linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-27 12:32:50 +00:00

Author	SHA1	Message	Date
Lukas Rusak	b0b5a34c85	arm: dts: bcm2710-rpi-3-b-plus: fix hpd gpio pin Signed-off-by: Lukas Rusak <lorusak@gmail.com>	2018-08-29 13:31:06 +01:00
Stefan Wahren	214bb591cc	ARM: bcm283x: Add missing interrupt for RNG block commit `4034600e6f` upstream. This patch adds the missing interrupt property to the RNG block of BCM283x. Link: https://github.com/raspberrypi/linux/issues/2195 CC: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>	2018-08-29 13:31:06 +01:00
Jaikumar	d58e800806	Allo Katana DAC: Updated default values Signed-off-by: Jaikumar <jaikumar@cem-solutions.com>	2018-08-29 13:31:06 +01:00
Jon Maxwell	692e1a66f9	ifb: fix packets checksum commit `b1d2e4e03f` upstream. Fixup the checksum for CHECKSUM_COMPLETE when pulling skbs on RX path. Otherwise we get splats when tc mirred is used to redirect packets to ifb. Before fix: nic: hw csum failure Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-29 13:31:06 +01:00
Matthias Reichl	0e806754d1	SQUASH: Revert downstream wm8804 changes The format change from S24_LE to S24_3LE effectively disables 24-bit mode as S24_3LE isn't supported by bcm2835-i2s. This causes issues with drivers that want to use wm8804 in 24-bit mode. Adding the S32_LE format is also incorrect, according to the datasheet only 16-24 bit formats are supported. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Phil Elwell	3804ff5f83	overlays: Add gpio-no-irq overlay An overlay to disable all GPIO interrupts. See: https://github.com/raspberrypi/linux/issues/2550 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
John Greb	0333ce8cd8	usb: gadget ethernet: Re-enable Jumbo frames. Commit <eea52743eb5654ec6f52b0e8b4aefec952543697> upstream Fixes: <b3e3893e1253> ("net: use core MTU range checking") which patched only one of two functions used to setup the USB Gadget Ethernet driver, causing a serious performance regression in the ability to increase mtu size above 1500. Signed-off-by: John Greb <h3x4m3r0n@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>	2018-08-29 13:31:06 +01:00
Dave Stevenson	a3e731bb59	net: lan78xx: Disable TCP Segmentation Offload (TSO) TSO seems to be having issues when packets are dropped and the remote end uses Selective Acknowledge (SACK) to denote that data is missing. The missing data is never resent, so the connection eventually stalls. There is a module parameter of enable_tso added to allow further debugging without forcing a rebuild of the kernel. https://github.com/raspberrypi/linux/issues/2449 https://github.com/raspberrypi/linux/issues/2482 Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	c776700329	config: Add CONFIG_NET_IPVTI=m See: https://github.com/raspberrypi/linux/issues/2581 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	8ba3115671	config: Add CONFIG_SPI_GPIO See: https://github.com/raspberrypi/linux/pull/2318 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	017afed681	sc16is7xx: Fix for "Unexpected interrupt: 8" The SC16IS752 has an Enhanced Feature Register which is aliased at the same address as the Interrupt Identification Register; accessing it requires that a magic value is written to the Line Configuration Register. If an interrupt is raised while the EFR is mapped in then the ISR won't be able to access the IIR, leading to the "Unexpected interrupt" error messages. Avoid the problem by claiming a mutex around accesses to the EFR register, also claiming the mutex in the interrupt handler work item (this is equivalent to disabling interrupts to interlock against a non-threaded interrupt handler). See: https://github.com/raspberrypi/linux/issues/2529 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
xunzhaocnm	b5e6ec9c14	Enable bbr module for arm64	2018-08-29 13:31:06 +01:00
Phil Elwell	a359a17a32	overlays: Add sdtweak features for network booting It has been observed that a Pi with no SD card will poll the interface continuously, using up to 10% CPU. Add some new parameters to the sdtweak overlay to control this behaviour: poll_once Only look for a card once, at boot time. If none is found then the interface is effectively disabled. enable Set to "off" or "no" to completely disable the interface. See: https://github.com/raspberrypi/linux/issues/2567 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
eccgecko	88f204b5e3	Enable AES, AES bit slice, and AES NEON engines on arm64	2018-08-29 13:31:06 +01:00
Phil Elwell	a8f3b9ba63	config: Add I2C_TINY_USB=m Enable the I2C Tiny USB module. See: https://github.com/raspberrypi/linux/issues/2535 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
derpeter	8d524e3537	This commit adds support for RP3-B-Plus in in arch arm64 (#2464 )	2018-08-29 13:31:06 +01:00
Piraty	ed3f88ad47	arm64: enable thermal / enable mmc (#2425 )	2018-08-29 13:31:06 +01:00
Matthias Reichl	75d4ac47db	config: enable Audio Graph Card module Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Eric Anholt	baffa954fd	BCM270X: Add the DSI panel to the defconfig. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Daniel Matuschek	eb54c217fd	ASoC: wm8804: MCLK configuration options, 32-bit WM8804 can run with PLL frequencies of 256xfs and 128xfs for most sample rates. At 192kHz only 128xfs is supported. The existing driver selects 128xfs automatically for some lower samples rates. By using an additional mclk_div divider, it is now possible to control the behaviour. This allows using 256xfs PLL frequency on all sample rates up to 96kHz. It should allow lower jitter and better signal quality. The behavior has to be controlled by the sound card driver, because some sample frequency share the same setting. e.g. 192kHz and 96kHz use 24.576MHz master clock. The only difference is the MCLK divider. This also added support for 32bit data. Signed-off-by: Daniel Matuschek <daniel@matuschek.net>	2018-08-29 13:31:06 +01:00
popcornmix	b4fb71bd89	ASoC: tas5713: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	e7fa5019f3	ASoC: pcm512x: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	a5a31b1668	ASoC: pcm1794a: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	98180fb3cc	ASoC: adau1977-adc: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	3f5d1f33e9	ASoC: allo-katana-codec: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	0716657bb7	ASoC: googlevoicehat-codec: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	72aacac4d7	ASoC: digidac1-soundcard: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	0ce0751fab	ASoC: iqaudio_digi: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	429bb976a0	ASoC: hifiberry_dacplus: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	3e223552be	ASoC: allo-boss-dac: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	3601fe2308	ASoC: allo-digione: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	832bdb7454	ASoC: rpi-proto: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	ad48ff6af6	ASoC: fe-pi-audio: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	7220a2effa	ASoC: hifiberry_digi: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	d647e6733e	ASoC: allo-piano-dac-plus: change codec to component Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
Phil Elwell	4df6acf8c1	of: configfs: Use of_overlay_fdt_apply API call The published API to the dynamic overlay application mechanism now takes a Flattened Device Tree blob as input so that it can manage the lifetime of the unflattened tree. Conveniently, the new API call - of_overlay_fdt_apply - is virtually a drop-in replacement for create_overlay, which can now be deleted. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	d8fa8abffe	Revert "configfs: hack: make it build" This reverts commit `3449d494aa`.	2018-08-29 13:31:06 +01:00
Phil Elwell	e8a11267de	irqchip: irq-bcm2835: Calc. FIQ_START at boot-time `ad83c7cb2f` ("irqchip/irq-bcm2836: Add support for DT interrupt polarity") changed the way that the BCM2836/7 local interrupts are mapped; instead of being pre-mapped they are now mapped on-demand. A side effect of this change is that the call to irq_of_parse_and_map from armctrl_of_init creates a new mapping, forming a gap between the IRQs and the FIQs. This gap breaks the FIQ<->IRQ mapping which up to now has been done by assuming: 1) that the value of FIQ_START is the same as the number of normal IRQs that will be mapped (still true), and 2) that this value is also the offset between an IRQ and its equivalent FIQ (which is no longer the case). Remove both assumptions by measuring the interval between the last IRQ and the last FIQ, passing it as the parameter to init_FIQ(). Fixes: https://github.com/raspberrypi/linux/issues/2432 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	99b369acfe	Revert "dwc_otg: Disable fiq by default until it is fixed" This reverts commit `6bf69e6b74`.	2018-08-29 13:31:06 +01:00
Matthias Reichl	6dd755846a	ASoC: justboom-digi: change codec to component Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Matthias Reichl	15a90b44fb	ASoC: justboom-dac: change codec to component Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
popcornmix	b41d3799a0	configfs: hack: make it build	2018-08-29 13:31:06 +01:00
popcornmix	8a2bf43442	dwc_otg: Disable fiq by default until it is fixed	2018-08-29 13:31:06 +01:00
Phil Elwell	a7020ba1b5	firmware/raspberrypi: Notify firmware of a reboot Register for reboot notifications, sending RPI_FIRMWARE_NOTIFY_REBOOT over the mailbox interface on reception. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Nick Bulleid	0f24f10602	Add ability to export gpio used by gpio-poweroff Signed-off-by: Nick Bulleid <nedbulleid@fastmail.com> Added export feature to gpio-poweroff documentation Signed-off-by: Nick Bulleid <nedbulleid@fastmail.com>	2018-08-29 13:31:06 +01:00
allocom	470f89b149	Driver and overlay for Allo Katana DAC	2018-08-29 13:31:06 +01:00
Phil Elwell	aec1fe78b4	gpiolib: Don't prevent IRQ usage of output GPIOs Upstream Linux deems using output GPIOs to generate IRQs as a bogus use case, even though the BCM2835 GPIO controller is capable of doing so. A number of users would like to make use of this facility, so disable the checks. See: https://github.com/raspberrypi/linux/issues/2527 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Dave Stevenson	abf111ce05	net: lan78xx: Reduce s/w csum check on VLANs With HW_VLAN_CTAG_RX enabled we don't observe the checksum issue, so amend the workaround to only drop back to s/w checksums if VLAN offload is disabled. See #2458. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Dave Stevenson	8fd04c67e7	net: lan78xx: Add support for VLAN tag stripping. The chip supports stripping the VLAN tag and reporting it in metadata. Implement this as it also appears to solve the issues observed in checksum computation. See #2458. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Dave Stevenson	6af9c00019	net: lan78xx: Add support for VLAN filtering. HW_VLAN_CTAG_FILTER was partially implemented, but not fully to Linux. Complete the implementation of this. See #2458. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Dave Stevenson	bab583c29e	net: lan78xx: Request s/w csum check on VLAN tagged packets. There appears to be some issue in the LAN78xx where the checksum computed on a VLAN tagged packet is incorrect, or at least not in the form that the kernel is after. This is most easily shown by pinging a device via a VLAN tagged interface and it will dump out the error message and stack trace from netdev_rx_csum_fault. It has also been seen with standard TCP and UDP packets. Until this is fully understood, request that the network stack computes the checksum on packets signalled as having a VLAN tag applied. See https://github.com/raspberrypi/linux/issues/2458 Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	c3c8584c52	lan78xx: Move enabling of EEE into PHY init code Enable EEE mode as soon as possible after connecting to the PHY, and before phy_start. This avoids a second link negotiation, which speeds up booting and stops the interface failing to become ready. See: https://github.com/raspberrypi/linux/issues/2437 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Dave Stevenson	8dcd3e0be9	net: lan78xx: Allow for VLAN headers in timeout. The frame abort timeout being set by lan78xx_set_rx_max_frame_length didn't account for any VLAN headers, resulting in very low throughput if used with tagged VLANs. Use VLAN_ETH_HLEN instead of ETH_HLEN to correct for this. See https://github.com/raspberrypi/linux/issues/2458 Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Matthias Reichl	70905ade35	ASoC: rpi-cirrus: change codec to component Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Phil Elwell	eabfc144d5	lan78xx: Ignore DT MAC address if already valid The patch to set the lan78xx MAC address from DT does so regardless of whether or not the interface already has a valid address. As the initialisation function is called from the reset handler when the interface is brought up, it is impossible to change the MAC address in a way that persists across the interface being brought up. Fix the problem by moving the DT reading code after the check for a valid address. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=209309 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	8dd86e0b43	lan78xx: Read LED states from Device Tree Add support for DT property "microchip,led-modes", a vector of two cells (u32s) in the range 0-15, each of which sets the mode for one of the two LEDs. The possible values are: 0=link/activity 1=link1000/activity 2=link100/activity 3=link10/activity 4=link100/1000/activity 5=link10/1000/activity 6=link10/100/activity 14=off 15=on Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	3e16548bbb	audioinjector-octo: Add continuous clock feature By user request, add a switch to prevent the clocks being stopped when the stream is paused, stopped or shutdown. Provide access to the switch by adding a 'non-stop-clocks' parameter to the audioinjector-addons overlay. See: https://github.com/raspberrypi/linux/issues/2409 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Eric Anholt	b39a15e86b	drm/vc4: Skip ULPS latching when we're in that ULPS state already. It seems that trying to go from unlatched to unlatched will time out waiting for STOP, and we can just skip that. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
popcornmix	4454135559	hid: Reduce default mouse polling interval to 60Hz Reduces overhead when using X	2018-08-29 13:31:06 +01:00
Eric Anholt	c702f41853	drm/vc4: Don't wait for vblank on fkms cursor updates. We don't use the same async update path between fkms and normal kms, and the normal kms workaround ended up making us wait. This became a larger problem in rpi-4.14.y, as the USB HID update rate throttling got (accidentally?) dropped. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	644a3f4822	drm/vc4: Remove duplicate primary/cursor fields from FKMS driver. The CRTC has those fields and we can just use them. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	f33a8068ca	drm/vc4: Skip SET_CURSOR_INFO when the cursor contents didn't change. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	6992115136	drm/vc4: Fix warning about vblank interrupts before DRM core is ready. The SMICS interrupt fires continuously, but since it's 1/100 the rate of the USB interrupts, we don't really need a way to turn it off. We do need to make sure that we don't tell DRM about it until DRM has asked for the interrupt at least once, because otherwise it will throw a warning at boot time. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Phil Elwell	e8dd074ea5	lan78xx: Change LEDs to include 10Mb activity The default LED modes put 1000Mb/activity on orange and 100Mb/activity on green, but this leaves no indication for a 10Mb link. Change the defaults to put 10Mb/100Mb/activity on green. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	5fe19e9bd8	lan78xx: Read initial EEE status from DT Add two new DT properties: * microchip,eee-enabled - a boolean to enable EEE * microchip,tx-lpi-timer - time in microseconds to wait before entering low power state Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
hdoverobinson	7728096669	added capture_clear option to pps-gpio via dtoverlay (#2433 )	2018-08-29 13:31:06 +01:00
Nathan Chancellor	3861885b96	staging: vchiq_arm: Remove unused variable This warning appears with GCC 6.4.0 from toolchains.bootlin.com: ../drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c: In function ‘vchiq_open’: ../drivers/staging/vc04_services/interface/vchiq_arm/vchiq_arm.c:1735:7: warning: unused variable ‘ret’ [-Wunused-variable] int ret; ^~~ This variable's usage was removed by commit `3c980263c5` ("staging: vchiq_arm: Make debugfs failure non-fatal"), making it useless. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
Nathan Chancellor	2e1cc674d4	sound: bcm: Fix memset dereference warning This warning appears with GCC 6.4.0 from toolchains.bootlin.com: ../sound/soc/bcm/allo-piano-dac-plus.c: In function ‘snd_allo_piano_dac_init’: ../sound/soc/bcm/allo-piano-dac-plus.c:711:30: warning: argument to ‘sizeof’ in ‘memset’ call is the same expression as the destination; did you mean to dereference it? [-Wsizeof-pointer-memaccess] memset(glb_ptr, 0x00, sizeof(glb_ptr)); ^ Suggested-by: Phil Elwell <phil@raspberrypi.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	eb274367d2	firmware/raspberrypi: Add a get_throttled sysfs file Under-voltage due to inadequate power supplies is a recurring problem for new Raspberry Pi users. There are visual indications that an under-voltage situation is occuring like blinking power led and a lightning icon on the desktop (not shown when using the vc4 driver), but for new users it's not obvious that this signifies a critical situation. This patch provides a twofold improvement to the situation: Firstly it logs under-voltage events to the kernel log. This provides information also for headless installations. Secondly it provides a sysfs file to read the value. This improves on 'vcgencmd' by providing change notification. Userspace can poll on the file and be notified of changes to the value. A script can poll the file and use dbus notification to put a windows on the desktop with information about the severity with a recommendation to change the power supply. A link to more information can also be provided. Only changes to the sticky bits are reported (cleared between readings). Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Prevent voltage low warnings from filling log Although the correct fix for low voltage warnings is to improve the power supply, the current implementation of the detection can fill the log if the warning happens freqently. This replaces the logging with slightly custom ratelimited logging. Signed-off-by: James Hughes <james.hughes@raspberrypi.org> Reduce log spam when mailbox call not implemented This changes the logging message when a mailbox call fails to the dev_dbg level. In addition, it fixes the low voltage detection logging code so that if the mailbox call doies fails, it logs at error level and flags so the call is no longer attempted. Signed-off-by: James Hughes <james.hughes@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	a83f9295c3	sc16is7xx: Fix for multi-channel stall The SC16IS752 is a dual-channel device. The two channels are largely independent, but the IRQ signals are wired together as an open-drain, active low signal which will be driven low while either of the channels requires attention, which can be for significant periods of time until operations complete and the interrupt can be acknowledged. In that respect it is should be treated as a true level-sensitive IRQ. The kernel, however, needs to be able to exit interrupt context in order to use I2C or SPI to access the device registers (which may involve sleeping). Therefore the interrupt needs to be masked out or paused in some way. The usual way to manage sleeping from within an interrupt handler is to use a threaded interrupt handler - a regular interrupt routine does the minimum amount of work needed to triage the interrupt before waking the interrupt service thread. If the threaded IRQ is marked as IRQF_ONESHOT the kernel will automatically mask out the interrupt until the thread runs to completion. The sc16is7xx driver used to use a threaded IRQ, but a patch switched to using a kthread_worker in order to set realtime priorities on the handler thread and for other optimisations. The end result is non-threaded IRQ that schedules some work then returns IRQ_HANDLED, making the kernel think that all IRQ processing has completed. The work-around to prevent a constant stream of interrupts is to mark the interrupt as edge-sensitive rather than level-sensitive, but interpreting an active-low source as a falling-edge source requires care to prevent a total cessation of interrupts. Whereas an edge-triggering source will generate a new edge for every interrupt condition a level-triggering source will keep the signal at the interrupting level until it no longer requires attention; in other words, the host won't see another edge until all interrupt conditions are cleared. It is therefore vital that the interrupt handler does not exit with an outstanding interrupt condition, otherwise the kernel will not receive another interrupt unless some other operation causes the interrupt state on the device to be cleared. The existing sc16is7xx driver has a very simple interrupt "thread" (kthread_work job) that processes interrups on each channel in turn until there are no more. If both channels are active and the first channel starts interrupting while the handler for the second channel is running then it will not be detected and an IRQ stall ensues. This could be handled easily if there was a shared IRQ status register, or a convenient way to determine if the IRQ had been deasserted for any length of time, but both appear to be lacking. Avoid this problem (or at least make it much less likely to happen) by reducing the granularity of per-channel interrupt processing to one condition per iteration, only exiting the overall loop when both channels are no longer interrupting. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	f1f1a74043	i2c-gpio: Also set bus numbers from reg property I2C busses can be assigned specific bus numbers using aliases in Device Tree - string properties where the name is the alias and the value is the path to the node. The current DT parameter mechanism does not allow property names to be derived from a parameter value in any way, so it isn't possible to generate unique or matching aliases for nodes from an overlay that can generate multiple instances, e.g. i2c-gpio. Work around this limitation (at least temporarily) by allowing the i2c adapter number to be initialised from the "reg" property if present. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
popcornmix	abd49b638c	hack: cache: Fix linker error	2018-08-29 13:31:06 +01:00
popcornmix	6055453e3b	vc4_fkms: Apply firmware overscan offset to hardware cursor	2018-08-29 13:31:06 +01:00
Eric Anholt	9b2785e795	drm/vc4: Add missing enable/disable vblank handlers in fkms. Fixes hang at boot in 4.14. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	bf79320ff5	drm/vc4: Add FB modifier support to firmwarekms. Signed-off-by: Eric Anholt <eric@anholt.net> (cherry picked from commit `11752d7348`)	2018-08-29 13:31:06 +01:00
Eric Anholt	b41466d68e	drm/vc4: Add support for setting DPMS in firmwarekms. This ensures that the screen goes blank during DPMS (screensaver), including the cursor. Planes don't necessarily get disabled during CRTC disable, so we need to be careful to not leave them on or turn them back on early. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	75cafd3947	drm/vc4: Fix sending of page flip completion events in FKMS mode. In the rewrite of vc4_crtc.c for fkms, I dropped the part of the CRTC's atomic flush handler that moved the completion event from the proposed atomic state change to the CRTC's current state. That meant that when full screen pageflipping happened (glxgears -fullscreen in X, compton, por weston), the app would end up blocked firever waiting to draw its next frame. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	d785a75738	drm/vc4: Add DRM_DEBUG_ATOMIC for the insides of fkms. Trying to debug weston on fkms involved figuring out what calls I was making to the firmware. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	f26f60b8fb	drm/vc4: Name the primary and cursor planes in fkms. This makes debugging nicer, compared to trying to remember what the IDs are. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	ed87c8bb06	drm/vc4: Add a mode for using the closed firmware for display. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Eric Anholt	1952e461da	raspberrypi-firmware: Export the general transaction function. The vc4-firmware-kms module is going to be doing the MBOX FB call. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Phil Elwell	a9978d889d	serial: 8250: bcm2835aux - suppress EPROBE_DEFER Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	094152ae51	ARM: Activate FIQs to avoid __irq_startup warnings There is a new test in __irq_startup that the IRQ is activated, which hasn't been the case for FIQs since they bypass some of the usual setup. Augment enable_fiq to include a call to irq_activate to avoid the warning. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	e49363037d	ARM: bcm2835: Set Serial number and Revision The VideoCore bootloader passes in Serial number and Revision number through Device Tree. Make these available to userspace through /proc/cpuinfo. Mainline status: There is a commit in linux-next that standardize passing the serial number through Device Tree (string: /serial-number): ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo There was an attempt to do the same with the revision number, but it didn't get in: [PATCH v2 1/2] arm: devtree: Set system_rev from DT revision Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	e991df4562	cgroup: Disable cgroup "memory" by default Some Raspberry Pis have limited RAM and most users won't use the cgroup memory support so it is disabled by default. Enable with: cgroup_enable=memory See: https://github.com/raspberrypi/linux/issues/1950 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
James Hughes	4351d67f25	Tidy up of the ft5406 driver to use DT (#2189 ) Driver was using a fixed resolution, this commit adds touchscreen size, and coordinate flip and swap features via device tree overlays. Adds overrides so the VC4 can adjust the DT parameters appropriately; there is a newer version of the VC4 side driver that can now set up the appropriate DT values if required. Signed-off-by: James Hughes <james.hughes@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	f3e6ce710b	mcp2515: Use DT-supplied interrupt flags The MCP2515 datasheet clearly describes a level-triggered interrupt pin. Therefore the receiving interrupt controller must also be configured for level-triggered operation otherwise there is a danger of a missed interrupt condition blocking all subsequent interrupts. The ONESHOT flag ensures that the interrupt is masked until the threaded interrupt handler exits. Rather than change the flags globally (they must have worked for at least one user), allow the flags to be overridden from Device Tree in the event that the device has a DT node. See: https://github.com/raspberrypi/linux/issues/2175 https://github.com/raspberrypi/linux/issues/2263 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
James Hughes	0e59f7a962	AXI performance monitor driver (#2222 ) Uses the debugfs I/F to provide access to the AXI bus performance monitors. Requires the new mailbox peripheral access for access to the VPU performance registers, system bus access is done using direct register reads. Signed-off-by: James Hughes <james.hughes@raspberrypi.org>	2018-08-29 13:31:06 +01:00
popcornmix	529b1aa9a4	cache: export clean and invalidate	2018-08-29 13:31:06 +01:00
Phil Elwell	83fbb7d93b	bcm2835-aux: Add aux interrupt controller The AUX block has a shared interrupt line with a register indicating which devices have active IRQs. Expose this as a nested interrupt controller to avoid sharing problems. See: https://github.com/raspberrypi/linux/issues/1484 https://github.com/raspberrypi/linux/issues/1573 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Khem Raj	7fe5305e39	build/arm64: Add rules for .dtbo files for dts overlays We now create overlays as .dtbo files. Signed-off-by: Khem Raj <raj.khem@gmail.com>	2018-08-29 13:31:06 +01:00
Michael Zoran	e0d9432c44	ARM64: Force hardware emulation of deprecated instructions.	2018-08-29 13:31:06 +01:00
Michael Zoran	a38025e42f	ARM64: Round-Robin dispatch IRQs between CPUs. IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2018-08-29 13:31:06 +01:00
Michael Zoran	4409adb68a	ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2018-08-29 13:31:06 +01:00
Michael Zoran	ef49fcc159	Add arm64 configuration and device tree differences. Disable MMC_BCM2835_SDHOST and MMC_BCM2835 since these drivers are crashing at the moment. ARM64: Modify default config to get raspbian to boot (#1686) 1. Enable emulation of deprecated instructions. 2. Enable ARM 8.1 and 8.2 features which are not detected at runtime. 3. Switch the default governer to powersave. 4. Include the watchdog timer driver in the kernel image rather then a module. Tested with raspbian-jessie 2016-09-23. ARM64: Make it work again on 4.9 (#1790) * Invoke the dtc compiler with the same options used in arm mode. * ARM64 now uses the bcm2835 platform just like ARM32. * ARM64: Update bcmrpi3_defconfig Signed-off-by: Michael Zoran <mzoran@crowfest.net> Update arm64 Makefile to compile bcm2710 dtb file The line 'dtb-$(CONFIG_ARCH_BCM2835) += bcm2710-rpi-3-b.dtb' has been copied from previous rpi-4.14.y version into rpi-4.15.y arch/arm64/boot/dts/broadcom/Makefile to restore compilation of bcm2710-rpi-3-b.dtb device tree blob under 'make ARCH=arm64 dtbs' command.	2018-08-29 13:31:06 +01:00
popcornmix	6393747de8	config: Add default configs	2018-08-29 13:31:06 +01:00
Phil Elwell	103ed297a4	hci_h5: Don't send conf_req when ACTIVE Without this patch, a modem and kernel can continuously bombard each other with conf_req and conf_rsp messages, in a demented game of tag.	2018-08-29 13:31:06 +01:00
Phil Elwell	5a7bc6a51f	brcmfmac: request_firmware_direct is quieter Since we don't have any CLM-capable firmware yet, silence the warning of its absence by using request_firmware_direct, which should also be marginally quicker. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	2f0dd167dd	brcmfmac: Mute expected startup 'errors' The brcmfmac WiFi driver always complains about the '00' country code. Modify the driver to ignore '00' silently. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Cheong2K	8378069943	brcm: adds support for BCM43341 wifi brcmfmac: Disable power management Disable wireless power saving in the brcmfmac WLAN driver. This is a temporary measure until the connectivity loss resulting from power saving is resolved. Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: Use original country code as a fallback Commit `73345fd212`: brcmfmac: Configure country code using device specific settings prevents region codes from working on devices that lack a region code translation table. In the event of an absent table, preserve the old behaviour of using the provided code as-is. Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: Plug memory leak in brcmf_fill_bss_param See: https://github.com/raspberrypi/linux/issues/1471 Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: do not use internal roaming engine by default Some evidence of curing disconnects with this disabled, so make it a default. Can be overridden with module parameter roamoff=0 See: http://projectable.me/optimize-my-pi-wi-fi/ brcmfmac: Change stop_ap sequence Patch from Broadcom/Cypress to resolve a customer error Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Pantelis Antoniou	7d39cbeade	OF: DT-Overlay configfs interface This is a port of Pantelis Antoniou's v3 port that makes use of the new upstreamed configfs support for binary attributes. Original commit message: Add a runtime interface to using configfs for generic device tree overlay usage. With it its possible to use device tree overlays without having to use a per-platform overlay manager. Please see Documentation/devicetree/configfs-overlays.txt for more info. Changes since v2: - Removed ifdef CONFIG_OF_OVERLAY (since for now it's required) - Created a documentation entry - Slight rewording in Kconfig Changes since v1: - of_resolve() -> of_resolve_phandles(). Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Phil Elwell <phil@raspberrypi.org> DT configfs: Fix build errors on other platforms Signed-off-by: Phil Elwell <phil@raspberrypi.org> DT configfs: fix build error There is an error when compiling rpi-4.6.y branch: CC drivers/of/configfs.o drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types] .default_groups = of_cfs_def_groups, ^ drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next') The .default_groups is linked list since commit `1ae1602de0`. This commit uses configfs_add_default_group to fix this problem. Signed-off-by: Slawomir Stepien <sst@poczta.fm> configfs: New of_overlay API	2018-08-29 13:31:06 +01:00
Nathan Chancellor	1a011e46e3	net: rtl8192cu: Fix implicit fallthrough warnings These warnings appear with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c: In function ‘mgt_dispatcher’: ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:734:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if(check_fwstate(pmlmepriv, WIFI_AP_STATE) == _TRUE) ^ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:739:3: note: here case WIFI_ASSOCREQ: ^~~~ ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c: In function ‘phy_TxPwrIdxToDbm’: ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c:2365:10: warning: this statement may fall through [-Wimplicit-fallthrough=] Offset = -8; ~~~~~~~^~~~ ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_phycfg.c:2366:2: note: here default: ^~~~~~~ ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c: In function ‘GetHwReg8192CU’: ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c:5694:20: warning: this statement may fall through [-Wimplicit-fallthrough=] ((u16 )(val)) = pHalData->BasicRateSet; ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/usb/usb_halinit.c:5695:3: note: here case HW_VAR_TXPAUSE: ^~~~ ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c: In function ‘set_group_key’: ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c:7383:11: warning: this statement may fall through [-Wimplicit-fallthrough=] keylen = 16; ~~~~~~~^~~~ ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_linux.c:7384:3: note: here default: ^~~~~~~ ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c: In function ‘set_group_key’: ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:822:11: warning: this statement may fall through [-Wimplicit-fallthrough=] keylen = 16; ~~~~~~~^~~~ ../drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:823:3: note: here default: ^~~~~~~ None of them appear to be a real issue but it is trivial to make the warnings go away. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
Francisco Facioni	96c9874d67	net: rtl8192cu: Fix outstanding GCC 6.4.0 warnings Compiler used: toolchains.bootlin.com Reference: https://github.com/diederikdehaas/rtl8812AU [@nathanchance: Cherry-picked from `6c19878818` and updated message] Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
Nathan Chancellor	6768a62417	net: rtl8192cu: Normalize indentation These warnings appear with GCC 6.4.0 from toolchains.bootlin.com: ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c: In function ‘aes_cipher’: ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1504:5: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation] for (j = 0; j < 8; j++) ^~~ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1507:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’ payload_index = hdrlen + 8; ^~~~~~~~~~~~~ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c: In function ‘aes_decipher’: ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1878:5: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation] for (j = 0; j < 8; j++) ^~~ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_security.c:1881:2: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘for’ payload_index = hdrlen + 8; ^~~~~~~~~~~~~ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:5666:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation] if( _rtw_memcmp(pwdinfo->rx_prov_disc_info.peerDevAddr, empty_addr, ETH_ALEN) ); ^~ ../drivers/net/wireless/realtek/rtl8192cu/core/rtw_mlme_ext.c:5667:6: note: ...this statement, but the latter is misleadingly indented as if it is guarded by the ‘if’ _rtw_memcpy(pwdinfo->rx_prov_disc_info.peerDevAddr, GetAddr2Ptr(pframe), ETH_ALEN); ^~~~~~~~~~~ It appears to be due to tabs versus spaces. Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
Marc Kleine-Budde	7b58d5a700	net: rtl8192cu: Fix off-by-one warning ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_rf6052.c: In function ‘PHY_RFShadowRefresh’: ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_rf6052.c:1020:37: warning: iteration 63 invokes undefined behavior [-Waggressive-loop-optimizations] RF_Shadow[eRFPath][Offset].Value = 0; ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~ ../drivers/net/wireless/realtek/rtl8192cu/hal/rtl8192c/rtl8192c_rf6052.c:1018:3: note: within this loop for (Offset = 0; Offset <= RF6052_MAX_REG; Offset++) ^~~ Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	9d9ca607d6	rtl8192cu: Updates for 4.15	2018-08-29 13:31:06 +01:00
Phil Elwell	3c3e82e43b	net: Fix rtl8192cu build errors on other platforms Signed-off-by: Phil Elwell <phil@raspberrypi.org> suppress spurious messages Add #if for 3.14 kernel change (#87) Fixes compiling after changes in `f663dd9aaf` and `99932d4fc0` Fixes #86 Set dev_type to wlan Fixes #23 Tentatively added support for more 8188CUS based devices. Add support for more 8188CUS and 8192CUS devices Add ProductId for the Netgear N150 - WNA1000M Fixes CONFIG_CONCURRENT_MODE CONFIG_MULTI_VIR_IFACES Fixes compatibility with 3.13 Enables warning in the compiler and fixes some issues, reference => https://github.com/diederikdehaas/rtl8812AU Starts device in station mode instead of monitor, fixes NetworkManager issues Enable cfg80211 support Fix cfg80211 for kernel >= 4.7 Fixes rtl8192cu for kernel >= 4.8 rtl8192: Fixup build fixup: rtl8192cu fixes from milhouse rtl8192: switch to netdev->priv_destructor() When trying to build from the rpi-4.11.y branch, I'm getting the following error : drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:3464:10: error: 'struct net_device' has no member named 'destructor' It seems to occur since this upstream commit : `cf124db566` [...] netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com> ARM64: Fix build break for RTL8187/RTL8192CU wifi These drivers use an ASM function from the base system to compute the ipv6 checksum. These functions are not available on ARM64, probably because nobody has bother to write them. The base system does have a generic "C" version, so a simple fix is to include the header to use the generic version on ARM64 only. A longer term solution would be to submit the necessary ASM function to the upstream source. With this change, these drivers now compile without any errors on ARM64. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2018-08-29 13:31:06 +01:00
popcornmix	6209192931	net: Add non-mainline source for rtl8192cu wlan Add non-mainline source for rtl8192cu wireless driver version v4.0.2_9000 as this is widely used. Disable older rtlwifi driver. 8192cu needs old wireless extensions The obsolete WIRELESS_EXT configuration is used by the old Realtek code and is needed for AP support. 8192cu: CONFIG_AP_MODE hardcoded in autoconf.h rtl8192c_rf6052: PHY_RFShadowRefresh(): fix off-by-one Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org> rtl8192cu: Add PID for D-Link DWA 131	2018-08-29 13:31:06 +01:00
popcornmix	c6e10f8310	bcm2835-virtgpio: Virtual GPIO driver Add a virtual GPIO driver that uses the firmware mailbox interface to request that the VPU toggles LEDs.	2018-08-29 13:31:06 +01:00
P33M	2d42af61e6	rpi_display: add backlight driver and overlay Add a mailbox-driven backlight controller for the Raspberry Pi DSI touchscreen display. Requires updated GPU firmware to recognise the mailbox request. Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org> Add Raspberry Pi firmware driver to the dependencies of backlight driver Otherwise the backlight driver fails to build if the firmware loading driver is not in the kernel Signed-off-by: Alex Riesen <alexander.riesen@cetitec.com>	2018-08-29 13:31:06 +01:00
sandeepal	86a3e8faaa	Allo Digione Driver (#2048 ) Driver for the Allo Digione soundcard allo-digione: 192kHz clicking sound fix See: https://github.com/raspberrypi/linux/pull/2149	2018-08-29 13:31:06 +01:00
Peter Malkin	f3b53f2556	Driver support for Google voiceHAT soundcard.	2018-08-29 13:31:06 +01:00
Matt Flax	d12e06ea42	Add support for the AudioInjector.net Octo sound card AudioInjector Octo: sample rates, regulators, reset This patch adds new sample rates to the Audioinjector Octo sound card. The new supported rates are (in kHz) : 96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7 Reference the bcm270x DT regulators in the overlay. This patch adds a reset GPIO for the AudioInjector.net octo sound card. Audioinjector octo : Make the playback and capture symmetric This patch ensures that the sample rate and channel count of the audioinjector octo sound card are symmetric.	2018-08-29 13:31:06 +01:00
Fe-Pi	edf55f5616	Add support for Fe-Pi audio sound card. (#1867 ) Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec. Mechanical specification of the board is the same the Raspberry Pi Zero. 3.5mm jacks for Headphone/Mic, Line In, and Line Out. Signed-off-by: Henry Kupis <fe-pi@cox.net>	2018-08-29 13:31:06 +01:00
Miquel	39d024aca7	sound: Support for Dion Audio LOCO-V2 DAC-AMP HAT Signed-off-by: Miquel Blauw <info@dionaudio.nl> ASoC: dionaudio_loco-v2: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Also remove hw_params and ops as they are no longer needed. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Matthias Reichl	0502603ea6	ASoC: Add driver for Cirrus Logic Audio Card Note: due to problems with deferred probing of regulators the following softdep should be added to a modprobe.d file softdep arizona-spi pre: arizona-ldo1 Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
gtrainavicius	34201aa97c	Support for Blokas Labs pisound board Pisound dynamic overlay (#1760) Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay. Print a logline when the kernel module is removed. pisound improvements: * Added a writable sysfs object to enable scripts / user space software to blink MIDI activity LEDs for variable duration. * Improved hw_param constraints setting. * Added compatibility with S16_LE sample format. * Exposed some simple placeholder volume controls, so the card appears in volumealsa widget. Add missing SND_PISOUND selects dependency to SND_RAWMIDI Without it the Pisound module fails to compile. See https://github.com/raspberrypi/linux/issues/2366 Updates for Pisound module code: * Merged 'Fix a warning in DEBUG builds' (`1c8b82b`). * Updating some strings and copyright information. * Fix for handling high load of MIDI input and output. * Use dual rate oversampling ratio for 96kHz instead of single rate one. Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io> Fixing memset call in pisound.c Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io> Fix for Pisound's MIDI Input getting blocked for a while in rare cases. There was a possible race condition which could lead to Input's FIFO queue to be underflown, causing high amount of processing in the worker thread for some period of time. Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>	2018-08-29 13:31:06 +01:00
BabuSubashChandar	7af0753d43	Add support for Allo Boss DAC add-on board for Raspberry Pi. (#1924 ) Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Deepak <deepak@zilogic.com> Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com> Add support for new clock rate and mute gpios. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Deepak <deepak@zilogic.com> Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com> ASoC: allo-boss-dac: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Signed-off-by: Matthias Reichl <hias@horus.com> ASoC: allo-boss-dac: transmit S24_LE with 64 BCLK cycles Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Raashid Muhammed	2cd87a4ca4	Add support for Allo Piano DAC 2.1 plus add-on board for Raspberry Pi. The Piano DAC 2.1 has support for 4 channels with subwoofer. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com> Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com> Add clock changes and mute gpios (#1938) Also improve code style and adhere to ALSA coding conventions. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com> Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com> PianoPlus: Dual Mono & Dual Stereo features added (#2069) allo-piano-dac-plus: Master volume added + fixes Master volume added, which controls both DACs volumes. See: https://github.com/raspberrypi/linux/pull/2149 Also fix initial max volume, default mode value, and unmute. Signed-off-by: allocom <sparky-dev@allo.com> ASoC: allo-piano-dac-plus: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Clive Messer	7983a9b743	Allo Piano DAC boards: Initial 2 channel (stereo) support (#1645 ) Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards, using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC machine driver. NB. The initial support is 2 channel (stereo) ONLY! (The Piano DAC 2.1 will only support 2 channel (stereo) left/right output, pending an update to the upstream pcm512x codec driver, which will have to be submitted via upstream. With the initial downstream support, provided by this patch, the Piano DAC 2.1 subwoofer outputs will not function.) Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk> Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk> ASoC: allo-piano-dac: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Also remove hw_params and ops as they are no longer needed. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
DigitalDreamtime	7996c3b753	Add support for Dion Audio LOCO DAC-AMP HAT Using dedicated machine driver and pcm5102a codec driver. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2018-08-29 13:31:06 +01:00
escalator2015	58d1912dfe	New driver for RRA DigiDAC1 soundcard using WM8741 + WM8804	2018-08-29 13:31:06 +01:00
DigitalDreamtime	c8798b60f1	Add IQAudIO Digi WM8804 board support Support IQAudIO Digi board with iqaudio_digi machine driver and iqaudio-digi-wm8804-audio overlay. NB. Machine driver is a cut and paste of hifiberry_digi code, with format and general cleanup to comply with kernel coding standards. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2018-08-29 13:31:06 +01:00
Matt Flax	f6382936e7	New AudioInjector.net Pi soundcard with low jitter audio in and out. Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile. Adds the dts overlay and updates the Makefile and README. Updates the relevant defconfig files to enable building for the Raspberry Pi. Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions. Added support for headphones, microphone and bclk_ratio settings. This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.	2018-08-29 13:31:06 +01:00
Andrey Grodzovsky	d205c9ee05	ARM: adau1977-adc: Add basic machine driver for adau1977 codec driver. This commit adds basic support for the codec usage including: Device tree overlay, binding I2S bus and setting I2S mode, clock source and frequency setting according to spec. Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>	2018-08-29 13:31:06 +01:00
Aaron Shaw	dba20233ca	Add Support for JustBoom Audio boards justboom-dac: Adjust for ALSA API change As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card * rather than a struct snd_soc_codec *. Signed-off-by: Phil Elwell <phil@raspberrypi.org> ASoC: justboom-dac: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Also remove hw_params as it's no longer needed. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Waldemar Brodkorb	83c48f80dc	Add driver for rpi-proto Forward port of 3.10.x driver from https://github.com/koalo We are using a custom board and would like to use rpi 3.18.x kernel. Patch works fine for our embedded system. URL to the audio chip: http://www.mikroe.com/add-on-boards/audio-voice/audio-codec-proto/ Playback tested with devicetree enabled. Signed-off-by: Waldemar Brodkorb <wbrodkorb@conet.de>	2018-08-29 13:31:06 +01:00
Daniel Matuschek	d3689d21b6	Added driver for HiFiBerry Amp amplifier add-on board The driver contains a low-level hardware driver for the TAS5713 and the drivers for the Raspberry Pi I2S subsystem. TAS5713: return error if initialisation fails Existing TAS5713 driver logs errors during initialisation, but does not return an error code. Therefore even if initialisation fails, the driver will still be loaded, but won't work. This patch fixes this. I2C communication error will now reported correctly by a non-zero return code. HiFiBerry Amp: fix device-tree problems Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.	2018-08-29 13:31:06 +01:00
Daniel Matuschek	d908d33797	Added support for HiFiBerry DAC+ The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses a different codec chip (PCM5122), therefore a new driver is necessary. Add support for the HiFiBerry DAC+ Pro. The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators. An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame. Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+ 24db_digital_gain DT param can be used to specify that PCM512x codec "Digital" volume control should not be limited to 0dB gain, and if specified will allow the full 24dB gain. Add dt param to force HiFiBerry DAC+ Pro into slave mode "dtoverlay=hifiberry-dacplus,slave" Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode, with Pi as master for bit and frame clock. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk> Fixed a bug when using 352.8kHz sample rate Signed-off-by: Daniel Matuschek <daniel@hifiberry.com> ASoC: pcm512x: revert downstream changes This partially reverts commit `185ea05465` which was added by https://github.com/raspberrypi/linux/pull/1152 The downstream pcm512x changes caused a regression, it broke normal use of the 24bit format with the codec, eg when using simple-audio-card. The actual bug with 24bit playback is the incorrect usage of physical_width in various drivers in the downstream tree which causes 24bit data to be transmitted with 32 clock cycles. So it's not the pcm512x that needs fixing, it's the soundcard drivers. Signed-off-by: Matthias Reichl <hias@horus.com> ASoC: hifiberry_dacplus: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Signed-off-by: Matthias Reichl <hias@horus.com> ASoC: hifiberry_dacplus: transmit S24_LE with 64 BCLK cycles Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Gordon Garrity	bed2d3fcb1	Add IQaudIO Sound Card support for Raspberry Pi Set a limit of 0dB on Digital Volume Control The main volume control in the PCM512x DAC has a range up to +24dB. This is dangerously loud and can potentially cause massive clipping in the output stages. Therefore this sets a sensible limit of 0dB for this control. Allow up to 24dB digital gain to be applied when using IQAudIO DAC+ 24db_digital_gain DT param can be used to specify that PCM512x codec "Digital" volume control should not be limited to 0dB gain, and if specified will allow the full 24dB gain. Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt Add the ability to set the card name, dai name and dai stream name, from dt config. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk> IQaudIO: auto-mute for AMP+ and DigiAMP+ IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute and auto mute. Revision 2, auto mute implementing HiassofT suggestion to mute/unmute using set_bias_level, rather than startup/shutdown.... "By default DAPM waits 5 seconds (pmdown_time) before shutting down playback streams so a close/stop immediately followed by open/start doesn't trigger an amp mute+unmute." Tested on both AMP+ (via DAC+) and DigiAMP+, with both options... dtoverlay=iqaudio-dacplus,unmute_amp "one-shot" unmute when kernel module loads. dtoverlay=iqaudio-dacplus,auto_mute_amp Unmute amp when ALSA device opened by a client. Mute, with 5 second delay when ALSA device closed. (Re-opening the device within the 5 second close window, will cancel mute.) Revision 4, using gpiod. Revision 5, clean-up formatting before adding mute code. - Convert tab plus 4 space formatting to 2x tab - Remove '// NOT USED' commented code Revision 6, don't attempt to "one-shot" unmute amp, unless card is successfully registered. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk> ASoC: iqaudio-dac: fix S24_LE format Remove set_bclk_ratio call so 24-bit data is transmitted in 24 bclk cycles. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Daniel Matuschek	3a6558279c	ASoC: BCM:Add support for HiFiBerry Digi. Driver is based on the patched WM8804 driver. Signed-off-by: Daniel Matuschek <daniel@matuschek.net> Add a parameter to turn off SPDIF output if no audio is playing This patch adds the paramater auto_shutdown_output to the kernel module. Default behaviour of the module is the same, but when auto_shutdown_output is set to 1, the SPDIF oputput will shutdown if no stream is playing. bugfix for 32kHz sample rate, was missing HiFiBerry Digi: set SPDIF status bits for sample rate The HiFiBerry Digi driver did not signal the sample rate in the SPDIF status bits. While this is optional, some DACs and receivers do not accept this signal. This patch adds the sample rate bits in the SPDIF status block. Added HiFiBerry Digi+ Pro driver Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>	2018-08-29 13:31:06 +01:00
Florian Meier	88daa9e903	ASoC: Add support for Rpi-DAC	2018-08-29 13:31:06 +01:00
Florian Meier	f3cb4d0f32	ASoC: Add support for HifiBerry DAC This adds a machine driver for the HifiBerry DAC. It is a sound card that can be stacked onto the Raspberry Pi. Signed-off-by: Florian Meier <florian.meier@koalo.de>	2018-08-29 13:31:06 +01:00
Matthias Reichl	71c03118d4	ASoC: pcm512x: implement set_tdm_slot interface PCM512x can accept data padded with additional BCLK cycles but the driver currently lacks an interface to configure this. This leads to the problem that S24_LE format in master mode can result in non-integer clock divisors and pcm512x running at a rather off rate. For example 48kHz with 48fs BCLK and SCLK at 24.576MHz uses a divisor of 10 (rounded down from 10.6666) and results in a 51.2kHz LRCLK. With 64fs BCLK a divisor of 8 is used and LRCLK runs at exactly 48kHz. Fix this by providing a minimal set_tdm_slot implementation so machine drivers can optionally configure custom BCLK ratios. Signed-off-by: Matthias Reichl <hias@horus.com>	2018-08-29 13:31:06 +01:00
Phil Elwell	d16334d8b2	mfd: Add Raspberry Pi Sense HAT core driver	2018-08-29 13:31:06 +01:00
Phil Elwell	f75fe3cf97	gpio-poweroff: Allow it to work on Raspberry Pi The Raspberry Pi firmware manages the power-down and reboot process. To do this it installs a pm_power_off handler, causing the gpio-poweroff module to abort the probe function. This patch introduces a "force" DT property that overrides that behaviour, and also adds a DT overlay to enable and control it. Note that running in an active-low configuration (DT parameter "active_low") requires a custom dt-blob.bin and probably won't allow a reboot without switching off, so an external inversion of the trigger signal may be preferable.	2018-08-29 13:31:06 +01:00
popcornmix	160c140f0b	Improve __copy_to_user and __copy_from_user performance Provide a __copy_from_user that uses memcpy. On BCM2708, use optimised memcpy/memmove/memcmp/memset implementations. arch/arm: Add mmiocpy/set aliases for memcpy/set See: https://github.com/raspberrypi/linux/issues/1082 copy_from_user: CPU_SW_DOMAIN_PAN compatibility The downstream copy_from_user acceleration must also play nice with CONFIG_CPU_SW_DOMAIN_PAN. See: https://github.com/raspberrypi/linux/issues/1381 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Gordon Hollingworth	fd0563a08e	rpi-ft5406: Add touchscreen driver for pi LCD display Fix driver detection failure Check that the buffer response is non-zero meaning the touchscreen was detected rpi-ft5406: Use firmware API RPI-FT5406: Enable aarch64 support through explicit iomem interface Signed-off-by: Gerhard de Clercq <gerharddeclercq@outlook.com>	2018-08-29 13:31:06 +01:00
popcornmix	43428c97fd	Added Device IDs for August DVB-T 205	2018-08-29 13:31:06 +01:00
Siarhei Siamashka	ebdb63cf6b	fbdev: add FBIOCOPYAREA ioctl Based on the patch authored by Ali Gholami Rudi at https://lkml.org/lkml/2009/7/13/153 Provide an ioctl for userspace applications, but only if this operation is hardware accelerated (otherwide it does not make any sense). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com> bcm2708_fb: Add ioctl for reading gpu memory through dma	2018-08-29 13:31:06 +01:00
Phil Elwell	b49343fd70	BCM270x_DT: Add pwr_led, and the required "input" trigger The "input" trigger makes the associated GPIO an input. This is to support the Raspberry Pi PWR LED, which is driven by external hardware in normal use. N.B. pwr_led is not available on Model A or B boards. leds-gpio: Implement the brightness_get method The power LED uses some clever logic that means it is driven by a voltage measuring circuit when configured as input, otherwise it is driven by the GPIO output value. This patch wires up the brightness_get method for leds-gpio so that user-space can monitor the LED value via /sys/class/gpio/led1/brightness. Using the input trigger this returns an indication of the system power health, otherwise it is just whatever value the trigger has written most recently. See: https://github.com/raspberrypi/linux/issues/1064	2018-08-29 13:31:06 +01:00
notro	30b1e3b885	BCM2708: Add core Device Tree support Add the bare minimum needed to boot BCM2708 from a Device Tree. Signed-off-by: Noralf Tronnes <notro@tronnes.org> BCM2708: DT: change 'axi' nodename to 'soc' Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835. The VC4 bootloader fills in certain properties in the 'axi' subtree, but since this is part of an upstreaming effort, the name is changed. Signed-off-by: Noralf Tronnes notro@tronnes.org BCM2708_DT: Correct length of the peripheral space Use dts-dirs feature for overlays. The kernel makefiles have a dts-dirs target that is for vendor subdirectories. Using this fixes the install_dtbs target, which previously did not install the overlays. BCM270X_DT: configure I2S DMA channels Signed-off-by: Matthias Reichl <hias@horus.com> BCM270X_DT: switch to bcm2835-i2s I2S soundcard drivers with proper devicetree support (i.e. not linking to the cpu_dai/platform via name but to cpu/platform via of_node) will work out of the box without any modifications. When the kernel is compiled without devicetree support the platform code will instantiate the bcm2708-i2s driver and I2S soundcard drivers will link to it via name, as before. Signed-off-by: Matthias Reichl <hias@horus.com> SDIO-overlay: add poll_once-boolean parameter Add paramter to toggle sdio-device-polling done every second or once at boot-time. Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de> BCM270X_DT: Make mmc overlay compatible with current firmware The original DT overlay logic followed a merge-then-patch procedure, i.e. parameters are applied to the loaded overlay before the overlay is merged into the base DTB. This sequence has been changed to patch-then-merge, in order to support parameterised node names, and to protect against bad overlays. As a result, overrides (parameters) must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB. mmc-overlay.dts (that switches back to the original mmc sdcard driver) is the only overlay violating that rule, and this patch fixes it. bcm270x_dt: Use the sdhost MMC controller by default The "mmc" overlay reverts to using the other controller. squash: Add cprman to dt BCM270X_DT: Use clk_core for I2C interfaces BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi The mainline Device Tree files are quite close to downstream now. Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files for our dts files. Mainline dts files are based on these files: bcm2835-rpi.dtsi bcm2835.dtsi bcm2836.dtsi bcm283x.dtsi Current downstream are based on these: bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi bcm2708_common.dtsi This patch introduces this dependency: bcm2708.dtsi bcm2709.dtsi bcm2708-rpi.dtsi bcm270x.dtsi bcm2835.dtsi bcm2836.dtsi bcm283x.dtsi And: bcm2710.dtsi bcm2708-rpi.dtsi bcm270x.dtsi bcm283x.dtsi bcm270x.dtsi contains the downstream bcm283x.dtsi diff. bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi. Other changes: - The led node has moved from /soc/leds to /leds. This is not a problem since the label is used to reference it. - The clk_osc reg property changes from 6 to 3. - The gpu nodes has their interrupt property set in the base file. - the clocks label does not point to the /clocks node anymore, but points to the cprman node. This is not a problem since the overlays that use the clock node refer to it directly: target-path = "/clocks"; - some nodes now have 2 labels since mainline and downstream differs in this respect: cprman/clocks, spi0/spi, gpu/vc4. - some nodes doesn't have an explicit status = "okay" since they're not disabled in the base file: watchdog and random. - gpiomem doesn't need an explicit status = "okay". - bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi, it's now set directly in that file. - bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer. - Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270X_DT: Use raspberrypi-power to turn on USB power Use the raspberrypi-power driver to turn on USB power. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270X_DT: Add a .dtbo target, use for overlays Change the filenames and extensions to keep the pre-DDT style of overlay (<name>-overlay.dtb) distinct from new ones that use a different style of local fixups (<name>.dtbo), and to match other platforms. The RPi firmware uses the DDTK trailer atom to choose which type of overlay to use for each kernel. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Don't generate "linux,phandle" props The EPAPR standard says to use "phandle" properties to store phandles, rather than the deprecated "linux,phandle" version. By default, dtc generates both, but adding "-H epapr" causes it to only generate "phandle"s, saving some space and clutter. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Add overlay for enc28j60 on SPI2 Works on SPI2 for compute module BCM270X_DT: Add midi-uart0 overlay MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock so that requesting 38.4kbaud actually gets 31.25kbaud. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Add i2c-sensor overlay The i2c-sensor overlay is a container for various pressure and temperature sensors, currently bmp085 and bmp280. The standalone bmp085_i2c-sensor overlay is now deprecated. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: overlays/-overlay.dtb -> overlays/.dtbo (#1752) We now create overlays as .dtbo files. build: support for .dtbo files for dtb overlays Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb. Patch the kernel, which has faulty rules to generate .dtbo the way yocto does Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr> Signed-off-by: Khem Raj <raj.khem@gmail.com> BCM270X: Drop position requirement for CMA in VC4 overlay. No longer necessary since `2aefcd5761`, and will probably let peeople that want to choose a larger CMA allocation (particularly on pi0/1). Signed-off-by: Eric Anholt <eric@anholt.net> BCM270X_DT: RPi Device Tree tidy Use the upstream sdhost node, add thermal-zones, and factor out some common elements. Signed-off-by: Phil Elwell <phil@raspberrypi.org> kbuild: Silence unhelpful DTC warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	caf4819c87	scripts: Add mkknlimg and knlinfo scripts from tools repo The Raspberry Pi firmware looks for a trailer on the kernel image to determine whether it was compiled with Device Tree support enabled. If the firmware finds a kernel without this trailer, or which has a trailer indicating that it isn't DT-capable, it disables DT support and reverts to using ATAGs. The mkknlimg utility adds that trailer, having first analysed the image to look for signs of DT support and the kernel version string. knlinfo displays the contents of the trailer in the given kernel image. scripts/mkknlimg: Add support for ARCH_BCM2835 Add a new trailer field indicating whether this is an ARCH_BCM2835 build, as opposed to MACH_BCM2708/9. If the loader finds this flag is set it changes the default base dtb file name from bcm270x... to bcm283y... Also update knlinfo to show the status of the field. scripts/mkknlimg: Improve ARCH_BCM2835 detection The board support code contains sufficient strings to be able to distinguish 2708 vs. 2835 builds, so remove the check for bcm2835-pm-wdt which could exist in either. Also, since the canned configuration is no longer built in (it's a module), remove the config string checking. See: https://github.com/raspberrypi/linux/issues/1157 scripts: Multi-platform support for mkknlimg and knlinfo The firmware uses tags in the kernel trailer to choose which dtb file to load. Current firmware loads bcm2835-.dtb if the '283x' tag is true, otherwise it loads bcm270.dtb. This scheme breaks if an image supports multiple platforms. This patch adds '270X' and '283X' tags to indicate support for RPi and upstream platforms, respectively. '283x' (note lower case 'x') is left for old firmware, and is only set if the image only supports upstream builds. scripts/mkknlimg: Append a trailer for all input Now that the firmware assumes an unsigned kernel is DT-capable, it is helpful to be able to mark a kernel as being non-DT-capable. Signed-off-by: Phil Elwell <phil@raspberrypi.org> scripts/knlinfo: Decode DDTK atom Show the DDTK atom as being a boolean. Signed-off-by: Phil Elwell <phil@raspberrypi.org> mkknlimg: Retain downstream-kernel detection With the death of ARCH_BCM2708 and ARCH_BCM2709, a new way is needed to determine if this is a "downstream" build that wants the firmware to load a bcm27xx .dtb. The vc_cma driver is used downstream but not upstream, making vc_cma_init a suitable predicate symbol. mkknlimg: Find some more downstream-only strings See: https://github.com/raspberrypi/linux/issues/1920 Signed-off-by: Phil Elwell <phil@raspberrypi.org> scripts: Update mkknlimg, just in case With the removal of the vc_cma driver, mkknlimg lost an indication that the user had built a downstream kernel. Update the script, adding a few more key strings, in case it is still being used. Note that mkknlimg is now deprecated, except to tag kernels as upstream (283x), and thus requiring upstream DTBs. See: https://github.com/raspberrypi/linux/issues/2239 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	d2d3fa140c	firmware: bcm2835: Support ARCH_BCM270x Support booting without Device Tree. Turn on USB power. Load driver early because of lacking support for deferred probing in many drivers. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> firmware: bcm2835: Don't turn on USB power The raspberrypi-power driver is now used to turn on USB power. This partly reverts commit: firmware: bcm2835: Support ARCH_BCM270x Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	a81adf5761	char: broadcom: Add vcio module Add module for accessing the mailbox property channel through /dev/vcio. Was previously in bcm2708-vcio. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
popcornmix	e04a63e5aa	Add Chris Boot's i2c driver i2c-bcm2708: fixed baudrate Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock). In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits. This resulted in incorrect setting of CDIV and higher baudrate than intended. Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz The correct baudrate is shown in the log after the cdiv > 0xffff correction. Perform I2C combined transactions when possible Perform I2C combined transactions whenever possible, within the restrictions of the Broadcomm Serial Controller. Disable DONE interrupt during TA poll Prevent interrupt from being triggered if poll is missed and transfer starts and finishes. i2c: Make combined transactions optional and disabled by default i2c: bcm2708: add device tree support Add DT support to driver and add to .dtsi file. Setup pins in .dts file. i2c is disabled by default. Signed-off-by: Noralf Tronnes <notro@tronnes.org> bcm2708: don't register i2c controllers when using DT The devices for the i2c controllers are in the Device Tree. Only register devices when not using DT. Signed-off-by: Noralf Tronnes <notro@tronnes.org> I2C: Only register the I2C device for the current board revision i2c_bcm2708: Fix clock reference counting Fix grabbing lock from atomic context in i2c driver 2 main changes: - check for timeouts in the bcm2708_bsc_setup function as indicated by this comment: /* poll for transfer start bit (should only take 1-20 polls) / This implies that the setup function can now fail so account for this everywhere it's called - Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock. i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl i2c-bcm2708: Increase timeouts to allow larger transfers Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting for completion. The default timeout is 1 second. See: https://github.com/raspberrypi/linux/issues/260 i2c-bcm2708/BCM270X_DT: Add support for I2C2 The third I2C bus (I2C2) is normally reserved for HDMI use. Careless use of this bus can break an attached display - use with caution. It is recommended to disable accesses by VideoCore by setting hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt. The interface is disabled by default - enable using the i2c2_iknowwhatimdoing DT parameter. bcm2708-spi: Don't use static pin configuration with DT Also remove superfluous error checking - the SPI framework ensures the validity of the chip_select value. i2c-bcm2708: Remove non-DT support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs. Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574) i2c: fix i2c_bcm2708: Clear FIFO before sending data Make sure FIFO gets cleared before trying to send data in case of a repeated start (COMBINED=Y). * i2c: fix i2c_bcm2708: Only write to FIFO when not full Check if FIFO can accept data before writing. To avoid a peripheral read on the last iteration of a loop, both bcm2708_bsc_fifo_fill and ~drain are changed as well.	2018-08-29 13:31:06 +01:00
popcornmix	838071baaf	Add cpufreq driver Signed-off-by: popcornmix <popcornmix@gmail.com> bcm2835-cpufreq: Change licence to GPLv2 Signed-off-by: Eben Upton <eben.upton@broadcom.com> Signed-off-by: Dom Cobley <dom@raspberrypi.com>	2018-08-29 13:31:06 +01:00
Luke Wren	7b8871543f	Add SMI NAND driver Signed-off-by: Luke Wren <wren6991@gmail.com>	2018-08-29 13:31:06 +01:00
Martin Sperl	2b8b2283e7	MISC: bcm2835: smi: use clock manager and fix reload issues Use clock manager instead of self-made clockmanager. Also fix some error paths that showd up during development (especially missing release of dma resources on rmmod) Signed-off-by: Martin Sperl <kernel@martin.sperl.org>	2018-08-29 13:31:06 +01:00
Luke Wren	8a43865d05	Add SMI driver Signed-off-by: Luke Wren <wren6991@gmail.com>	2018-08-29 13:31:06 +01:00
Luke Wren	20bfe449ab	Add /dev/gpiomem device for rootless user GPIO access Signed-off-by: Luke Wren <luke@raspberrypi.org> bcm2835-gpiomem: Fix for ARCH_BCM2835 builds Build on ARCH_BCM2835, and fail to probe if no IO resource. See: https://github.com/raspberrypi/linux/issues/1154	2018-08-29 13:31:06 +01:00
Tim Gover	b7754c4852	vcsm: VideoCore shared memory service for BCM2835 Add experimental support for the VideoCore shared memory service. This allows user processes to allocate memory from VideoCore's GPU relocatable heap and mmap the buffers. Additionally, the memory handles can passed to other VideoCore services such as MMAL, OpenMax and DispmanX TODO * This driver was originally released for BCM28155 which has a different cache architecture to BCM2835. Consequently, in this release only uncached mappings are supported. However, there's no fundamental reason which cached mappings cannot be support or BCM2835 * More refactoring is required to remove the typedefs. * Re-enable the some of the commented out debug-fs statistics which were disabled when migrating code from proc-fs. * There's a lot of code to support sharing of VCSM in order to support Android. This could probably done more cleanly or perhaps just removed. Signed-off-by: Tim Gover <timgover@gmail.com> config: Disable VC_SM for now to fix hang with cutdown kernel vcsm: Use boolean as it cannot be built as module On building the bcm_vc_sm as a module we get the following error: v7_dma_flush_range and do_munmap are undefined in vc-sm.ko. Fix by making it not an option to build as module vcsm: Add ioctl for custom cache flushing vc-sm: Move headers out of arch directory Signed-off-by: Noralf Trønnes <noralf@tronnes.org> vcsm: Treat EBUSY as success rather than SIGBUS Currently if two cores access the same page concurrently one will return VM_FAULT_NOPAGE and the other VM_FAULT_SIGBUS crashing the user code. Also report when mapping fails. Signed-off-by: popcornmix <popcornmix@gmail.com> vcsm: Provide new ioctl to clean/invalidate a 2D block vcsm: Convert to loading via device tree. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> VCSM: New option to import a DMABUF for VPU use Takes a dmabuf, and then calls over to the VPU to wrap it into a suitable handle. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> vcsm: fix multi-platform build vcsm: add macros for cache functions vcsm: use dma APIs for cache functions * Will handle multi-platform builds vcsm: Fix up macros to avoid breaking numbers used by existing apps vcsm: Define cache operation constants in user header Without this change, users have to use raw values (1, 2, 3) to specify cache operation. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Support for finding user/vc handle in memory pool vmcs_sm_{usr,vc}_handle_from_pid_and_address() were failing to find handle if specified user pointer is not exactly the one that the memory locking call returned even if the pointer is in range of map/resource. So fixed the functions to match the range. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Unify cache manipulating functions Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix obscure conditions Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix memory leaking on clean_invalid2 ioctl handler Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Describe the use of cache operation constants Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix obscure conditions again Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Add no-op cache operation constant Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Revert to do page-table-walk-based cache manipulating on some ioctl calls On FLUSH, INVALID, CLEAN_INVALID ioctl calls, cache operations based on page table walk were used in case that the buffer of the cache is not pinned. So reverted to do page-table-based cache manipulating. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Define cache operation constants in user header Without this change, users have to use raw values (1, 2, 3) to specify cache operation. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	552e6ccc5b	vc_mem: Add vc_mem driver for querying firmware memory addresses Signed-off-by: popcornmix <popcornmix@gmail.com> BCM270x: Move vc_mem Make the vc_mem module available for ARCH_BCM2835 by moving it. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	e0e6ef92a1	Adding bcm2835-sdhost driver, and an overlay to enable it BCM2835 has two SD card interfaces. This driver uses the other one. bcm2835-sdhost: Error handling fix, and code clarification bcm2835-sdhost: Adding overclocking option Allow a different clock speed to be substitued for a requested 50MHz. This option is exposed using the "overclock_50" DT parameter. Note that the sdhost interface is restricted to integer divisions of core_freq, and the highest sensible option for a core_freq of 250MHz is 84 (250/3 = 83.3MHz), the next being 125 (250/2) which is much too high. Use at your own risk. bcm2835-sdhost: Round up the overclock, so 62 works for 62.5Mhz Also only warn once for each overclock setting. bcm2835-sdhost: Improve error handling and recovery 1) Expose the hw_reset method to the MMC framework, removing many internal calls by the driver. 2) Reduce overclock setting on error. 3) Increase timeout to cope with high capacity cards. 4) Add properties and parameters to control pio_limit and debug. 5) Reduce messages at probe time. bcm2835-sdhost: Further improve overclock back-off bcm2835-sdhost: Clear HBLC for PIO mode Also update pio_limit default in overlay README. bcm2835-sdhost: Add the ERASE capability See: https://github.com/raspberrypi/linux/issues/1076 bcm2835-sdhost: Ignore CRC7 for MMC CMD1 It seems that the sdhost interface returns CRC7 errors for CMD1, which is the MMC-specific SEND_OP_COND. Returning these errors to the MMC layer causes a downward spiral, but ignoring them seems to be harmless. bcm2835-mmc/sdhost: Remove ARCH_BCM2835 differences The bcm2835-mmc driver (and -sdhost driver that copied from it) contains code to handle SDIO interrupts in a threaded interrupt handler rather than waking the MMC framework thread. The change follows a patch from Russell King that adds the facility as the preferred way of working. However, the new code path is only present in ARCH_BCM2835 builds, which I have taken to be a way of testing the waters rather than making the change across the board; I can't see any technical reason why it wouldn't be enabled for MACH_BCM270X builds. So this patch standardises on the ARCH_BCM2835 code, removing the old code paths. bcm2835-sdhost: Don't log timeout errors unless debug=1 The MMC card-discovery process generates timeouts. This is expected behaviour, so reporting it to the user serves no purpose. Suppress the reporting of timeout errors unless the debug flag is on. bcm2835-sdhost: Add workaround for odd behaviour on some cards For reasons not understood, the sdhost driver fails when reading sectors very near the end of some SD cards. The problem could be related to the similar issue that reading the final sector of any card as part of a multiple read never completes, and the workaround is an extension of the mechanism introduced to solve that problem which ensures those sectors are always read singly. bcm2835-sdhost: Major revision This is a significant revision of the bcm2835-sdhost driver. It improves on the original in a number of ways: 1) Through the use of CMD23 for reads it appears to avoid problems reading some sectors on certain high speed cards. 2) Better atomicity to prevent crashes. 3) Higher performance. 4) Activity logging included, for easier diagnosis in the event of a problem. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Restore ATOMIC flag to PIO sg mapping Allocation problems have been seen in a wireless driver, and this is the only change which might have been responsible. SQUASH: bcm2835-sdhost: Only claim one DMA channel With both MMC controllers enabled there are few DMA channels left. The bcm2835-sdhost driver only uses DMA in one direction at a time, so it doesn't need to claim two channels. See: https://github.com/raspberrypi/linux/issues/1327 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Workaround for "slow" sectors Some cards have been seen to cause timeouts after certain sectors are read. This workaround enforces a minimum delay between the stop after reading one of those sectors and a subsequent data command. Using CMD23 (SET_BLOCK_COUNT) avoids this problem, so good cards will not be penalised by this workaround. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Firmware manages the clock divisor The bcm2835-sdhost driver hands control of the CDIV clock divisor register to matching firmware, allowing it to adjust to a changing core clock. This removes the need to use the performance governor or to enable io_is_busy on the on-demand governor in order to get the best SD performance. N.B. As SD clocks must be an integer divisor of the core clock, it is possible that the SD clock for "turbo" mode can be different (even lower) than "normal" mode. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Reset the clock in task context Since reprogramming the clock can now involve a round-trip to the firmware it must not be done at atomic context, and a tasklet is not a task. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Don't exit cmd wait loop on error The FAIL flag can be set in the CMD register before command processing is complete, leading to spurious "failed to complete" errors. This has the effect of promoting harmless CRC7 errors during CMD1 processing into errors that can delay and even prevent booting. Also: 1) Convert the last KERN_ERROR message in the register dumping to KERN_INFO. 2) Remove an unnecessary reset call from bcm2835_sdhost_add_host. See: https://github.com/raspberrypi/linux/pull/1492 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: mmc_card_blockaddr fix Get the definition of mmc_card_blockaddr from drivers/mmc/core/card.h. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: New timer API mmc: bcm2835-sdhost: Support underclocking Support underclocking of the SD bus in two ways: 1. using the max-frequency DT property (which currently has no DT parameter), and 2. using the exiting sd_overclock parameter. The two methods differ slightly - in the former the MMC subsystem is aware of the underclocking, while in the latter it isn't - but the end results should be the same. See: https://github.com/raspberrypi/linux/issues/2350 Signed-off-by: Phil Elwell <phil@raspberrypi.org> mmc: bcm2835-sdhost: Add include highmem.h (needed for kmap_atomic) is pulled in by one of the other include files, but only with some CONFIG settings. Make the inclusion explicit to cater for cases where the CONFIG setting is absent. See: https://github.com/raspberrypi/linux/issues/2366 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
gellert	fa8b2fccc4	MMC: added alternative MMC driver mmc: Disable CMD23 transfers on all cards Pending wire-level investigation of these types of transfers and associated errors on bcm2835-mmc, disable for now. Fallback of CMD18/CMD25 transfers will be used automatically by the MMC layer. Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org> mmc: bcm2835-mmc: enable DT support for all architectures Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Enable Device Tree support for all architectures. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> mmc: bcm2835-mmc: fix probe error handling Probe error handling is broken in several places. Simplify error handling by using device managed functions. Replace pr_{err,info} with dev_{err,info}. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Add locks when accessing sdhost registers bcm2835-mmc: Add range of debug options for slowing things down bcm2835-mmc: Add option to disable some delays bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Adding overclocking option Allow a different clock speed to be substitued for a requested 50MHz. This option is exposed using the "overclock_50" DT parameter. Note that the mmc interface is restricted to EVEN integer divisions of 250MHz, and the highest sensible option is 63 (250/4 = 62.5), the next being 125 (250/2) which is much too high. Use at your own risk. bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz Also only warn once for each overclock setting. mmc: bcm2835-mmc: Make available on ARCH_BCM2835 Make the bcm2835-mmc driver available for use on ARCH_BCM2835. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-mmc entry Add Device Tree entry for bcm2835-mmc. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Don't overwrite MMC capabilities from DT bcm2835-mmc: Don't override bus width capabilities from devicetree Take out the force setting of the MMC_CAP_4_BIT_DATA host capability so that the result read from devicetree via mmc_of_parse() is preserved. bcm2835-mmc: Only claim one DMA channel With both MMC controllers enabled there are few DMA channels left. The bcm2835-mmc driver only uses DMA in one direction at a time, so it doesn't need to claim two channels. See: https://github.com/raspberrypi/linux/issues/1327 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-mmc: New timer API mmc: bcm2835-mmc: Support underclocking Support underclocking of the SD bus using the max-frequency DT property (which currently has no DT parameter). The sd_overclock parameter already provides another way to achieve the same thing which should be equivalent in end result, but it is a bug not to support max-frequency as well. See: https://github.com/raspberrypi/linux/issues/2350 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Florian Meier	af45fe1f99	dmaengine: Add support for BCM2708 Add support for DMA controller of BCM2708 as used in the Raspberry Pi. Currently it only supports cyclic DMA. Signed-off-by: Florian Meier <florian.meier@koalo.de> dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels DMA: fix cyclic LITE length overflow bug dmaengine: bcm2708: Remove chancnt affectations Mirror bcm2835-dma.c commit `9eba5536a7`: chancnt is already filled by dma_async_device_register, which uses the channel list to know how much channels there is. Since it's already filled, we can safely remove it from the drivers' probe function. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: overwrite dreq only if it is not set dreq is set when the DMA channel is fetched from Device Tree. slave_id is set using dmaengine_slave_config(). Only overwrite dreq with slave_id if it is not set. dreq/slave_id in the cyclic DMA case is not touched, because I don't have hardware to test with. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: do device registration in the board file Don't register the device in the driver. Do it in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835 Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Add Device Tree support to the non ARCH_BCM2835 case. Use the same driver name regardless of architecture. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-dma entry Add Device Tree entry for bcm2835-dma. The entry doesn't contain any resources since they are handled by the arch/arm/mach-bcm270x/dma.c driver. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine: Add debug options BCM270x: Add memory and irq resources to dmaengine device and DT Prepare for merging of the legacy DMA API arch driver dma.c with bcm2708-dmaengine by adding memory and irq resources both to platform file device and Device Tree node. Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c Merge the legacy DMA API driver with bcm2708-dmaengine. This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox driver is also needed). Changes to the dma.c code: - Use BIT() macro. - Cutdown some comments to one line. - Add mutex to vc_dmaman and use this, since the dev lock is locked during probing of the engine part. - Add global g_dmaman variable since drvdata is used by the engine part. - Restructure for readability: vc_dmaman_chan_alloc() vc_dmaman_chan_free() bcm_dma_chan_free() - Restructure bcm_dma_chan_alloc() to simplify error handling. - Use device irq resources instead of hardcoded bcm_dma_irqs table. - Remove dev_dmaman_register() and code it directly. - Remove dev_dmaman_deregister() and code it directly. - Simplify bcm_dmaman_probe() using devm_* functions. - Get dmachans from DT if available. - Keep 'dma.dmachans' module argument name for backwards compatibility. Make it available on ARCH_BCM2835 as well. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: set residue_granularity field bcm2708-dmaengine supports residue reporting at burst level but didn't report this via the residue_granularity field. Without this field set properly we get playback issues with I2S cards. dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer bcm2708-dmaengine: Use more DMA channels (but not 12) 1) Only the bcm2708_fb drivers uses the legacy DMA API, and it requires a BULK-capable channel, so all other types (FAST, NORMAL and LITE) can be made available to the regular DMA API. 2) DMA channels 11-14 share an interrupt. The driver can't handle this, so don't use channels 12-14 (12 was used, probably because it appears to have an interrupt, but in reality that interrupt is for activity on ANY channel). This may explain a lockup encountered when running out of DMA channels. The combined effect of this patch is to leave 7 DMA channels available + channel 0 for bcm2708_fb via the legacy API. See: https://github.com/raspberrypi/linux/issues/1110 https://github.com/raspberrypi/linux/issues/1108 dmaengine: bcm2708: Make legacy API available for bcm2835-dma bcm2708_fb uses the legacy DMA API, so in order to start using bcm2835-dma, bcm2835-dma has to support the legacy API. Make this possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove(). Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Change DT compatible string Both bcm2835-dma and bcm2708-dmaengine have the same compatible string. So change compatible to "brcm,bcm2708-dma". Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Remove driver but keep legacy API Dropping non-DT support means we don't need this driver, but we still need the legacy DMA API. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine - Fix arm64 portability/build issues dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708 bcm2708-dmaengine.c defines functions like bcm_dma_start which are defined as well in dma-bcm2708.h as inline versions when CONFIG_DMA_BCM2708 is not defined. This works fine when CONFIG_DMA_BCM2708 is built in, but when it is selected as module build fails with redefinition errors because in the build system when CONFIG_DMA_BCM2708 is selected as module, the macro becomes CONFIG_DMA_BCM2708_MODULE. This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when available. Fixes https://github.com/raspberrypi/linux/issues/2056 Signed-off-by: Andrei Gherzan <andrei@gherzan.com>	2018-08-29 13:31:06 +01:00
Harm Hanemaaijer	2854168e27	Speed up console framebuffer imageblit function Especially on platforms with a slower CPU but a relatively high framebuffer fill bandwidth, like current ARM devices, the existing console monochrome imageblit function used to draw console text is suboptimal for common pixel depths such as 16bpp and 32bpp. The existing code is quite general and can deal with several pixel depths. By creating special case functions for 16bpp and 32bpp, by far the most common pixel formats used on modern systems, a significant speed-up is attained which can be readily felt on ARM-based devices like the Raspberry Pi and the Allwinner platform, but should help any platform using the fb layer. The special case functions allow constant folding, eliminating a number of instructions including divide operations, and allow the use of an unrolled loop, eliminating instructions with a variable shift size, reducing source memory access instructions, and eliminating excessive branching. These unrolled loops also allow much better code optimization by the C compiler. The code that selects which optimized variant is used is also simplified, eliminating integer divide instructions. The speed-up, measured by timing 'cat file.txt' in the console, varies between 40% and 70%, when testing on the Raspberry Pi and Allwinner ARM-based platforms, depending on font size and the pixel depth, with the greater benefit for 32bpp. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>	2018-08-29 13:31:06 +01:00
popcornmix	63f1c13e25	bcm2708 framebuffer driver Signed-off-by: popcornmix <popcornmix@gmail.com> bcm2708_fb : Implement blanking support using the mailbox property interface bcm2708_fb: Add pan and vsync controls bcm2708_fb: DMA acceleration for fb_copyarea Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425 Also used Simon's dmaer_master module as a reference for tweaking DMA settings for better performance. For now busylooping only. IRQ support might be added later. With non-overclocked Raspberry Pi, the performance is ~360 MB/s for simple copy or ~260 MB/s for two-pass copy (used when dragging windows to the right). In the case of using DMA channel 0, the performance improves to ~440 MB/s. For comparison, VFP optimized CPU copy can only do ~114 MB/s in the same conditions (hindered by reading uncached source buffer). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com> bcm2708_fb: report number of dma copies Add a counter (exported via debugfs) reporting the number of dma copies that the framebuffer driver has done, in order to help evaluate different optimization strategies. Signed-off-by: Luke Diamand <luked@broadcom.com> bcm2708_fb: use IRQ for DMA copies The copyarea ioctl() uses DMA to speed things along. This was busy-waiting for completion. This change supports using an interrupt instead for larger transfers. For small transfers, busy-waiting is still likely to be faster. Signed-off-by: Luke Diamand <luke@diamand.org> bcm2708: Make ioctl logging quieter video: fbdev: bcm2708_fb: Don't panic on error No need to panic the kernel if the video driver fails. Just print a message and return an error. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> fbdev: bcm2708_fb: Add ARCH_BCM2835 support Add Device Tree support. Pass the device to dma_alloc_coherent() in order to get the correct bus address on ARCH_BCM2835. Use the new DMA legacy API header file. Including <mach/platform.h> is not necessary. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: Add bcm2708-fb device Add bcm2708-fb to Device Tree and don't add the platform device when booting in DT mode. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Cleanup of bcm2708_fb file to kernel coding standards Some minor change to function - remove a use of in_atomic, plus replacing various debug messages that manually specify the function name with ("%s",.__func__) Signed-off-by: James Hughes <james.hughes@raspberrypi.org>	2018-08-29 13:31:06 +01:00
popcornmix	b07ca6a973	Add dwc_otg driver Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the unallocated struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in `c4564d4a1a`). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: https://github.com/raspberrypi/firmware/issues/21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with `eb1b482a`. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to `c8edb238`. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every microframe in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un\|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit `6ce0d20` changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes raspberrypi/linux#903 fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes https://github.com/raspberrypi/linux/issues/1842 dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes https://github.com/raspberrypi/linux/issues/1709 dwc_otg: fix split transaction data toggle handling around dequeues See https://github.com/raspberrypi/linux/issues/1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc() dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings. dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held. dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted. dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: https://github.com/raspberrypi/linux/issues/2020 dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check. dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect). dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: https://github.com/raspberrypi/linux/issues/2024 dwc_otg: Fixup change to DRIVER_ATTR interface dwc_otg: Fix compilation warnings Signed-off-by: Phil Elwell <phil@raspberrypi.org> USB_DWCOTG: Disable building dwc_otg as a module (#2265) When dwc_otg is built as a module, build will fail with the following error: ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined! scripts/Makefile.modpost:91: recipe for target '__modpost' failed make[1]: [__modpost] Error 1 Makefile:1199: recipe for target 'modules' failed make: * [modules] Error 2 Even if the error is solved by including the missing DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading dwc_otg. As a workaround, simply prevent user from building dwc_otg as a module as the current kernel does not support it. See: https://github.com/raspberrypi/linux/issues/2258 Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com> dwc_otg: New timer API dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init() Add the ability to disable force_host_mode for those that want to use dwc_otg in both device and host modes. dwc_otg: Fix a regression when dequeueing isochronous transfers In `282bed95` (dwc_otg: make nak_holdoff work as intended with empty queues) the dequeue mechanism was changed to leave FIQ-enabled transfers to run to completion - to avoid leaving hub TT buffers with stale packets lying around. This broke FIQ-accelerated isochronous transfers, as this then meant that dozens of transfers were performed after the dequeue function returned. Restore the state machine fence for isochronous transfers. fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288) See: https://github.com/raspberrypi/linux/issues/2140 dwc_otg: add smp_mb() to prevent driver state corruption on boot Occasional crashes have been seen where the FIQ code dereferences invalid/random pointers immediately after being set up, leading to panic on boot. The crash occurs as the FIQ code races against hcd_init_fiq() and the hcd_init_fiq() code races against the outstanding memory stores from dwc_otg_hcd_init(). Use explicit barriers after touching driver state. usb: dwc_otg: fix memory corruption in dwc_otg driver [Upstream commit `51b1b64917`] The move from the staging tree to the main tree exposed a longstanding memory corruption bug in the dwc2 driver. The reordering of the driver initialization caused the dwc2 driver to corrupt the initialization data of the sdhci driver on the Raspberry Pi platform, which made the bug show up. The error is in calling to_usb_device(hsotg->dev), since ->dev is not a member of struct usb_device. The easiest fix is to just remove the offending code, since it is not really needed. Thanks to Stephen Warren for tracking down the cause of this. Reported-by: Andre Heider <a.heider@gmail.com> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Paul Zimmerman <paulz@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [lukas: port from upstream dwc2 to out-of-tree dwc_otg driver] Signed-off-by: Lukas Wunner <lukas@wunner.de> usb: dwb_otg: Fix unreachable switch statement warning This warning appears with GCC 7.3.0 from toolchains.bootlin.com: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’: ../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable] st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps; ~~~~~~~~~~~~~~~~~^~~~ Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>	2018-08-29 13:31:06 +01:00
popcornmix	217c29badb	Main bcm2708/bcm2709 linux port Signed-off-by: popcornmix <popcornmix@gmail.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2709: Drop platform smp and timer init code irq-bcm2836 handles this through these functions: bcm2835_init_local_timer_frequency() bcm2836_arm_irqchip_smp_init() Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm270x: Use watchdog for reboot/poweroff The watchdog driver already has support for reboot/poweroff. Make use of this and remove the code from the platform files. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> board_bcm2835: Remove coherent dma pool increase - API has gone	2018-08-29 13:31:06 +01:00
notro	98e9118bdf	pinctrl-bcm2835: Set base to 0 give expected gpio numbering Signed-off-by: Noralf Tronnes <notro@tronnes.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	0720ccdded	staging: vchiq_arm: Make debugfs failure non-fatal It can be useful to be able to open multiple vchiq instances in a single process. This currently fails due to a debugfs collision, so make such a failure non-fatal. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	f8cc1df4ea	amba_pl011: Add cts-event-workaround DT property The BCM2835 PL011 implementation seems to have a bug that can lead to a transmission lockup if CTS changes frequently. A workaround was added to the driver with a vendor-specific flag to enable it, but this flag is currently not set for ARM implementations. Add a "cts-event-workaround" property to Pi DTBs and use the presence of that property to force the flag to be enabled in the driver. See: https://github.com/raspberrypi/linux/issues/1280 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	349800ce76	amba_pl011: Insert mb() for correct FIFO handling The pl011 register accessor functions use the _relaxed versions of the standard readl() and writel() functions, meaning that there are no automatic memory barriers. When polling a FIFO status register to check for fullness, it is necessary to ensure that any outstanding writes have completed; otherwise the flags are effectively stale, making it possible that the next write is to a full FIFO. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	c20b1ccdc3	amba_pl011: Round input clock up The UART clock is initialised to be as close to the requested frequency as possible without exceeding it. Now that there is a clock manager that returns the actual frequencies, an expected 48MHz clock is reported as 47999625. If the requested baudrate == requested clock/16, there is no headroom and the slight reduction in actual clock rate results in failure. Detect cases where it looks like a "round" clock was chosen and adjust the reported clock to match that "round" value. As the code comment says: /* * If increasing a clock by less than 0.1% changes it * from ..999.. to ..000.., round up. */ Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	3f4dfb76b3	amba_pl011: Don't use DT aliases for numbering The pl011 driver looks for DT aliases of the form "serial<n>", and if found uses <n> as the device ID. This can cause /dev/ttyAMA0 to become /dev/ttyAMA1, which is confusing if the other serial port is provided by the 8250 driver which doesn't use the same logic.	2018-08-29 13:31:06 +01:00
Phil Elwell	d7abf87859	lan78xx: Enable LEDs and auto-negotiation For applications of the LAN78xx that don't have valid programmed EEPROMs or OTPs, enabling both LEDs and auto-negotiation by default seems reasonable. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	63f03dbf51	lan78xx: Read MAC address from DT if present There is a standard mechanism for locating and using a MAC address from the Device Tree. Use this facility in the lan78xx driver to support applications without programmed EEPROM or OTP. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	e6ae170da7	irqchip: irq-bcm2836: Remove regmap and syscon use The syscon node defines a register range that duplicates that used by the local_intc node on bcm2836/7. Since irq-bcm2835 and irq-bcm2836 are built in and always present together (both drivers are enabled by CONFIG_ARCH_BCM2835), it is possible to replace the syscon usage with a global variable that simplifies the code. Doing so does lose the locking provided by regmap, but as only one side is using the regmap interface (irq-bcm2835 uses readl and write) there is no loss of atomicity. See: https://github.com/raspberrypi/firmware/issues/926 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	2ed9dc788c	ASoC: Add prompt for ICS43432 codec Without a prompt string, a config setting can't be included in a defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards can use the driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Eric Anholt	e39903f579	mm: Remove the PFN busy warning See commit `dae803e165` -- the warning is expected sometimes when using CMA. However, that commit still spams my kernel log with these warnings. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	8544900514	i2c: bcm2835: Add debug support This adds a debug module parameter to aid in debugging transfer issues by printing info to the kernel log. When enabled, status values are collected in the interrupt routine and msg info in bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid affecting timing. Having printk in the isr can mask issues. debug values (additive): 1: Print info on error 2: Print info on all transfers 3: Print messages before transfer is started The value can be changed at runtime: /sys/module/i2c_bcm2835/parameters/debug Example output, debug=3: [ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1] [ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1] [ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1] [ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1] [ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1] [ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1] [ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1] [ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1] Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Claggy3	76d429a9e5	Update vfpmodule.c Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m. This patch fixes a problem with VFP state save and restore related to exception handling (panic with message "BUG: unsupported FP instruction in kernel mode") present on VFP11 floating point units (as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry Pi boards). This patch was developed and discussed on https://github.com/raspberrypi/linux/issues/859 A precondition to see the crashes is that floating point exception traps are enabled. In this case, the VFP11 might determine that a FPU operation needs to trap at a point in time when it is not possible to signal this to the ARM11 core any more. The VFP11 will then set the FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases, a second opcode might have been accepted by the VFP11 before the exception was detected and could be reported to the ARM11 - in this case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in FPINST2.) If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits to decide what actions to take, i.e., whether to emulate the opcodes found in FPINST and FPINST2, and whether to retry the bounced instruction. If a user space application has left the VFP11 in this "pending trap" state, the next FPU opcode issued to the VFP11 might actually be the VSTMIA operation vfp_save_state() uses to store the FPU registers to memory (in our test cases, when building the signal stack frame). In this case, the kernel crashes as described above. This patch fixes the problem by making sure that vfp_save_state() is always entered with FPEXC.EX cleared. (The current value of FPEXC has already been saved, so this does not corrupt the context. Clearing FPEXC.EX has no effects on FPINST or FPINST2. Also note that many callers already modify FPEXC by setting FPEXC.EN before invoking vfp_save_state().) This patch also addresses a second problem related to FPEXC.EX: After returning from signal handling, the kernel reloads the VFP context from the user mode stack. However, the current code explicitly clears both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these bits to be preserved, this patch disables clearing them for VFP implementations belonging to architecture 1. There should be no negative side effects: the user can set both bits by executing FPU opcodes anyway, and while user code may now place arbitrary values into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support code knows which instructions can be emulated, and rejects other opcodes with "unhandled bounce" messages, so there should be no security impact from allowing reloading FPEXC.EX and FPEXC.FP2V. Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>	2018-08-29 13:31:06 +01:00
Phil Elwell	7fdf9e123d	sound: Demote deferral errors to INFO level At present there is no mechanism to specify driver load order, which can lead to deferrals and repeated retries until successful. Since this situation is expected, reduce the dmesg level to INFO and mention that the operation will be retried. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Eric Anholt	7356bcd64f	clk: bcm2835: Mark GPIO clocks enabled at boot as critical. These divide off of PLLD_PER and are used for the ethernet and wifi PHYs source PLLs. Neither of them is currently represented by a phy device that would grab the clock for us. This keeps other drivers from killing the networking PHYs when they disable their own clocks and trigger PLLD_PER's refcount going to 0. v2: Skip marking as critical if they aren't on at boot. Signed-off-by: Eric Anholt <eric@anholt.net>	2018-08-29 13:31:06 +01:00
Phil Elwell	b37d160991	clk-bcm2835: Read max core clock from firmware The VPU is responsible for managing the core clock, usually under direction from the bcm2835-cpufreq driver but not via the clk-bcm2835 driver. Since the core frequency can change without warning, it is safer to report the maximum clock rate to users of the core clock - I2C, SPI and the mini UART - to err on the safe side when calculating clock divisors. If the DT node for the clock driver includes a reference to the firmware node, use the firmware API to query the maximum core clock instead of reading the divider registers. Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about 160KHz. In particular, switching to the 4.9 kernel was likely to break SenseHAT usage on a Pi3. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	d7ae9addd1	clk-bcm2835: Add claim-clocks property The claim-clocks property can be used to prevent PLLs and dividers from being marked as critical. It contains a vector of clock IDs, as defined by dt-bindings/clock/bcm2835.h. Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and PLLH_PIX for the vc4_kms_v3d driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	a049eb0312	clk-bcm2835: Mark used PLLs and dividers CRITICAL The VPU configures and relies on several PLLs and dividers. Mark all enabled dividers and their PLLs as CRITICAL to prevent the kernel from switching them off. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	6606fa82e5	kbuild: Ignore dtco targets when filtering symbols	2018-08-29 13:31:06 +01:00
popcornmix	b6fc258291	bcm2835-rng: Avoid initialising if already enabled Avoids the 0x40000 cycles of warmup again if firmware has already used it	2018-08-29 13:31:06 +01:00
Martin Sperl	49d45fb9b4	Register the clocks early during the boot process, so that special/critical clocks can get enabled early on in the boot process avoiding the risk of disabling a clock, pll_divider or pll when a claiming driver fails to install propperly - maybe it needs to defer. Signed-off-by: Martin Sperl <kernel@martin.sperl.org>	2018-08-29 13:31:06 +01:00
popcornmix	40d694a52d	bcm: Make RASPBERRYPI_POWER depend on PM	2018-08-29 13:31:06 +01:00
popcornmix	4b5a9e65d1	reboot: Use power off rather than busy spinning when halt is requested	2018-08-29 13:31:06 +01:00
Noralf Trønnes	3c731192ec	watchdog: bcm2835: Support setting reboot partition The Raspberry Pi firmware looks at the RSTS register to know which partition to boot from. The reboot syscall command LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument. Add support for passing in a partition number 0..63 to boot from. Partition 63 is a special partiton indicating halt. If the partition doesn't exist, the firmware falls back to partition 0. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	815ccf5689	rtc: Add SPI alias for pcf2123 driver Without this alias, Device Tree won't cause the driver to be loaded. See: https://github.com/raspberrypi/linux/pull/1510	2018-08-29 13:31:06 +01:00
popcornmix	ea095c19a0	firmware: Updated mailbox header	2018-08-29 13:31:06 +01:00
Noralf Trønnes	c02539adab	dmaengine: bcm2835: Load driver early and support legacy API Load driver early since at least bcm2708_fb doesn't support deferred probing and even if it did, we don't want the video driver deferred. Support the legacy DMA API which is needed by bcm2708_fb. Don't mask out channel 2. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	80fef29051	spi-bcm2835: Remove unused code	2018-08-29 13:31:06 +01:00
Phil Elwell	f65d6a0a72	spi-bcm2835: Disable forced software CS Select software CS in bcm2708_common.dtsi, and disable the automatic conversion in the driver to allow hardware CS to be re-enabled with an overlay. See: https://github.com/raspberrypi/linux/issues/1547 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	a4a40a110e	spi-bcm2835: Support pin groups other than 7-11 The spi-bcm2835 driver automatically uses GPIO chip-selects due to some unreliability of the native ones. In doing so it chooses the same pins as the native chip-selects would use, but the existing code always uses pins 7 and 8, wherever the SPI function is mapped. Search the pinctrl group assigned to the driver for pins that correspond to native chip-selects, and use those for GPIO chip- selects. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	76c258fc49	spidev: Add "spidev" compatible string to silence warning See: https://github.com/raspberrypi/linux/issues/1054	2018-08-29 13:31:06 +01:00
Noralf Trønnes	f7ff40011e	irqchip: irq-bcm2835: Add 2836 FIQ support Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2018-08-29 13:31:06 +01:00
Noralf Trønnes	b757b2a5c9	irqchip: bcm2835: Add FIQ support Add a duplicate irq range with an offset on the hwirq's so the driver can detect that enable_fiq() is used. Tested with downstream dwc_otg USB controller driver. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Stephen Warren <swarren@wwwdotorg.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	933618c247	irq-bcm2836: Avoid "Invalid trigger warning" Initialise the level for each IRQ to avoid a warning from the arm arch timer code. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
Phil Elwell	6d823dfa53	irq-bcm2836: Prevent spurious interrupts, and trap them early The old arch-specific IRQ macros included a dsb to ensure the write to clear the mailbox interrupt completed before returning from the interrupt. The BCM2836 irqchip driver needs the same precaution to avoid spurious interrupts. Spurious interrupts are still possible for other reasons, though, so trap them early.	2018-08-29 13:31:06 +01:00
Phil Elwell	1e39a7633a	Protect __release_resource against resources without parents Without this patch, removing a device tree overlay can crash here. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2018-08-29 13:31:06 +01:00
popcornmix	d0c07033fe	Allow mac address to be set in smsc95xx Signed-off-by: popcornmix <popcornmix@gmail.com>	2018-08-29 13:31:06 +01:00
Sam Nazarko	34a85661ca	smsc95xx: Experimental: Enable turbo_mode and packetsize=2560 by default See: http://forum.kodi.tv/showthread.php?tid=285288	2018-08-29 13:31:06 +01:00
Steve Glendinning	ca117f165e	smsx95xx: fix crimes against truesize smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings. This patch stops smsc95xx from changing truesize. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>	2018-08-29 13:31:06 +01:00
Dan Pasanen	cb290ead07	arm: partially revert `702b94bff3` * Re-expose some dmi APIs for use in VCSM	2018-08-29 13:31:06 +01:00
Greg Kroah-Hartman	62134e7b2b	Linux 4.17.19	2018-08-24 13:07:17 +02:00
Jann Horn	c3139458db	reiserfs: fix broken xattr handling (heap corruption, bad retval) commit `a13f085d11` upstream. This fixes the following issues: - When a buffer size is supplied to reiserfs_listxattr() such that each individual name fits, but the concatenation of all names doesn't fit, reiserfs_listxattr() overflows the supplied buffer. This leads to a kernel heap overflow (verified using KASAN) followed by an out-of-bounds usercopy and is therefore a security bug. - When a buffer size is supplied to reiserfs_listxattr() such that a name doesn't fit, -ERANGE should be returned. But reiserfs instead just truncates the list of names; I have verified that if the only xattr on a file has a longer name than the supplied buffer length, listxattr() incorrectly returns zero. With my patch applied, -ERANGE is returned in both cases and the memory corruption doesn't happen anymore. Credit for making me clean this code up a bit goes to Al Viro, who pointed out that the ->actor calling convention is suboptimal and should be changed. Link: http://lkml.kernel.org/r/20180802151539.5373-1-jannh@google.com Fixes: `48b32a3553` ("reiserfs: use generic xattr handlers") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:17 +02:00
Esben Haabendal	9a63bbbf78	i2c: imx: Fix race condition in dma read commit `bed4ff1ed4` upstream. This fixes a race condition, where the DMAEN bit ends up being set after I2C slave has transmitted a byte following the dummy read. When that happens, an interrupt is generated instead, and no DMA request is generated to kickstart the DMA read, and a timeout happens after DMA_TIMEOUT (1 sec). Fixed by setting the DMAEN bit before the dummy read. Signed-off-by: Esben Haabendal <eha@deif.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:17 +02:00
Hans de Goede	7da4d0e8b0	i2c: core: ACPI: Properly set status byte to 0 for multi-byte writes commit `c463a158cb` upstream. acpi_gsb_i2c_write_bytes() returns i2c_transfer()'s return value, which is the number of transfers executed on success, so 1. The ACPI code expects us to store 0 in gsb->status for success, not 1. Specifically this breaks the following code in the Thinkpad 8 DSDT: ECWR = I2CW = ECWR /* \_SB_.I2C1.BAT0.ECWR / If ((ECST == Zero)) { ECRD = I2CR / \_SB_.I2C1.I2CR */ } Before this commit we set ECST to 1, causing the read to never happen breaking battery monitoring on the Thinkpad 8. This commit makes acpi_gsb_i2c_write_bytes() return 0 when i2c_transfer() returns 1, so the single write transfer completed successfully, and makes it return -EIO on for other (unexpected) return values >= 0. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:17 +02:00
Lukas Wunner	d8534575d5	PCI: pciehp: Fix unprotected list iteration in IRQ handler commit `1204e35bed` upstream. Commit `b440bde74f` ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") iterates over the devices on a hotplug port's subordinate bus in pciehp's IRQ handler without acquiring pci_bus_sem. It is thus possible for a user to cause a crash by concurrently manipulating the device list, e.g. by disabling slot power via sysfs on a different CPU or by initiating a remove/rescan via sysfs. This can't be fixed by acquiring pci_bus_sem because it may sleep. The simplest fix is to avoid the list iteration altogether and just check the ignore_hotplug flag on the port itself. This works because pci_ignore_hotplug() sets the flag both on the device as well as on its parent bridge. We do lose the ability to print the name of the device blocking hotplug in the debug message, but that's probably bearable. Fixes: `b440bde74f` ("PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:17 +02:00
Lukas Wunner	5aceac0e4c	PCI: pciehp: Fix use-after-free on unplug commit `281e878eab` upstream. When pciehp is unbound (e.g. on unplug of a Thunderbolt device), the hotplug_slot struct is deregistered and thus freed before freeing the IRQ. The IRQ handler and the work items it schedules print the slot name referenced from the freed structure in various informational and debug log messages, each time resulting in a quadruple dereference of freed pointers (hotplug_slot -> pci_slot -> kobject -> name). At best the slot name is logged as "(null)", at worst kernel memory is exposed in logs or the driver crashes: pciehp 0000:10:00.0:pcie204: Slot((null)): Card not present An attacker may provoke the bug by unplugging multiple devices on a Thunderbolt daisy chain at once. Unplugging can also be simulated by powering down slots via sysfs. The bug is particularly easy to trigger in poll mode. It has been present since the driver's introduction in 2004: https://git.kernel.org/tglx/history/c/c16b4b14d980 Fix by rearranging teardown such that the IRQ is freed first. Run the work items queued by the IRQ handler to completion before freeing the hotplug_slot struct by draining the work queue from the ->release_slot callback which is invoked by pci_hp_deregister(). Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.6.4 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:17 +02:00
Myron Stowe	7231737bb0	PCI: Skip MPS logic for Virtual Functions (VFs) commit `3dbe97efe8` upstream. PCIe r4.0, sec 9.3.5.4, "Device Control Register", shows both Max_Payload_Size (MPS) and Max_Read_request_Size (MRRS) to be 'RsvdP' for VFs. Just prior to the table it states: "PF and VF functionality is defined in Section 7.5.3.4 except where noted in Table 9-16. For VF fields marked 'RsvdP', the PF setting applies to the VF." All of which implies that with respect to Max_Payload_Size Supported (MPSS), MPS, and MRRS values, we should not be paying any attention to the VF's fields, but rather only to the PF's. Only looking at the PF's fields also logically makes sense as it's the sole physical interface to the PCIe bus. Link: https://bugzilla.kernel.org/show_bug.cgi?id=200527 Fixes: `27d868b5e6` ("PCI: Set MPS to match upstream bridge") Signed-off-by: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # 4.3+ Cc: Keith Busch <keith.busch@intel.com> Cc: Sinan Kaya <okaya@kernel.org> Cc: Dongdong Liu <liudongdong3@huawei.com> Cc: Jon Mason <jdmason@kudzu.us> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Lukas Wunner	221d52fe2d	PCI: hotplug: Don't leak pci_slot on registration failure commit `4ce6435820` upstream. If addition of sysfs files fails on registration of a hotplug slot, the struct pci_slot as well as the entry in the slot_list is leaked. The issue has been present since the hotplug core was introduced in 2002: https://git.kernel.org/tglx/history/c/a8a2069f432c Perhaps the idea was that even though sysfs addition fails, the slot should still be usable. But that's not how drivers use the interface, they abort probe if a non-zero value is returned. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v2.4.15+ Cc: Greg Kroah-Hartman <greg@kroah.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Rafael J. Wysocki	77f57b8a45	PCI / ACPI / PM: Resume all bridges on suspend-to-RAM commit `9d64b539b7` upstream. Commit `26112ddc25` (PCI / ACPI / PM: Resume bridges w/o drivers on suspend-to-RAM) attempted to fix a functional regression resulting from commit `c62ec4610c` (PM / core: Fix direct_complete handling for devices with no callbacks) by resuming PCI bridges without drivers (that is, "parallel PCI" ones) during system-wide suspend if the target system state is not ACPI S0 (working state). That turns out insufficient, however, as it is reported that, at least in one case, the platform firmware gets confused if a PCIe root port is suspended before entering the ACPI S3 sleep state. That issue was exposed by commit 77b3729ca03 (PCI / PM: Use SMART_SUSPEND and LEAVE_SUSPENDED flags for PCIe ports) that allowed PCIe ports to stay in runtime suspend during system-wide suspend (which is OK for suspend-to-idle, but turns out to be problematic otherwise). For this reason, drop the driver check from acpi_pci_need_resume() and resume all bridges (including PCIe ports with drivers) during system-wide suspend if the target system state is not ACPI S0. [If the target system state is ACPI S0, it means suspend-to-idle and the platform firmware is not going to be invoked to actually suspend the system, so there is no need to resume the bridges in that case.] Fixes: 77b3729ca03 (PCI / PM: Use SMART_SUSPEND and LEAVE_SUSPENDED flags for PCIe ports) Link: https://bugzilla.kernel.org/show_bug.cgi?id=200675 Reported-by: teika kazura <teika@gmx.com> Tested-by: teika kazura <teika@gmx.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+: `26112ddc25` (PCI / ACPI / PM: Resume bridges ...) Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Christian König	8f908e1867	PCI: Restore resized BAR state on resume commit `d3252ace0b` upstream. Resize BARs after resume to the expected size again. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199959 Fixes: `d6895ad39f` ("drm/amdgpu: resize VRAM BAR for CPU access v6") Fixes: `276b738deb` ("PCI: Add resizable BAR infrastructure") Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v4.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Ursula Braun	3cb744554f	net/smc: no shutdown in state SMC_LISTEN commit `caa21e19e0` upstream. Invoking shutdown for a socket in state SMC_LISTEN does not make sense. Nevertheless programs like syzbot fuzzing the kernel may try to do this. For SMC this means a socket refcounting problem. This patch makes sure a shutdown call for an SMC socket in state SMC_LISTEN simply returns with -ENOTCONN. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Willem de Bruijn	4713ad9583	packet: refine ring v3 block size test to hold one frame commit `4576cd469d` upstream. TPACKET_V3 stores variable length frames in fixed length blocks. Blocks must be able to store a block header, optional private space and at least one minimum sized frame. Frames, even for a zero snaplen packet, store metadata headers and optional reserved space. In the block size bounds check, ensure that the frame of the chosen configuration fits. This includes sockaddr_ll and optional tp_reserve. Syzbot was able to construct a ring with insuffient room for the sockaddr_ll in the header of a zero-length frame, triggering an out-of-bounds write in dev_parse_header. Convert the comparison to less than, as zero is a valid snap len. This matches the test for minimum tp_frame_size immediately below. Fixes: `f6fb8f100b` ("af-packet: TPACKET_V3 flexible buffer implementation.") Fixes: `eb73190f4f` ("net/packet: refine check for priv area size") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Florian Westphal	e3648b5ccb	netfilter: conntrack: dccp: treat SYNC/SYNCACK as invalid if no prior state commit `6613b6173d` upstream. When first DCCP packet is SYNC or SYNCACK, we insert a new conntrack that has an un-initialized timeout value, i.e. such entry could be reaped at any time. Mark them as INVALID and only ignore SYNC/SYNCACK when connection had an old state. Reported-by: syzbot+6f18401420df260e37ed@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
Eric Dumazet	0a8fd12f85	xfrm_user: prevent leaking 2 bytes of kernel memory commit `45c180bc29` upstream. struct xfrm_userpolicy_type has two holes, so we should not use C99 style initializer. KMSAN report: BUG: KMSAN: kernel-infoleak in copyout lib/iov_iter.c:140 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x1b14/0x2800 lib/iov_iter.c:571 CPU: 1 PID: 4520 Comm: syz-executor841 Not tainted 4.17.0+ #5 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1117 kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1211 kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1253 copyout lib/iov_iter.c:140 [inline] _copy_to_iter+0x1b14/0x2800 lib/iov_iter.c:571 copy_to_iter include/linux/uio.h:106 [inline] skb_copy_datagram_iter+0x422/0xfa0 net/core/datagram.c:431 skb_copy_datagram_msg include/linux/skbuff.h:3268 [inline] netlink_recvmsg+0x6f1/0x1900 net/netlink/af_netlink.c:1959 sock_recvmsg_nosec net/socket.c:802 [inline] sock_recvmsg+0x1d6/0x230 net/socket.c:809 ___sys_recvmsg+0x3fe/0x810 net/socket.c:2279 __sys_recvmmsg+0x58e/0xe30 net/socket.c:2391 do_sys_recvmmsg+0x2a6/0x3e0 net/socket.c:2472 __do_sys_recvmmsg net/socket.c:2485 [inline] __se_sys_recvmmsg net/socket.c:2481 [inline] __x64_sys_recvmmsg+0x15d/0x1c0 net/socket.c:2481 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x446ce9 RSP: 002b:00007fc307918db8 EFLAGS: 00000293 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 00000000006dbc24 RCX: 0000000000446ce9 RDX: 000000000000000a RSI: 0000000020005040 RDI: 0000000000000003 RBP: 00000000006dbc20 R08: 0000000020004e40 R09: 0000000000000000 R10: 0000000040000000 R11: 0000000000000293 R12: 0000000000000000 R13: 00007ffc8d2df32f R14: 00007fc3079199c0 R15: 0000000000000001 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_save_stack mm/kmsan/kmsan.c:294 [inline] kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685 kmsan_memcpy_origins+0x11d/0x170 mm/kmsan/kmsan.c:527 __msan_memcpy+0x109/0x160 mm/kmsan/kmsan_instr.c:413 __nla_put lib/nlattr.c:569 [inline] nla_put+0x276/0x340 lib/nlattr.c:627 copy_to_user_policy_type net/xfrm/xfrm_user.c:1678 [inline] dump_one_policy+0xbe1/0x1090 net/xfrm/xfrm_user.c:1708 xfrm_policy_walk+0x45a/0xd00 net/xfrm/xfrm_policy.c:1013 xfrm_dump_policy+0x1c0/0x2a0 net/xfrm/xfrm_user.c:1749 netlink_dump+0x9b5/0x1550 net/netlink/af_netlink.c:2226 __netlink_dump_start+0x1131/0x1270 net/netlink/af_netlink.c:2323 netlink_dump_start include/linux/netlink.h:214 [inline] xfrm_user_rcv_msg+0x8a3/0x9b0 net/xfrm/xfrm_user.c:2577 netlink_rcv_skb+0x37e/0x600 net/netlink/af_netlink.c:2448 xfrm_netlink_rcv+0xb2/0xf0 net/xfrm/xfrm_user.c:2598 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1680/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Local variable description: ----upt.i@dump_one_policy Variable was created at: dump_one_policy+0x78/0x1090 net/xfrm/xfrm_user.c:1689 xfrm_policy_walk+0x45a/0xd00 net/xfrm/xfrm_policy.c:1013 Byte 130 of 137 is uninitialized Memory access starts at ffff88019550407f Fixes: `c0144beaec` ("[XFRM] netlink: Use nla_put()/NLA_PUT() variantes") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
John David Anglin	5b75a91fc6	parisc: Remove ordered stores from syscall.S commit `7797167ffd` upstream. Now that we use a sync prior to releasing the locks in syscall.S, we don't need the PA 2.0 ordered stores used to release some locks. Using an ordered store, potentially slows the release and subsequent code. There are a number of other ordered stores and loads that serve no purpose. I have converted these to normal stores. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:16 +02:00
John David Anglin	5c9fc580b1	parisc: Remove unnecessary barriers from spinlock.h commit `3b885ac1dc` upstream. Now that mb() is an instruction barrier, it will slow performance if we issue unnecessary barriers. The spinlock defines have a number of unnecessary barriers. The __ldcw() define is both a hardware and compiler barrier. The mb() barriers in the routines using __ldcw() serve no purpose. The only barrier needed is the one in arch_spin_unlock(). We need to ensure all accesses are complete prior to releasing the lock. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Gustavo A. R. Silva	4027d547d6	drm/amdgpu/pm: Fix potential Spectre v1 commit `ddf74e79a5` upstream. idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:408 amdgpu_set_pp_force_state() warn: potential spectre issue 'data.states' Fix this by sanitizing idx before using it to index data.states Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Gustavo A. R. Silva	766cbfbce2	drm/i915/kvmgt: Fix potential Spectre v1 commit `de5372da60` upstream. info.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/gpu/drm/i915/gvt/kvmgt.c:1232 intel_vgpu_ioctl() warn: potential spectre issue 'vgpu->vdev.region' [r] Fix this by sanitizing info.index before indirectly using it to index vgpu->vdev.region Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Jeremy Cline	df2ef7a510	ext4: fix spectre gadget in ext4_mb_regular_allocator() commit `1a5d5e5d51` upstream. 'ac->ac_g_ex.fe_len' is a user-controlled value which is used in the derivation of 'ac->ac_2order'. 'ac->ac_2order', in turn, is used to index arrays which makes it a potential spectre gadget. Fix this by sanitizing the value assigned to 'ac->ac2_order'. This covers the following accesses found with the help of smatch: * fs/ext4/mballoc.c:1896 ext4_mb_simple_scan_group() warn: potential spectre issue 'grp->bb_counters' [w] (local cap) * fs/ext4/mballoc.c:445 mb_find_buddy() warn: potential spectre issue 'EXT4_SB(e4b->bd_sb)->s_mb_offsets' [r] (local cap) * fs/ext4/mballoc.c:446 mb_find_buddy() warn: potential spectre issue 'EXT4_SB(e4b->bd_sb)->s_mb_maxs' [r] (local cap) Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Dave Hansen	74953b0d49	x86/mm/init: Remove freed kernel image areas from alias mapping commit `c40a56a781` upstream. The kernel image is mapped into two places in the virtual address space (addresses without KASLR, of course): 1. The kernel direct map (0xffff880000000000) 2. The "high kernel map" (0xffffffff81000000) We actually execute out of #2. If we get the address of a kernel symbol, it points to #2, but almost all physical-to-virtual translations point to Parts of the "high kernel map" alias are mapped in the userspace page tables with the Global bit for performance reasons. The parts that we map to userspace do not (er, should not) have secrets. When PTI is enabled then the global bit is usually not set in the high mapping and just used to compensate for poor performance on systems which lack PCID. This is fine, except that some areas in the kernel image that are adjacent to the non-secret-containing areas are unused holes. We free these holes back into the normal page allocator and reuse them as normal kernel memory. The memory will, of course, get used via the normal map, but the alias mapping is kept. This otherwise unused alias mapping of the holes will, by default keep the Global bit, be mapped out to userspace, and be vulnerable to Meltdown. Remove the alias mapping of these pages entirely. This is likely to fracture the 2M page mapping the kernel image near these areas, but this should affect a minority of the area. The pageattr code changes all aliases mapping the physical pages that it operates on (by default). We only want to modify a single alias, so we need to tweak its behavior. This unmapping behavior is currently dependent on PTI being in place. Going forward, we should at least consider doing this for all configurations. Having an extra read-write alias for memory is not exactly ideal for debugging things like random memory corruption and this does undercut features like DEBUG_PAGEALLOC or future work like eXclusive Page Frame Ownership (XPFO). Before this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_kernel-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd current_kernel-0xffffffff82c00000-0xffffffff82e00000 2M RW NX pte current_kernel-0xffffffff82e00000-0xffffffff83200000 4M RW PSE NX pmd current_kernel-0xffffffff83200000-0xffffffffa0000000 462M pmd current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_user-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd After this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K pte current_kernel-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd current_kernel-0xffffffff82400000-0xffffffff82488000 544K ro NX pte current_kernel-0xffffffff82488000-0xffffffff82600000 1504K pte current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd current_kernel-0xffffffff82c00000-0xffffffff82c0d000 52K RW NX pte current_kernel-0xffffffff82c0d000-0xffffffff82dc0000 1740K pte current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K pte current_user-0xffffffff82000000-0xffffffff82400000 4M ro PSE GLB NX pmd current_user-0xffffffff82400000-0xffffffff82488000 544K ro NX pte current_user-0xffffffff82488000-0xffffffff82600000 1504K pte current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd [ tglx: Do not unmap on 32bit as there is only one mapping ] Fixes: `0f561fce4d` ("x86/pti: Enable global pages for shared areas") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/20180802225831.5F6A2BFC@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Dave Hansen	e35cecb186	x86/mm/init: Add helper for freeing kernel image pages commit `6ea2738e0c` upstream. When chunks of the kernel image are freed, free_init_pages() is used directly. Consolidate the three sites that do this. Also update the string to give an incrementally better description of that memory versus what was there before. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@google.com Cc: aarcange@redhat.com Cc: jgross@suse.com Cc: jpoimboe@redhat.com Cc: gregkh@linuxfoundation.org Cc: peterz@infradead.org Cc: hughd@google.com Cc: torvalds@linux-foundation.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: ak@linux.intel.com Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20180802225829.FE0E32EA@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Dave Hansen	6a794d62d1	x86/mm/init: Pass unconverted symbol addresses to free_init_pages() commit `9f515cdb41` upstream. The x86 code has several places where it frees parts of kernel image: 1. Unused SMP alternative 2. __init code 3. The hole between text and rodata 4. The hole between rodata and data We call free_init_pages() to do this. Strangely, we convert the symbol addresses to kernel direct map addresses in some cases (#3, #4) but not others (#1, #2). The virt_to_page() and the other code in free_reserved_area() now works fine for for symbol addresses on x86, so don't bother converting the addresses to direct map addresses before freeing them. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@google.com Cc: aarcange@redhat.com Cc: jgross@suse.com Cc: jpoimboe@redhat.com Cc: gregkh@linuxfoundation.org Cc: peterz@infradead.org Cc: hughd@google.com Cc: torvalds@linux-foundation.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: ak@linux.intel.com Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20180802225828.89B2D0E2@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Dave Hansen	79e944b068	mm: Allow non-direct-map arguments to free_reserved_area() commit `0d83432811` upstream. free_reserved_area() takes pointers as arguments to show which addresses should be freed. However, it does this in a somewhat ambiguous way. If it gets a kernel direct map address, it always works. However, if it gets an address that is part of the kernel image alias mapping, it can fail. It fails if all of the following happen: * The specified address is part of the kernel image alias * Poisoning is requested (forcing a memset()) * The address is in a read-only portion of the kernel image The memset() fails on the read-only mapping, of course. free_reserved_area() is called both on the direct map and on kernel image alias addresses. We've just lucked out thus far that the kernel image alias areas it gets used on are read-write. I'm fairly sure this has been just a happy accident. It is quite easy to make free_reserved_area() work for all cases: just convert the address to a direct map address before doing the memset(), and do this unconditionally. There is little chance of a regression here because we previously did a virt_to_page() on the address for the memset, so we know these are not highmem pages for which virt_to_page() would fail. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@google.com Cc: aarcange@redhat.com Cc: jgross@suse.com Cc: jpoimboe@redhat.com Cc: gregkh@linuxfoundation.org Cc: peterz@infradead.org Cc: hughd@google.com Cc: torvalds@linux-foundation.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: ak@linux.intel.com Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Hugh Dickins <hughd@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20180802225826.1287AE3E@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:15 +02:00
Matthijs van Duin	965f2fed02	pty: fix O_CLOEXEC for TIOCGPTPEER commit `36ecc1481d` upstream. It was being ignored because the flags were not passed to fd allocation. Fixes: `54ebbfb160` ("tty: add TIOCGPTPEER ioctl") Signed-off-by: Matthijs van Duin <matthijsvanduin@gmail.com> Acked-by: Aleksa Sarai <asarai@suse.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Takashi Iwai	5be1b251b3	EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[] commit `b748f2de4b` upstream. The edac_mem_types[] array misses a MEM_LRDDR4 entry, which leads to NULL pointer dereference when accessed via sysfs or such. Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20180810141426.8918-1-tiwai@suse.de Fixes: `1e8096bb20` ("EDAC: Add LRDDR4 DRAM type") Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Linus Torvalds	d9b875a4a1	mm: make vm_area_alloc() initialize core fields [ Upstream commit `490fc05386` ] Like vm_area_dup(), it initializes the anon_vma_chain head, and the basic mm pointer. The rest of the fields end up being different for different users, although the plan is to also initialize the 'vm_ops' field to a dummy entry. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Linus Torvalds	ae97be30fa	mm: make vm_area_dup() actually copy the old vma data [ Upstream commit `95faf6992d` ] .. and re-initialize th eanon_vma_chain head. This removes some boiler-plate from the users, and also makes it clear why it didn't need use the 'zalloc()' version. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Linus Torvalds	5aac8b5530	mm: use helper functions for allocating and freeing vm_area structs [ Upstream commit `3928d4f5ee` ] The vm_area_struct is one of the most fundamental memory management objects, but the management of it is entirely open-coded evertwhere, ranging from allocation and freeing (using kmem_cache_[z]alloc and kmem_cache_free) to initializing all the fields. We want to unify this in order to end up having some unified initialization of the vmas, and the first step to this is to at least have basic allocation functions. Right now those functions are literally just wrappers around the kmem_cache_*() calls. This is a purely mechanical conversion: # new vma: kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL) -> vm_area_alloc() # copy old vma kmem_cache_alloc(vm_area_cachep, GFP_KERNEL) -> vm_area_dup(old) # free vma kmem_cache_free(vm_area_cachep, vma) -> vm_area_free(vma) to the point where the old vma passed in to the vm_area_dup() function isn't even used yet (because I've left all the old manual initialization alone). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Roland Dreier	2a271623ce	nvme: fix handling of metadata_len for NVME_IOCTL_IO_CMD [ Upstream commit `9b38276813` ] The old code in nvme_user_cmd() passed the userspace virtual address from nvme_passthru_cmd.metadata as the length of the metadata buffer as well as the address to nvme_submit_user_cmd(). Fixes: `63263d60` ("nvme: Use metadata for passthrough commands") Signed-off-by: Roland Dreier <roland@purestorage.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Damien Thébault	587eb87b45	platform/x86: dell-laptop: Fix backlight detection [ Upstream commit `2502e5a025` ] Fix return code check for "max brightness" ACPI call. The Dell laptop ACPI video brightness control is not present on dell laptops anymore, but was present in older kernel versions. The code that checks the return value is incorrect since the SMM refactoring. The old code was: if (buffer->output[0] == 0) Which was changed to: ret = dell_send_request(...) if (ret) However, dell_send_request() will return 0 if buffer->output[0] == 0, so we must change the check to: if (ret == 0) This issue was found on a Dell M4800 laptop, and the fix tested on it as well. Fixes: `549b4930f0` ("dell-smbios: Introduce dispatcher for SMM calls") Signed-off-by: Damien Thébault <damien@dtbo.net> Tested-by: Damien Thébault <damien@dtbo.net> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Uwe Kleine-König	844c2af02c	ARM: dts: imx6: RDU2: fix irq type for mv88e6xxx switch [ Upstream commit `e01a06c808` ] The Marvell switches report their interrupts in a level sensitive way. When using edge sensitive detection a race condition in the interrupt handler of the swich might result in the OS to miss all future events which might make the switch non-functional. The problem is that both mv88e6xxx_g2_irq_thread_fn() and mv88e6xxx_g1_irq_thread_work() sample the irq cause register (MV88E6XXX_G2_INT_SRC and MV88E6XXX_G1_STS respectively) once and then handle the observed sources. If after sampling but before all observed irq sources are handled a new irq source gets active this is not noticed by the handler which returns unsuspecting, but the interrupt line stays active which prevents the edge detector to kick in. All device trees but imx6qdl-zii-rdu2 get this right (most of them by not specifying an interrupt parent). So fix imx6qdl-zii-rdu2 accordingly. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Fixes: `f64992d1a9` ("ARM: dts: imx6: RDU2: Add Switch interrupts") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:14 +02:00
Robin H. Johnson	fb57f5aa7b	ACPI / EC: Use ec_no_wakeup on more Thinkpad X1 Carbon 6th systems [ Upstream commit `2c4d6baf1b` ] The ec_no_wakeup matcher added for Thinkpad X1 Carbon 6th gen systems beyond matched only a single DMI model (20KGS3JF01), that didn't cover my laptop (20KH002JUS). Change to match based on DMI product family to cover all X1 6th gen systems. Signed-off-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Anson Huang	710a0cc2cc	soc: imx: gpc: restrict register range for regmap access [ Upstream commit `de2d9b5284` ] GPC registers are NOT continuous, some registers are reserved and accessing them from userspace will trigger external abort, add regmap register access table to avoid below abort: root@imx6slevk:~# cat /sys/kernel/debug/regmap/20dc000.gpc/registers [ 108.480477] Unhandled fault: imprecise external abort (0x1406) at 0xb6db5004 [ 108.487985] pgd = 42b54bfd [ 108.490741] [b6db5004] *pgd=ba1b7831 [ 108.494386] Internal error: : 1406 [#1] SMP ARM [ 108.498943] Modules linked in: [ 108.502043] CPU: 0 PID: 389 Comm: cat Not tainted 4.18.0-rc1-00074-gc9f1f60-dirty #482 [ 108.509982] Hardware name: Freescale i.MX6 SoloLite (Device Tree) [ 108.516123] PC is at regmap_mmio_read32le+0x20/0x24 [ 108.521031] LR is at regmap_mmio_read+0x40/0x60 [ 108.525586] pc : [<c059cf74>] lr : [<c059d1ac>] psr: 20060093 [ 108.531875] sp : eccf1d98 ip : eccf1da8 fp : eccf1da4 [ 108.537122] r10: ec2d3800 r9 : eccf1f60 r8 : ecfc0000 [ 108.542370] r7 : eccf1e2c r6 : eccf1e2c r5 : 00000028 r4 : ec338e00 [ 108.548920] r3 : 00000000 r2 : eccf1e2c r1 : f0980028 r0 : 00000000 [ 108.555474] Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none [ 108.562720] Control: 10c5387d Table: acf4004a DAC: 00000051 [ 108.568491] Process cat (pid: 389, stack limit = 0xd4318a65) [ 108.574174] Stack: (0xeccf1d98 to 0xeccf2000) Fixes: `721cabf6c6` ("soc: imx: move PGC handling to a new GPC driver") Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Randy Dunlap	cb0bd53504	tcp: identify cryptic messages as TCP seq # bugs [ Upstream commit `e56b8ce363` ] Attempt to make cryptic TCP seq number error messages clearer by (1) identifying the source of the message as "TCP", (2) identifying the errors as "seq # bug", and (3) grouping the field identifiers and values by separating them with commas. E.g., the following message is changed from: recvmsg bug 2: copied 73BCB6CD seq 70F17CBE rcvnxt 73BCB9AA fl 0 WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:1881 tcp_recvmsg+0x649/0xb90 to: TCP recvmsg seq # bug 2: copied 73BCB6CD, seq 70F17CBE, rcvnxt 73BCB9AA, fl 0 WARNING: CPU: 2 PID: 1501 at /linux/net/ipv4/tcp.c:2011 tcp_recvmsg+0x694/0xba0 Suggested-by: 積丹尼 Dan Jacobson <jidanni@jidanni.org> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Alexander Sverdlin	50657bdfc9	net: cavium: Add fine-granular dependencies on PCI [ Upstream commit `e40562abdf` ] Add dependencies on PCI where necessary. Fixes: `7e2bc7fb65` ("net: cavium: Drop dependency of NET_VENDOR_CAVIUM on PCI") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Stefan Wahren	16bfad96c1	net: qca_spi: Fix log level if probe fails [ Upstream commit `5097399326` ] In cases the probing fails the log level of the messages should be an error. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Stefan Wahren	08d8f98c10	net: qca_spi: Make sure the QCA7000 reset is triggered [ Upstream commit `711c62dfa6` ] In case the SPI thread is not running, a simple reset of sync state won't fix the transmit timeout. We also need to wake up the kernel thread. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: `ed7d42e24e` ("net: qca_spi: fix transmit queue timeout handling") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Stefan Wahren	601e3173bc	net: qca_spi: Avoid packet drop during initial sync [ Upstream commit `b2bab426dc` ] As long as the synchronization with the QCA7000 isn't finished, we cannot accept packets from the upper layers. So let the SPI thread enable the TX queue after sync and avoid unwanted packet drop. Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Fixes: `291ab06ecf` ("net: qualcomm: new Ethernet over SPI driver for QCA7000") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Sergei Shtylyov	824dc11ed2	PCI: v3-semi: Fix I/O space page leak [ Upstream commit `270ed733e6` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The V3 Semiconductor PCI driver has the same issue. Replace devm_pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: `68a15eb7bd` ("PCI: v3-semi: Add V3 Semiconductor PCI host driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:13 +02:00
Sergei Shtylyov	6a1735ce60	PCI: mediatek: Fix I/O space page leak [ Upstream commit `438477b9a0` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The MediaTek PCIe driver has the same issue. Replace devm_pci_remap_iospace() with its devm_ managed counterpart to fix the bug. Fixes: `637cfacae9` ("PCI: mediatek: Add MediaTek PCIe host controller support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	6ab3f7e0dc	PCI: faraday: Fix I/O space page leak [ Upstream commit `e30609454b` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Faraday PCI driver has the same issue. Replace pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: `d3c68e0a7e` ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	671db0d11b	PCI: aardvark: Fix I/O space page leak [ Upstream commit `1df3e5b3fe` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Aardvark PCI controller driver has the same issue. Replace pci_remap_iospace() with its devm_ managed version to fix the bug. Fixes: `8c39d71036` ("PCI: aardvark: Add Aardvark PCI host controller driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	a818d84ffe	PCI: designware: Fix I/O space page leak [ Upstream commit `fd07f5e19c` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver is left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The DesignWare PCIe controller driver has the same issue. Replace devm_pci_remap_iospace() with a devm_ managed version to fix the bug. Fixes: `cbce790059` ("PCI: designware: Make driver arch-agnostic") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Jingoo Han <jingoohan1@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	64c4ad1e06	PCI: versatile: Fix I/O space page leak [ Upstream commit `0018b265ad` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The Versatile PCI controller driver has the same issue. Replace pci_remap_iospace() with the devm_ managed version to fix the bug. Fixes: `b7e78170ef` ("PCI: versatile: Add DT-based ARM Versatile PB PCIe host driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	b1b2c5dc96	PCI: xgene: Fix I/O space page leak [ Upstream commit `925652d035` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. The X-Gene PCI controller driver has the same issue. Replace pci_remap_iospace() with the devm_ managed version so that the pages get unmapped automagically on any probe failure. Fixes: `5f6b6ccdbe` ("PCI: xgene: Add APM X-Gene PCIe driver") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Sergei Shtylyov	607fef725d	PCI: OF: Fix I/O space page leak [ Upstream commit `a5fb9fb023` ] When testing the R-Car PCIe driver on the Condor board, if the PCIe PHY driver was left disabled, the kernel crashed with this BUG: kernel BUG at lib/ioremap.c:72! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 39 Comm: kworker/0:1 Not tainted 4.17.0-dirty #1092 Hardware name: Renesas Condor board based on r8a77980 (DT) Workqueue: events deferred_probe_work_func pstate: 80000005 (Nzcv daif -PAN -UAO) pc : ioremap_page_range+0x370/0x3c8 lr : ioremap_page_range+0x40/0x3c8 sp : ffff000008da39e0 x29: ffff000008da39e0 x28: 00e8000000000f07 x27: ffff7dfffee00000 x26: 0140000000000000 x25: ffff7dfffef00000 x24: 00000000000fe100 x23: ffff80007b906000 x22: ffff000008ab8000 x21: ffff000008bb1d58 x20: ffff7dfffef00000 x19: ffff800009c30fb8 x18: 0000000000000001 x17: 00000000000152d0 x16: 00000000014012d0 x15: 0000000000000000 x14: 0720072007200720 x13: 0720072007200720 x12: 0720072007200720 x11: 0720072007300730 x10: 00000000000000ae x9 : 0000000000000000 x8 : ffff7dffff000000 x7 : 0000000000000000 x6 : 0000000000000100 x5 : 0000000000000000 x4 : 000000007b906000 x3 : ffff80007c61a880 x2 : ffff7dfffeefffff x1 : 0000000040000000 x0 : 00e80000fe100f07 Process kworker/0:1 (pid: 39, stack limit = 0x (ptrval)) Call trace: ioremap_page_range+0x370/0x3c8 pci_remap_iospace+0x7c/0xac pci_parse_request_of_pci_ranges+0x13c/0x190 rcar_pcie_probe+0x4c/0xb04 platform_drv_probe+0x50/0xbc driver_probe_device+0x21c/0x308 __device_attach_driver+0x98/0xc8 bus_for_each_drv+0x54/0x94 __device_attach+0xc4/0x12c device_initial_probe+0x10/0x18 bus_probe_device+0x90/0x98 deferred_probe_work_func+0xb0/0x150 process_one_work+0x12c/0x29c worker_thread+0x200/0x3fc kthread+0x108/0x134 ret_from_fork+0x10/0x18 Code: f9004ba2 54000080 aa0003fb 17ffff48 (d4210000) It turned out that pci_remap_iospace() wasn't undone when the driver's probe failed, and since devm_phy_optional_get() returned -EPROBE_DEFER, the probe was retried, finally causing the BUG due to trying to remap already remapped pages. Introduce the devm_pci_remap_iospace() managed API and replace the pci_remap_iospace() call with it to fix the bug. Fixes: `dbf9826d57` ("PCI: generic: Convert to DT resource parsing API") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> [lorenzo.pieralisi@arm.com: split commit/updated the commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Karsten Graul	6445942b55	net/smc: reset recv timeout after clc handshake [ Upstream commit `f6bdc42f02` ] During clc handshake the receive timeout is set to CLC_WAIT_TIME. Remember and reset the original timeout value after the receive calls, and remove a duplicate assignment of CLC_WAIT_TIME. Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
Peng Hao	1ba4f24d19	kvmclock: fix TSC calibration for nested guests [ Upstream commit `e10f780503` ] Inside a nested guest, access to hardware can be slow enough that tsc_read_refs always return ULLONG_MAX, causing tsc_refine_calibration_work to be called periodically and the nested guest to spend a lot of time reading the ACPI timer. However, if the TSC frequency is available from the pvclock page, we can just set X86_FEATURE_TSC_KNOWN_FREQ and avoid the recalibration. 'refine' operation. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> [Commit message rewritten. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:12 +02:00
David Lechner	80e3c217d1	net: usb: rtl8150: demote allmulti message to dev_dbg() [ Upstream commit `3a9b045506` ] This driver can spam the kernel log with multiple messages of: net eth0: eth0: allmulti set Usually 4 or 8 at a time (probably because of using ConnMan). This message doesn't seem useful, so let's demote it from dev_info() to dev_dbg(). Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Alexander Sverdlin	1c69d87969	octeon_mgmt: Fix MIX registers configuration on MTU setup [ Upstream commit `4aac0b4347` ] octeon_mgmt driver doesn't drop RX frames that are 1-4 bytes bigger than MTU set for the corresponding interface. The problem is in the AGL_GMX_RX0/1_FRM_MAX register setting, which should not account for VLAN tagging. According to Octeon HW manual: "For tagged frames, MAX increases by four bytes for each VLAN found up to a maximum of two VLANs, or MAX + 8 bytes." OCTEON_FRAME_HEADER_LEN "define" is fine for ring buffer management, but should not be used for AGL_GMX_RX0/1_FRM_MAX. The problem could be easily reproduced using "ping" command. If affected system has default MTU 1500, other host (having MTU >= 1504) can successfully "ping" the affected system with payload size 1473-1476, resulting in IP packets of size 1501-1504 accepted by the mgmt driver. Fixed system still accepts IP packets of 1500 bytes even with VLAN tagging, because the limits are lifted in HW as expected, for every VLAN tag. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Scott Bauer	3b4f6a219c	nvme: ensure forward progress during Admin passthru [ Upstream commit `cf39a6bc34` ] If the controller supports effects and goes down during the passthru admin command we will deadlock during namespace revalidation. [ 363.488275] INFO: task kworker/u16:5:231 blocked for more than 120 seconds. [ 363.488290] Not tainted 4.17.0+ #2 [ 363.488296] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 363.488303] kworker/u16:5 D 0 231 2 0x80000000 [ 363.488331] Workqueue: nvme-reset-wq nvme_reset_work [nvme] [ 363.488338] Call Trace: [ 363.488385] schedule+0x75/0x190 [ 363.488396] rwsem_down_read_failed+0x1c3/0x2f0 [ 363.488481] call_rwsem_down_read_failed+0x14/0x30 [ 363.488504] down_read+0x1d/0x80 [ 363.488523] nvme_stop_queues+0x1e/0xa0 [nvme_core] [ 363.488536] nvme_dev_disable+0xae4/0x1620 [nvme] [ 363.488614] nvme_reset_work+0xd1e/0x49d9 [nvme] [ 363.488911] process_one_work+0x81a/0x1400 [ 363.488934] worker_thread+0x87/0xe80 [ 363.488955] kthread+0x2db/0x390 [ 363.488977] ret_from_fork+0x35/0x40 Fixes: `84fef62d13` ("nvme: check admin passthru command effects") Signed-off-by: Scott Bauer <scott.bauer@intel.com> Reviewed-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Qu Wenruo	67a97632c6	btrfs: scrub: Don't use inode page cache in scrub_handle_errored_block() [ Upstream commit `665d4953cd` ] In commit `ac0b4145d6` ("btrfs: scrub: Don't use inode pages for device replace") we removed the branch of copy_nocow_pages() to avoid corruption for compressed nodatasum extents. However above commit only solves the problem in scrub_extent(), if during scrub_pages() we failed to read some pages, sctx->no_io_error_seen will be non-zero and we go to fixup function scrub_handle_errored_block(). In scrub_handle_errored_block(), for sctx without csum (no matter if we're doing replace or scrub) we go to scrub_fixup_nodatasum() routine, which does the similar thing with copy_nocow_pages(), but does it without the extra check in copy_nocow_pages() routine. So for test cases like btrfs/100, where we emulate read errors during replace/scrub, we could corrupt compressed extent data again. This patch will fix it just by avoiding any "optimization" for nodatasum, just falls back to the normal fixup routine by try read from any good copy. This also solves WARN_ON() or dead lock caused by lame backref iteration in scrub_fixup_nodatasum() routine. The deadlock or WARN_ON() won't be triggered before commit `ac0b4145d6` ("btrfs: scrub: Don't use inode pages for device replace") since copy_nocow_pages() have better locking and extra check for data extent, and it's already doing the fixup work by try to read data from any good copy, so it won't go scrub_fixup_nodatasum() anyway. This patch disables the faulty code and will be removed completely in a followup patch. Fixes: `ac0b4145d6` ("btrfs: scrub: Don't use inode pages for device replace") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Pavel Machek	248e87dfb8	ARM: dts: omap4-droid4: fix dts w.r.t. pwm [ Upstream commit `d3f6daede2` ] pwm node should not be under gpio6 node in the device tree. This fixes detection of the pwm on Droid 4. Fixes: `6d7bdd328d` ("ARM: dts: omap4-droid4: update touchscreen") Signed-off-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> [tony@atomide.com: added fixes tag] Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
John Allen	74c5905e15	ibmvnic: Fix error recovery on login failure [ Upstream commit `3578a7ecb6` ] Testing has uncovered a failure case that is not handled properly. In the event that a login fails and we are not able to recover on the spot, we return 0 from do_reset, preventing any error recovery code from being triggered. Additionally, the state is set to "probed" meaning that when we are able to trigger the error recovery, the driver always comes up in the probed state. To handle the case properly, we need to return a failure code here and set the adapter state to the state that we entered the reset in indicating the state that we would like to come out of the recovery reset in. Signed-off-by: John Allen <jallen@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Randy Dunlap	1f4a89d215	net/ethernet/freescale/fman: fix cross-build error [ Upstream commit `c133459765` ] CC [M] drivers/net/ethernet/freescale/fman/fman.o In file included from ../drivers/net/ethernet/freescale/fman/fman.c:35: ../include/linux/fsl/guts.h: In function 'guts_set_dmacr': ../include/linux/fsl/guts.h:165:2: error: implicit declaration of function 'clrsetbits_be32' [-Werror=implicit-function-declaration] clrsetbits_be32(&guts->dmacr, 3 << shift, device << shift); ^~~~~~~~~~~~~~~ Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Madalin Bucur <madalin.bucur@nxp.com> Cc: netdev@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:11 +02:00
Stephen Hemminger	eea4f8ffbc	hv/netvsc: fix handling of fallback to single queue mode [ Upstream commit `916c5e1413` ] The netvsc device may need to fallback to running in single queue mode if host side only wants to support single queue. Recent change for handling mtu broke this in setup logic. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `3ffe64f1a6` ("hv_netvsc: split sub-channel setup into async and sync") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Thomas Falcon	26611d0bb2	ibmvnic: Revise RX/TX queue error messages [ Upstream commit `2d14d37952` ] During a device failover, there may be latency between the loss of the current backing device and a notification from firmware that a failover has occurred. This latency can result in a large amount of error printouts as firmware returns outgoing traffic with a generic error code. These are not necessarily errors in this case as the firmware is busy swapping in a new backing adapter and is not ready to send packets yet. This patch reclassifies those error codes as warnings with an explanation that a failover may be pending. All other return codes will be considered errors. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Frank Rowand	c62357ffc4	of: overlay: update phandle cache on overlay apply and remove [ Upstream commit `b9952b5218` ] A comment in the review of the patch adding the phandle cache said that the cache would have to be updated when modules are applied and removed. This patch implements the cache updates. Fixes: `0b3ce78e90` ("of: cache phandle nodes to reduce cost of of_find_node_by_phandle()") Reported-by: Alan Tull <atull@kernel.org> Suggested-by: Alan Tull <atull@kernel.org> Signed-off-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Dan Carpenter	926fb5fe74	drm/nouveau/gem: off by one bugs in nouveau_gem_pushbuf_reloc_apply() [ Upstream commit `7f073d011f` ] The bo array has req->nr_buffers elements so the > should be >= so we don't read beyond the end of the array. Fixes: `a1606a9596` ("drm/nouveau: new gem pushbuf interface, bump to 0.0.16") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Juri Lelli	92ab8ea99f	sched/deadline: Fix switched_from_dl() warning [ Upstream commit `e117cb52bd` ] Mark noticed that syzkaller is able to reliably trigger the following warning: dl_rq->running_bw > dl_rq->this_bw WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x458 show_stack+0x20/0x30 dump_stack+0x180/0x250 panic+0x2dc/0x4ec __warn_printk+0x0/0x150 report_bug+0x228/0x2d8 bug_handler+0xa0/0x1a0 brk_handler+0x2f0/0x568 do_debug_exception+0x1bc/0x5d0 el1_dbg+0x18/0x78 switched_from_dl+0x454/0x608 __sched_setscheduler+0x8cc/0x2018 sys_sched_setattr+0x340/0x758 el0_svc_naked+0x30/0x34 syzkaller reproducer runs a bunch of threads that constantly switch between DEADLINE and NORMAL classes while interacting through futexes. The splat above is caused by the fact that if a DEADLINE task is setattr back to NORMAL while in non_contending state (blocked on a futex - inactive timer armed), its contribution to running_bw is not removed before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the latter sees running_bw > this_bw. Fix it by removing a task contribution from running_bw if the task is not queued and in non_contending state while switched to a different class. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Juri Lelli <juri.lelli@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: claudio@evidence.eu.com Cc: rostedt@goodmis.org Link: http://lkml.kernel.org/r/20180711072948.27061-1-juri.lelli@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Jim Mattson	969ef15693	kvm: nVMX: Restore exit qual for VM-entry failure due to MSR loading [ Upstream commit `0b88abdc3f` ] This exit qualification was inadvertently dropped when the two VM-entry failure blocks were coalesced. Fixes: `e79f245dde` ("X86/KVM: Properly update 'tsc_offset' to represent the running guest") Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
piaojun	41b5285e24	net/9p/client.c: put refcount of trans_mod in error case in parse_opts() [ Upstream commit `c290fba8c4` ] In my testing, the second mount will fail after umounting successfully. The reason is that we put refcount of trans_mod in the correct case rather than the error case in parse_opts() at last. That will cause the refcount decrease to -1, and when we try to get trans_mod again in try_module_get(), we could only increase refcount to 0 which will cause failure as follows: parse_opts v9fs_get_trans_by_name try_module_get : return NULL to caller which cause error So we should put refcount of trans_mod in error case. Link: http://lkml.kernel.org/r/5B3F39A0.2030509@huawei.com Fixes: `9421c3e641` ("net/9p/client.c: fix potential refcnt problem of trans module") Signed-off-by: Jun Piao <piaojun@huawei.com> Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr> Tested-by: Dominique Martinet <dominique.martinet@cea.fr> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Wei Yongjun	4a6a2cba92	pinctrl: nsp: Fix potential NULL dereference [ Upstream commit `c29e9da56b` ] platform_get_resource() may fail and return NULL, so we should better check it's return value to avoid a NULL pointer dereference a bit later in the code. This is detected by Coccinelle semantic patch. @@ expression pdev, res, n, t, e, e1, e2; @@ res = platform_get_resource(pdev, t, n); + if (!res) + return -EINVAL; ... when != res == NULL e = devm_ioremap_nocache(e1, res->start, e2); Fixes: `cc4fa83f66` ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:10 +02:00
Dan Carpenter	cab422d563	pinctrl: nsp: off by ones in nsp_pinmux_enable() [ Upstream commit `f90a21c898` ] The > comparisons should be >= or else we read beyond the end of the pinctrl->functions[] array. Fixes: `cc4fa83f66` ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Paul Cercueil	593bf2c988	pinctrl: ingenic: Fix inverted direction for < JZ4770 [ Upstream commit `0084a786ca` ] The .gpio_set_direction() callback was setting inverted direction for SoCs older than the JZ4770, this restores the correct behaviour. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Yuchung Cheng	04f5fcc50d	tcp: remove DELAYED ACK events in DCTCP [ Upstream commit `a69258f7aa` ] After fixing the way DCTCP tracking delayed ACKs, the delayed-ACK related callbacks are no longer needed Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Dan Carpenter	193db50743	qlogic: check kstrtoul() for errors [ Upstream commit `5fc853cc01` ] We accidentally left out the error handling for kstrtoul(). Fixes: `a520030e32` ("qlcnic: Implement flash sysfs callback for 83xx adapter") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Alexandre Belloni	b45ef79dc3	rtc: fix alarm read and set offset [ Upstream commit `fd6792bb02` ] The offset needs to be added after reading the alarm value. It also needs to be subtracted after the now < alarm test. Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Willem de Bruijn	a2186f729d	packet: reset network header if packet shorter than ll reserved space [ Upstream commit `993675a310` ] If variable length link layer headers result in a packet shorter than dev->hard_header_len, reset the network header offset. Else skb->mac_len may exceed skb->len after skb_mac_reset_len. packet_sendmsg_spkt already has similar logic. Fixes: `b84bbaf7a6` ("packet: in packet_snd start writing at link layer allocation") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Bert Kenward	9a092576e2	sfc: hold filter_sem consistently during reset [ Upstream commit `193f20033c` ] We should take and release the filter_sem consistently during the reset process, in the same manner as the mac_lock and reset_lock. For lockdep consistency we also take the filter_sem for write around other calls to efx->type->init(). Fixes: `c2bebe37c6` ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:09 +02:00
Bert Kenward	3f6034f304	sfc: avoid hang from nested use of the filter_sem [ Upstream commit `1c56c0994a` ] In some situations we may end up calling down_read while already holding the semaphore for write, thus hanging. This has been seen when setting the MAC address for the interface. The hung task log in this situation includes this stack: down_read efx_ef10_filter_insert efx_ef10_filter_insert_addr_list efx_ef10_filter_vlan_sync_rx_mode efx_ef10_filter_add_vlan efx_ef10_filter_table_probe efx_ef10_set_mac_address efx_set_mac_address dev_set_mac_address In addition, lockdep rightly points out that nested calling of down_read is incorrect. Fixes: `c2bebe37c6` ("sfc: give ef10 its own rwsem in the filter table instead of filter_lock") Tested-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Masahiro Yamada	5ee0b669e6	kbuild: suppress warnings from 'getconf LFS_' [ Upstream commit `6d79a7b424` ] Suppress warnings for systems that do not recognize LFS_. getconf: no such configuration parameter `LFS_CFLAGS' getconf: no such configuration parameter `LFS_LDFLAGS' getconf: no such configuration parameter `LFS_LIBS' Fixes: `d7f14c66c2` ("kbuild: Enable Large File Support for hostprogs") Reported-by: Chen Feng <puck.chen@hisilicon.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Laura Abbott	f9d253c15c	tools: build: Use HOSTLDFLAGS with fixdep [ Upstream commit `8b247a92eb` ] The final link of fixdep uses LDFLAGS but not the existing HOSTLDFLAGS. Fix this. Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Laura Abbott	6110c0fce4	tools: build: Fixup host c flags [ Upstream commit `6fdbd824fd` ] Commit `0c3b7e4261` ("tools build: Add support for host programs format") introduced host_c_flags which referenced CHOSTFLAGS. The actual name of the variable is HOSTCFLAGS. Fix this up. Fixes: `0c3b7e4261` ("tools build: Add support for host programs format") Signed-off-by: Laura Abbott <labbott@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Dan Carpenter	c8ba74daec	ixgbe: Off by one in ixgbe_ipsec_tx() [ Upstream commit `c411104115` ] The ipsec->tx_tbl[] has IXGBE_IPSEC_MAX_SA_COUNT elements so the > needs to be changed to >= so we don't read one element beyond the end of the array. Fixes: `5925947047` ("ixgbe: process the Tx ipsec offload") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
David Francis	da9135048e	amd/dc/dce100: On dce100, set clocks to 0 on suspend [ Upstream commit `8d4235f715` ] [Why] When a dce100 asic was suspended, the clocks were not set to 0. Upon resume, the new clock was compared to the existing clock, they were found to be the same, and so the clock was not set. This resulted in a pernicious blackscreen. [How] In atomic commit, check to see if there are any active pipes. If no, set clocks to 0 Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Alexander Duyck	1a9739dc9c	ixgbe: Be more careful when modifying MAC filters [ Upstream commit `d14c780c11` ] This change makes it so that we are much more explicit about the ordering of updates to the receive address register (RAR) table. Prior to this patch I believe we may have been updating the table while entries were still active, or possibly allowing for reordering of things since we weren't explicitly flushing writes to either the lower or upper portion of the register prior to accessing the other half. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Reviewed-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Adam Ford	b49969e4b9	ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller [ Upstream commit `923847413f` ] The AM3517 has a different OTG controller location than the OMAP3, which is included from omap3.dtsi. This results in a hwmod error. Since the AM3517 has a different OTG controller address, this patch disabes one that is isn't available. Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:08 +02:00
Nishanth Menon	102be50be5	ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary cores [ Upstream commit `2f8b5b2183` ] Call secure services to enable ACTLR[0] (Enable invalidates of BTB with ICIALLU) when branch hardening is enabled for kernel. On GP devices OMAP5/DRA7, there is no possibility to update secure side since "secure world" is ROM and there are no override mechanisms possible. On HS devices, appropriate PPA should do the workarounds as well. However, the configuration is only done for secondary core, since it is expected that firmware/bootloader will have enabled the required configuration for the primary boot core (note: bootloaders typically will NOT enable secondary processors, since it has no need to do so). Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Russell King	7fdaa8e0ce	sfp: fix module initialisation with netdev already up [ Upstream commit `576cd32082` ] It was been observed that with a particular order of initialisation, the netdev can be up, but the SFP module still has its TX_DISABLE signal asserted. This occurs when the network device brought up before the SFP kernel module has been inserted by userspace. This occurs because sfp-bus layer does not hear about the change in network device state, and so assumes that it is still down. Set netdev->sfp when the upstream is registered to work around this problem. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Russell King	e756fc5b02	sfp: ensure we clean up properly on bus registration failure [ Upstream commit `f20a4c46b9` ] We fail to correctly clean up after a bus registration failure, which can lead to an incorrect assumption about the registration state of the upstream or sfp cage. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Steven Rostedt (VMware)	e6ab069a0e	ARM: 8780/1: ftrace: Only set kernel memory back to read-only after boot [ Upstream commit `b4c7e2bd2e` ] Dynamic ftrace requires modifying the code segments that are usually set to read-only. To do this, a per arch function is called both before and after the ftrace modifications are performed. The "before" function will set kernel code text to read-write to allow for ftrace to make the modifications, and the "after" function will set the kernel code text back to "read-only" to keep the kernel code text protected. The issue happens when dynamic ftrace is tested at boot up. The test is done before the kernel code text has been set to read-only. But the "before" and "after" calls are still performed. The "after" call will change the kernel code text to read-only prematurely, and other boot code that expects this code to be read-write will fail. The solution is to add a variable that is set when the kernel code text is expected to be converted to read-only, and make the ftrace "before" and "after" calls do nothing if that variable is not yet set. This is similar to the x86 solution from commit `1623963097` ("ftrace, x86: make kernel text writable only for conversions"). Link: http://lkml.kernel.org/r/20180620212906.24b7b66e@vmware.local.home Reported-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Kamal Heib	fb848d06d2	RDMA/mlx5: Fix memory leak in mlx5_ib_create_srq() error path [ Upstream commit `d63c46734c` ] Fix memory leak in the error path of mlx5_ib_create_srq() by making sure to free the allocated srq. Fixes: `c2b37f7648` ("IB/mlx5: Fix integer overflows in mlx5_ib_create_srq") Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Dave Jiang	3cba6052bb	nfit: fix unchecked dereference in acpi_nfit_ctl [ Upstream commit `ee6581ceba` ] Incremental patch to fix the unchecked dereference in acpi_nfit_ctl. Reported by Dan Carpenter: "acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value" from Jun 28, 2018, leads to the following Smatch complaint: drivers/acpi/nfit/core.c:578 acpi_nfit_ctl() warn: variable dereferenced before check 'cmd_rc' (see line 411) drivers/acpi/nfit/core.c 410 411 *cmd_rc = -EINVAL; ^^^^^^^^^^^^^^^^^^ Patch adds unchecked dereference. Fixes: `c1985cefd8` ("acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Jeremy Cline	6ab0a8948f	perf tools: Use python-config --includes rather than --cflags [ Upstream commit `32aa928a7b` ] Builds started failing in Fedora on Python 3.7 with: `.gnu.debuglto_.debug_macro' referenced in section `.gnu.debuglto_.debug_macro' of util/scripting-engines/trace-event-python.o: defined in discarded section In Fedora, Python 3.7 added -flto to the list of --cflags and since it was only applied to util/scripting-engines/trace-event-python.c and scripts/python/Perf-Trace-Util/Context.c, linking failed. It's not the first time the addition of flags has broken builds: commit `c6707fdef7` ("perf tools: Fix up build in hardnened environments") appears to have fixed a similar problem. "python-config --includes" provides the proper -I flags and doesn't introduce additional CFLAGS. Signed-off-by: Jeremy Cline <jcline@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180710154612.6285-1-jcline@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Janne Huttunen	b6c783af06	perf script python: Fix dict reference counting [ Upstream commit `db0ba84c04` ] The dictionaries are attached to the parameter tuple that steals the references and takes care of releasing them when appropriate. The code should not decrement the reference counts explicitly. E.g. if libpython has been built with reference debugging enabled, the superfluous DECREFs will trigger this error when running perf script: Fatal Python error: Objects/tupleobject.c:238 object at 0x7f10f2041b40 has negative ref count -1 Aborted (core dumped) If the reference debugging is not enabled, the superfluous DECREFs might cause the dict objects to be silently released while they are still in use. This may trigger various other assertions or just cause perf crashes and/or weird and unexpected data changes in the stored Python objects. Signed-off-by: Janne Huttunen <janne.huttunen@nokia.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jaroslav Skarvada <jskarvad@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1531133990-17485-1-git-send-email-janne.huttunen@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:07 +02:00
Jiri Olsa	a4546e808a	perf tools: Fix compilation errors on gcc8 [ Upstream commit `a09603f851` ] We are getting following warnings on gcc8 that break compilation: $ make CC jvmti/jvmti_agent.o jvmti/jvmti_agent.c: In function ‘jvmti_open’: jvmti/jvmti_agent.c:252:35: error: ‘/jit-’ directive output may be truncated \ writing 5 bytes into a region of size between 1 and 4096 [-Werror=format-truncation=] snprintf(dump_path, PATH_MAX, "%s/jit-%i.dump", jit_path, getpid()); There's no point in checking the result of snprintf call in jvmti_open, the following open call will fail in case the name is mangled or too long. Using tools/lib/ function scnprintf that touches the return value from the snprintf() calls and thus get rid of those warnings. $ make DEBUG=1 CC arch/x86/util/perf_regs.o arch/x86/util/perf_regs.c: In function ‘arch_sdt_arg_parse_op’: arch/x86/util/perf_regs.c:229:4: error: ‘strncpy’ output truncated before terminating nul copying 2 bytes from a string of the same length [-Werror=stringop-truncation] strncpy(prefix, "+0", 2); ^~~~~~~~~~~~~~~~~~~~~~~~ Using scnprintf instead of the strncpy (which we know is safe in here) to get rid of that warning. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180702134202.17745-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Kim Phillips	f3327af7f8	perf test shell: Prevent temporary editor files from being considered test scripts [ Upstream commit `db8fec583f` ] Allows a perf shell test developer to concurrently edit and run their test scripts, avoiding perf test attempts to execute their editor temporary files, such as seen here: $ sudo taskset -c 0 ./perf test -vvvvvvvv -F 63 63: 0VIM 8.0 : --- start --- sh: 1: ./tests/shell/.record+probe_libc_inet_pton.sh.swp: Permission denied ---- end ---- 0VIM 8.0: FAILED! Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180629124658.15a506b41fc4539c08eb9426@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Kim Phillips	daeabef01f	perf llvm-utils: Remove bashism from kernel include fetch script [ Upstream commit `f6432b9f65` ] Like system(), popen() calls /bin/sh, which may/may not be bash. Script when run on dash and encounters the line, yields: exit: Illegal number: -1 checkbashisms report on script content: possible bashism (exit\|return with negative status code): exit -1 Remove the bashism and use the more portable non-zero failure status code 1. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan@linux.vnet.ibm.com> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180629124652.8d0af7e2281fd3fd8262cacc@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Manish Rangankar	268b59b680	scsi: qedi: Send driver state to MFW [ Upstream commit `a3440d0d2f` ] In case of iSCSI offload BFS environment, MFW requires to mark virtual link based upon qedi load status. Signed-off-by: Manish Rangankar <manish.rangankar@qlogic.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Saurav Kashyap	43a75ff22e	scsi: qedf: Send the driver state to MFW [ Upstream commit `6ac174756d` ] Need to notify firmware when driver is loaded and unloaded. Signed-off-by: Saurav Kashyap <saurav.kashyap@cavium.com> Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Don Brace	7f545182d9	scsi: hpsa: correct enclosure sas address [ Upstream commit `01d0e789a1` ] The original complaint was the lsscsi -t showed the same SAS address of the two enclosures (SEP devices). In fact the SAS address was being set to the Enclosure Logical Identifier (ELI). Reviewed-by: Scott Teel <scott.teel@microsemi.com> Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com> Signed-off-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Taeung Song	ab85798979	samples/bpf: Fix tc and ip paths in xdp2skb_meta.sh [ Upstream commit `b9626f45ab` ] The below path error can occur: # ./xdp2skb_meta.sh --dev eth0 --list ./xdp2skb_meta.sh: line 61: /usr/sbin/tc: No such file or directory So just use command names instead of absolute paths of tc and ip. In addition, it allow callers to redefine $TC and $IP paths Fixes: `36e04a2d78` ("samples/bpf: xdp2skb_meta shows transferring info from XDP to SKB") Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:06 +02:00
Vikas Gupta	276dbd6801	bnxt_en: Fix for system hang if request_irq fails [ Upstream commit `c58387ab16` ] Fix bug in the error code path when bnxt_request_irq() returns failure. bnxt_disable_napi() should not be called in this error path because NAPI has not been enabled yet. Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Michael Chan	09a63bc8c2	bnxt_en: Do not modify max IRQ count after RDMA driver requests/frees IRQs. [ Upstream commit `30f529473e` ] Calling bnxt_set_max_func_irqs() to modify the max IRQ count requested or freed by the RDMA driver is flawed. The max IRQ count is checked when re-initializing the IRQ vectors and this can happen multiple times during ifup or ethtool -L. If the max IRQ is reduced and the RDMA driver is operational, we may not initailize IRQs correctly. This problem shows up on VFs with very small number of MSIX. There is no other logic that relies on the IRQ count excluding the ones used by RDMA. So we fix it by just removing the call to subtract or add the IRQs used by RDMA. Fixes: `a588e4580a` ("bnxt_en: Add interface to support RDMA driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Michael Chan	e678c47251	bnxt_en: Always set output parameters in bnxt_get_max_rings(). [ Upstream commit `78f058a4aa` ] The current code returns -ENOMEM and does not bother to set the output parameters to 0 when no rings are available. Some callers, such as bnxt_get_channels() will display garbage ring numbers when that happens. Fix it by always setting the output parameters. Fixes: `6e6c5a57fb` ("bnxt_en: Modify bnxt_get_max_rings() to support shared or non shared rings.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Michael Chan	5ef273fbb3	bnxt_en: Fix inconsistent BNXT_FLAG_AGG_RINGS logic. [ Upstream commit `07f4fde53d` ] If there aren't enough RX rings available, the driver will attempt to use a single RX ring without the aggregation ring. If that also fails, the BNXT_FLAG_AGG_RINGS flag is cleared but the other ring parameters are not set consistently to reflect that. If more RX rings become available at the next open, the RX rings will be in an inconsistent state and may crash when freeing the RX rings. Fix it by restoring the BNXT_FLAG_AGG_RINGS if not enough RX rings are available to run without aggregation rings. Fixes: `bdbd1eb59c` ("bnxt_en: Handle no aggregation ring gracefully.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Venkat Duvvuru	5e00d96990	bnxt_en: Fix the vlan_tci exact match check. [ Upstream commit `e32d4e60b3` ] It is possible that OVS may set don’t care for DEI/CFI bit in vlan_tci mask. Hence, checking for vlan_tci exact match will endup in a vlan flow rejection. This patch fixes the problem by checking for vlan_pcp and vid separately, instead of checking for the entire vlan_tci. Fixes: `e85a9be93c` (bnxt_en: do not allow wildcard matches for L2 flows) Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Peter Zijlstra	451530c5ac	ARC: Improve cmpxchg syscall implementation [ Upstream commit `e8708786d4` ] This is used in configs lacking hardware atomics to emulate atomic r-m-w for user space, implemented by disabling preemption in kernel. However there are issues in current implementation: 1. Process not terminated if invalid user pointer passed: i.e. __get_user() failed. 2. The reason for this patch was __put_user() failure not being handled either, specifically for the COW break scenario. The zero page is initially wired up and read from __get_user() succeeds. A subsequent write by __put_user() induces a Protection Violation, but COW can't finish as Linux page fault handler is disabled due to preempt disable. And what's worse is we silently return the stale value to user space. Fix this specific case by re-enabling preemption and explicitly fixing up the fault and retrying the whole sequence over. Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: linux-arch@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: rewrote the changelog] Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:05 +02:00
Gustavo Pimentel	f5bab96e13	ARC: [plat-hsdk]: Configure APB GPIO controller on ARC HSDK platform [ Upstream commit `ec58ba16e1` ] In case of HSDK we have intermediate INTC in for of DW APB GPIO controller which is used as a de-bounce logic for interrupt wires that come from outside the board. We cannot use existing "irq-dw-apb-ictl" driver here because all input lines are routed to corresponding output lines but not muxed into one line (this is configured in RTL and we cannot change this in software). But even if we add such a feature to "irq-dw-apb-ictl" driver that won't benefit us as higher-level INTC (in case of HSDK it is IDU) anyways has per-input control so adding fully-controller intermediate INTC will only bring some overhead on interrupt processing but no other benefits. Thus we just do one-time configuration of DW APB GPIO controller and forget about it. Based on implementation available on arch/arc/plat-axs10x/axs10x.c file. Acked-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Andrey Ryabinin	d392482547	netfilter: nf_conntrack: Fix possible possible crash on module loading. [ Upstream commit `2045cdfa1b` ] Loading the nf_conntrack module with doubled hashsize parameter, i.e. modprobe nf_conntrack hashsize=12345 hashsize=12345 causes NULL-ptr deref. If 'hashsize' specified twice, the nf_conntrack_set_hashsize() function will be called also twice. The first nf_conntrack_set_hashsize() call will set the 'nf_conntrack_htable_size' variable: nf_conntrack_set_hashsize() ... /* On boot, we can set this without any fancy locking. */ if (!nf_conntrack_htable_size) return param_set_uint(val, kp); But on the second invocation, the nf_conntrack_htable_size is already set, so the nf_conntrack_set_hashsize() will take a different path and call the nf_conntrack_hash_resize() function. Which will crash on the attempt to dereference 'nf_conntrack_hash' pointer: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: 0010:nf_conntrack_hash_resize+0x255/0x490 [nf_conntrack] Call Trace: nf_conntrack_set_hashsize+0xcd/0x100 [nf_conntrack] parse_args+0x1f9/0x5a0 load_module+0x1281/0x1a50 __se_sys_finit_module+0xbe/0xf0 do_syscall_64+0x7c/0x390 entry_SYSCALL_64_after_hwframe+0x49/0xbe Fix this, by checking !nf_conntrack_hash instead of !nf_conntrack_htable_size. nf_conntrack_hash will be initialized only after the module loaded, so the second invocation of the nf_conntrack_set_hashsize() won't crash, it will just reinitialize nf_conntrack_htable_size again. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Florian Westphal	eb968e6d9d	netfilter: nft_compat: explicitly reject ERROR and standard target [ Upstream commit `21d5e07819` ] iptables-nft never requests these, but make this explicitly illegal. If it were quested, kernel could oops as ->eval is NULL, furthermore, the builtin targets have no owning module so its possible to rmmod eb/ip/ip6_tables module even if they would be loaded. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Russell King	66a0b73937	drm/armada: fix irq handling [ Upstream commit `92298c1cd8` ] Add the missing locks to the IRQ enable/disable paths, and fix a comment in the interrupt handler: reading the ISR clears down the status bits, but does not reset the interrupt so it can signal again. That seems to require a write. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Russell King	697eafd012	drm/armada: fix colorkey mode property [ Upstream commit `d378859a66` ] The colorkey mode property was not correctly disabling the colorkeying when "disabled" mode was selected. Arrange for this to work as one would expect. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Michael Hennerich	a2363e2d88	net: ieee802154: adf7242: Fix OCL calibration runs [ Upstream commit `58e9683d14` ] Reissuing RC_RX every 400ms - to adjust for offset drift in receiver see datasheet page 61, OCL section. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:04 +02:00
Michael Hennerich	c313f29e44	net: ieee802154: adf7242: Fix erroneous RX enable [ Upstream commit `36d26d6b62` ] Only enable RX mode if the netdev is opened. Signed-off-by: Michael Hennerich <michael.hennerich@analog.com> Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Mikko Perttunen	0c5a08372f	drm/tegra: Fix comparison operator for buffer size [ Upstream commit `5265f0338b` ] Here we are checking for the buffer length, not an offset for writing to, so using > is correct. The current code incorrectly rejects a command buffer ending at the memory buffer's end. Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Dmitry Osipenko	7be437a4a7	gpu: host1x: Check whether size of unpin isn't 0 [ Upstream commit `ec58923215` ] Only gather pins are mapped by the Host1x driver, regular BO relocations are not. Check whether size of unpin isn't 0, otherwise IOVA allocation at 0x0 could be erroneously released. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Dmitry Osipenko	dd6a05c921	gpu: host1x: Skip IOMMU initialization if firewall is enabled [ Upstream commit `4466b1f0e0` ] Host1x's CDMA can't access the command buffers if IOMMU and Host1x firewall are enabled in the kernels config because firewall doesn't map the copied buffer into IOVA space. Fix this by skipping IOMMU initialization if firewall is enabled as firewall merges sparse cmdbufs into a single contiguous buffer and hence IOMMU isn't needed in this case. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Stefan Schmidt	054b9b5cd6	ieee802154: fakelb: switch from BUG_ON() to WARN_ON() on problem [ Upstream commit `8f2fbc6c60` ] The check is valid but it does not warrant to crash the kernel. A WARN_ON() is good enough here. Found by checkpatch. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Stefan Schmidt	b0da3918d1	ieee802154: at86rf230: use __func__ macro for debug messages [ Upstream commit `8a81388ec2` ] Instead of having the function name hard-coded (it might change and we forgot to update them in the debug output) we can use __func__ instead and also shorter the line so we do not need to break it. Also fix an extra blank line while being here. Found by checkpatch. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:03 +02:00
Stefan Schmidt	7a56e644d2	ieee802154: at86rf230: switch from BUG_ON() to WARN_ON() on problem [ Upstream commit `20f330452a` ] The check is valid but it does not warrant to crash the kernel. A WARN_ON() is good enough here. Found by checkpatch. Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Arnd Bergmann	33be064adf	drm/sun4i: link in front-end code if needed [ Upstream commit `3156b53c2e` ] When the base sun4i DRM driver is built-in but the back-end is a loadable module, we run into a link error: drivers/gpu/drm/sun4i/sun4i_drv.o: In function `sun4i_drv_probe': sun4i_drv.c:(.text+0x60c): undefined reference to `sun4i_frontend_of_table' The dependency is a bit tricky, the best workaround I have come up with is to use a Makefile hack to to interpret both CONFIG_DRM_SUN4I_BACKEND=m and CONFIG_DRM_SUN4I_BACKEND=y as a directive to build the front-end the same way as the main module. Fixes: `dd0421f475` ("drm/sun4i: Add a driver for the display frontend") Link: https://lore.kernel.org/lkml/20180301091908.zcptz3ezqr2c6ly5@flea/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180706142847.2032381-1-arnd@arndb.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Paolo Abeni	c46a24e3a6	ipfrag: really prevent allocation on netns exit [ Upstream commit `f6f2a4a2eb` ] Setting the low threshold to 0 has no effect on frags allocation, we need to clear high_thresh instead. The code was pre-existent to commit `648700f76b` ("inet: frags: use rhashtables for reassembly units"), but before the above, such assignment had a different role: prevent concurrent eviction from the worker and the netns cleanup helper. Fixes: `648700f76b` ("inet: frags: use rhashtables for reassembly units") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
John Fastabend	d71c9c2ae1	bpf: fix sk_skb programs without skb->dev assigned [ Upstream commit `0c6bc6e531` ] Multiple BPF helpers in use by sk_skb programs calculate the max skb length using the __bpf_skb_max_len function. However, this calculates the max length using the skb->dev pointer which can be NULL when an sk_skb program is paired with an sk_msg program. To force this a sk_msg program needs to redirect into the ingress path of a sock with an attach sk_skb program. Then the the sk_skb program would need to call one of the helpers that adjust the skb size. To fix the null ptr dereference use SKB_MAX_ALLOC size if no dev is available. Fixes: `8934ce2fd0` ("bpf: sockmap redirect ingress support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Douglas Anderson	ddb3428519	nvmem: Don't let a NULL cell_id for nvmem_cell_get() crash us [ Upstream commit `87ed1405ef` ] In commit `ca04d9d3e1` ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips") you can see a call like: devm_nvmem_cell_get(dev, NULL); Note that the cell ID passed to the function is NULL. This is because the qcom-qusb2 driver is expected to work only on systems where the PHY node is hooked up via device-tree and is nameless. This works OK for the most part. The first thing nvmem_cell_get() does is to call of_nvmem_cell_get() and there it's documented that a NULL name is fine. The problem happens when the call to of_nvmem_cell_get() returns -EINVAL. In such a case we'll fall back to nvmem_cell_get_from_list() and eventually might (if nvmem_cells isn't an empty list) crash with something that looks like: strcmp nvmem_find_cell __nvmem_device_get nvmem_cell_get_from_list nvmem_cell_get devm_nvmem_cell_get qusb2_phy_probe There are several different ways we could fix this problem: One could argue that perhaps the qcom-qusb2 driver should be changed to use of_nvmem_cell_get() which is allowed to have a NULL name. In that case, we'd need to add a patche to introduce devm_of_nvmem_cell_get() since the qcom-qusb2 driver is using devm managed resources. One could also argue that perhaps we could just add a name to qcom-qusb2. That would be OK but I believe it effectively changes the device tree bindings, so maybe it's a no-go. In this patch I have chosen to fix the problem by simply not crashing when a NULL cell_id is passed to nvmem_cell_get(). NOTE: that for the qcom-qusb2 driver the "nvmem-cells" property is defined to be optional and thus it's expected to be a common case that we would hit this crash and this is more than just a theoretical fix. Fixes: `ca04d9d3e1` ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips") Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Davide Caratti	01e1bef230	net/sched: act_tunnel_key: fix NULL dereference when 'goto chain' is used [ Upstream commit `38230a3e0e` ] the control action in the common member of struct tcf_tunnel_key must be a valid value, as it can contain the chain index when 'goto chain' is used. Ensure that the control action can be read as x->tcfa_action, when x is a pointer to struct tc_action and x->ops->type is TCA_ACT_TUNNEL_KEY, to prevent the following command: # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \ > $tcflags dst_mac $h2mac action tunnel_key unset goto chain 1 from causing a NULL dereference when a matching packet is received: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 80000001097ac067 P4D 80000001097ac067 PUD 103b0a067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 3491 Comm: mausezahn Tainted: G E 4.18.0-rc2.auguri+ #421 Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.58 02/07/2013 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246 RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800 R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40 FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0 Call Trace: <IRQ> fl_classify+0x1ad/0x1c0 [cls_flower] ? __update_load_avg_se.isra.47+0x1ca/0x1d0 ? __update_load_avg_se.isra.47+0x1ca/0x1d0 ? update_load_avg+0x665/0x690 ? update_load_avg+0x665/0x690 ? kmem_cache_alloc+0x38/0x1c0 tcf_classify+0x89/0x140 __netif_receive_skb_core+0x5ea/0xb70 ? enqueue_entity+0xd0/0x270 ? process_backlog+0x97/0x150 process_backlog+0x97/0x150 net_rx_action+0x14b/0x3e0 __do_softirq+0xde/0x2b4 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.18+0x49/0x50 __local_bh_enable_ip+0x49/0x50 __dev_queue_xmit+0x4ab/0x8a0 ? wait_woken+0x80/0x80 ? packet_sendmsg+0x38f/0x810 ? __dev_queue_xmit+0x8a0/0x8a0 packet_sendmsg+0x38f/0x810 sock_sendmsg+0x36/0x40 __sys_sendto+0x10e/0x140 ? do_vfs_ioctl+0xa4/0x630 ? syscall_trace_enter+0x1df/0x2e0 ? __audit_syscall_exit+0x22a/0x290 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fd67e18dc93 Code: 48 8b 0d 18 83 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c7 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b f7 ff ff 48 89 04 24 RSP: 002b:00007ffe0189b748 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000020ca010 RCX: 00007fd67e18dc93 RDX: 0000000000000062 RSI: 00000000020ca322 RDI: 0000000000000003 RBP: 00007ffe0189b780 R08: 00007ffe0189b760 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000062 R13: 00000000020ca322 R14: 00007ffe0189b760 R15: 0000000000000003 Modules linked in: act_tunnel_key act_gact cls_flower sch_ingress vrf veth act_csum(E) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_realtek coretemp snd_hda_codec_generic kvm_intel kvm irqbypass snd_hda_intel crct10dif_pclmul crc32_pclmul hp_wmi ghash_clmulni_intel pcbc snd_hda_codec aesni_intel sparse_keymap rfkill snd_hda_core snd_hwdep snd_seq crypto_simd iTCO_wdt gpio_ich iTCO_vendor_support wmi_bmof cryptd mei_wdt glue_helper snd_seq_device snd_pcm pcspkr snd_timer snd i2c_i801 lpc_ich sg soundcore wmi mei_me mei ie31200_edac nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod sr_mod cdrom i915 video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci crc32c_intel libahci serio_raw sfc libata mtd drm ixgbe mdio i2c_core e1000e dca CR2: 0000000000000000 ---[ end trace 1ab8b5b5d4639dfc ]--- RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffff95145ea03c40 EFLAGS: 00010246 RAX: 0000000020000001 RBX: ffff9514499e5800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 RBP: ffff95145ea03e60 R08: 0000000000000000 R09: ffff95145ea03c9c R10: ffff95145ea03c78 R11: 0000000000000008 R12: ffff951456a69800 R13: ffff951456a69808 R14: 0000000000000001 R15: ffff95144965ee40 FS: 00007fd67ee11740(0000) GS:ffff95145ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001038a2006 CR4: 00000000001606f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x11400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: `d0f6dd8a91` ("net/sched: Introduce act_tunnel_key") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Davide Caratti	28694edb7f	net/sched: act_csum: fix NULL dereference when 'goto chain' is used [ Upstream commit `11a245e2f7` ] the control action in the common member of struct tcf_csum must be a valid value, as it can contain the chain index when 'goto chain' is used. Ensure that the control action can be read as x->tcfa_action, when x is a pointer to struct tc_action and x->ops->type is TCA_ACT_CSUM, to prevent the following command: # tc filter add dev $h2 ingress protocol ip pref 1 handle 101 flower \ > $tcflags dst_mac $h2mac action csum ip or tcp or udp or sctp goto chain 1 from triggering a NULL pointer dereference when a matching packet is received. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000010416b067 P4D 800000010416b067 PUD 1041be067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 0 PID: 3072 Comm: mausezahn Tainted: G E 4.18.0-rc2.auguri+ #421 Hardware name: Hewlett-Packard HP Z220 CMT Workstation/1790, BIOS K51 v01.58 02/07/2013 RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffa020dea03c40 EFLAGS: 00010246 RAX: 0000000020000001 RBX: ffffa020d7ccef00 RCX: 0000000000000054 RDX: 0000000000000000 RSI: ffffa020ca5ae000 RDI: ffffa020d7ccef00 RBP: ffffa020dea03e60 R08: 0000000000000000 R09: ffffa020dea03c9c R10: ffffa020dea03c78 R11: 0000000000000008 R12: ffffa020d3fe4f00 R13: ffffa020d3fe4f08 R14: 0000000000000001 R15: ffffa020d53ca300 FS: 00007f5a46942740(0000) GS:ffffa020dea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104218002 CR4: 00000000001606f0 Call Trace: <IRQ> fl_classify+0x1ad/0x1c0 [cls_flower] ? arp_rcv+0x121/0x1b0 ? __x2apic_send_IPI_dest+0x40/0x40 ? smp_reschedule_interrupt+0x1c/0xd0 ? reschedule_interrupt+0xf/0x20 ? reschedule_interrupt+0xa/0x20 ? device_is_rmrr_locked+0xe/0x50 ? iommu_should_identity_map+0x49/0xd0 ? __intel_map_single+0x30/0x140 ? e1000e_update_rdt_wa.isra.52+0x22/0xb0 [e1000e] ? e1000_alloc_rx_buffers+0x233/0x250 [e1000e] ? kmem_cache_alloc+0x38/0x1c0 tcf_classify+0x89/0x140 __netif_receive_skb_core+0x5ea/0xb70 ? enqueue_task_fair+0xb6/0x7d0 ? process_backlog+0x97/0x150 process_backlog+0x97/0x150 net_rx_action+0x14b/0x3e0 __do_softirq+0xde/0x2b4 do_softirq_own_stack+0x2a/0x40 </IRQ> do_softirq.part.18+0x49/0x50 __local_bh_enable_ip+0x49/0x50 __dev_queue_xmit+0x4ab/0x8a0 ? wait_woken+0x80/0x80 ? packet_sendmsg+0x38f/0x810 ? __dev_queue_xmit+0x8a0/0x8a0 packet_sendmsg+0x38f/0x810 sock_sendmsg+0x36/0x40 __sys_sendto+0x10e/0x140 ? do_vfs_ioctl+0xa4/0x630 ? syscall_trace_enter+0x1df/0x2e0 ? __audit_syscall_exit+0x22a/0x290 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f5a45cbec93 Code: 48 8b 0d 18 83 20 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 59 c7 20 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 2b f7 ff ff 48 89 04 24 RSP: 002b:00007ffd0ee6d748 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000001161010 RCX: 00007f5a45cbec93 RDX: 0000000000000062 RSI: 0000000001161322 RDI: 0000000000000003 RBP: 00007ffd0ee6d780 R08: 00007ffd0ee6d760 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000062 R13: 0000000001161322 R14: 00007ffd0ee6d760 R15: 0000000000000003 Modules linked in: act_csum act_gact cls_flower sch_ingress vrf veth act_tunnel_key(E) xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi snd_hda_codec_realtek kvm snd_hda_codec_generic hp_wmi iTCO_wdt sparse_keymap rfkill mei_wdt iTCO_vendor_support wmi_bmof gpio_ich irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel snd_hda_intel crypto_simd cryptd snd_hda_codec glue_helper snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm pcspkr i2c_i801 snd_timer snd sg lpc_ich soundcore wmi mei_me mei ie31200_edac nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod ahci libahci crc32c_intel i915 ixgbe serio_raw libata video dca i2c_algo_bit sfc drm_kms_helper syscopyarea mtd sysfillrect mdio sysimgblt fb_sys_fops drm e1000e i2c_core CR2: 0000000000000000 ---[ end trace 3c9e9d1a77df4026 ]--- RIP: 0010:tcf_action_exec+0xb8/0x100 Code: 00 00 00 20 74 1d 83 f8 03 75 09 49 83 c4 08 4d 39 ec 75 bc 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 97 a8 00 00 00 <48> 8b 12 48 89 55 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 RSP: 0018:ffffa020dea03c40 EFLAGS: 00010246 RAX: 0000000020000001 RBX: ffffa020d7ccef00 RCX: 0000000000000054 RDX: 0000000000000000 RSI: ffffa020ca5ae000 RDI: ffffa020d7ccef00 RBP: ffffa020dea03e60 R08: 0000000000000000 R09: ffffa020dea03c9c R10: ffffa020dea03c78 R11: 0000000000000008 R12: ffffa020d3fe4f00 R13: ffffa020d3fe4f08 R14: 0000000000000001 R15: ffffa020d53ca300 FS: 00007f5a46942740(0000) GS:ffffa020dea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000104218002 CR4: 00000000001606f0 Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: 0x26400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]--- Fixes: `9c5f69bbd7` ("net/sched: act_csum: don't use spinlock in the fast path") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:02 +02:00
Harini Katakam	7520d375b0	net: macb: Free RX ring for all queues [ Upstream commit `e50b770ea5` ] rx ring is allocated for all queues in macb_alloc_consistent. Free the same for all queues instead of just Q0. Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Daniel Mack	966829aba2	ARM: pxa: irq: fix handling of ICMR registers in suspend/resume [ Upstream commit `0c1049dcb4` ] PXA3xx platforms have 56 interrupts that are stored in two ICMR registers. The code in pxa_irq_suspend() and pxa_irq_resume() however does a simple division by 32 which only leads to one register being saved at suspend and restored at resume time. The NAND interrupt setting, for instance, is lost. Fix this by using DIV_ROUND_UP() instead. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Casey Leedom	a583ba7a4a	cxgb4: assume flash part size to be 4MB, if it can't be determined [ Upstream commit `843789f6dd` ] t4_get_flash_params() fails in a fatal fashion if the FLASH part isn't one of the recognized parts. But this leads to desperate efforts to update drivers when various FLASH parts which we are using suddenly become unavailable and we need to substitute new FLASH parts. This has lead to more than one Customer Field Emergency when a Customer has an old driver and suddenly can't use newly shipped adapters. This commit fixes this by simply assuming that the FLASH part is 4MB in size if it can't be identified. Note that all Chelsio adapters will have flash parts which are at least 4MB in size. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Jon Maloy	91df8f5242	tipc: make function tipc_net_finalize() thread safe [ Upstream commit `9faa89d4ed` ] The setting of the node address is not thread safe, meaning that two discoverers may decide to set it simultanously, with a duplicate entry in the name table as result. We fix that with this commit. Fixes: `25b0b9c4e8` ("tipc: handle collisions of 32-bit node address hash values") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Jon Maloy	a4100054e3	tipc: fix correct setting of message type in second discoverer [ Upstream commit `92018c7ca9` ] The duplicate address discovery protocol is not safe against two discoverers running in parallel. The one executing first after the trial period is over will set the node address and change its own message type to DSC_REQ_MSG. The one executing last may find that the node address is already set, and never change message type, with the result that its links may never be established. In this commmit we ensure that the message type always is set correctly after the trial period is over. Fixes: `25b0b9c4e8` ("tipc: handle collisions of 32-bit node address hash values") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Jon Maloy	871d366d7a	tipc: correct discovery message handling during address trial period [ Upstream commit `e415577f57` ] With the duplicate address discovery protocol for tipc nodes addresses we introduced a one second trial period before a node is allocated a hash number to use as address. Unfortunately, we miss to handle the case when a regular LINK REQUEST/ RESPONSE arrives from a cluster node during the trial period. Such messages are not ignored as they should be, leading to links setup attempts while the node still has no address. Fixes: `25b0b9c4e8` ("tipc: handle collisions of 32-bit node address hash values") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Jon Maloy	aa01f1bfd6	tipc: fix wrong return value from function tipc_node_try_addr() [ Upstream commit `2a57f18242` ] The function for checking if there is an node address conflict is supposed to return a suggestion for a new address if it finds a conflict, and zero otherwise. But in case the peer being checked is previously unknown it does instead return a "suggestion" for the checked address itself. This results in a DSC_TRIAL_FAIL_MSG being sent unecessarily to the peer, and sometimes makes the trial period starting over again. Fixes: `25b0b9c4e8` ("tipc: handle collisions of 32-bit node address hash values") Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Vladimir Zapolskiy	8be8876433	ravb: fix invalid context bug while changing link options by ethtool [ Upstream commit `05925e52a7` ] The change fixes sleep in atomic context bug, which is encountered every time when link settings are changed by ethtool. Since commit `35b5f6b1a8` ("PHYLIB: Locking fixes for PHY I/O potentially sleeping") phy_start_aneg() function utilizes a mutex to serialize changes to phy state, however that helper function is called in atomic context under a grabbed spinlock, because phy_start_aneg() is called by phy_ethtool_ksettings_set() and by replaced phy_ethtool_sset() helpers from phylib. Now duplex mode setting is enforced in ravb_adjust_link() only, also now RX/TX is disabled when link is put down or modifications to E-MAC registers ECMR and GECMR are expected for both cases of checked and ignored link status pin state from E-MAC interrupt handler. Fixes: `c156633f13` ("Renesas Ethernet AVB driver proper") Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:01 +02:00
Vladimir Zapolskiy	43fdde5022	ravb: fix invalid context bug while calling auto-negotiation by ethtool [ Upstream commit `0973a4dd79` ] Since commit `35b5f6b1a8` ("PHYLIB: Locking fixes for PHY I/O potentially sleeping") phy_start_aneg() function utilizes a mutex to serialize changes to phy state, however the helper function is called in atomic context. The bug can be reproduced by running "ethtool -r" command, the bug is reported if CONFIG_DEBUG_ATOMIC_SLEEP build option is enabled. Fixes: `c156633f13` ("Renesas Ethernet AVB driver proper") Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Vladimir Zapolskiy	6032d5492e	sh_eth: fix invalid context bug while changing link options by ethtool [ Upstream commit `5cb3f52a11` ] The change fixes sleep in atomic context bug, which is encountered every time when link settings are changed by ethtool. Since commit `35b5f6b1a8` ("PHYLIB: Locking fixes for PHY I/O potentially sleeping") phy_start_aneg() function utilizes a mutex to serialize changes to phy state, however that helper function is called in atomic context under a grabbed spinlock, because phy_start_aneg() is called by phy_ethtool_ksettings_set() and by replaced phy_ethtool_sset() helpers from phylib. Now duplex mode setting is enforced in sh_eth_adjust_link() only, also now RX/TX is disabled when link is put down or modifications to E-MAC registers ECMR and GECMR are expected for both cases of checked and ignored link status pin state from E-MAC interrupt handler. For reference the change is a partial rework of commit `1e1b812bbe` ("sh_eth: fix handling of no LINK signal"). Fixes: `dc19e4e5e0` ("sh: sh_eth: Add support ethtool") Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Vladimir Zapolskiy	b41eb1109b	sh_eth: fix invalid context bug while calling auto-negotiation by ethtool [ Upstream commit `53a710b504` ] Since commit `35b5f6b1a8` ("PHYLIB: Locking fixes for PHY I/O potentially sleeping") phy_start_aneg() function utilizes a mutex to serialize changes to phy state, however the helper function is called in atomic context. The bug can be reproduced by running "ethtool -r" command, the bug is reported if CONFIG_DEBUG_ATOMIC_SLEEP build option is enabled. Fixes: `dc19e4e5e0` ("sh: sh_eth: Add support ethtool") Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Arun Kumar Neelakantam	8e86a48a15	net: qrtr: Reset the node and port ID of broadcast messages [ Upstream commit `d27e77a3de` ] All the control messages broadcast to remote routers are using QRTR_NODE_BCAST instead of using local router NODE ID which cause the packets to be dropped on remote router due to invalid NODE ID. Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Arun Kumar Neelakantam	9493898179	net: qrtr: Broadcast messages only from control port [ Upstream commit `fdf5fd3975` ] The broadcast node id should only be sent with the control port id. Signed-off-by: Arun Kumar Neelakantam <aneela@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Paul Moore	0aa16a963f	ipv6: make ipv6_renew_options() interrupt/kernel safe [ Upstream commit `a9ba23d48d` ] At present the ipv6_renew_options_kern() function ends up calling into access_ok() which is problematic if done from inside an interrupt as access_ok() calls WARN_ON_IN_IRQ() on some (all?) architectures (x86-64 is affected). Example warning/backtrace is shown below: WARNING: CPU: 1 PID: 3144 at lib/usercopy.c:11 _copy_from_user+0x85/0x90 ... Call Trace: <IRQ> ipv6_renew_option+0xb2/0xf0 ipv6_renew_options+0x26a/0x340 ipv6_renew_options_kern+0x2c/0x40 calipso_req_setattr+0x72/0xe0 netlbl_req_setattr+0x126/0x1b0 selinux_netlbl_inet_conn_request+0x80/0x100 selinux_inet_conn_request+0x6d/0xb0 security_inet_conn_request+0x32/0x50 tcp_conn_request+0x35f/0xe00 ? __lock_acquire+0x250/0x16c0 ? selinux_socket_sock_rcv_skb+0x1ae/0x210 ? tcp_rcv_state_process+0x289/0x106b tcp_rcv_state_process+0x289/0x106b ? tcp_v6_do_rcv+0x1a7/0x3c0 tcp_v6_do_rcv+0x1a7/0x3c0 tcp_v6_rcv+0xc82/0xcf0 ip6_input_finish+0x10d/0x690 ip6_input+0x45/0x1e0 ? ip6_rcv_finish+0x1d0/0x1d0 ipv6_rcv+0x32b/0x880 ? ip6_make_skb+0x1e0/0x1e0 __netif_receive_skb_core+0x6f2/0xdf0 ? process_backlog+0x85/0x250 ? process_backlog+0x85/0x250 ? process_backlog+0xec/0x250 process_backlog+0xec/0x250 net_rx_action+0x153/0x480 __do_softirq+0xd9/0x4f7 do_softirq_own_stack+0x2a/0x40 </IRQ> ... While not present in the backtrace, ipv6_renew_option() ends up calling access_ok() via the following chain: access_ok() _copy_from_user() copy_from_user() ipv6_renew_option() The fix presented in this patch is to perform the userspace copy earlier in the call chain such that it is only called when the option data is actually coming from userspace; that place is do_ipv6_setsockopt(). Not only does this solve the problem seen in the backtrace above, it also allows us to simplify the code quite a bit by removing ipv6_renew_options_kern() completely. We also take this opportunity to cleanup ipv6_renew_options()/ipv6_renew_option() a small amount as well. This patch is heavily based on a rough patch by Al Viro. I've taken his original patch, converted a kmemdup() call in do_ipv6_setsockopt() to a memdup_user() call, made better use of the e_inval jump target in the same function, and cleaned up the use ipv6_renew_option() by ipv6_renew_options(). CC: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Dan Carpenter	412a5ef339	qed: off by one in qed_parse_mcp_trace_buf() [ Upstream commit `0df8adbb88` ] If format_idx == s_mcp_trace_meta.formats_num then we read one element beyond the end of the s_mcp_trace_meta.formats[] array. Fixes: `50bc60cb15` ("qed*: Utilize FW 8.33.11.0") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:07:00 +02:00
Florian Westphal	8ce651857b	netfilter: x_tables: set module owner for icmp(6) matches [ Upstream commit `d376bef9c2` ] nft_compat relies on xt_request_find_match to increment refcount of the module that provides the match/target. The (builtin) icmp matches did't set the module owner so it was possible to rmmod ip(6)tables while icmp extensions were still in use. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Lubomir Rintel	cc31c36e86	ieee802154: 6lowpan: set IFLA_LINK [ Upstream commit `b30c122c0b` ] Otherwise NetworkManager (and iproute alike) is not able to identify the parent IEEE 802.15.4 interface of a 6LoWPAN link. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Alexander Aring <aring@mojatatu.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Arnd Bergmann	a0b1a7b831	ieee802154: mcr20a: add missing includes [ Upstream commit `a6032120d3` ] Without CONFIG_GPIOLIB, some headers are not included implicitly, leading to a build failure: drivers/net/ieee802154/mcr20a.c: In function 'mcr20a_probe': drivers/net/ieee802154/mcr20a.c:1347:13: error: implicit declaration of function 'irq_get_trigger_type'; did you mean 'irq_get_irqchip_state'? [-Werror=implicit-function-declaration] This includes gpio/consumer.h and irq.h directly rather through the gpiolib header. Fixes: `8c6ad9cc51` ("ieee802154: Add NXP MCR20A IEEE 802.15.4 transceiver driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Xue Liu <liuxuenetmail@gmail.com> Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Taeung Song	4c6477eb7b	samples/bpf: Check the error of write() and read() [ Upstream commit `02a2f000a3` ] test_task_rename() and test_urandom_read() can be failed during write() and read(), So check the result of them. Reviewed-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Taeung Song	2c8f70b90a	samples/bpf: Check the result of system() [ Upstream commit `492b7e8945` ] To avoid the below build warning message, use new generate_load() checking the return value. ignoring return value of ‘system’, declared with attribute warn_unused_result And it also refactors the duplicate code of both test_perf_event_all_cpu() and test_perf_event_task() Cc: Teng Qin <qinteng@fb.com> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Taeung Song	94ffaf9ecd	samples/bpf: add missing <linux/if_vlan.h> [ Upstream commit `4d5d33a085` ] This fixes build error regarding redefinition: CLANG-bpf samples/bpf/parse_varlen.o samples/bpf/parse_varlen.c:111:8: error: redefinition of 'vlan_hdr' struct vlan_hdr { ^ ./include/linux/if_vlan.h:38:8: note: previous definition is here So remove duplicate 'struct vlan_hdr' in sample code and include if_vlan.h Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Jim Wilson	5d17abbf8a	RISC-V: Fix PTRACE_SETREGSET bug. [ Upstream commit `1db9b80980` ] In riscv_gpr_set, pass regs instead of &regs to user_regset_copyin to fix gdb segfault. Signed-off-by: Jim Wilson <jimw@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Palmer Dabbelt	60f50e90b6	RISC-V: Don't include irq-riscv-intc.h [ Upstream commit `8606544890` ] This file has never existed in the upstream kernel, but it's guarded by an #ifdef that's also never existed in the upstream kernel. As a part of our interrupt controller refactoring this header is no longer necessary, but this reference managed to sneak in anyway. Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Andreas Schwab	c488c3eabe	RISC-V: fix R_RISCV_ADD32/R_RISCV_SUB32 relocations [ Upstream commit `781c8fe2da` ] The R_RISCV_ADD32/R_RISCV_SUB32 relocations should add/subtract the address of the symbol (without overflow check), not its contents. Signed-off-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:59 +02:00
Maciej Purski	19c99f6cc0	drm/bridge/sii8620: Fix display of packed pixel modes [ Upstream commit `fdddc65ab3` ] Current implementation does not guarantee packed pixel modes working with every dongle. There are some dongles, which require selecting the output mode explicitly. Write proper values to registers in packed_pixel mode, based on how it is done in vendor's code. Select output color space: RGB (no packed pixel) or YCBCR422 (packed pixel). This reverts commit `e8b92efa62` ("drm/bridge/sii8620: fix display of packed pixel modes in MHL2"). Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1530204243-6370-3-git-send-email-m.purski@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Yuiko Oshino	b7a463aac2	smsc75xx: Add workaround for gigabit link up hardware errata. [ Upstream commit `d461e3da90` ] In certain conditions, the device may not be able to link in gigabit mode. This software workaround ensures that the device will not enter the failure state. Fixes: `d0cad87170` ("SMSC75XX USB 2.0 Gigabit Ethernet Devices") Signed-off-by: Yuiko Oshino <yuiko.oshino@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Wang Dongsheng	7702b2503e	net: phy: marvell: change default m88e1510 LED configuration [ Upstream commit `077772468e` ] The m88e1121 LED default configuration does not apply m88e151x. So add a function to relpace m88e1121 LED configuration. Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Zhen Lei	2014bb5b41	kasan: fix shadow_size calculation error in kasan_module_alloc [ Upstream commit `1e8e18f694` ] There is a special case that the size is "(N << KASAN_SHADOW_SCALE_SHIFT) Pages plus X", the value of X is [1, KASAN_SHADOW_SCALE_SIZE-1]. The operation "size >> KASAN_SHADOW_SCALE_SHIFT" will drop X, and the roundup operation can not retrieve the missed one page. For example: size=0x28006, PAGE_SIZE=0x1000, KASAN_SHADOW_SCALE_SHIFT=3, we will get shadow_size=0x5000, but actually we need 6 pages. shadow_size = round_up(size >> KASAN_SHADOW_SCALE_SHIFT, PAGE_SIZE); This can lead to a kernel crash when kasan is enabled and the value of mod->core_layout.size or mod->init_layout.size is like above. Because the shadow memory of X has not been allocated and mapped. move_module: ptr = module_alloc(mod->core_layout.size); ... memset(ptr, 0, mod->core_layout.size); //crashed Unable to handle kernel paging request at virtual address ffff0fffff97b000 ...... Call trace: __asan_storeN+0x174/0x1a8 memset+0x24/0x48 layout_and_allocate+0xcd8/0x1800 load_module+0x190/0x23e8 SyS_finit_module+0x148/0x180 Link: http://lkml.kernel.org/r/1529659626-12660-1-git-send-email-thunder.leizhen@huawei.com Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Dmitriy Vyukov <dvyukov@google.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Libin <huawei.libin@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Mathieu Malaterre	25b98023d7	tracing: Use __printf markup to silence compiler [ Upstream commit `26b68dd2f4` ] Silence warnings (triggered at W=1) by adding relevant __printf attributes. CC kernel/trace/trace.o kernel/trace/trace.c: In function ‘__trace_array_vprintk’: kernel/trace/trace.c:2979:2: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format] len = vscnprintf(tbuffer, TRACE_BUF_SIZE, fmt, args); ^~~ AR kernel/trace/built-in.o Link: http://lkml.kernel.org/r/20180308205843.27447-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Mauricio Vasquez B	ba80c24ebc	bpf: hash map: decrement counter on error [ Upstream commit `ed2b82c03d` ] Decrement the number of elements in the map in case the allocation of a new node fails. Fixes: `6c90598174` ("bpf: pre-allocate hash map elements") Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Heiner Kallweit	4e956777f9	r8169: fix mac address change [ Upstream commit `2d0ec5440b` ] Network core refuses to change mac address because flag IFF_LIVE_ADDR_CHANGE isn't set. Set this missing flag. Fixes: `1f7aa2bc26` ("r8169: simplify rtl_set_mac_address") Reported-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Tested-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Doron Roberts-Kedes	3634c1433b	tls: fix skb_to_sgvec returning unhandled error. [ Upstream commit `52ee6ef36e` ] The current code does not inspect the return value of skb_to_sgvec. This can cause a nullptr kernel panic when the malformed sgvec is passed into the crypto request. Checking the return value of skb_to_sgvec and skipping decryption if it is negative fixes this problem. Fixes: `c46234ebb4` ("tls: RX path for ktls") Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:58 +02:00
Fabio Estevam	124c9bc9ba	ARM: imx_v4_v5_defconfig: Select ULPI support [ Upstream commit `2ceb2780b7` ] Select CONFIG_USB_CHIPIDEA_ULPI and CONFIG_USB_ULPI_BUS so that USB ULPI can be functional on some boards like that use ULPI interface. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Fabio Estevam	df06ca1f56	ARM: imx_v6_v7_defconfig: Select ULPI support [ Upstream commit `157bcc0609` ] Select CONFIG_USB_CHIPIDEA_ULPI and CONFIG_USB_ULPI_BUS so that USB ULPI can be functional on some boards like imx51-babbge. This fixes a kernel hang in 4.18-rc1 on i.mx51-babbage, caused by commit `03e6275ae3` ("usb: chipidea: Fix ULPI on imx51"). Suggested-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Jason Gerecke	eb702bcd82	HID: wacom: Correct touch maximum XY of 2nd-gen Intuos [ Upstream commit `3b8d573586` ] The touch sensors on the 2nd-gen Intuos tablets don't use a 4096x4096 sensor like other similar tablets (3rd-gen Bamboo, Intuos5, etc.). The incorrect maximum XY values don't normally affect userspace since touch input from these devices is typically relative rather than absolute. It does, however, cause problems when absolute distances need to be measured, e.g. for gesture recognition. Since the resolution of the touch sensor on these devices is 10 units / mm (versus 100 for the pen sensor), the proper maximum values can be calculated by simply dividing by 10. Fixes: `b5fd2a3e92` ("Input: wacom - add support for three new Intuos devices") Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Zhenzhong Duan	446137c3be	x86/mm/32: Initialize the CR4 shadow before __flush_tlb_all() [ Upstream commit `4fb5f58e8d` ] On 32-bit kernels, __flush_tlb_all() may have read the CR4 shadow before the initialization of CR4 shadow in cpu_init(). Fix it by adding an explicit cr4_init_shadow() call into start_secondary() which is the first function called on non-boot SMP CPUs - ahead of the __flush_tlb_all() call. ( This is somewhat of a layering violation, but start_secondary() does CR4 bootstrap in the PCID case anyway. ) Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: http://lkml.kernel.org/r/b07b6ae9-4b57-4b40-b9bc-50c2c67f1d91@default Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Peter Zijlstra	2caa5529cb	kthread, sched/core: Fix kthread_parkme() (again...) [ Upstream commit `1cef1150ef` ] Gaurav reports that commit: `85f1abe001` ("kthread, sched/wait: Fix kthread_parkme() completion issue") isn't working for him. Because of the following race: > controller Thread CPUHP Thread > takedown_cpu > kthread_park > kthread_parkme > Set KTHREAD_SHOULD_PARK > smpboot_thread_fn > set Task interruptible > > > wake_up_process > if (!(p->state & state)) > goto out; > > Kthread_parkme > SET TASK_PARKED > schedule > raw_spin_lock(&rq->lock) > ttwu_remote > waiting for __task_rq_lock > context_switch > > finish_lock_switch > > > > Case TASK_PARKED > kthread_park_complete > > > SET Running Furthermore, Oleg noticed that the whole scheduler TASK_PARKED handling is buggered because the TASK_DEAD thing is done with preemption disabled, the current code can still complete early on preemption :/ So basically revert that earlier fix and go with a variant of the alternative mentioned in the commit. Promote TASK_PARKED to special state to avoid the store-store issue on task->state leading to the WARN in kthread_unpark() -> __kthread_bind(). But in addition, add wait_task_inactive() to kthread_park() to ensure the task really is PARKED when we return from kthread_park(). This avoids the whole kthread still gets migrated nonsense -- although it would be really good to get this done differently. Reported-by: Gaurav Kohli <gkohli@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `85f1abe001` ("kthread, sched/wait: Fix kthread_parkme() completion issue") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Vincent Guittot	d2bc46d569	sched/util_est: Fix util_est_dequeue() for throttled cfs_rq [ Upstream commit `3482d98bbc` ] When a cfs_rq is throttled, parent cfs_rq->nr_running is decreased and everything happens at cfs_rq level. Currently util_est stays unchanged in such case and it keeps accounting the utilization of throttled tasks. This can somewhat make sense as we don't dequeue tasks but only throttled cfs_rq. If a task of another group is enqueued/dequeued and root cfs_rq becomes idle during the dequeue, util_est will be cleared whereas it was accounting util_est of throttled tasks before. So the behavior of util_est is not always the same regarding throttled tasks and depends of side activity. Furthermore, util_est will not be updated when the cfs_rq is unthrottled as everything happens at cfs_rq level. Main results is that util_est will stay null whereas we now have running tasks. We have to wait for the next dequeue/enqueue of the previously throttled tasks to get an up to date util_est. Remove the assumption that cfs_rq's estimated utilization of a CPU is 0 if there is no running task so the util_est of a task remains until the latter is dequeued even if its cfs_rq has been throttled. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `7f65ea42eb` ("sched/fair: Add util_est on top of PELT") Link: http://lkml.kernel.org/r/1528972380-16268-1-git-send-email-vincent.guittot@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Xunlei Pang	431ad472e0	sched/fair: Fix bandwidth timer clock drift condition [ Upstream commit `512ac999d2` ] I noticed that cgroup task groups constantly get throttled even if they have low CPU usage, this causes some jitters on the response time to some of our business containers when enabling CPU quotas. It's very simple to reproduce: mkdir /sys/fs/cgroup/cpu/test cd /sys/fs/cgroup/cpu/test echo 100000 > cpu.cfs_quota_us echo $$ > tasks then repeat: cat cpu.stat \| grep nr_throttled # nr_throttled will increase steadily After some analysis, we found that cfs_rq::runtime_remaining will be cleared by expire_cfs_rq_runtime() due to two equal but stale "cfs_{b\|q}->runtime_expires" after period timer is re-armed. The current condition to judge clock drift in expire_cfs_rq_runtime() is wrong, the two runtime_expires are actually the same when clock drift happens, so this condtion can never hit. The orginal design was correctly done by this commit: `a9cf55b286` ("sched: Expire invalid runtime") ... but was changed to be the current implementation due to its locking bug. This patch introduces another way, it adds a new field in both structures cfs_rq and cfs_bandwidth to record the expiration update sequence, and uses them to figure out if clock drift happens (true if they are equal). Signed-off-by: Xunlei Pang <xlpang@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `51f2176d74` ("sched/fair: Fix unlocked reads of some cfs_b->quota/period") Link: http://lkml.kernel.org/r/20180620101834.24455-1-xlpang@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Frederic Weisbecker	d9cad98185	sched/nohz: Skip remote tick on idle task entirely [ Upstream commit `d9c0ffcabd` ] Some people have reported that the warning in sched_tick_remote() occasionally triggers, especially in favour of some RCU-Torture pressure: WARNING: CPU: 11 PID: 906 at kernel/sched/core.c:3138 sched_tick_remote+0xb6/0xc0 Modules linked in: CPU: 11 PID: 906 Comm: kworker/u32:3 Not tainted 4.18.0-rc2+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 Workqueue: events_unbound sched_tick_remote RIP: 0010:sched_tick_remote+0xb6/0xc0 Code: e8 0f 06 b8 00 c6 03 00 fb eb 9d 8b 43 04 85 c0 75 8d 48 8b 83 e0 0a 00 00 48 85 c0 75 81 eb 88 48 89 df e8 bc fe ff ff eb aa <0f> 0b eb +c5 66 0f 1f 44 00 00 bf 17 00 00 00 e8 b6 2e fe ff 0f b6 Call Trace: process_one_work+0x1df/0x3b0 worker_thread+0x44/0x3d0 kthread+0xf3/0x130 ? set_worker_desc+0xb0/0xb0 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40 This happens when the remote tick applies on an idle task. Usually the idle_cpu() check avoids that, but it is performed before we lock the runqueue and it is therefore racy. It was intended to be that way in order to prevent from useless runqueue locks since idle task tick callback is a no-op. Now if the racy check slips out of our hands and we end up remotely ticking an idle task, the empty task_tick_idle() is harmless. Still it won't pass the WARN_ON_ONCE() test that ensures rq_clock_task() is not too far from curr->se.exec_start because update_curr_idle() doesn't update the exec_start value like other scheduler policies. Hence the reported false positive. So let's have another check, while the rq is locked, to make sure we don't remote tick on an idle task. The lockless idle_cpu() still applies to avoid unecessary rq lock contention. Reported-by: Jacek Tomaka <jacekt@dug.com> Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1530203381-31234-1-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:57 +02:00
Greentime Hu	d50a578199	nds32: Fix the dts pointer is not passed correctly issue. [ Upstream commit `6897e6ecb3` ] We found that the original implementation will only use the built-in dtb pointer instead of the pointer pass from bootloader. This bug is fixed by this patch. Signed-off-by: Greentime Hu <greentime@andestech.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Alex Deucher	376fb2a025	drm/amdgpu: fix swapped emit_ib_size in vce3 [ Upstream commit `0859df22ab` ] The phys and vm versions had the values swapped. Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Kai-Heng Feng	acf2b19c9b	usb: xhci: dbc: Don't decrement runtime PM counter if DBC is not started [ Upstream commit `74cb319bd9` ] pm_runtime_put_sync() gets called everytime in xhci_dbc_stop(). If dbc is not started, this makes the runtime PM counter incorrectly becomes 0, and calls autosuspend function. Then we'll keep seeing this: [54664.762220] xhci_hcd 0000:00:14.0: Root hub is not suspended So only calls pm_runtime_put_sync() when dbc was started. Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Hangbin Liu	3c59e14df0	ipvlan: call dev_change_flags when ipvlan mode is reset [ Upstream commit `5dc2d3996a` ] After we change the ipvlan mode from l3 to l2, or vice versa, we only reset IFF_NOARP flag, but don't flush the ARP table cache, which will cause eth->h_dest to be equal to eth->h_source in ipvlan_xmit_mode_l2(). Then the message will not come out of host. Here is the reproducer on local host: ip link set eth1 up ip addr add 192.168.1.1/24 dev eth1 ip link add link eth1 ipvlan1 type ipvlan mode l3 ip netns add net1 ip link set ipvlan1 netns net1 ip netns exec net1 ip link set ipvlan1 up ip netns exec net1 ip addr add 192.168.2.1/24 dev ipvlan1 ip route add 192.168.2.0/24 via 192.168.1.2 ping 192.168.2.2 -c 2 ip netns exec net1 ip link set ipvlan1 type ipvlan mode l2 ping 192.168.2.2 -c 2 Add the same configuration on remote host. After we set the mode to l2, we could find that the src/dst MAC addresses are the same on eth1: 21:26:06.648565 00:b7:13:ad:d3:05 > 00:b7:13:ad:d3:05, ethertype IPv4 (0x0800), length 98: (tos 0x0, ttl 64, id 58356, offset 0, flags [DF], proto ICMP (1), length 84) 192.168.2.1 > 192.168.2.2: ICMP echo request, id 22686, seq 1, length 64 Fix this by calling dev_change_flags(), which will call netdevice notifier with flag change info. v2: a) As pointed out by Wang Cong, check return value for dev_change_flags() when change dev flags. b) As suggested by Stefano and Sabrina, move flags setting before l3mdev_ops. So we don't need to redo ipvlan_{, un}register_nf_hook() again in err path. Reported-by: Jianlin Shi <jishi@redhat.com> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Fixes: `2ad7bf3638` ("ipvlan: Initial check-in of the IPVLAN driver.") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Josh Poimboeuf	4f11b1ba42	objtool: Support GCC 8 '-fnoreorder-functions' [ Upstream commit `08b393d01c` ] Since the following commit: `cd77849a69` ("objtool: Fix GCC 8 cold subfunction detection for aliased functions") ... if the kernel is built with EXTRA_CFLAGS='-fno-reorder-functions', objtool can get stuck in an infinite loop. That flag causes the new GCC 8 cold subfunctions to be placed in .text instead of .text.unlikely. But it also has an unfortunate quirk: in the symbol table, the subfunction (e.g., nmi_panic.cold.7) is nested inside the parent (nmi_panic). That function overlap confuses objtool, and causes it to get into an infinite loop in next_insn_same_func(). Here's Allan's description of the loop: "Objtool iterates through the instructions in nmi_panic using next_insn_same_func. Once it reaches the end of nmi_panic at 0x534 it jumps to 0x528 as that's the start of nmi_panic.cold.7. However, since the instructions starting at 0x528 are still associated with nmi_panic objtool will get stuck in a loop, continually jumping back to 0x528 after reaching 0x534." Fix it by shortening the length of the parent function so that the functions no longer overlap. Reported-and-analyzed-by: Allan Xavier <allan.x.xavier@oracle.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Allan Xavier <allan.x.xavier@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/9e704c52bee651129b036be14feda317ae5606ae.1530136978.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Greg Ungerer	3c2db6eee9	m68k: fix "bad page state" oops on ColdFire boot [ Upstream commit `ecd60532e0` ] Booting a ColdFire m68k core with MMU enabled causes a "bad page state" oops since commit `1d40a5ea01` ("mm: mark pages in use for page tables"): BUG: Bad page state in process sh pfn:01ce2 page:004fefc8 count:0 mapcount:-1024 mapping:00000000 index:0x0 flags: 0x0() raw: 00000000 00000000 00000000 fffffbff 00000000 00000100 00000200 00000000 raw: 039c4000 page dumped because: nonzero mapcount Modules linked in: CPU: 0 PID: 22 Comm: sh Not tainted 4.17.0-07461-g1d40a5ea01d5 #13 Fix by calling pgtable_page_dtor() in our __pte_free_tlb() code path, so that the PG_table flag is cleared before we free the pte page. Note that I had to change the type of pte_free() to be static from extern. Otherwise you get a lot of warnings like this: ./arch/m68k/include/asm/mcf_pgalloc.h:80:2: warning: ‘pgtable_page_dtor’ is static but used in inline function ‘pte_free’ which is not static pgtable_page_dtor(page); ^ And making it static is consistent with our use of this in the other m68k pgalloc definitions of pte_free(). Signed-off-by: Greg Ungerer <gerg@linux-m68k.org> CC: Matthew Wilcox <willy@infradead.org> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Eric Biggers	c249545c78	crypto: arm/speck - fix building in Thumb2 mode [ Upstream commit `a068b94d74` ] Building the kernel with CONFIG_THUMB2_KERNEL=y and CONFIG_CRYPTO_SPECK_NEON set fails with the following errors: arch/arm/crypto/speck-neon-core.S: Assembler messages: arch/arm/crypto/speck-neon-core.S:419: Error: r13 not allowed here -- `bic sp,#0xf' arch/arm/crypto/speck-neon-core.S:423: Error: r13 not allowed here -- `bic sp,#0xf' arch/arm/crypto/speck-neon-core.S:427: Error: r13 not allowed here -- `bic sp,#0xf' arch/arm/crypto/speck-neon-core.S:431: Error: r13 not allowed here -- `bic sp,#0xf' The problem is that the 'bic' instruction can't operate on the 'sp' register in Thumb2 mode. Fix it by using a temporary register. This isn't in the main loop, so the performance difference is negligible. This also matches what aes-neonbs-core.S does. Reported-by: Stefan Agner <stefan@agner.ch> Fixes: `ede9622162` ("crypto: arm/speck - add NEON-accelerated implementation of Speck-XTS") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Stafford Horne	5140200017	openrisc: entry: Fix delay slot exception detection [ Upstream commit `ae15a41a64` ] Originally in patch `e6d20c55a4` ("openrisc: entry: Fix delay slot detection") I fixed delay slot detection, but only for QEMU. We missed that hardware delay slot detection using delay slot exception flag (DSX) was still broken. This was because QEMU set the DSX flag in both pre-exception supervision register (ESR) and supervision register (SR) register, but on real hardware the DSX flag is only set on the SR register during exceptions. Fix this by carrying the DSX flag into the SR register during exception. We also update the DSX flag read locations to read the value from the SR register not the pt_regs SR register which represents ESR. The ESR should never have the DSX flag set. In the process I updated/removed a few comments to match the current state. Including removing a comment saying that the DSX detection logic was inefficient and needed to be rewritten. I have tested this on QEMU with a patch ensuring it matches the hardware specification. Link: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00000.html Fixes: `e6d20c55a4` ("openrisc: entry: Fix delay slot detection") Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:56 +02:00
Vishal Verma	b49ae220bf	tools/testing/nvdimm: advertise a write cache for nfit_test [ Upstream commit `1273c253c3` ] Commit `546eb0317c` "libnvdimm, pmem: Do not flush power-fail protected CPU caches" fixed the write_cache detection to correctly show the lack of a write cache based on the platform capabilities described in the ACPI NFIT. The nfit_test unit tests expected a write cache to be present, so change the nfit test namespaces to only advertise a persistence domain limited to the memory controller. This allows the kernel to show a write_cache attribute, and the test behaviour remains unchanged. Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Dave Jiang	3a17b2c117	acpi/nfit: fix cmd_rc for acpi_nfit_ctl to always return a value [ Upstream commit `c1985cefd8` ] cmd_rc is passed in by reference to the acpi_nfit_ctl() function and the caller expects a value returned. However, when the package is pass through via the ND_CMD_CALL command, cmd_rc is not touched. Make sure cmd_rc is always set. Fixes: `aef2533822` ("libnvdimm, nfit: centralize command status translation") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Julian Wiedmann	d0d88b5909	s390/qeth: consistently re-enable device features [ Upstream commit `d025da9eb1` ] commit `e830baa9c3` ("qeth: restore device features after recovery") and commit `ce34435641` ("s390/qeth: rely on kernel for feature recovery") made sure that the HW functions for device features get re-programmed after recovery. But we missed that the same handling is also required when a card is first set offline (destroying all HW context), and then online again. Fix this by moving the re-enable action out of the recovery-only path. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Madalin Bucur	0e22aac15e	dpaa_eth: DPAA SGT needs to be 256B [ Upstream commit `595e802e53` ] The DPAA HW requires that at least 256 bytes from the start of the first scatter-gather table entry are allocated and accessible. The hardware reads the maximum size the table can have in one access, thus requiring that the allocation and mapping to be done for the maximum size of 256B even if there is a smaller number of entries in the table. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Madalin Bucur	ad71bb9565	fsl/fman: fix parser reporting bad checksum on short frames [ Upstream commit `b95f6fbc8e` ] The FMan hardware parser needs to be configured to remove the short frame padding from the checksum calculation, otherwise short UDP and TCP frames are likely to be marked as having a bad checksum. Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Sudarsana Reddy Kalluru	1bba5adf01	bnx2x: Fix receiving tx-timeout in error or recovery state. [ Upstream commit `484c016d93` ] Driver performs the internal reload when it receives tx-timeout event from the OS. Internal reload might fail in some scenarios e.g., fatal HW issues. In such cases OS still see the link, which would result in undesirable functionalities such as re-generation of tx-timeouts. The patch addresses this issue by indicating the link-down to OS when tx-timeout is detected, and keeping the link in down state till the internal reload is successful. Please consider applying it to 'net' branch. Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Nicholas Mc Guire	6ba78341db	PCI: faraday: Add missing of_node_put() [ Upstream commit `3dc6ddfedc` ] The call to of_get_next_child() returns a node pointer with refcount incremented thus it must be explicitly decremented here in the error path and after the last usage. Fixes: `d3c68e0a7e` ("PCI: faraday: Add Faraday Technology FTPCI100 PCI Host Bridge driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Nicholas Mc Guire	46a5ff9d7a	PCI: xilinx-nwl: Add missing of_node_put() [ Upstream commit `342639d996` ] The call to of_get_next_child() returns a node pointer with refcount incremented thus it must be explicitly decremented here after the last usage. Fixes: `ab597d35ef` ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:55 +02:00
Nicholas Mc Guire	1c18c4174c	PCI: xilinx: Add missing of_node_put() [ Upstream commit `8c3f9bd851` ] The call to of_get_next_child() returns a node pointer with refcount incremented thus it must be explicitly decremented here after the last usage. Fixes: `8961def568` ("PCI: xilinx: Add Xilinx AXI PCIe Host Bridge IP driver") Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> [lorenzo.pieralisi@arm.com: reworked commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Daniel Borkmann	b641c52ae2	bpf, s390: fix potential memleak when later bpf_jit_prog fails [ Upstream commit `f605ce5eb2` ] If we would ever fail in the bpf_jit_prog() pass that writes the actual insns to the image after we got header via bpf_jit_binary_alloc() then we also need to make sure to free it through bpf_jit_binary_free() again when bailing out. Given we had prior bpf_jit_prog() passes to initially probe for clobbered registers, program size and to fill in addrs arrray for jump targets, this is more of a theoretical one, but at least make sure this doesn't break with future changes. Fixes: `0546231057` ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Bart Van Assche	7d208fd909	drbd: Fix drbd_request_prepare() discard handling [ Upstream commit `fad2d4ef63` ] Fix the test that verifies whether bio_op(bio) represents a discard or write zeroes operation. Compile-tested only. Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Lars Ellenberg <lars.ellenberg@linbit.com> Fixes: `7435e9018f` ("drbd: zero-out partial unaligned discards on local backend") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Jens Axboe	d48d0a5b90	blk-mq: don't queue more if we get a busy return [ Upstream commit `1f57f8d442` ] Some devices have different queue limits depending on the type of IO. A classic case is SATA NCQ, where some commands can queue, but others cannot. If we have NCQ commands inflight and encounter a non-queueable command, the driver returns busy. Currently we attempt to dispatch more from the scheduler, if we were able to queue some commands. But for the case where we ended up stopping due to BUSY, we should not attempt to retrieve more from the scheduler. If we do, we can get into a situation where we attempt to queue a non-queueable command, get BUSY, then successfully retrieve more commands from that scheduler and queue those. This can repeat forever, starving the non-queuable command indefinitely. Fix this by NOT attempting to pull more commands from the scheduler, if we get a BUSY return. This should also be more optimal in terms of letting requests stay in the scheduler for as long as possible, if we get a BUSY due to the regular out-of-tags condition. Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Marek Szyprowski	7e43ed7501	drm/exynos: decon5433: Fix WINCONx reset value [ Upstream commit `7b7aa62c05` ] The only bits that should be preserved in decon_win_set_fmt() is WINCONx_ENWIN_F. All other bits depends on the selected pixel formats and are set by the mentioned function. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Marek Szyprowski	7c62b6786e	drm/exynos: decon5433: Fix per-plane global alpha for XRGB modes [ Upstream commit `ab337fc274` ] Set per-plane global alpha to maximum value to get proper blending of XRGB and ARGB planes. This fixes the strange order of overlapping planes. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Marek Szyprowski	71bfed12bc	drm/exynos: gsc: Fix support for NV16/61, YUV420/YVU420 and YUV422 modes [ Upstream commit `dd209ef809` ] Fix following issues related to planar YUV pixel format configuration: - NV16/61 modes were incorrectly programmed as NV12/21, - YVU420 was programmed as YUV420 on source, - YVU420 and YUV422 were programmed as YUV420 on output. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Johannes Berg	bf6e8b8d46	nl80211: check nla_parse_nested() return values [ Upstream commit `95bca62fb7` ] At the very least we should check the return value if nla_parse_nested() is called with a non-NULL policy. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:54 +02:00
Bob Copeland	083489b7b6	nl80211: relax ht operation checks for mesh [ Upstream commit `188f60ab8e` ] Commit `9757235f45`, "nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") relaxed the range for the HT operation field in meshconf, while also adding checks requiring the non-greenfield and non-ht-sta bits to be set in certain circumstances. The latter bit is actually reserved for mesh BSSes according to Table 9-168 in 802.11-2016, so in fact it should not be set. wpa_supplicant sets these bits because the mesh and AP code share the same implementation, but authsae does not. As a result, some meshconf updates from authsae which set only the NONHT_MIXED protection bits were being rejected. In order to avoid breaking userspace by changing the rules again, simply accept the values with or without the bits set, and mask off the reserved bit to match the spec. While in here, update the 802.11-2012 reference to 802.11-2016. Fixes: `9757235f45` ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value") Cc: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Bob Copeland <bobcopeland@fb.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Reviewed-by: Masashi Honma <masashi.honma@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Denis Kenzior	60ce4457d4	mac80211: disable BHs/preemption in ieee80211_tx_control_port() [ Upstream commit `e7441c9274` ] On pre-emption enabled kernels the following print was being seen due to missing local_bh_disable/local_bh_enable calls. mac80211 assumes that pre-emption is disabled in the data path. BUG: using smp_processor_id() in preemptible [00000000] code: iwd/517 caller is __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] [...] Call Trace: dump_stack+0x5c/0x80 check_preemption_disabled.cold.0+0x46/0x51 __ieee80211_subif_start_xmit+0x144/0x210 [mac80211] Fixes: `9118064914` ("mac80211: Add support for tx_control_port") Signed-off-by: Denis Kenzior <denkenz@gmail.com> [commit message rewrite, fixes tag] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Jeff Moyer	9fa5403516	dev-dax: check_vma: ratelimit dev_info-s [ Upstream commit `5a14e91d55` ] This is easily triggered from userspace, so let's ratelimit the messages. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
BingJing Chang	133b39d5dd	md/raid10: fix that replacement cannot complete recovery after reassemble [ Upstream commit `bda3153998` ] During assemble, the spare marked for replacement is not checked. conf->fullsync cannot be updated to be 1. As a result, recovery will treat it as a clean array. All recovering sectors are skipped. Original device is replaced with the not-recovered spare. mdadm -C /dev/md0 -l10 -n4 -pn2 /dev/loop[0123] mdadm /dev/md0 -a /dev/loop4 mdadm /dev/md0 --replace /dev/loop0 mdadm -S /dev/md0 # stop array during recovery mdadm -A /dev/md0 /dev/loop[01234] After reassemble, you can see recovery go on, but it completes immediately. In fact, recovery is not actually processed. To solve this problem, we just add the missing logics for replacment spares. (In raid1.c or raid5.c, they have already been checked.) Reported-by: Alex Chen <alexchen@synology.com> Reviewed-by: Alex Wu <alexwu@synology.com> Reviewed-by: Chung-Chiang Cheng <cccheng@synology.com> Signed-off-by: BingJing Chang <bingjingc@synology.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Evan Quan	d7fcc6997d	drm/amd/powerplay: correct vega12 thermal support as true [ Upstream commit `363a3d3fb7` ] Thermal support is enabled on vega12. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Ryan Hsu	8ddb02be59	ath10k: update the phymode along with bandwidth change request [ Upstream commit `9191fc2a43` ] In the case of Station connects to AP with narrower bandwidth at beginning. And later the AP changes the bandwidth to winder bandwidth, the AP will beacon with wider bandwidth IE, eg VHT20->VHT40->VHT80 or VHT40->VHT80. Since the supported BANDWIDTH will be limited by the PHYMODE, so while Station receives the bandwidth change request, it will also need to reconfigure the PHYMODE setting to firmware instead of just configuring the BANDWIDTH info, otherwise it'll trigger a firmware crash with non-support bandwidth. The issue was observed in WLAN.RM.4.4.1-00051-QCARMSWP-1, QCA6174 with below scenario: AP xxx changed bandwidth, new config is 5200 MHz, width 2 (5190/0 MHz) disconnect from AP xxx for new auth to yyy RX ReassocResp from xxx (capab=0x1111 status=0 aid=102) associated .... AP xxx changed bandwidth, new config is 5200 MHz, width 2 (5190/0 MHz) AP xxx changed bandwidth, new config is 5200 MHz, width 3 (5210/0 MHz) .... firmware register dump: [00]: 0x05030000 0x000015B3 0x00987291 0x00955B31 [04]: 0x00987291 0x00060730 0x00000004 0x00000001 [08]: 0x004089F0 0x00955A00 0x000A0B00 0x00400000 [12]: 0x00000009 0x00000000 0x00952CD0 0x00952CE6 [16]: 0x00952CC4 0x0098E25F 0x00000000 0x0091080D [20]: 0x40987291 0x0040E7A8 0x00000000 0x0041EE3C [24]: 0x809ABF05 0x0040E808 0x00000000 0xC0987291 [28]: 0x809A650C 0x0040E948 0x0041FE40 0x004345C4 [32]: 0x809A5C63 0x0040E988 0x0040E9AC 0x0042D1A8 [36]: 0x8091D252 0x0040E9A8 0x00000002 0x00000001 [40]: 0x809FDA9D 0x0040EA58 0x0043D554 0x0042D554 [44]: 0x809F8B22 0x0040EA78 0x0043D554 0x00000001 [48]: 0x80911210 0x0040EAC8 0x00000010 0x004041D0 [52]: 0x80911154 0x0040EB28 0x00400000 0x00000000 [56]: 0x8091122D 0x0040EB48 0x00000000 0x00400600 Reported-by: Rouven Czerwinski <rouven@czerwinskis.de> Tested-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Ryan Hsu <ryanhsu@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Dan Carpenter	16a4d4e05f	dmaengine: k3dma: Off by one in k3_of_dma_simple_xlate() [ Upstream commit `c4c2b7644c` ] The d->chans[] array has d->dma_requests elements so the > should be >= here. Fixes: `8e6152bc66` ("dmaengine: Add hisilicon k3 DMA engine driver") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Marek Szyprowski	d830b2678a	dmaengine: pl330: report BURST residue granularity [ Upstream commit `e3f329c600` ] The reported residue is already calculated in BURST unit granularity, so advertise this capability properly to other devices in the system. Fixes: `aee4d1fac8` ("dmaengine: pl330: improve pl330_tx_status() function") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:53 +02:00
Martin Blumenstingl	05a28971ed	ARM64: dts: meson-gxl: fix Mali GPU compatible string [ Upstream commit `1c38f4afd5` ] meson-gxl-mali.dtsi is only used on GXL SoCs. Thus it should use the GXL specific compatible string instead of the GXBB one. For now this is purely cosmetic since the (out-of-tree) lima driver for this GPU currently uses the "arm,mali-450" match instead of the SoC specific one. However, update the .dts to match the documentation since this driver behavior might change in the future. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Jerome Brunet	534ad3cd69	ARM64: dts: meson-axg: fix ethernet stability issue [ Upstream commit `6d28d57751` ] Like the odroid-c2 and wetek, the s400 uses the RTL8211F and seems to suffer from the kind of stability issue. Doing an iperf3 download test, we can see a significant number of LPI interrupts on the tx path. After a short while (5 to 15 seconds), the network connection dies. If using rootfs over NFS, the connection may also break during the boot sequence. We still don't have a real explanation for this problem so let's disable EEE once again. Fixes: `f6f6ac914b` ("ARM64: dts: meson-axg: enable ethernet for A113D S400 board") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Will Deacon	01157286b8	arm64: Avoid flush_icache_range() in alternatives patching code [ Upstream commit `429388682d` ] The implementation of flush_icache_range() includes instruction sequences which are themselves patched at runtime, so it is not safe to call from the patching framework. This patch reworks the alternatives cache-flushing code so that it rolls its own internal D-cache maintenance using DC CIVAC before invalidating the entire I-cache after all alternatives have been applied at boot. Modules don't cause any issues, since flush_icache_range() is safe to call by the time they are loaded. Acked-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Rohit Khanna <rokhanna@nvidia.com> Cc: Alexander Van Brunt <avanbrunt@nvidia.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Katsuhiro Suzuki	ee5f9222e7	arm64: dts: uniphier: fix widget name of headphone for LD11/LD20 boards [ Upstream commit `86676c4685` ] This patch fixes wrong name of headphone widget for receiving events of insert/remove headphone plug from simple-card or audio-graph-card. If we use wrong widget name then we get warning messages such as "asoc-audio-graph-card sound: ASoC: DAPM unknown pin Headphones" when the plug is inserted or removed from headphone jack. Fixes: `fb21a0acaa` ("arm64: dts: uniphier: add sound node") Signed-off-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Keerthy	a28b5c0c1f	ARM: dts: da850: Fix interrups property for gpio [ Upstream commit `3eb1b955cd` ] The intc #interrupt-cells is equal to 1. Currently gpio node has 2 cells per IRQ which is wrong. Remove the additional cell for each of the interrupts. Signed-off-by: Keerthy <j-keerthy@ti.com> Fixes: `2e38b946dc` ("ARM: davinci: da850: add GPIO DT node") Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Andy Lutomirski	42d010a51d	selftests/x86/sigreturn: Do minor cleanups [ Upstream commit `e8a445dea2` ] We have short names for the requested and resulting register values. Use them instead of spelling out the whole register entry for each case. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/bb3bc1f923a2f6fe7912d22a1068fe29d6033d38.1530076529.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Andy Lutomirski	9088b0d132	selftests/x86/sigreturn/64: Fix spurious failures on AMD CPUs [ Upstream commit `ec34802056` ] When I wrote the sigreturn test, I didn't realize that AMD's busted IRET behavior was different from Intel's busted IRET behavior: On AMD CPUs, the CPU leaks the high 32 bits of the kernel stack pointer to certain userspace contexts. Gee, thanks. There's very little the kernel can do about it. Modify the test so it passes. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/86e7fd3564497f657de30a36da4505799eebef01.1530076529.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Chengguang Xu	c855124409	nfp: cast sizeof() to int when comparing with error code [ Upstream commit `2d2595719a` ] sizeof() will return unsigned value so in the error check negative error code will be always larger than sizeof(). Fixes: `a0d8e02c35` ("nfp: add support for reading nffw info") Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:52 +02:00
Sowmini Varadhan	c827073c95	rds: clean up loopback rds_connections on netns deletion [ Upstream commit `c809195f55` ] The RDS core module creates rds_connections based on callbacks from rds_loop_transport when sending/receiving packets to local addresses. These connections will need to be cleaned up when they are created from a netns that is not init_net, and that netns is deleted. Add the changes aligned with the changes from commit `ebeeb1ad9b` ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") for rds_loop_transport Reported-and-tested-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Eli Cohen	0ce9d3074e	net/mlx5: E-Switch, Disallow vlan/spoofcheck setup if not being esw manager [ Upstream commit `a8d70a054a` ] In smartnic env, if the host (PF) driver is not an e-switch manager, we are not allowed to apply eswitch ports setups such as vlan (VST), spoof-checks, min/max rate or state. Make sure we are eswitch manager when coming to issue these callbacks and err otherwise. Also fix the definition of ESW_ALLOWED to rely on eswitch_manager capability and on the vport_group_manger. Operations on the VF nic vport context, such as setting a mac or reading the vport counters are allowed to the PF in this scheme. The modify nic vport guid code was modified to omit checking the nic_vport_node_guid_modify eswitch capability. The reason for doing so is that modifying node guid requires vport group manager capability, and there's no need to check further capabilities. 1. set_vf_vlan - disallowed 2. set_vf_spoofchk - disallowed 3. set_vf_mac - allowed 4. get_vf_config - allowed 5. set_vf_trust - disallowed 6. set_vf_rate - disallowed 7. get_vf_stat - allowed 8. set_vf_link_state - disallowed Fixes: `f942380c12` ('net/mlx5: E-Switch, Vport ingress/egress ACLs rules for spoofchk') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Tested-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Yan, Zheng	3aebfca49f	ceph: fix dentry leak in splice_dentry() [ Upstream commit `8b8f53af1e` ] In any case, d_splice_alias() does not drop reference of original dentry. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Jann Horn	788a2c75a3	netfilter: nf_log: fix uninit read in nf_log_proc_dostring [ Upstream commit `dffd22aed2` ] When proc_dostring() is called with a non-zero offset in strict mode, it doesn't just write to the ->data buffer, it also reads. Make sure it doesn't read uninitialized data. Fixes: `c6ac37d8d8` ("netfilter: nf_log: fix error on write NONE to [...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Adam Ford	0f5c35df8a	ARM: davinci: board-da850-evm: fix WP pin polarity for MMC/SD [ Upstream commit `1b6fe9798a` ] When booting from MMC/SD in rw mode on DA850 EVM, the system crashes because the write protect pin should be active high and not active low. This patch fixes the polarity of the WP pin. Fixes: `bdf0e8364f` ("ARM: davinci: da850-evm: use gpio descriptor for mmc pins") Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Peter Chen	a1efa25770	usb: chipidea: host: fix disconnection detect issue [ Upstream commit `90f26cc6bb` ] The commit `4e88d4c083` ("usb: add a flag to skip PHY initialization to struct usb_hcd") delete the assignment for hcd->usb_phy, it causes usb_phy_notify_connect{disconnect) are not called, the USB PHY driver is not notified of hot plug event, then the disconnection will not be detected by hardware. Fixes: `4e88d4c083` ("usb: add a flag to skip PHY initialization to struct usb_hcd") Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reported-by: Mats Karrman <mats.dev.list@gmail.com> Tested-by: Mats Karrman <mats.dev.list@gmail.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Dan Carpenter	c3f1091f92	clk: davinci: cfgchip: testing the wrong variable [ Upstream commit `0613de3737` ] There is a copy and paste bug here. We should be testing "usb1" instead of "usb0". Fixes: `58e1e2d2cd` ("clk: davinci: cfgchip: Add TI DA8XX USB PHY clocks") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David Lechner <david@lechnology.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Ravi Bangoria	0f3587469a	perf tools: Fix crash caused by accessing feat_ops[HEADER_LAST_FEATURE] [ Upstream commit `92ead7ee30` ] perf_event__process_feature() accesses feat_ops[HEADER_LAST_FEATURE] which is not defined and thus perf is crashing. HEADER_LAST_FEATURE is used as an end marker for the perf report but it's unused for perf script/annotate. Ignore HEADER_LAST_FEATURE for perf script/annotate, just like it is done in 'perf report'. Before: # perf record -o - ls \| perf script <SNIP 'ls' output> Segmentation fault (core dumped) # After: # perf record -o - ls \| perf script <SNIP 'ls' output> Segmentation fault (core dumped) ls 7031 4392.099856: 250000 cpu-clock:uhH: 7f5e0ce7cd60 ls 7031 4392.100355: 250000 cpu-clock:uhH: 7f5e0c706ef7 # Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `57b5de4639` ("perf report: Support forced leader feature in pipe mode") Link: http://lkml.kernel.org/r/20180625124220.6434-4-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:51 +02:00
Ravi Bangoria	a94568b95d	perf script: Fix crash because of missing evsel->priv [ Upstream commit `a3af66f51b` ] 'perf script' in piped mode is crashing because evsel->priv is not set properly. Fix it. Before: # perf record -o - -- ls \| perf script <SNIP 'ls' output> Segmentation fault (core dumped) # After: # perf record -o - -- ls \| perf script <SNIP 'ls' output> ls 2282 1031.731974: 250000 cpu-clock:uhH: 7effe4b3d29e ls 2282 1031.732222: 250000 cpu-clock:uhH: 7effe4b3a650 # Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: `a14390fde6` ("perf script: Allow creating per-event dump files") Link: http://lkml.kernel.org/r/20180625124220.6434-3-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Jiri Olsa	ab7757aa30	perf bench: Fix numa report output code [ Upstream commit `983107072b` ] Currently we can hit following assert when running numa bench: $ perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZ0cm --thp 1 perf: bench/numa.c:1577: __bench_numa: Assertion `!(!(((wait_stat) & 0x7f) == 0))' failed. The assertion is correct, because we hit the SIGFPE in following line: Thread 2.2 "thread 0/0" received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7fffd28c6700 (LWP 11750)] 0x000.. in worker_thread (__tdata=0x7.. ) at bench/numa.c:1257 1257 td->speed_gbs = bytes_done / (td->runtime_ns / NSEC_PER_SEC) / 1e9; We don't check if the runtime is actually bigger than 1 second, and thus this might end up with zero division within FPU. Adding the check to prevent this. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180620094036.17278-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Yonghong Song	ea2117efdd	perf tools: Fix a clang 7.0 compilation error [ Upstream commit `c6555c1457` ] Arnaldo reported the perf build failure with latest llvm/clang compiler (7.0). $ make LIBCLANGLLVM=1 -C tools/perf/ <SNIP> CC /tmp/tmp.t53Qo38zci/tests/kmod-path.o util/c++/clang.cpp: In function ‘std::unique_ptr<llvm::SmallVectorImpl<char> > perf::getBPFObjectFromModule(llvm::Module)’: util/c++/clang.cpp:150:43: error: no matching function for call to ‘llvm::TargetMachine::addPassesToEmitFile(llvm::legacy::PassManager&, llvm::raw_svector_ostream&, llvm::TargetMachine::CodeGenFileType)’ TargetMachine::CGFT_ObjectFile)) { ^ In file included from util/c++/clang.cpp:25:0: /usr/local/include/llvm/Target/TargetMachine.h:254:16: note: candidate: virtual bool llvm::TargetMachine::addPassesToEmitFile( llvm::legacy::PassManagerBase&, llvm::raw_pwrite_stream&, llvm::raw_pwrite_stream, llvm::TargetMachine::CodeGenFileType, bool, llvm::MachineModuleInfo) virtual bool addPassesToEmitFile(PassManagerBase &, raw_pwrite_stream &, ^~~~~~~~~~~~~~~~~~~ /usr/local/include/llvm/Target/TargetMachine.h:254:16: note: candidate expects 6 arguments, 3 provided mv: cannot stat '/tmp/tmp.t53Qo38zci/util/c++/.clang.o.tmp': No such file or directory make[7]: [/home/acme/git/perf/tools/build/Makefile.build:101: /tmp/tmp.t53Qo38zci/util/c++/clang.o] Error 1 make[6]: * [/home/acme/git/perf/tools/build/Makefile.build:139: c++] Error 2 make[5]: * [/home/acme/git/perf/tools/build/Makefile.build:139: util] Error 2 make[5]: * Waiting for unfinished jobs.... CC /tmp/tmp.t53Qo38zci/tests/thread-map.o The function addPassesToEmitFile signature changed in llvm 7.0 and such a change caused the failure. This patch fixed the issue with using proper function signatures under different compiler versions. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Yonghong Song <yhs@fb.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/20180616174739.1076733-1-yhs@fb.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Jiri Olsa	01c62b622a	perf tests: Add event parsing error handling to parse events test [ Upstream commit `933ccf2002` ] Add missing error handling for parse_events calls in test_event function that led to following segfault on s390: running test 52 'intel_pt//u' perf: Segmentation fault ... /lib64/libc.so.6(vasprintf+0xe6) [0x3fffca3f106] /lib64/libc.so.6(asprintf+0x46) [0x3fffca1aa96] ./perf(parse_events_add_pmu+0xb8) [0x80132088] ./perf(parse_events_parse+0xc62) [0x8019529a] ./perf(parse_events+0x98) [0x801341c0] ./perf(test__parse_events+0x48) [0x800cd140] ./perf(cmd_test+0x26a) [0x800bd44a] test child interrupted Adding the struct parse_events_error argument to parse_events call. Also adding parse_events_print_error to get more details on the parsing failures, like: # perf test 6 -v running test 52 'intel_pt//u'failed to parse event 'intel_pt//u', err 1, str 'Cannot find PMU `intel_pt'. Missing kernel support?' event syntax error: 'intel_pt//u' \___ Cannot find PMU `intel_pt'. Missing kernel support? Committer note: Use named initializers in the struct parse_events_error variable to avoid breaking the build on centos5, 6 and others with a similar gcc: cc1: warnings being treated as errors tests/parse-events.c: In function 'test_event': tests/parse-events.c:1696: error: missing initializer tests/parse-events.c:1696: error: (near initialization for 'err.str') Reported-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lkml.kernel.org/r/20180611093422.1005-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Sandipan Das	80c7f519be	perf report powerpc: Fix crash if callchain is empty [ Upstream commit `143c99f6ac` ] For some cases, the callchain provided by the kernel may be empty. So, the callchain ip filtering code will cause a crash if we do not check whether the struct ip_callchain pointer is NULL before accessing any members. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf record -b -e cycles:u ls Before: # perf report --branch-history perf: Segmentation fault -------- backtrace -------- perf[0x1027615c] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x7fff856304d8] perf(arch_skip_callchain_idx+0x44)[0x10257c58] perf[0x1017f2e4] perf(thread__resolve_callchain+0x124)[0x1017ff5c] perf(sample__resolve_callchain+0xf0)[0x10172788] ... After: # perf report --branch-history Samples: 25 of event 'cycles:u', Event count (approx.): 2306870 Overhead Source:Line Symbol Shared Object + 11.60% _init+35736 [.] _init ls + 9.84% strcoll_l.c:137 [.] __strcoll_l libc-2.26.so + 9.16% memcpy.S:175 [.] __memcpy_power7 libc-2.26.so + 9.01% gconv_charset.h:54 [.] _nl_find_locale libc-2.26.so + 8.87% dl-addr.c:52 [.] _dl_addr libc-2.26.so + 8.83% _init+236 [.] _init ls ... Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180611104049.11048-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Thomas Richter	5d17a17ac4	perf test session topology: Fix test on s390 [ Upstream commit `b930e62ecd` ] On s390 this test case fails because the socket identifiction numbers assigned to the CPU are higher than the CPU identification numbers. F/ix this by adding the platform architecture into the perf data header flag information. This helps identifiing the test platform and handles s390 specifics in process_cpu_topology(). Before: [root@p23lp27 perf]# perf test -vvvvv -F 39 39: Session topology : --- start --- templ file: /tmp/perf-test-iUv755 socket_id number is too big.You may need to upgrade the perf tool. ---- end ---- Session topology: Skip [root@p23lp27 perf]# After: [root@p23lp27 perf]# perf test -vvvvv -F 39 39: Session topology : --- start --- templ file: /tmp/perf-test-8X8VTs CPU 0, core 0, socket 6 CPU 1, core 1, socket 3 ---- end ---- Session topology: Ok [root@p23lp27 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Fixes: `c84974ed9f` ("perf test: Add entry to test cpu topology") Link: http://lkml.kernel.org/r/20180611073153.15592-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Thomas Richter	c3732c0d34	perf record: Support s390 random socket_id assignment [ Upstream commit `0176622953` ] On s390 the socket identifier assigned to a CPU identifier is random and (depending on the configuration of the LPAR) may be higher than the CPU identifier. This is currently not supported. Fix this by allowing arbitrary socket identifiers being assigned to CPU id. Output before: [root@p23lp27 perf]# ./perf report --header -I -v ... socket_id number is too big.You may need to upgrade the perf tool. Error: The perf.data file has no samples! # ======== # captured on : Tue May 29 09:29:57 2018 # header version : 1 ... # Core ID and Socket ID information is not available ... [root@p23lp27 perf]# Output after: [root@p23lp27 perf]# ./perf report --header -I -v ... Error: The perf.data file has no samples! # ======== # captured on : Tue May 29 09:29:57 2018 # header version : 1 ... # CPU 0: Core ID 0, Socket ID 6 # CPU 1: Core ID 1, Socket ID 3 # CPU 2: Core ID -1, Socket ID -1 ... [root@p23lp27 perf]# Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180611073153.15592-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Dirk Gouders	991fcd4eb7	kconfig: fix line numbers for if-entries in menu tree [ Upstream commit `b2d00d7c61` ] The line numers for if-entries in the menu tree are off by one or more lines which is confusing when debugging for correctness of unrelated changes. According to the git log, commit `a02f0570ae` (kconfig: improve error handling in the parser) was the last one that changed that part of the parser and replaced "if_entry: T_IF expr T_EOL" by "if_entry: T_IF expr nl" but the commit message does not state why this has been done. When reverting that part of the commit, only the line numers are corrected (checked with cdebug = DEBUG_PARSE in zconf.y), otherwise the menu tree remains unchanged (checked with zconfdump() enabled in conf.c). An example for the corrected line numbers: drivers/soc/Kconfig:15:source drivers/soc/tegra/Kconfig drivers/soc/tegra/Kconfig:4:if drivers/soc/tegra/Kconfig:6:if changes to: drivers/soc/Kconfig:15:source drivers/soc/tegra/Kconfig drivers/soc/tegra/Kconfig:1:if drivers/soc/tegra/Kconfig:4:if Signed-off-by: Dirk Gouders <dirk@gouders.net> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:50 +02:00
Dan Carpenter	ecd04c8dad	typec: tcpm: Fix a msecs vs jiffies bug [ Upstream commit `9578bcd0bb` ] The tcpm_set_state() function take msecs not jiffies. Fixes: `f0690a25a1` ("staging: typec: USB Type-C Port Manager (tcpm)") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Hans de Goede	8ea1c45b4e	NFC: pn533: Fix wrong GFP flag usage [ Upstream commit `ecc443c03f` ] pn533_recv_response() is an urb completion handler, so it must use GFP_ATOMIC. pn533_usb_send_frame() OTOH runs from a regular sleeping context, so the pn533_submit_urb_for_response() there (and only there) can use the regular GFP_KERNEL flags. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1514134 Fixes: `9815c7cf22` ("NFC: pn533: Separate physical layer from ...") Cc: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Ajay Gupta	08f1a2724b	usb: xhci: increase CRS timeout value [ Upstream commit `305886ca87` ] Some controllers take almost 55ms to complete controller restore state (CRS). There is no timeout limit mentioned in xhci specification so fixing the issue by increasing the timeout limit to 100ms [reformat code comment -Mathias] Signed-off-by: Ajay Gupta <ajaykuee@gmail.com> Signed-off-by: Nagaraj Annaiah <naga.annaiah@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Dongjiu Geng	4d8829a211	usb: xhci: remove the code build warning [ Upstream commit `36eb93509c` ] Initialize the 'err' variate to remove the build warning, the warning is shown as below: drivers/usb/host/xhci-tegra.c: In function 'tegra_xusb_mbox_thread': drivers/usb/host/xhci-tegra.c:552:6: warning: 'err' may be used uninitialized in this function [-Wuninitialized] drivers/usb/host/xhci-tegra.c:482:6: note: 'err' was declared here Fixes: `e84fce0f88` ("usb: xhci: Add NVIDIA Tegra XUSB controller driver") Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Jakub Kicinski	13b2330ba3	nfp: bpf: don't stop offload if replace failed [ Upstream commit `68d676a089` ] Stopping offload completely if replace of program failed dates back to days of transparent offload. Back then we wanted to silently fall back to the in-driver processing. Today we mark programs for offload when they are loaded into the kernel, so the transparent offload is no longer a reality. Flags check in the driver will only allow replace of a driver program with another driver program or an offload program with another offload program. When driver program is replaced stopping offload is a no-op, because driver program isn't offloaded. When replacing offloaded program if the offload fails the entire operation will fail all the way back to user space and we should continue using the old program. IOW when replacing a driver program stopping offload is unnecessary and when replacing offloaded program - it's a bug, old program should continue to run. In practice this bug would mean that if offload operation was to fail (either due to FW communication error, kernel OOM or new program being offloaded but for a different netdev) driver would continue reporting that previous XDP program is offloaded but in fact no program will be loaded in hardware. The failure is fairly unlikely (found by inspection, when working on the code) but it's unpleasant. Backport note: even though the bug was introduced in commit `cafa92ac25` ("nfp: bpf: add support for XDP_FLAGS_HW_MODE"), this fix depends on commit `441a33031f` ("net: xdp: don't allow device-bound programs in driver mode"), so this fix is sufficient only in v4.15 or newer. Kernels v4.13.x and v4.14.x do need to stop offload if it was transparent/opportunistic, i.e. if XDP_FLAGS_HW_MODE was not set on running program. Fixes: `cafa92ac25` ("nfp: bpf: add support for XDP_FLAGS_HW_MODE") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Takashi Iwai	399b260090	ALSA: seq: Fix UBSAN warning at SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT ioctl [ Upstream commit `c9a4c63888` ] The kernel may spew a WARNING with UBSAN undefined behavior at handling ALSA sequencer ioctl SNDRV_SEQ_IOCTL_QUERY_NEXT_CLIENT: UBSAN: Undefined behaviour in sound/core/seq/seq_clientmgr.c:2007:14 signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x122/0x1c8 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x86 lib/ubsan.c:159 handle_overflow+0x1c2/0x21f lib/ubsan.c:190 __ubsan_handle_add_overflow+0x2a/0x31 lib/ubsan.c:198 snd_seq_ioctl_query_next_client+0x1ac/0x1d0 sound/core/seq/seq_clientmgr.c:2007 snd_seq_ioctl+0x264/0x3d0 sound/core/seq/seq_clientmgr.c:2144 .... It happens only when INT_MAX is passed there, as we're incrementing it unconditionally. So the fix is trivial, check the value with INT_MAX. Although the bug itself is fairly harmless, it's better to fix it so that fuzzers won't hit this again later. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200211 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Daniel Mack	c5c6b86fbb	ARM: dts: am437x: make edt-ft5x06 a wakeup source [ Upstream commit `49a6ec5b80` ] The touchscreen driver no longer configures the device as wakeup source by default. A "wakeup-source" property is needed. Signed-off-by: Daniel Mack <daniel@zonque.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Haiyue Wang	9d8dc2ca73	ipmi: kcs_bmc: fix IRQ exception if the channel is not open [ Upstream commit `dc0f0a026d` ] When kcs_bmc_handle_event calls kcs_force_abort function to handle the not open (no user running) KCS channel transaction, the returned status value -ENODEV causes the low level IRQ handler indicating that the irq was not for him by returning IRQ_NONE. After some time, this IRQ will be treated to be spurious one, and the exception dump happens. irq 30: nobody cared (try booting with the "irqpoll" option) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.15-npcm750 #1 Hardware name: NPCMX50 Chip family [<c010b264>] (unwind_backtrace) from [<c0106930>] (show_stack+0x20/0x24) [<c0106930>] (show_stack) from [<c03dad38>] (dump_stack+0x8c/0xa0) [<c03dad38>] (dump_stack) from [<c0168810>] (__report_bad_irq+0x3c/0xdc) [<c0168810>] (__report_bad_irq) from [<c0168c34>] (note_interrupt+0x29c/0x2ec) [<c0168c34>] (note_interrupt) from [<c0165c80>] (handle_irq_event_percpu+0x5c/0x68) [<c0165c80>] (handle_irq_event_percpu) from [<c0165cd4>] (handle_irq_event+0x48/0x6c) [<c0165cd4>] (handle_irq_event) from [<c0169664>] (handle_fasteoi_irq+0xc8/0x198) [<c0169664>] (handle_fasteoi_irq) from [<c016529c>] (__handle_domain_irq+0x90/0xe8) [<c016529c>] (__handle_domain_irq) from [<c01014bc>] (gic_handle_irq+0x58/0x9c) [<c01014bc>] (gic_handle_irq) from [<c010752c>] (__irq_svc+0x6c/0x90) Exception stack(0xc0a01de8 to 0xc0a01e30) 1de0: 00002080 c0a6fbc0 00000000 00000000 00000000 c096d294 1e00: 00000000 00000001 dc406400 f03ff100 00000082 c0a01e94 c0a6fbc0 c0a01e38 1e20: 00200102 c01015bc 60000113 ffffffff [<c010752c>] (__irq_svc) from [<c01015bc>] (__do_softirq+0xbc/0x358) [<c01015bc>] (__do_softirq) from [<c011c798>] (irq_exit+0xb8/0xec) [<c011c798>] (irq_exit) from [<c01652a0>] (__handle_domain_irq+0x94/0xe8) [<c01652a0>] (__handle_domain_irq) from [<c01014bc>] (gic_handle_irq+0x58/0x9c) [<c01014bc>] (gic_handle_irq) from [<c010752c>] (__irq_svc+0x6c/0x90) Exception stack(0xc0a01ef8 to 0xc0a01f40) 1ee0: 00000000 000003ae 1f00: dcc0f338 c0111060 c0a00000 c0a0cc44 c0a0cbe4 c0a1c22b c07bc218 00000001 1f20: dcffca40 c0a01f54 c0a01f58 c0a01f48 c0103524 c0103528 60000013 ffffffff [<c010752c>] (__irq_svc) from [<c0103528>] (arch_cpu_idle+0x48/0x4c) [<c0103528>] (arch_cpu_idle) from [<c0681390>] (default_idle_call+0x30/0x3c) [<c0681390>] (default_idle_call) from [<c0156f24>] (do_idle+0xc8/0x134) [<c0156f24>] (do_idle) from [<c015722c>] (cpu_startup_entry+0x28/0x2c) [<c015722c>] (cpu_startup_entry) from [<c067ad74>] (rest_init+0x84/0x88) [<c067ad74>] (rest_init) from [<c0900d44>] (start_kernel+0x388/0x394) [<c0900d44>] (start_kernel) from [<0000807c>] (0x807c) handlers: [<c041c5dc>] npcm7xx_kcs_irq Disabling IRQ #30 It needs to change the returned status from -ENODEV to 0. The -ENODEV was originally used to tell the low level IRQ handler that no user was running, but not consider the IRQ handling desgin. And multiple KCS channels share one IRQ handler, it needs to check the IBF flag before doing force abort. If the IBF is set, after handling, return 0 to low level IRQ handler to indicate that the IRQ is handled. Signed-off-by: Haiyue Wang <haiyue.wang@linux.intel.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:49 +02:00
Michael Trimarchi	23e94126ee	brcmfmac: stop watchdog before detach and free everything [ Upstream commit `373c83a801` ] Using built-in in kernel image without a firmware in filesystem or in the kernel image can lead to a kernel NULL pointer deference. Watchdog need to be stopped in brcmf_sdio_remove The system is going down NOW! [ 1348.110759] Unable to handle kernel NULL pointer dereference at virtual address 000002f8 Sent SIGTERM to all processes [ 1348.121412] Mem abort info: [ 1348.126962] ESR = 0x96000004 [ 1348.130023] Exception class = DABT (current EL), IL = 32 bits [ 1348.135948] SET = 0, FnV = 0 [ 1348.138997] EA = 0, S1PTW = 0 [ 1348.142154] Data abort info: [ 1348.145045] ISV = 0, ISS = 0x00000004 [ 1348.148884] CM = 0, WnR = 0 [ 1348.151861] user pgtable: 4k pages, 48-bit VAs, pgdp = (____ptrval____) [ 1348.158475] [00000000000002f8] pgd=0000000000000000 [ 1348.163364] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 1348.168927] Modules linked in: ipv6 [ 1348.172421] CPU: 3 PID: 1421 Comm: brcmf_wdog/mmc0 Not tainted 4.17.0-rc5-next-20180517 #18 [ 1348.180757] Hardware name: Amarula A64-Relic (DT) [ 1348.185455] pstate: 60000005 (nZCv daif -PAN -UAO) [ 1348.190251] pc : brcmf_sdiod_freezer_count+0x0/0x20 [ 1348.195124] lr : brcmf_sdio_watchdog_thread+0x64/0x290 [ 1348.200253] sp : ffff00000b85be30 [ 1348.203561] x29: ffff00000b85be30 x28: 0000000000000000 [ 1348.208868] x27: ffff00000b6cb918 x26: ffff80003b990638 [ 1348.214176] x25: ffff0000087b1a20 x24: ffff80003b94f800 [ 1348.219483] x23: ffff000008e620c8 x22: ffff000008f0b660 [ 1348.224790] x21: ffff000008c6a858 x20: 00000000fffffe00 [ 1348.230097] x19: ffff80003b94f800 x18: 0000000000000001 [ 1348.235404] x17: 0000ffffab2e8a74 x16: ffff0000080d7de8 [ 1348.240711] x15: 0000000000000000 x14: 0000000000000400 [ 1348.246018] x13: 0000000000000400 x12: 0000000000000001 [ 1348.251324] x11: 00000000000002c4 x10: 0000000000000a10 [ 1348.256631] x9 : ffff00000b85bc40 x8 : ffff80003be11870 [ 1348.261937] x7 : ffff80003dfc7308 x6 : 000000078ff08b55 [ 1348.267243] x5 : 00000139e1058400 x4 : 0000000000000000 [ 1348.272550] x3 : dead000000000100 x2 : 958f2788d6618100 [ 1348.277856] x1 : 00000000fffffe00 x0 : 0000000000000000 Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Tested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Tomasz Duszynski	17722c4b60	iio: pressure: bmp280: fix relative humidity unit [ Upstream commit `13399ff25f` ] According to IIO ABI relative humidity reading should be returned in milli percent. This patch addresses that by applying proper scaling and returning integer instead of fractional format type specifier. Note that the fixes tag is before the driver was heavily refactored to introduce spi support, so the patch won't apply that far back. Signed-off-by: Tomasz Duszynski <tduszyns@gmail.com> Fixes: `14beaa8f5a` ("iio: pressure: bmp280: add humidity support") Acked-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Ganesh Goudar	42e2fa589d	cxgb4: when disabling dcb set txq dcb priority to 0 [ Upstream commit `5ce36338a3` ] When we are disabling DCB, store "0" in txq->dcb_prio since that's used for future TX Work Request "OVLAN_IDX" values. Setting non zero priority upon disabling DCB would halt the traffic. Reported-by: AMG Zollner Robert <robert@cloudmedia.eu> CC: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Linus Lüssing	a26a4c984f	batman-adv: Fix multicast TT issues with bogus ROAM flags [ Upstream commit `a44ebeff6b` ] When a (broken) node wrongly sends multicast TT entries with a ROAM flag then this causes any receiving node to drop all entries for the same multicast MAC address announced by other nodes, leading to packet loss. Fix this DoS vector by only storing TT sync flags. For multicast TT non-sync'ing flag bits like ROAM are unused so far anyway. Fixes: `1d8ab8d3c1` ("batman-adv: Modified forwarding behaviour for multicast packets") Reported-by: Leonardo Mörlein <me@irrelefant.net> Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Linus Lüssing	880d1ef59c	batman-adv: Avoid storing non-TT-sync flags on singular entries too [ Upstream commit `4a519b83da` ] Since commit `54e22f265e` ("batman-adv: fix TT sync flag inconsistencies") TT sync flags and TT non-sync'd flags are supposed to be stored separately. The previous patch missed to apply this separation on a TT entry with only a single TT orig entry. This is a minor fix because with only a single TT orig entry the DDoS issue the former patch solves does not apply. Fixes: `54e22f265e` ("batman-adv: fix TT sync flag inconsistencies") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Sven Eckelmann	62b4d20e32	batman-adv: Fix debugfs path for renamed softif [ Upstream commit `6da7be7d24` ] batman-adv is creating special debugfs directories in the init net_namespace for each created soft-interface (batadv net_device). But it is possible to rename a net_device to a completely different name then the original one. It can therefore happen that a user registers a new batadv net_device with the name "bat0". batman-adv is then also adding a new directory under $debugfs/batman-adv/ with the name "wlan0". The user then decides to rename this device to "bat1" and registers a different batadv device with the name "bat0". batman-adv will then try to create a directory with the name "bat0" under $debugfs/batman-adv/ again. But there already exists one with this name under this path and thus this fails. batman-adv will detect a problem and rollback the registering of this device. batman-adv must therefore take care of renaming the debugfs directories for soft-interfaces whenever it detects such a net_device rename. Fixes: `c6c8fea297` ("net: Add batman-adv meshing protocol") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Sven Eckelmann	7fdca0f115	batman-adv: Fix debugfs path for renamed hardif [ Upstream commit `36dc621cec` ] batman-adv is creating special debugfs directories in the init net_namespace for each valid hard-interface (net_device). But it is possible to rename a net_device to a completely different name then the original one. It can therefore happen that a user registers a new net_device which gets the name "wlan0" assigned by default. batman-adv is also adding a new directory under $debugfs/batman-adv/ with the name "wlan0". The user then decides to rename this device to "wl_pri" and registers a different device. The kernel may now decide to use the name "wlan0" again for this new device. batman-adv will detect it as a valid net_device and tries to create a directory with the name "wlan0" under $debugfs/batman-adv/. But there already exists one with this name under this path and thus this fails. batman-adv will detect a problem and rollback the registering of this device. batman-adv must therefore take care of renaming the debugfs directories for hard-interfaces whenever it detects such a net_device rename. Fixes: `5bc7c1eb44` ("batman-adv: add debugfs structure for information per interface") Reported-by: John Soros <sorosj@gmail.com> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Sven Eckelmann	cd9718c53c	batman-adv: Fix bat_v best gw refcnt after netlink dump [ Upstream commit `9713cb0cf1` ] A reference for the best gateway is taken when the list of gateways in the mesh is sent via netlink. This is necessary to check whether the currently dumped entry is the currently selected gateway or not. This information is then transferred as flag BATADV_ATTR_FLAG_BEST. After the comparison of the current entry is done, batadv_v_gw_dump_entry() has to decrease the reference counter again. Otherwise the reference will be held and thus prevents a proper shutdown of the batman-adv interfaces (and some of the interfaces enslaved in it). Fixes: `b71bb6f924` ("batman-adv: add B.A.T.M.A.N. V bat_gw_dump implementations") Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:48 +02:00
Sven Eckelmann	83901ae942	batman-adv: Fix bat_ogm_iv best gw refcnt after netlink dump [ Upstream commit `b5685d2687` ] A reference for the best gateway is taken when the list of gateways in the mesh is sent via netlink. This is necessary to check whether the currently dumped entry is the currently selected gateway or not. This information is then transferred as flag BATADV_ATTR_FLAG_BEST. After the comparison of the current entry is done, batadv_iv_gw_dump_entry() has to decrease the reference counter again. Otherwise the reference will be held and thus prevents a proper shutdown of the batman-adv interfaces (and some of the interfaces enslaved in it). Fixes: `efb766af06` ("batman-adv: add B.A.T.M.A.N. IV bat_gw_dump implementations") Reported-by: Andreas Ziegler <dev@andreas-ziegler.de> Tested-by: Andreas Ziegler <dev@andreas-ziegler.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Acked-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Rob Herring	81f24148ce	arm64: dts: msm8916: fix Coresight ETF graph connections [ Upstream commit `6b4154a655` ] The ETF input should be connected to the funnel output, and the ETF output should be connected to the replicator input. The labels are wrong and these got swapped: Warning (graph_endpoint): /soc/funnel@821000/ports/port@8/endpoint: graph connection to node '/soc/etf@825000/ports/port@1/endpoint' is not bidirectional Warning (graph_endpoint): /soc/replicator@824000/ports/port@2/endpoint: graph connection to node '/soc/etf@825000/ports/port@0/endpoint' is not bidirectional Fixes: `7c10da3736` ("arm64: dts: qcom: Add msm8916 CoreSight components") Cc: Ivan T. Ivanov <ivan.ivanov@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Andy Gross <andy.gross@linaro.org> Cc: David Brown <david.brown@linaro.org> Cc: linux-arm-msm@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Casey Schaufler	23c3cf7d56	Smack: Mark inode instant in smack_task_to_inode [ Upstream commit `7b4e88434c` ] Smack: Mark inode instant in smack_task_to_inode /proc clean-up in commit `1bbc55131e` resulted in smack_task_to_inode() being called before smack_d_instantiate. This resulted in the smk_inode value being ignored, even while present for files in /proc/self. Marking the inode as instant here fixes that. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Hangbin Liu	a29d1c5ce5	ipv6: mcast: fix unsolicited report interval after receiving querys [ Upstream commit `6c6da92808` ] After recieving MLD querys, we update idev->mc_maxdelay with max_delay from query header. This make the later unsolicited reports have the same interval with mc_maxdelay, which means we may send unsolicited reports with long interval time instead of default configured interval time. Also as we will not call ipv6_mc_reset() after device up. This issue will be there even after leave the group and join other groups. Fixes: `fc4eba58b4` ("ipv6: make unsolicited report intervals configurable for mld") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Zhenzhong Duan	c826a9eaca	x86/microcode/intel: Fix memleak in save_microcode_patch() [ Upstream commit `0218c76626` ] Free useless ucode_patch entry when it's replaced. [ bp: Drop the memfree_patch() two-liner. ] Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Srinivas REDDY Eeda <srinivas.eeda@oracle.com> Link: http://lkml.kernel.org/r/888102f0-fd22-459d-b090-a1bd8a00cb2b@default Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Marc Zyngier	71141fa9f7	irqchip/gic-v3-its: Fix reprogramming of redistributors on CPU hotplug [ Upstream commit `82f499c881` ] Enabling LPIs was made a lot stricter recently, by checking that they are disabled before enabling them. By doing so, the CPU hotplug case was missed altogether, which leaves LPIs enabled on hotplug off (expecting the CPU to eventually come back), and won't write a different value anyway on hotplug on. So skip that check if that particular case is detected Fixes: `6eb486b66a` ("irqchip/gic-v3: Ensure GICR_CTLR.EnableLPI=0 is observed before enabling") Reported-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Sumit Garg <sumit.garg@linaro.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Link: https://lkml.kernel.org/r/20180622095254.5906-8-marc.zyngier@arm.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Marc Zyngier	8b2d7cd3fa	irqchip/gic-v2m: Fix SPI release on error path [ Upstream commit `cbaf45a6be` ] On failing to allocate the required SPIs, the actual number of interrupts should be freed and not its log2 value. Fixes: `de337ee301` ("irqchip/gic-v2m: Add PCI Multi-MSI support") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Cc: Sumit Garg <sumit.garg@linaro.org> Link: https://lkml.kernel.org/r/20180622095254.5906-4-marc.zyngier@arm.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Geert Uytterhoeven	07b67a0120	mtd: dataflash: Use ULL suffix for 64-bit constants [ Upstream commit `cbdceb9b3e` ] With gcc 4.1.2 when compiling for 32-bit: drivers/mtd/devices/mtd_dataflash.c:736: warning: integer constant is too large for ‘long’ type drivers/mtd/devices/mtd_dataflash.c:737: warning: integer constant is too large for ‘long’ type Add the missing "ULL" suffixes to fix this. Fixes: `67e4145ebf` ("mtd: dataflash: Add flash_info for AT45DB641E") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Jeffrin Jose T	9f73f2495b	selftests: bpf: notification about privilege required to run test_kmod.sh testing script [ Upstream commit `81e167c2a2` ] The test_kmod.sh script require root privilege for the successful execution of the test. This patch is to notify the user about the privilege the script demands for the successful execution of the test. Signed-off-by: Jeffrin Jose T (Rajagiri SET) <ahiliation@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:47 +02:00
Steven Rostedt (VMware)	88c1d08371	locking/lockdep: Do not record IRQ state within lockdep code [ Upstream commit `fcc784be83` ] While debugging where things were going wrong with mapping enabling/disabling interrupts with the lockdep state and actual real enabling and disabling interrupts, I had to silent the IRQ disabling/enabling in debug_check_no_locks_freed() because it was always showing up as it was called before the splat was. Use raw_local_irq_save/restore() for not only debug_check_no_locks_freed() but for all internal lockdep functions, as they hide useful information about where interrupts were used incorrectly last. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Link: https://lkml.kernel.org/lkml/20180404140630.3f4f4c7a@gandalf.local.home Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Masahiro Yamada	9a8416253d	clk: sunxi-ng: replace lib-y with obj-y [ Upstream commit `12f8c553a5` ] We had commit `06e226c7fb` ("clk: sunxi-ng: Move all clock types to a library") and commit `799c434154` ("kbuild: thin archives make default for all archs") in the same development cycle, from different trees. With migration to the thin archive, the entire drivers/clk/sunxi-ng/lib.a is linked to the vmlinux. This does not break build, but we do not get any size saving. However, we do not need to go back to the individual Kconfig options. The default configuration pulls in all (or most) of the CCU parts anyway. Also, once we enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION, we can simply list all files with obj-y, and the linker will drop all unused functions by itself. After the long discussion [1], people there agreed to fix this, but nobody sent a patch after all. I am doing it now. I lifted up CONFIG_SUNXI_CCU to drivers/clk/Makefile because everything in drivers/clk/sunxi-ng/ depends on SUNXI_CCU. [1] https://patchwork.kernel.org/patch/9796521/ Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Jianchao Wang	fc2a7f6fb8	nvme-pci: move nvme_kill_queues to nvme_remove_dead_ctrl [ Upstream commit `9f9cafc140` ] There is race between nvme_remove and nvme_reset_work that can lead to io hang. nvme_remove nvme_reset_work -> nvme_remove_dead_ctrl -> nvme_dev_disable -> quiesce request_queue -> queue remove_work -> cancel_work_sync reset_work -> nvme_remove_namespaces -> splice ctrl->namespaces nvme_remove_dead_ctrl_work -> nvme_kill_queues -> nvme_ns_remove do nothing -> blk_cleanup_queue -> blk_freeze_queue Finally, the request_queue is quiesced state when wait freeze, we will get io hang here. To fix it, move the nvme_kill_queues from nvme_remove_dead_ctrl_work to nvme_remove_dead_ctrl. Suggested-by: Keith Busch <keith.busch@linux.intel.com> Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Maciej Purski	a3cb060658	drm/bridge/sii8620: fix display of packed pixel modes in MHL2 [ Upstream commit `e8b92efa62` ] Currently packed pixel modes in MHL2 can't be displayed. The device automatically recognizes output format, so setting format other than RGB causes failure. Fix it by writing proper values to registers. Tested on MHL1 and MHL2 using various vendors' dongles both in DVI and HDMI mode. Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1516706239-9104-1-git-send-email-m.purski@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Ard Biesheuvel	efedf3a953	KVM: arm/arm64: Drop resource size check for GICV window [ Upstream commit `ba56bc3a07` ] When booting a 64 KB pages kernel on a ACPI GICv3 system that implements support for v2 emulation, the following warning is produced GICV size 0x2000 not a multiple of page size 0x10000 and support for v2 emulation is disabled, preventing GICv2 VMs from being able to run on such hosts. The reason is that vgic_v3_probe() performs a sanity check on the size of the window (it should be a multiple of the page size), while the ACPI MADT parsing code hardcodes the size of the window to 8 KB. This makes sense, considering that ACPI does not bother to describe the size in the first place, under the assumption that platforms implementing ACPI will follow the architecture and not put anything else in the same 64 KB window. So let's just drop the sanity check altogether, and assume that the window is at least 64 KB in size. Fixes: `9097773245` ("KVM: arm/arm64: vgic-new: vgic_init: implement kvm_vgic_hyp_init") Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Marcelo Ricardo Leitner	27fe4e8d5f	sctp: fix erroneous inc of snmp SctpFragUsrMsgs [ Upstream commit `fedb1bd3d2` ] Currently it is incrementing SctpFragUsrMsgs when the user message size is of the exactly same size as the maximum fragment size, which is wrong. The fix is to increment it only when user message is bigger than the maximum fragment size. Fixes: `bfd2e4b873` ("sctp: refactor sctp_datamsg_from_user") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Bartosz Golaszewski	edd8e08850	net: davinci_emac: match the mdio device against its compatible if possible [ Upstream commit `ea0820bb77` ] Device tree based systems without of_dev_auxdata will have the mdio device named differently than "davinci_mdio(.0)". In this case use the device's parent's compatible string for matching Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Doron Roberts-Kedes	b043ccf739	nbd: Add the nbd NBD_DISCONNECT_ON_CLOSE config flag. [ Upstream commit `08ba91ee6e` ] If NBD_DISCONNECT_ON_CLOSE is set on a device, then the driver will issue a disconnect from nbd_release if the device has no remaining bdev->bd_openers. Fix ret val so reconfigure with only setting the flag succeeds. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:46 +02:00
Anders Roxell	473c7b01ff	selftests: net: add config fragments [ Upstream commit `73f9c33beb` ] Add fragments to pass bridge and vlan tests. Fixes: `33b01b7b4f` ("selftests: add rtnetlink test script") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Alexey Brodkin	05f7daf5b3	ARC: Enable machine_desc->init_per_cpu for !CONFIG_SMP [ Upstream commit `2f24ef7413` ] machine_desc->init_per_cpu() hook is supposed to be per cpu initialization and would seem to apply equally to UP and/or SMP. Infact the comment in header file seems to suggest it works for UP too, which was not the case and this patch. This enables !CONFIG_SMP build for platforms such as hsdk. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> [vgupta: trimmeed changelog] Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Dan Carpenter	b6232082cf	block: sed-opal: Fix a couple off by one bugs [ Upstream commit `ce042c183b` ] resp->num is the number of tokens in resp->tok[]. It gets set in response_parse(). So if n == resp->num then we're reading beyond the end of the data. Fixes: `455a7b238c` ("block: Add Sed-opal library") Reviewed-by: Scott Bauer <scott.bauer@intel.com> Tested-by: Scott Bauer <scott.bauer@intel.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Dan Carpenter	2e95d2eac6	blk-mq-debugfs: Off by one in blk_mq_rq_state_name() [ Upstream commit `a1e7918862` ] If rq_state == ARRAY_SIZE() then we read one element beyond the end of the blk_mq_rq_state_name_array[] array. Fixes: `ec6dcf63c5` ("blk-mq-debugfs: Show more request state information") Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Max Gurtuvoy	cf10109138	nvmet: reset keep alive timer in controller enable [ Upstream commit `d68a90e148` ] Controllers that are not yet enabled should not really enforce keep alive timeouts, but we still want to track a timeout and cleanup in case a host died before it enabled the controller. Hence, simply reset the keep alive timer when the controller is enabled. Suggested-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Israel Rukshin	7458a7a177	nvme-rdma: Fix command completion race at error recovery [ Upstream commit `c947657b15` ] The race is between completing the request at error recovery work and rdma completions. If we cancel the request before getting the good rdma completion we get a NULL deref of the request MR at nvme_rdma_process_nvme_rsp(). When Canceling the request we return its mr to the mr pool (set mr to NULL) and also unmap its data. Canceling the requests while the rdma queues are active is not safe. Because rdma queues are active and we get good rdma completions that can use the mr pointer which may be NULL. Completing the request too soon may lead also to performing DMA to/from user buffers which might have been already unmapped. The commit fixes the race by draining the QP before starting the abort commands mechanism. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Sagi Grimberg	d03f539983	nvme-rdma: fix possible double free condition when failing to create a controller [ Upstream commit `3d0641015b` ] Failures after nvme_init_ctrl will defer resource cleanups to .free_ctrl when the reference is released, hence we should not free the controller queues for these failures. Fix that by moving controller queues allocation before controller initialization and correctly freeing them for failures before initialization and skip them for failures after initialization. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Dinh Nguyen	dd76b88049	net: stmmac: socfpga: add additional ocp reset line for Stratix10 [ Upstream commit `bc8a2d9bcb` ] The Stratix10 platform has an additional reset line, OCP(Open Core Protocol), that also needs to get deasserted for the stmmac ethernet controller to work. Thus we need to update the Kconfig to include ARCH_STRATIX10 in order to build dwmac-socfpga. Also, remove the redundant check for the reset controller pointer. The reset driver already checks for the pointer and returns 0 if the pointer is NULL. Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Li RongQing	3438b1147b	net: propagate dev_get_valid_name return code [ Upstream commit `7892bd0810` ] if dev_get_valid_name failed, propagate its return code and remove the setting err to ENODEV, it will be set to 0 again before dev_change_net_namespace exits. Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:45 +02:00
Stefan Agner	07b87be1c8	net: hamradio: use eth_broadcast_addr [ Upstream commit `4e8439aa34` ] The array bpq_eth_addr is only used to get the size of an address, whereas the bcast_addr is used to set the broadcast address. This leads to a warning when using clang: drivers/net/hamradio/bpqether.c:94:13: warning: variable 'bpq_eth_addr' is not needed and will not be emitted [-Wunneeded-internal-declaration] static char bpq_eth_addr[6]; ^ Remove both variables and use the common eth_broadcast_addr to set the broadcast address. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Govindarajulu Varadarajan	f0899c17f4	enic: initialize enic->rfs_h.lock in enic_probe [ Upstream commit `3256d29fc7` ] lockdep spotted that we are using rfs_h.lock in enic_get_rxnfc() without initializing. rfs_h.lock is initialized in enic_open(). But ethtool_ops can be called when interface is down. Move enic_rfs_flw_tbl_init to enic_probe. INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 18 PID: 1189 Comm: ethtool Not tainted 4.17.0-rc7-devel+ #27 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 Call Trace: dump_stack+0x85/0xc0 register_lock_class+0x550/0x560 ? __handle_mm_fault+0xa8b/0x1100 __lock_acquire+0x81/0x670 lock_acquire+0xb9/0x1e0 ? enic_get_rxnfc+0x139/0x2b0 [enic] _raw_spin_lock_bh+0x38/0x80 ? enic_get_rxnfc+0x139/0x2b0 [enic] enic_get_rxnfc+0x139/0x2b0 [enic] ethtool_get_rxnfc+0x8d/0x1c0 dev_ethtool+0x16c8/0x2400 ? __mutex_lock+0x64d/0xa00 ? dev_load+0x6a/0x150 dev_ioctl+0x253/0x4b0 sock_do_ioctl+0x9a/0x130 sock_ioctl+0x1af/0x350 do_vfs_ioctl+0x8e/0x670 ? syscall_trace_enter+0x1e2/0x380 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x5a/0x170 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Sudarsana Reddy Kalluru	da454998f4	qed: Do not advertise DCBX_LLD_MANAGED capability. [ Upstream commit `ff54d5cd9e` ] Do not advertise DCBX_LLD_MANAGED capability i.e., do not allow external agent to manage the dcbx/lldp negotiation. MFW acts as lldp agent for qed* devices, and no other lldp agent is allowed to coexist with mfw. Also updated a debug print, to not to display the redundant info. Fixes: `a1d8d8a51` ("qed: Add dcbnl support.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Sudarsana Reddy Kalluru	9c066b254c	qed: Add sanity check for SIMD fastpath handler. [ Upstream commit `3935a70968` ] Avoid calling a SIMD fastpath handler if it is NULL. The check is needed to handle an unlikely scenario where unsolicited interrupt is destined to a PF in INTa mode. Fixes: `fe56b9e6a` ("qed: Add module with basic common support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Sudarsana Reddy Kalluru	95af241b9f	qed: Fix possible memory leak in Rx error path handling. [ Upstream commit `4f9de4df90` ] Memory for packet buffers need to be freed in the error paths as there is no consumer (e.g., upper layer) for such packets and that memory will never get freed. The issue was uncovered when port was attacked with flood of isatap packets, these are multicast packets hence were directed at all the PFs. For foce PF, this meant they were routed to the ll2 module which in turn drops such packets. Fixes: `0a7fb11c` ("qed: Add Light L2 support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Ariel Elior <ariel.elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Zhizhou Zhang	7e2a6057f8	arm64: make secondary_start_kernel() notrace [ Upstream commit `b154886f78` ] We can't call function trace hook before setup percpu offset. When entering secondary_start_kernel(), percpu offset has not been initialized. So this lead hotplug malfunction. Here is the flow to reproduce this bug: echo 0 > /sys/devices/system/cpu/cpu1/online echo function > /sys/kernel/debug/tracing/current_tracer echo 1 > /sys/kernel/debug/tracing/tracing_on echo 1 > /sys/devices/system/cpu/cpu1/online Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Zhizhou Zhang <zhizhouzhang@asrmicro.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Marek Szyprowski	b1c9c6e175	arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag [ Upstream commit `dd65a941f6` ] dma_alloc_() buffers might be exposed to userspace via mmap() call, so they should be cleared on allocation. In case of IOMMU-based dma-mapping implementation such buffer clearing was missing in the code path for DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous() doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more information on clearing buffers allocated by dma_alloc_ functions, see commit `6829e274a6` ("arm64: dma-mapping: always clear allocated buffers"). Fixes: `44176bb38f` ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Zhouyang Jia	1e03a74f9f	xen/scsiback: add error handling for xenbus_printf [ Upstream commit `7c63ca24c8` ] When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:44 +02:00
Zhouyang Jia	57ca861763	scsi: xen-scsifront: add error handling for xenbus_printf [ Upstream commit `93efbd3987` ] When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Trond Myklebust	8a98455e9f	pNFS: Always free the session slot on error in nfs4_layoutget_handle_exception [ Upstream commit `2dbf8dffbf` ] Right now, we can call nfs_commit_inode() while holding the session slot, which could lead to NFSv4 deadlocks. Ensure we only keep the slot if the server returned a layout that we have to process. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Zhouyang Jia	8b0f3bd0e5	xen: add error handling for xenbus_printf [ Upstream commit `84c029a733` ] When xenbus_printf fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling xenbus_printf. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Nicholas Piggin	ab830707c4	powerpc: smp_send_stop do not offline stopped CPUs [ Upstream commit `de6e5d3841` ] Marking CPUs stopped by smp_send_stop as offline can cause warnings due to cross-CPU wakeups. This trace was noticed on a busy system running a sysrq+c crash test, after the injected crash: WARNING: CPU: 51 PID: 1546 at kernel/sched/core.c:1179 set_task_cpu+0x22c/0x240 CPU: 51 PID: 1546 Comm: kworker/u352:1 Tainted: G D Workqueue: mlx5e mlx5e_update_stats_work [mlx5_core] [...] NIP [c00000000017c21c] set_task_cpu+0x22c/0x240 LR [c00000000017d580] try_to_wake_up+0x230/0x720 Call Trace: [c000000001017700] runqueues+0x0/0xb00 (unreliable) [c00000000017d580] try_to_wake_up+0x230/0x720 [c00000000015a214] insert_work+0x104/0x140 [c00000000015adb0] __queue_work+0x230/0x690 [c000003fc5007910] [c00000000015b26c] queue_work_on+0x5c/0x90 [c0080000135fc8f8] mlx5_cmd_exec+0x538/0xcb0 [mlx5_core] [c008000013608fd0] mlx5_core_access_reg+0x140/0x1d0 [mlx5_core] [c00800001362777c] mlx5e_update_pport_counters.constprop.59+0x6c/0x90 [mlx5_core] [c008000013628868] mlx5e_update_ndo_stats+0x28/0x90 [mlx5_core] [c008000013625558] mlx5e_update_stats_work+0x68/0xb0 [mlx5_core] [c00000000015bcec] process_one_work+0x1bc/0x5f0 [c00000000015ecac] worker_thread+0xac/0x6b0 [c000000000168338] kthread+0x168/0x1b0 [c00000000000b628] ret_from_kernel_thread+0x5c/0xb4 This happens because firstly the CPU is not really offline in the usual sense, processes and interrupts have not been migrated away. Secondly smp_send_stop does not happen atomically on all CPUs, so one CPU can have marked itself offline, while another CPU is still running processes or interrupts which can affect the first CPU. Fix this by just not marking the CPU as offline. It's more like frozen in time, so offline does not really reflect its state properly anyway. There should be nothing in the crash/panic path that walks online CPUs and synchronously waits for them, so this change should not introduce new hangs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Minas Harutyunyan	bc7e139c2c	dwc2: gadget: Fix ISOC IN DDMA PID bitfield value calculation [ Upstream commit `1d8e5c0027` ] PID bitfield in descriptor should be set based on particular request length, not based on EP's mc value. PID value can't be set to 0 even request length is 0. Signed-off-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Grigor Tovmasyan	302dc36e1a	usb: gadget: dwc2: fix memory leak in gadget_init() [ Upstream commit `9bb073a053` ] Freed allocated request for ep0 to prevent memory leak in case when dwc2_driver_probe() failed. Cc: Stefan Wahren <stefan.wahren@i2se.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Grigor Tovmasyan <tovmasya@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
Chunfeng Yun	249a19f36c	usb: gadget: composite: fix delayed_status race condition when set_interface [ Upstream commit `980900d631` ] It happens when enable debug log, if set_alt() returns USB_GADGET_DELAYED_STATUS and usb_composite_setup_continue() is called before increasing count of @delayed_status, so fix it by using spinlock of @cdev->lock. Signed-off-by: Chunfeng Yun <chunfeng.yun@mediatek.com> Tested-by: Jay Hsu <shih-chieh.hsu@mediatek.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
William Wu	5938e42c7e	usb: dwc2: fix isoc split in transfer with no data [ Upstream commit `70c3c8cb83` ] If isoc split in transfer with no data (the length of DATA0 packet is zero), we can't simply return immediately. Because the DATA0 can be the first transaction or the second transaction for the isoc split in transaction. If the DATA0 packet with no data is in the first transaction, we can return immediately. But if the DATA0 packet with no data is in the second transaction of isoc split in transaction sequence, we need to increase the qtd->isoc_frame_index and giveback urb to device driver if needed, otherwise, the MDATA packet will be lost. A typical test case is that connect the dwc2 controller with an usb hs Hub (GL852G-12), and plug an usb fs audio device (Plantronics headset) into the downstream port of Hub. Then use the usb mic to record, we can find noise when playback. In the case, the isoc split in transaction sequence like this: - SSPLIT IN transaction - CSPLIT IN transaction - MDATA packet (176 bytes) - CSPLIT IN transaction - DATA0 packet (0 byte) This patch use both the length of DATA0 and qtd->isoc_split_offset to check if the DATA0 is in the second transaction. Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:43 +02:00
William Wu	d6ebbb41f0	usb: dwc2: alloc dma aligned buffer for isoc split in [ Upstream commit `af424a4107` ] The commit `3bc04e28a0` ("usb: dwc2: host: Get aligned DMA in a more supported way") rips out a lot of code to simply the allocation of aligned DMA. However, it also introduces a new issue when use isoc split in transfer. In my test case, I connect the dwc2 controller with an usb hs Hub (GL852G-12), and plug an usb fs audio device (Plantronics headset) into the downstream port of Hub. Then use the usb mic to record, we can find noise when playback. It's because that the usb Hub uses an MDATA for the first transaction and a DATA0 for the second transaction for the isoc split in transaction. An typical isoc split in transaction sequence like this: - SSPLIT IN transaction - CSPLIT IN transaction - MDATA packet - CSPLIT IN transaction - DATA0 packet The DMA address of MDATA (urb->dma) is always DWORD-aligned, but the DMA address of DATA0 (urb->dma + qtd->isoc_split_offset) may not be DWORD-aligned, it depends on the qtd->isoc_split_offset (the length of MDATA). In my test case, the length of MDATA is usually unaligned, this cause DATA0 packet transmission error. This patch use kmem_cache to allocate aligned DMA buf for isoc split in transaction. Note that according to usb 2.0 spec, the maximum data payload size is 1023 bytes for each fs isoc ep, and the maximum allowable interrupt data payload size is 64 bytes or less for fs interrupt ep. So we set the size of object to be 1024 bytes in the kmem cache. Tested-by: Gevorg Sahakyan <sahakyan@synopsys.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Artur Petrosyan	b64f38524d	usb: dwc2: Fix host exit from hibernation flow. [ Upstream commit `22bb5cfdf1` ] In case when a hub is connected to DWC2 host auto suspend occurs and host goes to hibernation. When any device connected to hub host hibernation exiting incorrectly. - Added dwc2_hcd_rem_wakeup() function call to exit from suspend state by remote wakeup. - Increase timeout value for port suspend bit to be set. Acked-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Artur Petrosyan <arturp@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Janusz Krzysztofik	55601e0651	dmaengine: ti: omap-dma: Fix OMAP1510 incorrect residue_granularity [ Upstream commit `c9bd0946da` ] Commit `0198d7bb8a` ("ASoC: omap-mcbsp: Convert to use the sdma-pcm instead of omap-pcm") resulted in broken audio playback on OMAP1510 (discovered on Amstrad Delta). When running on OMAP1510, omap-pcm used to obtain DMA offset from snd_dmaengine_pcm_pointer_no_residue() based on DMA interrupt triggered software calculations instead of snd_dmaengine_pcm_pointer() which depended on residue value calculated from omap_dma_get_src_pos(). Similar code path is still available in now used sound/soc/soc-generic-dmaengine-pcm.c but it is not triggered. It was verified already before that omap_get_dma_src_pos() from arch/arm/plat-omap/dma.c didn't work correctly for OMAP1510 - see commit `1bdd741991` ("ASoC: OMAP: fix OMAP1510 broken PCM pointer callback") for details. Apparently the same applies to its successor, omap_dma_get_src_pos() from drivers/dma/ti/omap-dma.c. On the other hand, snd_dmaengine_pcm_pointer_no_residue() is described as depreciated and discouraged for use in new drivers because of its unreliable accuracy. However, it seems the only working option for OPAM1510 now, as long as a software calculated residue is not implemented as OMAP1510 fallback in omap-dma. Using snd_dmaengine_pcm_pointer_no_residue() code path instead of snd_dmaengine_pcm_pointer() in sound/soc/soc-generic-dmaengine-pcm.c can be triggered in two ways: - by passing pcm->flags \|= SND_DMAENGINE_PCM_FLAG_NO_RESIDUE from sound/soc/omap/sdma-pcm.c, - by passing dma_caps.residue_granularity = DMA_RESIDUE_GRANULARITY_DESCRIPTOR from DMA engine. Let's do the latter. Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com> Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
John Garry	5fd726352e	libahci: Fix possible Spectre-v1 pmp indexing in ahci_led_store() [ Upstream commit `fae2a63737` ] Currently smatch warns of possible Spectre-V1 issue in ahci_led_store(): drivers/ata/libahci.c:1150 ahci_led_store() warn: potential spectre issue 'pp->em_priv' (local cap) Userspace controls @pmp from following callchain: em_message->store() ->ata_scsi_em_message_store() -->ap->ops->em_store() --->ahci_led_store() After the mask+shift @pmp is effectively an 8b value, which is used to index into an array of length 8, so sanitize the array index. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Vijay Immanuel	d3fc1dee6e	IB/rxe: Fix missing completion for mem_reg work requests [ Upstream commit `375dc53d03` ] Run the completer task to post a work completion after processing a memory registration or invalidate work request. This covers the case where the memory registration or invalidate was the last work request posted to the qp. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Reviewed-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Ayan Kumar Halder	bf3a313b9d	drm/mali-dp: Rectify the width and height passed to rotmem_required() [ Upstream commit `c6cf387ec5` ] The width and height needs to be swapped Signed-off-by: Ayan Kumar halder <ayan.halder@arm.com> Reviewed-by: Brian Starkey <brian.starkey@arm.com> Reviewed-by: Alexandru Gheorghe <alexandru-cosmin.gheorghe@arm.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> [rebased on top of v4.18-rc1] Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Ayan Kumar Halder	39b9c26680	drm/arm/malidp: Preserve LAYER_FORMAT contents when setting format [ Upstream commit `ad7fda2e37` ] On some Mali-DP processors, the LAYER_FORMAT register contains fields other than the format. These bits were unconditionally cleared when setting the pixel format, whereas they should be preserved at their reset values. Reported-by: Brian Starkey <brian.starkey@arm.com> Reported-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Ayan Kumar halder <ayan.halder@arm.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Alison Wang	1dd003d7fc	drm: mali-dp: Enable Global SE interrupts mask for DP500 [ Upstream commit `89610dc2c2` ] In the situation that DE and SE aren’t shared the same interrupt number, the Global SE interrupts mask bit MASK_IRQ_EN in MASKIRQ must be set, or else other mask bits will not work and no SE interrupt will occur. This patch enables MASK_IRQ_EN for SE to fix this problem. Signed-off-by: Alison Wang <alison.wang@nxp.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Ayan Kumar Halder	a5bcc852a0	drm/arm/malidp: Ensure that the crtcs are shutdown before removing any encoder/connector [ Upstream commit `109c4d18e5` ] One needs to ensure that the crtcs are shutdown so that the drm_crtc_state->connector_mask reflects that no connectors are currently active. Further, it reduces the reference count for each connector. This ensures that the connectors and encoders can be cleanly removed either when _unbind is called for the corresponding drivers or by drm_mode_config_cleanup(). We need drm_atomic_helper_shutdown() to be called before component_unbind_all() otherwise the connectors attached to the component device will have the wrong reference count value and will not be cleanly removed. Signed-off-by: Ayan Kumar Halder <ayan.halder@arm.com> Acked-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:42 +02:00
Hoan Tran	b7a8ce416b	drivers/perf: xgene_pmu: Fix IOB SLOW PMU parser error [ Upstream commit `a45fc268db` ] This patch fixes the below parser error of the IOB SLOW PMU. # perf stat -a -e iob-slow0/cycle-count/ sleep 1 evenf syntax error: 'iob-slow0/cycle-count/' \___ parser error It replaces the "-" character by "_" character inside the PMU name. Signed-off-by: Hoan Tran <hoan.tran@amperecomputing.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Ray Jui	0963540d96	arm64: dts: Stingray: Fix I2C controller interrupt type [ Upstream commit `75af23c473` ] Fix I2C controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom Stingray SoC. Fixes: `1256ea1887` ("arm64: dts: Add I2C DT nodes for Stingray SoC") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Ray Jui	efcb779fce	arm64: dts: ns2: Fix PCIe controller interrupt type [ Upstream commit `d0b8aed9e8` ] Fix PCIe controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom NS2 SoC. Fixes: `fd5e5dd56a` ("arm64: dts: Add PCIe0 and PCIe4 DT nodes for NS2") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Ray Jui	e13f803684	arm64: dts: ns2: Fix I2C controller interrupt type [ Upstream commit `e605c287de` ] Fix I2C controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom NS2 SoC. Fixes: `7ac674e8df` ("arm64: dts: Add I2C nodes for NS2") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Scott Branden	3658337339	arm64: dts: specify 1.8V EMMC capabilities for bcm958742t [ Upstream commit `37c2bd81a8` ] Specify 1.8V EMMC capabilities for bcm958742t board to indicate support for UHS mode. Fixes: `d4b4aba6be` ("arm64: dts: Initial DTS files for Broadcom Stingray SOC") Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Scott Branden	866d7d223c	arm64: dts: specify 1.8V EMMC capabilities for bcm958742k [ Upstream commit `eba92503e9` ] Specify 1.8V EMMC capabilities for bcm958742k board to indicate support for UHS mode. Fixes: `d4b4aba6be` ("arm64: dts: Initial DTS files for Broadcom Stingray SOC") Signed-off-by: Scott Branden <scott.branden@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Ray Jui	aed5e31179	ARM: dts: Cygnus: Fix PCIe controller interrupt type [ Upstream commit `6cb1628ad3` ] Fix PCIe controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom Cygnus SoC Fixes: `cd590b50a9` ("ARM: dts: enable PCIe support for Cygnus") Fixes: `f6b889358a` ("ARM: dts: Enable MSI support for Broadcom Cygnus") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Ray Jui	2778064827	ARM: dts: Cygnus: Fix I2C controller interrupt type [ Upstream commit `71ca340970` ] Fix I2C controller interrupt to use IRQ_TYPE_LEVEL_HIGH for Broadcom Cygnus SoC. Fixes: `b51c05a331` ("ARM: dts: add I2C device nodes for Broadcom Cygnus") Signed-off-by: Ray Jui <ray.jui@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:41 +02:00
Florian Fainelli	8cc17a37f7	ARM: dts: BCM5301x: Fix i2c controller interrupt type [ Upstream commit `a0a8338e90` ] The i2c controller should be using IRQ_TYPE_LEVEL_HIGH, fix that. Fixes: `bb097e3e00` ("ARM: dts: BCM5301X: Add I2C support to the DT") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Florian Fainelli	969cbd08c3	ARM: dts: HR2: Fix interrupt types for i2c and PCIe [ Upstream commit `dbe4a39331` ] The i2c and PCIe controllers had an incorrect type which should have been set to IRQ_TYPE_LEVEL_HIGH, fix that. Fixes: `b9099ec754` ("ARM: dts: Add Broadcom Hurricane 2 DTS include file") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Florian Fainelli	b1bc997a04	ARM: dts: NSP: Fix PCIe controllers interrupt types [ Upstream commit `403fde6448` ] The interrupts for the PCIe controllers should all be of type IRQ_TYPE_LEVEL_HIGH instead of IRQ_TYPE_NONE. Fixes: `d71eb94120` ("ARM: dts: NSP: Add MSI support on PCI") Fixes: `522199029f` ("ARM: dts: NSP: Fix PCIE DT issue") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Florian Fainelli	996f2d7176	ARM: dts: NSP: Fix i2c controller interrupt type [ Upstream commit `a3e32e78a4` ] The i2c controller should use IRQ_TYPE_LEVEL_HIGH instead of IRQ_TYPE_NONE. Fixes: `0f9f27a36d` ("ARM: dts: NSP: Add I2C support to the DT") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Fathi Boudra	5793e128fe	selftests: sync: add config fragment for testing sync framework [ Upstream commit `d6a3e55131` ] Unless the software synchronization objects (CONFIG_SW_SYNC) is enabled, the sync test will be skipped: TAP version 13 1..0 # Skipped: Sync framework not supported by kernel Add a config fragment file to be able to run "make kselftest-merge" to enable relevant configuration required in order to run the sync test. Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org> Link: https://lkml.org/lkml/2017/5/5/14 Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Shuah Khan (Samsung OSG)	0fca223f6d	selftests: vm: return Kselftest Skip code for skipped tests [ Upstream commit `a4d7537789` ] When vm test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Acked-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Shuah Khan (Samsung OSG)	c161e8a625	selftests: zram: return Kselftest Skip code for skipped tests [ Upstream commit `685814466b` ] When zram test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Shuah Khan (Samsung OSG)	03b936b065	selftests: user: return Kselftest Skip code for skipped tests [ Upstream commit `d7d5311d4a` ] When user test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Add an explicit check for module presence and return skip code if module isn't present. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:40 +02:00
Shuah Khan (Samsung OSG)	d0342b78e6	selftests: sysctl: return Kselftest Skip code for skipped tests [ Upstream commit `c7db6ffb83` ] When sysctl test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Changed return code to kselftest skip code in skip error legs that check requirements and module probe test error leg. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Shuah Khan (Samsung OSG)	a950c0190d	selftests: static_keys: return Kselftest Skip code for skipped tests [ Upstream commit `8781578087` ] When static_keys test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Added an explicit searches for test_static_key_base and test_static_keys modules and return skip code if they aren't found to differentiate between the failure to load the module condition and module not found condition. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Shuah Khan (Samsung OSG)	67fa2f218b	selftests: pstore: return Kselftest Skip code for skipped tests [ Upstream commit `856e7c4b61` ] When pstore_post_reboot test gets skipped because of unmet dependencies and/or unsupported configuration, it returns 0 which is treated as a pass by the Kselftest framework. This leads to false positive result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Gao Feng	45e49e7ef9	netfilter: nf_ct_helper: Fix possible panic after nf_conntrack_helper_unregister [ Upstream commit `ad9852af97` ] The helper module would be unloaded after nf_conntrack_helper_unregister, so it may cause a possible panic caused by race. nf_ct_iterate_destroy(unhelp, me) reset the helper of conntrack as NULL, but maybe someone has gotten the helper pointer during this period. Then it would panic, when it accesses the helper and the module was unloaded. Take an example as following: CPU0 CPU1 ctnetlink_dump_helpinfo helper = rcu_dereference(help->helper); unhelp set helper as NULL unload helper module helper->to_nlattr(skb, ct); As above, the cpu0 tries to access the helper and its module is unloaded, then the panic happens. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Eric Dumazet	f54a67e0d1	netfilter: ipv6: nf_defrag: reduce struct net memory waste [ Upstream commit `9ce7bc036a` ] It is a waste of memory to use a full "struct netns_sysctl_ipv6" while only one pointer is really used, considering netns_sysctl_ipv6 keeps growing. Also, since "struct netns_frags" has cache line alignment, it is better to move the frags_hdr pointer outside, otherwise we spend a full cache line for this pointer. This saves 192 bytes of memory per netns. Fixes: `c038a767cd` ("ipv6: add a new namespace for nf_conntrack_reasm") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Mika Westerberg	26a309e144	ACPI / EC: Use ec_no_wakeup on Thinkpad X1 Carbon 6th [ Upstream commit `8195a655e5` ] On this system EC interrupt triggers constantly kicking devices out of low power states and thus blocking power management. The system also has a PCIe root port hosting Alpine Ridge Thunderbolt controller and it never gets a chance to go to D3cold because of this. Since the power button works the same regardless if EC interrupt is enabled or not during s2idle, add a quirk for this machine that sets ec_no_wakeup=true preventing spurious wakeups. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Johan Hovold	5661b528c0	usb: dwc3: of-simple: fix use-after-free on remove [ Upstream commit `896e518883` ] The clocks have already been explicitly disabled and put as part of remove() so the runtime suspend callback must not be run when balancing the runtime PM usage count before returning. Fixes: `16adc674d0` ("usb: dwc3: add generic OF glue layer") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Minas Harutyunyan	2a3ab5b97a	usb: dwc2: gadget: Fix issue in dwc2_gadget_start_isoc() [ Upstream commit `1ffba90587` ] In case of requests queue is empty reset EP target_frame to initial value. This allow restarting ISOC traffic in case when function driver queued requests with interruptions. Tested-by: Zeng Tao <prime.zeng@hisilicon.com> Signed-off-by: Minas Harutyunyan <hminas@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Vincent Pelletier	e8ca0700a0	usb: gadget: ffs: Fix BUG when userland exits with submitted AIO transfers [ Upstream commit `d52e4d0c0c` ] This bug happens only when the UDC needs to sleep during usb_ep_dequeue, as is the case for (at least) dwc3. [ 382.200896] BUG: scheduling while atomic: screen/1808/0x00000100 [ 382.207124] 4 locks held by screen/1808: [ 382.211266] #0: (rcu_callback){....}, at: [<c10b4ff0>] rcu_process_callbacks+0x260/0x440 [ 382.219949] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>] percpu_ref_switch_to_atomic_rcu+0xb0/0x130 [ 382.230034] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>] free_ioctx_users+0x23/0xd0 [ 382.230096] #3: (&(&ffs->eps_lock)->rlock){....}, at: [<f81e7710>] ffs_aio_cancel+0x20/0x60 [usb_f_fs] [ 382.230160] Modules linked in: usb_f_fs libcomposite configfs bnep btsdio bluetooth ecdh_generic brcmfmac brcmutil intel_powerclamp coretemp dwc3 kvm_intel ulpi udc_core kvm irqbypass crc32_pclmul crc32c_intel pcbc dwc3_pci aesni_intel aes_i586 crypto_simd cryptd ehci_pci ehci_hcd gpio_keys usbcore basincove_gpadc industrialio usb_common [ 382.230407] CPU: 1 PID: 1808 Comm: screen Not tainted 4.14.0-edison+ #117 [ 382.230416] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48 [ 382.230425] Call Trace: [ 382.230438] <SOFTIRQ> [ 382.230466] dump_stack+0x47/0x62 [ 382.230498] __schedule_bug+0x61/0x80 [ 382.230522] __schedule+0x43/0x7a0 [ 382.230587] schedule+0x5f/0x70 [ 382.230625] dwc3_gadget_ep_dequeue+0x14c/0x270 [dwc3] [ 382.230669] ? do_wait_intr_irq+0x70/0x70 [ 382.230724] usb_ep_dequeue+0x19/0x90 [udc_core] [ 382.230770] ffs_aio_cancel+0x37/0x60 [usb_f_fs] [ 382.230798] kiocb_cancel+0x31/0x40 [ 382.230822] free_ioctx_users+0x4d/0xd0 [ 382.230858] percpu_ref_switch_to_atomic_rcu+0x10a/0x130 [ 382.230881] ? percpu_ref_exit+0x40/0x40 [ 382.230904] rcu_process_callbacks+0x2b3/0x440 [ 382.230965] __do_softirq+0xf8/0x26b [ 382.231011] ? __softirqentry_text_start+0x8/0x8 [ 382.231033] do_softirq_own_stack+0x22/0x30 [ 382.231042] </SOFTIRQ> [ 382.231071] irq_exit+0x45/0xc0 [ 382.231089] smp_apic_timer_interrupt+0x13c/0x150 [ 382.231118] apic_timer_interrupt+0x35/0x3c [ 382.231132] EIP: __copy_user_ll+0xe2/0xf0 [ 382.231142] EFLAGS: 00210293 CPU: 1 [ 382.231154] EAX: bfd4508c EBX: 00000004 ECX: 00000003 EDX: f3d8fe50 [ 382.231165] ESI: f3d8fe51 EDI: bfd4508d EBP: f3d8fe14 ESP: f3d8fe08 [ 382.231176] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 382.231265] core_sys_select+0x25f/0x320 [ 382.231346] ? __wake_up_common_lock+0x62/0x80 [ 382.231399] ? tty_ldisc_deref+0x13/0x20 [ 382.231438] ? ldsem_up_read+0x1b/0x40 [ 382.231459] ? tty_ldisc_deref+0x13/0x20 [ 382.231479] ? tty_write+0x29f/0x2e0 [ 382.231514] ? n_tty_ioctl+0xe0/0xe0 [ 382.231541] ? tty_write_unlock+0x30/0x30 [ 382.231566] ? __vfs_write+0x22/0x110 [ 382.231604] ? security_file_permission+0x2f/0xd0 [ 382.231635] ? rw_verify_area+0xac/0x120 [ 382.231677] ? vfs_write+0x103/0x180 [ 382.231711] SyS_select+0x87/0xc0 [ 382.231739] ? SyS_write+0x42/0x90 [ 382.231781] do_fast_syscall_32+0xd6/0x1a0 [ 382.231836] entry_SYSENTER_32+0x47/0x71 [ 382.231848] EIP: 0xb7f75b05 [ 382.231857] EFLAGS: 00000246 CPU: 1 [ 382.231868] EAX: ffffffda EBX: 00000400 ECX: bfd4508c EDX: bfd4510c [ 382.231878] ESI: 00000000 EDI: 00000000 EBP: 00000000 ESP: bfd45020 [ 382.231889] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 382.232281] softirq: huh, entered softirq 9 RCU c10b4d90 with preempt_count 00000100, exited with 00000000? Tested-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:39 +02:00
Heikki Krogerus	4719b500cc	usb: dwc3: pci: add support for Intel IceLake [ Upstream commit `00908693c4` ] PCI IDs for Intel IceLake. Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Anson Huang	53f3c88aa4	soc: imx: gpcv2: correct PGC offset [ Upstream commit `3637f12faf` ] Correct MIPI/PCIe/USB_HSIC's PGC offset based on design RTL, the values in the Reference Manual (Rev. 1, 01/2018 and the older ones) are incorrect. The correct offset values should be as below: 0x800 ~ 0x83F: PGC for core0 of A7 platform; 0x840 ~ 0x87F: PGC for core1 of A7 platform; 0x880 ~ 0x8BF: PGC for SCU of A7 platform; 0xA00 ~ 0xA3F: PGC for fastmix/megamix; 0xC00 ~ 0xC3F: PGC for MIPI PHY; 0xC40 ~ 0xC7F: PGC for PCIe_PHY; 0xC80 ~ 0xCBF: PGC for USB OTG1 PHY; 0xCC0 ~ 0xCFF: PGC for USB OTG2 PHY; 0xD00 ~ 0xD3F: PGC for USB HSIC PHY; Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Fixes: `03aa12629f` ("soc: imx: Add GPCv2 power gating driver") Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Guenter Roeck	4f0eeb62a7	hwmon: (nct6775) Fix loop limit [ Upstream commit `91bb8f45f7` ] Commit `cc66b30382` ("hwmon: (nct6775) Rework temperature source and label handling") changed a loop limit from "data->temp_label_num - 1" to "32", as part of moving from a string array to a bit mask. This results in the following error, reported by UBSAN. UBSAN: Undefined behaviour in drivers/hwmon/nct6775.c:4179:27 shift exponent 32 is too large for 32-bit type 'long unsigned int' Similar to the original loop, the limit has to be one less than the number of bits. Fixes: `cc66b30382` ("hwmon: (nct6775) Rework temperature source and label handling") Reported-by: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de> Cc: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel+linux-hwmon@molgen.mpg.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Helge Eichelberg	e1f511665f	hwmon: (dell-smm) Disable fan support for Dell XPS13 9333 [ Upstream commit `536e0019b7` ] Calling fan related SMM functions implemented by Dell BIOS firmware on Dell XPS13 9333 freeze kernel for about 500ms. Until Dell fixes it we need to disable fan support for Dell XPS13 9333. Via "force" module param fan support can be enabled. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195751 Signed-off-by: Helge Eichelberg <kernelorg@elchenberg.name> Reviewed-by: Pali Rohár <pali.rohar@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Steve French	e1fd246d69	smb3: increase initial number of credits requested to allow write [ Upstream commit `d409014e4f` ] Compared to other clients the Linux smb3 client ramps up credits very slowly, taking more than 128 operations before a maximum size write could be sent (since the number of credits requested is only 2 per small operation, causing the credit limit to grow very slowly). This lack of credits initially would impact large i/o performance, when large i/o is tried early before enough credits are built up. Signed-off-by: Steve French <sfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Jakub Kicinski	bd67a01861	selftests/bpf: test offloads even with BPF programs present [ Upstream commit `47cf52a246` ] Modern distroes increasingly make use of BPF programs. Default Ubuntu 18.04 installation boots with a number of cgroup_skb programs loaded. test_offloads.py tries to check if programs and maps are not leaked on error paths by confirming the list of programs on the system is empty between tests. Since we can no longer expect the system to have no BPF objects at boot try to remember the programs and maps present at the start, and skip those when scanning the system. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Alexey Brodkin	ac8bd7eb6e	ARC: Explicitly add -mmedium-calls to CFLAGS [ Upstream commit `74c11e300c` ] GCC built for arc--linux has "-mmedium-calls" implicitly enabled by default thus we don't see any problems during Linux kernel compilation. ----------------------------->8------------------------ arc-linux-gcc -mcpu=arc700 -Q --help=target \| grep calls -mlong-calls [disabled] -mmedium-calls [enabled] ----------------------------->8------------------------ But if we try to use so-called Elf32 toolchain with GCC configured for arc--elf* then we'd see the following failure: ----------------------------->8------------------------ init/do_mounts.o: In function 'init_rootfs': do_mounts.c:(.init.text+0x108): relocation truncated to fit: R_ARC_S21W_PCREL against symbol 'unregister_filesystem' defined in .text section in fs/filesystems.o arc-elf32-ld: final link failed: Symbol needs debug section which does not exist make: *** [vmlinux] Error 1 ----------------------------->8------------------------ That happens because neither "-mmedium-calls" nor "-mlong-calls" are enabled in Elf32 GCC: ----------------------------->8------------------------ arc-elf32-gcc -mcpu=arc700 -Q --help=target \| grep calls -mlong-calls [disabled] -mmedium-calls [disabled] ----------------------------->8------------------------ Now to make it possible to use Elf32 toolchain for building Linux kernel we're explicitly add "-mmedium-calls" to CFLAGS. And since we add "-mmedium-calls" to the global CFLAGS there's no point in having per-file copies thus removing them. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Maciej Purski	519114dff4	drm/bridge/sii8620: fix potential buffer overflow [ Upstream commit `9378cecb1c` ] Buffer overflow error should not occur, as mode_fixup() callback filters pixel clock value and it should never exceed 600000. However, current implementation is not obviously safe and relies on implementation of mode_fixup(). Make 'i' variable never reach unsafe value in order to avoid buffer overflow error. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `bf1722ca` ("drm/bridge/sii8620: rewrite hdmi start sequence") Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1511341718-6974-1-git-send-email-m.purski@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:38 +02:00
Maciej Purski	a5a1d97173	drm/bridge/sii8620: fix display modes validation [ Upstream commit `ecba7cfa3a` ] Current implementation of mode_valid() and mode_fixup() callbacks handle packed pixel modes improperly. Fix it by using proper maximum clock values from the documentation. Signed-off-by: Maciej Purski <m.purski@samsung.com> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/1517568865-25219-1-git-send-email-m.purski@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Andrzej Hajda	c230e103e0	drm/bridge/sii8620: fix loops in EDID fetch logic [ Upstream commit `8e627a1b1c` ] Function should constantly check if cable is connected and finish in finite time. Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Maciej Purski <m.purski@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180115173357.31067-4-a.hajda@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Julia Lawall	94c8a7cc35	clocksource/drivers/stm32: Fix error return code [ Upstream commit `a26ed66c20` ] Return an error code on failure. Problem found using Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: kernel-janitors@vger.kernel.org Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lkml.kernel.org/r1528640655-18948-3-git-send-email-Julia.Lawall@lip6.fr Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Christophe Jaillet	bae5e4e524	IB/mlx4: Fix an error handling path in 'mlx4_ib_rereg_user_mr()' [ Upstream commit `3dc7c7badb` ] Before returning -EPERM we should release some resources, as already done in the other error handling path of the function. Fixes: `d8f9cc328c` ("IB/mlx4: Mark user MR as writable if actual virtual memory is writable") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Lucas Stach	a8e1259744	Input: synaptics-rmi4 - fix axis-swap behavior [ Upstream commit `645a397d32` ] The documentation for the touchscreen-swapped-x-y property states that swapping is done after inverting if both are used. RMI4 did it the other way around, leading to inconsistent behavior with regard to other touchscreens. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Nick Dyer <nick@shmanahar.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Kalderon, Michal	79510753fe	RDMA/qedr: Fix NULL pointer dereference when running over iWARP without RDMA-CM [ Upstream commit `425cf5c135` ] Some RoCE specific code in qedr_modify_qp was run over an iWARP device when running perftest benchmarks without the -R option. The commit `3e44e0ee08` ("IB/providers: Avoid null netdev check for RoCE") exposed this. Dropping the check for NULL pointer on ndev in qedr_modify_qp lead to a null pointer dereference when running over iWARP. Before the code would identify ndev as being NULL and return an error. Fixes: `3e44e0ee08` ("IB/providers: Avoid null netdev check for RoCE") Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Zhu Yanjun	460f9d5579	IB/rxe: avoid double kfree skb [ Upstream commit `828d810550` ] In rxe_send, when network_type is not RDMA_NETWORK_IPV4 or RDMA_NETWORK_IPV6, skb is freed and -EINVAL is returned. Then rxe_xmit_packet will return -EINVAL, too. In rxe_requester, this skb is double freed. In rxe_requester, kfree_skb is needed only after fill_packet fails. So kfree_skb is moved from label err to test fill_packet. Fixes: `5793b46521` ("IB/rxe: remove unnecessary skb_clone in xmit") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Nicolas Boichat	fd58a83990	HID: google: Add support for whiskers [ Upstream commit `3e84c7651d` ] Another device in the hammer class, with USB id 0x5030. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:37 +02:00
Jiri Olsa	9f0ec571ca	perf tools: Fix error index for pmu event parser [ Upstream commit `f7fa827f5f` ] For events we provide specific error message we need to set error column index, PMU parser is missing that, adding it. Before: $ perf stat -e cycles,krava/cycles/ kill event syntax error: 'cycles,krava/cycles/' \___ Cannot find PMU `krava'. Missing kernel support? After: $ perf stat -e cycles,krava/cycles/ kill event syntax error: 'cycles,krava/cycles/' \___ Cannot find PMU `krava'. Missing kernel support? Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <frederic@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/20180606221513.11302-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:36 +02:00
Dong Jia Shi	5d2c644ec0	vfio: ccw: fix error return in vfio_ccw_sch_event [ Upstream commit `2c861d89cc` ] If the device has not been registered, or there is work pending, we should reschedule a sch_event call again. Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20180502072559.50691-1-bjsdjshi@linux.vnet.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:36 +02:00
Viresh Kumar	d887fef510	arm: dts: armada: Fix "#cooling-cells" property's name [ Upstream commit `ac62cc9d9c` ] It should be "#cooling-cells" instead of "cooling-cells". Fix it. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-24 13:06:36 +02:00
Greg Kroah-Hartman	97f21471e8	Linux 4.17.18	2018-08-22 07:44:57 +02:00
Jisheng Zhang	ee225298c3	net: mvneta: fix mvneta_config_rss on armada 3700 [ Upstream commit `0f5c6c30a0` ] The mvneta Ethernet driver is used on a few different Marvell SoCs. Some SoCs have per cpu interrupts for Ethernet events, the driver uses a per CPU napi structure for this case. Some SoCs such as armada 3700 have a single interrupt for Ethernet events, the driver uses a global napi structure for this case. Current mvneta_config_rss() always operates the per cpu napi structure. Fix it by operating a global napi for "single interrupt" case, and per cpu napi structure for remaining cases. Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com> Fixes: `2636ac3cc2` ("net: mvneta: Add network support for Armada 3700 SoC") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:56 +02:00
Andrew Lunn	51bde4ba3c	net: ethernet: mvneta: Fix napi structure mixup on armada 3700 [ Upstream commit `7a86f05faf` ] The mvneta Ethernet driver is used on a few different Marvell SoCs. Some SoCs have per cpu interrupts for Ethernet events. Some SoCs have a single interrupt, independent of the CPU. The driver handles this by having a per CPU napi structure when there are per CPU interrupts, and a global napi structure when there is a single interrupt. When the napi core calls mvneta_poll(), it passes the napi instance. This was not being propagated through the call chain, and instead the per-cpu napi instance was passed to napi_gro_receive() call. This breaks when there is a single global napi instance. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: `2636ac3cc2` ("net: mvneta: Add network support for Armada 3700 SoC") Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:56 +02:00
Hangbin Liu	160ecd519e	cls_matchall: fix tcf_unbind_filter missing [ Upstream commit `a51c76b4df` ] Fix tcf_unbind_filter missing in cls_matchall as this will trigger WARN_ON() in cbq_destroy_class(). Fixes: `fd62d9f5c5` ("net/sched: matchall: Fix configuration race") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:56 +02:00
Haishuang Yan	00eb8d6dfe	ip_vti: fix a null pointer deferrence when create vti fallback tunnel [ Upstream commit `cd1aa9c2c6` ] After set fb_tunnels_only_for_init_net to 1, the itn->fb_tunnel_dev will be NULL and will cause following crash: [ 2742.849298] BUG: unable to handle kernel NULL pointer dereference at 0000000000000941 [ 2742.851380] PGD 800000042c21a067 P4D 800000042c21a067 PUD 42aaed067 PMD 0 [ 2742.852818] Oops: 0002 [#1] SMP PTI [ 2742.853570] CPU: 7 PID: 2484 Comm: unshare Kdump: loaded Not tainted 4.18.0-rc8+ #2 [ 2742.855163] Hardware name: Fedora Project OpenStack Nova, BIOS seabios-1.7.5-11.el7 04/01/2014 [ 2742.856970] RIP: 0010:vti_init_net+0x3a/0x50 [ip_vti] [ 2742.858034] Code: 90 83 c0 48 c7 c2 20 a1 83 c0 48 89 fb e8 6e 3b f6 ff 85 c0 75 22 8b 0d f4 19 00 00 48 8b 93 00 14 00 00 48 8b 14 ca 48 8b 12 <c6> 82 41 09 00 00 04 c6 82 38 09 00 00 45 5b c3 66 0f 1f 44 00 00 [ 2742.861940] RSP: 0018:ffff9be28207fde0 EFLAGS: 00010246 [ 2742.863044] RAX: 0000000000000000 RBX: ffff8a71ebed4980 RCX: 0000000000000013 [ 2742.864540] RDX: 0000000000000000 RSI: 0000000000000013 RDI: ffff8a71ebed4980 [ 2742.866020] RBP: ffff8a71ea717000 R08: ffffffffc083903c R09: ffff8a71ea717000 [ 2742.867505] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a71ebed4980 [ 2742.868987] R13: 0000000000000013 R14: ffff8a71ea5b49c0 R15: 0000000000000000 [ 2742.870473] FS: 00007f02266c9740(0000) GS:ffff8a71ffdc0000(0000) knlGS:0000000000000000 [ 2742.872143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2742.873340] CR2: 0000000000000941 CR3: 000000042bc20006 CR4: 00000000001606e0 [ 2742.874821] Call Trace: [ 2742.875358] ops_init+0x38/0xf0 [ 2742.876078] setup_net+0xd9/0x1f0 [ 2742.876789] copy_net_ns+0xb7/0x130 [ 2742.877538] create_new_namespaces+0x11a/0x1d0 [ 2742.878525] unshare_nsproxy_namespaces+0x55/0xa0 [ 2742.879526] ksys_unshare+0x1a7/0x330 [ 2742.880313] __x64_sys_unshare+0xe/0x20 [ 2742.881131] do_syscall_64+0x5b/0x180 [ 2742.881933] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reproduce: echo 1 > /proc/sys/net/core/fb_tunnels_only_for_init_net modprobe ip_vti unshare -n Fixes: `79134e6ce2` ("net: do not create fallback tunnels for non-default namespaces") Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:56 +02:00
Jian-Hong Pan	2de0cccf37	r8169: don't use MSI-X on RTL8106e [ Upstream commit `7bb05b85bc` ] Found the ethernet network on ASUS X441UAR doesn't come back on resume from suspend when using MSI-X. The chip is RTL8106e - version 39. [ 21.848357] libphy: r8169: probed [ 21.848473] r8169 0000:02:00.0 eth0: RTL8106e, 0c:9d:92:32:67:b4, XID 44900000, IRQ 127 [ 22.518860] r8169 0000:02:00.0 enp2s0: renamed from eth0 [ 29.458041] Generic PHY r8169-200:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) [ 63.227398] r8169 0000:02:00.0 enp2s0: Link is Up - 100Mbps/Full - flow control off [ 124.514648] Generic PHY r8169-200:00: attached PHY driver [Generic PHY] (mii_bus:phy_addr=r8169-200:00, irq=IGNORE) Here is the ethernet controller in detail: 02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd. RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136] (rev 07) Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast Ethernet controller [1043:200f] Flags: bus master, fast devsel, latency 0, IRQ 16 I/O ports at e000 [size=256] Memory at ef100000 (64-bit, non-prefetchable) [size=4K] Memory at e0000000 (64-bit, prefetchable) [size=16K] Capabilities: <access denied> Kernel driver in use: r8169 Kernel modules: r8169 Falling back to MSI fixes the issue. Fixes: `6c6aa15fde` ("r8169: improve interrupt handling") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:56 +02:00
Jeremy Cline	aa89bd0eba	net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd() [ Upstream commit `66b51b0a03` ] req->sdiag_family is a user-controlled value that's used as an array index. Sanitize it after the bounds check to avoid speculative out-of-bounds array access. This also protects the sock_is_registered() call, so this removes the sanitize call there. Fixes: `e978de7a6d` ("net: socket: Fix potential spectre v1 gadget in sock_is_registered") Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: konrad.wilk@oracle.com Cc: jamie.iles@oracle.com Cc: liran.alon@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:55 +02:00
Kees Cook	445bff409b	isdn: Disable IIOCDBGVAR [ Upstream commit `5e22002aa8` ] It was possible to directly leak the kernel address where the isdn_dev structure pointer was stored. This is a kernel ASLR bypass for anyone with access to the ioctl. The code had been present since the beginning of git history, though this shouldn't ever be needed for normal operation, therefore remove it. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:55 +02:00
Sudip Mukherjee	3f0f439a62	Bluetooth: avoid killing an already killed socket commit `4e1a720d03` upstream. slub debug reported: [ 440.648642] ============================================================================= [ 440.648649] BUG kmalloc-1024 (Tainted: G BU O ): Poison overwritten [ 440.648651] ----------------------------------------------------------------------------- [ 440.648655] INFO: 0xe70f4bec-0xe70f4bec. First byte 0x6a instead of 0x6b [ 440.648665] INFO: Allocated in sk_prot_alloc+0x6b/0xc6 age=33155 cpu=1 pid=1047 [ 440.648671] ___slab_alloc.constprop.24+0x1fc/0x292 [ 440.648675] __slab_alloc.isra.18.constprop.23+0x1c/0x25 [ 440.648677] __kmalloc+0xb6/0x17f [ 440.648680] sk_prot_alloc+0x6b/0xc6 [ 440.648683] sk_alloc+0x1e/0xa1 [ 440.648700] sco_sock_alloc.constprop.6+0x26/0xaf [bluetooth] [ 440.648716] sco_connect_cfm+0x166/0x281 [bluetooth] [ 440.648731] hci_conn_request_evt.isra.53+0x258/0x281 [bluetooth] [ 440.648746] hci_event_packet+0x28b/0x2326 [bluetooth] [ 440.648759] hci_rx_work+0x161/0x291 [bluetooth] [ 440.648764] process_one_work+0x163/0x2b2 [ 440.648767] worker_thread+0x1a9/0x25c [ 440.648770] kthread+0xf8/0xfd [ 440.648774] ret_from_fork+0x2e/0x38 [ 440.648779] INFO: Freed in __sk_destruct+0xd3/0xdf age=3815 cpu=1 pid=1047 [ 440.648782] __slab_free+0x4b/0x27a [ 440.648784] kfree+0x12e/0x155 [ 440.648787] __sk_destruct+0xd3/0xdf [ 440.648790] sk_destruct+0x27/0x29 [ 440.648793] __sk_free+0x75/0x91 [ 440.648795] sk_free+0x1c/0x1e [ 440.648810] sco_sock_kill+0x5a/0x5f [bluetooth] [ 440.648825] sco_conn_del+0x8e/0xba [bluetooth] [ 440.648840] sco_disconn_cfm+0x3a/0x41 [bluetooth] [ 440.648855] hci_event_packet+0x45e/0x2326 [bluetooth] [ 440.648868] hci_rx_work+0x161/0x291 [bluetooth] [ 440.648872] process_one_work+0x163/0x2b2 [ 440.648875] worker_thread+0x1a9/0x25c [ 440.648877] kthread+0xf8/0xfd [ 440.648880] ret_from_fork+0x2e/0x38 [ 440.648884] INFO: Slab 0xf4718580 objects=27 used=27 fp=0x (null) flags=0x40008100 [ 440.648886] INFO: Object 0xe70f4b88 @offset=19336 fp=0xe70f54f8 When KASAN was enabled, it reported: [ 210.096613] ================================================================== [ 210.096634] BUG: KASAN: use-after-free in ex_handler_refcount+0x5b/0x127 [ 210.096641] Write of size 4 at addr ffff880107e17160 by task kworker/u9:1/2040 [ 210.096651] CPU: 1 PID: 2040 Comm: kworker/u9:1 Tainted: G U O 4.14.47-20180606+ #2 [ 210.096654] Hardware name: , BIOS 2017.01-00087-g43e04de 08/30/2017 [ 210.096693] Workqueue: hci0 hci_rx_work [bluetooth] [ 210.096698] Call Trace: [ 210.096711] dump_stack+0x46/0x59 [ 210.096722] print_address_description+0x6b/0x23b [ 210.096729] ? ex_handler_refcount+0x5b/0x127 [ 210.096736] kasan_report+0x220/0x246 [ 210.096744] ex_handler_refcount+0x5b/0x127 [ 210.096751] ? ex_handler_clear_fs+0x85/0x85 [ 210.096757] fixup_exception+0x8c/0x96 [ 210.096766] do_trap+0x66/0x2c1 [ 210.096773] do_error_trap+0x152/0x180 [ 210.096781] ? fixup_bug+0x78/0x78 [ 210.096817] ? hci_debugfs_create_conn+0x244/0x26a [bluetooth] [ 210.096824] ? __schedule+0x113b/0x1453 [ 210.096830] ? sysctl_net_exit+0xe/0xe [ 210.096837] ? __wake_up_common+0x343/0x343 [ 210.096843] ? insert_work+0x107/0x163 [ 210.096850] invalid_op+0x1b/0x40 [ 210.096888] RIP: 0010:hci_debugfs_create_conn+0x244/0x26a [bluetooth] [ 210.096892] RSP: 0018:ffff880094a0f970 EFLAGS: 00010296 [ 210.096898] RAX: 0000000000000000 RBX: ffff880107e170e8 RCX: ffff880107e17160 [ 210.096902] RDX: 000000000000002f RSI: ffff88013b80ed40 RDI: ffffffffa058b940 [ 210.096906] RBP: ffff88011b2b0578 R08: 00000000852f0ec9 R09: ffffffff81cfcf9b [ 210.096909] R10: 00000000d21bdad7 R11: 0000000000000001 R12: ffff8800967b0488 [ 210.096913] R13: ffff880107e17168 R14: 0000000000000068 R15: ffff8800949c0008 [ 210.096920] ? __sk_destruct+0x2c6/0x2d4 [ 210.096959] hci_event_packet+0xff5/0x7de2 [bluetooth] [ 210.096969] ? __local_bh_enable_ip+0x43/0x5b [ 210.097004] ? l2cap_sock_recv_cb+0x158/0x166 [bluetooth] [ 210.097039] ? hci_le_meta_evt+0x2bb3/0x2bb3 [bluetooth] [ 210.097075] ? l2cap_ertm_init+0x94e/0x94e [bluetooth] [ 210.097093] ? xhci_urb_enqueue+0xbd8/0xcf5 [xhci_hcd] [ 210.097102] ? __accumulate_pelt_segments+0x24/0x33 [ 210.097109] ? __accumulate_pelt_segments+0x24/0x33 [ 210.097115] ? __update_load_avg_se.isra.2+0x217/0x3a4 [ 210.097122] ? set_next_entity+0x7c3/0x12cd [ 210.097128] ? pick_next_entity+0x25e/0x26c [ 210.097135] ? pick_next_task_fair+0x2ca/0xc1a [ 210.097141] ? switch_mm_irqs_off+0x346/0xb4f [ 210.097147] ? __switch_to+0x769/0xbc4 [ 210.097153] ? compat_start_thread+0x66/0x66 [ 210.097188] ? hci_conn_check_link_mode+0x1cd/0x1cd [bluetooth] [ 210.097195] ? finish_task_switch+0x392/0x431 [ 210.097228] ? hci_rx_work+0x154/0x487 [bluetooth] [ 210.097260] hci_rx_work+0x154/0x487 [bluetooth] [ 210.097269] process_one_work+0x579/0x9e9 [ 210.097277] worker_thread+0x68f/0x804 [ 210.097285] kthread+0x31c/0x32b [ 210.097292] ? rescuer_thread+0x70c/0x70c [ 210.097299] ? kthread_create_on_node+0xa3/0xa3 [ 210.097306] ret_from_fork+0x35/0x40 [ 210.097314] Allocated by task 2040: [ 210.097323] kasan_kmalloc.part.1+0x51/0xc7 [ 210.097328] __kmalloc+0x17f/0x1b6 [ 210.097335] sk_prot_alloc+0xf2/0x1a3 [ 210.097340] sk_alloc+0x22/0x297 [ 210.097375] sco_sock_alloc.constprop.7+0x23/0x202 [bluetooth] [ 210.097410] sco_connect_cfm+0x2d0/0x566 [bluetooth] [ 210.097443] hci_conn_request_evt.isra.53+0x6d3/0x762 [bluetooth] [ 210.097476] hci_event_packet+0x85e/0x7de2 [bluetooth] [ 210.097507] hci_rx_work+0x154/0x487 [bluetooth] [ 210.097512] process_one_work+0x579/0x9e9 [ 210.097517] worker_thread+0x68f/0x804 [ 210.097523] kthread+0x31c/0x32b [ 210.097529] ret_from_fork+0x35/0x40 [ 210.097533] Freed by task 2040: [ 210.097539] kasan_slab_free+0xb3/0x15e [ 210.097544] kfree+0x103/0x1a9 [ 210.097549] __sk_destruct+0x2c6/0x2d4 [ 210.097584] sco_conn_del.isra.1+0xba/0x10e [bluetooth] [ 210.097617] hci_event_packet+0xff5/0x7de2 [bluetooth] [ 210.097648] hci_rx_work+0x154/0x487 [bluetooth] [ 210.097653] process_one_work+0x579/0x9e9 [ 210.097658] worker_thread+0x68f/0x804 [ 210.097663] kthread+0x31c/0x32b [ 210.097670] ret_from_fork+0x35/0x40 [ 210.097676] The buggy address belongs to the object at ffff880107e170e8 which belongs to the cache kmalloc-1024 of size 1024 [ 210.097681] The buggy address is located 120 bytes inside of 1024-byte region [ffff880107e170e8, ffff880107e174e8) [ 210.097683] The buggy address belongs to the page: [ 210.097689] page:ffffea00041f8400 count:1 mapcount:0 mapping: (null) index:0xffff880107e15b68 compound_mapcount: 0 [ 210.110194] flags: 0x8000000000008100(slab\|head) [ 210.115441] raw: 8000000000008100 0000000000000000 ffff880107e15b68 0000000100170016 [ 210.115448] raw: ffffea0004a47620 ffffea0004b48e20 ffff88013b80ed40 0000000000000000 [ 210.115451] page dumped because: kasan: bad access detected [ 210.115454] Memory state around the buggy address: [ 210.115460] ffff880107e17000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 210.115465] ffff880107e17080: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb [ 210.115469] >ffff880107e17100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 210.115472] ^ [ 210.115477] ffff880107e17180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 210.115481] ffff880107e17200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 210.115483] ================================================================== And finally when BT_DBG() and ftrace was enabled it showed: <...>-14979 [001] .... 186.104191: sco_sock_kill <-sco_sock_close <...>-14979 [001] .... 186.104191: sco_sock_kill <-sco_sock_release <...>-14979 [001] .... 186.104192: sco_sock_kill: sk ef0497a0 state 9 <...>-14979 [001] .... 186.104193: bt_sock_unlink <-sco_sock_kill kworker/u9:2-792 [001] .... 186.104246: sco_sock_kill <-sco_conn_del kworker/u9:2-792 [001] .... 186.104248: sco_sock_kill: sk ef0497a0 state 9 kworker/u9:2-792 [001] .... 186.104249: bt_sock_unlink <-sco_sock_kill kworker/u9:2-792 [001] .... 186.104250: sco_sock_destruct <-__sk_destruct kworker/u9:2-792 [001] .... 186.104250: sco_sock_destruct: sk ef0497a0 kworker/u9:2-792 [001] .... 186.104860: hci_conn_del <-hci_event_packet kworker/u9:2-792 [001] .... 186.104864: hci_conn_del: hci0 hcon ef0484c0 handle 266 Only in the failed case, sco_sock_kill() gets called with the same sock pointer two times. Add a check for SOCK_DEAD to avoid continue killing a socket which has already been killed. Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:55 +02:00
Johan Hovold	16dd10970f	misc: sram: fix resource leaks in probe error path commit `f294d00961` upstream. Make sure to disable clocks and deregister any exported partitions before returning on late probe errors. Note that since commit `ee895ccdf7` ("misc: sram: fix enabled clock leak on error path"), partitions are deliberately exported before enabling the clock so we stick to that logic here. A follow up patch will address this. Cc: stable <stable@vger.kernel.org> # 4.9 Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:55 +02:00
Srinath Mannam	0e3207bd59	serial: 8250_dw: Add ACPI support for uart on Broadcom SoC commit `784c29eda5` upstream. Add ACPI identifier HID for UART DW 8250 on Broadcom SoCs to match the HID passed through ACPI tables to enable UART controller. Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com> Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com> Tested-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:55 +02:00
Chen Hu	2605a4d5fa	serial: 8250_dw: always set baud rate in dw8250_set_termios commit `dfcab6ba57` upstream. dw8250_set_termios() doesn't set baud rate if the arg "old ktermios" is NULL. This happens during resume. Call Trace: ... [ 54.928108] dw8250_set_termios+0x162/0x170 [ 54.928114] serial8250_set_termios+0x17/0x20 [ 54.928117] uart_change_speed+0x64/0x160 [ 54.928119] uart_resume_port ... So the baud rate is not restored after S3 and breaks the apps who use UART, for example, console and bluetooth etc. We address this issue by setting the baud rate irrespective of arg "old", just like the drivers for other 8250 IPs. This is tested with Intel Broxton platform. Signed-off-by: Chen Hu <hu1.chen@intel.com> Fixes: `4e26b134bd` ("serial: 8250_dw: clock rate handling for all ACPI platforms") Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Aaron Sierra	92c770e452	serial: 8250_exar: Read INT0 from slave device, too commit `60ab0fafc4` upstream. The sleep wake-up refactoring that I introduced in commit `c7e1b40590` ("tty: serial: exar: Relocate sleep wake-up handling") did not account for devices with a slave device on the expansion port. This patch pokes the INT0 register in the slave device, if present, in order to ensure that MSI interrupts don't get permanently "stuck" because of a sleep wake-up interrupt as described here: commit `2c0ac5b48a` ("serial: exar: Fix stuck MSIs") This also converts an ioread8() to readb() in order to provide visual consistency with the MMIO-only accessors used elsewhere in the driver. Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Aaron Sierra <asierra@xes-inc.com> Fixes: `c7e1b40590` ("tty: serial: exar: Relocate sleep wake-up handling") Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Mark	8c1947e62c	tty: serial: 8250: Revert NXP SC16C2552 workaround commit `47ac76662c` upstream. Revert commit `ecb988a3b7`: tty: serial: 8250: 8250_core: NXP SC16C2552 workaround The above commit causes userland application to no longer write correctly its first write to a dumb terminal connected to /dev/ttyS0. This commit seems to be the culprit. It's as though the TX FIFO is being reset during that write. What should be displayed is: PSW 80000000 INST 00000000 HALT // What is displayed is some variation of: T 00000000 HAL// Reverting this commit via this patch fixes my problem. Signed-off-by: Mark Hounschell <dmarkh@cfl.rr.com> Fixes: `ecb988a3b7` ("tty: serial: 8250: 8250_core: NXP SC16C2552 workaround") Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Willy Tarreau	1362e5d806	ACPI / PM: save NVS memory for ASUS 1025C laptop commit `231f941500` upstream. Every time I tried to upgrade my laptop from 3.10.x to 4.x I faced an issue by which the fan would run at full speed upon resume. Bisecting it showed me the issue was introduced in 3.17 by commit `821d6f0359` (ACPI / sleep: Do not save NVS for new machines to accelerate S3). This code only affects machines built starting as of 2012, but this Asus 1025C laptop was made in 2012 and apparently needs the NVS data to be saved, otherwise the CPU's thermal state is not properly reported on resume and the fan runs at full speed upon resume. Here's a very simple way to check if such a machine is affected : # cat /sys/class/thermal/thermal_zone0/temp 55000 ( now suspend, wait one second and resume ) # cat /sys/class/thermal/thermal_zone0/temp 0 (and after ~15 seconds the fan starts to spin) Let's apply the same quirk as commit `cbc00c13` (ACPI: save NVS memory for Lenovo G50-45) and reuse the function it provides. Note that this commit was already backported to 4.9.x but not 4.4.x. Cc: 3.17+ <stable@vger.kernel.org> # 3.17+: requires `cbc00c13` Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Aleksander Morgado	dfc69468e8	USB: option: add support for DW5821e commit `7bab01ecc6` upstream. The device exposes AT, NMEA and DIAG ports in both USB configurations. The patch explicitly ignores interfaces 0 and 1, as they're bound to other drivers already; and also interface 6, which is a GNSS interface for which we don't have a driver yet. T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 18 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2 P: Vendor=413c ProdID=81d7 Rev=03.18 S: Manufacturer=DELL S: Product=DW5821e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2 P: Vendor=413c ProdID=81d7 Rev=03.18 S: Manufacturer=DELL S: Product=DW5821e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Movie Song	a47dcca30e	USB: serial: pl2303: add a new device id for ATEN commit `29c692c96b` upstream. Signed-off-by: Movie Song <MovieSong@aten-itlab.cn> Cc: Johan Hovold <johan@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
John Ogness	e14aed8af8	USB: serial: sierra: fix potential deadlock at close commit `e60870012e` upstream. The portdata spinlock can be taken in interrupt context (via sierra_outdat_callback()). Disable interrupts when taking the portdata spinlock when discarding deferred URBs during close to prevent a possible deadlock. Fixes: `014333f77c` ("USB: sierra: fix urb and memory leak on disconnect") Cc: stable <stable@vger.kernel.org> Signed-off-by: John Ogness <john.ogness@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ johan: amend commit message and add fixes and stable tags ] Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Takashi Iwai	99747622f9	ALSA: seq: Fix poll() error return commit `a49a71f6e2` upstream. The sanity checks in ALSA sequencer and OSS sequencer emulation codes return falsely -ENXIO from poll callback. They should be EPOLLERR instead. This was caught thanks to the recent change to the return value. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:54 +02:00
Takashi Iwai	3baebed599	ALSA: vxpocket: Fix invalid endian conversions commit `3acd3e3bab` upstream. The endian conversions used in vxp_dma_read() and vxp_dma_write() are superfluous and even wrong on big-endian machines, as inw() and outw() already do conversions. Kill them. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Takashi Iwai	349a3ed9b4	ALSA: memalloc: Don't exceed over the requested size commit `dfef01e150` upstream. snd_dma_alloc_pages_fallback() tries to allocate pages again when the allocation fails with reduced size. But the first try actually increases the size to power-of-two, which may give back a larger chunk than the requested size. This confuses the callers, e.g. sgbuf assumes that the size is equal or less, and it may result in a bad loop due to the underflow and eventually lead to Oops. The code of this function seems incorrectly assuming the usage of get_order(). We need to decrease at first, then align to power-of-two. Reported-and-tested-by: he, bo <bo.he@intel.com> Reported-by: zhang jun <jun.zhang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Hans de Goede	166c382429	ALSA: hda: Correct Asrock B85M-ITX power_save blacklist entry commit `8e82a72879` upstream. I added the subsys product-id for the HDMI HDA device rather then for the PCH one, this commit fixes this. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1525104 Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Takashi Iwai	0b48709fad	ALSA: cs5535audio: Fix invalid endian conversion commit `69756930f2` upstream. One place in cs5535audio_build_dma_packets() does an extra conversion via cpu_to_le32(); namely jmpprd_addr is passed to setup_prd() ops, which writes the value via cs_writel(). That is, the callback does the conversion by itself, and we don't need to convert beforehand. This patch fixes that bogus conversion. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Takashi Iwai	15e2046739	ALSA: virmidi: Fix too long output trigger loop commit `50e9ffb199` upstream. The virmidi output trigger tries to parse the all available bytes and process sequencer events as much as possible. In a normal situation, this is supposed to be relatively short, but a program may give a huge buffer and it'll take a long time in a single spin lock, which may eventually lead to a soft lockup. This patch simply adds a workaround, a cond_resched() call in the loop if applicable. A better solution would be to move the event processor into a work, but let's put a duct-tape quickly at first. Reported-and-tested-by: Dae R. Jeong <threeearcat@gmail.com> Reported-by: syzbot+619d9f40141d826b097e@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Takashi Iwai	892dadfa1c	ALSA: vx222: Fix invalid endian conversions commit `fff71a4c05` upstream. The endian conversions used in vx2_dma_read() and vx2_dma_write() are superfluous and even wrong on big-endian machines, as inl() and outl() already do conversions. Kill them. Spotted by sparse, a warning like: sound/pci/vx222/vx222_ops.c:278:30: warning: incorrect type in argument 1 (different base types) Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Park Ju Hyung	83c501d7be	ALSA: hda - Turn CX8200 into D3 as well upon reboot commit `d77a4b4a5b` upstream. As an equivalent codec with CX20724, CX8200 is also subject to the reboot bug. Late 2017 and 2018 LG Gram and some HP Spectre laptops are known victims to this issue, causing extremely loud noises upon reboot. Now that we know that this bug is subject to multiple codecs, fix the comment as well. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
Park Ju Hyung	7638bf159a	ALSA: hda - Sleep for 10ms after entering D3 on Conexant codecs commit `f59cf9a055` upstream. On rare occasions, we are still noticing that the internal speaker spitting out spurious noises even after adding the problematic codec to the list. Adding a 10ms artificial delay before rebooting fixes the issue entirely. Patch for Realtek codecs also adds the same amount of delay after entering D3. Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:53 +02:00
David Howells	3afd60a6d8	rxrpc: Fix the keepalive generator [ver #2 ] [ Upstream commit `330bdcfadc` ] AF_RXRPC has a keepalive message generator that generates a message for a peer ~20s after the last transmission to that peer to keep firewall ports open. The implementation is incorrect in the following ways: (1) It mixes up ktime_t and time64_t types. (2) It uses ktime_get_real(), the output of which may jump forward or backward due to adjustments to the time of day. (3) If the current time jumps forward too much or jumps backwards, the generator function will crank the base of the time ring round one slot at a time (ie. a 1s period) until it catches up, spewing out VERSION packets as it goes. Fix the problem by: (1) Only using time64_t. There's no need for sub-second resolution. (2) Use ktime_get_seconds() rather than ktime_get_real() so that time isn't perceived to go backwards. (3) Simplifying rxrpc_peer_keepalive_worker() by splitting it into two parts: (a) The "worker" function that manages the buckets and the timer. (b) The "dispatch" function that takes the pending peers and potentially transmits a keepalive packet before putting them back in the ring into the slot appropriate to the revised last-Tx time. (4) Taking everything that's pending out of the ring and splicing it into a temporary collector list for processing. In the case that there's been a significant jump forward, the ring gets entirely emptied and then the time base can be warped forward before the peers are processed. The warping can't happen if the ring isn't empty because the slot a peer is in is keepalive-time dependent, relative to the base time. (5) Limit the number of iterations of the bucket array when scanning it. (6) Set the timer to skip any empty slots as there's no point waking up if there's nothing to do yet. This can be triggered by an incoming call from a server after a reboot with AF_RXRPC and AFS built into the kernel causing a peer record to be set up before userspace is started. The system clock is then adjusted by userspace, thereby potentially causing the keepalive generator to have a meltdown - which leads to a message like: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [kworker/0:1:23] ... Workqueue: krxrpcd rxrpc_peer_keepalive_worker EIP: lock_acquire+0x69/0x80 ... Call Trace: ? rxrpc_peer_keepalive_worker+0x5e/0x350 ? _raw_spin_lock_bh+0x29/0x60 ? rxrpc_peer_keepalive_worker+0x5e/0x350 ? rxrpc_peer_keepalive_worker+0x5e/0x350 ? __lock_acquire+0x3d3/0x870 ? process_one_work+0x110/0x340 ? process_one_work+0x166/0x340 ? process_one_work+0x110/0x340 ? worker_thread+0x39/0x3c0 ? kthread+0xdb/0x110 ? cancel_delayed_work+0x90/0x90 ? kthread_stop+0x70/0x70 ? ret_from_fork+0x19/0x24 Fixes: `ace45bec6d` ("rxrpc: Fix firewall route keepalive") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Heiner Kallweit	08f5083d54	r8169: don't use MSI-X on RTL8168g [ Upstream commit `7c53a72245` ] There have been two reports that network doesn't come back on resume from suspend when using MSI-X. Both cases affect the same chip version (RTL8168g - version 40), on different systems. Falling back to MSI fixes the issue. Even though we don't really have a proof yet that the network chip version is to blame, let's disable MSI-X for this version. Reported-by: Steve Dodd <steved424@gmail.com> Reported-by: Lou Reed <gogen@disroot.org> Tested-by: Steve Dodd <steved424@gmail.com> Tested-by: Lou Reed <gogen@disroot.org> Fixes: `6c6aa15fde` ("r8169: improve interrupt handling") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Or Gerlitz	64cfdbc4e1	net/mlx5e: Properly check if hairpin is possible between two functions [ Upstream commit `816f670623` ] The current check relies on function BDF addresses and can get us wrong e.g when two VFs are assigned into a VM and the PCI v-address is set by the hypervisor. Fixes: `5c65c564c9` ('net/mlx5e: Support offloading TC NIC hairpin flows') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Alaa Hleihel <alaa@mellanox.com> Tested-by: Alaa Hleihel <alaa@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Nir Dotan	ea0c9925c1	mlxsw: core_acl_flex_actions: Remove redundant mirror resource destruction [ Upstream commit `caebd1b389` ] In previous patch mlxsw_afa_resource_del() was added to avoid a duplicate resource detruction scenario. For mirror actions, such duplicate destruction leads to a crash as in: # tc qdisc add dev swp49 ingress # tc filter add dev swp49 parent ffff: \ protocol ip chain 100 pref 10 \ flower skip_sw dst_ip 192.168.101.1 action drop # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \ action mirred egress mirror dev swp4 Therefore add a call to mlxsw_afa_resource_del() in mlxsw_afa_mirror_destroy() in order to clear that resource from rule's resources. Fixes: `d0d13c1858` ("mlxsw: spectrum_acl: Add support for mirror action") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Nir Dotan	6e1907e177	mlxsw: core_acl_flex_actions: Remove redundant counter destruction [ Upstream commit `7cc6169493` ] Each tc flower rule uses a hidden count action. As counter resource may not be available due to limited HW resources, update _counter_create() and _counter_destroy() pair to follow previously introduced symmetric error condition handling, add a call to mlxsw_afa_resource_del() as part of the counter resource destruction. Fixes: `c18c1e186b` ("mlxsw: core: Make counter index allocated inside the action append") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Nir Dotan	3c409ca9df	mlxsw: core_acl_flex_actions: Remove redundant resource destruction [ Upstream commit `dda0a3a3fb` ] Some ACL actions require the allocation of a separate resource prior to applying the action itself. When facing an error condition during the setup phase of the action, resource should be destroyed. For such actions the destruction was done twice which is dangerous and lead to a potential crash. The destruction took place first upon error on action setup phase and then as the rule was destroyed. The following sequence generated a crash: # tc qdisc add dev swp49 ingress # tc filter add dev swp49 parent ffff: \ protocol ip chain 100 pref 10 \ flower skip_sw dst_ip 192.168.101.1 action drop # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 action goto chain 100 \ action mirred egress mirror dev swp4 Therefore add mlxsw_afa_resource_del() as a complement of mlxsw_afa_resource_add() to add symmetry to resource_list membership handling. Call this from mlxsw_afa_fwd_entry_ref_destroy() to make the _fwd_entry_ref_create() and _fwd_entry_ref_destroy() pair of calls a NOP. Fixes: `140ce42121` ("mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Xin Long	92f6bb16bb	ip6_tunnel: use the right value for ipv4 min mtu check in ip6_tnl_xmit [ Upstream commit `82a40777de` ] According to RFC791, 68 bytes is the minimum size of IPv4 datagram every device must be able to forward without further fragmentation while 576 bytes is the minimum size of IPv4 datagram every device has to be able to receive, so in ip6_tnl_xmit(), 68(IPV4_MIN_MTU) should be the right value for the ipv4 min mtu check in ip6_tnl_xmit. While at it, change to use max() instead of if statement. Fixes: `c9fefa0819` ("ip6_tunnel: get the min mtu properly in ip6_tnl_xmit") Reported-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Dmitry Bogdanov	87835f30e4	net: aquantia: Fix IFF_ALLMULTI flag functionality [ Upstream commit `11ba961c91` ] It was noticed that NIC always pass all multicast traffic to the host regardless of IFF_ALLMULTI flag on the interface. The rule in MC Filter Table in NIC, that is configured to accept any multicast packets, is turning on if IFF_MULTICAST flag is set on the interface. It leads to passing all multicast traffic to the host. This fix changes the condition to turn on that rule by checking IFF_ALLMULTI flag as it should. Fixes: `b21f502f84` ("net:ethernet:aquantia: Fix for multicast filter handling.") Signed-off-by: Dmitry Bogdanov <dmitry.bogdanov@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Nir Dotan	4b66d4ef96	mlxsw: core_acl_flex_actions: Return error for conflicting actions [ Upstream commit `3757b255bf` ] Spectrum switch ACL action set is built in groups of three actions which may point to additional actions. A group holds a single record which can be set as goto record for pointing at a following group or can be set to mark the termination of the lookup. This is perfectly adequate for handling a series of actions to be executed on a packet. While the SW model allows configuration of conflicting actions where it is clear that some actions will never execute, the mlxsw driver must block such configurations as it creates a conflict over the single terminate/goto record value. For a conflicting actions configuration such as: # tc filter add dev swp49 parent ffff: \ protocol ip pref 10 \ flower skip_sw dst_ip 192.168.101.1 \ action goto chain 100 \ action mirred egress mirror dev swp4 Where it is clear that the last action will never execute, the mlxsw driver was issuing a warning instead of returning an error. Therefore replace that warning with an error for this specific case. Fixes: `4cda7d8d70` ("mlxsw: core: Introduce flexible actions support") Signed-off-by: Nir Dotan <nird@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:52 +02:00
Jason Wang	bff6f03290	vhost: reset metadata cache when initializing new IOTLB [ Upstream commit `b13f9c6364` ] We need to reset metadata cache during new IOTLB initialization, otherwise the stale pointers to previous IOTLB may be still accessed which will lead a use after free. Reported-by: syzbot+c51e6736a1bf614b3272@syzkaller.appspotmail.com Fixes: `f889491380` ("vhost: introduce O(1) vq metadata cache") Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Hangbin Liu	87f716bc53	net_sched: Fix missing res info when create new tc_index filter [ Upstream commit `008369dcc5` ] Li Shuang reported the following warn: [ 733.484610] WARNING: CPU: 6 PID: 21123 at net/sched/sch_cbq.c:1418 cbq_destroy_class+0x5d/0x70 [sch_cbq] [ 733.495190] Modules linked in: sch_cbq cls_tcindex sch_dsmark rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat l [ 733.574155] syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ixgbe ahci libahci i2c_algo_bit libata i40e i2c_core dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [ 733.592500] CPU: 6 PID: 21123 Comm: tc Not tainted 4.18.0-rc8.latest+ #131 [ 733.600169] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016 [ 733.608518] RIP: 0010:cbq_destroy_class+0x5d/0x70 [sch_cbq] [ 733.614734] Code: e7 d9 d2 48 8b 7b 48 e8 61 05 da d2 48 8d bb f8 00 00 00 e8 75 ae d5 d2 48 39 eb 74 0a 48 89 df 5b 5d e9 16 6c 94 d2 5b 5d c3 <0f> 0b eb b6 0f 1f 44 00 00 66 2e 0f 1f 84 [ 733.635798] RSP: 0018:ffffbfbb066bb9d8 EFLAGS: 00010202 [ 733.641627] RAX: 0000000000000001 RBX: ffff9cdd17392800 RCX: 000000008010000f [ 733.649588] RDX: ffff9cdd1df547e0 RSI: ffff9cdd17392800 RDI: ffff9cdd0f84c800 [ 733.657547] RBP: ffff9cdd0f84c800 R08: 0000000000000001 R09: 0000000000000000 [ 733.665508] R10: ffff9cdd0f84d000 R11: 0000000000000001 R12: 0000000000000001 [ 733.673469] R13: 0000000000000000 R14: 0000000000000001 R15: ffff9cdd17392200 [ 733.681430] FS: 00007f911890a740(0000) GS:ffff9cdd1f8c0000(0000) knlGS:0000000000000000 [ 733.690456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 733.696864] CR2: 0000000000b5544c CR3: 0000000859374002 CR4: 00000000001606e0 [ 733.704826] Call Trace: [ 733.707554] cbq_destroy+0xa1/0xd0 [sch_cbq] [ 733.712318] qdisc_destroy+0x62/0x130 [ 733.716401] dsmark_destroy+0x2a/0x70 [sch_dsmark] [ 733.721745] qdisc_destroy+0x62/0x130 [ 733.725829] qdisc_graft+0x3ba/0x470 [ 733.729817] tc_get_qdisc+0x2a6/0x2c0 [ 733.733901] ? cred_has_capability+0x7d/0x130 [ 733.738761] rtnetlink_rcv_msg+0x263/0x2d0 [ 733.743330] ? rtnl_calcit.isra.30+0x110/0x110 [ 733.748287] netlink_rcv_skb+0x4d/0x130 [ 733.752576] netlink_unicast+0x1a3/0x250 [ 733.756949] netlink_sendmsg+0x2ae/0x3a0 [ 733.761324] sock_sendmsg+0x36/0x40 [ 733.765213] ___sys_sendmsg+0x26f/0x2d0 [ 733.769493] ? handle_pte_fault+0x586/0xdf0 [ 733.774158] ? __handle_mm_fault+0x389/0x500 [ 733.778919] ? __sys_sendmsg+0x5e/0xa0 [ 733.783099] __sys_sendmsg+0x5e/0xa0 [ 733.787087] do_syscall_64+0x5b/0x180 [ 733.791171] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 733.796805] RIP: 0033:0x7f9117f23f10 [ 733.800791] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 [ 733.821873] RSP: 002b:00007ffe96818398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 733.830319] RAX: ffffffffffffffda RBX: 000000005b71244c RCX: 00007f9117f23f10 [ 733.838280] RDX: 0000000000000000 RSI: 00007ffe968183e0 RDI: 0000000000000003 [ 733.846241] RBP: 00007ffe968183e0 R08: 000000000000ffff R09: 0000000000000003 [ 733.854202] R10: 00007ffe96817e20 R11: 0000000000000246 R12: 0000000000000000 [ 733.862161] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000 [ 733.870121] ---[ end trace 28edd4aad712ddca ]--- This is because we didn't update f->result.res when create new filter. Then in tcindex_delete() -> tcf_unbind_filter(), we will failed to find out the res and unbind filter, which will trigger the WARN_ON() in cbq_destroy_class(). Fix it by updating f->result.res when create new filter. Fixes: `6e0565697a` ("net_sched: fix another crash in cls_tcindex") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Cong Wang	fcc827b758	vsock: split dwork to avoid reinitializations [ Upstream commit `455f05ecd2` ] syzbot reported that we reinitialize an active delayed work in vsock_stream_connect(): ODEBUG: init active (active state 0) object type: timer_list hint: delayed_work_timer_fn+0x0/0x90 kernel/workqueue.c:1414 WARNING: CPU: 1 PID: 11518 at lib/debugobjects.c:329 debug_print_object+0x16a/0x210 lib/debugobjects.c:326 The pattern is apparently wrong, we should only initialize the dealyed work once and could repeatly schedule it. So we have to move out the initializations to allocation side. And to avoid confusion, we can split the shared dwork into two, instead of re-using the same one. Fixes: `d021c34405` ("VSOCK: Introduce VM Sockets") Reported-by: <syzbot+8a9b1bd330476a4f3db6@syzkaller.appspotmail.com> Cc: Andy king <acking@vmware.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Hangbin Liu	d72ab47910	net_sched: fix NULL pointer dereference when delete tcindex filter [ Upstream commit `2df8bee565` ] Li Shuang reported the following crash: [ 71.267724] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 71.276456] PGD 800000085d9bd067 P4D 800000085d9bd067 PUD 859a0b067 PMD 0 [ 71.284127] Oops: 0000 [#1] SMP PTI [ 71.288015] CPU: 12 PID: 2386 Comm: tc Not tainted 4.18.0-rc8.latest+ #131 [ 71.295686] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.1.5 04/11/2016 [ 71.304037] RIP: 0010:tcindex_delete+0x72/0x280 [cls_tcindex] [ 71.310446] Code: 00 31 f6 48 87 75 20 48 85 f6 74 11 48 8b 47 18 48 8b 40 08 48 8b 40 50 e8 fb a6 f8 fc 48 85 db 0f 84 dc 00 00 00 48 8b 73 18 <8b> 56 04 48 8d 7e 04 85 d2 0f 84 7b 01 00 [ 71.331517] RSP: 0018:ffffb45207b3f898 EFLAGS: 00010282 [ 71.337345] RAX: ffff8ad3d72d6360 RBX: ffff8acc84393680 RCX: 000000000000002e [ 71.345306] RDX: ffff8ad3d72c8570 RSI: 0000000000000000 RDI: ffff8ad847a45800 [ 71.353277] RBP: ffff8acc84393688 R08: ffff8ad3d72c8400 R09: 0000000000000000 [ 71.361238] R10: ffff8ad3de786e00 R11: 0000000000000000 R12: ffffb45207b3f8c7 [ 71.369199] R13: ffff8ad3d93bd2a0 R14: 000000000000002e R15: ffff8ad3d72c9600 [ 71.377161] FS: 00007f9d3ec3e740(0000) GS:ffff8ad3df980000(0000) knlGS:0000000000000000 [ 71.386188] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 71.392597] CR2: 0000000000000004 CR3: 0000000852f06003 CR4: 00000000001606e0 [ 71.400558] Call Trace: [ 71.403299] tcindex_destroy_element+0x25/0x40 [cls_tcindex] [ 71.409611] tcindex_walk+0xbb/0x110 [cls_tcindex] [ 71.414953] tcindex_destroy+0x44/0x90 [cls_tcindex] [ 71.420492] ? tcindex_delete+0x280/0x280 [cls_tcindex] [ 71.426323] tcf_proto_destroy+0x16/0x40 [ 71.430696] tcf_chain_flush+0x51/0x70 [ 71.434876] tcf_block_put_ext.part.30+0x8f/0x1b0 [ 71.440122] tcf_block_put+0x4d/0x70 [ 71.444108] cbq_destroy+0x4d/0xd0 [sch_cbq] [ 71.448869] qdisc_destroy+0x62/0x130 [ 71.452951] dsmark_destroy+0x2a/0x70 [sch_dsmark] [ 71.458300] qdisc_destroy+0x62/0x130 [ 71.462373] qdisc_graft+0x3ba/0x470 [ 71.466359] tc_get_qdisc+0x2a6/0x2c0 [ 71.470443] ? cred_has_capability+0x7d/0x130 [ 71.475307] rtnetlink_rcv_msg+0x263/0x2d0 [ 71.479875] ? rtnl_calcit.isra.30+0x110/0x110 [ 71.484832] netlink_rcv_skb+0x4d/0x130 [ 71.489109] netlink_unicast+0x1a3/0x250 [ 71.493482] netlink_sendmsg+0x2ae/0x3a0 [ 71.497859] sock_sendmsg+0x36/0x40 [ 71.501748] ___sys_sendmsg+0x26f/0x2d0 [ 71.506029] ? handle_pte_fault+0x586/0xdf0 [ 71.510694] ? __handle_mm_fault+0x389/0x500 [ 71.515457] ? __sys_sendmsg+0x5e/0xa0 [ 71.519636] __sys_sendmsg+0x5e/0xa0 [ 71.523626] do_syscall_64+0x5b/0x180 [ 71.527711] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 71.533345] RIP: 0033:0x7f9d3e257f10 [ 71.537331] Code: c3 48 8b 05 82 6f 2c 00 f7 db 64 89 18 48 83 cb ff eb dd 0f 1f 80 00 00 00 00 83 3d 8d d0 2c 00 00 75 10 b8 2e 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 [ 71.558401] RSP: 002b:00007fff6f893398 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 71.566848] RAX: ffffffffffffffda RBX: 000000005b71274d RCX: 00007f9d3e257f10 [ 71.574810] RDX: 0000000000000000 RSI: 00007fff6f8933e0 RDI: 0000000000000003 [ 71.582770] RBP: 00007fff6f8933e0 R08: 000000000000ffff R09: 0000000000000003 [ 71.590729] R10: 00007fff6f892e20 R11: 0000000000000246 R12: 0000000000000000 [ 71.598689] R13: 0000000000662ee0 R14: 0000000000000000 R15: 0000000000000000 [ 71.606651] Modules linked in: sch_cbq cls_tcindex sch_dsmark xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_coni [ 71.685425] libahci i2c_algo_bit i2c_core i40e libata dca mdio megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [ 71.697075] CR2: 0000000000000004 [ 71.700792] ---[ end trace f604eb1acacd978b ]--- Reproducer: tc qdisc add dev lo handle 1:0 root dsmark indices 64 set_tc_index tc filter add dev lo parent 1:0 protocol ip prio 1 tcindex mask 0xfc shift 2 tc qdisc add dev lo parent 1:0 handle 2:0 cbq bandwidth 10Mbit cell 8 avpkt 1000 mpu 64 tc class add dev lo parent 2:0 classid 2:1 cbq bandwidth 10Mbit rate 1500Kbit avpkt 1000 prio 1 bounded isolated allot 1514 weight 1 maxburst 10 tc filter add dev lo parent 2:0 protocol ip prio 1 handle 0x2e tcindex classid 2:1 pass_on tc qdisc add dev lo parent 2:1 pfifo limit 5 tc qdisc del dev lo root This is because in tcindex_set_parms, when there is no old_r, we set new exts to cr.exts. And we didn't set it to filter when r == &new_filter_result. Then in tcindex_delete() -> tcf_exts_get_net(), we will get NULL pointer dereference as we didn't init exts. Fix it by moving tcf_exts_change() after "if (old_r && old_r != r)" check. Then we don't need "cr" as there is no errout after that. Fixes: `bf63ac73b3` ("net_sched: fix an oops in tcindex filter") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Cong Wang	7f7649aa02	llc: use refcount_inc_not_zero() for llc_sap_find() [ Upstream commit `0dcb82254d` ] llc_sap_put() decreases the refcnt before deleting sap from the global list. Therefore, there is a chance llc_sap_find() could find a sap with zero refcnt in this global list. Close this race condition by checking if refcnt is zero or not in llc_sap_find(), if it is zero then it is being removed so we can just treat it as gone. Reported-by: <syzbot+278893f3f7803871f7ce@syzkaller.appspotmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Wei Wang	1bc83bf234	l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache [ Upstream commit `6d37fa49da` ] In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a UDP socket. User could call sendmsg() on both this tunnel and the UDP socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_sendmsg() is lockless and call sk_dst_check() to refresh sk->sk_dst_cache, there could be a race and cause the dst cache to be freed multiple times. So we fix l2tp side code to always call sk_dst_check() to garantee xchg() is called when refreshing sk->sk_dst_cache to avoid race conditions. Syzkaller reported stack trace: BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline] BUG: KASAN: use-after-free in atomic_fetch_add_unless include/linux/atomic.h:575 [inline] BUG: KASAN: use-after-free in atomic_add_unless include/linux/atomic.h:597 [inline] BUG: KASAN: use-after-free in dst_hold_safe include/net/dst.h:308 [inline] BUG: KASAN: use-after-free in ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029 Read of size 4 at addr ffff8801aea9a880 by task syz-executor129/4829 CPU: 0 PID: 4829 Comm: syz-executor129 Not tainted 4.18.0-rc7-next-20180802+ #30 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272 atomic_read include/asm-generic/atomic-instrumented.h:21 [inline] atomic_fetch_add_unless include/linux/atomic.h:575 [inline] atomic_add_unless include/linux/atomic.h:597 [inline] dst_hold_safe include/net/dst.h:308 [inline] ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029 rt6_get_pcpu_route net/ipv6/route.c:1249 [inline] ip6_pol_route+0x354/0xd20 net/ipv6/route.c:1922 ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2098 fib6_rule_lookup+0x283/0x890 net/ipv6/fib6_rules.c:122 ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2126 ip6_dst_lookup_tail+0x1278/0x1da0 net/ipv6/ip6_output.c:978 ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079 ip6_sk_dst_lookup_flow+0x5ed/0xc50 net/ipv6/ip6_output.c:1117 udpv6_sendmsg+0x2163/0x36b0 net/ipv6/udp.c:1354 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:622 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:632 ___sys_sendmsg+0x51d/0x930 net/socket.c:2115 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2210 __do_sys_sendmmsg net/socket.c:2239 [inline] __se_sys_sendmmsg net/socket.c:2236 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2236 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446a29 Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f4de5532db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc38 RCX: 0000000000446a29 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc30 R08: 00007f4de5533700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc3c R13: 00007ffe2b830fdf R14: 00007f4de55339c0 R15: 0000000000000001 Fixes: `71b1391a41` ("l2tp: ensure sk->dst is still valid") Reported-by: syzbot+05f840f3b04f211bad55@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Guillaume Nault <g.nault@alphalink.fr> Cc: David Ahern <dsahern@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Alexey Kodanev	a5c8409ab7	dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart() [ Upstream commit `61ef4b07fc` ] The shift of 'cwnd' with '(now - hc->tx_lsndtime) / hc->tx_rto' value can lead to undefined behavior [1]. In order to fix this use a gradual shift of the window with a 'while' loop, similar to what tcp_cwnd_restart() is doing. When comparing delta and RTO there is a minor difference between TCP and DCCP, the last one also invokes dccp_cwnd_restart() and reduces 'cwnd' if delta equals RTO. That case is preserved in this change. [1]: [40850.963623] UBSAN: Undefined behaviour in net/dccp/ccids/ccid2.c:237:7 [40851.043858] shift exponent 67 is too large for 32-bit type 'unsigned int' [40851.127163] CPU: 3 PID: 15940 Comm: netstress Tainted: G W E 4.18.0-rc7.x86_64 #1 ... [40851.377176] Call Trace: [40851.408503] dump_stack+0xf1/0x17b [40851.451331] ? show_regs_print_info+0x5/0x5 [40851.503555] ubsan_epilogue+0x9/0x7c [40851.548363] __ubsan_handle_shift_out_of_bounds+0x25b/0x2b4 [40851.617109] ? __ubsan_handle_load_invalid_value+0x18f/0x18f [40851.686796] ? xfrm4_output_finish+0x80/0x80 [40851.739827] ? lock_downgrade+0x6d0/0x6d0 [40851.789744] ? xfrm4_prepare_output+0x160/0x160 [40851.845912] ? ip_queue_xmit+0x810/0x1db0 [40851.895845] ? ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp] [40851.963530] ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp] [40852.029063] dccp_xmit_packet+0x1d3/0x720 [dccp] [40852.086254] dccp_write_xmit+0x116/0x1d0 [dccp] [40852.142412] dccp_sendmsg+0x428/0xb20 [dccp] [40852.195454] ? inet_dccp_listen+0x200/0x200 [dccp] [40852.254833] ? sched_clock+0x5/0x10 [40852.298508] ? sched_clock+0x5/0x10 [40852.342194] ? inet_create+0xdf0/0xdf0 [40852.388988] sock_sendmsg+0xd9/0x160 ... Fixes: `113ced1f52` ("dccp ccid-2: Perform congestion-window validation") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-22 07:44:51 +02:00
Greg Kroah-Hartman	fa535b8a37	Linux 4.17.17	2018-08-18 10:48:41 +02:00
Sean Christopherson	a4ae511d86	x86/speculation/l1tf: Exempt zeroed PTEs from inversion commit `f19f5c49bb` upstream. It turns out that we should not invert all not-present mappings, because the all zeroes case is obviously special. clear_page() does not undergo the XOR logic to invert the address bits, i.e. PTE, PMD and PUD entries that have not been individually written will have val=0 and so will trigger __pte_needs_invert(). As a result, {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok because the page at physical address 0 is reserved early in boot specifically to mitigate L1TF, so explicitly exempt them from the inversion when reading the PFN. Manifested as an unexpected mprotect(..., PROT_NONE) failure when called on a VMA that has VM_PFNMAP and was mmap'd to as something other than PROT_NONE but never used. mprotect() sends the PROT_NONE request down prot_none_walk(), which walks the PTEs to check the PFNs. prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns -EACCES because it thinks mprotect() is trying to adjust a high MMIO address. [ This is a very modified version of Sean's original patch, but all credit goes to Sean for doing this and also pointing out that sometimes the __pte_needs_invert() function only gets the protection bits, not the full eventual pte. But zero remains special even in just protection bits, so that's ok. - Linus ] Fixes: `f22cc87f6c` ("x86/speculation/l1tf: Invert all not present mappings") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-18 10:48:41 +02:00
Greg Kroah-Hartman	085c22bb54	Linux 4.17.16	2018-08-17 21:02:45 +02:00
Toshi Kani	c812c53e81	x86/mm: Add TLB purge to free pmd/pte page interfaces commit `5e0fb5df2e` upstream. ioremap() calls pud_free_pmd_page() / pmd_free_pte_page() when it creates a pud / pmd map. The following preconditions are met at their entry. - All pte entries for a target pud/pmd address range have been cleared. - System-wide TLB purges have been peformed for a target pud/pmd address range. The preconditions assure that there is no stale TLB entry for the range. Speculation may not cache TLB entries since it requires all levels of page entries, including ptes, to have P & A-bits set for an associated address. However, speculation may cache pud/pmd entries (paging-structure caches) when they have P-bit set. Add a system-wide TLB purge (INVLPG) to a single page after clearing pud/pmd entry's P-bit. SDM 4.10.4.1, Operation that Invalidate TLBs and Paging-Structure Caches, states that: INVLPG invalidates all paging-structure caches associated with the current PCID regardless of the liner addresses to which they correspond. Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: cpandya@codeaurora.org Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: Joerg Roedel <joro@8bytes.org> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-4-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Chintan Pandya	a033b10861	ioremap: Update pgtable free interfaces with addr commit `785a19f9d1` upstream. The following kernel panic was observed on ARM64 platform due to a stale TLB entry. 1. ioremap with 4K size, a valid pte page table is set. 2. iounmap it, its pte entry is set to 0. 3. ioremap the same address with 2M size, update its pmd entry with a new value. 4. CPU may hit an exception because the old pmd entry is still in TLB, which leads to a kernel panic. Commit `b6bdb7517c` ("mm/vmalloc: add interfaces to free unmapped page table") has addressed this panic by falling to pte mappings in the above case on ARM64. To support pmd mappings in all cases, TLB purge needs to be performed in this case on ARM64. Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page() so that TLB purge can be added later in seprate patches. [toshi.kani@hpe.com: merge changes, rewrite patch description] Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: Will Deacon <will.deacon@arm.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Mark Salyzyn	0c37356f69	Bluetooth: hidp: buffer overflow in hidp_process_report commit `7992c18810` upstream. CVE-2018-9363 The buffer length is unsigned at all layers, but gets cast to int and checked in hidp_process_report and can lead to a buffer overflow. Switch len parameter to unsigned int to resolve issue. This affects 3.18 and newer kernels. Signed-off-by: Mark Salyzyn <salyzyn@android.com> Fixes: `a4b1b5877b` ("HID: Bluetooth: hidp: make sure input buffers are big enough") Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Kees Cook <keescook@chromium.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: linux-bluetooth@vger.kernel.org Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: security@kernel.org Cc: kernel-team@android.com Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	82ca3d13d8	crypto: skcipher - fix crash flushing dcache in error path commit `8088d3dd4d` upstream. scatterwalk_done() is only meant to be called after a nonzero number of bytes have been processed, since scatterwalk_pagedone() will flush the dcache of the previous page. But in the error case of skcipher_walk_done(), e.g. if the input wasn't an integer number of blocks, scatterwalk_done() was actually called after advancing 0 bytes. This caused a crash ("BUG: unable to handle kernel paging request") during '!PageSlab(page)' on architectures like arm and arm64 that define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was page-aligned as in that case walk->offset == 0. Fix it by reorganizing skcipher_walk_done() to skip the scatterwalk_advance() and scatterwalk_done() if an error has occurred. This bug was found by syzkaller fuzzing. Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { struct sockaddr_alg addr = { .salg_type = "skcipher", .salg_name = "cbc(aes-generic)", }; char buffer[4096] __attribute__((aligned(4096))) = { 0 }; int fd; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16); fd = accept(fd, NULL, NULL); write(fd, buffer, 15); read(fd, buffer, 15); } Reported-by: Liu Chao <liuchao741@huawei.com> Fixes: `b286d8b1a6` ("crypto: skcipher - Add skcipher walk interface") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	f21a81f5b9	crypto: skcipher - fix aligning block size in skcipher_copy_iv() commit `0567fc9e90` upstream. The ALIGN() macro needs to be passed the alignment, not the alignmask (which is the alignment minus 1). Fixes: `b286d8b1a6` ("crypto: skcipher - Add skcipher walk interface") Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	483efa679b	crypto: ablkcipher - fix crash flushing dcache in error path commit `318abdfbe7` upstream. Like the skcipher_walk and blkcipher_walk cases: scatterwalk_done() is only meant to be called after a nonzero number of bytes have been processed, since scatterwalk_pagedone() will flush the dcache of the previous page. But in the error case of ablkcipher_walk_done(), e.g. if the input wasn't an integer number of blocks, scatterwalk_done() was actually called after advancing 0 bytes. This caused a crash ("BUG: unable to handle kernel paging request") during '!PageSlab(page)' on architectures like arm and arm64 that define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was page-aligned as in that case walk->offset == 0. Fix it by reorganizing ablkcipher_walk_done() to skip the scatterwalk_advance() and scatterwalk_done() if an error has occurred. Reported-by: Liu Chao <liuchao741@huawei.com> Fixes: `bf06099db1` ("crypto: skcipher - Add ablkcipher_walk interfaces") Cc: <stable@vger.kernel.org> # v2.6.35+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	a404181a65	crypto: blkcipher - fix crash flushing dcache in error path commit `0868def3e4` upstream. Like the skcipher_walk case: scatterwalk_done() is only meant to be called after a nonzero number of bytes have been processed, since scatterwalk_pagedone() will flush the dcache of the previous page. But in the error case of blkcipher_walk_done(), e.g. if the input wasn't an integer number of blocks, scatterwalk_done() was actually called after advancing 0 bytes. This caused a crash ("BUG: unable to handle kernel paging request") during '!PageSlab(page)' on architectures like arm and arm64 that define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE, provided that the input was page-aligned as in that case walk->offset == 0. Fix it by reorganizing blkcipher_walk_done() to skip the scatterwalk_advance() and scatterwalk_done() if an error has occurred. This bug was found by syzkaller fuzzing. Reproducer, assuming ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { struct sockaddr_alg addr = { .salg_type = "skcipher", .salg_name = "ecb(aes-generic)", }; char buffer[4096] __attribute__((aligned(4096))) = { 0 }; int fd; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buffer, 16); fd = accept(fd, NULL, NULL); write(fd, buffer, 15); read(fd, buffer, 15); } Reported-by: Liu Chao <liuchao741@huawei.com> Fixes: `5cde0af2a9` ("[CRYPTO] cipher: Added block cipher type") Cc: <stable@vger.kernel.org> # v2.6.19+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	55c689bd3d	crypto: vmac - separate tfm and request context commit `bb29648102` upstream. syzbot reported a crash in vmac_final() when multiple threads concurrently use the same "vmac(aes)" transform through AF_ALG. The bug is pretty fundamental: the VMAC template doesn't separate per-request state from per-tfm (per-key) state like the other hash algorithms do, but rather stores it all in the tfm context. That's wrong. Also, vmac_final() incorrectly zeroes most of the state including the derived keys and cached pseudorandom pad. Therefore, only the first VMAC invocation with a given key calculates the correct digest. Fix these bugs by splitting the per-tfm state from the per-request state and using the proper init/update/final sequencing for requests. Reproducer for the crash: #include <linux/if_alg.h> #include <sys/socket.h> #include <unistd.h> int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "vmac(aes)", }; char buf[256] = { 0 }; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); setsockopt(fd, SOL_ALG, ALG_SET_KEY, buf, 16); fork(); fd = accept(fd, NULL, NULL); for (;;) write(fd, buf, 256); } The immediate cause of the crash is that vmac_ctx_t.partial_size exceeds VMAC_NHBYTES, causing vmac_final() to memset() a negative length. Reported-by: syzbot+264bca3a6e8d645550d3@syzkaller.appspotmail.com Fixes: `f1939f7c56` ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	b26243c535	crypto: vmac - require a block cipher with 128-bit block size commit `73bf20ef3d` upstream. The VMAC template assumes the block cipher has a 128-bit block size, but it failed to check for that. Thus it was possible to instantiate it using a 64-bit block size cipher, e.g. "vmac(cast5)", causing uninitialized memory to be used. Add the needed check when instantiating the template. Fixes: `f1939f7c56` ("crypto: vmac - New hash algorithm for intel_txt support") Cc: <stable@vger.kernel.org> # v2.6.32+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Eric Biggers	7aaaad0da8	crypto: x86/sha256-mb - fix digest copy in sha256_mb_mgr_get_comp_job_avx2() commit `af839b4e54` upstream. There is a copy-paste error where sha256_mb_mgr_get_comp_job_avx2() copies the SHA-256 digest state from sha256_mb_mgr::args::digest to job_sha256::result_digest. Consequently, the sha256_mb algorithm sometimes calculates the wrong digest. Fix it. Reproducer using AF_ALG: #include <assert.h> #include <linux/if_alg.h> #include <stdio.h> #include <string.h> #include <sys/socket.h> #include <unistd.h> static const __u8 expected[32] = "\xad\x7f\xac\xb2\x58\x6f\xc6\xe9\x66\xc0\x04\xd7\xd1\xd1\x6b\x02" "\x4f\x58\x05\xff\x7c\xb4\x7c\x7a\x85\xda\xbd\x8b\x48\x89\x2c\xa7"; int main() { int fd; struct sockaddr_alg addr = { .salg_type = "hash", .salg_name = "sha256_mb", }; __u8 data[4096] = { 0 }; __u8 digest[32]; int ret; int i; fd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(fd, (void *)&addr, sizeof(addr)); fork(); fd = accept(fd, 0, 0); do { ret = write(fd, data, 4096); assert(ret == 4096); ret = read(fd, digest, 32); assert(ret == 32); } while (memcmp(digest, expected, 32) == 0); printf("wrong digest: "); for (i = 0; i < 32; i++) printf("%02x", digest[i]); printf("\n"); } Output was: wrong digest: ad7facb2000000000000000000000000ffffffef7cb47c7a85dabd8b48892ca7 Fixes: `172b1d6b5a` ("crypto: sha256-mb - fix ctx pointer and digest copy") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:44 +02:00
Tom Lendacky	3dcf894690	crypto: ccp - Fix command completion detection race commit `f426d2b20f` upstream. The wait_event() function is used to detect command completion. The interrupt handler will set the wait condition variable when the interrupt is triggered. However, the variable used for wait_event() is initialized after the command has been submitted, which can create a race condition with the interrupt handler and result in the wait_event() never returning. Move the initialization of the wait condition variable to just before command submission. Fixes: `200664d523` ("crypto: ccp: Add Secure Encrypted Virtualization (SEV) command support") Cc: <stable@vger.kernel.org> # 4.16.x- Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Acked-by: Gary R Hook <gary.hook@amd.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Tom Lendacky	80a5eb6b37	crypto: ccp - Check for NULL PSP pointer at module unload commit `afb31cd2d1` upstream. Should the PSP initialization fail, the PSP data structure will be freed and the value contained in the sp_device struct set to NULL. At module unload, psp_dev_destroy() does not check if the pointer value is NULL and will end up dereferencing a NULL pointer. Add a pointer check of the psp_data field in the sp_device struct in psp_dev_destroy() and return immediately if it is NULL. Cc: <stable@vger.kernel.org> # 4.16.x- Fixes: `2a6170dfe7` ("crypto: ccp: Add Platform Security Processor (PSP) device support") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Gilad Ben-Yossef	a2a9b8b34c	crypto: ccree - fix iv handling commit `00904aa0cd` upstream. We were copying our last cipher block into the request for use as IV for all modes of operations. Fix this by discerning the behaviour based on the mode of operation used: copy ciphertext for CBC, update counter for CTR. CC: stable@vger.kernel.org Fixes: `63ee04c8b4` ("crypto: ccree - add skcipher support") Reported by: Hadar Gat <hadar.gat@arm.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Hadar Gat	4e2befa3fe	crypto: ccree - fix finup commit `26497e72a1` upstream. finup() operation was incorrect, padding was missing. Fix by setting the ccree HW to enable padding. Signed-off-by: Hadar Gat <hadar.gat@arm.com> [ gilad@benyossef.com: refactored for better code sharing ] Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Randy Dunlap	6b77316235	kbuild: verify that $DEPMOD is installed commit `934193a654` upstream. Verify that 'depmod' ($DEPMOD) is installed. This is a partial revert of commit `620c231c7a` ("kbuild: do not check for ancient modutils tools"). Also update Documentation/process/changes.rst to refer to kmod instead of module-init-tools. Fixes kernel bugzilla #198965: https://bugzilla.kernel.org/show_bug.cgi?id=198965 Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Lucas De Marchi <lucas.de.marchi@gmail.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Jessica Yu <jeyu@kernel.org> Cc: Chih-Wei Huang <cwhuang@linux.org.tw> Cc: stable@vger.kernel.org # any kernel since 2012 Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Toshi Kani	9c018f562d	x86/mm: Disable ioremap free page handling on x86-PAE commit `f967db0b9e` upstream. ioremap() supports pmd mappings on x86-PAE. However, kernel's pmd tables are not shared among processes on x86-PAE. Therefore, any update to sync'd pmd entries need re-syncing. Freeing a pte page also leads to a vmalloc fault and hits the BUG_ON in vmalloc_sync_one(). Disable free page handling on x86-PAE. pud_free_pmd_page() and pmd_free_pte_page() simply return 0 if a given pud/pmd entry is present. This assures that ioremap() does not update sync'd pmd entries at the cost of falling back to pte mappings. Fixes: `28ee90fe60` ("x86/mm: implement free pmd/pte page interfaces") Reported-by: Joerg Roedel <joro@8bytes.org> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: mhocko@suse.com Cc: akpm@linux-foundation.org Cc: hpa@zytor.com Cc: cpandya@codeaurora.org Cc: linux-mm@kvack.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180627141348.21777-2-toshi.kani@hpe.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
M. Vefa Bicakci	dc891b7ba3	xen/pv: Call get_cpu_address_sizes to set x86_virt/phys_bits commit `405c018a25` upstream. Commit `d94a155c59` ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") has moved the query and calculation of the x86_virt_bits and x86_phys_bits fields of the cpuinfo_x86 struct from the get_cpu_cap function to a new function named get_cpu_address_sizes. One of the call sites related to Xen PV VMs was unfortunately missed in the aforementioned commit. This prevents successful boot-up of kernel versions 4.17 and up in Xen PV VMs if CONFIG_DEBUG_VIRTUAL is enabled, due to the following code path: enlighten_pv.c::xen_start_kernel mmu_pv.c::xen_reserve_special_pages page.h::__pa physaddr.c::__phys_addr physaddr.h::phys_addr_valid phys_addr_valid uses boot_cpu_data.x86_phys_bits to validate physical addresses. boot_cpu_data.x86_phys_bits is no longer populated before the call to xen_reserve_special_pages due to the aforementioned commit though, so the validation performed by phys_addr_valid fails, which causes __phys_addr to trigger a BUG, preventing boot-up. Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: xen-devel@lists.xenproject.org Cc: x86@kernel.org Cc: stable@vger.kernel.org # for v4.17 and up Fixes: `d94a155c59` ("x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption") Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Dave Hansen	63687997fd	x86/mm/pti: Clear Global bit more aggressively commit `eac7073aa6` upstream. The kernel image starts out with the Global bit set across the entire kernel image. The bit is cleared with set_memory_nonglobal() in the configurations with PCIDs where the performance benefits of the Global bit are not needed. However, this is fragile. It means that we are stuck opting out of the less-secure (Global bit set) configuration, which seems backwards. Let's start more secure (Global bit clear) and then let things opt back in if they want performance, or are truly mapping common data between kernel and userspace. This fixes a bug. Before this patch, there are areas that are unmapped from the user page tables (like like everything above 0xffffffff82600000 in the example below). These have the hallmark of being a wrong Global area: they are not identical in the 'current_kernel' and 'current_user' page table dumps. They are also read-write, which means they're much more likely to contain secrets. Before this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K RW GLB NX pte current_kernel-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE GLB NX pmd current_kernel-0xffffffff82c00000-0xffffffff82e00000 2M RW GLB NX pte current_kernel-0xffffffff82e00000-0xffffffff83200000 4M RW PSE GLB NX pmd current_kernel-0xffffffff83200000-0xffffffffa0000000 462M pmd current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K RW GLB NX pte current_user-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd After this patch: current_kernel:---[ High Kernel Mapping ]--- current_kernel-0xffffffff80000000-0xffffffff81000000 16M pmd current_kernel-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_kernel-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_kernel-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_kernel-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_kernel-0xffffffff82600000-0xffffffff82c00000 6M RW PSE NX pmd current_kernel-0xffffffff82c00000-0xffffffff82e00000 2M RW NX pte current_kernel-0xffffffff82e00000-0xffffffff83200000 4M RW PSE NX pmd current_kernel-0xffffffff83200000-0xffffffffa0000000 462M pmd current_user:---[ High Kernel Mapping ]--- current_user-0xffffffff80000000-0xffffffff81000000 16M pmd current_user-0xffffffff81000000-0xffffffff81e00000 14M ro PSE GLB x pmd current_user-0xffffffff81e00000-0xffffffff81e11000 68K ro GLB x pte current_user-0xffffffff81e11000-0xffffffff82000000 1980K RW NX pte current_user-0xffffffff82000000-0xffffffff82600000 6M ro PSE GLB NX pmd current_user-0xffffffff82600000-0xffffffffa0000000 474M pmd Fixes: `0f561fce4d` ("x86/pti: Enable global pages for shared areas") Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: keescook@google.com Cc: aarcange@redhat.com Cc: jgross@suse.com Cc: jpoimboe@redhat.com Cc: gregkh@linuxfoundation.org Cc: peterz@infradead.org Cc: torvalds@linux-foundation.org Cc: bp@alien8.de Cc: luto@kernel.org Cc: ak@linux.intel.com Cc: Kees Cook <keescook@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: https://lkml.kernel.org/r/20180802225825.A100C071@viggo.jf.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Dou Liyang	af1cd0e896	x86/platform/UV: Mark memblock related init code and data correctly commit `24cfd8ca1d` upstream. parse_mem_block_size() and mem_block_size are only used during init. Mark them accordingly. Fixes: `d7609f4210` ("x86/platform/UV: Add kernel parameter to set memory block size") Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: Mike Travis <mike.travis@hpe.com> Cc: Andrew Banman <andrew.banman@hpe.com> Link: https://lkml.kernel.org/r/20180730075947.23023-1-douly.fnst@cn.fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Guenter Roeck	921eb6977c	x86: i8259: Add missing include file commit `0a957467c5` upstream. i8259.h uses inb/outb and thus needs to include asm/io.h to avoid the following build error, as seen with x86_64:defconfig and CONFIG_SMP=n. In file included from drivers/rtc/rtc-cmos.c:45:0: arch/x86/include/asm/i8259.h: In function 'inb_pic': arch/x86/include/asm/i8259.h:32:24: error: implicit declaration of function 'inb' arch/x86/include/asm/i8259.h: In function 'outb_pic': arch/x86/include/asm/i8259.h:45:2: error: implicit declaration of function 'outb' Reported-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Suggested-by: Sebastian Gottschall <s.gottschall@dd-wrt.com> Fixes: `447ae31667` ("x86: Don't include linux/irq.h from asm/hardirq.h") Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:43 +02:00
Guenter Roeck	0fee797887	x86/l1tf: Fix build error seen if CONFIG_KVM_INTEL is disabled commit `1eb46908b3` upstream. allmodconfig+CONFIG_INTEL_KVM=n results in the following build error. ERROR: "l1tf_vmx_mitigation" [arch/x86/kvm/kvm.ko] undefined! Fixes: `5b76a3cff0` ("KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry") Reported-by: Meelis Roos <mroos@linux.ee> Cc: Meelis Roos <mroos@linux.ee> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-17 21:02:42 +02:00
Greg Kroah-Hartman	307e9f91ed	Linux 4.17.15	2018-08-15 18:11:09 +02:00
Borislav Petkov	62203c007b	x86/CPU/AMD: Have smp_num_siblings and cpu_llc_id always be present commit `f8b64d08dd` upstream. Move smp_num_siblings and cpu_llc_id to cpu/common.c so that they're always present as symbols and not only in the CONFIG_SMP case. Then, other code using them doesn't need ugly ifdeffery anymore. Get rid of some ifdeffery. Signed-off-by: Borislav Petkov <bpetkov@suse.de> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1524864877-111962-2-git-send-email-suravee.suthikulpanit@amd.com Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:09 +02:00
Vlastimil Babka	27f172cc35	x86/init: fix build with CONFIG_SWAP=n commit `792adb90fa` upstream. The introduction of generic_max_swapfile_size and arch-specific versions has broken linking on x86 with CONFIG_SWAP=n due to undefined reference to 'generic_max_swapfile_size'. Fix it by compiling the x86-specific max_swapfile_size() only with CONFIG_SWAP=y. Reported-by: Tomas Pruzina <pruzinat@gmail.com> Fixes: `377eeaa8e1` ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:09 +02:00
Abel Vesa	ffdb0b32db	cpu/hotplug: Non-SMP machines do not make use of booted_once commit `269777aa53` upstream. Commit `0cc3cd2165` ("cpu/hotplug: Boot HT siblings at least once") breaks non-SMP builds. [ I suspect the 'bool' fields should just be made to be bitfields and be exposed regardless of configuration, but that's a separate cleanup that I'll leave to the owners of this file for later. - Linus ] Fixes: `0cc3cd2165` ("cpu/hotplug: Boot HT siblings at least once") Cc: Dave Hansen <dave.hansen@intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Signed-off-by: Abel Vesa <abelvesa@linux.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Vlastimil Babka	8eabab771c	x86/smp: fix non-SMP broken build due to redefinition of apic_id_is_primary_thread commit `d0055f351e` upstream. The function has an inline "return false;" definition with CONFIG_SMP=n but the "real" definition is also visible leading to "redefinition of ‘apic_id_is_primary_thread’" compiler error. Guard it with #ifdef CONFIG_SMP Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: `6a4d2657e0` ("x86/smp: Provide topology_is_primary_thread()") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Josh Poimboeuf	f6b2c72538	x86/microcode: Allow late microcode loading with SMT disabled commit `07d981ad4c` upstream The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
David Woodhouse	bc651db4b3	tools headers: Synchronise x86 cpufeatures.h for L1TF additions commit `e24f14b0ff` upstream [ ... and some older changes in the 4.17.y backport too ...] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Arnaldo Carvalho de Melo	fcf9776ade	tools headers: Synchronize prctl.h ABI header commit `63b89a19cc` upstream To pick up changes from: $ git log --oneline -2 -i include/uapi/linux/prctl.h `356e4bfff2` prctl: Add force disable speculation `b617cfc858` prctl: Add speculation control prctls $ tools/perf/trace/beauty/prctl_option.sh > before.c $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h $ tools/perf/trace/beauty/prctl_option.sh > after.c $ diff -u before.c after.c # --- before.c 2018-06-01 10:39:53.834073962 -0300 # +++ after.c 2018-06-01 10:42:11.307985394 -0300 @@ -35,6 +35,8 @@ [42] = "GET_THP_DISABLE", [45] = "SET_FP_MODE", [46] = "GET_FP_MODE", + [52] = "GET_SPECULATION_CTRL", + [53] = "SET_SPECULATION_CTRL", }; static const char prctl_set_mm_options[] = { [1] = "START_CODE", $ This will be used by 'perf trace' to show these strings when beautifying the prctl syscall args. At some point we'll be able to say something like: 'perf trace --all-cpus -e prctl(option=SPEC*)' To filter by arg by name. This silences this warning when building tools/perf: Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h' Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-zztsptwhc264r8wg44tqh5gp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Andi Kleen	bbef6c7161	x86/mm/kmmio: Make the tracer robust against L1TF commit `1063711b57` upstream The mmio tracer sets io mapping PTEs and PMDs to non present when enabled without inverting the address bits, which makes the PTE entry vulnerable for L1TF. Make it use the right low level macros to actually invert the address bits to protect against L1TF. In principle this could be avoided because MMIO tracing is not likely to be enabled on production machines, but the fix is straigt forward and for consistency sake it's better to get rid of the open coded PTE manipulation. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Andi Kleen	c4b60c5270	x86/mm/pat: Make set_memory_np() L1TF safe commit `958f79b9ee` upstream set_memory_np() is used to mark kernel mappings not present, but it has it's own open coded mechanism which does not have the L1TF protection of inverting the address bits. Replace the open coded PTE manipulation with the L1TF protecting low level PTE routines. Passes the CPA self test. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Andi Kleen	d224e11c59	x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert commit `0768f91530` upstream Some cases in THP like: - MADV_FREE - mprotect - split mark the PMD non present for temporarily to prevent races. The window for an L1TF attack in these contexts is very small, but it wants to be fixed for correctness sake. Use the proper low level functions for pmd/pud_mknotpresent() to address this. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Andi Kleen	aa04c664cc	x86/speculation/l1tf: Invert all not present mappings commit `f22cc87f6c` upstream For kernel mappings PAGE_PROTNONE is not necessarily set for a non present mapping, but the inversion logic explicitely checks for !PRESENT and PROT_NONE. Remove the PROT_NONE check and make the inversion unconditional for all not present mappings. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:08 +02:00
Thomas Gleixner	6dcc5103dd	cpu/hotplug: Fix SMT supported evaluation commit `bc2d8d262c` upstream Josh reported that the late SMT evaluation in cpu_smt_state_init() sets cpu_smt_control to CPU_SMT_NOT_SUPPORTED in case that 'nosmt' was supplied on the kernel command line as it cannot differentiate between SMT disabled by BIOS and SMT soft disable via 'nosmt'. That wreckages the state and makes the sysfs interface unusable. Rework this so that during bringup of the non boot CPUs the availability of SMT is determined in cpu_smt_allowed(). If a newly booted CPU is not a 'primary' thread then set the local cpu_smt_available marker and evaluate this explicitely right after the initial SMP bringup has finished. SMT evaulation on x86 is a trainwreck as the firmware has all the information _before_ booting the kernel, but there is no interface to query it. Fixes: `73d5e2b472` ("cpu/hotplug: detect SMT disabled by BIOS") Reported-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Paolo Bonzini	4fef280872	KVM: VMX: Tell the nested hypervisor to skip L1D flush on vmentry commit `5b76a3cff0` upstream When nested virtualization is in use, VMENTER operations from the nested hypervisor into the nested guest will always be processed by the bare metal hypervisor, and KVM's "conditional cache flushes" mode in particular does a flush on nested vmentry. Therefore, include the "skip L1D flush on vmentry" bit in KVM's suggested ARCH_CAPABILITIES setting. Add the relevant Documentation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Paolo Bonzini	daceaeec3e	x86/speculation: Use ARCH_CAPABILITIES to skip L1D flush on vmentry commit `8e0b2b9166` upstream Bit 3 of ARCH_CAPABILITIES tells a hypervisor that L1D flush on vmentry is not needed. Add a new value to enum vmx_l1d_flush_state, which is used either if there is no L1TF bug at all, or if bit 3 is set in ARCH_CAPABILITIES. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Paolo Bonzini	ea6c7c5dab	x86/speculation: Simplify sysfs report of VMX L1TF vulnerability commit `ea156d192f` upstream Three changes to the content of the sysfs file: - If EPT is disabled, L1TF cannot be exploited even across threads on the same core, and SMT is irrelevant. - If mitigation is completely disabled, and SMT is enabled, print "vulnerable" instead of "vulnerable, SMT vulnerable" - Reorder the two parts so that the main vulnerability state comes first and the detail on SMT is second. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Thomas Gleixner	1273eb5fca	Documentation/l1tf: Remove Yonah processors from not vulnerable list commit `5833113613` upstream Dave reported, that it's not confirmed that Yonah processors are unaffected. Remove them from the list. Reported-by: ave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Nicolai Stange	a09777bcd3	x86/KVM/VMX: Don't set l1tf_flush_l1d from vmx_handle_external_intr() commit `18b57ce2eb` upstream For VMEXITs caused by external interrupts, vmx_handle_external_intr() indirectly calls into the interrupt handlers through the host's IDT. It follows that these interrupts get accounted for in the kvm_cpu_l1tf_flush_l1d per-cpu flag. The subsequently executed vmx_l1d_flush() will thus be aware that some interrupts have happened and conduct a L1d flush anyway. Setting l1tf_flush_l1d from vmx_handle_external_intr() isn't needed anymore. Drop it. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Nicolai Stange	de446dc3cf	x86/irq: Let interrupt handlers set kvm_cpu_l1tf_flush_l1d commit `ffcba43ff6` upstream The last missing piece to having vmx_l1d_flush() take interrupts after VMEXIT into account is to set the kvm_cpu_l1tf_flush_l1d per-cpu flag on irq entry. Issue calls to kvm_set_cpu_l1tf_flush_l1d() from entering_irq(), ipi_entering_ack_irq(), smp_reschedule_interrupt() and uv_bau_message_interrupt(). Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Nicolai Stange	7654bab9d7	x86: Don't include linux/irq.h from asm/hardirq.h commit `447ae31667` upstream The next patch in this series will have to make the definition of irq_cpustat_t available to entering_irq(). Inclusion of asm/hardirq.h into asm/apic.h would cause circular header dependencies like asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/topology.h linux/smp.h asm/smp.h or linux/gfp.h linux/mmzone.h asm/mmzone.h asm/mmzone_64.h asm/smp.h asm/apic.h asm/hardirq.h linux/irq.h linux/irqdesc.h linux/kobject.h linux/sysfs.h linux/kernfs.h linux/idr.h linux/gfp.h and others. This causes compilation errors because of the header guards becoming effective in the second inclusion: symbols/macros that had been defined before wouldn't be available to intermediate headers in the #include chain anymore. A possible workaround would be to move the definition of irq_cpustat_t into its own header and include that from both, asm/hardirq.h and asm/apic.h. However, this wouldn't solve the real problem, namely asm/harirq.h unnecessarily pulling in all the linux/irq.h cruft: nothing in asm/hardirq.h itself requires it. Also, note that there are some other archs, like e.g. arm64, which don't have that #include in their asm/hardirq.h. Remove the linux/irq.h #include from x86' asm/hardirq.h. Fix resulting compilation errors by adding appropriate #includes to .c files as needed. Note that some of these .c files could be cleaned up a bit wrt. to their set of #includes, but that should better be done from separate patches, if at all. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:07 +02:00
Nicolai Stange	2566de6742	x86/KVM/VMX: Introduce per-host-cpu analogue of l1tf_flush_l1d commit `45b575c00d` upstream Part of the L1TF mitigation for vmx includes flushing the L1D cache upon VMENTRY. L1D flushes are costly and two modes of operations are provided to users: "always" and the more selective "conditional" mode. If operating in the latter, the cache would get flushed only if a host side code path considered unconfined had been traversed. "Unconfined" in this context means that it might have pulled in sensitive data like user data or kernel crypto keys. The need for L1D flushes is tracked by means of the per-vcpu flag l1tf_flush_l1d. KVM exit handlers considered unconfined set it. A vmx_l1d_flush() subsequently invoked before the next VMENTER will conduct a L1d flush based on its value and reset that flag again. Currently, interrupts delivered "normally" while in root operation between VMEXIT and VMENTER are not taken into account. Part of the reason is that these don't leave any traces and thus, the vmx code is unable to tell if any such has happened. As proposed by Paolo Bonzini, prepare for tracking all interrupts by introducing a new per-cpu flag, "kvm_cpu_l1tf_flush_l1d". It will be in strong analogy to the per-vcpu ->l1tf_flush_l1d. A later patch will make interrupt handlers set it. For the sake of cache locality, group kvm_cpu_l1tf_flush_l1d into x86' per-cpu irq_cpustat_t as suggested by Peter Zijlstra. Provide the helpers kvm_set_cpu_l1tf_flush_l1d(), kvm_clear_cpu_l1tf_flush_l1d() and kvm_get_cpu_l1tf_flush_l1d(). Make them trivial resp. non-existent for !CONFIG_KVM_INTEL as appropriate. Let vmx_l1d_flush() handle kvm_cpu_l1tf_flush_l1d in the same way as l1tf_flush_l1d. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Nicolai Stange	6896e226e6	x86/irq: Demote irq_cpustat_t::__softirq_pending to u16 commit `9aee5f8a7e` upstream An upcoming patch will extend KVM's L1TF mitigation in conditional mode to also cover interrupts after VMEXITs. For tracking those, stores to a new per-cpu flag from interrupt handlers will become necessary. In order to improve cache locality, this new flag will be added to x86's irq_cpustat_t. Make some space available there by shrinking the ->softirq_pending bitfield from 32 to 16 bits: the number of bits actually used is only NR_SOFTIRQS, i.e. 10. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Nicolai Stange	f5129286ad	x86/KVM/VMX: Move the l1tf_flush_l1d test to vmx_l1d_flush() commit `5b6ccc6c3b` upstream Currently, vmx_vcpu_run() checks if l1tf_flush_l1d is set and invokes vmx_l1d_flush() if so. This test is unncessary for the "always flush L1D" mode. Move the check to vmx_l1d_flush()'s conditional mode code path. Notes: - vmx_l1d_flush() is likely to get inlined anyway and thus, there's no extra function call. - This inverts the (static) branch prediction, but there hadn't been any explicit likely()/unlikely() annotations before and so it stays as is. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Nicolai Stange	569f08c9b0	x86/KVM/VMX: Replace 'vmx_l1d_flush_always' with 'vmx_l1d_flush_cond' commit `427362a142` upstream The vmx_l1d_flush_always static key is only ever evaluated if vmx_l1d_should_flush is enabled. In that case however, there are only two L1d flushing modes possible: "always" and "conditional". The "conditional" mode's implementation tends to require more sophisticated logic than the "always" mode. Avoid inverted logic by replacing the 'vmx_l1d_flush_always' static key with a 'vmx_l1d_flush_cond' one. There is no change in functionality. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Nicolai Stange	afb059e688	x86/KVM/VMX: Don't set l1tf_flush_l1d to true from vmx_l1d_flush() commit `379fd0c7e6` upstream vmx_l1d_flush() gets invoked only if l1tf_flush_l1d is true. There's no point in setting l1tf_flush_l1d to true from there again. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Josh Poimboeuf	63a759508f	cpu/hotplug: detect SMT disabled by BIOS commit `73d5e2b472` upstream If SMT is disabled in BIOS, the CPU code doesn't properly detect it. The /sys/devices/system/cpu/smt/control file shows 'on', and the 'l1tf' vulnerabilities file shows SMT as vulnerable. Fix it by forcing 'cpu_smt_control' to CPU_SMT_NOT_SUPPORTED in such a case. Unfortunately the detection can only be done after bringing all the CPUs online, so we have to overwrite any previous writes to the variable. Reported-by: Joe Mario <jmario@redhat.com> Tested-by: Jiri Kosina <jkosina@suse.cz> Fixes: `f048c399e0` ("x86/topology: Provide topology_smt_supported()") Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Tony Luck	0cd0915019	Documentation/l1tf: Fix typos commit `1949f9f497` upstream Fix spelling and other typos Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Nicolai Stange	48f9b6554c	x86/KVM/VMX: Initialize the vmx_l1d_flush_pages' content commit `288d152c23` upstream The slow path in vmx_l1d_flush() reads from vmx_l1d_flush_pages in order to evict the L1d cache. However, these pages are never cleared and, in theory, their data could be leaked. More importantly, KSM could merge a nested hypervisor's vmx_l1d_flush_pages to fewer than 1 << L1D_CACHE_ORDER host physical pages and this would break the L1d flushing algorithm: L1D on x86_64 is tagged by physical addresses. Fix this by initializing the individual vmx_l1d_flush_pages with a different pattern each. Rename the "empty_zp" asm constraint identifier in vmx_l1d_flush() to "flush_pages" to reflect this change. Fixes: `a47dd5f067` ("x86/KVM/VMX: Add L1D flush algorithm") Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:06 +02:00
Jiri Kosina	0eb84cc57b	x86/speculation/l1tf: Unbreak !__HAVE_ARCH_PFN_MODIFY_ALLOWED architectures commit `6c26fcd2ab` upstream pfn_modify_allowed() and arch_has_pfn_modify_check() are outside of the !__ASSEMBLY__ section in include/asm-generic/pgtable.h, which confuses assembler on archs that don't have __HAVE_ARCH_PFN_MODIFY_ALLOWED (e.g. ia64) and breaks build: include/asm-generic/pgtable.h: Assembler messages: include/asm-generic/pgtable.h:538: Error: Unknown opcode `static inline bool pfn_modify_allowed(unsigned long pfn,pgprot_t prot)' include/asm-generic/pgtable.h:540: Error: Unknown opcode `return true' include/asm-generic/pgtable.h:543: Error: Unknown opcode `static inline bool arch_has_pfn_modify_check(void)' include/asm-generic/pgtable.h:545: Error: Unknown opcode `return false' arch/ia64/kernel/entry.S:69: Error: `mov' does not fit into bundle Move those two static inlines into the !__ASSEMBLY__ section so that they don't confuse the asm build pass. Fixes: `42e4089c78` ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings") Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Thomas Gleixner	f4e44a4197	Documentation: Add section about CPU vulnerabilities commit `3ec8ce5d86` upstream Add documentation for the L1TF vulnerability and the mitigation mechanisms: - Explain the problem and risks - Document the mitigation mechanisms - Document the command line controls - Document the sysfs files Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lkml.kernel.org/r/20180713142323.287429944@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Jiri Kosina	912d10ace8	x86/bugs, kvm: Introduce boot-time control of L1TF mitigations commit `d90a7a0ec8` upstream Introduce the 'l1tf=' kernel command line option to allow for boot-time switching of mitigation that is used on processors affected by L1TF. The possible values are: full Provides all available mitigations for the L1TF vulnerability. Disables SMT and enables all mitigations in the hypervisors. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. full,force Same as 'full', but disables SMT control. Implies the 'nosmt=force' command line option. sysfs control of SMT and the hypervisor flush control is disabled. flush Leaves SMT enabled and enables the conditional hypervisor mitigation. Hypervisors will issue a warning when the first VM is started in a potentially insecure configuration, i.e. SMT enabled or L1D flush disabled. flush,nosmt Disables SMT and enables the conditional hypervisor mitigation. SMT control via /sys/devices/system/cpu/smt/control is still possible after boot. If SMT is reenabled or flushing disabled at runtime hypervisors will issue a warning. flush,nowarn Same as 'flush', but hypervisors will not warn when a VM is started in a potentially insecure configuration. off Disables hypervisor mitigations and doesn't emit any warnings. Default is 'flush'. Let KVM adhere to these semantics, which means: - 'lt1f=full,force' : Performe L1D flushes. No runtime control possible. - 'l1tf=full' - 'l1tf-flush' - 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if SMT has been runtime enabled or L1D flushing has been run-time enabled - 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted. - 'l1tf=off' : L1D flushes are not performed and no warnings are emitted. KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush' module parameter except when lt1f=full,force is set. This makes KVM's private 'nosmt' option redundant, and as it is a bit non-systematic anyway (this is something to control globally, not on hypervisor level), remove that option. Add the missing Documentation entry for the l1tf vulnerability sysfs file while at it. Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Thomas Gleixner	7fc6f0e2f7	cpu/hotplug: Set CPU_SMT_NOT_SUPPORTED early commit `fee0aede6f` upstream The CPU_SMT_NOT_SUPPORTED state is set (if the processor does not support SMT) when the sysfs SMT control file is initialized. That was fine so far as this was only required to make the output of the control file correct and to prevent writes in that case. With the upcoming l1tf command line parameter, this needs to be set up before the L1TF mitigation selection and command line parsing happens. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.121795971@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Jiri Kosina	6c57ce2dd9	cpu/hotplug: Expose SMT control init function commit `8e1b706b6e` upstream The L1TF mitigation will gain a commend line parameter which allows to set a combination of hypervisor mitigation and SMT control. Expose cpu_smt_disable() so the command line parser can tweak SMT settings. [ tglx: Split out of larger patch and made it preserve an already existing force off state ] Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142323.039715135@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Thomas Gleixner	e86575c798	x86/kvm: Allow runtime control of L1D flush commit `895ae47f99` upstream All mitigation modes can be switched at run time with a static key now: - Use sysfs_streq() instead of strcmp() to handle the trailing new line from sysfs writes correctly. - Make the static key management handle multiple invocations properly. - Set the module parameter file to RW Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.954525119@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Thomas Gleixner	3983c579df	x86/kvm: Serialize L1D flush parameter setter commit `dd4bfa739a` upstream Writes to the parameter files are not serialized at the sysfs core level, so local serialization is required. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.873642605@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:05 +02:00
Thomas Gleixner	6da20c6fae	x86/kvm: Add static key for flush always commit `4c6523ec59` upstream Avoid the conditional in the L1D flush control path. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.790914912@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Thomas Gleixner	f224969741	x86/kvm: Move l1tf setup function commit `7db92e165a` upstream In preparation of allowing run time control for L1D flushing, move the setup code to the module parameter handler. In case of pre module init parsing, just store the value and let vmx_init() do the actual setup after running kvm_init() so that enable_ept is having the correct state. During run-time invoke it directly from the parameter setter to prepare for run-time control. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.694063239@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Thomas Gleixner	fe7463bbef	x86/l1tf: Handle EPT disabled state proper commit `a7b9020b06` upstream If Extended Page Tables (EPT) are disabled or not supported, no L1D flushing is required. The setup function can just avoid setting up the L1D flush for the EPT=n case. Invoke it after the hardware setup has be done and enable_ept has the correct state and expose the EPT disabled state in the mitigation status as well. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.612160168@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Thomas Gleixner	5a591c95e2	x86/kvm: Drop L1TF MSR list approach commit `2f055947ae` upstream The VMX module parameter to control the L1D flush should become writeable. The MSR list is set up at VM init per guest VCPU, but the run time switching is based on a static key which is global. Toggling the MSR list at run time might be feasible, but for now drop this optimization and use the regular MSR write to make run-time switching possible. The default mitigation is the conditional flush anyway, so for extra paranoid setups this will add some small overhead, but the extra code executed is in the noise compared to the flush itself. Aside of that the EPT disabled case is not handled correctly at the moment and the MSR list magic is in the way for fixing that as well. If it's really providing a significant advantage, then this needs to be revisited after the code is correct and the control is writable. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.516940445@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Thomas Gleixner	fc7107040f	x86/litf: Introduce vmx status variable commit `72c6d2db64` upstream Store the effective mitigation of VMX in a status variable and use it to report the VMX state in the l1tf sysfs file. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Link: https://lkml.kernel.org/r/20180713142322.433098358@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Thomas Gleixner	49fc27f3c0	cpu/hotplug: Online siblings when SMT control is turned on commit `215af5499d` upstream Writing 'off' to /sys/devices/system/cpu/smt/control offlines all SMT siblings. Writing 'on' merily enables the abilify to online them, but does not online them automatically. Make 'on' more useful by onlining all offline siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Konrad Rzeszutek Wilk	0689d66648	x86/KVM/VMX: Use MSR save list for IA32_FLUSH_CMD if required commit `390d975e0c` upstream If the L1D flush module parameter is set to 'always' and the IA32_FLUSH_CMD MSR is available, optimize the VMENTER code with the MSR save list. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Konrad Rzeszutek Wilk	0a06117dd3	x86/KVM/VMX: Extend add_atomic_switch_msr() to allow VMENTER only MSRs commit `989e3992d2` upstream The IA32_FLUSH_CMD MSR needs only to be written on VMENTER. Extend add_atomic_switch_msr() with an entry_only parameter to allow storing the MSR only in the guest (ENTRY) MSR array. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:04 +02:00
Konrad Rzeszutek Wilk	7bbd5a4751	x86/KVM/VMX: Separate the VMX AUTOLOAD guest/host number accounting commit `3190709335` upstream This allows to load a different number of MSRs depending on the context: VMEXIT or VMENTER. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Konrad Rzeszutek Wilk	f38baa5bc1	x86/KVM/VMX: Add find_msr() helper function commit `ca83b4a7f2` upstream .. to help find the MSR on either the guest or host MSR list. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Konrad Rzeszutek Wilk	813e6ede42	x86/KVM/VMX: Split the VMX MSR LOAD structures to have an host/guest numbers commit `33966dd6b2` upstream There is no semantic change but this change allows an unbalanced amount of MSRs to be loaded on VMEXIT and VMENTER, i.e. the number of MSRs to save or restore on VMEXIT or VMENTER may be different. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Paolo Bonzini	aab5ed8057	x86/KVM/VMX: Add L1D flush logic commit `c595ceee45` upstream Add the logic for flushing L1D on VMENTER. The flush depends on the static key being enabled and the new l1tf_flush_l1d flag being set. The flags is set: - Always, if the flush module parameter is 'always' - Conditionally at: - Entry to vcpu_run(), i.e. after executing user space - From the sched_in notifier, i.e. when switching to a vCPU thread. - From vmexit handlers which are considered unsafe, i.e. where sensitive data can be brought into L1D: - The emulator, which could be a good target for other speculative execution-based threats, - The MMU, which can bring host page tables in the L1 cache. - External interrupts - Nested operations that require the MMU (see above). That is vmptrld, vmptrst, vmclear,vmwrite,vmread. - When handling invept,invvpid [ tglx: Split out from combo patch and reduced to a single flag ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Paolo Bonzini	348a22db76	x86/KVM/VMX: Add L1D MSR based flush commit `3fa045be4c` upstream 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD aka 0x10B) which has similar write-only semantics to other MSRs defined in the document. The semantics of this MSR is to allow "finer granularity invalidation of caching structures than existing mechanisms like WBINVD. It will writeback and invalidate the L1 data cache, including all cachelines brought in by preceding instructions, without invalidating all caches (eg. L2 or LLC). Some processors may also invalidate the first level level instruction cache on a L1D_FLUSH command. The L1 data and instruction caches may be shared across the logical processors of a core." Use it instead of the loop based L1 flush algorithm. A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 [ tglx: Avoid allocating pages when the MSR is available ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Paolo Bonzini	1db21b84cb	x86/KVM/VMX: Add L1D flush algorithm commit `a47dd5f067` upstream To mitigate the L1 Terminal Fault vulnerability it's required to flush L1D on VMENTER to prevent rogue guests from snooping host memory. CPUs will have a new control MSR via a microcode update to flush L1D with a single MSR write, but in the absence of microcode a fallback to a software based flush algorithm is required. Add a software flush loop which is based on code from Intel. [ tglx: Split out from combo patch ] [ bpetkov: Polish the asm code ] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Konrad Rzeszutek Wilk	9431927cfe	x86/KVM/VMX: Add module argument for L1TF mitigation commit `a399477e52` upstream Add a mitigation mode parameter "vmentry_l1d_flush" for CVE-2018-3620, aka L1 terminal fault. The valid arguments are: - "always" L1D cache flush on every VMENTER. - "cond" Conditional L1D cache flush, explained below - "never" Disable the L1D cache flush mitigation "cond" is trying to avoid L1D cache flushes on VMENTER if the code executed between VMEXIT and VMENTER is considered safe, i.e. is not bringing any interesting information into L1D which might exploited. [ tglx: Split out from a larger patch ] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:03 +02:00
Konrad Rzeszutek Wilk	8efd53e73a	x86/KVM: Warn user if KVM is loaded SMT and L1TF CPU bug being present commit `26acfb666a` upstream If the L1TF CPU bug is present we allow the KVM module to be loaded as the major of users that use Linux and KVM have trusted guests and do not want a broken setup. Cloud vendors are the ones that are uncomfortable with CVE 2018-3620 and as such they are the ones that should set nosmt to one. Setting 'nosmt' means that the system administrator also needs to disable SMT (Hyper-threading) in the BIOS, or via the 'nosmt' command line parameter, or via the /sys/devices/system/cpu/smt/control. See commit `05736e4ac1` ("cpu/hotplug: Provide knobs to control SMT"). Other mitigations are to use task affinity, cpu sets, interrupt binding, etc - anything to make sure that _only_ the same guests vCPUs are running on sibling threads. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Thomas Gleixner	edaa3d5446	cpu/hotplug: Boot HT siblings at least once commit `0cc3cd2165` upstream Due to the way Machine Check Exceptions work on X86 hyperthreads it's required to boot up _all_ logical cores at least once in order to set the CR4.MCE bit. So instead of ignoring the sibling threads right away, let them boot up once so they can configure themselves. After they came out of the initial boot stage check whether its a "secondary" sibling and cancel the operation which puts the CPU back into offline state. Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Thomas Gleixner	50eb78a074	Revert "x86/apic: Ignore secondary threads if nosmt=force" commit `506a66f374` upstream Dave Hansen reported, that it's outright dangerous to keep SMT siblings disabled completely so they are stuck in the BIOS and wait for SIPI. The reason is that Machine Check Exceptions are broadcasted to siblings and the soft disabled sibling has CR4.MCE = 0. If a MCE is delivered to a logical core with CR4.MCE = 0, it asserts IERR#, which shuts down or reboots the machine. The MCE chapter in the SDM contains the following blurb: Because the logical processors within a physical package are tightly coupled with respect to shared hardware resources, both logical processors are notified of machine check errors that occur within a given physical processor. If machine-check exceptions are enabled when a fatal error is reported, all the logical processors within a physical package are dispatched to the machine-check exception handler. If machine-check exceptions are disabled, the logical processors enter the shutdown state and assert the IERR# signal. When enabling machine-check exceptions, the MCE flag in control register CR4 should be set for each logical processor. Reverting the commit which ignores siblings at enumeration time solves only half of the problem. The core cpuhotplug logic needs to be adjusted as well. This thoughtful engineered mechanism also turns the boot process on all Intel HT enabled systems into a MCE lottery. MCE is enabled on the boot CPU before the secondary CPUs are brought up. Depending on the number of physical cores the window in which this situation can happen is smaller or larger. On a HSW-EX it's about 750ms: MCE is enabled on the boot CPU: [ 0.244017] mce: CPU supports 22 MCE banks The corresponding sibling #72 boots: [ 1.008005] .... node #0, CPUs: #72 That means if an MCE hits on physical core 0 (logical CPUs 0 and 72) between these two points the machine is going to shutdown. At least it's a known safe state. It's obvious that the early boot can be hit by an MCE as well and then runs into the same situation because MCEs are not yet enabled on the boot CPU. But after enabling them on the boot CPU, it does not make any sense to prevent the kernel from recovering. Adjust the nosmt kernel parameter documentation as well. Reverts: `2207def700` ("x86/apic: Ignore secondary threads if nosmt=force") Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Michal Hocko	1c777e6c2b	x86/speculation/l1tf: Fix up pte->pfn conversion for PAE commit `e14d7dfb41` upstream Jan has noticed that pte_pfn and co. resp. pfn_pte are incorrect for CONFIG_PAE because phys_addr_t is wider than unsigned long and so the pte_val reps. shift left would get truncated. Fix this up by using proper types. Fixes: `6b28baca9b` ("x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation") Reported-by: Jan Beulich <JBeulich@suse.com> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Vlastimil Babka	7c7de7559d	x86/speculation/l1tf: Protect PAE swap entries against L1TF commit `0d0f624905` upstream The PAE 3-level paging code currently doesn't mitigate L1TF by flipping the offset bits, and uses the high PTE word, thus bits 32-36 for type, 37-63 for offset. The lower word is zeroed, thus systems with less than 4GB memory are safe. With 4GB to 128GB the swap type selects the memory locations vulnerable to L1TF; with even more memory, also the swap offfset influences the address. This might be a problem with 32bit PAE guests running on large 64bit hosts. By continuing to keep the whole swap entry in either high or low 32bit word of PTE we would limit the swap size too much. Thus this patch uses the whole PAE PTE with the same layout as the 64bit version does. The macros just become a bit tricky since they assume the arch-dependent swp_entry_t to be 32bit. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Borislav Petkov	f85769f140	x86/CPU/AMD: Move TOPOEXT reenablement before reading smp_num_siblings commit `7ce2f0393e` upstream The TOPOEXT reenablement is a workaround for broken BIOSen which didn't enable the CPUID bit. amd_get_topology_early(), however, relies on that bit being set so that it can read out the CPUID leaf and set smp_num_siblings properly. Move the reenablement up to early_init_amd(). While at it, simplify amd_get_topology_early(). Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Konrad Rzeszutek Wilk	e486e3cbe1	x86/cpufeatures: Add detection of L1D cache flush support. commit `11e34e64e4` upstream 336996-Speculative-Execution-Side-Channel-Mitigations.pdf defines a new MSR (IA32_FLUSH_CMD) which is detected by CPUID.7.EDX[28]=1 bit being set. This new MSR "gives software a way to invalidate structures with finer granularity than other architectual methods like WBINVD." A copy of this document is available at https://bugzilla.kernel.org/show_bug.cgi?id=199511 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Vlastimil Babka	51d8977e5a	x86/speculation/l1tf: Extend 64bit swap file size limit commit `1a7ed1ba4b` upstream The previous patch has limited swap file size so that large offsets cannot clear bits above MAX_PA/2 in the pte and interfere with L1TF mitigation. It assumed that offsets are encoded starting with bit 12, same as pfn. But on x86_64, offsets are encoded starting with bit 9. Thus the limit can be raised by 3 bits. That means 16TB with 42bit MAX_PA and 256TB with 46bit MAX_PA. Fixes: `377eeaa8e1` ("x86/speculation/l1tf: Limit swap file size to MAX_PA/2") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:02 +02:00
Thomas Gleixner	0daddfa173	x86/apic: Ignore secondary threads if nosmt=force commit `2207def700` upstream nosmt on the kernel command line merely prevents the onlining of the secondary SMT siblings. nosmt=force makes the APIC detection code ignore the secondary SMT siblings completely, so they even do not show up as possible CPUs. That reduces the amount of memory allocations for per cpu variables and saves other resources from being allocated too large. This is not fully equivalent to disabling SMT in the BIOS because the low level SMT enabling in the BIOS can result in partitioning of resources between the siblings, which is not undone by just ignoring them. Some CPUs can use the full resources when their sibling is not onlined, but this is depending on the CPU family and model and it's not well documented whether this applies to all partitioned resources. That means depending on the workload disabling SMT in the BIOS might result in better performance. Linus analysis of the Intel manual: The intel optimization manual is not very clear on what the partitioning rules are. I find: "In general, the buffers for staging instructions between major pipe stages are partitioned. These buffers include µop queues after the execution trace cache, the queues after the register rename stage, the reorder buffer which stages instructions for retirement, and the load and store buffers. In the case of load and store buffers, partitioning also provided an easier implementation to maintain memory ordering for each logical processor and detect memory ordering violations" but some of that partitioning may be relaxed if the HT thread is "not active": "In Intel microarchitecture code name Sandy Bridge, the micro-op queue is statically partitioned to provide 28 entries for each logical processor, irrespective of software executing in single thread or multiple threads. If one logical processor is not active in Intel microarchitecture code name Ivy Bridge, then a single thread executing on that processor core can use the 56 entries in the micro-op queue" but I do not know what "not active" means, and how dynamic it is. Some of that partitioning may be entirely static and depend on the early BIOS disabling of HT, and even if we park the cores, the resources will just be wasted. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	3708ec490b	x86/cpu/AMD: Evaluate smp_num_siblings early commit `1e1d7e25fd` upstream To support force disabling of SMT it's required to know the number of thread siblings early. amd_get_topology() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings and invoke it from amd_early_init(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Borislav Petkov	096700fb08	x86/CPU/AMD: Do not check CPUID max ext level before parsing SMP info commit `119bff8a9c` upstream Old code used to check whether CPUID ext max level is >= 0x80000008 because that last leaf contains the number of cores of the physical CPU. The three functions called there now do not depend on that leaf anymore so the check can go. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	8346482d76	x86/cpu/intel: Evaluate smp_num_siblings early commit `1910ad5624` upstream Make use of the new early detection function to initialize smp_num_siblings on the boot cpu before the MP-Table or ACPI/MADT scan happens. That's required for force disabling SMT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	2910ebe928	x86/cpu/topology: Provide detect_extended_topology_early() commit `95f3d39ccf` upstream To support force disabling of SMT it's required to know the number of thread siblings early. detect_extended_topology() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	06ebc9fd87	x86/cpu/common: Provide detect_ht_early() commit `545401f444` upstream To support force disabling of SMT it's required to know the number of thread siblings early. detect_ht() cannot be called before the APIC driver is selected, so split out the part which initializes smp_num_siblings. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	9087f7047f	x86/cpu/AMD: Remove the pointless detect_ht() call commit `44ca36de56` upstream Real 32bit AMD CPUs do not have SMT and the only value of the call was to reach the magic printout which got removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	186aeab679	x86/cpu: Remove the pointless CPU printout commit `55e6d279ab` upstream The value of this printout is dubious at best and there is no point in having it in two different places along with convoluted ways to reach it. Remove it completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:01 +02:00
Thomas Gleixner	73650e0724	cpu/hotplug: Provide knobs to control SMT commit `05736e4ac1` upstream Provide a command line and a sysfs knob to control SMT. The command line options are: 'nosmt': Enumerate secondary threads, but do not online them 'nosmt=force': Ignore secondary threads completely during enumeration via MP table and ACPI/MADT. The sysfs control file has the following states (read/write): 'on': SMT is enabled. Secondary threads can be freely onlined 'off': SMT is disabled. Secondary threads, even if enumerated cannot be onlined 'forceoff': SMT is permanentely disabled. Writes to the control file are rejected. 'notsupported': SMT is not supported by the CPU The command line option 'nosmt' sets the sysfs control to 'off'. This can be changed to 'on' to reenable SMT during runtime. The command line option 'nosmt=force' sets the sysfs control to 'forceoff'. This cannot be changed during runtime. When SMT is 'on' and the control file is changed to 'off' then all online secondary threads are offlined and attempts to online a secondary thread later on are rejected. When SMT is 'off' and the control file is changed to 'on' then secondary threads can be onlined again. The 'off' -> 'on' transition does not automatically online the secondary threads. When the control file is set to 'forceoff', the behaviour is the same as setting it to 'off', but the operation is irreversible and later writes to the control file are rejected. When the control status is 'notsupported' then writes to the control file are rejected. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Thomas Gleixner	77db3823f0	cpu/hotplug: Split do_cpu_down() commit `cc1fe215e1` upstream Split out the inner workings of do_cpu_down() to allow reuse of that function for the upcoming SMT disabling mechanism. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Thomas Gleixner	165ae0b7da	cpu/hotplug: Make bringup/teardown of smp threads symmetric commit `c4de65696d` upstream The asymmetry caused a warning to trigger if the bootup was stopped in state CPUHP_AP_ONLINE_IDLE. The warning no longer triggers as kthread_park() can now be invoked on already or still parked threads. But there is still no reason to have this be asymmetric. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Thomas Gleixner	aef565d858	x86/topology: Provide topology_smt_supported() commit `f048c399e0` upstream Provide information whether SMT is supoorted by the CPUs. Preparatory patch for SMT control mechanism. Suggested-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Thomas Gleixner	babba06adf	x86/smp: Provide topology_is_primary_thread() commit `6a4d2657e0` upstream If the CPU is supporting SMT then the primary thread can be found by checking the lower APIC ID bits for zero. smp_num_siblings is used to build the mask for the APIC ID bits which need to be taken into account. This uses the MPTABLE or ACPI/MADT supplied APIC ID, which can be different than the initial APIC ID in CPUID. But according to AMD the lower bits have to be consistent. Intel gave a tentative confirmation as well. Preparatory patch to support disabling SMT at boot/runtime. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Peter Zijlstra	f3839aa4a2	sched/smt: Update sched_smt_present at runtime commit `ba2591a599` upstream The static key sched_smt_present is only updated at boot time when SMT siblings have been detected. Booting with maxcpus=1 and bringing the siblings online after boot rebuilds the scheduling domains correctly but does not update the static key, so the SMT code is not enabled. Let the key be updated in the scheduler CPU hotplug code to fix this. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Konrad Rzeszutek Wilk	3f1ad8f2b5	x86/bugs: Move the l1tf function and define pr_fmt properly commit `56563f53d3` upstream The pr_warn in l1tf_select_mitigation would have used the prior pr_fmt which was defined as "Spectre V2 : ". Move the function to be past SSBD and also define the pr_fmt. Fixes: `17dbca1193` ("x86/speculation/l1tf: Add sysfs reporting for l1tf") Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Andi Kleen	ad71fbdf4f	x86/speculation/l1tf: Limit swap file size to MAX_PA/2 commit `377eeaa8e1` upstream For the L1TF workaround its necessary to limit the swap file size to below MAX_PA/2, so that the higher bits of the swap offset inverted never point to valid memory. Add a mechanism for the architecture to override the swap file size check in swapfile.c and add a x86 specific max swapfile check function that enforces that limit. The check is only enabled if the CPU is vulnerable to L1TF. In VMs with 42bit MAX_PA the typical limit is 2TB now, on a native system with 46bit PA it is 32TB. The limit is only per individual swap file, so it's always possible to exceed these limits with multiple swap files or partitions. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:11:00 +02:00
Andi Kleen	4880e94df0	x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings commit `42e4089c78` upstream For L1TF PROT_NONE mappings are protected by inverting the PFN in the page table entry. This sets the high bits in the CPU's address space, thus making sure to point to not point an unmapped entry to valid cached memory. Some server system BIOSes put the MMIO mappings high up in the physical address space. If such an high mapping was mapped to unprivileged users they could attack low memory by setting such a mapping to PROT_NONE. This could happen through a special device driver which is not access protected. Normal /dev/mem is of course access protected. To avoid this forbid PROT_NONE mappings or mprotect for high MMIO mappings. Valid page mappings are allowed because the system is then unsafe anyways. It's not expected that users commonly use PROT_NONE on MMIO. But to minimize any impact this is only enforced if the mapping actually refers to a high MMIO address (defined as the MAX_PA-1 bit being set), and also skip the check for root. For mmaps this is straight forward and can be handled in vm_insert_pfn and in remap_pfn_range(). For mprotect it's a bit trickier. At the point where the actual PTEs are accessed a lot of state has been changed and it would be difficult to undo on an error. Since this is a uncommon case use a separate early page talk walk pass for MMIO PROT_NONE mappings that checks for this condition early. For non MMIO and non PROT_NONE there are no changes. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Andi Kleen	79f7d72277	x86/speculation/l1tf: Add sysfs reporting for l1tf commit `17dbca1193` upstream L1TF core kernel workarounds are cheap and normally always enabled, However they still should be reported in sysfs if the system is vulnerable or mitigated. Add the necessary CPU feature/bug bits. - Extend the existing checks for Meltdowns to determine if the system is vulnerable. All CPUs which are not vulnerable to Meltdown are also not vulnerable to L1TF - Check for 32bit non PAE and emit a warning as there is no practical way for mitigation due to the limited physical address bits - If the system has more than MAX_PA/2 physical memory the invert page workarounds don't protect the system against the L1TF attack anymore, because an inverted physical address will also point to valid memory. Print a warning in this case and report that the system is vulnerable. Add a function which returns the PFN limit for the L1TF mitigation, which will be used in follow up patches for sanity and range checks. [ tglx: Renamed the CPU feature bit to L1TF_PTEINV ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Andi Kleen	9dcfa2d757	x86/speculation/l1tf: Make sure the first page is always reserved commit `10a70416e1` upstream The L1TF workaround doesn't make any attempt to mitigate speculate accesses to the first physical page for zeroed PTEs. Normally it only contains some data from the early real mode BIOS. It's not entirely clear that the first page is reserved in all configurations, so add an extra reservation call to make sure it is really reserved. In most configurations (e.g. with the standard reservations) it's likely a nop. Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Andi Kleen	23cb0697ce	x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation commit `6b28baca9b` upstream When PTEs are set to PROT_NONE the kernel just clears the Present bit and preserves the PFN, which creates attack surface for L1TF speculation speculation attacks. This is important inside guests, because L1TF speculation bypasses physical page remapping. While the host has its own migitations preventing leaking data from other VMs into the guest, this would still risk leaking the wrong page inside the current guest. This uses the same technique as Linus' swap entry patch: while an entry is is in PROTNONE state invert the complete PFN part part of it. This ensures that the the highest bit will point to non existing memory. The invert is done by pte/pmd_modify and pfn/pmd/pud_pte for PROTNONE and pte/pmd/pud_pfn undo it. This assume that no code path touches the PFN part of a PTE directly without using these primitives. This doesn't handle the case that MMIO is on the top of the CPU physical memory. If such an MMIO region was exposed by an unpriviledged driver for mmap it would be possible to attack some real memory. However this situation is all rather unlikely. For 32bit non PAE the inversion is not done because there are really not enough bits to protect anything. Q: Why does the guest need to be protected when the HyperVisor already has L1TF mitigations? A: Here's an example: Physical pages 1 2 get mapped into a guest as GPA 1 -> PA 2 GPA 2 -> PA 1 through EPT. The L1TF speculation ignores the EPT remapping. Now the guest kernel maps GPA 1 to process A and GPA 2 to process B, and they belong to different users and should be isolated. A sets the GPA 1 PA 2 PTE to PROT_NONE to bypass the EPT remapping and gets read access to the underlying physical page. Which in this case points to PA 2, so it can read process B's data, if it happened to be in L1, so isolation inside the guest is broken. There's nothing the hypervisor can do about this. This mitigation has to be done in the guest itself. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Linus Torvalds	04ed26b3dd	x86/speculation/l1tf: Protect swap entries against L1TF commit `2f22b4cd45` upstream With L1 terminal fault the CPU speculates into unmapped PTEs, and resulting side effects allow to read the memory the PTE is pointing too, if its values are still in the L1 cache. For swapped out pages Linux uses unmapped PTEs and stores a swap entry into them. To protect against L1TF it must be ensured that the swap entry is not pointing to valid memory, which requires setting higher bits (between bit 36 and bit 45) that are inside the CPUs physical address space, but outside any real memory. To do this invert the offset to make sure the higher bits are always set, as long as the swap file is not too big. Note there is no workaround for 32bit !PAE, or on systems which have more than MAX_PA/2 worth of memory. The later case is very unlikely to happen on real systems. [AK: updated description and minor tweaks by. Split out from the original patch ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Linus Torvalds	89bcd057d2	x86/speculation/l1tf: Change order of offset/type in swap entry commit `bcd11afa7a` upstream If pages are swapped out, the swap entry is stored in the corresponding PTE, which has the Present bit cleared. CPUs vulnerable to L1TF speculate on PTE entries which have the present bit set and would treat the swap entry as phsyical address (PFN). To mitigate that the upper bits of the PTE must be set so the PTE points to non existent memory. The swap entry stores the type and the offset of a swapped out page in the PTE. type is stored in bit 9-13 and offset in bit 14-63. The hardware ignores the bits beyond the phsyical address space limit, so to make the mitigation effective its required to start 'offset' at the lowest possible bit so that even large swap offsets do not reach into the physical address space limit bits. Move offset to bit 9-58 and type to bit 59-63 which are the bits that hardware generally doesn't care about. That, in turn, means that if you on desktop chip with only 40 bits of physical addressing, now that the offset starts at bit 9, there needs to be 30 bits of offset actually in use until bit 39 ends up being set, which means when inverted it will again point into existing memory. So that's 4 terabyte of swap space (because the offset is counted in pages, so 30 bits of offset is 42 bits of actual coverage). With bigger physical addressing, that obviously grows further, until the limit of the offset is hit (at 50 bits of offset - 62 bits of actual swap file coverage). This is a preparatory change for the actual swap entry inversion to protect against L1TF. [ AK: Updated description and minor tweaks. Split into two parts ] [ tglx: Massaged changelog ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Andi Kleen	7a5ff7472a	x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT commit `50896e180c` upstream L1 Terminal Fault (L1TF) is a speculation related vulnerability. The CPU speculates on PTE entries which do not have the PRESENT bit set, if the content of the resulting physical address is available in the L1D cache. The OS side mitigation makes sure that a !PRESENT PTE entry points to a physical address outside the actually existing and cachable memory space. This is achieved by inverting the upper bits of the PTE. Due to the address space limitations this only works for 64bit and 32bit PAE kernels, but not for 32bit non PAE. This mitigation applies to both host and guest kernels, but in case of a 64bit host (hypervisor) and a 32bit PAE guest, inverting the upper bits of the PAE address space (44bit) is not enough if the host has more than 43 bits of populated memory address space, because the speculation treats the PTE content as a physical host address bypassing EPT. The host (hypervisor) protects itself against the guest by flushing L1D as needed, but pages inside the guest are not protected against attacks from other processes inside the same guest. For the guest the inverted PTE mask has to match the host to provide the full protection for all pages the host could possibly map into the guest. The hosts populated address space is not known to the guest, so the mask must cover the possible maximal host address space, i.e. 52 bit. On 32bit PAE the maximum PTE mask is currently set to 44 bit because that is the limit imposed by 32bit unsigned long PFNs in the VMs. This limits the mask to be below what the host could possible use for physical pages. The L1TF PROT_NONE protection code uses the PTE masks to determine which bits to invert to make sure the higher bits are set for unmapped entries to prevent L1TF speculation attacks against EPT inside guests. In order to invert all bits that could be used by the host, increase __PHYSICAL_PAGE_SHIFT to 52 to match 64bit. The real limit for a 32bit PAE kernel is still 44 bits because all Linux PTEs are created from unsigned long PFNs, so they cannot be higher than 44 bits on a 32bit kernel. So these extra PFN bits should be never set. The only users of this macro are using it to look at PTEs, so it's safe. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Nick Desaulniers	ffc42003b1	x86/irqflags: Provide a declaration for native_save_fl commit `208cbb3255` upstream. It was reported that the commit `d0a8d9378d` is causing users of gcc < 4.9 to observe -Werror=missing-prototypes errors. Indeed, it seems that: extern inline unsigned long native_save_fl(void) { return 0; } compiled with -Werror=missing-prototypes produces this warning in gcc < 4.9, but not gcc >= 4.9. Fixes: `d0a8d9378d` ("x86/paravirt: Make native_save_fl() extern inline"). Reported-by: David Laight <david.laight@aculab.com> Reported-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: jgross@suse.com Cc: kstewart@linuxfoundation.org Cc: gregkh@linuxfoundation.org Cc: boris.ostrovsky@oracle.com Cc: astrachan@google.com Cc: mka@chromium.org Cc: arnd@arndb.de Cc: tstellar@redhat.com Cc: sedat.dilek@gmail.com Cc: David.Laight@aculab.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180803170550.164688-1-ndesaulniers@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Masami Hiramatsu	ce4ec59ee6	kprobes/x86: Fix %p uses in error messages commit `0ea063306e` upstream. Remove all %p uses in error messages in kprobes/x86. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: David Howells <dhowells@redhat.com> Cc: David S . Miller <davem@davemloft.net> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jon Medhurst <tixy@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tobin C . Harding <me@tobin.cc> Cc: Will Deacon <will.deacon@arm.com> Cc: acme@kernel.org Cc: akpm@linux-foundation.org Cc: brueckner@linux.vnet.ibm.com Cc: linux-arch@vger.kernel.org Cc: rostedt@goodmis.org Cc: schwidefsky@de.ibm.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/152491902310.9916.13355297638917767319.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:59 +02:00
Jiri Kosina	0a9da8dd12	x86/speculation: Protect against userspace-userspace spectreRSB commit `fdf82a7856` upstream. The article "Spectre Returns! Speculation Attacks using the Return Stack Buffer" [1] describes two new (sub-)variants of spectrev2-like attacks, making use solely of the RSB contents even on CPUs that don't fallback to BTB on RSB underflow (Skylake+). Mitigate userspace-userspace attacks by always unconditionally filling RSB on context switch when the generic spectrev2 mitigation has been enabled. [1] https://arxiv.org/pdf/1807.07940.pdf Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1807261308190.997@cbobk.fhfr.pm Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Peter Zijlstra	7676d2dee2	x86/paravirt: Fix spectre-v2 mitigations for paravirt guests commit `5800dc5c19` upstream. Nadav reported that on guests we're failing to rewrite the indirect calls to CALLEE_SAVE paravirt functions. In particular the pv_queued_spin_unlock() call is left unpatched and that is all over the place. This obviously wrecks Spectre-v2 mitigation (for paravirt guests) which relies on not actually having indirect calls around. The reason is an incorrect clobber test in paravirt_patch_call(); this function rewrites an indirect call with a direct call to the _SAME_ function, there is no possible way the clobbers can be different because of this. Therefore remove this clobber check. Also put WARNs on the other patch failure case (not enough room for the instruction) which I've not seen trigger in my (limited) testing. Three live kernel image disassemblies for lock_sock_nested (as a small function that illustrates the problem nicely). PRE is the current situation for guests, POST is with this patch applied and NATIVE is with or without the patch for !guests. PRE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq *0xffffffff822299e8 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. POST: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: callq 0xffffffff810a0c20 <__raw_callee_save___pv_queued_spin_unlock> 0xffffffff817be9a5 <+53>: xchg %ax,%ax 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063aa0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. NATIVE: (gdb) disassemble lock_sock_nested Dump of assembler code for function lock_sock_nested: 0xffffffff817be970 <+0>: push %rbp 0xffffffff817be971 <+1>: mov %rdi,%rbp 0xffffffff817be974 <+4>: push %rbx 0xffffffff817be975 <+5>: lea 0x88(%rbp),%rbx 0xffffffff817be97c <+12>: callq 0xffffffff819f7160 <_cond_resched> 0xffffffff817be981 <+17>: mov %rbx,%rdi 0xffffffff817be984 <+20>: callq 0xffffffff819fbb00 <_raw_spin_lock_bh> 0xffffffff817be989 <+25>: mov 0x8c(%rbp),%eax 0xffffffff817be98f <+31>: test %eax,%eax 0xffffffff817be991 <+33>: jne 0xffffffff817be9ba <lock_sock_nested+74> 0xffffffff817be993 <+35>: movl $0x1,0x8c(%rbp) 0xffffffff817be99d <+45>: mov %rbx,%rdi 0xffffffff817be9a0 <+48>: movb $0x0,(%rdi) 0xffffffff817be9a3 <+51>: nopl 0x0(%rax) 0xffffffff817be9a7 <+55>: pop %rbx 0xffffffff817be9a8 <+56>: pop %rbp 0xffffffff817be9a9 <+57>: mov $0x200,%esi 0xffffffff817be9ae <+62>: mov $0xffffffff817be993,%rdi 0xffffffff817be9b5 <+69>: jmpq 0xffffffff81063ae0 <__local_bh_enable_ip> 0xffffffff817be9ba <+74>: mov %rbp,%rdi 0xffffffff817be9bd <+77>: callq 0xffffffff817be8c0 <__lock_sock> 0xffffffff817be9c2 <+82>: jmp 0xffffffff817be993 <lock_sock_nested+35> End of assembler dump. Fixes: `63f70270cc` ("[PATCH] i386: PARAVIRT: add common patching machinery") Fixes: `3010a0663f` ("x86/paravirt, objtool: Annotate indirect calls") Reported-by: Nadav Amit <namit@vmware.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Oleksij Rempel	02e1b5a02d	ARM: dts: imx6sx: fix irq for pcie bridge commit `1bcfe05640` upstream. Use the correct IRQ line for the MSI controller in the PCIe host controller. Apparently a different IRQ line is used compared to other i.MX6 variants. Without this change MSI IRQs aren't properly propagated to the upstream interrupt controller. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Fixes: `b1d17f68e5` ("ARM: dts: imx: add initial imx6sx device tree source") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Al Viro	8ee542a6ef	fix __legitimize_mnt()/mntput() race commit `119e1ef80e` upstream. __legitimize_mnt() has two problems - one is that in case of success the check of mount_lock is not ordered wrt preceding increment of refcount, making it possible to have successful __legitimize_mnt() on one CPU just before the otherwise final mntpu() on another, with __legitimize_mnt() not seeing mntput() taking the lock and mntput() not seeing the increment done by __legitimize_mnt(). Solved by a pair of barriers. Another is that failure of __legitimize_mnt() on the second read_seqretry() leaves us with reference that'll need to be dropped by caller; however, if that races with final mntput() we can end up with caller dropping rcu_read_lock() and doing mntput() to release that reference - with the first mntput() having freed the damn thing just as rcu_read_lock() had been dropped. Solution: in "do mntput() yourself" failure case grab mount_lock, check if MNT_DOOMED has been set by racing final mntput() that has missed our increment and if it has - undo the increment and treat that as "failure, caller doesn't need to drop anything" case. It's not easy to hit - the final mntput() has to come right after the first read_seqretry() in __legitimize_mnt() and manage to miss the increment done by __legitimize_mnt() before the second read_seqretry() in there. The things that are almost impossible to hit on bare hardware are not impossible on SMP KVM, though... Reported-by: Oleg Nesterov <oleg@redhat.com> Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Al Viro	8b563c7c42	fix mntput/mntput race commit `9ea0a46ca2` upstream. mntput_no_expire() does the calculation of total refcount under mount_lock; unfortunately, the decrement (as well as all increments) are done outside of it, leading to false positives in the "are we dropping the last reference" test. Consider the following situation: * mnt is a lazy-umounted mount, kept alive by two opened files. One of those files gets closed. Total refcount of mnt is 2. On CPU 42 mntput(mnt) (called from __fput()) drops one reference, decrementing component * After it has looked at component #0, the process on CPU 0 does mntget(), incrementing component #0, gets preempted and gets to run again - on CPU 69. There it does mntput(), which drops the reference (component #69) and proceeds to spin on mount_lock. * On CPU 42 our first mntput() finishes counting. It observes the decrement of component #69, but not the increment of component #0. As the result, the total it gets is not 1 as it should've been - it's 0. At which point we decide that vfsmount needs to be killed and proceed to free it and shut the filesystem down. However, there's still another opened file on that filesystem, with reference to (now freed) vfsmount, etc. and we are screwed. It's not a wide race, but it can be reproduced with artificial slowdown of the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups. Fix consists of moving the refcount decrement under mount_lock; the tricky part is that we want (and can) keep the fast case (i.e. mount that still has non-NULL ->mnt_ns) entirely out of mount_lock. All places that zero mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu() before that mntput(). IOW, if mntput() observes (under rcu_read_lock()) a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to be dropped. Reported-by: Jann Horn <jannh@google.com> Tested-by: Jann Horn <jannh@google.com> Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Al Viro	0aeaf7c64b	make sure that __dentry_kill() always invalidates d_seq, unhashed or not commit `4c0d7cd5c8` upstream. RCU pathwalk relies upon the assumption that anything that changes ->d_inode of a dentry will invalidate its ->d_seq. That's almost true - the one exception is that the final dput() of already unhashed dentry does not touch ->d_seq at all. Unhashing does, though, so for anything we'd found by RCU dcache lookup we are fine. Unfortunately, we can start with an unhashed dentry or jump into it. We could try and be careful in the (few) places where that could happen. Or we could just make the final dput() invalidate the damn thing, unhashed or not. The latter is much simpler and easier to backport, so let's do it that way. Reported-by: "Dae R. Jeong" <threeearcat@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Al Viro	e3cf270890	root dentries need RCU-delayed freeing commit `90bad5e05b` upstream. Since mountpoint crossing can happen without leaving lazy mode, root dentries do need the same protection against having their memory freed without RCU delay as everything else in the tree. It's partially hidden by RCU delay between detaching from the mount tree and dropping the vfsmount reference, but the starting point of pathwalk can be on an already detached mount, in which case umount-caused RCU delay has already passed by the time the lazy pathwalk grabs rcu_read_lock(). If the starting point happens to be at the root of that vfsmount and that vfsmount covers the entire filesystem, we get trouble. Fixes: `48a066e72d` ("RCU'd vsfmounts") Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Linus Torvalds	ce8998f030	init: rename and re-order boot_cpu_state_init() commit `b5b1404d08` upstream. This is purely a preparatory patch for upcoming changes during the 4.19 merge window. We have a function called "boot_cpu_state_init()" that isn't really about the bootup cpu state: that is done much earlier by the similarly named "boot_cpu_init()" (note lack of "state" in name). This function initializes some hotplug CPU state, and needs to run after the percpu data has been properly initialized. It even has a comment to that effect. Except it _doesn't_ actually run after the percpu data has been properly initialized. On x86 it happens to do that, but on at least arm and arm64, the percpu base pointers are initialized by the arch-specific 'smp_prepare_boot_cpu()' hook, which ran _after_ boot_cpu_state_init(). This had some unexpected results, and in particular we have a patch pending for the merge window that did the obvious cleanup of using 'this_cpu_write()' in the cpu hotplug init code: - per_cpu_ptr(&cpuhp_state, smp_processor_id())->state = CPUHP_ONLINE; + this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); which is obviously the right thing to do. Except because of the ordering issue, it actually failed miserably and unexpectedly on arm64. So this just fixes the ordering, and changes the name of the function to be 'boot_cpu_hotplug_init()' to make it obvious that it's about cpu hotplug state, because the core CPU state was supposed to have already been done earlier. Marked for stable, since the (not yet merged) patch that will show this problem is marked for stable. Reported-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Mian Yousaf Kaukab <yousaf.kaukab@suse.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:58 +02:00
Quinn Tran	20dfc496dc	scsi: qla2xxx: Fix memory leak for allocating abort IOCB commit `5e53be8e47` upstream. In the case of IOCB QFull, Initiator code can leave behind a stale pointer to an SRB structure on the outstanding command array. Fixes: `82de802ad4` ("scsi: qla2xxx: Preparation for Target MQ.") Cc: stable@vger.kernel.org #v4.16+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Bart Van Assche	a4a7bb655e	scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled commit `1214fd7b49` upstream. Surround scsi_execute() calls with scsi_autopm_get_device() and scsi_autopm_put_device(). Note: removing sr_mutex protection from the scsi_cd_get() and scsi_cd_put() calls is safe because the purpose of sr_mutex is to serialize cdrom_*() calls. This patch avoids that complaints similar to the following appear in the kernel log if runtime power management is enabled: INFO: task systemd-udevd:650 blocked for more than 120 seconds. Not tainted 4.18.0-rc7-dbg+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. systemd-udevd D28176 650 513 0x00000104 Call Trace: __schedule+0x444/0xfe0 schedule+0x4e/0xe0 schedule_preempt_disabled+0x18/0x30 __mutex_lock+0x41c/0xc70 mutex_lock_nested+0x1b/0x20 __blkdev_get+0x106/0x970 blkdev_get+0x22c/0x5a0 blkdev_open+0xe9/0x100 do_dentry_open.isra.19+0x33e/0x570 vfs_open+0x7c/0xd0 path_openat+0x6e3/0x1120 do_filp_open+0x11c/0x1c0 do_sys_open+0x208/0x2d0 __x64_sys_openat+0x59/0x70 do_syscall_64+0x77/0x230 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Maurizio Lombardi <mlombard@redhat.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: <stable@vger.kernel.org> Tested-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Daniel Borkmann	f2be94d24f	bpf, sockmap: fix bpf_tcp_sendmsg sock error handling commit `5121700b34` upstream. While working on bpf_tcp_sendmsg() code, I noticed that when a sk->sk_err is set we error out with err = sk->sk_err. However this is problematic since sk->sk_err is a positive error value and therefore we will neither go into sk_stream_error() nor will we report an error back to user space. I had this case with EPIPE and user space was thinking sendmsg() succeeded since EPIPE is a positive value, thinking we submitted 32 bytes. Fix it by negating the sk->sk_err value. Fixes: `4f738adba3` ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Daniel Borkmann	5c6850e8f7	bpf, sockmap: fix leak in bpf_tcp_sendmsg wait for mem path commit `7c81c71730` upstream. In bpf_tcp_sendmsg() the sk_alloc_sg() may fail. In the case of ENOMEM, it may also mean that we've partially filled the scatterlist entries with pages. Later jumping to sk_stream_wait_memory() we could further fail with an error for several reasons, however we miss to call free_start_sg() if the local sk_msg_buff was used. Fixes: `4f738adba3` ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Juergen Gross	303a32ac20	xen/netfront: don't cache skb_shinfo() commit `d472b3a6cf` upstream. skb_shinfo() can change when calling __pskb_pull_tail(): Don't cache its return value. Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Minchan Kim	ef74528ff7	zram: remove BD_CAP_SYNCHRONOUS_IO with writeback feature commit `4f7a7beaee` upstream. If zram supports writeback feature, it's no longer a BD_CAP_SYNCHRONOUS_IO device beause zram does asynchronous IO operations for incompressible pages. Do not pretend to be synchronous IO device. It makes the system very sluggish due to waiting for IO completion from upper layers. Furthermore, it causes a user-after-free problem because swap thinks the opearion is done when the IO functions returns so it can free the page (e.g., lock_page_or_retry and goto out_release in do_swap_page) but in fact, IO is asynchronous so the driver could access a just freed page afterward. This patch fixes the problem. BUG: Bad page state in process qemu-system-x86 pfn:3dfab21 page:ffffdfb137eac840 count:0 mapcount:0 mapping:0000000000000000 index:0x1 flags: 0x17fffc000000008(uptodate) raw: 017fffc000000008 dead000000000100 dead000000000200 0000000000000000 raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag set bad because of flags: 0x8(uptodate) CPU: 4 PID: 1039 Comm: qemu-system-x86 Tainted: G B 4.18.0-rc5+ #1 Hardware name: Supermicro Super Server/X10SRL-F, BIOS 2.0b 05/02/2017 Call Trace: dump_stack+0x5c/0x7b bad_page+0xba/0x120 get_page_from_freelist+0x1016/0x1250 __alloc_pages_nodemask+0xfa/0x250 alloc_pages_vma+0x7c/0x1c0 do_swap_page+0x347/0x920 __handle_mm_fault+0x7b4/0x1110 handle_mm_fault+0xfc/0x1f0 __get_user_pages+0x12f/0x690 get_user_pages_unlocked+0x148/0x1f0 __gfn_to_pfn_memslot+0xff/0x3c0 [kvm] try_async_pf+0x87/0x230 [kvm] tdp_page_fault+0x132/0x290 [kvm] kvm_mmu_page_fault+0x74/0x570 [kvm] kvm_arch_vcpu_ioctl_run+0x9b3/0x1990 [kvm] kvm_vcpu_ioctl+0x388/0x5d0 [kvm] do_vfs_ioctl+0xa2/0x630 ksys_ioctl+0x70/0x80 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x55/0x100 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Link: https://lore.kernel.org/lkml/0516ae2d-b0fd-92c5-aa92-112ba7bd32fc@contabo.de/ Link: http://lkml.kernel.org/r/20180802051112.86174-1-minchan@kernel.org [minchan@kernel.org: fix changelog, add comment] Link: https://lore.kernel.org/lkml/0516ae2d-b0fd-92c5-aa92-112ba7bd32fc@contabo.de/ Link: http://lkml.kernel.org/r/20180802051112.86174-1-minchan@kernel.org Link: http://lkml.kernel.org/r/20180805233722.217347-1-minchan@kernel.org [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Minchan Kim <minchan@kernel.org> Reported-by: Tino Lehnig <tino.lehnig@contabo.de> Tested-by: Tino Lehnig <tino.lehnig@contabo.de> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> [4.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Daniel Bristot de Oliveira	f0f759e70f	sched/deadline: Update rq_clock of later_rq when pushing a task commit `840d719604` upstream. Daniel Casini got this warn while running a DL task here at RetisLab: [ 461.137582] ------------[ cut here ]------------ [ 461.137583] rq->clock_update_flags < RQCF_ACT_SKIP [ 461.137599] WARNING: CPU: 4 PID: 2354 at kernel/sched/sched.h:967 assert_clock_updated.isra.32.part.33+0x17/0x20 [a ton of modules] [ 461.137646] CPU: 4 PID: 2354 Comm: label_image Not tainted 4.18.0-rc4+ #3 [ 461.137647] Hardware name: ASUS All Series/Z87-K, BIOS 0801 09/02/2013 [ 461.137649] RIP: 0010:assert_clock_updated.isra.32.part.33+0x17/0x20 [ 461.137649] Code: ff 48 89 83 08 09 00 00 eb c6 66 0f 1f 84 00 00 00 00 00 55 48 c7 c7 98 7a 6c a5 c6 05 bc 0d 54 01 01 48 89 e5 e8 a9 84 fb ff <0f> 0b 5d c3 0f 1f 44 00 00 0f 1f 44 00 00 83 7e 60 01 74 0a 48 3b [ 461.137673] RSP: 0018:ffffa77e08cafc68 EFLAGS: 00010082 [ 461.137674] RAX: 0000000000000000 RBX: ffff8b3fc1702d80 RCX: 0000000000000006 [ 461.137674] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff8b3fded164b0 [ 461.137675] RBP: ffffa77e08cafc68 R08: 0000000000000026 R09: 0000000000000339 [ 461.137676] R10: ffff8b3fd060d410 R11: 0000000000000026 R12: ffffffffa4e14e20 [ 461.137677] R13: ffff8b3fdec22940 R14: ffff8b3fc1702da0 R15: ffff8b3fdec22940 [ 461.137678] FS: 00007efe43ee5700(0000) GS:ffff8b3fded00000(0000) knlGS:0000000000000000 [ 461.137679] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 461.137680] CR2: 00007efe30000010 CR3: 0000000301744003 CR4: 00000000001606e0 [ 461.137680] Call Trace: [ 461.137684] push_dl_task.part.46+0x3bc/0x460 [ 461.137686] task_woken_dl+0x60/0x80 [ 461.137689] ttwu_do_wakeup+0x4f/0x150 [ 461.137690] ttwu_do_activate+0x77/0x80 [ 461.137692] try_to_wake_up+0x1d6/0x4c0 [ 461.137693] wake_up_q+0x32/0x70 [ 461.137696] do_futex+0x7e7/0xb50 [ 461.137698] __x64_sys_futex+0x8b/0x180 [ 461.137701] do_syscall_64+0x5a/0x110 [ 461.137703] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 461.137705] RIP: 0033:0x7efe4918ca26 [ 461.137705] Code: 00 00 00 74 17 49 8b 48 20 44 8b 59 10 41 83 e3 30 41 83 fb 20 74 1e be 85 00 00 00 41 ba 01 00 00 00 41 b9 01 00 00 04 0f 05 <48> 3d 01 f0 ff ff 73 1f 31 c0 c3 be 8c 00 00 00 49 89 c8 4d 31 d2 [ 461.137738] RSP: 002b:00007efe43ee4928 EFLAGS: 00000283 ORIG_RAX: 00000000000000ca [ 461.137739] RAX: ffffffffffffffda RBX: 0000000005094df0 RCX: 00007efe4918ca26 [ 461.137740] RDX: 0000000000000001 RSI: 0000000000000085 RDI: 0000000005094e24 [ 461.137741] RBP: 00007efe43ee49c0 R08: 0000000005094e20 R09: 0000000004000001 [ 461.137741] R10: 0000000000000001 R11: 0000000000000283 R12: 0000000000000000 [ 461.137742] R13: 0000000005094df8 R14: 0000000000000001 R15: 0000000000448a10 [ 461.137743] ---[ end trace 187df4cad2bf7649 ]--- This warning happened in the push_dl_task(), because __add_running_bw()->cpufreq_update_util() is getting the rq_clock of the later_rq before its update, which takes place at activate_task(). The fix then is to update the rq_clock before calling add_running_bw(). To avoid double rq_clock_update() call, we set ENQUEUE_NOCLOCK flag to activate_task(). Reported-by: Daniel Casini <daniel.casini@santannapisa.it> Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Juri Lelli <juri.lelli@redhat.com> Cc: Clark Williams <williams@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tommaso Cucinotta <tommaso.cucinotta@santannapisa.it> Fixes: `e0367b1267` sched/deadline: Move CPU frequency selection triggering points Link: http://lkml.kernel.org/r/ca31d073a4788acf0684a8b255f14fea775ccf20.1532077269.git.bristot@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Isaac J. Manjarres	c315c4e36d	stop_machine: Disable preemption after queueing stopper threads commit `2610e88946` upstream. This commit: `9fb8d5dc4b` ("stop_machine, Disable preemption when waking two stopper threads") does not fully address the race condition that can occur as follows: On one CPU, call it CPU 3, thread 1 invokes cpu_stop_queue_two_works(2, 3,...), and the execution is such that thread 1 queues the works for migration/2 and migration/3, and is preempted after releasing the locks for migration/2 and migration/3, but before waking the threads. Then, On CPU 2, a kworker, call it thread 2, is running, and it invokes cpu_stop_queue_two_works(1, 2,...), such that thread 2 queues the works for migration/1 and migration/2. Meanwhile, on CPU 3, thread 1 resumes execution, and wakes migration/2 and migration/3. This means that when CPU 2 releases the locks for migration/1 and migration/2, but before it wakes those threads, it can be preempted by migration/2. If thread 2 is preempted by migration/2, then migration/2 will execute the first work item successfully, since migration/3 was woken up by CPU 3, but when it goes to execute the second work item, it disables preemption, calls multi_cpu_stop(), and thus, CPU 2 will wait forever for migration/1, which should have been woken up by thread 2. However migration/1 cannot be woken up by thread 2, since it is a kworker, so it is affine to CPU 2, but CPU 2 is running migration/2 with preemption disabled, so thread 2 will never run. Disable preemption after queueing works for stopper threads to ensure that the operation of queueing the works and waking the stopper threads is atomic. Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org> Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: bigeasy@linutronix.de Cc: gregkh@linuxfoundation.org Cc: matt@codeblueprint.co.uk Fixes: `9fb8d5dc4b` ("stop_machine, Disable preemption when waking two stopper threads") Link: http://lkml.kernel.org/r/1531856129-9871-1-git-send-email-isaacm@codeaurora.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
Linus Torvalds	01f2e8354c	Mark HI and TASKLET softirq synchronous commit `3c53776e29` upstream. Way back in 4.9, we committed `4cd13c21b2` ("softirq: Let ksoftirqd do its job"), and ever since we've had small nagging issues with it. For example, we've had: `1ff688209e` ("watchdog: core: make sure the watchdog_worker is not deferred") `8d5755b3f7` ("watchdog: softdog: fire watchdog even if softirqs do not get to run") `217f697436` ("net: busy-poll: allow preemption in sk_busy_loop()") all of which worked around some of the effects of that commit. The DVB people have also complained that the commit causes excessive USB URB latencies, which seems to be due to the USB code using tasklets to schedule USB traffic. This seems to be an issue mainly when already living on the edge, but waiting for ksoftirqd to handle it really does seem to cause excessive latencies. Now Hanna Hawa reports that this issue isn't just limited to USB URB and DVB, but also causes timeout problems for the Marvell SoC team: "I'm facing kernel panic issue while running raid 5 on sata disks connected to Macchiatobin (Marvell community board with Armada-8040 SoC with 4 ARMv8 cores of CA72) Raid 5 built with Marvell DMA engine and async_tx mechanism (ASYNC_TX_DMA [=y]); the DMA driver (mv_xor_v2) uses a tasklet to clean the done descriptors from the queue" The latency problem causes a panic: mv_xor_v2 f0400000.xor: dma_sync_wait: timeout! Kernel panic - not syncing: async_tx_quiesce: DMA error waiting for transaction We've discussed simply just reverting the original commit entirely, and also much more involved solutions (with per-softirq threads etc). This patch is intentionally stupid and fairly limited, because the issue still remains, and the other solutions either got sidetracked or had other issues. We should probably also consider the timer softirqs to be synchronous and not be delayed to ksoftirqd (since they were the issue with the earlier watchdog problems), but that should be done as a separate patch. This does only the tasklet cases. Reported-and-tested-by: Hanna Hawa <hannah@marvell.com> Reported-and-tested-by: Josef Griebichler <griebichler.josef@gmx.at> Reported-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:57 +02:00
John David Anglin	f31d059209	parisc: Define mb() and add memory barriers to assembler unlock sequences commit `fedb8da963` upstream. For years I thought all parisc machines executed loads and stores in order. However, Jeff Law recently indicated on gcc-patches that this is not correct. There are various degrees of out-of-order execution all the way back to the PA7xxx processor series (hit-under-miss). The PA8xxx series has full out-of-order execution for both integer operations, and loads and stores. This is described in the following article: http://web.archive.org/web/20040214092531/http://www.cpus.hp.com/technical_references/advperf.shtml For this reason, we need to define mb() and to insert a memory barrier before the store unlocking spinlocks. This ensures that all memory accesses are complete prior to unlocking. The ldcw instruction performs the same function on entry. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:56 +02:00
Helge Deller	8ab85f0bec	parisc: Enable CONFIG_MLONGCALLS by default commit `66509a276c` upstream. Enable the -mlong-calls compiler option by default, because otherwise in most cases linking the vmlinux binary fails due to truncations of R_PARISC_PCREL22F relocations. This fixes building the 64-bit defconfig. Cc: stable@vger.kernel.org # 4.0+ Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-15 18:10:56 +02:00
Greg Kroah-Hartman	d0b1397a9a	Linux 4.17.14	2018-08-09 12:15:13 +02:00
Shankara Pailoor	2972e3f681	jfs: Fix inconsistency between memory allocation and ea_buf->max_size commit `92d3413419` upstream. The code is assuming the buffer is max_size length, but we weren't allocating enough space for it. Signed-off-by: Shankara Pailoor <shankarapailoor@gmail.com> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:13 +02:00
Dave Chinner	39dc3fb32f	xfs: validate cached inodes are free when allocated commit `afca6c5b25` upstream. A recent fuzzed filesystem image cached random dcache corruption when the reproducer was run. This often showed up as panics in lookup_slow() on a null inode->i_ops pointer when doing pathwalks. BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 .... Call Trace: lookup_slow+0x44/0x60 walk_component+0x3dd/0x9f0 link_path_walk+0x4a7/0x830 path_lookupat+0xc1/0x470 filename_lookup+0x129/0x270 user_path_at_empty+0x36/0x40 path_listxattr+0x98/0x110 SyS_listxattr+0x13/0x20 do_syscall_64+0xf5/0x280 entry_SYSCALL_64_after_hwframe+0x42/0xb7 but had many different failure modes including deadlocks trying to lock the inode that was just allocated or KASAN reports of use-after-free violations. The cause of the problem was a corrupt INOBT on a v4 fs where the root inode was marked as free in the inobt record. Hence when we allocated an inode, it chose the root inode to allocate, found it in the cache and re-initialised it. We recently fixed a similar inode allocation issue caused by inobt record corruption problem in xfs_iget_cache_miss() in commit `ee457001ed` ("xfs: catch inode allocation state mismatch corruption"). This change adds similar checks to the cache-hit path to catch it, and turns the reproducer into a corruption shutdown situation. Reported-by: Wen Xu <wen.xu@gatech.edu> Signed-Off-By: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> [darrick: fix typos in comment] Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eduardo Valentin <eduval@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Eric Sandeen	173f00f401	xfs: don't call xfs_da_shrink_inode with NULL bp commit `bb3d48dcf8` upstream. xfs_attr3_leaf_create may have errored out before instantiating a buffer, for example if the blkno is out of range. In that case there is no work to do to remove it, and in fact xfs_da_shrink_inode will lead to an oops if we try. This also seems to fix a flaw where the original error from xfs_attr3_leaf_create gets overwritten in the cleanup case, and it removes a pointless assignment to bp which isn't used after this. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199969 Reported-by: Xu, Wen <wen.xu@gatech.edu> Tested-by: Xu, Wen <wen.xu@gatech.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Cc: Eduardo Valentin <eduval@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Linus Torvalds	c9c16a87da	Partially revert "block: fail op_is_write() requests to read-only partitions" commit `a32e236eb9` upstream. It turns out that commit `721c7fc701` ("block: fail op_is_write() requests to read-only partitions"), while obviously correct, causes problems for some older lvm2 installations. The reason is that the lvm snapshotting will continue to write to the snapshow COW volume, even after the volume has been marked read-only. End result: snapshot failure. This has actually been fixed in newer version of the lvm2 tool, but the old tools still exist, and the breakage was reported both in the kernel bugzilla and in the Debian bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200439 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=900442 The lvm2 fix is here https://sourceware.org/git/?p=lvm2.git;a=commit;h=a6fdb9d9d70f51c49ad11a87ab4243344e6701a3 but until everybody has updated to recent versions, we'll have to weaken the "never write to read-only partitions" check. It now allows the write to happen, but causes a warning, something like this: generic_make_request: Trying to write to read-only block-device dm-3 (partno X) Modules linked in: nf_tables xt_cgroup xt_owner kvm_intel iwlmvm kvm irqbypass iwlwifi CPU: 1 PID: 77 Comm: kworker/1:1 Not tainted 4.17.9-gentoo #3 Hardware name: LENOVO 20B6A019RT/20B6A019RT, BIOS GJET91WW (2.41 ) 09/21/2016 Workqueue: ksnaphd do_metadata RIP: 0010:generic_make_request_checks+0x4ac/0x600 ... Call Trace: generic_make_request+0x64/0x400 submit_bio+0x6c/0x140 dispatch_io+0x287/0x430 sync_io+0xc3/0x120 dm_io+0x1f8/0x220 do_metadata+0x1d/0x30 process_one_work+0x1b9/0x3e0 worker_thread+0x2b/0x3c0 kthread+0x113/0x130 ret_from_fork+0x35/0x40 Note that this is a "revert" in behavior only. I'm leaving alone the actual code cleanups in commit `721c7fc701`, but letting the previously uncaught request go through with a warning instead of stopping it. Fixes: `721c7fc701` ("block: fail op_is_write() requests to read-only partitions") Reported-and-tested-by: WGH <wgh@torlan.ru> Acked-by: Mike Snitzer <snitzer@redhat.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Zdenek Kabelac <zkabelac@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Filipe Manana	288e990ae1	Btrfs: fix file data corruption after cloning a range and fsync commit `bd3599a0e1` upstream. When we clone a range into a file we can end up dropping existing extent maps (or trimming them) and replacing them with new ones if the range to be cloned overlaps with a range in the destination inode. When that happens we add the new extent maps to the list of modified extents in the inode's extent map tree, so that a "fast" fsync (the flag BTRFS_INODE_NEEDS_FULL_SYNC not set in the inode) will see the extent maps and log corresponding extent items. However, at the end of range cloning operation we do truncate all the pages in the affected range (in order to ensure future reads will not get stale data). Sometimes this truncation will release the corresponding extent maps besides the pages from the page cache. If this happens, then a "fast" fsync operation will miss logging some extent items, because it relies exclusively on the extent maps being present in the inode's extent tree, leading to data loss/corruption if the fsync ends up using the same transaction used by the clone operation (that transaction was not committed in the meanwhile). An extent map is released through the callback btrfs_invalidatepage(), which gets called by truncate_inode_pages_range(), and it calls __btrfs_releasepage(). The later ends up calling try_release_extent_mapping() which will release the extent map if some conditions are met, like the file size being greater than 16Mb, gfp flags allow blocking and the range not being locked (which is the case during the clone operation) nor being the extent map flagged as pinned (also the case for cloning). The following example, turned into a test for fstests, reproduces the issue: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0x18 9000K 6908K" /mnt/foo $ xfs_io -f -c "pwrite -S 0x20 2572K 156K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar # reflink destination offset corresponds to the size of file bar, # 2728Kb minus 4Kb. $ xfs_io -c ""reflink ${SCRATCH_MNT}/foo 0 2724K 15908K" /mnt/bar $ xfs_io -c "fsync" /mnt/bar $ md5sum /mnt/bar 95a95813a8c2abc9aa75a6c2914a077e /mnt/bar <power fail> $ mount /dev/sdb /mnt $ md5sum /mnt/bar 207fd8d0b161be8a84b945f0df8d5f8d /mnt/bar # digest should be 95a95813a8c2abc9aa75a6c2914a077e like before the # power failure In the above example, the destination offset of the clone operation corresponds to the size of the "bar" file minus 4Kb. So during the clone operation, the extent map covering the range from 2572Kb to 2728Kb gets trimmed so that it ends at offset 2724Kb, and a new extent map covering the range from 2724Kb to 11724Kb is created. So at the end of the clone operation when we ask to truncate the pages in the range from 2724Kb to 2724Kb + 15908Kb, the page invalidation callback ends up removing the new extent map (through try_release_extent_mapping()) when the page at offset 2724Kb is passed to that callback. Fix this by setting the bit BTRFS_INODE_NEEDS_FULL_SYNC whenever an extent map is removed at try_release_extent_mapping(), forcing the next fsync to search for modified extents in the fs/subvolume tree instead of relying on the presence of extent maps in memory. This way we can continue doing a "fast" fsync if the destination range of a clone operation does not overlap with an existing range or if any of the criteria necessary to remove an extent map at try_release_extent_mapping() is not met (file size not bigger then 16Mb or gfp flags do not allow blocking). CC: stable@vger.kernel.org # 3.16+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Esben Haabendal	1cde10ea1d	i2c: imx: Fix reinit_completion() use commit `9f9e3e0d4d` upstream. Make sure to call reinit_completion() before dma is started to avoid race condition where reinit_completion() is called after complete() and before wait_for_completion_timeout(). Signed-off-by: Esben Haabendal <eha@deif.com> Fixes: `ce1a78840f` ("i2c: imx: add DMA support for freescale i2c driver") Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Masami Hiramatsu	b0dbc2f3e8	ring_buffer: tracing: Inherit the tracing setting to next ring buffer commit `73c8d89455` upstream. Maintain the tracing on/off setting of the ring_buffer when switching to the trace buffer snapshot. Taking a snapshot is done by swapping the backup ring buffer (max_tr_buffer). But since the tracing on/off setting is defined by the ring buffer, when swapping it, the tracing on/off setting can also be changed. This causes a strange result like below: /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 0 > tracing_on /sys/kernel/debug/tracing # cat tracing_on 0 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 1 /sys/kernel/debug/tracing # echo 1 > snapshot /sys/kernel/debug/tracing # cat tracing_on 0 We don't touch tracing_on, but snapshot changes tracing_on setting each time. This is an anomaly, because user doesn't know that each "ring_buffer" stores its own tracing-enable state and the snapshot is done by swapping ring buffers. Link: http://lkml.kernel.org/r/153149929558.11274.11730609978254724394.stgit@devbox Cc: Ingo Molnar <mingo@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Hiraku Toyooka <hiraku.toyooka@cybertrust.co.jp> Cc: stable@vger.kernel.org Fixes: `debdd57f51` ("tracing: Make a snapshot feature available from userspace") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> [ Updated commit log and comment in the code ] Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Dmitry Safonov	1cb3f3924d	netlink: Don't shift on 64 for ngroups commit `91874ecf32` upstream. It's legal to have 64 groups for netlink_sock. As user-supplied nladdr->nl_groups is __u32, it's possible to subscribe only to first 32 groups. The check for correctness of .bind() userspace supplied parameter is done by applying mask made from ngroups shift. Which broke Android as they have 64 groups and the shift for mask resulted in an overflow. Fixes: `61f4b23769` ("netlink: Don't shift with UB on nlk->ngroups") Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Cc: stable@vger.kernel.org Reported-and-Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Frederic Weisbecker	0e6812bffc	nohz: Fix missing tick reprogram when interrupting an inline softirq commit `0a0e0829f9` upstream. The full nohz tick is reprogrammed in irq_exit() only if the exit is not in a nesting interrupt. This stands as an optimization: whether a hardirq or a softirq is interrupted, the tick is going to be reprogrammed when necessary at the end of the inner interrupt, with even potential new updates on the timer queue. When soft interrupts are interrupted, it's assumed that they are executing on the tail of an interrupt return. In that case tick_nohz_irq_exit() is called after softirq processing to take care of the tick reprogramming. But the assumption is wrong: softirqs can be processed inline as well, ie: outside of an interrupt, like in a call to local_bh_enable() or from ksoftirqd. Inline softirqs don't reprogram the tick once they are done, as opposed to interrupt tail softirq processing. So if a tick interrupts an inline softirq processing, the next timer will neither be reprogrammed from the interrupting tick's irq_exit() nor after the interrupted softirq processing. This situation may leave the tick unprogrammed while timers are armed. To fix this, simply keep reprogramming the tick even if a softirq has been interrupted. That can be optimized further, but for now correctness is more important. Note that new timers enqueued in nohz_full mode after a softirq gets interrupted will still be handled just fine through self-IPIs triggered by the timer code. Reported-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Cc: stable@vger.kernel.org # 4.14+ Link: https://lkml.kernel.org/r/1533303094-15855-1-git-send-email-frederic@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Anna-Maria Gleixner	ba15482a75	nohz: Fix local_timer_softirq_pending() commit `80d20d35af` upstream. local_timer_softirq_pending() checks whether the timer softirq is pending with: local_softirq_pending() & TIMER_SOFTIRQ. This is wrong because TIMER_SOFTIRQ is the softirq number and not a bitmask. So the test checks for the wrong bit. Use BIT(TIMER_SOFTIRQ) instead. Fixes: `5d62c183f9` ("nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()") Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Daniel Bristot de Oliveira <bristot@redhat.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Cc: bigeasy@linutronix.de Cc: peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180731161358.29472-1-anna-maria@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:12 +02:00
Kan Liang	58207429db	perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices commit `156c8b58ef` upstream. Masayoshi Mizuma reported that a warning message is shown while a CPU is hot-removed on Broadwell servers: WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988 uncore_pci_remove+0x10b/0x150 Call Trace: pci_device_remove+0x42/0xd0 device_release_driver_internal+0x148/0x220 pci_stop_bus_device+0x76/0xa0 pci_stop_root_bus+0x44/0x60 acpi_pci_root_remove+0x1f/0x80 acpi_bus_trim+0x57/0x90 acpi_bus_trim+0x2e/0x90 acpi_device_hotplug+0x2bc/0x4b0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x174/0x3a0 worker_thread+0x4c/0x3d0 kthread+0xf8/0x130 This bug was introduced by: commit `15a3e845b0` ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well. To fix the conflict, the hardcoded index needs to be cleaned up: - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2 filter" on Broadwell, - increase UNCORE_EXTRA_PCI_DEV_MAX by one, - clean up the hardcoded index. Debugged-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Reported-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: msys.mizuma@gmail.com Cc: stable@vger.kernel.org Fixes: `15a3e845b0` ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs") Link: http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Thomas Gleixner	26af0b9e95	genirq: Make force irq threading setup more robust commit `d1f0301b33` upstream. The support of force threading interrupts which are set up with both a primary and a threaded handler wreckaged the setup of regular requested threaded interrupts (primary handler == NULL). The reason is that it does not check whether the primary handler is set to the default handler which wakes the handler thread. Instead it replaces the thread handler with the primary handler as it would do with force threaded interrupts which have been requested via request_irq(). So both the primary and the thread handler become the same which then triggers the warnon that the thread handler tries to wakeup a not configured secondary thread. Fortunately this only happens when the driver omits the IRQF_ONESHOT flag when requesting the threaded interrupt, which is normaly caught by the sanity checks when force irq threading is disabled. Fix it by skipping the force threading setup when a regular threaded interrupt is requested. As a consequence the interrupt request which lacks the IRQ_ONESHOT flag is rejected correctly instead of silently wreckaging it. Fixes: `2a1d3ab898` ("genirq: Handle force threading of irqs with primary and thread handler") Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Kees Cook	6505a712d8	jfs: Fix usercopy whitelist for inline inode data commit `961b33c244` upstream. Bart Massey reported what turned out to be a usercopy whitelist false positive in JFS when symlink contents exceeded 128 bytes. The inline inode data (i_inline) is actually designed to overflow into the "extended area" following it (i_inline_ea) when needed. So the whitelist needed to be expanded to include both i_inline and i_inline_ea (the whole size of which is calculated internally using IDATASIZE, 256, instead of sizeof(i_inline), 128). $ cd /mnt/jfs $ touch $(perl -e 'print "B" x 250') $ ln -s B* b $ ls -l >/dev/null [ 249.436410] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLUB object 'jfs_ip' (offset 616, size 250)! Reported-by: Bart Massey <bart.massey@gmail.com> Fixes: `8d2704d382` ("jfs: Define usercopy region in jfs_ip slab cache") Cc: Dave Kleikamp <shaggy@kernel.org> Cc: jfs-discussion@lists.sourceforge.net Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Anil Gurumurthy	37be2e37d8	scsi: qla2xxx: Return error when TMF returns commit `b4146c4929` upstream. Propagate the task management completion status properly to avoid unnecessary waits for commands to complete. Fixes: `faef62d134` ("[SCSI] qla2xxx: Fix Task Management command asynchronous handling") Cc: <stable@vger.kernel.org> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Quinn Tran	29af9c350e	scsi: qla2xxx: Fix ISP recovery on unload commit `b08abbd9f5` upstream. During unload process, the chip can encounter problem where a FW dump would be captured. For this case, the full reset sequence will be skip to bring the chip back to full operational state. Fixes: `e315cd28b9` ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Quinn Tran	37b94b0c72	scsi: qla2xxx: Fix driver unload by shutting down chip commit `45235022da` upstream. Use chip shutdown at the start of unload to stop all DMA + traffic and bring down the laser. This prevents any link activities from triggering the driver to be re-engaged. Fixes: `4b60c82736` ("scsi: qla2xxx: Add fw_started flags to qpair") Cc: <stable@vger.kernel.org> #4.16 Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Quinn Tran	030680c5b2	scsi: qla2xxx: Fix NPIV deletion by calling wait_for_sess_deletion commit `efa93f48fa` upstream. Add wait for session deletion to finish before freeing an NPIV scsi host. Fixes: `726b854870` ("qla2xxx: Add framework for async fabric discovery") Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Quinn Tran	d7bc915396	scsi: qla2xxx: Fix unintialized List head crash commit `e3dde080eb` upstream. In case of IOCB Queue full or system where memory is low and driver receives large number of RSCN storm, the stale sp pointer can stay on gpnid_list resulting in page_fault. This patch fixes this issue by initializing the sp->elem list head and removing sp->elem before memory is freed. Following stack trace is seen 9 [ffff987b37d1bc60] page_fault at ffffffffad516768 [exception RIP: qla24xx_async_gpnid+496] 10 [ffff987b37d1bd10] qla24xx_async_gpnid at ffffffffc039866d [qla2xxx] 11 [ffff987b37d1bd80] qla2x00_do_work at ffffffffc036169c [qla2xxx] 12 [ffff987b37d1be38] qla2x00_do_dpc_all_vps at ffffffffc03adfed [qla2xxx] 13 [ffff987b37d1be78] qla2x00_do_dpc at ffffffffc036458a [qla2xxx] 14 [ffff987b37d1bec8] kthread at ffffffffacebae31 Fixes: `2d73ac6102` ("scsi: qla2xxx: Serialize GPNID for multiple RSCN") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-09 12:15:11 +02:00
Greg Kroah-Hartman	128642b28a	Linux 4.17.13	2018-08-06 16:18:22 +02:00
Tony Battersby	bced7cbdcc	scsi: sg: fix minor memory leak in error path commit `c170e5a8d2` upstream. Fix a minor memory leak when there is an error opening a /dev/sg device. Fixes: `cc833acbee` ("sg: O_EXCL and other lock handling") Cc: <stable@vger.kernel.org> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:22 +02:00
Boris Brezillon	9d61d4bdf2	drm/atomic: Initialize variables in drm_atomic_helper_async_check() to make gcc happy commit `de2d8db395` upstream. drm_atomic_helper_async_check() declares the plane, old_plane_state and new_plane_state variables to iterate over all planes of the atomic state and make sure only one plane is enabled. Unfortunately gcc is not smart enough to figure out that the check on n_planes is enough to guarantee that plane, new_plane_state and old_plane_state are initialized. Explicitly initialize those variables to NULL to make gcc happy. Fixes: `fef9df8b59` ("drm/atomic: initial support for asynchronous plane update") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Link: https://patchwork.freedesktop.org/patch/msgid/20180724133300.32023-1-boris.brezillon@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:22 +02:00
Boris Brezillon	3859ebae85	drm/atomic: Check old_plane_state->crtc in drm_atomic_helper_async_check() commit `603ba2dfb3` upstream. Async plane update is supposed to work only when updating the FB or FB position of an already enabled plane. That does not apply to requests where the plane was previously disabled or assigned to a different CTRC. Check old_plane_state->crtc value to make sure async plane update is allowed. Fixes: `fef9df8b59` ("drm/atomic: initial support for asynchronous plane update") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180724133215.31917-1-boris.brezillon@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:22 +02:00
Boris Brezillon	53a1cb1c35	drm/vc4: Reset ->{x, y}_scaling[1] when dealing with uniplanar formats commit `a6a00918d4` upstream. This is needed to ensure ->is_unity is correct when the plane was previously configured to output a multi-planar format with scaling enabled, and is then being reconfigured to output a uniplanar format. Fixes: `fc04023faf` ("drm/vc4: Add support for YUV planes.") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Eric Anholt <eric@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20180724133601.32114-1-boris.brezillon@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Herbert Xu	3f9bc0411d	crypto: padlock-aes - Fix Nano workaround data corruption commit `46d8c4b286` upstream. This was detected by the self-test thanks to Ard's chunking patch. I finally got around to testing this out on my ancient Via box. It turns out that the workaround got the assembly wrong and we end up doing count + initial cycles of the loop instead of just count. This obviously causes corruption, either by overwriting the source that is yet to be processed, or writing over the end of the buffer. On CPUs that don't require the workaround only ECB is affected. On Nano CPUs both ECB and CBC are affected. This patch fixes it by doing the subtraction prior to the assembly. Fixes: `a76c1c23d0` ("crypto: padlock-aes - work around Nano CPU...") Cc: <stable@vger.kernel.org> Reported-by: Jamie Heilman <jamie@audible.transient.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Jack Morgenstein	d432f2f0bf	RDMA/uverbs: Expand primary and alt AV port checks commit `addb8a6559` upstream. The commit cited below checked that the port numbers provided in the primary and alt AVs are legal. That is sufficient to prevent a kernel panic. However, it is not sufficient for correct operation. In Linux, AVs (both primary and alt) must be completely self-described. We do not accept an AV from userspace without an embedded port number. (This has been the case since kernel 3.14 commit `dbf727de74` ("IB/core: Use GID table in AH creation and dmac resolution")). For the primary AV, this embedded port number must match the port number specified with IB_QP_PORT. We also expect the port number embedded in the alt AV to match the alt_port_num value passed by the userspace driver in the modify_qp command base structure. Add these checks to modify_qp. Cc: <stable@vger.kernel.org> # 4.16 Fixes: `5d4c05c3ee` ("RDMA/uverbs: Sanitize user entered port numbers prior to access it") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Rafał Miłecki	afda82507f	brcmfmac: fix regression in parsing NVRAM for multiple devices commit `299b6365a3` upstream. NVRAM is designed to work with Broadcom's SDK Linux kernel which fakes PCI domain 0 for all internal MMIO devices. Since official Linux kernel uses platform devices for that purpose there is a mismatch in numbering PCI domains. There used to be a fix for that problem but it was accidentally dropped during the last firmware loading rework. That resulted in brcmfmac not being able to extract device specific NVRAM content and all kind of calibration problems. Reported-by: Aditya Xavier <adityaxavier@gmail.com> Fixes: `2baa3aaee2` ("brcmfmac: introduce brcmf_fw_alloc_request() function") Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Emmanuel Grumbach	b5c014661a	iwlwifi: add more card IDs for 9000 series commit `0a5257bc6d` upstream. Add new device IDs for the 9000 series. Cc: stable@vger.kernel.org # 4.14 Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Mike Rapoport	9559dd910c	userfaultfd: remove uffd flags from vma->vm_flags if UFFD_EVENT_FORK fails commit `31e810aa10` upstream. The fix in commit `0cbb4b4f4c` ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") cleared the vma->vm_userfaultfd_ctx but kept userfaultfd flags in vma->vm_flags that were copied from the parent process VMA. As the result, there is an inconsistency between the values of vma->vm_userfaultfd_ctx.ctx and vma->vm_flags which triggers BUG_ON in userfaultfd_release(). Clearing the uffd flags from vma->vm_flags in case of UFFD_EVENT_FORK failure resolves the issue. Link: http://lkml.kernel.org/r/1532931975-25473-1-git-send-email-rppt@linux.vnet.ibm.com Fixes: `0cbb4b4f4c` ("userfaultfd: clear the vma->vm_userfaultfd_ctx if UFFD_EVENT_FORK fails") Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reported-by: syzbot+121be635a7a35ddb7dcb@syzkaller.appspotmail.com Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Eric Biggers <ebiggers3@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Jane Chu	80755071c1	ipc/shm.c add ->pagesize function to shm_vm_ops commit `eec3636ad1` upstream. Commit `05ea88608d` ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") adds a new ->pagesize() function to hugetlb_vm_ops, intended to cover all hugetlbfs backed files. With System V shared memory model, if "huge page" is specified, the "shared memory" is backed by hugetlbfs files, but the mappings initiated via shmget/shmat have their original vm_ops overwritten with shm_vm_ops, so we need to add a ->pagesize function to shm_vm_ops. Otherwise, vma_kernel_pagesize() returns PAGE_SIZE given a hugetlbfs backed vma, result in below BUG: fs/hugetlbfs/inode.c 443 if (unlikely(page_mapped(page))) { 444 BUG_ON(truncate_op); resulting in hugetlbfs: oracle (4592): Using mlock ulimits for SHM_HUGETLB is deprecated ------------[ cut here ]------------ kernel BUG at fs/hugetlbfs/inode.c:444! Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 ... CPU: 35 PID: 5583 Comm: oracle_5583_sbt Not tainted 4.14.35-1829.el7uek.x86_64 #2 RIP: 0010:remove_inode_hugepages+0x3db/0x3e2 .... Call Trace: hugetlbfs_evict_inode+0x1e/0x3e evict+0xdb/0x1af iput+0x1a2/0x1f7 dentry_unlink_inode+0xc6/0xf0 __dentry_kill+0xd8/0x18d dput+0x1b5/0x1ed __fput+0x18b/0x216 ____fput+0xe/0x10 task_work_run+0x90/0xa7 exit_to_usermode_loop+0xdd/0x116 do_syscall_64+0x187/0x1ae entry_SYSCALL_64_after_hwframe+0x150/0x0 [jane.chu@oracle.com: relocate comment] Link: http://lkml.kernel.org/r/20180731044831.26036-1-jane.chu@oracle.com Link: http://lkml.kernel.org/r/20180727211727.5020-1-jane.chu@oracle.com Fixes: `05ea88608d` ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") Signed-off-by: Jane Chu <jane.chu@oracle.com> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Yi Wang	cbdef783b1	audit: fix potential null dereference 'context->module.name' commit `b305f7ed0f` upstream. The variable 'context->module.name' may be null pointer when kmalloc return null, so it's better to check it before using to avoid null dereference. Another one more thing this patch does is using kstrdup instead of (kmalloc + strcpy), and signal a lost record via audit_log_lost. Cc: stable@vger.kernel.org # 4.11 Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Roman Kagan	aa0703c2e3	kvm: x86: vmx: fix vpid leak commit `63aff65573` upstream. VPID for the nested vcpu is allocated at vmx_create_vcpu whenever nested vmx is turned on with the module parameter. However, it's only freed if the L1 guest has executed VMXON which is not a given. As a result, on a system with nested==on every creation+deletion of an L1 vcpu without running an L2 guest results in leaking one vpid. Since the total number of vpids is limited to 64k, they can eventually get exhausted, preventing L2 from starting. Delay allocation of the L2 vpid until VMXON emulation, thus matching its freeing. Fixes: `5c614b3583` Cc: stable@vger.kernel.org Signed-off-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:21 +02:00
Andy Lutomirski	6557adc692	x86/entry/64: Remove %ebx handling from error_entry/exit commit `b3681dd548` upstream. error_entry and error_exit communicate the user vs. kernel status of the frame using %ebx. This is unnecessary -- the information is in regs->cs. Just use regs->cs. This makes error_entry simpler and makes error_exit more robust. It also fixes a nasty bug. Before all the Spectre nonsense, the xen_failsafe_callback entry point returned like this: ALLOC_PT_GPREGS_ON_STACK SAVE_C_REGS SAVE_EXTRA_REGS ENCODE_FRAME_POINTER jmp error_exit And it did not go through error_entry. This was bogus: RBX contained garbage, and error_exit expected a flag in RBX. Fortunately, it generally contained nonzero garbage, so the correct code path was used. As part of the Spectre fixes, code was added to clear RBX to mitigate certain speculation attacks. Now, depending on kernel configuration, RBX got zeroed and, when running some Wine workloads, the kernel crashes. This was introduced by: commit `3ac6d8c787` ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface") With this patch applied, RBX is no longer needed as a flag, and the problem goes away. I suspect that malicious userspace could use this bug to crash the kernel even without the offending patch applied, though. [ Historical note: I wrote this patch as a cleanup before I was aware of the bug it fixed. ] [ Note to stable maintainers: this should probably get applied to all kernels. If you're nervous about that, a more conservative fix to add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should also fix the problem. ] Reported-and-tested-by: M. Vefa Bicakci <m.v.b@runbox.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: xen-devel@lists.xenproject.org Fixes: `3ac6d8c787` ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface") Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Len Brown	15265c8188	x86/apic: Future-proof the TSC_DEADLINE quirk for SKX commit `d9e6dbcf28` upstream. All SKX with stepping higher than 4 support the TSC_DEADLINE, no matter the microcode version. Without this patch, upcoming SKX steppings will not be able to use their TSC_DEADLINE timer. Signed-off-by: Len Brown <len.brown@intel.com> Cc: <stable@kernel.org> # v4.14+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `616dd5872e` ("x86/apic: Update TSC_DEADLINE quirk with additional SKX stepping") Link: http://lkml.kernel.org/r/d0c7129e509660be9ec6b233284b8d42d90659e8.1532207856.git.len.brown@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Brijesh Singh	d17111f7b4	x86/efi: Access EFI MMIO data as unencrypted when SEV is active commit `9b788f32be` upstream. SEV guest fails to update the UEFI runtime variables stored in the flash. The following commit: `1379edd596` ("x86/efi: Access EFI data as encrypted when SEV is active") unconditionally maps all the UEFI runtime data as 'encrypted' (C=1). When SEV is active the UEFI runtime data marked as EFI_MEMORY_MAPPED_IO should be mapped as 'unencrypted' so that both guest and hypervisor can access the data. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: <stable@vger.kernel.org> # 4.15.x Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: `1379edd596` ("x86/efi: Access EFI data as encrypted ...") Link: http://lkml.kernel.org/r/20180720012846.23560-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Jiang Biao	c301e0b0a0	virtio_balloon: fix another race between migration and ballooning commit `89da619bc1` upstream. Kernel panic when with high memory pressure, calltrace looks like, PID: 21439 TASK: ffff881be3afedd0 CPU: 16 COMMAND: "java" #0 [ffff881ec7ed7630] machine_kexec at ffffffff81059beb #1 [ffff881ec7ed7690] __crash_kexec at ffffffff81105942 #2 [ffff881ec7ed7760] crash_kexec at ffffffff81105a30 #3 [ffff881ec7ed7778] oops_end at ffffffff816902c8 #4 [ffff881ec7ed77a0] no_context at ffffffff8167ff46 #5 [ffff881ec7ed77f0] __bad_area_nosemaphore at ffffffff8167ffdc #6 [ffff881ec7ed7838] __node_set at ffffffff81680300 #7 [ffff881ec7ed7860] __do_page_fault at ffffffff8169320f #8 [ffff881ec7ed78c0] do_page_fault at ffffffff816932b5 #9 [ffff881ec7ed78f0] page_fault at ffffffff8168f4c8 [exception RIP: _raw_spin_lock_irqsave+47] RIP: ffffffff8168edef RSP: ffff881ec7ed79a8 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffffea0019740d00 RCX: ffff881ec7ed7fd8 RDX: 0000000000020000 RSI: 0000000000000016 RDI: 0000000000000008 RBP: ffff881ec7ed79a8 R8: 0000000000000246 R9: 000000000001a098 R10: ffff88107ffda000 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000008 R14: ffff881ec7ed7a80 R15: ffff881be3afedd0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 It happens in the pagefault and results in double pagefault during compacting pages when memory allocation fails. Analysed the vmcore, the page leads to second pagefault is corrupted with _mapcount=-256, but private=0. It's caused by the race between migration and ballooning, and lock missing in virtballoon_migratepage() of virtio_balloon driver. This patch fix the bug. Fixes: `e22504296d` ("virtio_balloon: introduce migration primitives to balloon pages") Cc: stable@vger.kernel.org Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn> Signed-off-by: Huang Chong <huang.chong@zte.com.cn> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Jeremy Cline	82d0d07a25	net: socket: Fix potential spectre v1 gadget in sock_is_registered commit `e978de7a6d` upstream. 'family' can be a user-controlled value, so sanitize it after the bounds check to avoid speculative out-of-bounds access. Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Jeremy Cline	baaa0eb84e	net: socket: fix potential spectre v1 gadget in socketcall commit `c8e8cd579b` upstream. 'call' is a user-controlled value, so sanitize the array index after the bounds check to avoid speculating past the bounds of the 'nargs' array. Found with the help of Smatch: net/socket.c:2508 __do_sys_socketcall() warn: potential spectre issue 'nargs' [r] (local cap) Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Jeremy Cline <jcline@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Anton Vasilyev	e8445da5df	can: ems_usb: Fix memory leak on ems_usb_disconnect() commit `72c05f32f4` upstream. ems_usb_probe() allocates memory for dev->tx_msg_buffer, but there is no its deallocation in ems_usb_disconnect(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:20 +02:00
Linus Torvalds	ca774ff89f	squashfs: more metadata hardenings commit `71755ee535` upstream. The squashfs fragment reading code doesn't actually verify that the fragment is inside the fragment table. The end result _is_ verified to be inside the image when actually reading the fragment data, but before that is done, we may end up taking a page fault because the fragment table itself might not even exist. Another report from Anatoly and his endless squashfs image fuzzing. Reported-by: Анатолий Тросиненко <anatoly.trosinenko@gmail.com> Acked-by:: Phillip Lougher <phillip.lougher@gmail.com>, Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Linus Torvalds	c14014186e	squashfs: more metadata hardening commit `d512584780` upstream. Anatoly reports another squashfs fuzzing issue, where the decompression parameters themselves are in a compressed block. This causes squashfs_read_data() to be called in order to read the decompression options before the decompression stream having been set up, making squashfs go sideways. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Acked-by: Phillip Lougher <phillip.lougher@gmail.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Feras Daoud	1c83fc5eee	net/mlx5e: IPoIB, Set the netdevice sw mtu in ipoib enhanced flow [ Upstream commit `8e1d162d8e` ] After introduction of the cited commit, mlx5e_build_nic_params receives the netdevice mtu in order to set the sw_mtu of mlx5e_params. For enhanced IPoIB, the netdevice mtu is not set in this stage, therefore, the initial sw_mtu equals zero. As a result, the hw_mtu of the receive queue will be calculated incorrectly causing traffic issues. To fix this issue, query for port mtu before building the nic params. Fixes: `472a1e44b3` ("net/mlx5e: Save MTU in channels params") Signed-off-by: Feras Daoud <ferasda@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Or Gerlitz	e4cecd1c06	net/mlx5e: Set port trust mode to PCP as default [ Upstream commit `2e8e70d249` ] The hairpin offload code has dependency on the trust mode being PCP. Hence we should set PCP as the default for handling cases where we are disallowed to read the trust mode from the FW, or failed to initialize it. Fixes: `106be53b6b` ('net/mlx5e: Set per priority hairpin pairs') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Eli Cohen	60406fbeb4	net/mlx5e: E-Switch, Initialize eswitch only if eswitch manager [ Upstream commit `5f5991f36d` ] Execute mlx5_eswitch_init() only if we have MLX5_ESWITCH_MANAGER capabilities. Do the same for mlx5_eswitch_cleanup(). Fixes: `a9f7705ffd` ("net/mlx5: Unify vport manager capability check") Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
YueHaibing	528e9fa818	rxrpc: Fix user call ID check in rxrpc_service_prealloc_one [ Upstream commit `c01f6c9b32` ] There just check the user call ID isn't already in use, hence should compare user_call_ID with xcall->user_call_ID, which is current node's user_call_ID. Fixes: `540b1c48c3` ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Jose Abreu	6ee47da71b	net: stmmac: Fix WoL for PCI-based setups [ Upstream commit `b7d0f08e91` ] WoL won't work in PCI-based setups because we are not saving the PCI EP state before entering suspend state and not allowing D3 wake. Fix this by using a wrapper around stmmac_{suspend/resume} which correctly sets the PCI EP state. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Cc: David S. Miller <davem@davemloft.net> Cc: Joao Pinto <jpinto@synopsys.com> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Jeremy Cline	a927731692	netlink: Fix spectre v1 gadget in netlink_create() [ Upstream commit `bc5b6c0b62` ] 'protocol' is a user-controlled value, so sanitize it after the bounds check to avoid using it for speculative out-of-bounds access to arrays indexed by it. This addresses the following accesses detected with the help of smatch: * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_keys' [w] * net/netlink/af_netlink.c:654 __netlink_create() warn: potential spectre issue 'nlk_cb_mutex_key_strings' [w] * net/netlink/af_netlink.c:685 netlink_create() warn: potential spectre issue 'nl_table' [w] (local cap) Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Florian Fainelli	bfa48dc9a9	net: dsa: Do not suspend/resume closed slave_dev [ Upstream commit `a94c689e6c` ] If a DSA slave network device was previously disabled, there is no need to suspend or resume it. Fixes: `2446254915` ("net: dsa: allow switch drivers to implement suspend/resume hooks") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:19 +02:00
Eric Dumazet	868d277f41	ipv4: frags: handle possible skb truesize change [ Upstream commit `4672694bd4` ] ip_frag_queue() might call pskb_pull() on one skb that is already in the fragment queue. We need to take care of possible truesize change, or we might have an imbalance of the netns frags memory usage. IPv6 is immune to this bug, because RFC5722, Section 4, amended by Errata ID 3089 states : When reassembling an IPv6 datagram, if one or more its constituent fragments is determined to be an overlapping fragment, the entire datagram (and any constituent fragments) MUST be silently discarded. Fixes: `158f323b98` ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:18 +02:00
Eric Dumazet	e874d4ea8d	inet: frag: enforce memory limits earlier [ Upstream commit `56e2c94f05` ] We currently check current frags memory usage only when a new frag queue is created. This allows attackers to first consume the memory budget (default : 4 MB) creating thousands of frag queues, then sending tiny skbs to exceed high_thresh limit by 2 to 3 order of magnitude. Note that before commit `648700f76b` ("inet: frags: use rhashtables for reassembly units"), work queue could be starved under DOS, getting no cpu cycles. After commit `648700f76b`, only the per frag queue timer can eventually remove an incomplete frag queue and its skbs. Fixes: `b13d3cbfb8` ("inet: frag: move eviction of queues to work queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Jann Horn <jannh@google.com> Cc: Florian Westphal <fw@strlen.de> Cc: Peter Oskolkov <posk@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:18 +02:00
Eric Dumazet	e611b8fdde	bonding: avoid lockdep confusion in bond_get_stats() [ Upstream commit `7e2556e400` ] syzbot found that the following sequence produces a LOCKDEP splat [1] ip link add bond10 type bond ip link add bond11 type bond ip link set bond11 master bond10 To fix this, we can use the already provided nest_level. This patch also provides correct nesting for dev->addr_list_lock [1] WARNING: possible recursive locking detected 4.18.0-rc6+ #167 Not tainted -------------------------------------------- syz-executor751/4439 is trying to acquire lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 but task is already holding lock: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&bond->stats_lock)->rlock); lock(&(&bond->stats_lock)->rlock); * DEADLOCK * May be due to missing lock nesting notation 3 locks held by syz-executor751/4439: #0: (____ptrval____) (rtnl_mutex){+.+.}, at: rtnl_lock+0x17/0x20 net/core/rtnetlink.c:77 #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: spin_lock include/linux/spinlock.h:310 [inline] #1: (____ptrval____) (&(&bond->stats_lock)->rlock){+.+.}, at: bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 #2: (____ptrval____) (rcu_read_lock){....}, at: bond_get_stats+0x0/0x560 include/linux/compiler.h:215 stack backtrace: CPU: 0 PID: 4439 Comm: syz-executor751 Not tainted 4.18.0-rc6+ #167 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_deadlock_bug kernel/locking/lockdep.c:1765 [inline] check_deadlock kernel/locking/lockdep.c:1809 [inline] validate_chain kernel/locking/lockdep.c:2405 [inline] __lock_acquire.cold.64+0x1fb/0x486 kernel/locking/lockdep.c:3435 lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:144 spin_lock include/linux/spinlock.h:310 [inline] bond_get_stats+0xb4/0x560 drivers/net/bonding/bond_main.c:3426 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 bond_get_stats+0x232/0x560 drivers/net/bonding/bond_main.c:3432 dev_get_stats+0x10f/0x470 net/core/dev.c:8316 rtnl_fill_stats+0x4d/0xac0 net/core/rtnetlink.c:1169 rtnl_fill_ifinfo+0x1aa6/0x3fb0 net/core/rtnetlink.c:1611 rtmsg_ifinfo_build_skb+0xc8/0x190 net/core/rtnetlink.c:3268 rtmsg_ifinfo_event.part.30+0x45/0xe0 net/core/rtnetlink.c:3300 rtmsg_ifinfo_event net/core/rtnetlink.c:3297 [inline] rtnetlink_event+0x144/0x170 net/core/rtnetlink.c:4716 notifier_call_chain+0x180/0x390 kernel/notifier.c:93 __raw_notifier_call_chain kernel/notifier.c:394 [inline] raw_notifier_call_chain+0x2d/0x40 kernel/notifier.c:401 call_netdevice_notifiers_info+0x3f/0x90 net/core/dev.c:1735 call_netdevice_notifiers net/core/dev.c:1753 [inline] netdev_features_change net/core/dev.c:1321 [inline] netdev_change_features+0xb3/0x110 net/core/dev.c:7759 bond_compute_features.isra.47+0x585/0xa50 drivers/net/bonding/bond_main.c:1120 bond_enslave+0x1b25/0x5da0 drivers/net/bonding/bond_main.c:1755 bond_do_ioctl+0x7cb/0xae0 drivers/net/bonding/bond_main.c:3528 dev_ifsioc+0x43c/0xb30 net/core/dev_ioctl.c:327 dev_ioctl+0x1b5/0xcc0 net/core/dev_ioctl.c:493 sock_do_ioctl+0x1d3/0x3e0 net/socket.c:992 sock_ioctl+0x30d/0x680 net/socket.c:1093 vfs_ioctl fs/ioctl.c:46 [inline] file_ioctl fs/ioctl.c:500 [inline] do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:684 ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 __do_sys_ioctl fs/ioctl.c:708 [inline] __se_sys_ioctl fs/ioctl.c:706 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:706 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x440859 Code: e8 2c af 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffc51a92878 EFLAGS: 00000213 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000440859 RDX: 0000000020000040 RSI: 0000000000008990 RDI: 0000000000000003 RBP: 0000000000000000 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000022d5880 R11: 0000000000000213 R12: 0000000000007390 R13: 0000000000401db0 R14: 0000000000000000 R15: 0000000000000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-06 16:18:18 +02:00
Greg Kroah-Hartman	506f6fba26	Linux 4.17.12	2018-08-03 07:48:07 +02:00
Erik Schmauss	54babab909	ACPICA: AML Parser: ignore control method status in module-level code commit `460a53106a` upstream. Previous change in the AML parser code blindly set all non-successful dispatcher statuses to AE_OK. That approach is incorrect, though, because successful control method invocations from module-level return AE_CTRL_TRANSFER. Overwriting AE_OK to this status causes the AML parser to think that there was no return value from the control method invocation. Fixes: 92c0f4af386 (ACPICA: AML Parser: ignore dispatcher error status during table load) Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Rafael J. Wysocki	89db44642f	ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation commit `12864ff854` upstream. Commit `a09c591306` (ACPI / LPSS: Avoid PM quirks on suspend and resume from S3) modified the ACPI driver for Intel SoCs (LPSS) to avoid applying PM quirks on suspend and resume from S3 to address system-wide suspend and resume problems on some systems, but it is reported that the same issue also affects hibernation, so extend the approach used by that commit to cover hibernation as well. Fixes: `a09c591306` (ACPI / LPSS: Avoid PM quirks on suspend and resume from S3) Link: https://bugs.launchpad.net/bugs/1774950 Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Lawrence Brakmo	1b3610883f	tcp: ack immediately when a cwr packet arrives [ Upstream commit `9aee400061` ] We observed high 99 and 99.9% latencies when doing RPCs with DCTCP. The problem is triggered when the last packet of a request arrives CE marked. The reply will carry the ECE mark causing TCP to shrink its cwnd to 1 (because there are no packets in flight). When the 1st packet of the next request arrives, the ACK was sometimes delayed even though it is CWR marked, adding up to 40ms to the RPC latency. This patch insures that CWR marked data packets arriving will be acked immediately. Packetdrill script to reproduce the problem: 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 2:3(1) ack 2001 0.200 < [ect0] . 2001:3001(1000) ack 3 win 257 0.200 < [ect0] . 3001:4001(1000) ack 3 win 257 0.200 > [ect01] . 3:3(0) ack 4001 0.210 < [ce] P. 4001:4501(500) ack 3 win 257 +0.001 read(4, ..., 4500) = 4500 +0 write(4, ..., 1) = 1 +0 > [ect01] PE. 3:4(1) ack 4501 +0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257 // Previously the ACK sequence below would be 4501, causing a long RTO +0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack +0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data +0 > [ect01] . 4:4(0) ack 6501 // now acks everything +0.500 < F. 9501:9501(0) ack 4 win 257 Modified based on comments by Neal Cardwell <ncardwell@google.com> Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Eric Dumazet	b98838f41e	tcp: add one more quick ack after after ECN events [ Upstream commit `15ecbe94a4` ] Larry Brakmo proposal ( https://patchwork.ozlabs.org/patch/935233/ tcp: force cwnd at least 2 in tcp_cwnd_reduction) made us rethink about our recent patch removing ~16 quick acks after ECN events. tcp_enter_quickack_mode(sk, 1) makes sure one immediate ack is sent, but in the case the sender cwnd was lowered to 1, we do not want to have a delayed ack for the next packet we will receive. Fixes: `522040ea5f` ("tcp: do not aggressively quick ack after ECN events") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Neal Cardwell <ncardwell@google.com> Cc: Lawrence Brakmo <brakmo@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Yousuk Seung	04a7cf23ff	tcp: refactor tcp_ecn_check_ce to remove sk type cast [ Upstream commit `f4c9f85f3b` ] Refactor tcp_ecn_check_ce and __tcp_ecn_check_ce to accept struct sock* instead of tcp_sock* to clean up type casts. This is a pure refactor patch. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Eric Dumazet	d90f13b075	tcp: do not aggressively quick ack after ECN events [ Upstream commit `522040ea5f` ] ECN signals currently forces TCP to enter quickack mode for up to 16 (TCP_MAX_QUICKACKS) following incoming packets. We believe this is not needed, and only sending one immediate ack for the current packet should be enough. This should reduce the extra load noticed in DCTCP environments, after congestion events. This is part 2 of our effort to reduce pure ACK packets. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Eric Dumazet	8893617eeb	tcp: add max_quickacks param to tcp_incr_quickack and tcp_enter_quickack_mode [ Upstream commit `9a9c9b51e5` ] We want to add finer control of the number of ACK packets sent after ECN events. This patch is not changing current behavior, it only enables following change. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Eric Dumazet	2631845213	tcp: do not force quickack when receiving out-of-order packets [ Upstream commit `a3893637e1` ] As explained in commit `9f9843a751` ("tcp: properly handle stretch acks in slow start"), TCP stacks have to consider how many packets are acknowledged in one single ACK, because of GRO, but also because of ACK compression or losses. We plan to add SACK compression in the following patch, we must therefore not call tcp_enter_quickack_mode() Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:06 +02:00
Dmitry Safonov	dc73f8ae16	netlink: Don't shift with UB on nlk->ngroups [ Upstream commit `61f4b23769` ] On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in hang during boot. Check for 0 ngroups and use (unsigned long long) as a type to shift. Fixes: `7acf9d4237` ("netlink: Do not subscribe to non-existent groups"). Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Dmitry Safonov	fc3364865d	netlink: Do not subscribe to non-existent groups [ Upstream commit `7acf9d4237` ] Make ABI more strict about subscribing to group > ngroups. Code doesn't check for that and it looks bogus. (one can subscribe to non-existing group) Still, it's possible to bind() to all possible groups with (-1) Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: netdev@vger.kernel.org Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Tariq Toukan	469bda04fc	net: rollback orig value on failure of dev_qdisc_change_tx_queue_len [ Upstream commit `7effaf06c3` ] Fix dev_change_tx_queue_len so it rolls back original value upon a failure in dev_qdisc_change_tx_queue_len. This is already done for notifirers' failures, share the code. In case of failure in dev_qdisc_change_tx_queue_len, some tx queues would still be of the new length, while they should be reverted. Currently, the revert is not done, and is marked with a TODO label in dev_qdisc_change_tx_queue_len, and should find some nice solution to do it. Yet it is still better to not apply the newly requested value. Fixes: `48bfd55e7e` ("net_sched: plug in qdisc ops change_tx_queue_len") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Reported-by: Ran Rozenstein <ranro@mellanox.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Arjun Vynipadath	d685bbf1ac	cxgb4: Added missing break in ndo_udp_tunnel_{add/del} [ Upstream commit `942a656f1f` ] Break statements were missing for Geneve case in ndo_udp_tunnel_{add/del}, thereby raw mac matchall entries were not getting added. Fixes: c746fc0e8b2d("cxgb4: add geneve offload support for T6") Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Xiao Liang	8cd998af04	xen-netfront: wait xenbus state change when load module manually [ Upstream commit `822fb18a82` ] When loading module manually, after call xenbus_switch_state to initializes the state of the netfront device, the driver state did not change so fast that may lead no dev created in latest kernel. This patch adds wait to make sure xenbus knows the driver is not in closed/unknown state. Current state: [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available With the patch installed. [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Link detected: yes Signed-off-by: Xiao Liang <xiliang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Toshiaki Makita	4c42c0bfd0	virtio_net: Fix incosistent received bytes counter [ Upstream commit `ecbc42ca5d` ] When received packets are dropped in virtio_net driver, received packets counter is incremented but bytes counter is not. As a result, for instance if we drop all packets by XDP, only received is counted and bytes stays 0, which looks inconsistent. IMHO received packets/bytes should be counted if packets are produced by the hypervisor, like what common NICs on physical machines are doing. So fix the bytes counter. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Neal Cardwell	a21b4b34c8	tcp_bbr: fix bw probing to raise in-flight data for very small BDPs [ Upstream commit `383d470936` ] For some very small BDPs (with just a few packets) there was a quantization effect where the target number of packets in flight during the super-unity-gain (1.25x) phase of gain cycling was implicitly truncated to a number of packets no larger than the normal unity-gain (1.0x) phase of gain cycling. This meant that in multi-flow scenarios some flows could get stuck with a lower bandwidth, because they did not push enough packets inflight to discover that there was more bandwidth available. This was really only an issue in multi-flow LAN scenarios, where RTTs and BDPs are low enough for this to be an issue. This fix ensures that gain cycling can raise inflight for small BDPs by ensuring that in PROBE_BW mode target inflight values with a super-unity gain are always greater than inflight values with a gain <= 1. Importantly, this applies whether the inflight value is calculated for use as a cwnd value, or as a target inflight value for the end of the super-unity phase in bbr_is_next_cycle_phase() (both need to be bigger to ensure we can probe with more packets in flight reliably). This is a candidate fix for stable releases. Fixes: `0f8782ea14` ("tcp_bbr: add BBR congestion control") Signed-off-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Priyaranjan Jha <priyarjha@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Avinash Repaka	cb2a0a6c16	RDS: RDMA: Fix the NULL-ptr deref in rds_ib_get_mr [ Upstream commit `9e630bcb77` ] Registration of a memory region(MR) through FRMR/fastreg(unlike FMR) needs a connection/qp. With a proxy qp, this dependency on connection will be removed, but that needs more infrastructure patches, which is a work in progress. As an intermediate fix, the get_mr returns EOPNOTSUPP when connection details are not populated. The MR registration through sendmsg() will continue to work even with fast registration, since connection in this case is formed upfront. This patch fixes the following crash: kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Modules linked in: CPU: 1 PID: 4244 Comm: syzkaller468044 Not tainted 4.16.0-rc6+ #361 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544 RSP: 0018:ffff8801b059f890 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8801b07e1300 RCX: ffffffff8562d96e RDX: 000000000000000d RSI: 0000000000000001 RDI: 0000000000000068 RBP: ffff8801b059f8b8 R08: ffffed0036274244 R09: ffff8801b13a1200 R10: 0000000000000004 R11: ffffed0036274243 R12: ffff8801b13a1200 R13: 0000000000000001 R14: ffff8801ca09fa9c R15: 0000000000000000 FS: 00007f4d050af700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f4d050aee78 CR3: 00000001b0d9b006 CR4: 00000000001606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __rds_rdma_map+0x710/0x1050 net/rds/rdma.c:271 rds_get_mr_for_dest+0x1d4/0x2c0 net/rds/rdma.c:357 rds_setsockopt+0x6cc/0x980 net/rds/af_rds.c:347 SYSC_setsockopt net/socket.c:1849 [inline] SyS_setsockopt+0x189/0x360 net/socket.c:1828 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x4456d9 RSP: 002b:00007f4d050aedb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036 RAX: ffffffffffffffda RBX: 00000000006dac3c RCX: 00000000004456d9 RDX: 0000000000000007 RSI: 0000000000000114 RDI: 0000000000000004 RBP: 00000000006dac38 R08: 00000000000000a0 R09: 0000000000000000 R10: 0000000020000380 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffbfb36d6f R14: 00007f4d050af9c0 R15: 0000000000000005 Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 cc 01 00 00 4c 8b bb 80 04 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7f 68 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 9c 01 00 00 4d 8b 7f 68 48 b8 00 00 00 00 00 RIP: rds_ib_get_mr+0x5c/0x230 net/rds/ib_rdma.c:544 RSP: ffff8801b059f890 ---[ end trace 7e1cea13b85473b0 ]--- Reported-by: syzbot+b51c77ef956678a65834@syzkaller.appspotmail.com Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: Avinash Repaka <avinash.repaka@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Eugeniy Paltsev	77629161f0	NET: stmmac: align DMA stuff to largest cache line length [ Upstream commit `9939a46d90` ] As for today STMMAC_ALIGN macro (which is used to align DMA stuff) relies on L1 line length (L1_CACHE_BYTES). This isn't correct in case of system with several cache levels which might have L1 cache line length smaller than L2 line. This can lead to sharing one cache line between DMA buffer and other data, so we can lose this data while invalidate DMA buffer before DMA transaction. Fix that by using SMP_CACHE_BYTES instead of L1_CACHE_BYTES for aligning. Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Anton Vasilyev	71e5f4e1ea	net: mdio-mux: bcm-iproc: fix wrong getter and setter pair [ Upstream commit `b0753408aa` ] mdio_mux_iproc_probe() uses platform_set_drvdata() to store md pointer in device, whereas mdio_mux_iproc_remove() restores md pointer by dev_get_platdata(&pdev->dev). This leads to wrong resources release. The patch replaces getter to platform_get_drvdata. Fixes: `98bc865a1e` ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Signed-off-by: Anton Vasilyev <vasilyev@ispras.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:05 +02:00
Stefan Wahren	9678686272	net: lan78xx: fix rx handling before first packet is send [ Upstream commit `136f55f660` ] As long the bh tasklet isn't scheduled once, no packet from the rx path will be handled. Since the tx path also schedule the same tasklet this situation only persits until the first packet transmission. So fix this issue by scheduling the tasklet after link reset. Link: https://github.com/raspberrypi/linux/issues/2617 Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Suggested-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
tangpengpeng	db610cdfa9	net: fix amd-xgbe flow-control issue [ Upstream commit `7f3fc7ddf7` ] If we enable or disable xgbe flow-control by ethtool , it does't work.Because the parameter is not properly assigned,so we need to adjust the assignment order of the parameters. Fixes: `c1ce2f7736` ("amd-xgbe: Fix flow control setting logic") Signed-off-by: tangpengpeng <tangpengpeng@higon.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Gal Pressman	780ce03df1	net: ena: Fix use of uninitialized DMA address bits field [ Upstream commit `101f0cd4f2` ] UBSAN triggers the following undefined behaviour warnings: [...] [ 13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22 [ 13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] [ 13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4 [ 13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int' [...] When splitting the address to high and low, GENMASK_ULL is used to generate a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and ena_com_add_single_rx_desc). The problem is that dma_addr_bits is not initialized with a proper value (besides being cleared in ena_com_create_io_queue). Assign dma_addr_bits the correct value that is stored in ena_dev when initializing the SQ. Fixes: `1738cd3ed3` ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Gal Pressman <pressmangal@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Jakub Kicinski	8f2634dd58	netdevsim: don't leak devlink resources [ Upstream commit `c259b4fb33` ] Devlink resources registered with devlink_resource_register() have to be unregistered. Fixes: `37923ed6b8` ("netdevsim: Add simple FIB resource controller via devlink") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Lorenzo Bianconi	11f8d2a656	ipv4: remove BUG_ON() from fib_compute_spec_dst [ Upstream commit `9fc12023d6` ] Remove BUG_ON() from fib_compute_spec_dst routine and check in_dev pointer during flowi4 data structure initialization. fib_compute_spec_dst routine can be run concurrently with device removal where ip_ptr net_device pointer is set to NULL. This can happen if userspace enables pkt info on UDP rx socket and the device is removed while traffic is flowing Fixes: `35ebf65e85` ("ipv4: Create and use fib_compute_spec_dst() helper") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Michal Vokáč	1eb99afb65	net: dsa: qca8k: Allow overwriting CPU port setting commit `9bb2289f90` upstream. Implement adjust_link function that allows to overwrite default CPU port setting using fixed-link device tree subnode. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Michal Vokáč	02ccbb263e	net: dsa: qca8k: Add QCA8334 binding documentation commit `218bbea11a` upstream. Add support for the four-port variant of the Qualcomm QCA833x switch. The CPU port default link settings can be reconfigured using a fixed-link sub-node. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Michal Vokáč	a8760d052c	net: dsa: qca8k: Enable RXMAC when bringing up a port commit `eee1fe6476` upstream. When a port is brought up/down do not enable/disable only the TXMAC but the RXMAC as well. This is essential for the CPU port to work. Fixes: `6b93fb4648` ("net-next: dsa: add new driver for qca8xxx family") Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Michal Vokáč	ef405a8c14	net: dsa: qca8k: Force CPU port to its highest bandwidth commit `79a4ed4f0f` upstream. By default autonegotiation is enabled to configure MAC on all ports. For the CPU port autonegotiation can not be used so we need to set some sensible defaults manually. This patch forces the default setting of the CPU port to 1000Mbps/full duplex which is the chip maximum capability. Also correct size of the bit field used to configure link speed. Fixes: `6b93fb4648` ("net-next: dsa: add new driver for qca8xxx family") Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Leon Romanovsky	36d1eff359	RDMA/uverbs: Protect from attempts to create flows on unsupported QP commit `940efcc888` upstream. Flows can be created on UD and RAW_PACKET QP types. Attempts to provide other QP types as an input causes to various unpredictable failures. The reason is that in order to support all various types (e.g. XRC), we are supposed to use real_qp handle and not qp handle and expect to driver/FW to fail such (XRC) flows. The simpler and safer variant is to ban all QP types except UD and RAW_PACKET, instead of relying on driver/FW. Cc: <stable@vger.kernel.org> # 3.11 Fixes: `436f2ad05a` ("IB/core: Export ib_create/destroy_flow through uverbs") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:04 +02:00
Masahiro Yamada	6645b869f9	gpio: uniphier: set legitimate irq trigger type in .to_irq hook commit `bbfbf04c2d` upstream. If a GPIO chip is a part of a hierarchy IRQ domain, there is no way to specify the trigger type when gpio(d)_to_irq() allocates an interrupt on-the-fly. Currently, uniphier_gpio_to_irq() sets IRQ_TYPE_NONE, but it causes an error in the .alloc() hook of the parent domain. (drivers/irq/irq-uniphier-aidet.c) Even if we change irq-uniphier-aidet.c to accept the NONE type, GIC complains about it since commit `83a86fbb5b` ("irqchip/gic: Loudly complain about the use of IRQ_TYPE_NONE"). Instead, use IRQ_TYPE_LEVEL_HIGH as a temporary value when an irq is allocated. irq_set_irq_type() will override it when the irq is really requested. Fixes: `dbe776c2ca` ("gpio: uniphier: add UniPhier GPIO controller driver") Reported-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Linus Walleij	33d553b27b	gpio: of: Handle fixed regulator flags properly commit `906402a44b` upstream. This fixes up the handling of fixed regulator polarity inversion flags: while I remembered to fix it for the undocumented "reg-fixed-voltage" I forgot about the official "regulator-fixed" binding, there are two ways to do a fixed regulator. The error was noticed and fixed. Fixes: `a603a2b8d8` ("gpio: of: Add special quirk to parse regulator flags") Cc: Mark Brown <broonie@kernel.org> Cc: Thierry Reding <thierry.reding@gmail.com> Reported-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Theodore Ts'o	f668f6ef2b	ext4: fix check to prevent initializing reserved inodes commit `5012284700` upstream. Commit `8844618d8a`: "ext4: only look at the bg_flags field if it is valid" will complain if block group zero does not have the EXT4_BG_INODE_ZEROED flag set. Unfortunately, this is not correct, since a freshly created file system has this flag cleared. It gets almost immediately after the file system is mounted read-write --- but the following somewhat unlikely sequence will end up triggering a false positive report of a corrupted file system: mkfs.ext4 /dev/vdc mount -o ro /dev/vdc /vdc mount -o remount,rw /dev/vdc Instead, when initializing the inode table for block group zero, test to make sure that itable_unused count is not too large, since that is the case that will result in some or all of the reserved inodes getting cleared. This fixes the failures reported by Eric Whiteney when running generic/230 and generic/231 in the the nojournal test case. Fixes: `8844618d8a` ("ext4: only look at the bg_flags field if it is valid") Reported-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Theodore Ts'o	8062ee9cbb	ext4: check for allocation block validity with block group locked commit `8d5a803c6a` upstream. With commit `044e6e3d74`: "ext4: don't update checksum of new initialized bitmaps" the buffer valid bit will get set without actually setting up the checksum for the allocation bitmap, since the checksum will get calculated once we actually allocate an inode or block. If we are doing this, then we need to (re-)check the verified bit after we take the block group lock. Otherwise, we could race with another process reading and verifying the bitmap, which would then complain about the checksum being invalid. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Theodore Ts'o	78cd6a047d	ext4: fix inline data updates with checksums enabled commit `362eca70b5` upstream. The inline data code was updating the raw inode directly; this is problematic since if metadata checksums are enabled, ext4_mark_inode_dirty() must be called to update the inode's checksum. In addition, the jbd2 layer requires that get_write_access() be called before the metadata buffer is modified. Fix both of these problems. https://bugzilla.kernel.org/show_bug.cgi?id=200443 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Theodore Ts'o	56ac8b3c5b	ext4: fix false negatives and false positives in ext4_check_descriptors() commit `44de022c43` upstream. Ext4_check_descriptors() was getting called before s_gdb_count was initialized. So for file systems w/o the meta_bg feature, allocation bitmaps could overlap the block group descriptors and ext4 wouldn't notice. For file systems with the meta_bg feature enabled, there was a fencepost error which would cause the ext4_check_descriptors() to incorrectly believe that the block allocation bitmap overlaps with the block group descriptor blocks, and it would reject the mount. Fix both of these problems. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Linus Torvalds	1273d1f1c4	squashfs: be more careful about metadata corruption commit `01cfb7937a` upstream. Anatoly Trosinenko reports that a corrupted squashfs image can cause a kernel oops. It turns out that squashfs can end up being confused about negative fragment lengths. The regular squashfs_read_data() does check for negative lengths, but squashfs_read_metadata() did not, and the fragment size code just blindly trusted the on-disk value. Fix both the fragment parsing and the metadata reading code. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Theodore Ts'o	a0c1006b35	random: mix rdrand with entropy sent in from userspace commit `81e69df38e` upstream. Fedora has integrated the jitter entropy daemon to work around slow boot problems, especially on VM's that don't support virtio-rng: https://bugzilla.redhat.com/show_bug.cgi?id=1572944 It's understandable why they did this, but the Jitter entropy daemon works fundamentally on the principle: "the CPU microarchitecture is so complicated and we can't figure it out, so it must be random". Yes, it uses statistical tests to "prove" it is secure, but AES_ENCRYPT(NSA_KEY, COUNTER++) will also pass statistical tests with flying colors. So if RDRAND is available, mix it into entropy submitted from userspace. It can't hurt, and if you believe the NSA has backdoored RDRAND, then they probably have enough details about the Intel microarchitecture that they can reverse engineer how the Jitter entropy daemon affects the microarchitecture, and attack its output stream. And if RDRAND is in fact an honest DRNG, it will immeasurably improve on what the Jitter entropy daemon might produce. This also provides some protection against someone who is able to read or set the entropy seed file. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Wolfram Sang	ae7b02e080	i2c: rcar: handle RXDMA HW behaviour on Gen3 commit `2b16fd6305` upstream. On Gen3, we can only do RXDMA once per transfer reliably. For that, we must reset the device, then we can have RXDMA once. This patch implements this. When there is no reset controller or the reset fails, RXDMA will be blocked completely. Otherwise, it will be disabled after the first RXDMA transfer. Based on a commit from the BSP by Hiromitsu Yamasaki, yet completely refactored to handle multiple read messages within one transfer. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
James Smart	eabee58e7f	nvmet-fc: fix target sgl list on large transfers commit `d082dc1562` upstream. The existing code to carve up the sg list expected an sg element-per-page which can be very incorrect with iommu's remapping multiple memory pages to fewer bus addresses. To hit this error required a large io payload (greater than 256k) and a system that maps on a per-page basis. It's possible that large ios could get by fine if the system condensed the sgl list into the first 64 elements. This patch corrects the sg list handling by specifically walking the sg list element by element and attempting to divide the transfer up on a per-sg element boundary. While doing so, it still tries to keep sequences under 256k, but will exceed that rule if a single sg element is larger than 256k. Fixes: `48fa362b6c` ("nvmet-fc: simplify sg list handling") Cc: <stable@vger.kernel.org> # 4.14 Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:03 +02:00
Greg Edwards	c5e451d30d	block: reset bi_iter.bi_done after splitting bio commit `5151842b9d` upstream. After the bio has been updated to represent the remaining sectors, reset bi_done so bio_rewind_iter() does not rewind further than it should. This resolves a bio_integrity_process() failure on reads where the original request was split. Fixes: `63573e359d` ("bio-integrity: Restore original iterator on verify stage") Signed-off-by: Greg Edwards <gedwards@ddn.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Martin Wilck	93ba679354	blkdev: __blkdev_direct_IO_simple: fix leak in error case commit `9362dd1109` upstream. Fixes: `72ecad22d9` ("block: support a full bio worth of IO for simplified bdev direct-io") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Martin Wilck	c08cdf9b5f	block: bio_iov_iter_get_pages: fix size of last iovec commit `b403ea2404` upstream. If the last page of the bio is not "full", the length of the last vector slot needs to be corrected. This slot has the index (bio->bi_vcnt - 1), but only in bio->bi_io_vec. In the "bv" helper array, which is shifted by the value of bio->bi_vcnt at function invocation, the correct index is (nr_pages - 1). v2: improved readability following suggestions from Ming Lei. v3: followed a formatting suggestion from Christoph Hellwig. Fixes: `2cefe4dbaa` ("block: add bio_iov_iter_get_pages()") Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Felix Kuehling	09e3b44aea	drm/amdgpu: Avoid reclaim while holding locks taken in MMU notifier [ Upstream commit `6e08e0995b` ] When an MMU notifier runs in memory reclaim context, it can deadlock trying to take locks that are already held in the thread causing the memory reclaim. The solution is to avoid memory reclaim while holding locks that are taken in MMU notifiers. This commit fixes kmalloc while holding rmn->lock by moving the call outside the lock. The GFX MMU notifier also locks reservation objects. I have no good solution for avoiding reclaim while holding reservation objects. The HSA MMU notifier will not lock any reservation objects. v2: Moved allocation outside lock instead of using GFP_NOIO Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Andy Shevchenko	dee772d923	drm/dp/mst: Fix off-by-one typo when dump payload table [ Upstream commit `7056a2bccc` ] It seems there is a classical off-by-one typo from the beginning when commit `ad7f8a1f9c` ("drm/helper: add Displayport multi-stream helper (v0.6)") introduced a new helper. Fix a typo by introducing a macro constant. Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180319141932.37290-1-andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Ville Syrjälä	6e9ea39872	drm/atomic-helper: Drop plane->fb references only for drm_atomic_helper_shutdown() [ Upstream commit `5e9cfeba6a` ] drm_atomic_helper_shutdown() needs to release the reference held by plane->fb. Since commit `49d70aeaec` ("drm/atomic-helper: Fix leak in disable_all") we're doing that by calling drm_atomic_clean_old_fb() in drm_atomic_helper_disable_all(). This also leaves plane->fb == NULL afterwards. However, since drm_atomic_helper_disable_all() is also used by the i915 gpu reset code drm_atomic_helper_commit_duplicated_state() then has to undo the damage and put the correct plane->fb pointers back in (and also adjust the ref counts to match again as well). That approach doesn't work so well for load detection as nothing sets up the plane->old_fb pointers for us. This causes us to leak an extra reference for each plane->fb when drm_atomic_helper_commit_duplicated_state() calls drm_atomic_clean_old_fb() after load detection. To fix this let's call drm_atomic_clean_old_fb() only for drm_atomic_helper_shutdown() as that's the only time we need to actually drop the plane->fb references. In all the other cases (load detection, gpu reset) we want to leave plane->fb alone. v2: Don't inflict the clean_old_fbs bool to drivers (Daniel) v3: Squash in the revert and rewrite the commit msg (Daniel) Cc: martin.peres@free.fr Cc: chris@chris-wilson.co.uk Cc: Dave Airlie <airlied@gmail.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180322152313.6561-3-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #pre-squash Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
José Roberto de Souza	ac70c738be	drm: Add DP PSR2 sink enable bit [ Upstream commit `4f212e4046` ] To comply with eDP1.4a this bit should be set when enabling PSR2. Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180328223046.16125-1-jose.souza@intel.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Fabio Estevam	e6f51bf579	ARM: dts: imx6qdl-wandboard: Let the codec control MCLK pinctrl [ Upstream commit `6e1386b2ee` ] sgtl5000 codec needs MCLK clock to be present so that it can successfully read/write via I2C. In the case of wandboard, MCLK is provided via MX6QDL_PAD_GPIO_0__CCM_CLKO1 pad. Move the MCLK pinctrl from hog group to the codec group, so that the codec clock can be present prior to reading the codec ID. This avoids the following error that happens from time to time: [ 2.484443] sgtl5000 1-000a: Error reading chip id -6 Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Kirill Marinushkin	636ef9d6b1	ASoC: topology: Add missing clock gating parameter when parsing hw_configs [ Upstream commit `933e1c4a66` ] Clock gating parameter is a part of `dai_fmt`. It is supported by `alsa-lib` when creating a topology binary file, but ignored by kernel when loading this topology file. After applying this commit, the clock gating parameter is not ignored any more. This solution is backwards compatible. The existing behaviour is not broken, because by default the parameter value is 0 and is ignored. snd_soc_tplg_hw_config.clock_gated = 0 => no effect snd_soc_tplg_hw_config.clock_gated = 1 => SND_SOC_DAIFMT_GATED snd_soc_tplg_hw_config.clock_gated = 2 => SND_SOC_DAIFMT_CONT For example, the following config, based on alsa-lib/src/conf/topology/broadwell/broadwell.conf, is now supported: ~~~~ SectionHWConfig."CodecHWConfig" { id "1" format "I2S" # physical audio format. pm_gate_clocks "true" # clock can be gated } SectionLink."Codec" { # used for binding to the physical link id "0" hw_configs [ "CodecHWConfig" ] default_hw_conf_id "1" } ~~~~ Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Mark Brown <broonie@kernel.org> Cc: Pan Xiuli <xiuli.pan@linux.intel.com> Cc: Liam Girdwood <liam.r.girdwood@linux.intel.com> Cc: linux-kernel@vger.kernel.org Cc: alsa-devel@alsa-project.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Kirill Marinushkin	671dc72c2c	ASoC: topology: Fix bclk and fsync inversion in set_link_hw_format() [ Upstream commit `a941e2fab3` ] The values of bclk and fsync are inverted WRT the codec. But the existing solution already works for Broadwell, see the alsa-lib config: `alsa-lib/src/conf/topology/broadwell/broadwell.conf` This commit provides the backwards-compatible solution to fix this misuse. Signed-off-by: Kirill Marinushkin <k.marinushkin@gmail.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Tested-by: Pan Xiuli <xiuli.pan@linux.intel.com> Tested-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Mark Brown <broonie@kernel.org> Cc: Liam Girdwood <liam.r.girdwood@linux.intel.com> Cc: linux-kernel@vger.kernel.org Cc: alsa-devel@alsa-project.org Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:02 +02:00
Masahisa KOJIMA	f33d118994	net: socionext: reset hardware in ndo_stop [ Upstream commit `9a00b697ce` ] When the interface is down, head/tail of the descriptor ring address is set to 0 in netsec_netdev_stop(). But netsec hardware still keeps the previous descriptor ring address, so there is inconsistency between driver and hardware after interface is up at a later time. To address this inconsistency, add netsec_reset_hardware() when the interface is down. In addition, to minimize the reset process, add flag to decide whether driver loads the netsec microcode. Even if driver resets the netsec hardware, netsec microcode keeps resident on RAM, so it is ok we only load the microcode at initialization. This patch is critical for installation over network. Signed-off-by: Masahisa KOJIMA <masahisa.kojima@linaro.org> Fixes: `533dd11a12` ("net: socionext: Add Synquacer NetSec driver") Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Mauro Carvalho Chehab	8a90a8d31d	media: si470x: fix __be16 annotations [ Upstream commit `90db5c8296` ] The annotations there are wrong as warned: drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:107:35: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: warning: incorrect type in assignment (different base types) drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: expected unsigned short [unsigned] [short] <noident> drivers/media/radio/si470x/radio-si470x-i2c.c:129:24: got restricted __be16 [usertype] <noident> drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 drivers/media/radio/si470x/radio-si470x-i2c.c:163:39: warning: cast to restricted __be16 Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Hans Verkuil	956ebc825b	media: cec: fix smatch error [ Upstream commit `b66d448487` ] drivers/media/cec/cec-pin-error-inj.c:231 cec_pin_error_inj_parse_line() error: uninitialized symbol 'pos'. The tx-add-bytes command didn't check for the presence of an argument, and also didn't check that it was > 0. This should fix this error. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Mauro Carvalho Chehab	cf57026b30	media: atomisp: compat32: fix __user annotations [ Upstream commit `ad4222a0e2` ] The __user annotations at the compat32 code is not right: drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:81:18: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:81:18: expected void base drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:81:18: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:232:23: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:232:23: expected unsigned int [usertype] xcoords_y drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:232:23: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:233:23: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:233:23: expected unsigned int [usertype] ycoords_y drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:233:23: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:234:24: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:234:24: expected unsigned int [usertype] xcoords_uv drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:234:24: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:235:24: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:235:24: expected unsigned int [usertype] ycoords_uv drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:235:24: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:296:29: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:296:29: expected unsigned int [usertype] effective_width drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:296:29: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:360:29: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:360:29: expected unsigned int [usertype] effective_width drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:360:29: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:437:19: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:437:19: expected struct v4l2_framebuffer frame drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:437:19: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:481:29: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:481:29: expected unsigned short calb_grp_values drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:481:29: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:701:39: warning: cast removes address space of expression drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:704:21: warning: incorrect type in argument 1 (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:704:21: expected void const volatile [noderef] <asn:1><noident> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:704:21: got unsigned int [usertype] src drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:737:43: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:737:43: expected struct atomisp_shading_table shading_table drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:737:43: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:742:44: warning: incorrect type in argument 1 (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:742:44: expected void [noderef] <asn:1>to drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:742:44: got struct atomisp_shading_table shading_table drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:755:41: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:755:41: expected struct atomisp_morph_table morph_table drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:755:41: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:760:44: warning: incorrect type in argument 1 (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:760:44: expected void [noderef] <asn:1>to drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:760:44: got struct atomisp_morph_table morph_table drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:772:40: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:772:40: expected struct atomisp_dvs2_coefficients dvs2_coefs drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:772:40: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:777:44: warning: incorrect type in argument 1 (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:777:44: expected void [noderef] <asn:1>to drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:777:44: got struct atomisp_dvs2_coefficients dvs2_coefs drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:788:46: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:788:46: expected struct atomisp_dvs_6axis_config dvs_6axis_config drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:788:46: got void [noderef] <asn:1> drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:793:44: warning: incorrect type in argument 1 (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:793:44: expected void [noderef] <asn:1>to drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:793:44: got struct atomisp_dvs_6axis_config dvs_6axis_config drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:853:17: warning: incorrect type in assignment (different address spaces) drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:853:17: expected struct atomisp_sensor_ae_bracketing_lut_entry lut drivers/staging/media/atomisp/pci/atomisp2/atomisp_compat_ioctl32.c:853:17: got void [noderef] <asn:1> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Bjorn Helgaas	c037af7e6e	PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR [ Upstream commit `9ab105deb6` ] When in the ASPM L1.0 state (but not the PCI-PM L1.0 state), the most recent LTR value and the LTR_L1.2_THRESHOLD determines whether the link enters the L1.2 substate. If we don't have LTR enabled, prevent the use of ASPM L1.2. PCI-PM L1.2 may still be used because it doesn't depend on LTR_L1.2_THRESHOLD (see PCIe r4.0, sec 5.5.1). Tested-by: Srinath Mannam <srinath.mannam@broadcom.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Matthew R. Ochs	c1e6939d1b	scsi: cxlflash: Avoid clobbering context control register value [ Upstream commit `465891fe92` ] The SISLite specification originally defined the context control register with a single field of bits to represent the LISN and also stipulated that the register reset value be 0. The cxlflash driver took advantage of this when programming the LISN for the master contexts via an unconditional write - no other bits were preserved. When unmap support was added, SISLite was updated to define bit 0 of the context control register as a way for the AFU to notify the context owner that unmap operations were supported. Thus the assumptions under which the register is setup changed and the existing unconditional write is clobbering the unmap state for master contexts. This is presently not an issue due to the order in which the context control register is programmed in relation to the unmap bit being queried but should be addressed to avoid a future regression in the event this code is moved elsewhere. To remedy this issue, preserve the bits when programming the LISN field in the context control register. Since the LISN will now be programmed using a read value, assert that the initial state of the LISN field is as described in SISLite (0). Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Uma Krishnan	5b79aae23e	scsi: cxlflash: Synchronize reset and remove ops [ Upstream commit `a3feb6ef50` ] The following Oops can be encountered if a device removal or system shutdown is initiated while an EEH recovery is in process: [c000000ff2f479c0] c008000015256f18 cxlflash_pci_slot_reset+0xa0/0x100 [cxlflash] [c000000ff2f47a30] c00800000dae22e0 cxl_pci_slot_reset+0x168/0x290 [cxl] [c000000ff2f47ae0] c00000000003ef1c eeh_report_reset+0xec/0x170 [c000000ff2f47b20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170 [c000000ff2f47bb0] c00000000003f80c eeh_handle_normal_event+0x56c/0x580 [c000000ff2f47c60] c00000000003fba4 eeh_handle_event+0x2a4/0x338 [c000000ff2f47d10] c0000000000400b8 eeh_event_handler+0x1f8/0x200 [c000000ff2f47dc0] c00000000013da48 kthread+0x1a8/0x1b0 [c000000ff2f47e30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4 The remove handler frees AFU memory while the EEH recovery is in progress, leading to a race condition. This can result in a crash if the recovery thread tries to access this memory. To resolve this issue, the cxlflash remove handler will evaluate the device state and yield to any active reset or probing threads. Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Shivasharan S	9d9aaf5912	scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs [ Upstream commit `3239b8cd28` ] Hardware could time out Fastpath IOs one second earlier than the timeout provided by the host. For non-RAID devices, driver provides timeout value based on OS provided timeout value. Under certain scenarios, if the OS provides a timeout value of 1 second, due to above behavior hardware will timeout immediately. Increase timeout value for non-RAID fastpath IOs by 1 second. Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Xose Vazquez Perez	ef33a2d5d0	scsi: scsi_dh: replace too broad "TP9" string with the exact models [ Upstream commit `37b37d2609` ] SGI/TP9100 is not an RDAC array: ^^^ https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235 This partially reverts commit `35204772ea` ("[SCSI] scsi_dh_rdac : Consolidate rdac strings together") [mkp: fixed up the new entries to align with rest of struct] Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com> Cc: Hannes Reinecke <hare@suse.de> Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: SCSI ML <linux-scsi@vger.kernel.org> Cc: DM ML <dm-devel@redhat.com> Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Philippe CORNU	7bea183133	drm/stm: ltdc: fix warning in ltdc_crtc_update_clut() [ Upstream commit `c20f5f69c8` ] Fix the warning "warn: variable dereferenced before check 'crtc' (see line 390)" by removing unnecessary checks as ltdc_crtc_update_clut() is only called from ltdc_crtc_atomic_flush() where crtc and crtc->state are not NULL. Many thanks to Dan Carpenter for the bug report https://lists.freedesktop.org/archives/dri-devel/2018-February/166918.html Signed-off-by: Philippe Cornu <philippe.cornu@st.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: yannick fertre <yannick.fertre@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180410135312.3553-1-philippe.cornu@st.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:01 +02:00
Thomas Hebb	2435709952	ath10k: search all IEs for variant before falling back [ Upstream commit `c848966806` ] commit `f2593cb1b2` ("ath10k: Search SMBIOS for OEM board file extension") added a feature to ath10k that allows Board Data File (BDF) conflicts between multiple devices that use the same device IDs but have different calibration requirements to be resolved by allowing a "variant" string to be stored in SMBIOS [and later device tree, added by commit `d06f26c5c8` ("ath10k: search DT for qcom,ath10k-calibration- variant")] that gets appended to the ID stored in board-2.bin. This original patch had a regression, however. Namely that devices with a variant present in SMBIOS that didn't need custom BDFs could no longer find the default BDF, which has no variant appended. The patch was reverted and re-applied with a fix for this issue in commit `1657b8f84e` ("search SMBIOS for OEM board file extension"). But the fix to fall back to a default BDF introduced another issue: the driver currently parses IEs in board-2.bin one by one, and for each one it first checks to see if it matches the ID with the variant appended. If it doesn't, it checks to see if it matches the "fallback" ID with no variant. If a matching BDF is found at any point during this search, the search is terminated and that BDF is used. The issue is that it's very possible (and is currently the case for board-2.bin files present in the ath10k-firmware repository) for the default BDF to occur in an earlier IE than the variant-specific BDF. In this case, the current code will happily choose the default BDF even though a better-matching BDF is present later in the file. This patch fixes the issue by first searching the entire file for the ID with variant, and searching for the fallback ID only if that search fails. It also includes some code cleanup in the area, as ath10k_core_fetch_board_data_api_n() no longer does its own string mangling to remove the variant from an ID, instead leaving that job to a new flag passed to ath10k_core_create_board_name(). I've tested this patch on a QCA4019 and verified that the driver behaves correctly for 1) both fallback and variant BDFs present, 2) only fallback BDF present, and 3) no matching BDFs present. Fixes: `1657b8f84e` ("ath10k: search SMBIOS for OEM board file extension") Signed-off-by: Thomas Hebb <tommyhebb@gmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Douglas Anderson	99cf307b59	regulator: Don't return or expect -errno from of_map_mode() [ Upstream commit `02f3703934` ] In of_get_regulation_constraints() we were taking the result of of_map_mode() (an unsigned int) and assigning it to an int. We were then checking whether this value was -EINVAL. Some implementers of of_map_mode() were returning -EINVAL (even though the return type of their function needed to be unsigned int) because they needed to signal an error back to of_get_regulation_constraints(). In general in the regulator framework the mode is always referred to as an unsigned int. While we could fix this to be a signed int (the highest value we store in there right now is 0x8), it's actually pretty clean to just define the regulator mode 0x0 (the lack of any bits set) as an invalid mode. Let's do that. Fixes: `5e5e3a42c6` ("regulator: of: Add support for parsing initial and suspend modes") Suggested-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Suman Anna	f207585f96	media: omap3isp: fix unbalanced dma_iommu_mapping [ Upstream commit `b7e1e6859f` ] The OMAP3 ISP driver manages its MMU mappings through the IOMMU-aware ARM DMA backend. The current code creates a dma_iommu_mapping and attaches this to the ISP device, but never detaches the mapping in either the probe failure paths or the driver remove path resulting in an unbalanced mapping refcount and a memory leak. Fix this properly. Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Suman Anna <s-anna@ti.com> Tested-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Sean Young	432dd20071	media: rc: mce_kbd decoder: low timeout values cause double keydowns [ Upstream commit `c421c62a4a` ] The mce keyboard repeats pressed keys every 100ms. If the IR timeout is set to less than that, we send key up events before the repeat arrives, so we have key up/key down for each IR repeat. The keyboard ends any sequence with a 0 scancode, in which case all keys are cleared so there is no need to run the timeout timer: it only exists for the case that the final 0 was not received. Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Arnd Bergmann	f8356e461d	y2038: ipc: Use ktime_get_real_seconds consistently [ Upstream commit `2a70b7879b` ] In some places, we still used get_seconds() instead of ktime_get_real_seconds(), and I'm changing the remaining ones now to all use ktime_get_real_seconds() so we use the full available range for timestamps instead of overflowing the 'unsigned long' return value in year 2106 on 32-bit kernels. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Tudor-Dan Ambarus	6705c48563	crypto: authenc - don't leak pointers to authenc keys [ Upstream commit `ad2fdcdf75` ] In crypto_authenc_setkey we save pointers to the authenc keys in a local variable of type struct crypto_authenc_keys and we don't zeroize it after use. Fix this and don't leak pointers to the authenc keys. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Tudor-Dan Ambarus	296805b12d	crypto: authencesn - don't leak pointers to authenc keys [ Upstream commit `31545df391` ] In crypto_authenc_esn_setkey we save pointers to the authenc keys in a local variable of type struct crypto_authenc_keys and we don't zeroize it after use. Fix this and don't leak pointers to the authenc keys. Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Dominik Bozek	030fa9dd34	usb: hub: Don't wait for connect state at resume for powered-off ports [ Upstream commit `5d111f5190` ] wait_for_connected() wait till a port change status to USB_PORT_STAT_CONNECTION, but this is not possible if the port is unpowered. The loop will only exit at timeout. Such case take place if an over-current incident happen while system is in S3. Then during resume wait_for_connected() will wait 2s, which may be noticeable by the user. Signed-off-by: Dominik Bozek <dominikx.bozek@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Michal Simek	920610162f	microblaze: Fix simpleImage format generation [ Upstream commit `ece97f3a5f` ] simpleImage generation was broken for some time. This patch is fixing steps how simpleImage.*.ub file is generated. Steps are objdump of vmlinux and create .ub. Also make sure that there is striped elf version with .strip suffix. Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Andrey Smirnov	3412088e7c	soc: imx: gpcv2: Do not pass static memory as platform data [ Upstream commit `050f810e23` ] Platform device core assumes the ownership of dev.platform_data as well as that it is dynamically allocated and it will try to kfree it as a part of platform_device_release(). Change the code to use platform_device_add_data() n instead of a pointer to a static memory to avoid causing a BUG() when calling platform_device_put(). The problem can be reproduced by artificially enabling the error path of platform_device_add() call (around line 357). Note that this change also allows us to constify imx7_pgc_domains, since we no longer need to be able to modify it. Cc: Stefan Agner <stefan@agner.ch> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Douglas Anderson	64b8131944	serial: core: Make sure compiler barfs for 16-byte earlycon names [ Upstream commit `c1c734cb1f` ] As part of bringup I ended up wanting to call an earlycon driver by a name that was exactly 16-bytes big, specifically "qcom_geni_serial". Unfortunately, when I tried this I found that things compiled just fine. They just didn't work. Specifically the compiler felt perfectly justified in initting the ".name" field of "struct earlycon_id" with the full 16-bytes and just skipping the '\0'. Needless to say, that behavior didn't seem ideal, but I guess someone must have allowed it for a reason. One way to fix this is to shorten the name field to 15 bytes and then add an extra byte after that nobody touches. This should always be initted to 0 and we're golden. There are, of course, other ways to fix this too. We could audit all the users of the "name" field and make them stop at both null termination or at 16 bytes. We could also just make the name field much bigger so that we're not likely to run into this. ...but both seem like we'll just hit the bug again. Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:48:00 +02:00
Sergio Paracuellos	6ee03ec416	staging: ks7010: fix error handling in ks7010_upload_firmware [ Upstream commit `6e043704fb` ] This commit checks missing error code check when checking if the firmware is running reading General Communication Register A (GCR_A). It also set ret to EBUSY if firmware is running before copying it. Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
NeilBrown	181e50c19f	staging: lustre: ldlm: free resource when ldlm_lock_create() fails. [ Upstream commit `d8caf662b4` ] ldlm_lock_create() gets a resource, but don't put it on all failure paths. It should. Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
James Simmons	7c7d4194b2	staging: lustre: llite: correct removexattr detection [ Upstream commit `1b60f6dfa3` ] In ll_xattr_set_common() detect the removexattr() case correctly by testing for a NULL value as well as XATTR_REPLACE. Signed-off-by: John L. Hammond <john.hammond@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10787 Reviewed-on: https://review.whamcloud.com/ Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: James Simmons <uja.ornl@yahoo.com> Signed-off-by: James Simmons <jsimmons@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Stefan Wahren	81805d08b4	staging: vchiq_core: Fix missing semaphore release in error case [ Upstream commit `8113b89fc6` ] The bail out branch in case of a invalid tx_pos missed a semaphore release. Dan Carpenter found this with a static checker. Fixes: `d1eab9dec6` ("staging: vchiq_core: Bail out in case of invalid tx_pos") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Mario Limonciello	33c6b26316	platform/x86: dell-smbios: Match on www.dell.com in OEM strings too [ Upstream commit `b004b21cc6` ] Sergey reported that some much older Dell systems don't support the OEM string "Dell System" but instead supported www.dell.com in OEM strings. Match both of these to indicate that this driver is running on a Dell system. Reported-by: Sergey Kubushyn <ksi@koi8.net> Tested-by: Sergey Kubushyn <ksi@koi8.net> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> [dvhart: Simplify DMI logic and eliminate unnecessary variables] Signed-off-by: Darren Hart (VMware) <dvhart@infradead.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Tomasz Figa	4eeda9c88a	drm/rockchip: analogix_dp: Do not call Analogix code before bind [ Upstream commit `a4169609de` ] Driver callbacks, such as system suspend or resume can be called any time, specifically they can be called before the component bind callback. Let's use dp->adp pointer as a safeguard and skip calling Analogix entry points if it is an ERR_PTR(). Signed-off-by: Tomasz Figa <tfiga@chromium.org> Signed-off-by: Thierry Escande <thierry.escande@collabora.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180423105003.9004-24-enric.balletbo@collabora.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Ondrej Mosnáček	3d5a72ab35	audit: allow not equal op for audit by executable [ Upstream commit `23bcc480da` ] Current implementation of auditing by executable name only implements the 'equal' operator. This patch extends it to also support the 'not equal' operator. See: https://github.com/linux-audit/audit-kernel/issues/53 Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Reviewed-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Siva Rebbagondla	4d130239ac	rsi: fix nommu_map_sg overflow kernel panic [ Upstream commit `f700546682` ] Following overflow kernel panic is observed on some platforms while loading the driver. It is fixed if dynamically allocated memory is passed to SDIO instead of static one [ 927.513963] nommu_map_sg: overflow 17d54064ba7c+20 of device mask ffffffff [ 927.517712] Modules linked in: rsi_sdio(+) cmac bnep arc4 rsi_91x mac80211 cfg80211 btrsi rfcomm bluetooth ecdh_generic snd_soc_sst_bytcr_rt5660 [ 927.517861] CPU: 0 PID: 1624 Comm: insmod Tainted: G W 4.15.0-1000 #1 [ 927.517870] RIP: 0010:sdhci_send_command+0x5f0/0xa90 [sdhci] [ 927.517873] RSP: 0000:ffffac3fc064b6d8 EFLAGS: 00010086 [ 927.517895] Call Trace: [ 927.517908] ? __schedule+0x3cd/0x890 [ 927.517915] ? mod_timer+0x17b/0x3c0 [ 927.517922] sdhci_request+0x7c/0xf0 [sdhci] [ 927.517928] __mmc_start_request+0x5a/0x170 [ 927.517932] mmc_start_request+0x74/0x90 [ 927.517936] mmc_wait_for_req+0x87/0xe0 [ 927.517940] mmc_io_rw_extended+0x2fd/0x330 [ 927.517946] ? mmc_wait_data_done+0x30/0x30 [ 927.517951] sdio_io_rw_ext_helper+0x160/0x210 [ 927.517956] sdio_writesb+0x1d/0x20 [ 927.517966] rsi_sdio_write_register_multiple+0x68/0x110 [rsi_sdio] [ 927.517976] rsi_hal_device_init+0x357/0x910 [rsi_91x] [ 927.517983] ? rsi_hal_device_init+0x357/0x910 [rsi_91x] [ 927.517990] rsi_probe+0x2c6/0x450 [rsi_sdio] [ 927.517995] sdio_bus_probe+0xfc/0x110 [ 927.518000] driver_probe_device+0x2b3/0x490 [ 927.518005] __driver_attach+0xdf/0xf0 [ 927.518008] ? driver_probe_device+0x490/0x490 [ 927.518014] bus_for_each_dev+0x6c/0xc0 [ 927.518018] driver_attach+0x1e/0x20 [ 927.518021] bus_add_driver+0x1f4/0x270 [ 927.518028] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.518031] driver_register+0x60/0xe0 [ 927.518038] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.518041] sdio_register_driver+0x20/0x30 [ 927.518047] rsi_module_init+0x16/0x40 [rsi_sdio] Signed-off-by: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Siva Rebbagondla	5ea0308c83	rsi: Fix 'invalid vdd' warning in mmc [ Upstream commit `78e450719c` ] While performing cleanup, driver is messing with card->ocr value by not masking rocr against ocr_avail. Below panic is observed with some of the SDIO host controllers due to this. Issue is resolved by reverting incorrect modifications to vdd. [ 927.423821] mmc1: Invalid vdd 0x1f [ 927.423925] Modules linked in: rsi_sdio(+) cmac bnep arc4 rsi_91x mac80211 cfg80211 btrsi rfcomm bluetooth ecdh_generic [ 927.424073] CPU: 0 PID: 1624 Comm: insmod Tainted: G W 4.15.0-1000-caracalla #1 [ 927.424075] Hardware name: Dell Inc. Edge Gateway 3003/ , BIOS 01.00.06 01/22/2018 [ 927.424082] RIP: 0010:sdhci_set_power_noreg+0xdd/0x190[sdhci] [ 927.424085] RSP: 0018:ffffac3fc064b930 EFLAGS: 00010282 [ 927.424107] Call Trace: [ 927.424118] sdhci_set_power+0x5a/0x60 [sdhci] [ 927.424125] sdhci_set_ios+0x360/0x3b0 [sdhci] [ 927.424133] mmc_set_initial_state+0x92/0x120 [ 927.424137] mmc_power_up.part.34+0x33/0x1d0 [ 927.424141] mmc_power_up+0x17/0x20 [ 927.424147] mmc_sdio_runtime_resume+0x2d/0x50 [ 927.424151] mmc_runtime_resume+0x17/0x20 [ 927.424156] __rpm_callback+0xc4/0x200 [ 927.424161] ? idr_alloc_cyclic+0x57/0xd0 [ 927.424165] ? mmc_runtime_suspend+0x20/0x20 [ 927.424169] rpm_callback+0x24/0x80 [ 927.424172] ? mmc_runtime_suspend+0x20/0x20 [ 927.424176] rpm_resume+0x4b3/0x6c0 [ 927.424181] __pm_runtime_resume+0x4e/0x80 [ 927.424188] driver_probe_device+0x41/0x490 [ 927.424192] __driver_attach+0xdf/0xf0 [ 927.424196] ? driver_probe_device+0x490/0x490 [ 927.424201] bus_for_each_dev+0x6c/0xc0 [ 927.424205] driver_attach+0x1e/0x20 [ 927.424209] bus_add_driver+0x1f4/0x270 [ 927.424217] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.424221] driver_register+0x60/0xe0 [ 927.424227] ? rsi_sdio_ack_intr+0x50/0x50 [rsi_sdio] [ 927.424231] sdio_register_driver+0x20/0x30 [ 927.424237] rsi_module_init+0x16/0x40 [rsi_sdio] Signed-off-by: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com> Signed-off-by: Amitkumar Karwar <amit.karwar@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Chris Novakovic	791f834d48	ipconfig: Correctly initialise ic_nameservers [ Upstream commit `300eec7c0a` ] ic_nameservers, which stores the list of name servers discovered by ipconfig, is initialised (i.e. has all of its elements set to NONE, or 0xffffffff) by ic_nameservers_predef() in the following scenarios: - before the "ip=" and "nfsaddrs=" kernel command line parameters are parsed (in ip_auto_config_setup()); - before autoconfiguring via DHCP or BOOTP (in ic_bootp_init()), in order to clear any values that may have been set after parsing "ip=" or "nfsaddrs=" and are no longer needed. This means that ic_nameservers_predef() is not called when neither "ip=" nor "nfsaddrs=" is specified on the kernel command line. In this scenario, every element in ic_nameservers remains set to 0x00000000, which is indistinguishable from ANY and causes pnp_seq_show() to write the following (bogus) information to /proc/net/pnp: #MANUAL nameserver 0.0.0.0 nameserver 0.0.0.0 nameserver 0.0.0.0 This is potentially problematic for systems that blindly link /etc/resolv.conf to /proc/net/pnp. Ensure that ic_nameservers is also initialised when neither "ip=" nor "nfsaddrs=" are specified by calling ic_nameservers_predef() in ip_auto_config(), but only when ip_auto_config_setup() was not called earlier. This causes the following to be written to /proc/net/pnp, and is consistent with what gets written when ipconfig is configured manually but no name servers are specified on the kernel command line: #MANUAL Signed-off-by: Chris Novakovic <chris@chrisn.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:59 +02:00
Luc Van Oostenryck	c2af55216e	drm/gma500: fix psb_intel_lvds_mode_valid()'s return type [ Upstream commit `2ea009095c` ] The method struct drm_connector_helper_funcs::mode_valid is defined as returning an 'enum drm_mode_status' but the driver implementation for this method, psb_intel_lvds_mode_valid(), uses an 'int' for it. Fix this by using 'enum drm_mode_status' for psb_intel_lvds_mode_valid(). Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20180424131458.2060-1-luc.vanoostenryck@gmail.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Gustavo A. R. Silva	de152d298f	qtnfmac: pearl: pcie: fix memory leak in qtnf_fw_work_handler [ Upstream commit `3763770044` ] In case memory resources for fw were succesfully allocated, release them before jumping to fw_load_fail. Addresses-Coverity-ID: 1466092 ("Resource leak") Fixes: `c3b2f7ca41` ("qtnfmac: implement asynchronous firmware loading") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Vinicius Costa Gomes	62de12170f	igb: Fix queue selection on MAC filters on i210 [ Upstream commit `4dc93fcf0b` ] On the RAH registers there are semantic differences on the meaning of the "queue" parameter for traffic steering depending on the controller model: there is the 82575 meaning, which "queue" means a RX Hardware Queue, and the i350 meaning, where it is a reception pool. The previous behaviour was having no effect for i210 based controllers because the QSEL bit of the RAH register wasn't being set. This patch separates the condition in discrete cases, so the different handling is clearer. Fixes: `83c21335c8` ("igb: improve MAC filter handling") Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Charles Keepax	2731f8c6af	ASoC: compress: Only call free for components which have been opened [ Upstream commit `572e6c8dd1` ] The core should only call free on a component if said component has already had open called on it. This is not presently the case and most compressed drivers in the kernel assume it will be. This causes null pointer dereferences in the drivers as they attempt clean up for stuff that was never put in place. This is fixed by aborting calling open callbacks once a failure is encountered and then during clean up only iterating through the component list to that point. This is a fairly quick fix to the issue, to allow backporting. There is more refactoring to follow to tidy the code up a little. Fixes: `9e7e3738ab` ("ASoC: snd_soc_component_driver has snd_compr_ops") Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Acked-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Enric Balletbo i Serra	61145bfe1a	arm64: defconfig: Enable Rockchip io-domain driver [ Upstream commit `7c8b77f815` ] Heiko Stübner justified pretty well the change in commit `e330eb86ba` ("ARM: multi_v7_defconfig: enable Rockchip io-domain driver"). This change is also needed for arm64 rockchip boards, so, do the same for arm64. The io-domain driver is necessary to notify the soc about voltages changes happening on supplying regulators. Probably the most important user right now is the mmc tuning code, where the soc needs to get notified when the voltage is dropped to the 1.8V point. As this option is necessary to successfully tune UHS cards etc, it should get built in. Otherwise, tuning will fail with, dwmmc_rockchip fe320000.dwmmc: All phases bad! mmc0: tuning execution failed: -5 Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Fabio Estevam	021310e245	ASoC: fsl_ssi: Use u32 variable type when using regmap_read() [ Upstream commit `671f8204b1` ] Convert the sisr and sisr2 variable types to u32 to avoid the following sparse warnings: sound/soc/fsl/fsl_ssi.c:391:42: warning: incorrect type in argument 3 (different base types) sound/soc/fsl/fsl_ssi.c:391:42: expected unsigned int val sound/soc/fsl/fsl_ssi.c:391:42: got restricted __be32 <noident> sound/soc/fsl/fsl_ssi.c:393:17: warning: restricted __be32 degrades to integer sound/soc/fsl/fsl_ssi.c:393:15: warning: incorrect type in assignment (different base types) sound/soc/fsl/fsl_ssi.c:393:15: expected restricted __be32 [usertype] sisr2 sound/soc/fsl/fsl_ssi.c:393:15: got unsigned int sound/soc/fsl/fsl_ssi.c:396:50: warning: incorrect type in argument 3 (different base types) sound/soc/fsl/fsl_ssi.c:396:50: expected unsigned int [unsigned] val sound/soc/fsl/fsl_ssi.c:396:50: got restricted __be32 [usertype] sisr2 sound/soc/fsl/fsl_ssi.c:398:42: warning: incorrect type in argument 2 (different base types) sound/soc/fsl/fsl_ssi.c:398:42: expected unsigned int [unsigned] [usertype] sisr sound/soc/fsl/fsl_ssi.c:398:42: got restricted __be32 [addressable] [usertype] sisr In other places where regmap_read() is used a u32 variable is passed to store the register read value, so do the same here as well. regmap API already takes care of endianness, so the usage of u32 is safe. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Wei Xu	207276a52d	nvme: lightnvm: add granby support [ Upstream commit `ea48e87799` ] Add a new lightnvm quirk to identify CNEX’s Granby controller. Signed-off-by: Wei Xu <wxu@cnexlabs.com> Reviewed-by: Javier González <javier@cnexlabs.com> Reviewed-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Dmitry Osipenko	bce56f0cb6	memory: tegra: Apply interrupts mask per SoC [ Upstream commit `1c74d5c0de` ] Currently we are enabling handling of interrupts specific to Tegra124+ which happen to overlap with previous generations. Let's specify interrupts mask per SoC generation for consistency and in a preparation of squashing of Tegra20 driver into the common one that will enable handling of GART faults which may be undesirable by newer generations. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Dmitry Osipenko	9e41973533	memory: tegra: Do not handle spurious interrupts [ Upstream commit `bf3fbdfbec` ] The ISR reads interrupts-enable mask, but doesn't utilize it. Apply the mask to the interrupt status and don't handle interrupts that MC driver haven't asked for. Kernel would disable spurious MC IRQ and report the error. This would happen only in a case of a very severe bug. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Tamizh Chelvam	a4f84c17f0	ath10k: fix kernel panic while reading tpc_stats [ Upstream commit `4b190675ad` ] When attempt to read tpc_stats for the chipsets which support more than 3 tx chain will trigger kernel panic(kernel stack is corrupted) due to writing values on rate_code array out of range. This patch changes the array size depends on the WMI_TPC_TX_N_CHAIN and added check to avoid write values on the array if the num tx chain get in tpc config event is greater than WMI_TPC_TX_N_CHAIN. Tested on QCA9984 with firmware-5.bin_10.4-3.5.3-00057 Kernel panic log : [ 323.510944] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: bf90c654 [ 323.510944] [ 323.524390] CPU: 0 PID: 1908 Comm: cat Not tainted 3.14.77 #31 [ 323.530224] [<c021db48>] (unwind_backtrace) from [<c021ac08>] (show_stack+0x10/0x14) [ 323.537941] [<c021ac08>] (show_stack) from [<c03c53c0>] (dump_stack+0x80/0xa0) [ 323.545146] [<c03c53c0>] (dump_stack) from [<c022e4ac>] (panic+0x84/0x1e4) [ 323.552000] [<c022e4ac>] (panic) from [<c022e61c>] (__stack_chk_fail+0x10/0x14) [ 323.559350] [<c022e61c>] (__stack_chk_fail) from [<bf90c654>] (ath10k_wmi_event_pdev_tpc_config+0x424/0x438 [ath10k_core]) [ 323.570471] [<bf90c654>] (ath10k_wmi_event_pdev_tpc_config [ath10k_core]) from [<bf90d800>] (ath10k_wmi_10_4_op_rx+0x2f0/0x39c [ath10k_core]) [ 323.583047] [<bf90d800>] (ath10k_wmi_10_4_op_rx [ath10k_core]) from [<bf8fcc18>] (ath10k_htc_rx_completion_handler+0x170/0x1a0 [ath10k_core]) [ 323.595702] [<bf8fcc18>] (ath10k_htc_rx_completion_handler [ath10k_core]) from [<bf961f44>] (ath10k_pci_hif_send_complete_check+0x1f0/0x220 [ath10k_pci]) [ 323.609421] [<bf961f44>] (ath10k_pci_hif_send_complete_check [ath10k_pci]) from [<bf96562c>] (ath10k_ce_per_engine_service+0x74/0xc4 [ath10k_pci]) [ 323.622490] [<bf96562c>] (ath10k_ce_per_engine_service [ath10k_pci]) from [<bf9656f0>] (ath10k_ce_per_engine_service_any+0x74/0x80 [ath10k_pci]) [ 323.635423] [<bf9656f0>] (ath10k_ce_per_engine_service_any [ath10k_pci]) from [<bf96365c>] (ath10k_pci_napi_poll+0x44/0xe8 [ath10k_pci]) [ 323.647665] [<bf96365c>] (ath10k_pci_napi_poll [ath10k_pci]) from [<c0599994>] (net_rx_action+0xac/0x160) [ 323.657208] [<c0599994>] (net_rx_action) from [<c02324a4>] (__do_softirq+0x104/0x294) [ 323.665017] [<c02324a4>] (__do_softirq) from [<c0232920>] (irq_exit+0x9c/0x11c) [ 323.672314] [<c0232920>] (irq_exit) from [<c0217fc0>] (handle_IRQ+0x6c/0x90) [ 323.679341] [<c0217fc0>] (handle_IRQ) from [<c02084e0>] (gic_handle_irq+0x3c/0x60) [ 323.686893] [<c02084e0>] (gic_handle_irq) from [<c02095c0>] (__irq_svc+0x40/0x70) [ 323.694349] Exception stack(0xdd489c58 to 0xdd489ca0) [ 323.699384] 9c40: 00000000 a0000013 [ 323.707547] 9c60: 00000000 dc4bce40 60000013 ddc1d800 dd488000 00000990 00000000 c085c800 [ 323.715707] 9c80: 00000000 dd489d44 0000092d dd489ca0 c026e664 c026e668 60000013 ffffffff [ 323.723877] [<c02095c0>] (__irq_svc) from [<c026e668>] (rcu_note_context_switch+0x170/0x184) [ 323.732298] [<c026e668>] (rcu_note_context_switch) from [<c020e928>] (__schedule+0x50/0x4d4) [ 323.740716] [<c020e928>] (__schedule) from [<c020e490>] (schedule_timeout+0x148/0x178) [ 323.748611] [<c020e490>] (schedule_timeout) from [<c020f804>] (wait_for_common+0x114/0x154) [ 323.756972] [<c020f804>] (wait_for_common) from [<bf8f6ef0>] (ath10k_tpc_stats_open+0xc8/0x340 [ath10k_core]) [ 323.766873] [<bf8f6ef0>] (ath10k_tpc_stats_open [ath10k_core]) from [<c02bb598>] (do_dentry_open+0x1ac/0x274) [ 323.776741] [<c02bb598>] (do_dentry_open) from [<c02c838c>] (do_last+0x8c0/0xb08) [ 323.784201] [<c02c838c>] (do_last) from [<c02c87e4>] (path_openat+0x210/0x598) [ 323.791408] [<c02c87e4>] (path_openat) from [<c02c9d1c>] (do_filp_open+0x2c/0x78) [ 323.798873] [<c02c9d1c>] (do_filp_open) from [<c02bc85c>] (do_sys_open+0x114/0x1b4) [ 323.806509] [<c02bc85c>] (do_sys_open) from [<c0208c80>] (ret_fast_syscall+0x0/0x44) [ 323.814241] CPU1: stopping [ 323.816927] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.14.77 #31 [ 323.823008] [<c021db48>] (unwind_backtrace) from [<c021ac08>] (show_stack+0x10/0x14) [ 323.830731] [<c021ac08>] (show_stack) from [<c03c53c0>] (dump_stack+0x80/0xa0) [ 323.837934] [<c03c53c0>] (dump_stack) from [<c021cfac>] (handle_IPI+0xb8/0x140) [ 323.845224] [<c021cfac>] (handle_IPI) from [<c02084fc>] (gic_handle_irq+0x58/0x60) [ 323.852774] [<c02084fc>] (gic_handle_irq) from [<c02095c0>] (__irq_svc+0x40/0x70) [ 323.860233] Exception stack(0xdd499fa0 to 0xdd499fe8) [ 323.865273] 9fa0: ffffffed 00000000 1d3c9000 00000000 dd498000 dd498030 10c0387d c08b62c8 [ 323.873432] 9fc0: 4220406a 512f04d0 00000000 00000000 00000001 dd499fe8 c021838c c0218390 [ 323.881588] 9fe0: 60000013 ffffffff [ 323.885070] [<c02095c0>] (__irq_svc) from [<c0218390>] (arch_cpu_idle+0x30/0x50) [ 323.892454] [<c0218390>] (arch_cpu_idle) from [<c026500c>] (cpu_startup_entry+0xa4/0x108) [ 323.900690] [<c026500c>] (cpu_startup_entry) from [<422085a4>] (0x422085a4) Signed-off-by: Tamizh chelvam <tamizhr@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:58 +02:00
Sebastian Andrzej Siewior	fe52ec8c67	delayacct: Use raw_spinlocks [ Upstream commit `02acc80d19` ] try_to_wake_up() might invoke delayacct_blkio_end() while holding the pi_lock (which is a raw_spinlock_t). delayacct_blkio_end() acquires task_delay_info.lock which is a spinlock_t. This causes a might sleep splat on -RT where non raw spinlocks are converted to 'sleeping' spinlocks. task_delay_info.lock is only held for a short amount of time so it's not a problem latency wise to make convert it to a raw spinlock. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Balbir Singh <bsingharora@gmail.com> Link: https://lkml.kernel.org/r/20180423161024.6710-1-bigeasy@linutronix.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Thomas Gleixner	852de89a5f	stop_machine: Use raw spinlocks [ Upstream commit `de5b55c1d4` ] Use raw-locks in stop_machine() to allow locking in irq-off and preempt-disabled regions on -RT. This also documents the possible locking context in general. [bigeasy: update patch description.] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lkml.kernel.org/r/20180423191635.6014-1-bigeasy@linutronix.de Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Wolfram Sang	fa38c2df9a	backlight: pwm_bl: Don't use GPIOF_* with gpiod_get_direction [ Upstream commit `bb084c0f61` ] The documentation was wrong, gpiod_get_direction() returns 0/1 instead of the GPIOF_* flags. The docs were fixed with commit `94fc73094a` ("gpio: correct docs about return value of gpiod_get_direction"). Now, fix this user (until a better, system-wide solution is in place). Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Felix Fietkau	3bad56027a	mt76: add rcu locking around tx scheduling [ Upstream commit `1d868b70e0` ] Fixes a reported lockdep error in mac80211: [ 179.867321] ============================= [ 179.871510] WARNING: suspicious RCU usage [ 179.875528] 4.14.32 #0 Not tainted [ 179.878924] ----------------------------- [ 179.882981] backports-2017-11-01/net/mac80211/tx.c:594 suspicious rcu_dereference_check() usage! [ 179.891785] [ 179.891785] other info that might help us debug this: [ 179.891785] [ 179.899824] [ 179.899824] rcu_scheduler_active = 2, debug_locks = 1 [ 179.906343] 2 locks held by ksoftirqd/0/7: [ 179.910479] #0: (&(&q->lock)->rlock){+.-.}, at: [<86b207a4>] mt76_dma_tx_cleanup+0x64/0x354 [mt76] [ 179.919734] #1: (&(&fq->lock)->rlock){+.-.}, at: [<87238410>] ieee80211_tx_dequeue+0x54/0xc3c [mac80211] [ 179.929890] [ 179.929890] stack backtrace: [ 179.934257] CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.14.32 #0 [ 179.940421] Stack : 00000000 00000000 00000000 00000000 80e0fce2 00000036 00000000 00000000 [ 179.948864] 87c3d24c 80696377 8061039c 00000000 00000007 00000001 87c5db78 6534689d [ 179.957306] 00000000 00000000 80e10000 87c5da74 00000001 0000015a 00000007 00000000 [ 179.965748] 00000000 806a0000 000e4171 00000000 00000000 00000000 ffffffff 00000001 [ 179.974189] 806c0000 8692b240 86b000d0 87316fe4 00000001 802c9a68 00000000 80700000 [ 179.982632] ... [ 179.985104] Call Trace: [ 179.987582] [<80010a48>] show_stack+0x58/0x100 [ 179.992040] [<804c2c58>] dump_stack+0xe8/0x170 [ 179.996868] [<87234a04>] ieee80211_tx_h_select_key+0xa8/0x5b8 [mac80211] [ 180.004299] [<87238d44>] ieee80211_tx_dequeue+0x988/0xc3c [mac80211] [ 180.011048] [<86b230dc>] mt76_txq_schedule+0x110/0x3a4 [mt76] [ 180.016821] [<86b209d0>] mt76_dma_tx_cleanup+0x290/0x354 [mt76] [ 180.022777] [<86be2e60>] mt7603_tx_tasklet+0x40/0x6c [mt7603e] [ 180.028637] [<80037058>] tasklet_action+0x110/0x1ec [ 180.033532] [<804e1dac>] __do_softirq+0x164/0x35c [ 180.038235] [<80037174>] run_ksoftirqd+0x40/0x84 [ 180.042870] [<800580c8>] smpboot_thread_fn+0x1a8/0x1d8 [ 180.048023] [<800542e8>] kthread+0x130/0x144 [ 180.052297] [<8000b1f8>] ret_from_kernel_thread+0x14/0x1c Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Jacob Keller	1bcca8a390	i40e: avoid overflow in i40e_ptp_adjfreq() [ Upstream commit `830e0dd999` ] When operating at 1GbE, the base incval for the PTP clock is so large that multiplying it by numbers close to the max_adj can overflow the u64. Rather than attempting to limit the max_adj to a value small enough to avoid overflow, instead calculate the incvalue adjustment based on the 40GbE incvalue, and then multiply that by the scaling factor for the link speed. This sacrifices a small amount of precision in the adjustment but we avoid erratic behavior of the clock due to the overflow caused if ppb is very near the maximum adjustment. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Jakub Pawlak	d8fb5d37a9	i40e: Add advertising 10G LR mode [ Upstream commit `6ee4d32255` ] The advertising 10G LR mode should be possible to set but in the function i40e_set_link_ksettings() check for this is missed. This patch adds check for 10000baseLR_Full flag for 10G modes. Signed-off-by: Jakub Pawlak <jakub.pawlak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Yixun Lan	91781a4d00	dt-bindings: net: meson-dwmac: new compatible name for AXG SoC [ Upstream commit `7e5d05e18b` ] We need to introduce a new compatible name for the Meson-AXG SoC in order to support the RMII 100M ethernet PHY, since the PRG_ETH0 register of the dwmac glue layer is changed from previous old SoC. Signed-off-by: Yixun Lan <yixun.lan@amlogic.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Huazhong Tan	a47d8531e4	net: hns3: Fixes the out of bounds access in hclge_map_tqp [ Upstream commit `38e62046d4` ] This patch fixes the handling of the check when number of vports are detected to be more than available TPQs. Current handling causes an out of bounds access in hclge_map_tqp(). Fixes: `7df7dad633` ("net: hns3: Refactor the mapping of tqp to vport") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:57 +02:00
Alexey Khoroshilov	26fd86b8d0	spi: meson-spicc: Fix error handling in meson_spicc_probe() [ Upstream commit `ded5fa4e8b` ] If devm_spi_register_master() fails in meson_spicc_probe(), spicc->core is left undisabled. The patch fixes that. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Martin Blumenstingl	a9dc7f5dd7	dt-bindings: pinctrl: meson: add support for the Meson8m2 SoC [ Upstream commit `03d9fbc397` ] The Meson8m2 SoC is a variant of Meson8 with some updates from Meson8b (such as the Gigabit capable DesignWare MAC). It is mostly pin compatible with Meson8, only 10 (existing) CBUS pins get an additional function (four of these are Ethernet RXD2, RXD3, TXD2 and TXD3 which are required when the board uses an RGMII PHY). The AOBUS pins seem to be identical on Meson8 and Meson8m2. Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Tobin C. Harding	719e769be2	mmc: pwrseq: Use kmalloc_array instead of stack VLA [ Upstream commit `486e666136` ] The use of stack Variable Length Arrays needs to be avoided, as they can be a vector for stack exhaustion, which can be both a runtime bug (kernel Oops) or a security flaw (overwriting memory beyond the stack). Also, in general, as code evolves it is easy to lose track of how big a VLA can get. Thus, we can end up having runtime failures that are hard to debug. As part of the directive[1] to remove all VLAs from the kernel, and build with -Wvla. Currently driver is using a VLA declared using the number of descriptors. This array is used to store integer values and is later used as an argument to `gpiod_set_array_value_cansleep()` This can be avoided by using `kmalloc_array()` to allocate memory for the array of integer values. Memory is free'd before return from function. >From the code it appears that it is safe to sleep so we can use GFP_KERNEL (based _cansleep() suffix of function `gpiod_set_array_value_cansleep()`. It can be expected that this patch will result in a small increase in overhead due to the use of `kmalloc_array()` [1] https://lkml.org/lkml/2018/3/7/621 Signed-off-by: Tobin C. Harding <me@tobin.cc> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Shawn Lin	72e0bdbdb7	mmc: dw_mmc: update actual clock for mmc debugfs [ Upstream commit `ff178981bd` ] Respect the actual clock for mmc debugfs to help better debug the hardware. mmc_host mmc0: Bus speed (slot 0) = 135475200Hz (slot req 150000000Hz, actual 135475200HZ div = 0) cat /sys/kernel/debug/mmc0/ios clock: 150000000 Hz actual clock: 135475200 Hz vdd: 21 (3.3 ~ 3.4 V) bus mode: 2 (push-pull) chip select: 0 (don't care) power mode: 2 (on) bus width: 3 (8 bits) timing spec: 9 (mmc HS200) signal voltage: 0 (1.80 V) driver type: 0 (driver type B) Cc: Xiao Yao <xiaoyao@rock-chips.com> Cc: Ziyuan <xzy.xu@rock-chips.com> Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Takashi Sakamoto	786d69794a	ALSA: hda/ca0132: fix build failure when a local macro is defined [ Upstream commit `8e142e9e62` ] DECLARE_TLV_DB_SCALE (alias of SNDRV_CTL_TLVD_DECLARE_DB_SCALE) is used but tlv.h is not included. This causes build failure when local macro is defined by comment-out. This commit fixes the bug. At the same time, the alias macro is replaced with a destination macro added at a commit `46e860f768` ("ALSA: rename TLV-related macros so that they're friendly to user applications") Reported-by: Connor McAdams <conmanx360@gmail.com> Fixes: `44f0c9782c` ('ALSA: hda/ca0132: Add tuning controls') Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Ido Schimmel	a44c424f08	mlxsw: spectrum_router: Return an error for non-default FIB rules [ Upstream commit `6290182b2b` ] Since commit `9776d32537` ("net: Move call_fib_rule_notifiers up in fib_nl_newrule") it is possible to forbid the installation of unsupported FIB rules. Have mlxsw return an error for non-default FIB rules in addition to the existing extack message. Example: # ip rule add from 198.51.100.1 table 10 Error: mlxsw_spectrum: FIB rules not supported. Note that offload is only aborted when non-default FIB rules are already installed and merely replayed during module initialization. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Jaegeuk Kim	a70bfed87c	f2fs: check cap_resource only for data blocks [ Upstream commit `a90a0884ac` ] This patch changes the rule to check cap_resource for data blocks, not inode or node blocks in order to avoid selinux denial. Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Kishon Vijay Abraham I	5b50b523b7	mmc: sdhci-omap: Fix when capabilities are obtained from SDHCI_CAPABILITIES reg [ Upstream commit `0ec4ee3c9b` ] sdhci_omap_config_iodelay_pinctrl_state() requires caps and caps2 to be initialized (speed mode capabilities like UHS/HS200) before it is invoked. While mmc_of_parse() initializes caps/caps2 if capabilities is populated in device tree, it will remain uninitialized for capabilities obtained from SDHCI_CAPABILITIES register. Fix sdhci_omap_config_iodelay_pinctrl_state() to be used even while getting the capabilities from SDHCI_CAPABILITIES register by invoking sdhci_setup_host() before sdhci_omap_config_iodelay_pinctrl_state(). Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Satendra Singh Thakur	c81350c31d	drm/atomic: Handling the case when setting old crtc for plane [ Upstream commit `fc2a69f390` ] In the func drm_atomic_set_crtc_for_plane, with the current code, if crtc of the plane_state and crtc passed as argument to the func are same, entire func will executed in vein. It will get state of crtc and clear and set the bits in plane_mask. All these steps are not required for same old crtc. Ideally, we should do nothing in this case, this patch handles the same, and causes the program to return without doing anything in such scenario. Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com> Cc: Madhur Verma <madhur.verma@samsung.com> Cc: Hemanshu Srivastava <hemanshu.s@samsung.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Lorenzo Bianconi	0b6c80113d	mt76x2: fix avg_rssi estimation [ Upstream commit `c990affd5a` ] Add leftover filter coefficients in IIR rssi estimation Fixes: `7bc04215a6` ("mt76: add driver code for MT76x2e") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:56 +02:00
Mauro Carvalho Chehab	43f5952d0f	media: siano: get rid of __le32/__le16 cast warnings [ Upstream commit `e1b7f11b37` ] Those are all false-positives that appear with smatch when building for arm: drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:38:36: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:47:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:67:35: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:84:44: warning: cast to restricted __le32 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:98:26: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:99:28: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:100:27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:100:27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:100:27: warning: cast to restricted __le16 drivers/media/common/siano/smsendian.c:100:27: warning: cast to restricted __le16 Get rid of them by adding explicit forced casts. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Mauro Carvalho Chehab	aad7993a56	media: em28xx: fix a regression with HVR-950 [ Upstream commit `509f89652f` ] Commit `be7fd3c3a8` ("media: em28xx: Hauppauge DualHD second tuner functionality") removed the logic with sets the alternate for the DVB device. Without setting the right alternate, the device won't be able to submit URBs, and userspace fails with -EMSGSIZE: ERROR DMX_SET_PES_FILTER failed (PID = 0x2000): 90 Message too long Tested with Hauppauge HVR-950 model A1C0. Fixes: `be7fd3c3a8` ("media: em28xx: Hauppauge DualHD second tuner functionality") Cc: Brad Love <brad@nextdimension.cc> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Jaegeuk Kim	4fc891a62c	f2fs: avoid fsync() failure caused by EAGAIN in writepage() [ Upstream commit `5b19d284f5` ] pageout() in MM traslates EAGAIN, so calls handle_write_error() -> mapping_set_error() -> set_bit(AS_EIO, ...). file_write_and_wait_range() will see EIO error, which is critical to return value of fsync() followed by atomic_write failure to user. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Jakub Kicinski	d7825eb34a	bpf: fix references to free_bpf_prog_info() in comments [ Upstream commit `ab7f5bf092` ] Comments in the verifier refer to free_bpf_prog_info() which seems to have never existed in tree. Replace it with free_used_maps(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Changbin Du	666778425a	regulator: add dummy function of_find_regulator_by_node [ Upstream commit `08813e0ec1` ] If device tree is not enabled, of_find_regulator_by_node() should have a dummy function since the function call is still there. This is to fix build error after CONFIG_NO_AUTO_INLINE is introduced. If this option is enabled, GCC will not auto-inline functions that are not explicitly marked as inline. In this case (no CONFIG_OF), the copmiler will report error in function regulator_dev_lookup(). W/O NO_AUTO_INLINE, function of_get_regulator() is auto-inlined and then the call to of_find_regulator_by_node() is optimized out since of_get_regulator() always return NULL. W/ NO_AUTO_INLINE, the return value of of_get_regulator() is a variable so the call to of_find_regulator_by_node() cannot be optimized out. So we need a stub of_find_regulator_by_node(). static struct regulator_dev regulator_dev_lookup(struct device dev, const char supply) { struct regulator_dev r = NULL; struct device_node node; struct regulator_map map; const char devname = NULL; regulator_supply_alias(&dev, &supply); / first do a dt based lookup */ if (dev && dev->of_node) { node = of_get_regulator(dev, supply); if (node) { r = of_find_regulator_by_node(node); if (r) return r; ... Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Bartlomiej Zolnierkiewicz	9267f3fe1f	thermal: exynos: fix setting rising_threshold for Exynos5433 [ Upstream commit `8bfc218d0e` ] Add missing clearing of the previous value when setting rising temperature threshold. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Doug Oucharek	7fd91802e0	staging: lustre: o2iblnd: Fix FastReg map/unmap for MLX5 [ Upstream commit `24d4b7c8de` ] The FastReg support in ko2iblnd was not unmapping pool items causing the items to leak. In addition, the mapping code is not growing the pool like we do with FMR. This patch makes sure we are unmapping FastReg pool elements when we are done with them. It also makes sure the pool will grow when we depleat the pool. Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-9472 Reviewed-on: https://review.whamcloud.com/27015 Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: James Simmons <uja.ornl@yahoo.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Doug Oucharek <dougso@me.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Doug Oucahrek	8bd8f2d1d4	staging: lustre: o2iblnd: fix race at kiblnd_connect_peer [ Upstream commit `cf04968efe` ] cmid will be destroyed at OFED if kiblnd_cm_callback return error. if error happen before the end of kiblnd_connect_peer, it will touch destroyed cmid and fail as (o2iblnd_cb.c:1315:kiblnd_connect_peer()) ASSERTION( cmid->device != ((void *)0) ) failed: Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-10015 Reviewed-by: Alexey Lyashkov <c17817@cray.com> Reviewed-by: Doug Oucharek <dougso@me.com> Reviewed-by: John L. Hammond <john.hammond@intel.com> Signed-off-by: Doug Oucharek <dougso@me.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Takashi Iwai	833e3bd59d	dma-direct: try reallocation with GFP_DMA32 if possible [ Upstream commit `de7eab301d` ] As the recent swiotlb bug revealed, we seem to have given up the direct DMA allocation too early and felt back to swiotlb allocation. The reason is that swiotlb allocator expected that dma_direct_alloc() would try harder to get pages even below 64bit DMA mask with GFP_DMA32, but the function doesn't do that but only deals with GFP_DMA case. This patch adds a similar fallback reallocation with GFP_DMA32 as we've done with GFP_DMA. The condition is that the coherent mask is smaller than 64bit (i.e. some address limitation), and neither GFP_DMA nor GFP_DMA32 is set beforehand. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Chad Dupuis	d36ea91e67	scsi: qedf: Set the UNLOADING flag when removing a vport [ Upstream commit `4f4616ceeb` ] Similar to what we do when we remove a PCI function, set the QEDF_UNLOADING flag to prevent any requests from being queued while a vport is being deleted. This prevents any requests from getting stuck in limbo when the vport is unloaded or deleted. Fixes the crash: PID: 106676 TASK: ffff9a436aa90000 CPU: 12 COMMAND: "multipathd" #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512 #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600 #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768 #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52 #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9 #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379 #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915 #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768 [exception RIP: qedf_init_task+61] RIP: ffffffffc0e13c2d RSP: ffff9a43567d38d0 RFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffbe920472c738 RCX: ffff9a434fa0e3e8 RDX: ffff9a434f695280 RSI: ffffbe920472c738 RDI: ffff9a43aa359c80 RBP: ffff9a43567d3950 R8: 0000000000000c15 R9: ffff9a3fb09b9880 R10: ffff9a434fa0e3e8 R11: ffff9a43567d35ce R12: 0000000000000000 R13: ffff9a434f695280 R14: ffff9a43aa359c80 R15: ffff9a3fb9e005c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:55 +02:00
Viresh Kumar	073b0ab717	soc/tegra: pmc: Don't allocate struct tegra_powergate on stack [ Upstream commit `495ac33a3b` ] With a later commit an instance of the struct device will be added to struct genpd and with that the size of the struct tegra_powergate will be over 1024 bytes. That generates following warning: drivers/soc/tegra/pmc.c:579:1: warning: the frame size of 1200 bytes is larger than 1024 bytes [-Wframe-larger-than=] Avoid such warnings by allocating the structure dynamically. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Xiang Chen	93eb64009b	scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw [ Upstream commit `9413532788` ] As a unconstrained command, a command can be sent to SATA disk even if SATA disk status is BUSY, ERR or DRQ. If an ATA reset assert is successful but ATA reset de-assert fails, then it will retry the reset de-assert. If reset de- assert retry is successful, we think it is okay to probe the device but actually it still has Err status. Apparently we need to retry the ATA reset assertion and de- assertion instead for this mentioned scenario. As such, we config ATA reset assert as a constrained command, if ATA reset de-assert fails, then ATA reset de-assert retry will also fail. Then we will retry the proper process of ATA reset assert and de-assert again. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Dan Carpenter	6ddaebbdd6	scsi: megaraid: silence a static checker bug [ Upstream commit `27e833daba` ] If we had more than 32 megaraid cards then it would cause memory corruption. That's not likely, of course, but it's handy to enforce it and make the static checker happy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Wenwen Wang	8c093edfdd	scsi: 3w-xxxx: fix a missing-check bug [ Upstream commit `9899e4d352` ] In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'data_buffer_length'. Then a security check is performed on it to make sure that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, an error code -EINVAL is returned. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer length between the two copies. This way, the user can bypass the security check and inject invalid data buffer length. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Wenwen Wang	1781ecf288	scsi: 3w-9xxx: fix a missing-check bug [ Upstream commit `c9318a3e02` ] In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'driver_command'. Then a security check is performed on the data buffer size indicated by 'driver_command', which is 'driver_command.buffer_length'. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer size between the two copies. This way, the user can bypass the security check and inject invalid data buffer size. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Christian Gromm	4e04adf0f7	staging: most: cdev: fix chrdev_region leak [ Upstream commit `aba258b731` ] The function unregister_chrdev_region is called with a different counter as the alloc_chrdev_region. To fix this, this patch introduces the constant CHRDEV_REGION_SIZE that is used in both functions. Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Ram Pai	0225a1b338	mm, powerpc, x86: define VM_PKEY_BITx bits if CONFIG_ARCH_HAS_PKEYS is enabled [ Upstream commit `5212213aa5` ] VM_PKEY_BITx are defined only if CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS is enabled. Powerpc also needs these bits. Hence lets define the VM_PKEY_BITx bits for any architecture that enables CONFIG_ARCH_HAS_PKEYS. Reviewed-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Michael Chan	fcb85cc9f9	bnxt_en: Always forward VF MAC address to the PF. [ Upstream commit `707e7e9660` ] The current code already forwards the VF MAC address to the PF, except in one case. If the VF driver gets a valid MAC address from the firmware during probe time, it will not forward the MAC address to the PF, incorrectly assuming that the PF already knows the MAC address. This causes "ip link show" to show zero VF MAC addresses for this case. This assumption is not correct. Newer firmware remembers the VF MAC address last used by the VF and provides it to the VF driver during probe. So we need to always forward the VF MAC address to the PF. The forwarded MAC address may now be the PF assigned MAC address and so we need to make sure we approve it for this case. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Michael Chan	ceba5d5f55	bnxt_en: Check unsupported speeds in bnxt_update_link() on PF only. [ Upstream commit `dac0490718` ] Only non-NPAR PFs need to actively check and manage unsupported link speeds. NPAR functions and VFs do not control the link speed and should skip the unsupported speed detection logic, to avoid warning messages from firmware rejecting the unsupported firmware calls. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:54 +02:00
Antoine Tenart	af4a392e5a	net: phy: sfp: handle cases where neither BR, min nor BR, max is given [ Upstream commit `2b999ba899` ] When computing the bitrate using values read from an SFP module EEPROM, we use the nominal BR plus BR,min and BR,max to determine the boundaries. But in some cases BR,min and BR,max aren't provided, which led the SFP code to end up having the nominal value for both the minimum and maximum bitrate values. When using a passive cable, the nominal value should be used as the maximum one, and there is no minimum one so we should use 0. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Thomas Richter	8410a8ddfe	perf: fix invalid bit in diagnostic entry [ Upstream commit `3c0a83b14e` ] The s390 CPU measurement facility sampling mode supports basic entries and diagnostic entries. Each entry has a valid bit to indicate the status of the entry as valid or invalid. This bit is bit 31 in the diagnostic entry, but the bit mask definition refers to bit 30. Fix this by making the reserved field one bit larger. Fixes: `7e75fc3ff4` ("s390/cpum_sf: Add raw data sampling to support the diagnostic-sampling function") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Thomas Richter	d610dedaf3	s390/cpum_sf: Add data entry sizes to sampling trailer entry [ Upstream commit `77715b7ddb` ] The CPU Measurement sampling facility creates a trailer entry for each Sample-Data-Block of stored samples. The trailer entry contains the sizes (in bytes) of the stored sampling types: - basic-sampling data entry size - diagnostic-sampling data entry size Both sizes are 2 bytes long. This patch changes the trailer entry definition to reflect this. Fixes: `fcc77f5073` ("s390/cpum_sf: Atomically reset trailer entry fields of sample-data-blocks") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Sean Lanigan	7f07963e9c	brcmfmac: Add support for bcm43364 wireless chipset [ Upstream commit `9c4a121e82` ] Add support for the BCM43364 chipset via an SDIO interface, as used in e.g. the Murata 1FX module. The BCM43364 uses the same firmware as the BCM43430 (which is already included), the only difference is the omission of Bluetooth. However, the SDIO_ID for the BCM43364 is 02D0:A9A4, giving it a MODALIAS of sdio:c00v02D0dA9A4, which doesn't get recognised and hence doesn't load the brcmfmac module. Adding the 'A9A4' ID in the appropriate place triggers the brcmfmac driver to load, and then correctly use the firmware file 'brcmfmac43430-sdio.bin'. Signed-off-by: Sean Lanigan <sean@lano.id.au> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Jane Wan	9de437a135	mtd: rawnand: fsl_ifc: fix FSL NAND driver to read all ONFI parameter pages [ Upstream commit `a75bbe71a2` ] Per ONFI specification (Rev. 4.0), if the CRC of the first parameter page read is not valid, the host should read redundant parameter page copies. Fix FSL NAND driver to read the two redundant copies which are mandatory in the specification. Signed-off-by: Jane Wan <Jane.Wan@nokia.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Brad Love	8af60c2be4	media: em28xx: Fix DualHD broken second tuner [ Upstream commit `01affb000e` ] The use of a hard coded i2c address breaks the creation of the second tuner in DualHD 01595 models. The issue is compounded by lack of any error message stating that a driver failed initialization. Use addr, which contains the correct address for each tuner. Fixes: `ad32495b15` ("media: em28xx-dvb: simplify DVB module probing logic") Signed-off-by: Brad Love <brad@nextdimension.cc> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Jacopo Mondi	dc074a5a27	media: renesas-ceu: Set mbus_fmt on subdev operations [ Upstream commit `d3a67f2747` ] The renesas-ceu driver intializes the desired mbus_format at 'complete' time, inspecting the supported subdevice ones, and tuning some parameters to produce the requested memory format from what the sensor can produce. Although, the initially selected mbus_format was not provided to the subdevice during set_fmt and try_fmt operations, providing instead a '0' mbus format code. As long as the sensor defaults to a compatible mbus_format when an invalid code as '0' is provided, capture operations work correctly. If the subdevice defaults to an unsupported format (eg. some RGB permutations) capture does not work properly due to a mismatch on the expected and received image format on the wire. Fix that by re-using the initially selected mbus_format code during set_fmt and try_fmt subdevice operation calls. Tested by printing out the format selection procedure with ov7670 sensor. Before this patch: [ 0.866001] ov7670_try_fmt_internal -- Looking for mbus_code 0x0000 [ 0.870882] ov7670_try_fmt_internal -- Try mbus_code 0x2008 [ 0.876336] ov7670_try_fmt_internal -- Try mbus_code 0x1002 [ 0.881387] ov7670_try_fmt_internal -- Try mbus_code 0x1008 [ 0.886537] ov7670_try_fmt_internal -- Try mbus_code 0x3001 [ 0.891584] ov7670_try_fmt_internal -- mbus_code defaulted to 0x2008 With this patch applied: [ 0.867015] ov7670_try_fmt_internal -- Looking for mbus_code 0x2008 [ 0.873205] ov7670_try_fmt_internal -- Try mbus_code 0x2008: match Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Brad Love	3f5f31dbb4	media: saa7164: Fix driver name in debug output [ Upstream commit `0cc4655cb5` ] This issue was reported by a user who downloaded a corrupt saa7164 firmware, then went looking for a valid xc5000 firmware to fix the error displayed...but the device in question has no xc5000, thus after much effort, the wild goose chase eventually led to a support call. The xc5000 has nothing to do with saa7164 (as far as I can tell), so replace the string with saa7164 as well as give a meaningful hint on the firmware mismatch. Signed-off-by: Brad Love <brad@nextdimension.cc> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Sami Tolvanen	d1cea0cdba	media: media-device: fix ioctl function types [ Upstream commit `daa36370b6` ] This change fixes function types for media device ioctls to avoid indirect call mismatches with Control-Flow Integrity checking. Signed-off-by: Sami Tolvanen <samitolvanen@google.com> Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Hans de Goede	5236233eb9	ACPI / LPSS: Only call pwm_add_table() for Bay Trail PWM if PMIC HRV is 2 [ Upstream commit `c975e472ec` ] The Point of View mobii wintab p800w Bay Trail tablet comes with a Crystal Cove PMIC, yet uses the LPSS PWM for backlight control, rather then the Crystal Cove's PWM, so we need to call pwm_add_table() to add a pwm_backlight mapping for the LPSS pwm despite there being an INT33FD ACPI device present. On all Bay Trail devices the _HRV object of the INT33FD ACPI device will normally return 2, to indicate the Bay Trail variant of the CRC PMIC is present, except on this tablet where _HRV is 0xffff. I guess this is a hack to make the windows Crystal Cove PWM driver not bind. Out of the 44 DSTDs with an INT33FD device in there which I have (from different model devices) only the pov mobii wintab p800w uses 0xffff for the HRV. The byt_pwm_setup code calls acpi_dev_present to check for the presence of a INT33FD ACPI device which indicates that a CRC PMIC is present and if the INT33FD ACPI device is present then byt_pwm_setup will not add a pwm_backlight mapping for the LPSS pwm, so that the CRC PWM will get used instead. acpi_dev_present has a hrv parameter, this commit make us pass 2 instead of -1, so that things still match on normal tablets, but on this special case with its _HRV of 0xffff, the check will now fail so that the pwm_backlight mapping for the LPSS pwm gets added fixing backlight brightness control on this device. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:53 +02:00
Damien Le Moal	e5d790a043	libata: Fix command retry decision [ Upstream commit `804689ad2d` ] For failed commands with valid sense data (e.g. NCQ commands), scsi_check_sense() is used in ata_analyze_tf() to determine if the command can be retried. In such case, rely on this decision and ignore the command error mask based decision done in ata_worth_retry(). This fixes useless retries of commands such as unaligned writes on zoned disks (TYPE_ZAC). Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Wei Yongjun	c892cdfbe0	media: rcar_jpu: Add missing clk_disable_unprepare() on error in jpu_open() [ Upstream commit `43d0d3c527` ] Add the missing clk_disable_unprepare() before return from jpu_open() in the software reset error handling case. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Mikhail Ulyanov <mikhail.ulyanov@cogentembedded.com> Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Florian Fainelli	a35b70b64f	net: phy: phylink: Release link GPIO [ Upstream commit `daab3349ad` ] We are not releasing the link GPIO descriptor with gpiod_put() which results in subsequent probing to get -EBUSY when calling fwnode_get_named_gpiod(). Fix this by doing the release in phylink_destroy(). Fixes: `9525ae8395` ("phylink: add phylink infrastructure") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Marc Zyngier	3ad25429ea	dma-iommu: Fix compilation when !CONFIG_IOMMU_DMA [ Upstream commit `8a22a3e1e7` ] Inclusion of include/dma-iommu.h when CONFIG_IOMMU_DMA is not selected results in the following splat: In file included from drivers/irqchip/irq-gic-v3-mbi.c:20:0: ./include/linux/dma-iommu.h:95:69: error: unknown type name ‘dma_addr_t’ static inline int iommu_get_msi_cookie(struct iommu_domain domain, dma_addr_t base) ^~~~~~~~~~ ./include/linux/dma-iommu.h:108:74: warning: ‘struct list_head’ declared inside parameter list will not be visible outside of this definition or declaration static inline void iommu_dma_get_resv_regions(struct device dev, struct list_head *list) ^~~~~~~~~ scripts/Makefile.build:312: recipe for target 'drivers/irqchip/irq-gic-v3-mbi.o' failed Fix it by including linux/types.h. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lkml.kernel.org/r/20180508121438.11301-5-marc.zyngier@arm.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
DaeRyong Jeong	60c4e8db32	tty: Fix data race in tty_insert_flip_string_fixed_flag [ Upstream commit `b6da31b2c0` ] Unlike normal serials, in pty layer, there is no guarantee that multiple threads don't insert input characters at the same time. If it is happened, tty_insert_flip_string_fixed_flag can be executed concurrently. This can lead slab out-of-bounds write in tty_insert_flip_string_fixed_flag. Call sequences are as follows. CPU0 CPU1 n_tty_ioctl_helper n_tty_ioctl_helper __start_tty tty_send_xchar tty_wakeup pty_write n_hdlc_tty_wakeup tty_insert_flip_string n_hdlc_send_frames tty_insert_flip_string_fixed_flag pty_write tty_insert_flip_string tty_insert_flip_string_fixed_flag To fix the race, acquire port->lock in pty_write() before it inserts input characters to tty buffer. It prevents multiple threads from inserting input characters concurrently. The crash log is as follows: BUG: KASAN: slab-out-of-bounds in tty_insert_flip_string_fixed_flag+0xb5/ 0x130 drivers/tty/tty_buffer.c:316 at addr ffff880114fcc121 Write of size 1792 by task syz-executor0/30017 CPU: 1 PID: 30017 Comm: syz-executor0 Not tainted 4.8.0 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 0000000000000000 ffff88011638f888 ffffffff81694cc3 ffff88007d802140 ffff880114fcb300 ffff880114fcc300 ffff880114fcb300 ffff88011638f8b0 ffffffff8130075c ffff88011638f940 ffff88007d802140 ffff880194fcc121 Call Trace: __dump_stack lib/dump_stack.c:15 [inline] dump_stack+0xb3/0x110 lib/dump_stack.c:51 kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 print_address_description mm/kasan/report.c:194 [inline] kasan_report_error+0x1f7/0x4e0 mm/kasan/report.c:283 kasan_report+0x36/0x40 mm/kasan/report.c:303 check_memory_region_inline mm/kasan/kasan.c:292 [inline] check_memory_region+0x13e/0x1a0 mm/kasan/kasan.c:299 memcpy+0x37/0x50 mm/kasan/kasan.c:335 tty_insert_flip_string_fixed_flag+0xb5/0x130 drivers/tty/tty_buffer.c:316 tty_insert_flip_string include/linux/tty_flip.h:35 [inline] pty_write+0x7f/0xc0 drivers/tty/pty.c:115 n_hdlc_send_frames+0x1d4/0x3b0 drivers/tty/n_hdlc.c:419 n_hdlc_tty_wakeup+0x73/0xa0 drivers/tty/n_hdlc.c:496 tty_wakeup+0x92/0xb0 drivers/tty/tty_io.c:601 __start_tty.part.26+0x66/0x70 drivers/tty/tty_io.c:1018 __start_tty+0x34/0x40 drivers/tty/tty_io.c:1013 n_tty_ioctl_helper+0x146/0x1e0 drivers/tty/tty_ioctl.c:1138 n_hdlc_tty_ioctl+0xb3/0x2b0 drivers/tty/n_hdlc.c:794 tty_ioctl+0xa85/0x16d0 drivers/tty/tty_io.c:2992 vfs_ioctl fs/ioctl.c:43 [inline] do_vfs_ioctl+0x13e/0xba0 fs/ioctl.c:679 SYSC_ioctl fs/ioctl.c:694 [inline] SyS_ioctl+0x8f/0xc0 fs/ioctl.c:685 entry_SYSCALL_64_fastpath+0x1f/0xbd Signed-off-by: DaeRyong Jeong <threeearcat@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Jacob Keller	69596fc961	i40e: free the skb after clearing the bitlock [ Upstream commit `c79756cb5f` ] In commit `bbc4e7d273` ("i40e: fix race condition with PTP_TX_IN_PROGRESS bits") we modified the code which handles Tx timestamps so that we would clear the progress bit as soon as possible. A later commit `0bc0706b46` ("i40e: check for Tx timestamp timeouts during watchdog") introduced similar code for detecting and handling cleanup of a blocked Tx timestamp. This code did not use the same pattern for cleaning up the skb. Update this code to wait to free the skb until after the bit lock is free, by first setting the ptp_tx_skb to NULL and clearing the lock. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Rob Herring	fbaa721c18	ARM: dts: imx53: Fix LDB OF graph warning [ Upstream commit `77dd4bd0c0` ] Single child nodes in OF graph don't need an address and now dtc will warn about this: Warning (graph_child_address): /soc/aips@50000000/ldb@53fa8008/lvds-channel@0: graph node has single child node 'port@0', #address-cells/#size-cells are not necessary Since the LDB should always have an output port, fix the warning by adding the output port, 2, to the DT. Cc: Sascha Hauer <s.hauer@pengutronix.de> Cc: Pengutronix Kernel Team <kernel@pengutronix.de> Cc: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Mathieu Malaterre	acc85d00a2	nvmem: properly handle returned value nvmem_reg_read [ Upstream commit `50808bfcc1` ] Function nvmem_reg_read can return a non zero value indicating an error. This returned value must be read and error propagated to nvmem_cell_prepare_write_buffer. Silence the following gcc warning (W=1): drivers/nvmem/core.c:1093:9: warning: variable 'rc' set but not used [-Wunused-but-set-variable] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Geert Uytterhoeven	ede0d2fa33	ARM: dts: sh73a0: Add missing interrupt-affinity to PMU node [ Upstream commit `57a66497e1` ] The PMU node references two interrupts, but lacks the interrupt-affinity property, which is required in that case: hw perfevents: no interrupt-affinity property for /pmu, guessing. Add the missing property to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Geert Uytterhoeven	39f97084a6	ARM: dts: emev2: Add missing interrupt-affinity to PMU node [ Upstream commit `7207b94754` ] The PMU node references two interrupts, but lacks the interrupt-affinity property, which is required in that case: hw perfevents: no interrupt-affinity property for /pmu, guessing. Add the missing property to fix this. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:52 +02:00
Patrice Chotard	f08cbd7939	ARM: dts: stih407-pinctrl: Fix complain about IRQ_TYPE_NONE usage [ Upstream commit `e95b8e718f` ] Since commit `83a86fbb5b` ("irqchip/gic: Loudly complain about the use of IRQ_TYPE_NONE") kernel is complaining about the IRQ_TYPE_NONE usage which shouldn't be used. Use IRQ_TYPE_LEVEL_HIGH instead. Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Patrice Chotard	3f5f9a3c4f	ARM: dts: stih410: Fix complain about IRQ_TYPE_NONE usage [ Upstream commit `fd827d0ec8` ] Since commit `83a86fbb5b` ("irqchip/gic: Loudly complain about the use of IRQ_TYPE_NONE") kernel is complaining about the IRQ_TYPE_NONE usage which shouldn't be used. Use IRQ_TYPE_LEVEL_HIGH instead. Signed-off-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Sanjay Kumar Konduri	0430c9e099	rsi: Add null check for virtual interfaces in wowlan config [ Upstream commit `54b5172087` ] When the "poweroff" command is executed after wowlan enabled, we have observed a system crash. In the system "poweroff" sequence, network-manager is sent to inactive state by cleaning up the network interfaces, using rsi_mac80211_remove_interface() and when driver tries to access those network interfaces in rsi_wowlan_config() which was invoked by SDIO shutdown, results in a crash. Added a NULL check before accessing the network interfaces in rsi_wowlan_config(). Signed-off-by: Sanjay Kumar Konduri <sanjay.konduri@redpinesignals.com> Signed-off-by: Siva Rebbagondla <siva.rebbagondla@redpinesignals.com> Signed-off-by: Sushant Kumar Mishra <sushant.mishra@redpinesignals.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Thor Thayer	a4c1d404da	EDAC, altera: Fix ARM64 build warning [ Upstream commit `9ef20753e0` ] The kbuild test robot reported the following warning: drivers/edac/altera_edac.c: In function 'ocram_free_mem': drivers/edac/altera_edac.c:1410:42: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] gen_pool_free((struct gen_pool *)other, (u32)p, size); ^ After adding support for ARM64 architectures, the unsigned long parameter is 64 bits and causes a build warning on 64-bit configs. Fix by casting to the correct size (unsigned long) instead of u32. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-edac <linux-edac@vger.kernel.org> Fixes: `c3eea1942a` ("EDAC, altera: Add Altera L2 cache and OCRAM support") Link: http://lkml.kernel.org/r/1526317441-4996-1-git-send-email-thor.thayer@linux.intel.com Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Dmitry Torokhov	633faf7dac	HID: i2c-hid: check if device is there before really probing [ Upstream commit `b3a81b6c4f` ] On many Chromebooks touch devices are multi-sourced; the components are electrically compatible and one can be freely swapped for another without changing the OS image or firmware. To avoid bunch of scary messages when device is not actually present in the system let's try testing basic communication with it and if there is no response terminate probe early with -ENXIO. Signed-off-by: Dmitry Torokhov <dtor@chromium.org> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Jonathan Neuschäfer	523942f9a1	powerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled by Starlet [ Upstream commit `9dcb3df428` ] The interrupt controller inside the Wii's Hollywood chip is connected to two masters, the "Broadway" PowerPC and the "Starlet" ARM926, each with their own interrupt status and mask registers. When booting the Wii with mini[1], interrupts from the SD card controller (IRQ 7) are handled by the ARM, because mini provides SD access over IPC. Linux however can't currently use or disable this IPC service, so both sides try to handle IRQ 7 without coordination. Let's instead make sure that all interrupts that are unmasked on the PPC side are masked on the ARM side; this will also make sure that Linux can properly talk to the SD card controller (and potentially other devices). If access to a device through IPC is desired in the future, interrupts from that device should not be handled by Linux directly. [1]: https://github.com/lewurm/mini Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Ben Hutchings	113d7a74bc	IB: Fix RDMA_RXE and INFINIBAND_RDMAVT dependencies for DMA_VIRT_OPS [ Upstream commit `e02637e97d` ] DMA_VIRT_OPS requires that dma_addr_t is at least as wide as a pointer, which is expressed as a dependency on !64BIT \|\| ARCH_DMA_ADDR_T_64BIT. For parisc64 this is not true, and if these IB modules are enabled, kconfig warns: WARNING: unmet direct dependencies detected for DMA_VIRT_OPS Depends on [n]: HAS_DMA [=y] && (!64BIT [=y] \|\| ARCH_DMA_ADDR_T_64BIT) Selected by [m]: - INFINIBAND_RDMAVT [=m] && INFINIBAND [=m] && 64BIT [=y] && PCI [=y] - RDMA_RXE [=m] && INET [=y] && PCI [=y] && INFINIBAND [=m] Add dependencies to fix this. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Leo (Sunpeng) Li	78d0d21852	drm/amd/display: Fix dim display on DCE11 [ Upstream commit `84ffa80123` ] Before programming the input gamma, check that we're not using the identity correction. Signed-off-by: Leo (Sunpeng) Li <sunpeng.li@amd.com> Reviewed-by: Anthony Koo <Anthony.Koo@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Samuel Li	4acd141d2e	drm/amdgpu: Remove VRAM from shared bo domains. [ Upstream commit `9b3f217faf` ] This fixes an issue introduced by change "allow framebuffer in GART memory as well" which could lead to a shared buffer ending up pinned in vram. Use GTT if it is included in the domain, otherwise return an error. Signed-off-by: Samuel Li <Samuel.Li@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Luc Van Oostenryck	299a168893	drm/radeon: fix mode_valid's return type [ Upstream commit `7a47f20eb1` ] The method struct drm_connector_helper_funcs::mode_valid is defined as returning an 'enum drm_mode_status' but the driver implementation for this method uses an 'int' for it. Fix this by using 'enum drm_mode_status' in the driver too. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:51 +02:00
Shirish S	86633402fb	drm/amd/display: remove need of modeset flag for overlay planes (V2) [ Upstream commit `a2a330ad66` ] This patch is in continuation to the "843e3c7 drm/amd/display: defer modeset check in dm_update_planes_state" where we started to eliminate the dependency on DRM_MODE_ATOMIC_ALLOW_MODESET to be set by the user space, which as such is not mandatory. After deferring, this patch eliminates the dependency on the flag for overlay planes. This has to be done in stages as its a pretty complex and requires thorough testing before we free primary planes as well from dependency on modeset flag. V2: Simplified the plane type check. Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Kuninori Morimoto	86c9d0bbb0	arm64: dts: renesas: salvator-common: use audio-graph-card for Sound [ Upstream commit `06a574c7ef` ] Current Sound is using simple-audio-card which can't support HDMI. To use HDMI sound, we need to use audio-graph-card. But, one note is that r8a7795 has 2 HDMI ports, but r8a7796 has 1. Because of this mismatch, supporting HDMI on salvator-common is impossible. Thus, this patch exchange sound card to audio-graph-card and keep supporting ak4613 as 1st sound node. r8a7795/r8a7796 salvator-x{s} need to add HDMI sound individually. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Tested-by: Nguyen Viet Dung <nv-dung@jinso.co.jp> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Terry Junge	26b95032b4	HID: hid-plantronics: Re-resend Update to map button for PTT products [ Upstream commit `37e376df5f` ] Add a mapping for Push-To-Talk joystick trigger button. Tested on ChromeBox/ChromeBook with various Plantronics devices. Signed-off-by: Terry Junge <terry.junge@plantronics.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Will Deacon	f983711cf1	arm64: cmpwait: Clear event register before arming exclusive monitor [ Upstream commit `1cfc63b5ae` ] When waiting for a cacheline to change state in cmpwait, we may immediately wake-up the first time around the outer loop if the event register was already set (for example, because of the event stream). Avoid these spurious wakeups by explicitly clearing the event register before loading the cacheline and setting the exclusive monitor. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Mauro Carvalho Chehab	77fe703ed6	media: staging: atomisp: Comment out several unused sensor resolutions [ Upstream commit `db01f7ccfa` ] The register settings for several resolutions aren't used currently. So, comment them out. Fix those warnings: In file included from drivers/staging/media/atomisp/i2c/atomisp-gc2235.c:35:0: drivers/staging/media/atomisp/i2c/gc2235.h:340:32: warning: 'gc2235_960_640_30fps' defined but not used [-Wunused-const-variable=] static struct gc2235_reg const gc2235_960_640_30fps[] = { ^~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/gc2235.h:287:32: warning: 'gc2235_1296_736_30fps' defined but not used [-Wunused-const-variable=] static struct gc2235_reg const gc2235_1296_736_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~ In file included from drivers/staging/media/atomisp/i2c/atomisp-ov2722.c:35:0: drivers/staging/media/atomisp/i2c/ov2722.h:999:32: warning: 'ov2722_720p_30fps' defined but not used [-Wunused-const-variable=] static struct ov2722_reg const ov2722_720p_30fps[] = { ^~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2722.h:787:32: warning: 'ov2722_1M3_30fps' defined but not used [-Wunused-const-variable=] static struct ov2722_reg const ov2722_1M3_30fps[] = { ^~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2722.h:476:32: warning: 'ov2722_VGA_30fps' defined but not used [-Wunused-const-variable=] static struct ov2722_reg const ov2722_VGA_30fps[] = { ^~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2722.h:367:32: warning: 'ov2722_480P_30fps' defined but not used [-Wunused-const-variable=] static struct ov2722_reg const ov2722_480P_30fps[] = { ^~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2722.h:257:32: warning: 'ov2722_QVGA_30fps' defined but not used [-Wunused-const-variable=] static struct ov2722_reg const ov2722_QVGA_30fps[] = { ^~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/atomisp-ov2680.c: In function '__ov2680_set_exposure': In file included from drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:35:0: At top level: drivers/staging/media/atomisp/i2c/ov2680.h:736:33: warning: 'ov2680_1616x1082_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_1616x1082_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:649:33: warning: 'ov2680_1456x1096_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_1456x1096_30fps[]= { ^~~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:606:33: warning: 'ov2680_1296x976_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_1296x976_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:563:33: warning: 'ov2680_720p_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_720p_30fps[] = { ^~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:520:33: warning: 'ov2680_800x600_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_800x600_30fps[] = { ^~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:475:33: warning: 'ov2680_720x592_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_720x592_30fps[] = { ^~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:433:33: warning: 'ov2680_656x496_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_656x496_30fps[] = { ^~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:389:33: warning: 'ov2680_QVGA_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_QVGA_30fps[] = { ^~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:346:33: warning: 'ov2680_CIF_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_CIF_30fps[] = { ^~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov2680.h:301:33: warning: 'ov2680_QCIF_30fps' defined but not used [-Wunused-const-variable=] static struct ov2680_reg const ov2680_QCIF_30fps[] = { ^~~~~~~~~~~~~~~~~ In file included from drivers/staging/media/atomisp/i2c/ov5693/atomisp-ov5693.c:36:0: drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:988:32: warning: 'ov5693_1424x1168_30fps' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_1424x1168_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:954:32: warning: 'ov5693_2592x1944_30fps' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_2592x1944_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:889:32: warning: 'ov5693_2592x1456_30fps' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_2592x1456_30fps[] = { ^~~~~~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:862:32: warning: 'ov5693_1940x1096' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_1940x1096[] = { ^~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:796:32: warning: 'ov5693_1636p_30fps' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_1636p_30fps[] = { ^~~~~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:758:32: warning: 'ov5693_1296x736' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_1296x736[] = { ^~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:730:32: warning: 'ov5693_976x556' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_976x556[] = { ^~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:672:32: warning: 'ov5693_736x496' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_736x496[] = { ^~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:643:32: warning: 'ov5693_192x160' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_192x160[] = { ^~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:616:32: warning: 'ov5693_368x304' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_368x304[] = { ^~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:587:32: warning: 'ov5693_336x256' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_336x256[] = { ^~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:540:32: warning: 'ov5693_1296x976' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_1296x976[] = { ^~~~~~~~~~~~~~~ drivers/staging/media/atomisp/i2c/ov5693/ov5693.h:509:32: warning: 'ov5693_654x496' defined but not used [-Wunused-const-variable=] static struct ov5693_reg const ov5693_654x496[] = { ^~~~~~~~~~~~~~ Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Mauro Carvalho Chehab	21afef8a50	media: atomisp: ov2680: don't declare unused vars [ Upstream commit `e5c0680fd2` ] drivers/staging/media/atomisp/i2c/atomisp-ov2680.c: In function ‘__ov2680_set_exposure’: drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:400:10: warning: variable ‘hts’ set but not used [-Wunused-but-set-variable] u16 vts,hts; ^~~ drivers/staging/media/atomisp/i2c/atomisp-ov2680.c: In function ‘ov2680_detect’: drivers/staging/media/atomisp/i2c/atomisp-ov2680.c:1164:5: warning: variable ‘revision’ set but not used [-Wunused-but-set-variable] u8 revision; ^~~~~~~~ Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Yunsheng Lin	8b7ba81e08	net: hns3: Fix for fiber link up problem [ Upstream commit `be8d8cdb8e` ] When hclge_ae_start is called, hdev->hw.mac.link may be set to one after up/down multi-times, which does not correspond to the link state of netdev when the netdev is up. This fixes it by setting hdev->hw.mac.link to zero when hclge_ae_start is called. Fixes: `46a3df9f97` ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Takashi Iwai	92a5d5d44a	ALSA: usb-audio: Apply rate limit to warning messages in URB complete callback [ Upstream commit `377a879d98` ] retire_capture_urb() may print warning messages when the given URB doesn't align, and this may flood the system log easily. Put the rate limit to the message for avoiding it. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1093485 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Grygorii Strashko	e6bc812dc0	net: ethernet: ti: cpsw-phy-sel: check bus_find_device() ret value [ Upstream commit `c6213eb1ae` ] This fixes klockworks warnings: Pointer 'dev' returned from call to function 'bus_find_device' at line 179 may be NULL and will be dereferenced at line 181. cpsw-phy-sel.c:179: 'dev' is assigned the return value from function 'bus_find_device'. bus.c:342: 'bus_find_device' explicitly returns a NULL value. cpsw-phy-sel.c:181: 'dev' is dereferenced by passing argument 1 to function 'dev_get_drvdata'. device.h:1024: 'dev' is passed to function 'dev_get_drvdata'. device.h:1026: 'dev' is explicitly dereferenced. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> [nsekhar@ti.com: add an error message, fix return path] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:50 +02:00
Mathieu Malaterre	8a5dd80778	clocksource: Move inline keyword to the beginning of function declarations [ Upstream commit `db6f9e55c8` ] The inline keyword was not at the beginning of the function declarations. Fix the following warnings triggered when using W=1: kernel/time/clocksource.c:456:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] kernel/time/clocksource.c:457:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Stephen Boyd <sboyd@kernel.org> Cc: John Stultz <john.stultz@linaro.org> Link: https://lkml.kernel.org/r/20180516195943.31924-1-malat@debian.org Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Oza Pawandeep	496c2dadcc	PCI/DPC: Clear interrupt status in interrupt handler top half [ Upstream commit `56abbf8ad7` ] The generic IRQ handling code ensures that an interrupt handler runs with its interrupt masked or disabled. If the interrupt is level-triggered, the interrupt handler must tell its device to stop asserting the interrupt before returning. If it doesn't, we will immediately take the interrupt again when the handler returns and the generic code unmasks the interrupt. The driver doesn't know whether its interrupt is edge- or level-triggered, so it must clear its interrupt source directly in its interrupt handler. Previously we cleared the DPC interrupt status in the bottom half, i.e., in deferred work, which can cause an interrupt storm if the DPC interrupt happens to be level-triggered, e.g., if we're using INTx instead of MSI. Clear the DPC interrupt status bit in the interrupt handler, not in the deferred work. Signed-off-by: Oza Pawandeep <poza@codeaurora.org> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <helgaas@kernel.org> Reviewed-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Colin Ian King	a665c01590	media: smiapp: fix timeout checking in smiapp_read_nvm [ Upstream commit `7a2148dfda` ] The current code decrements the timeout counter i and the end of each loop i is incremented, so the check for timeout will always be false and hence the timeout mechanism is just a dead code path. Potentially, if the RD_READY bit is not set, we could end up in an infinite loop. Fix this so the timeout starts from 1000 and decrements to zero, if at the end of the loop i is zero we have a timeout condition. Detected by CoverityScan, CID#1324008 ("Logically dead code") Fixes: `ccfc97bdb5` ("[media] smiapp: Add driver") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Thierry Reding	6b2daed867	gpu: host1x: Acquire a reference to the IOVA cache [ Upstream commit `f40e1590c5` ] The IOVA API uses a memory cache to allocate IOVA nodes from. To make sure that this cache is available, obtain a reference to it and release the reference when the cache is no longer needed. On 64-bit ARM this is hidden by the fact that the DMA mapping API gets that reference and never releases it. On 32-bit ARM, this is papered over by the Tegra DRM driver (the sole user of the host1x API requiring the cache) acquiring a reference to the IOVA cache for its own purposes. However, there may be additional users of this API in the future, so fix this upfront to avoid surprises. Fixes: `404bfb78da` ("gpu: host1x: Add IOMMU support") Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Emil Tantilov	e775c2bc95	ixgbevf: fix MAC address changes through ixgbevf_set_mac() [ Upstream commit `6e7d0ba1e5` ] Set hw->mac.perm_addr in ixgbevf_set_mac() in order to avoid losing the custom MAC on reset. This can happen in the following case: >ip link set $vf address $mac >ethtool -r $vf Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Yufen Yu	b5dd1888d3	md: fix NULL dereference of mddev->pers in remove_and_add_spares() [ Upstream commit `c42a0e2675` ] We met NULL pointer BUG as follow: [ 151.760358] BUG: unable to handle kernel NULL pointer dereference at 0000000000000060 [ 151.761340] PGD 80000001011eb067 P4D 80000001011eb067 PUD 1011ea067 PMD 0 [ 151.762039] Oops: 0000 [#1] SMP PTI [ 151.762406] Modules linked in: [ 151.762723] CPU: 2 PID: 3561 Comm: mdadm-test Kdump: loaded Not tainted 4.17.0-rc1+ #238 [ 151.763542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 [ 151.764432] RIP: 0010:remove_and_add_spares.part.56+0x13c/0x3a0 [ 151.765061] RSP: 0018:ffffc90001d7fcd8 EFLAGS: 00010246 [ 151.765590] RAX: 0000000000000000 RBX: ffff88013601d600 RCX: 0000000000000000 [ 151.766306] RDX: 0000000000000000 RSI: ffff88013601d600 RDI: ffff880136187000 [ 151.767014] RBP: ffff880136187018 R08: 0000000000000003 R09: 0000000000000051 [ 151.767728] R10: ffffc90001d7fed8 R11: 0000000000000000 R12: ffff88013601d600 [ 151.768447] R13: ffff8801298b1300 R14: ffff880136187000 R15: 0000000000000000 [ 151.769160] FS: 00007f2624276700(0000) GS:ffff88013ae80000(0000) knlGS:0000000000000000 [ 151.769971] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.770554] CR2: 0000000000000060 CR3: 0000000111aac000 CR4: 00000000000006e0 [ 151.771272] Call Trace: [ 151.771542] md_ioctl+0x1df2/0x1e10 [ 151.771906] ? __switch_to+0x129/0x440 [ 151.772295] ? __schedule+0x244/0x850 [ 151.772672] blkdev_ioctl+0x4bd/0x970 [ 151.773048] block_ioctl+0x39/0x40 [ 151.773402] do_vfs_ioctl+0xa4/0x610 [ 151.773770] ? dput.part.23+0x87/0x100 [ 151.774151] ksys_ioctl+0x70/0x80 [ 151.774493] __x64_sys_ioctl+0x16/0x20 [ 151.774877] do_syscall_64+0x5b/0x180 [ 151.775258] entry_SYSCALL_64_after_hwframe+0x44/0xa9 For raid6, when two disk of the array are offline, two spare disks can be added into the array. Before spare disks recovery completing, system reboot and mdadm thinks it is ok to restart the degraded array by md_ioctl(). Since disks in raid6 is not only_parity(), raid5_run() will abort, when there is no PPL feature or not setting 'start_dirty_degraded' parameter. Therefore, mddev->pers is NULL. But, mddev->raid_disks has been set and it will not be cleared when raid5_run abort. md_ioctl() can execute cmd 'HOT_REMOVE_DISK' to remove a disk by mdadm, which will cause NULL pointer dereference in remove_and_add_spares() finally. Signed-off-by: Yufen Yu <yuyufen@huawei.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Gioh Kim	07a43bbfe3	md/raid1: add error handling of read error from FailFast device [ Upstream commit `b33d10624f` ] Current handle_read_error() function calls fix_read_error() only if md device is RW and rdev does not include FailFast flag. It does not handle a read error from a RW device including FailFast flag. I am not sure it is intended. But I found that write IO error sets rdev faulty. The md module should handle the read IO error and write IO error equally. So I think read IO error should set rdev faulty. Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com> Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Anson Huang	02e9f88767	regulator: pfuze100: add .is_enable() for pfuze100_swb_regulator_ops [ Upstream commit `0b01fd3d40` ] If is_enabled() is not defined, regulator core will assume this regulator is already enabled, then it can NOT be really enabled after disabled. Based on Li Jun's patch from the NXP kernel tree. Signed-off-by: Anson Huang <Anson.Huang@nxp.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Takashi Iwai	3a3209375f	ALSA: emu10k1: Rate-limit error messages about page errors [ Upstream commit `11d42c8103` ] The error messages at sanity checks of memory pages tend to repeat too many times once when it hits, and without the rate limit, it may flood and become unreadable. Replace such messages with the *_ratelimited() variant. Bugzilla: http://bugzilla.opensuse.org/show_bug.cgi?id=1093027 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:49 +02:00
Alexandre Belloni	4f4a958e27	rtc: tps65910: fix possible race condition [ Upstream commit `e6000a438e` ] The IRQ is requested before the struct rtc is allocated and registered, but this struct is used in the IRQ handler. This may lead to a NULL pointer dereference. Switch to devm_rtc_allocate_device/rtc_register_device to allocate the rtc before requesting the IRQ. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Alexandre Belloni	954e00ba3e	rtc: vr41xx: fix possible race condition [ Upstream commit `9a99247c9c` ] The probe function is not allowed to fail after the RTC is registered because the following may happen: CPU0: CPU1: sys_load_module() do_init_module() do_one_initcall() cmos_do_probe() rtc_device_register() __register_chrdev() cdev->owner = struct module* open("/dev/rtc0") rtc_device_unregister() module_put() free_module() module_free(mod->module_core) /* struct module module is now freed / chrdev_open() spin_lock(cdev_lock) cdev_get() try_module_get() module_is_live() /* dereferences already freed struct module* */ Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Alexandre Belloni	6aa5ebac3a	rtc: tps6586x: fix possible race condition [ Upstream commit `63d2206307` ] The probe function is not allowed to fail after the RTC is registered because the following may happen: CPU0: CPU1: sys_load_module() do_init_module() do_one_initcall() cmos_do_probe() rtc_device_register() __register_chrdev() cdev->owner = struct module* open("/dev/rtc0") rtc_device_unregister() module_put() free_module() module_free(mod->module_core) /* struct module module is now freed / chrdev_open() spin_lock(cdev_lock) cdev_get() try_module_get() module_is_live() /* dereferences already freed struct module* */ Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Vic Wei	c8f2bb00c2	Bluetooth: btusb: add ID for LiteOn 04ca:301a [ Upstream commit `d666fc5479` ] Contains a QCA6174A chipset, with USB BT. Let's support loading firmware on it. >From usb-devices: T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 2.01 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=04ca ProdID=301a Rev= 0.01 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb Signed-off-by: Vic Wei <vwei@codeaurora.org> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Ben Skeggs	8867977862	drm/nouveau/fifo/gk104-: poll for runlist update completion [ Upstream commit `4f2fc25c0f` ] Newer HW doesn't appear to send this event, which will cause long delays in runlist updates if they don't complete immediately. RM doesn't use these events anywhere, and an NVGPU commit message notes that polling is the preferred method even on HW that supports the event. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Ben Skeggs	bcc1681102	drm/nouveau/gem: lookup VMAs for buffers referenced by pushbuf ioctl [ Upstream commit `19ca10d82e` ] We previously only did this for push buffers, but an upcoming patch will need to attach fences to all VMAs to resolve another issue. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Ben Skeggs	125efb51cd	drm/nouveau: remove fence wait code from deferred client work handler [ Upstream commit `11e451e740` ] Fences attached to deferred client work items now originate from channels belonging to the client, meaning we can be certain they've been signalled before we destroy a client. This closes a race that could happen if the dma_fence_wait_timeout() call didn't succeed. When the fence was later signalled, a use-after-free was possible. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Jens Remus	52e3ca2ed5	scsi: zfcp: assert that the ERP lock is held when tracing a recovery trigger [ Upstream commit `9e156c54ac` ] Otherwise iterating with list_for_each() over the adapter->erp_ready_head and adapter->erp_running_head lists can lead to an infinite loop. See commit "zfcp: fix infinite iteration on erp_ready_head list". The run-time check is only performed for debug kernels which have the kernel lock validator enabled. Following is an example of the warning that is reported, if the ERP lock is not held when calling zfcp_dbf_rec_trig(): WARNING: CPU: 0 PID: 604 at drivers/s390/scsi/zfcp_dbf.c:288 zfcp_dbf_rec_trig+0x172/0x188 Modules linked in: ... CPU: 0 PID: 604 Comm: kworker/u128:3 Not tainted 4.16.0-... #1 Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) Workqueue: zfcp_q_0.0.1906 zfcp_scsi_rport_work Krnl PSW : 00000000330fdbf9 00000000367e9728 (zfcp_dbf_rec_trig+0x172/0x188) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:3 PM:0 RI:0 EA:3 Krnl GPRS: 00000000c57a5d99 3288200000000000 0000000000000000 000000006cc82740 00000000009d09d6 0000000000000000 00000000000000ff 0000000000000000 0000000000000000 0000000000e1b5fe 000000006de01d38 0000000076130958 000000006cc82548 000000006de01a98 00000000009d09d6 000000006a6d3c80 Krnl Code: 00000000009d0ad2: eb7ff0b80004 lmg %r7,%r15,184(%r15) 00000000009d0ad8: c0f4000d7dd0 brcl 15,b80678 #00000000009d0ade: a7f40001 brc 15,9d0ae0 >00000000009d0ae2: a7f4ff7d brc 15,9d09dc 00000000009d0ae6: e340f0f00004 lg %r4,240(%r15) 00000000009d0aec: eb7ff0b80004 lmg %r7,%r15,184(%r15) 00000000009d0af2: 07f4 bcr 15,%r4 00000000009d0af4: 0707 bcr 0,%r7 Call Trace: ([<00000000009d09d6>] zfcp_dbf_rec_trig+0x66/0x188) [<00000000009dd740>] zfcp_scsi_rport_work+0x98/0x190 [<0000000000169b34>] process_one_work+0x3d4/0x6f8 [<000000000016a08a>] worker_thread+0x232/0x418 [<000000000017219e>] kthread+0x166/0x178 [<0000000000b815ea>] kernel_thread_starter+0x6/0xc [<0000000000b815e4>] kernel_thread_starter+0x0/0xc 2 locks held by kworker/u128:3/604: #0: ((wq_completion)name){+.+.}, at: [<0000000082af1024>] process_one_work+0x1dc/0x6f8 #1: ((work_completion)(&port->rport_work)){+.+.}, at: [<0000000082af1024>] process_one_work+0x1dc/0x6f8 Last Breaking-Event-Address: [<00000000009d0ade>] zfcp_dbf_rec_trig+0x16e/0x188 ---[ end trace b2f4020572e2c124 ]--- Suggested-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Maya Erez	88ebbdac6f	scsi: ufs: fix exception event handling [ Upstream commit `2e3611e954` ] The device can set the exception event bit in one of the response UPIU, for example to notify the need for urgent BKOPs operation. In such a case, the host driver calls ufshcd_exception_event_handler to handle this notification. When trying to check the exception event status (for finding the cause for the exception event), the device may be busy with additional SCSI commands handling and may not respond within the 100ms timeout. To prevent that, we need to block SCSI commands during handling of exception events and allow retransmissions of the query requests, in case of timeout. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:48 +02:00
Subhash Jadavani	1b2a881027	scsi: ufs: ufshcd: fix possible unclocked register access [ Upstream commit `b334456ec2` ] Vendor specific setup_clocks ops may depend on clocks managed by ufshcd driver so if the vendor specific setup_clocks callback is called when the required clocks are turned off, it results into unclocked register access. This change make sure that required clocks are enabled before vendor specific setup_clocks callback is called. Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Venkat Gopalakrishnan <venkatg@codeaurora.org> Signed-off-by: Can Guo <cang@codeaurora.org> Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Maxime Chevallier	82ee36f98b	net: mvpp2: Add missing VLAN tag detection [ Upstream commit `62c8a069b5` ] Marvell PPv2 Header Parser sets some bits in the 'result_info' field in each lookup iteration, to identify different packet attributes such as DSA / VLAN tag, protocol infos, etc. This is used in further classification stages in the controller. It's the DSA tag detection entry that is in charge of detecting when there is a single VLAN tag. This commits adds the missing update of the result_info in this case. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Eric Biggers	ecdc49e2cb	fscrypt: use unbound workqueue for decryption [ Upstream commit `36dd26e0c8` ] Improve fscrypt read performance by switching the decryption workqueue from bound to unbound. With the bound workqueue, when multiple bios completed on the same CPU, they were decrypted on that same CPU. But with the unbound queue, they are now decrypted in parallel on any CPU. Although fscrypt read performance can be tough to measure due to the many sources of variation, this change is most beneficial when decryption is slow, e.g. on CPUs without AES instructions. For example, I timed tarring up encrypted directories on f2fs. On x86 with AES-NI instructions disabled, the unbound workqueue improved performance by about 25-35%, using 1 to NUM_CPUs jobs with 4 or 8 CPUs available. But with AES-NI enabled, performance was unchanged to within ~2%. I also did the same test on a quad-core ARM CPU using xts-speck128-neon encryption. There performance was usually about 10% better with the unbound workqueue, bringing it closer to the unencrypted speed. The unbound workqueue may be worse in some cases due to worse locality, but I think it's still the better default. dm-crypt uses an unbound workqueue by default too, so this change makes fscrypt match. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Xi Wang	bd14ddcb17	net: hns3: Fix for hns3 module is loaded multiple times problem [ Upstream commit `3c7624d8fc` ] If the hns3 driver has been built into kernel and then loaded with the same driver which built as KLM, it may trigger an error like below: [ 20.009555] hns3: Hisilicon Ethernet Network Driver for Hip08 Family - version [ 20.016789] hns3: Copyright (c) 2017 Huawei Corporation. [ 20.022100] Error: Driver 'hns3' is already registered, aborting... [ 23.517397] Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... [ 23.691583] Process insmod (pid: 1982, stack limit = 0x00000000cd5f21cb) [ 23.698270] Call trace: [ 23.700705] __list_del_entry_valid+0x2c/0xd8 [ 23.705049] hnae3_unregister_client+0x68/0xa8 [ 23.709487] hns3_init_module+0x98/0x1000 [hns3] [ 23.714093] do_one_initcall+0x5c/0x170 [ 23.717918] do_init_module+0x64/0x1f4 [ 23.721654] load_module+0x1d14/0x24b0 [ 23.725390] SyS_init_module+0x158/0x208 [ 23.729300] el0_svc_naked+0x30/0x34 This patch fixes it by adding module version info. Fixes: `38caee9d3e` ("net: hns3: Add support of the HNAE3 framework") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Xi Wang	45c5224a83	net: hns3: Fix the missing client list node initialization [ Upstream commit `13562d1f5e` ] This patch fixes the missing initialization of the client list node in the hnae3_register_client() function. Fixes: `76ad4f0ee7` ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Yunsheng Lin	2cfbedaf65	net: hns3: Fix for CMDQ and Misc. interrupt init order problem [ Upstream commit `eddf04626d` ] When vf module is loading, the cmd queue initialization should happen before misc interrupt initialization, otherwise the misc interrupt handle will cause using uninitialized cmd queue problem. There is also the same issue when vf module is unloading. This patch fixes it by adjusting the location of some function. Fixes: `e2cb1dec97` ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Tony Lindgren	4f8d490da5	spi: Add missing pm_runtime_put_noidle() after failed get [ Upstream commit `7e48e23a1f` ] If pm_runtime_get_sync() fails we should call pm_runtime_put_noidle(). This is probably not a critical fix as we should only hit this when things are broken elsewhere. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Mark Rutland	f9556db4e7	drivers/perf: arm-ccn: don't log to dmesg in event_init [ Upstream commit `1898eb61fb` ] The ARM CCN PMU driver uses dev_warn() to complain about parameters in the user-provided perf_event_attr. This means that under normal operation (e.g. a single invocation of the perf tool), a number of messages warnings may be logged to dmesg. Tools may issue multiple syscalls to probe for feature support, and multiple applications (from multiple users) can attempt to open events simultaneously, so this is not very helpful, even if a user happens to have access to dmesg. Worse, this can push important information out of the dmesg ring buffer, and can significantly slow down syscall fuzzers, vastly increasing the time it takes to find critical bugs. Demote the dev_warn() instances to dev_dbg(), as is the case for all other PMU drivers under drivers/perf/. Users who wish to debug PMU event initialisation can enable dynamic debug to receive these messages. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Wolfram Sang	7404c0350f	watchdog: renesas-wdt: Add support for the R8A77965 WDT [ Upstream commit `b1eb8fedc0` ] Document support for the Watchdog Timer (WDT) Controller in the Renesas R-Car M3-N (R8A77965) SoC. No driver update is needed. Signed-off-by: Takeshi Kihara <takeshi.kihara.df@renesas.com> [wsa: rebased to v4.17-rc3] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Mimi Zohar	4f51ab3ad4	ima: based on policy verify firmware signatures (pre-allocated buffer) [ Upstream commit `fd90bc559b` ] Don't differentiate, for now, between kernel_read_file_id READING_FIRMWARE and READING_FIRMWARE_PREALLOC_BUFFER enumerations. Fixes: `a098ecd` firmware: support loading into a pre-allocated buffer (since 4.8) Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: David Howells <dhowells@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Serge E. Hallyn <serge@hallyn.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:47 +02:00
Lorenzo Bianconi	7f9d1ad761	mt76x2: apply coverage class on slot time too [ Upstream commit `0d45d3fe42` ] According to 802.11-2007 17.3.8.6 (slot time), the slot time should be increased by 3 us * coverage class. Taking into account coverage class in slot time configuration allows to increase by an order of magnitude the throughput on a 4Km link in a noisy environment Tested-by: Luca Bisti <luca.bisti@gmail.com> Tested-by: Gaetano Catalli <gaetano.catalli@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Acked-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Xinming Hu	ab3088c623	mwifiex: correct histogram data with appropriate index [ Upstream commit `30bfce0b63` ] Correct snr/nr/rssi data index to avoid possible buffer underflow. Signed-off-by: Xinming Hu <huxm@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Michal Vokáč	1a93a58ecd	net: dsa: qca8k: Add support for QCA8334 switch [ Upstream commit `64cf81675a` ] Add support for the four-port variant of the Qualcomm QCA833x switch. Signed-off-by: Michal Vokáč <michal.vokac@ysoft.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Mika Westerberg	0abb4b17c7	PCI: pciehp: Request control of native hotplug only if supported [ Upstream commit `408fec36a1` ] Currently we request control of native PCIe hotplug unconditionally. Native PCIe hotplug events are handled by the pciehp driver, and if it is not enabled those events will be lost. Request control of native PCIe hotplug only if the pciehp driver is enabled, so we will actually handle native PCIe hotplug events. Suggested-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Sandipan Das	9a35392ed9	bpf: powerpc64: pad function address loads with NOPs [ Upstream commit `4ea69b2fd6` ] For multi-function programs, loading the address of a callee function to a register requires emitting instructions whose count varies from one to five depending on the nature of the address. Since we come to know of the callee's address only before the extra pass, the number of instructions required to load this address may vary from what was previously generated. This can make the JITed image grow or shrink. To avoid this, we should generate a constant five-instruction when loading function addresses by padding the optimized load sequence with NOPs. Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Sandipan Das	238a59e463	bpf: fix multi-function JITed dump obtained via syscall [ Upstream commit `4d56a76ead` ] Currently, for multi-function programs, we cannot get the JITed instructions using the bpf system call's BPF_OBJ_GET_INFO_BY_FD command. Because of this, userspace tools such as bpftool fail to identify a multi-function program as being JITed or not. With the JIT enabled and the test program running, this can be verified as follows: # cat /proc/sys/net/core/bpf_jit_enable 1 Before applying this patch: # bpftool prog list 1: kprobe name foo tag b811aab41a39ad3d gpl loaded_at 2018-05-16T11:43:38+0530 uid 0 xlated 216B not jited memlock 65536B ... # bpftool prog dump jited id 1 no instructions returned After applying this patch: # bpftool prog list 1: kprobe name foo tag b811aab41a39ad3d gpl loaded_at 2018-05-16T12:13:01+0530 uid 0 xlated 216B jited 308B memlock 65536B ... # bpftool prog dump jited id 1 0: nop 4: nop 8: mflr r0 c: std r0,16(r1) 10: stdu r1,-112(r1) 14: std r31,104(r1) 18: addi r31,r1,48 1c: li r3,10 ... Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Christian Lamparter	1678d98a28	pinctrl: msm: fix gpio-hog related boot issues [ Upstream commit `a86caa9ba5` ] Sven Eckelmann reported an issue with the current IPQ4019 pinctrl. Setting up any gpio-hog in the device-tree for his device would "kill the bootup completely": \| [ 0.477838] msm_serial 78af000.serial: could not find pctldev for node /soc/pinctrl@1000000/serial_pinmux, deferring probe \| [ 0.499828] spi_qup 78b5000.spi: could not find pctldev for node /soc/pinctrl@1000000/spi_0_pinmux, deferring probe \| [ 1.298883] requesting hog GPIO enable USB2 power (chip 1000000.pinctrl, offset 58) failed, -517 \| [ 1.299609] gpiochip_add_data: GPIOs 0..99 (1000000.pinctrl) failed to register \| [ 1.308589] ipq4019-pinctrl 1000000.pinctrl: Failed register gpiochip \| [ 1.316586] msm_serial 78af000.serial: could not find pctldev for node /soc/pinctrl@1000000/serial_pinmux, deferring probe \| [ 1.322415] spi_qup 78b5000.spi: could not find pctldev for node /soc/pinctrl@1000000/spi_0_pinmux, deferri This was also verified on a RT-AC58U (IPQ4018) which would no longer boot, if a gpio-hog was specified. (Tried forcing the USB LED PIN (GPIO0) to high.). The problem is that Pinctrl+GPIO registration is currently peformed in the following order in pinctrl-msm.c: 1. pinctrl_register() 2. gpiochip_add() 3. gpiochip_add_pin_range() The actual error code -517 == -EPROBE_DEFER is coming from pinctrl_get_device_gpio_range(), which is called through: gpiochip_add of_gpiochip_add of_gpiochip_scan_gpios gpiod_hog gpiochip_request_own_desc __gpiod_request chip->request gpiochip_generic_request pinctrl_gpio_request pinctrl_get_device_gpio_range pinctrl_get_device_gpio_range() is unable to find any valid pin ranges, since nothing has been added to the pinctrldev_list yet. so the range can't be found, and the operation fails with -EPROBE_DEFER. This patch fixes the issue by adding the "gpio-ranges" property to the pinctrl device node of all upstream Qcom SoC. The pin ranges are then added by the gpio core. In order to remain compatible with older, existing DTs (and ACPI) a check for the "gpio-ranges" property has been added to msm_gpio_init(). This prevents the driver of adding the same entry to the pinctrldev_list twice. Reported-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Tested-by: Sven Eckelmann <sven.eckelmann@openmesh.com> [ipq4019] Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Julia Lawall	d08b49507a	pinctrl: at91-pio4: add missing of_node_put [ Upstream commit `2181636471` ] The device node iterators perform an of_node_get on each iteration, so a jump out of the loop requires an of_node_put. The semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ expression root,e; local idexpression child; iterator name for_each_child_of_node; @@ for_each_child_of_node(root, child) { ... when != of_node_put(child) when != e = child + of_node_put(child); ? break; ... } ... when != child // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Christophe Leroy	f468a90a22	powerpc/8xx: fix invalid register expression in head_8xx.S [ Upstream commit `e4ccb1dae6` ] New binutils generate the following warning AS arch/powerpc/kernel/head_8xx.o arch/powerpc/kernel/head_8xx.S: Assembler messages: arch/powerpc/kernel/head_8xx.S:916: Warning: invalid register expression This patch fixes it. Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Geert Uytterhoeven	ab89e086ea	spi: sh-msiof: Fix setting SIRMDR1.SYNCAC to match SITMDR1.SYNCAC [ Upstream commit `0921e11e1e` ] According to section 59.2.4 MSIOF Receive Mode Register 1 (SIRMDR1) in the R-Car Gen3 datasheet Rev.1.00, the value of the SIRMDR1.SYNCAC bit must match the value of the SITMDR1.SYNCAC bit. However, sh_msiof_spi_setup() changes only the latter. Fix this by updating the SIRMDR1 register like the SITMDR1 register, taking into account register bits that exist in SITMDR1 only. Reported-by: Renesas BSP team via Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Fixes: `7ff0b53c40` ("spi: sh-msiof: Avoid writing to registers from spi_master.setup()") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:46 +02:00
Dan Carpenter	4f4e7c5559	KVM: x86: prevent integer overflows in KVM_MEMORY_ENCRYPT_REG_REGION [ Upstream commit `86bf20cb57` ] This is a fix from reviewing the code, but it looks like it might be able to lead to an Oops. It affects 32bit systems. The KVM_MEMORY_ENCRYPT_REG_REGION ioctl uses a u64 for range->addr and range->size but the high 32 bits would be truncated away on a 32 bit system. This is harmless but it's also harmless to prevent it. Then in sev_pin_memory() the "uaddr + ulen" calculation can wrap around. The wrap around can happen on 32 bit or 64 bit systems, but I was only able to figure out a problem for 32 bit systems. We would pick a number which results in "npages" being zero. The sev_pin_memory() would then return ZERO_SIZE_PTR without allocating anything. I made it illegal to call sev_pin_memory() with "ulen" set to zero. Hopefully, that doesn't cause any problems. I also changed the type of "first" and "last" to long, just for cosmetic reasons. Otherwise on a 64 bit system you're saving "uaddr >> 12" in an int and it truncates the high 20 bits away. The math works in the current code so far as I can see but it's just weird. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> [Brijesh noted that the code is only reachable on X86_64.] Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Mathieu Malaterre	cc197d0bed	powerpc: Add __printf verification to prom_printf [ Upstream commit `eae5f709a4` ] __printf is useful to verify format and arguments. Fix arg mismatch reported by gcc, remove the following warnings (with W=1): arch/powerpc/kernel/prom_init.c:1467:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1471:31: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1504:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1505:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1506:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1507:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1508:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1509:33: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:1975:39: error: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘unsigned int’ arch/powerpc/kernel/prom_init.c:1986:27: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:2567:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:2567:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:2569:38: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 2 has type ‘long unsigned int’ arch/powerpc/kernel/prom_init.c:2569:46: error: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘long unsigned int’ The patch also include arg mismatch fix for case with #define DEBUG_PROM (warning not listed here). This patch fix also the following warnings revealed by checkpatch: WARNING: Prefer using '"%s...", __func__' to using 'alloc_up', this function's name, in a string #101: FILE: arch/powerpc/kernel/prom_init.c:1235: + prom_debug("alloc_up(%lx, %lx)\n", size, align); and WARNING: Prefer using '"%s...", __func__' to using 'alloc_down', this function's name, in a string #138: FILE: arch/powerpc/kernel/prom_init.c:1278: + prom_debug("alloc_down(%lx, %lx, %s)\n", size, align, Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Mathieu Malaterre	994fecb1c3	powerpc/powermac: Mark variable x as unused [ Upstream commit `5a4b475cf8` ] Since the value of x is never intended to be read, declare it with gcc attribute as unused. Fix warning treated as error with W=1: arch/powerpc/platforms/powermac/bootx_init.c:471:21: error: variable ‘x’ set but not used [-Werror=unused-but-set-variable] Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Mathieu Malaterre	526475ce2f	powerpc/powermac: Add missing prototype for note_bootable_part() [ Upstream commit `f72cf3f1d4` ] Add a missing prototype for function `note_bootable_part` to silence a warning treated as error with W=1: arch/powerpc/platforms/powermac/setup.c:361:12: error: no previous prototype for ‘note_bootable_part’ [-Werror=missing-prototypes] Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Mathieu Malaterre	2a43f02c8a	powerpc/chrp/time: Make some functions static, add missing header include [ Upstream commit `b87a358b4a` ] Add a missing include <platforms/chrp/chrp.h>. These functions can all be static, make it so. Fix warnings treated as errors with W=1: arch/powerpc/platforms/chrp/time.c:41:13: error: no previous prototype for ‘chrp_time_init’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:66:5: error: no previous prototype for ‘chrp_cmos_clock_read’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:74:6: error: no previous prototype for ‘chrp_cmos_clock_write’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:86:5: error: no previous prototype for ‘chrp_set_rtc_time’ [-Werror=missing-prototypes] arch/powerpc/platforms/chrp/time.c:130:6: error: no previous prototype for ‘chrp_get_rtc_time’ [-Werror=missing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Mathieu Malaterre	030da2a1d5	powerpc/32: Add a missing include header [ Upstream commit `c89ca59322` ] The header file <linux/syscalls.h> was missing from the includes. Fix the following warning, treated as error with W=1: arch/powerpc/kernel/pci_32.c:286:6: error: no previous prototype for ‘sys_pciconfig_iobase’ [-Werror=missing-prototypes] Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Patrick Bellasi	ed2716b0e0	sched/cpufreq: Modify aggregate utilization to always include blocked FAIR utilization [ Upstream commit `8ecf04e112` ] Since the refactoring introduced by: commit `8f111bc357` ("cpufreq/schedutil: Rewrite CPUFREQ_RT support") we aggregate FAIR utilization only if this class has runnable tasks. This was mainly due to avoid the risk to stay on an high frequency just because of the blocked utilization of a CPU not being properly decayed while the CPU was idle. However, since: commit `31e77c93e4` ("sched/fair: Update blocked load when newly idle") the FAIR blocked utilization is properly decayed also for IDLE CPUs. This allows us to use the FAIR blocked utilization as a safe mechanism to gracefully reduce the frequency only if no FAIR tasks show up on a CPU for a reasonable period of time. Moreover, we also reduce the frequency drops of CPUs running periodic tasks which, depending on the task periodicity and the time required for a frequency switch, was increasing the chances to introduce some undesirable performance variations. Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Link: http://lkml.kernel.org/r/20180524141023.13765-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Sven Eckelmann	90212b23ab	ath: Add regulatory mapping for Bahamas [ Upstream commit `699e2302c2` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Sven Eckelmann	f95f6a0561	ath: Add regulatory mapping for Bermuda [ Upstream commit `9c790f2d23` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: FCC * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Sven Eckelmann	0bce8a8c20	ath: Add regulatory mapping for Serbia [ Upstream commit `2a3169a54b` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:45 +02:00
Sven Eckelmann	decbb5cdd6	ath: Add regulatory mapping for Tanzania [ Upstream commit `667ddac574` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Sven Eckelmann	097287cd62	ath: Add regulatory mapping for Uganda [ Upstream commit `1ea3986ad2` ] The country code is used by the ath to detect the ISO 3166-1 alpha-2 name and to select the correct conformance test limits (CTL) for a country. If the country isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this country are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Sven Eckelmann	e2480c6992	ath: Add regulatory mapping for APL2_FCCA [ Upstream commit `4f183687e3` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: FCC * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Sven Eckelmann	11d13f9a1d	ath: Add regulatory mapping for APL13_WORLD [ Upstream commit `9ba8df0c52` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Sven Eckelmann	cb59856a15	ath: Add regulatory mapping for ETSI8_WORLD [ Upstream commit `45faf6e096` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: ETSI Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Sven Eckelmann	21d9cd47a9	ath: Add regulatory mapping for FCC3_ETSIC [ Upstream commit `01fb2994a9` ] The regdomain code is used to select the correct the correct conformance test limits (CTL) for a country. If the regdomain code isn't available and it is still programmed in the EEPROM then it will cause an error and stop the initialization with: Invalid EEPROM contents The current CTL mappings for this regdomain code are: * 2.4GHz: ETSI * 5GHz: FCC Signed-off-by: Sven Eckelmann <sven.eckelmann@openmesh.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Keith Busch	be7d23f84a	nvme-pci: Fix AER reset handling [ Upstream commit `72cd4cc28e` ] The nvme timeout handling doesn't do anything if the pci channel is offline, which is the case when recovering from PCI error event, so it was a bad idea to sync the controller reset in this state. This patch flushes the reset work in the error_resume callback instead when the channel is back to online. This keeps AER handling serialized and can recover from timeouts. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199757 Fixes: `cc1d5e749a` ("nvme/pci: Sync controller reset for AER slot_reset") Reported-by: Alex Gagniuc <mr.nuke.me@gmail.com> Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Jianchao Wang	2a3d189410	nvme-rdma: stop admin queue before freeing it [ Upstream commit `2e050f00a0` ] For any failure after nvme_rdma_start_queue in nvme_rdma_configure_admin_queue, the admin queue will be freed with the NVME_RDMA_Q_LIVE flag still set. Once nvme_rdma_stop_queue is invoked, that will cause a use-after-free. BUG: KASAN: use-after-free in rdma_disconnect+0x1f/0xe0 [rdma_cm] To fix it, call nvme_rdma_stop_queue for all the failed cases after nvme_rdma_start_queue. Signed-off-by: Jianchao Wang <jianchao.w.wang@oracle.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Alex Elder	01acc132dd	soc: qcom: smem: byte swap values properly [ Upstream commit `04a512fea3` ] Two places report an error when a partition header is found to not contain the right canary value. The error messages do not properly byte swap the host ids. Fix this, and adjust the format specificier to match the 16-bit unsigned data type. Move the error handling for a bad canary value to the end of qcom_smem_alloc_private(). This avoids some long lines, and reduces the distraction of handling this unexpected problem. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Alex Elder	90a9f2c6e2	soc: qcom: smem: fix qcom_smem_set_global_partition() [ Upstream commit `8fa1a21409` ] If there is at least one entry in the partition table, but no global entry, the qcom_smem_set_global_partition() should return an error just like it does if there are no partition table entries. It turns out the function still returns an error in this case, but it waits to do so until it has mistakenly treated the last entry in the table as if it were the global entry found. Fix the function to return immediately if no global entry is found in the table. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:44 +02:00
Alex Elder	892dec9a2f	soc: qcom: qmi: fix a buffer sizing bug [ Upstream commit `7df5ff258b` ] In qmi_handle_init(), a buffer is allocated for to hold messages received through the handle's socket. Any "normal" messages (expected by the caller) will have a header prepended, so the buffer size is adjusted to accomodate that. The buffer must also be of sufficient size to receive control messages, so the size is increased if necessary to ensure these will fit. Unfortunately the calculation is done wrong, making it possible for the calculated buffer size to be too small to hold a "normal" message. Specifically, if: recv_buf_size > sizeof(struct qrtr_ctrl_pkt) - sizeof(struct qmi_header) AND recv_buf_size < sizeof(struct qrtr_ctrl_pkt) the current logic will use sizeof(struct qrtr_ctrl_pkt) as the receive buffer size, which is not enough to hold the maximum "normal" message plus its header. Currently this problem occurs for (13 < recv_buf_size < 20). This patch corrects this. Signed-off-by: Alex Elder <elder@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Andy Gross <andy.gross@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Christoph Hellwig	050ac33bbf	PCI: Prevent sysfs disable of device while driver is attached [ Upstream commit `6f5cdfa802` ] Manipulating the enable_cnt behind the back of the driver will wreak complete havoc with the kernel state, so disallow it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Acked-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Sebastian Andrzej Siewior	3fbe1a98d6	PM / wakeup: Make s2idle_lock a RAW_SPINLOCK [ Upstream commit `62fc00a661` ] The `s2idle_lock' is acquired during suspend while interrupts are disabled even on RT. The lock is acquired for short sections only. Make it a RAW lock which avoids "sleeping while atomic" warnings on RT. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Scott Wood	7ce076323d	x86/microcode: Make the late update update_lock a raw lock for RT [ Upstream commit `ff987fcf01` ] __reload_late() is called from stop_machine context and thus cannot acquire a non-raw spinlock on PREEMPT_RT. Signed-off-by: Scott Wood <swood@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Pei Zhang <pezhang@redhat.com> Cc: x86-ml <x86@kernel.org> Link: http://lkml.kernel.org/r/20180524154420.24455-1-swood@redhat.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Qu Wenruo	b9c0f6b27d	btrfs: qgroup: Finish rescan when hit the last leaf of extent tree [ Upstream commit `ff3d27a048` ] Under the following case, qgroup rescan can double account cowed tree blocks: In this case, extent tree only has one tree block. - \| transid=5 last committed=4 \| btrfs_qgroup_rescan_worker() \| \|- btrfs_start_transaction() \| \| transid = 5 \| \|- qgroup_rescan_leaf() \| \|- btrfs_search_slot_for_read() on extent tree \| Get the only extent tree block from commit root (transid = 4). \| Scan it, set qgroup_rescan_progress to the last \| EXTENT/META_ITEM + 1 \| now qgroup_rescan_progress = A + 1. \| \| fs tree get CoWed, new tree block is at A + 16K \| transid 5 get committed - \| transid=6 last committed=5 \| btrfs_qgroup_rescan_worker() \| btrfs_qgroup_rescan_worker() \| \|- btrfs_start_transaction() \| \| transid = 5 \| \|- qgroup_rescan_leaf() \| \|- btrfs_search_slot_for_read() on extent tree \| Get the only extent tree block from commit root (transid = 5). \| scan it using qgroup_rescan_progress (A + 1). \| found new tree block beyong A, and it's fs tree block, \| account it to increase qgroup numbers. - In above case, tree block A, and tree block A + 16K get accounted twice, while qgroup rescan should stop when it already reach the last leaf, other than continue using its qgroup_rescan_progress. Such case could happen by just looping btrfs/017 and with some possibility it can hit such double qgroup accounting problem. Fix it by checking the path to determine if we should finish qgroup rescan, other than relying on next loop to exit. Reported-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
David Sterba	b57325823c	btrfs: add barriers to btrfs_sync_log before log_commit_wait wakeups [ Upstream commit `3d3a2e610e` ] Currently the code assumes that there's an implied barrier by the sequence of code preceding the wakeup, namely the mutex unlock. As Nikolay pointed out: I think this is wrong (not your code) but the original assumption that the RELEASE semantics provided by mutex_unlock is sufficient. According to memory-barriers.txt: Section 'LOCK ACQUISITION FUNCTIONS' states: (2) RELEASE operation implication: Memory operations issued before the RELEASE will be completed before the RELEASE operation has completed. Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed. (I've bolded the may portion) The example given there: As an example, consider the following: A = a; B = b; ACQUIRE C = c; D = d; RELEASE E = e; F = f; The following sequence of events is acceptable: ACQUIRE, {F,A}, E, {C,D}, B, RELEASE So if we assume that C is modifying the flag which the waitqueue is checking, and E is the actual wakeup, then those accesses can be re-ordered... IMHO this code should be considered broken... Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Omar Sandoval	a913944b7e	Btrfs: don't BUG_ON() in btrfs_truncate_inode_items() [ Upstream commit `0552210997` ] btrfs_free_extent() can fail because of ENOMEM. There's no reason to panic here, we can just abort the transaction. Fixes: `f4b9aa8d3b` ("btrfs_truncate") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Omar Sandoval	3f1e725b72	Btrfs: don't return ino to ino cache if inode item removal fails [ Upstream commit `c08db7d8d2` ] In btrfs_evict_inode(), if btrfs_truncate_inode_items() fails, the inode item will still be in the tree but we still return the ino to the ino cache. That will blow up later when someone tries to allocate that ino, so don't return it to the cache. Fixes: `581bb05094` ("Btrfs: Cache free inode numbers in memory") Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Hans Verkuil	1c5742e811	media: videobuf2-core: don't call memop 'finish' when queueing [ Upstream commit `90b2da89a0` ] When a buffer is queued or requeued in vb2_buffer_done, then don't call the finish memop. In this case the buffer is only returned to vb2, not to userspace. Calling 'finish' here will cause an unbalance when the queue is canceled, since the core will call the same memop again. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:43 +02:00
Mauro Carvalho Chehab	76dbad5017	media: cec-pin-error-inj: avoid a false-positive Spectre detection [ Upstream commit `a3d71f256c` ] The current logic makes Smatch to false-detect a Spectre variant 1 vulnerability. The problem is that it initializes an u32 indirectly from user space input. After trying to write a fixup, after a while I realized that, in practice, this shouldn't be a problem, as an u32 is initialized from u8, but it took some time to discover it. So, do some code cleanup to make it clearer for both humans and machines about the valid range for "op". Fix this warning: drivers/media/cec/cec-pin-error-inj.c:170 cec_pin_error_inj_parse_line() warn: potential spectre issue 'pin->error_inj_args' Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Ezequiel Garcia	605adcb4bb	media: tw686x: Fix incorrect vb2_mem_ops GFP flags [ Upstream commit `636757ab6c` ] When the driver is configured in the "memcpy" dma-mode, it uses vb2_vmalloc_memops, which is backed by a SLAB allocator and so shouldn't be using GFP_DMA32. Fix it. Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Fuyun Liang	befb5d0a36	net: hns3: Fixes the init of the VALID BD info in the descriptor [ Upstream commit `7d0b130cbb` ] RX Buffer Descriptor contains a VALID bit which indicates if the BD is valid and has some data. This field is set by HNS3 hardware to intimate the driver of some valid data present in the BD. nd should be reset by the driver when BD is being used again. In the existing code this bit was not being (re-)initialized properly and hence was causing problems. Fixes: `76ad4f0ee7` ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Lijun Ou	f987e9ba71	net: hns3: Fixes initalization of RoCE handle and makes it conditional [ Upstream commit `544a7bcd5c` ] When register a RoCE client with hnae3vf device, it needs to judge the device whether support RoCE vf function. Otherwise, it will lead to calltrace when RoCE is not support vf function and remove roce device. The calltrace as follows: [ 93.156614] Unable to handle kernel NULL pointer dereference at virtual address 00000015 <SNIP> [ 93.278784] Call trace: [ 93.278788] hnae3_match_n_instantiate+0x24/0xd8 [hnae3] [ 93.278790] hnae3_register_client+0xcc/0x150 [hnae3] [ 93.278801] hns_roce_hw_v2_init+0x18/0x1000 [hns_roce_hw_v2] [ 93.278805] do_one_initcall+0x58/0x160 [ 93.278807] do_init_module+0x64/0x1d8 [ 93.278809] load_module+0x135c/0x15c8 [ 93.278811] SyS_finit_module+0x100/0x118 [ 93.278816] __sys_trace_return+0x0/0x4 [ 93.278827] Code: aa0003f5 12001c56 aa1e03e0 d503201f (b9402660) Fixes: `e2cb1dec97` ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Reported-by: Xinwei Kong <kong.kongxinwei@hisilicon.com> Reported-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Eyal Reizer	3f2a641ece	wlcore: sdio: check for valid platform device data before suspend [ Upstream commit `6e91d48371` ] the wl pointer can be null In case only wlcore_sdio is probed while no WiLink module is successfully probed, as in the case of mounting a wl12xx module while using a device tree file configured with wl18xx related settings. In this case the system was crashing in wl1271_suspend() as platform device data is not set. Make sure wl the pointer is valid before using it. Signed-off-by: Eyal Reizer <eyalr@ti.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Ganapathi Bhat	4a509e6347	mwifiex: handle race during mwifiex_usb_disconnect [ Upstream commit `b817047ae7` ] Race condition is observed during rmmod of mwifiex_usb: 1. The rmmod thread will call mwifiex_usb_disconnect(), download SHUTDOWN command and do wait_event_interruptible_timeout(), waiting for response. 2. The main thread will handle the response and will do a wake_up_interruptible(), unblocking rmmod thread. 3. On getting unblocked, rmmod thread will make rx_cmd.urb = NULL in mwifiex_usb_free(). 4. The main thread will try to resubmit rx_cmd.urb in mwifiex_usb_submit_rx_urb(), which is NULL. To fix, wait for main thread to complete before calling mwifiex_usb_free(). Signed-off-by: Ganapathi Bhat <gbhat@marvell.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Vincent Palatin	827c5922c5	mfd: cros_ec: Fail early if we cannot identify the EC [ Upstream commit `0dbbf25561` ] If we cannot communicate with the EC chip to detect the protocol version and its features, it's very likely useless to continue. Else we will commit all kind of uninformed mistakes (using the wrong protocol, the wrong buffer size, mixing the EC with other chips). Signed-off-by: Vincent Palatin <vpalatin@chromium.org> Acked-by: Benson Leung <bleung@chromium.org> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Kai Chieh Chuang	fefe0728f8	ASoC: dpcm: fix BE dai not hw_free and shutdown [ Upstream commit `9c0ac70ad2` ] In case, one BE is used by two FE1/FE2 FE1--->BE--> \| FE2----] when FE1/FE2 call dpcm_be_dai_hw_free() together the BE users will be 2 (> 1), hence cannot be hw_free the be state will leave at, ex. SND_SOC_DPCM_STATE_STOP later FE1/FE2 call dpcm_be_dai_shutdown(), will be skip due to wrong state. leaving the BE not being hw_free and shutdown. The BE dai will be hw_free later when calling dpcm_be_dai_shutdown() if still in invalid state. Signed-off-by: KaiChieh Chuang <kaichieh.chuang@mediatek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Jian-Hong Pan	b6c0a27673	Bluetooth: btusb: Add a new Realtek 8723DE ID 2ff8:b011 [ Upstream commit `66d9975c5a` ] Without this patch we cannot turn on the Bluethooth adapter on ASUS E406MA. T: Bus=01 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=2ff8 ProdID=b011 Rev= 2.00 S: Manufacturer=Realtek S: Product=802.11n WLAN Adapter S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Arnd Bergmann	1df4b68213	drivers/bus: arm-cci: fix build warnings [ Upstream commit `984e9cf1b9` ] When the arm-cci driver is enabled, but both CONFIG_ARM_CCI5xx_PMU and CONFIG_ARM_CCI400_PMU are not, we get a warning about how parts of the driver are never used: drivers/perf/arm-cci.c:1454:29: error: 'cci_pmu_models' defined but not used [-Werror=unused-variable] drivers/perf/arm-cci.c:693:16: error: 'cci_pmu_event_show' defined but not used [-Werror=unused-function] drivers/perf/arm-cci.c:685:16: error: 'cci_pmu_format_show' defined but not used [-Werror=unused-function] Marking all three functions as __maybe_unused avoids the warnings in randconfig builds. I'm doing this lacking any ideas for a better fix. Fixes: `3de6be7a3d` ("drivers/bus: Split Arm CCI driver") Reviewed-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:42 +02:00
Mikita Lipski	a22c783ba4	drm/amd/display: Do not program interrupt status on disabled crtc [ Upstream commit `4ea7fc0953` ] Prevent interrupt programming of a crtc on which the stream is disabled and it doesn't have an OTG to reference. Signed-off-by: Mikita Lipski <mikita.lipski@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Thierry Escande	9b603ad3ce	Bluetooth: hci_qca: Fix "Sleep inside atomic section" warning [ Upstream commit `9960521c44` ] This patch fixes the following warning during boot: do not call blocking ops when !TASK_RUNNING; state=1 set at [<(ptrval)>] qca_setup+0x194/0x750 [hci_uart] WARNING: CPU: 2 PID: 1878 at kernel/sched/core.c:6135 __might_sleep+0x7c/0x88 In qca_set_baudrate(), the current task state is set to TASK_UNINTERRUPTIBLE before going to sleep for 300ms. It was then restored to TASK_INTERRUPTIBLE. This patch sets the current task state back to TASK_RUNNING instead. Signed-off-by: Thierry Escande <thierry.escande@linaro.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Gregory Greenman	58c79c9fbc	iwlwifi: mvm: open BA session only when sta is authorized [ Upstream commit `d94c5a820d` ] Currently, a BA session is opened when the tx traffic exceeds 10 frames per second. As a result of inter-op problems with some APs, add a condition to open BA session only when station is already authorized. Fixes: `482e48440a` ("iwlwifi: mvm: change open and close criteria of a BA session") Signed-off-by: Gregory Greenman <gregory.greenman@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Shaul Triebitz	ff09c3c18e	iwlwifi: pcie: fix race in Rx buffer allocator [ Upstream commit `0f22e40053` ] Make sure the rx_allocator worker is canceled before running the rx_init routine. rx_init frees and re-allocates all rxb's pages. The rx_allocator worker also allocates pages for the used rxb's. Running rx_init and rx_allocator simultaniously causes a kernel panic. Fix that by canceling the work in rx_init. Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Ethan Lien	fba701fb4b	btrfs: balance dirty metadata pages in btrfs_finish_ordered_io [ Upstream commit `e73e81b6d0` ] [Problem description and how we fix it] We should balance dirty metadata pages at the end of btrfs_finish_ordered_io, since a small, unmergeable random write can potentially produce dirty metadata which is multiple times larger than the data itself. For example, a small, unmergeable 4KiB write may produce: 16KiB dirty leaf (and possibly 16KiB dirty node) in subvolume tree 16KiB dirty leaf (and possibly 16KiB dirty node) in checksum tree 16KiB dirty leaf (and possibly 16KiB dirty node) in extent tree Although we do call balance dirty pages in write side, but in the buffered write path, most metadata are dirtied only after we reach the dirty background limit (which by far only counts dirty data pages) and wakeup the flusher thread. If there are many small, unmergeable random writes spread in a large btree, we'll find a burst of dirty pages exceeds the dirty_bytes limit after we wakeup the flusher thread - which is not what we expect. In our machine, it caused out-of-memory problem since a page cannot be dropped if it is marked dirty. Someone may worry about we may sleep in btrfs_btree_balance_dirty_nodelay, but since we do btrfs_finish_ordered_io in a separate worker, it will not stop the flusher consuming dirty pages. Also, we use different worker for metadata writeback endio, sleep in btrfs_finish_ordered_io help us throttle the size of dirty metadata pages. [Reproduce steps] To reproduce the problem, we need to do 4KiB write randomly spread in a large btree. In our 2GiB RAM machine: 1) Create 4 subvolumes. 2) Run fio on each subvolume: [global] direct=0 rw=randwrite ioengine=libaio bs=4k iodepth=16 numjobs=1 group_reporting size=128G runtime=1800 norandommap time_based randrepeat=0 3) Take snapshot on each subvolume and repeat fio on existing files. 4) Repeat step (3) until we get large btrees. In our case, by observing btrfs_root_item->bytes_used, we have 2GiB of metadata in each subvolume tree and 12GiB of metadata in extent tree. 5) Stop all fio, take snapshot again, and wait until all delayed work is completed. 6) Start all fio. Few seconds later we hit OOM when the flusher starts to work. It can be reproduced even when using nocow write. Signed-off-by: Ethan Lien <ethanlien@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> [ add comment ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Jan Kiszka	253ee5a490	PCI: Fix devm_pci_alloc_host_bridge() memory leak [ Upstream commit `3bbce53178` ] Fix a memory leak by freeing the PCI resource list in devm_pci_release_host_bridge_dev(). Fixes: `5c3f18cce0` ("PCI: Add devm_pci_alloc_host_bridge() interface") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Sergey Matyukevich	f79fe4592f	qtnfmac: fix invalid STA state on EAPOL failure [ Upstream commit `480daa9cb6` ] Driver switches vif sta_state into QTNF_STA_CONNECTING when cfg80211 core initiates connect procedure. Further this state is changed either to QTNF_STA_CONNECTED or to QTNF_STA_DISCONNECTED by BSS_JOIN and BSS_LEAVE events from firmware. However it is possible that no such events will be sent by firmware, e.g. if EAPOL timed out. In this case vif sta_mode will remain in QTNF_STA_CONNECTING state and all subsequent connection attempts will fail with -EBUSY error code. Fix this by perfroming STA state transition from QTNF_STA_CONNECTING to QTNF_STA_DISCONNECTED in cfg80211 disconnect callback. No need to rely upon firmware events in this case. Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Anders Roxell	75da3bb5d2	selftests/filesystems: devpts_pts included wrong header [ Upstream commit `dd4b16b4f9` ] We were picking up the wrong header should use asm/ioctls.h form the kernel and not the header from the system (sys/ioctl.h). In the current code we added the correct include and we added the kernel headers path to the CFLAGS. Fixes: `ce290a1960` ("selftests: add devpts selftests") Signed-off-by: Anders Roxell <anders.roxell@linaro.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:41 +02:00
Shuah Khan (Samsung OSG)	f1f01d50be	selftests: filesystems: return Kselftest Skip code for skipped tests [ Upstream commit `7357dcf2ef` ] When devpts_pts test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. In another case, it returns pass for a skipped test reporting a false postive. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Change it to use ksft_exit_skip() when test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Shuah Khan (Samsung OSG)	4f7441943e	selftests: intel_pstate: return Kselftest Skip code for skipped tests [ Upstream commit `5c30a038fb` ] When intel_pstate test is skipped because of unmet dependencies and/or unsupported configuration, it returns 0 which is treated as a pass by the Kselftest framework. This leads to false positive result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Shuah Khan (Samsung OSG)	b4bf3a0aa4	selftests: kvm: return Kselftest Skip code for skipped tests [ Upstream commit `ab0e9c4b91` ] When kvm test is skipped because of unmet dependencies and/or unsupported configuration, it exits with error which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Change it to use ksft_exit_skip() when the test is skipped. In addition, refine test_assert() message to include strerror() string and add explicit check for EACCES to cleary identify when test doesn't run when access is denied to resources required e.g: open /dev/kvm failed, rc: -1 errno: 13 Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Shuah Khan (Samsung OSG)	99f2716693	selftests: memfd: return Kselftest Skip code for skipped tests [ Upstream commit `b27f0259e8` ] When memfd test is skipped because of unmet dependencies and/or unsupported configuration, it returns non-zero value which is treated as a fail by the Kselftest framework. This leads to false negative result even when the test could not be run. Change it to return kselftest skip code when a test gets skipped to clearly report that the test could not be run. Added an explicit check for root user at the start of memfd hugetlbfs test and return skip code if a non-root user attempts to run it. In addition, return skip code when not enough huge pages are available to run the test. Kselftest framework SKIP code is 4 and the framework prints appropriate messages to indicate that the test is skipped. Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Daniel Díaz	5737a8de8f	selftests/intel_pstate: Improve test, minor fixes [ Upstream commit `e9d33f149f` ] A few changes improve the overall usability of the test: * fix a hard-coded maximum frequency (3300), * don't adjust the CPU frequency if only evaluating results, * fix a comparison for multiple frequencies. A symptom of that last issue looked like this: ./run.sh: line 107: [: too many arguments ./run.sh: line 110: 3099 3099 3100-3100: syntax error in expression (error token is \"3099 3100-3100\") Because a check will count how many differente frequencies there are among the CPUs of the system, and after they are tallied another read is performed, which might produce different results. Signed-off-by: Daniel Díaz <daniel.diaz@linaro.org> Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Kan Liang	84e79250b3	perf/x86/intel/uncore: Correct fixed counter index check for NHM [ Upstream commit `d71f11c076` ] For Nehalem and Westmere, there is only one fixed counter for W-Box. There is no index which is bigger than UNCORE_PMC_IDX_FIXED. It is not correct to use >= to check fixed counter. The code quality issue will bring problem when new counter index is introduced. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-2-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Kan Liang	bd2de0692e	perf/x86/intel/uncore: Correct fixed counter index check in generic code [ Upstream commit `4749f81964` ] There is no index which is bigger than UNCORE_PMC_IDX_FIXED. The only exception is client IMC uncore, which has been specially handled. For generic code, it is not correct to use >= to check fixed counter. The code quality issue will bring problem when a new counter index is introduced. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: acme@kernel.org Cc: eranian@google.com Link: http://lkml.kernel.org/r/1525371913-10597-3-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Michael Grzeschik	16e8771478	usbip: dynamically allocate idev by nports found in sysfs [ Upstream commit `de19ca6fd7` ] As the amount of available ports varies by the kernels build configuration. To remove the limitation of the fixed 128 ports we allocate the amount of idevs by using the number we get from the kernel. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Acked-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Shuah Khan (Samsung OSG)	e6d7b8c25c	usbip: usbip_detach: Fix memory, udev context and udev leak [ Upstream commit `d179f99a65` ] detach_port() fails to call usbip_vhci_driver_close() from its error path after usbip_vhci_detach_device() returns failure, leaking memory allocated in usbip_vhci_driver_open() and holding udev_context and udev references. Fix it to call usbip_vhci_driver_close(). Signed-off-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:40 +02:00
Filippo Muzzini	0ba8e51fba	block, bfq: remove wrong lock in bfq_requests_merged [ Upstream commit `a12bffebc0` ] In bfq_requests_merged(), there is a deadlock because the lock on bfqq->bfqd->lock is held by the calling function, but the code of this function tries to grab the lock again. This deadlock is currently hidden by another bug (fixed by next commit for this source file), which causes the body of bfq_requests_merged() to be never executed. This commit removes the deadlock by removing the lock/unlock pair. Signed-off-by: Filippo Muzzini <filippo.muzzini@outlook.it> Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	e631f67576	f2fs: fix race in between GC and atomic open [ Upstream commit `27319ba404` ] Thread GC thread - f2fs_ioc_start_atomic_write - get_dirty_pages - filemap_write_and_wait_range - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - f2fs_is_atomic_file - set_page_dirty - set_inode_flag(, FI_ATOMIC_FILE) Dirty data page can still be generated by GC in race condition as above call stack. This patch adds fi->dio_rwsem[WRITE] in f2fs_ioc_start_atomic_write to avoid such race. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	2f149c797f	f2fs: fix to detect failure of dquot_initialize [ Upstream commit `c22aecd759` ] dquot_initialize() can fail due to any exception inside quota subsystem, f2fs needs to be aware of it, and return correct return value to caller. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Yunlei He	6e34dc89b9	f2fs: fix missing clear FI_NO_PREALLOC in some error case [ Upstream commit `cba41be08c` ] This patch fix missing clear FI_NO_PREALLOC in some error case Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Sahitya Tummala	1f7f2ac469	f2fs: Fix deadlock in shutdown ioctl [ Upstream commit `60b2b4ee2b` ] f2fs_ioc_shutdown() ioctl gets stuck in the below path when issued with F2FS_GOING_DOWN_FULLSYNC option. __switch_to+0x90/0xc4 percpu_down_write+0x8c/0xc0 freeze_super+0xec/0x1e4 freeze_bdev+0xc4/0xcc f2fs_ioctl+0xc0c/0x1ce0 f2fs_compat_ioctl+0x98/0x1f0 Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	d1bbc0425f	f2fs: fix to wait page writeback during revoking atomic write [ Upstream commit `e5e5732d81` ] After revoking atomic write, related LBA can be reused by others, so we need to wait page writeback before reusing the LBA, in order to avoid interference between old atomic written in-flight IO and new IO. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	37fa60d923	f2fs: fix to don't trigger writeback during recovery [ Upstream commit `64c74a7ab5` ] - f2fs_fill_super - recover_fsync_data - recover_data - del_fsync_inode - iput - iput_final - write_inode_now - f2fs_write_inode - f2fs_balance_fs - f2fs_balance_fs_bg - sync_dirty_inodes With data_flush mount option, during recovery, in order to avoid entering above writeback flow, let's detect recovery status and do skip in f2fs_balance_fs_bg. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Yunlei He <heyunlei@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	3c37accee8	f2fs: don't drop dentry pages after fs shutdown [ Upstream commit `1174abfd83` ] As description in commit "f2fs: don't drop any page on f2fs_cp_error() case": "We still provide readdir() after shtudown, so we should keep pages to avoid additional IOs." In order to provider lastest directory structure, let's keep dentry pages in cache after fs shutdown. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Chao Yu	df44c0553d	f2fs: fix error path of move_data_page [ Upstream commit `14a28559f4` ] This patch fixes error path of move_data_page: - clear cold data flag if it fails to write page. - redirty page for non-ENOMEM case. Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:39 +02:00
Anatoly Pugachev	ac3db8cb96	disable loading f2fs module on PAGE_SIZE > 4KB [ Upstream commit `4071e67cff` ] The following patch disables loading of f2fs module on architectures which have PAGE_SIZE > 4096 , since it is impossible to mount f2fs on such architectures , log messages are: mount: /mnt: wrong fs type, bad option, bad superblock on /dev/vdiskb1, missing codepage or helper program, or other error. /dev/vdiskb1: F2FS filesystem, UUID=1d8b9ca4-2389-4910-af3b-10998969f09c, volume name "" May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 1th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Can't find valid F2FS filesystem in 2th superblock May 15 18:03:13 ttip kernel: F2FS-fs (vdiskb1): Invalid page_cache_size (8192), supports only 4KB which was introduced by git commit `5c9b469295` tested on git kernel 4.17.0-rc6-00309-gec30dcf7f425 with patch applied: modprobe: ERROR: could not insert 'f2fs': Invalid argument May 28 01:40:28 v215 kernel: F2FS not supported on PAGE_SIZE(8192) != 4096 Signed-off-by: Anatoly Pugachev <matorola@gmail.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Trond Myklebust	ee523c1185	NFS: Fix up nfs_post_op_update_inode() to force ctime updates [ Upstream commit `d554168f87` ] We do not want to ignore ctime updates that originate from functions such as link(). Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Trond Myklebust	40183caa46	pnfs: Don't release the sequence slot until we've processed layoutget on open [ Upstream commit `ae55e59da0` ] If the server recalls the layout that was just handed out, we risk hitting a race as described in RFC5661 Section 2.10.6.3 unless we ensure that we release the sequence slot after processing the LAYOUTGET operation that was sent as part of the OPEN compound. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Alexey Kodanev	669d2e45c0	netfilter: nf_tables: check msg_type before nft_trans_set(trans) [ Upstream commit `9c7f96fd77` ] The patch moves the "trans->msg_type == NFT_MSG_NEWSET" check before using nft_trans_set(trans). Otherwise we can get out of bounds read. For example, KASAN reported the one when running 0001_cache_handling_0 nft test. In this case "trans->msg_type" was NFT_MSG_NEWTABLE: [75517.177808] BUG: KASAN: slab-out-of-bounds in nft_set_lookup_global+0x22f/0x270 [nf_tables] [75517.279094] Read of size 8 at addr ffff881bdb643fc8 by task nft/7356 ... [75517.375605] CPU: 26 PID: 7356 Comm: nft Tainted: G E 4.17.0-rc7.1.x86_64 #1 [75517.489587] Hardware name: Oracle Corporation SUN SERVER X4-2 [75517.618129] Call Trace: [75517.648821] dump_stack+0xd1/0x13b [75517.691040] ? show_regs_print_info+0x5/0x5 [75517.742519] ? kmsg_dump_rewind_nolock+0xf5/0xf5 [75517.799300] ? lock_acquire+0x143/0x310 [75517.846738] print_address_description+0x85/0x3a0 [75517.904547] kasan_report+0x18d/0x4b0 [75517.949892] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.019153] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.088420] ? nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.157689] nft_set_lookup_global+0x22f/0x270 [nf_tables] [75518.224869] nf_tables_newsetelem+0x1a5/0x5d0 [nf_tables] [75518.291024] ? nft_add_set_elem+0x2280/0x2280 [nf_tables] [75518.357154] ? nla_parse+0x1a5/0x300 [75518.401455] ? kasan_kmalloc+0xa6/0xd0 [75518.447842] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75518.507743] ? nfnetlink_rcv+0x7a5/0x1bdf [nfnetlink] [75518.569745] ? nfnl_err_reset+0x3c0/0x3c0 [nfnetlink] [75518.631711] ? lock_acquire+0x143/0x310 [75518.679133] ? netlink_deliver_tap+0x9b/0x1070 [75518.733840] ? kasan_unpoison_shadow+0x31/0x40 [75518.788542] netlink_unicast+0x45d/0x680 [75518.837111] ? __isolate_free_page+0x890/0x890 [75518.891913] ? netlink_attachskb+0x6b0/0x6b0 [75518.944542] netlink_sendmsg+0x6fa/0xd30 [75518.993107] ? netlink_unicast+0x680/0x680 [75519.043758] ? netlink_unicast+0x680/0x680 [75519.094402] sock_sendmsg+0xd9/0x160 [75519.138810] ___sys_sendmsg+0x64d/0x980 [75519.186234] ? copy_msghdr_from_user+0x350/0x350 [75519.243118] ? lock_downgrade+0x650/0x650 [75519.292738] ? do_raw_spin_unlock+0x5d/0x250 [75519.345456] ? _raw_spin_unlock+0x24/0x30 [75519.395065] ? __handle_mm_fault+0xbde/0x3410 [75519.448830] ? sock_setsockopt+0x3d2/0x1940 [75519.500516] ? __lock_acquire.isra.25+0xdc/0x19d0 [75519.558448] ? lock_downgrade+0x650/0x650 [75519.608057] ? __audit_syscall_entry+0x317/0x720 [75519.664960] ? __fget_light+0x58/0x250 [75519.711325] ? __sys_sendmsg+0xde/0x170 [75519.758850] __sys_sendmsg+0xde/0x170 [75519.804193] ? __ia32_sys_shutdown+0x90/0x90 [75519.856725] ? syscall_trace_enter+0x897/0x10e0 [75519.912354] ? trace_event_raw_event_sys_enter+0x920/0x920 [75519.979432] ? __audit_syscall_entry+0x720/0x720 [75520.036118] do_syscall_64+0xa3/0x3d0 [75520.081248] ? prepare_exit_to_usermode+0x47/0x1d0 [75520.139904] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75520.201680] RIP: 0033:0x7fc153320ba0 [75520.245772] RSP: 002b:00007ffe294c3638 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [75520.337708] RAX: ffffffffffffffda RBX: 00007ffe294c4820 RCX: 00007fc153320ba0 [75520.424547] RDX: 0000000000000000 RSI: 00007ffe294c46b0 RDI: 0000000000000003 [75520.511386] RBP: 00007ffe294c47b0 R08: 0000000000000004 R09: 0000000002114090 [75520.598225] R10: 00007ffe294c30a0 R11: 0000000000000246 R12: 00007ffe294c3660 [75520.684961] R13: 0000000000000001 R14: 00007ffe294c3650 R15: 0000000000000001 [75520.790946] Allocated by task 7356: [75520.833994] kasan_kmalloc+0xa6/0xd0 [75520.878088] __kmalloc+0x189/0x450 [75520.920107] nft_trans_alloc_gfp+0x20/0x190 [nf_tables] [75520.983961] nf_tables_newtable+0xcd0/0x1bd0 [nf_tables] [75521.048857] nfnetlink_rcv+0xc43/0x1bdf [nfnetlink] [75521.108655] netlink_unicast+0x45d/0x680 [75521.157013] netlink_sendmsg+0x6fa/0xd30 [75521.205271] sock_sendmsg+0xd9/0x160 [75521.249365] ___sys_sendmsg+0x64d/0x980 [75521.296686] __sys_sendmsg+0xde/0x170 [75521.341822] do_syscall_64+0xa3/0x3d0 [75521.386957] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.467867] Freed by task 23454: [75521.507804] __kasan_slab_free+0x132/0x180 [75521.558137] kfree+0x14d/0x4d0 [75521.596005] free_rt_sched_group+0x153/0x280 [75521.648410] sched_autogroup_create_attach+0x19a/0x520 [75521.711330] ksys_setsid+0x2ba/0x400 [75521.755529] __ia32_sys_setsid+0xa/0x10 [75521.802850] do_syscall_64+0xa3/0x3d0 [75521.848090] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [75521.929000] The buggy address belongs to the object at ffff881bdb643f80 which belongs to the cache kmalloc-96 of size 96 [75522.079797] The buggy address is located 72 bytes inside of 96-byte region [ffff881bdb643f80, ffff881bdb643fe0) [75522.221234] The buggy address belongs to the page: [75522.280100] page:ffffea006f6d90c0 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [75522.377443] flags: 0x2fffff80000100(slab) [75522.426956] raw: 002fffff80000100 0000000000000000 0000000000000000 0000000180200020 [75522.521275] raw: ffffea006e6fafc0 0000000c0000000c ffff881bf180f400 0000000000000000 [75522.615601] page dumped because: kasan: bad access detected Fixes: `37a9cc5255` ("netfilter: nf_tables: add generation mask to sets") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Javier González	22bfca4ea2	lightnvm: pblk: warn in case of corrupted write buffer [ Upstream commit `e37d07983a` ] When cleaning up buffer entries as we wrap up, their state should be "completed". If any of the entries is in "submitted" state, it means that something bad has happened. Trigger a warning immediately instead of waiting for the state flag to eventually be updated, thus hiding the issue. Signed-off-by: Javier González <javier@cnexlabs.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Igor Konopko	84574d56f0	lightnvm: proper error handling for pblk_bio_add_pages [ Upstream commit `f142ac0b5d` ] Currently in case of error caused by bio_pc_add_page in pblk_bio_add_pages two issues occur when calling from pblk_rb_read_to_bio(). First one is in pblk_bio_free_pages, since we are trying to free pages not allocated from our mempool. Second one is the warn from dma_pool_free, that we are trying to free NULL pointer dma. This commit fix both issues. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Igor Konopko	41c1c490b6	lightnvm: fix partial read error path [ Upstream commit `fbadca7396` ] When error occurs during bio_add_page on partial read path, pblk tries to free pages twice. Signed-off-by: Igor Konopko <igor.j.konopko@intel.com> Signed-off-by: Marcin Dziegielewski <marcin.dziegielewski@intel.com> Signed-off-by: Matias Bjørling <mb@lightnvm.io> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Leon Romanovsky	69321a3a0c	RDMA/mad: Convert BUG_ONs to error flows [ Upstream commit `2468b82d69` ] Let's perform checks in-place instead of BUG_ONs. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Yunsheng Lin	f0f55e97f4	net: hns3: Fix for service_task not running problem after resetting [ Upstream commit `f5be79673f` ] When hclge_ae_stop is called during resetting, it will cancel the service_task by calling cancel_work_sync, which may cause the service_task to exit without clearing HCLGE_STATE_SERVICE_SCHED bit. If this happens, the service_task will never run again. This patch fixes this problem by clearing it after calling cancel_work_sync in hclge_ae_stop. Fixes: `46a3df9f97` ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Yunsheng Lin	4eb463ea15	net: hns3: Fix for phy not link up problem after resetting [ Upstream commit `9617f66867` ] When resetting, phy_state_machine may be accessing the phy through firmware if the phy is not stopped or disconnected, which will cause firemware timeout problem because the firmware is busy processing the reset request. This patch fixes it by disabling the phy when resetting. Fixes: b940aeae0ed6 ("net: hns3: never send command queue message to IMP when reset") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:38 +02:00
Paul Cercueil	b5d3cbd624	clk: ingenic: jz4770: Modify C1CLK clock to disable CPU clock stop on idle [ Upstream commit `45ba63a29f` ] When the main processor goes idle, by default its clock is stopped. However, this also stops the clock of the co-processor. Here, if the C1CLK clock is enabled, we disable this functionality. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Mike Looijmans	862b3a1af4	clk-si544: Properly round requested frequency to nearest match [ Upstream commit `4d3f36c5e9` ] The si544 driver had a rounding problem that using the result of clk_round_rate may set the clock to yet another rate, for example: clk_round_rate(195000000) = 194999999 clk_round_rate(194999999) = 194999998 Clients would expect that after clk_set_rate(clk, freq2=clk_round_rate(clk, freq)) the chip will be running at exactly freq2. The problem was in the calculation of the feedback divider, it was always rounded down instead of to the nearest possible VCO value. After this change, the following holds true for any supported frequency: actual_freq = clk_round_rate(clk, freq); clk_set_rate(clk, actual_freq); clk_round_rate(clk, actual_freq) == actual_freq && clk_get_rate(clk) == actual_freq Signed-off-by: Mike Looijmans <mike.looijmans@topic.nl> Fixes: `953cc3e811` ("clk: Add driver for the si544 clock generator chip") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Nicholas Piggin	89b5ab4d0d	powerpc/64s: Fix compiler store ordering to SLB shadow area [ Upstream commit `926bc2f100` ] The stores to update the SLB shadow area must be made as they appear in the C code, so that the hypervisor does not see an entry with mismatched vsid and esid. Use WRITE_ONCE for this. GCC has been observed to elide the first store to esid in the update, which means that if the hypervisor interrupts the guest after storing to vsid, it could see an entry with old esid and new vsid, which may possibly result in memory corruption. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Stewart Smith	8cae46595b	hvc_opal: don't set tb_ticks_per_usec in udbg_init_opal_common() [ Upstream commit `447808bf50` ] time_init() will set up tb_ticks_per_usec based on reality. time_init() is called after udbg_init_opal_common() during boot. from arch/powerpc/kernel/time.c: unsigned long tb_ticks_per_usec = 100; /* sane default */ Currently, all powernv systems have a timebase frequency of 512mhz (512000000/1000000 == 0x200) - although there's nothing written down anywhere that I can find saying that we couldn't make that different based on the requirements in the ISA. So, we've been (accidentally) thwacking the (currently) correct (for powernv at least) value for tb_ticks_per_usec earlier than we otherwise would have. The "sane default" seems to be adequate for our purposes between udbg_init_opal_common() and time_init() being called, and if it isn't, then we should probably be setting it somewhere that isn't hvc_opal.c! Signed-off-by: Stewart Smith <stewart@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Sam Bobroff	a1463986b1	powerpc/eeh: Fix use-after-release of EEH driver [ Upstream commit `46d4be41b9` ] Correct two cases where eeh_pcid_get() is used to reference the driver's module but the reference is dropped before the driver pointer is used. In eeh_rmv_device() also refactor a little so that only two calls to eeh_pcid_put() are needed, rather than three and the reference isn't taken at all if it wasn't needed. Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Michal Suchanek	50a5991e80	powerpc/64s: Add barrier_nospec [ Upstream commit `a6b3964ad7` ] A no-op form of ori (or immediate of 0 into r31 and the result stored in r31) has been re-tasked as a speculation barrier. The instruction only acts as a barrier on newer machines with appropriate firmware support. On older CPUs it remains a harmless no-op. Implement barrier_nospec using this instruction. mpe: The semantics of the instruction are believed to be that it prevents execution of subsequent instructions until preceding branches have been fully resolved and are no longer executing speculatively. There is no further documentation available at this time. Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Christophe Leroy	da3fbfc559	powerpc/lib: Adjust .balign inside string functions for PPC32 [ Upstream commit `1128bb7813` ] commit `87a156fb18` ("Align hot loops of some string functions") degraded the performance of string functions by adding useless nops A simple benchmark on an 8xx calling 100000x a memchr() that matches the first byte runs in 41668 TB ticks before this patch and in 35986 TB ticks after this patch. So this gives an improvement of approx 10% Another benchmark doing the same with a memchr() matching the 128th byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks after this patch, so regardless on the number of loops, removing those useless nops improves the test by 5683 TB ticks. Fixes: `87a156fb18` ("Align hot loops of some string functions") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Cong Wang	da1320feb7	infiniband: fix a possible use-after-free bug [ Upstream commit `cb2595c139` ] ucma_process_join() will free the new allocated "mc" struct, if there is any error after that, especially the copy_to_user(). But in parallel, ucma_leave_multicast() could find this "mc" through idr_find() before ucma_process_join() frees it, since it is already published. So "mc" could be used in ucma_leave_multicast() after it is been allocated and freed in ucma_process_join(), since we don't refcnt it. Fix this by separating "publish" from ID allocation, so that we can get an ID first and publish it later after copy_to_user(). Fixes: `c8f6a362bf` ("RDMA/cma: Add multicast communication support") Reported-by: Noam Rathaus <noamr@beyondsecurity.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Benjamin Poirier	a1a6cfeea6	e1000e: Ignore TSYNCRXCTL when getting I219 clock attributes [ Upstream commit `fff200caf6` ] There have been multiple reports of crashes that look like kernel: RIP: 0010:[<ffffffff8110303f>] timecounter_read+0xf/0x50 [...] kernel: Call Trace: kernel: [<ffffffffa0806b0f>] e1000e_phc_gettime+0x2f/0x60 [e1000e] kernel: [<ffffffffa0806c5d>] e1000e_systim_overflow_work+0x1d/0x80 [e1000e] kernel: [<ffffffff810992c5>] process_one_work+0x155/0x440 kernel: [<ffffffff81099e16>] worker_thread+0x116/0x4b0 kernel: [<ffffffff8109f422>] kthread+0xd2/0xf0 kernel: [<ffffffff8163184f>] ret_from_fork+0x3f/0x70 These can be traced back to the fact that e1000e_systim_reset() skips the timecounter_init() call if e1000e_get_base_timinca() returns -EINVAL, which leads to a null deref in timecounter_read(). Commit `83129b37ef` ("e1000e: fix systim issues", v4.2-rc1) reworked e1000e_get_base_timinca() in such a way that it can return -EINVAL for e1000_pch_spt if the SYSCFI bit is not set in TSYNCRXCTL. Some experimentation has shown that on I219 (e1000_pch_spt, "MAC: 12") adapters, the E1000_TSYNCRXCTL_SYSCFI flag is unstable; TSYNCRXCTL reads sometimes don't have the SYSCFI bit set. Retrying the read shortly after finds the bit to be set. This was observed at boot (probe) but also link up and link down. Moreover, the phc (PTP Hardware Clock) seems to operate normally even after reads where SYSCFI=0. Therefore, remove this register read and unconditionally set the clock parameters. Reported-by: Achim Mildenberger <admin@fph.physik.uni-karlsruhe.de> Message-Id: <20180425065243.g5mqewg5irkwgwgv@f2> Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1075876 Fixes: `83129b37ef` ("e1000e: fix systim issues") Signed-off-by: Benjamin Poirier <bpoirier@suse.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Luis Henriques	c7aae3e007	ceph: fix use-after-free in ceph_statfs() [ Upstream commit `73fb0949cf` ] KASAN found an UAF in ceph_statfs. This was a one-off bug but looking at the code it looks like the monmap access needs to be protected as it can be modified while we're accessing it. Fix this by protecting the access with the monc->mutex. BUG: KASAN: use-after-free in ceph_statfs+0x21d/0x2c0 Read of size 8 at addr ffff88006844f2e0 by task trinity-c5/304 CPU: 0 PID: 304 Comm: trinity-c5 Not tainted 4.17.0-rc6+ #172 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xa5/0x11b ? show_regs_print_info+0x5/0x5 ? kmsg_dump_rewind+0x118/0x118 ? ceph_statfs+0x21d/0x2c0 print_address_description+0x73/0x2b0 ? ceph_statfs+0x21d/0x2c0 kasan_report+0x243/0x360 ceph_statfs+0x21d/0x2c0 ? ceph_umount_begin+0x80/0x80 ? kmem_cache_alloc+0xdf/0x1a0 statfs_by_dentry+0x79/0xb0 vfs_statfs+0x28/0x110 user_statfs+0x8c/0xe0 ? vfs_statfs+0x110/0x110 ? __fdget_raw+0x10/0x10 __se_sys_statfs+0x5d/0xa0 ? user_statfs+0xe0/0xe0 ? mutex_unlock+0x1d/0x40 ? __x64_sys_statfs+0x20/0x30 do_syscall_64+0xee/0x290 ? syscall_return_slowpath+0x1c0/0x1c0 ? page_fault+0x1e/0x30 ? syscall_return_slowpath+0x13c/0x1c0 ? prepare_exit_to_usermode+0xdb/0x140 ? syscall_trace_enter+0x330/0x330 ? __put_user_4+0x1c/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Allocated by task 130: __kmalloc+0x124/0x210 ceph_monmap_decode+0x1c1/0x400 dispatch+0x113/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Freed by task 130: kfree+0xb8/0x210 dispatch+0x15a/0xd20 ceph_con_workfn+0xa7e/0x44e0 process_one_work+0x5f0/0xa30 worker_thread+0x184/0xa70 kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:37 +02:00
Chengguang Xu	026a8e91bf	ceph: fix alignment of rasize [ Upstream commit `c36ed50de2` ] On currently logic: when I specify rasize=0~1 then it will be 4096. when I specify rasize=2~4097 then it will be 8192. Make it the same as rsize & wsize. Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Wang YanQing	5c4f26b05f	bpf, arm32: fix inconsistent naming about emit_a32_lsr_{r64,i64} [ Upstream commit `68565a1af9` ] The names for BPF_ALU64 \| BPF_ARSH are emit_a32_arsh_, the names for BPF_ALU64 \| BPF_LSH are emit_a32_lsh_, but the names for BPF_ALU64 \| BPF_RSH are emit_a32_lsr_. For consistence reason, let's rename emit_a32_lsr_ to emit_a32_rsh_*. This patch also corrects a wrong comment. Fixes: `39c13c204b` ("arm: eBPF JIT compiler") Signed-off-by: Wang YanQing <udknight@gmail.com> Cc: Shubham Bansal <illusionist.neo@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux@armlinux.org.uk Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Sergey Senozhatsky	60f778f897	printk: drop in_nmi check from printk_safe_flush_on_panic() [ Upstream commit `554755be08` ] Drop the in_nmi() check from printk_safe_flush_on_panic() and attempt to re-init (IOW unlock) locked logbuf spinlock from panic CPU regardless of its context. Otherwise, theoretically, we can deadlock on logbuf trying to flush per-CPU buffers: a) Panic CPU is running in non-NMI context b) Panic CPU sends out shutdown IPI via reboot vector c) Panic CPU fails to stop all remote CPUs d) Panic CPU sends out shutdown IPI via NMI vector One of the CPUs that we bring down via NMI vector can hold logbuf spin lock (theoretically). Link: http://lkml.kernel.org/r/20180530070350.10131-1-sergey.senozhatsky@gmail.com To: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Jacopo Mondi	0363c8f4dd	media: arch: sh: migor: Fix TW9910 PDN gpio [ Upstream commit `2b787b66bc` ] The TW9910 PDN gpio (power down) is listed as active high in the chip manual. It turns out it is actually active low as when set to physical level 0 it actually turns the video decoder power off. Without this patch applied: tw9910 0-0045: Product ID error 1f:2 With this patch applied: tw9910 0-0045: tw9910 Product ID b:0 Fixes: commit "186c446f4b840bd77b79d3dc951ca436cb8abe79" Signed-off-by: Jacopo Mondi <jacopo+renesas@jmondi.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Marco Felsch	941829b7d5	watchdog: da9063: Fix updating timeout value [ Upstream commit `44ee54aabf` ] The DA9063 watchdog has only one register field to store the timeout value and to enable the watchdog. The watchdog gets enabled if the value is not zero. There is no issue if the watchdog is already running but it leads into problems if the watchdog is disabled. If the watchdog is disabled and only the timeout value should be prepared the watchdog gets enabled too. Add a check to get the current watchdog state and update the watchdog timeout value on hw-side only if the watchdog is already active. Fixes: `5e9c16e376` ("watchdog: Add DA9063 PMIC watchdog driver.") Signed-off-by: Marco Felsch <m.felsch@pengutronix.de> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Laurentiu Tudor	90ec36a180	irqchip/ls-scfg-msi: Map MSIs in the iommu [ Upstream commit `0cdd431c33` ] Add the required iommu_dma_map_msi_msg() when composing the MSI message, otherwise the interrupts will not work. Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: jason@lakedaemon.net Cc: marc.zyngier@arm.com Cc: zhiqiang.hou@nxp.com Cc: minghuan.lian@nxp.com Link: https://lkml.kernel.org/r/20180605122727.12831-1-laurentiu.tudor@nxp.com Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Jozsef Kadlecsik	70eabe7261	netfilter: ipset: List timing out entries with "timeout 1" instead of zero [ Upstream commit `bd975e6914` ] When listing sets with timeout support, there's a probability that just timing out entries with "0" timeout value is listed/saved. However when restoring the saved list, the zero timeout value means permanent elelements. The new behaviour is that timing out entries are listed with "timeout 1" instead of zero. Fixes netfilter bugzilla #1258. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Florent Fourcot	deab8c6239	netfilter: ipset: forbid family for hash:mac sets [ Upstream commit `cbdebe481a` ] Userspace `ipset` command forbids family option for hash:mac type: ipset create test hash:mac family inet4 ipset v6.30: Unknown argument: `family' However, this check is not done in kernel itself. When someone use external netlink applications (pyroute2 python library for example), one can create hash:mac with invalid family and inconsistant results from userspace (`ipset` command cannot read set content anymore). This patch enforce the logic in kernel, and forbids insertion of hash:mac with a family set. Since IP_SET_PROTO_UNDEF is defined only for hash:mac, this patch has no impact on other hash:* sets Signed-off-by: Florent Fourcot <florent.fourcot@wifirst.fr> Signed-off-by: Victorien Molle <victorien.molle@wifirst.fr> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Jiri Olsa	ccd6c88364	perf tools: Fix pmu events parsing rule [ Upstream commit `ceac7b79df` ] Currently all the event parsing fails end up in the event_pmu rule, and display misleading help like: $ perf stat -e inst kill event syntax error: 'inst' \___ Cannot find PMU `inst'. Missing kernel support? ... The reason is that the event_pmu is too strong and match also single string. Changing it to force the '/' separators to be part of the rule, and getting the proper error now: $ perf stat -e inst kill event syntax error: 'inst' \___ parser error Run 'perf list' for a list of valid events ... Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180605121416.31645-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
Xi Wang	abca35ca18	net: hns3: Fix for VF mailbox cannot receiving PF response [ Upstream commit `1819e40908` ] When the VF frequently switches the CMDQ interrupt, if the CMDQ_SRC is not cleared, the VF will not receive the new PF response after the interrupt is re-enabled, the corresponding log is as follows: [ 317.482222] hns3 0000:00:03.0: VF could not get mbx resp(=0) from PF in 500 tries [ 317.483137] hns3 0000:00:03.0: VF request to get tqp info from PF failed -5 This patch fixes this problem by clearing CMDQ_SRC before enabling interrupt and syncing pending IRQ handlers after disabling interrupt. Fixes: `e2cb1dec97` ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:36 +02:00
David Howells	9f8a2aa1c5	rxrpc: Fix terminal retransmission connection ID to include the channel [ Upstream commit `fb1967a69f` ] When retransmitting the final ACK or ABORT packet for a call, the cid field in the packet header is set to the connection's cid, but this is incorrect as it also needs to include the channel number on that connection that the call was made on. Fix this by OR'ing in the channel number. Note that this fixes the bug that: commit `1a025028d4` rxrpc: Fix handling of call quietly cancelled out on server works around. I'm not intending to revert that as it will help protect against problems that might occur on the server. Fixes: `3136ef49a1` ("rxrpc: Delay terminal ACK transmission on a client call") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Alexandre Belloni	6bae920819	rtc: ensure rtc_set_alarm fails when alarms are not supported [ Upstream commit `abfdff44bc` ] When using RTC_ALM_SET or RTC_WKALM_SET with rtc_wkalrm.enabled not set, rtc_timer_enqueue() is not called and rtc_set_alarm() may succeed but the subsequent RTC_AIE_ON ioctl will fail. RTC_ALM_READ would also fail in that case. Ensure rtc_set_alarm() fails when alarms are not supported to avoid letting programs think the alarms are working for a particular RTC when they are not. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Mathieu Malaterre	74b40894cb	mm/slub.c: add __printf verification to slab_err() [ Upstream commit `a38965bf94` ] __printf is useful to verify format and arguments. Remove the following warning (with W=1): mm/slub.c:721:2: warning: function might be possible candidate for `gnu_printf' format attribute [-Wsuggest-attribute=format] Link: http://lkml.kernel.org/r/20180505200706.19986-1-malat@debian.org Signed-off-by: Mathieu Malaterre <malat@debian.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Chintan Pandya	21c3a6582f	mm: vmalloc: avoid racy handling of debugobjects in vunmap [ Upstream commit `f3c01d2f3a` ] Currently, __vunmap flow is, 1) Release the VM area 2) Free the debug objects corresponding to that vm area. This leave some race window open. 1) Release the VM area 1.5) Some other client gets the same vm area 1.6) This client allocates new debug objects on the same vm area 2) Free the debug objects corresponding to this vm area. Here, we actually free 'other' client's debug objects. Fix this by freeing the debug objects first and then releasing the VM area. Link: http://lkml.kernel.org/r/1523961828-9485-2-git-send-email-cpandya@codeaurora.org Signed-off-by: Chintan Pandya <cpandya@codeaurora.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Byungchul Park <byungchul.park@lge.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Laura Abbott <labbott@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yisheng Xie <xieyisheng1@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Huang Ying	cb7c923483	mm: /proc/pid/pagemap: hide swap entries from unprivileged users [ Upstream commit `ab6ecf247a` ] In commit `ab676b7d6f` ("pagemap: do not leak physical addresses to non-privileged userspace"), the /proc/PID/pagemap is restricted to be readable only by CAP_SYS_ADMIN to address some security issue. In commit `1c90308e7a` ("pagemap: hide physical addresses from non-privileged users"), the restriction is relieved to make /proc/PID/pagemap readable, but hide the physical addresses for non-privileged users. But the swap entries are readable for non-privileged users too. This has some security issues. For example, for page under migrating, the swap entry has physical address information. So, in this patch, the swap entries are hided for non-privileged users too. Link: http://lkml.kernel.org/r/20180508012745.7238-1-ying.huang@intel.com Fixes: `1c90308e7a` ("pagemap: hide physical addresses from non-privileged users") Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Cc: Andrei Vagin <avagin@openvz.org> Cc: Jerome Glisse <jglisse@redhat.com> Cc: Daniel Colascione <dancol@google.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Aaron Lu	53e3420769	mem_cgroup: make sure moving_account, move_lock_task and stat_cpu in the same cacheline [ Upstream commit `e81bf9793b` ] The LKP robot found a 27% will-it-scale/page_fault3 performance regression regarding commit e27be240df53("mm: memcg: make sure memory.events is uptodate when waking pollers"). What the test does is: 1 mkstemp() a 128M file on a tmpfs; 2 start $nr_cpu processes, each to loop the following: 2.1 mmap() this file in shared write mode; 2.2 write 0 to this file in a PAGE_SIZE step till the end of the file; 2.3 unmap() this file and repeat this process. 3 After 5 minutes, check how many loops they managed to complete, the higher the better. The commit itself looks innocent enough as it merely changed some event counting mechanism and this test didn't trigger those events at all. Perf shows increased cycles spent on accessing root_mem_cgroup->stat_cpu in count_memcg_event_mm()(called by handle_mm_fault()) and in __mod_memcg_state() called by page_add_file_rmap(). So it's likely due to the changed layout of 'struct mem_cgroup' that either make stat_cpu falling into a constantly modifying cacheline or some hot fields stop being in the same cacheline. I verified this by moving memory_events[] back to where it was: : --- a/include/linux/memcontrol.h : +++ b/include/linux/memcontrol.h : @@ -205,7 +205,6 @@ struct mem_cgroup { : int oom_kill_disable; : : /* memory.events / : - atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; : struct cgroup_file events_file; : : / protect arrays of thresholds / : @@ -238,6 +237,7 @@ struct mem_cgroup { : struct mem_cgroup_stat_cpu __percpu stat_cpu; : atomic_long_t stat[MEMCG_NR_STAT]; : atomic_long_t events[NR_VM_EVENT_ITEMS]; : + atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; : : unsigned long socket_pressure; And performance restored. Later investigation found that as long as the following 3 fields moving_account, move_lock_task and stat_cpu are in the same cacheline, performance will be good. To avoid future performance surprise by other commits changing the layout of 'struct mem_cgroup', this patch makes sure the 3 fields stay in the same cacheline. One concern of this approach is, moving_account and move_lock_task could be modified when a process changes memory cgroup while stat_cpu is a always read field, it might hurt to place them in the same cacheline. I assume it is rare for a process to change memory cgroup so this should be OK. Link: https://lkml.kernel.org/r/20180528114019.GF9904@yexl-desktop Link: http://lkml.kernel.org/r/20180601071115.GA27302@intel.com Signed-off-by: Aaron Lu <aaron.lu@intel.com> Reported-by: kernel test robot <xiaolong.ye@intel.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Tetsuo Handa	54bc29b663	kernel/hung_task.c: show all hung tasks before panic [ Upstream commit `401c636a0e` ] When we get a hung task it can often be valuable to see _all_ the hung tasks on the system before calling panic(). Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&id=5316056503549952 ---------------------------------------- INFO: task syz-executor0:6540 blocked for more than 120 seconds. Not tainted 4.16.0+ #13 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. syz-executor0 D23560 6540 4521 0x80000004 Call Trace: context_switch kernel/sched/core.c:2848 [inline] __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490 schedule+0xf5/0x430 kernel/sched/core.c:3549 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607 __mutex_lock_common kernel/locking/mutex.c:833 [inline] __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060 isofs_get_last_session fs/isofs/inode.c:567 [inline] isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660 mount_bdev+0x2b7/0x370 fs/super.c:1119 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560 mount_fs+0x66/0x2d0 fs/super.c:1222 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:2514 [inline] do_new_mount fs/namespace.c:2517 [inline] do_mount+0xea4/0x2b90 fs/namespace.c:2847 ksys_mount+0xab/0x120 fs/namespace.c:3063 SYSC_mount fs/namespace.c:3077 [inline] SyS_mount+0x39/0x50 fs/namespace.c:3074 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 (...snipped...) Showing all locks held in the system: (...snipped...) 2 locks held by syz-executor0/6540: #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline] #0: 00000000566d4c39 (&type->s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); / #1: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 / mutex_lock_nested(&lo->lo_ctl_mutex, 1); / (...snipped...) 3 locks held by syz-executor7/6541: #0: 0000000043ca8836 (&lo->lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 / mutex_lock_nested(&lo->lo_ctl_mutex, 1); / #1: 000000007bf3d3f9 (&bdev->bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192 #2: 00000000566d4c39 (&type->s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 / down_read(&sb->s_umount); */ ---------------------------------------- When reporting an AB-BA deadlock like shown above, it would be nice if trace of PID=6541 is printed as well as trace of PID=6540 before calling panic(). Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay calling panic() but normally there should not be so many hung tasks. Link: http://lkml.kernel.org/r/201804050705.BHE57833.HVFOFtSOMQJFOL@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Dmitry Vyukov <dvyukov@google.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Alex Williamson	2e446d1157	vfio/type1: Fix task tracking for QEMU vCPU hotplug [ Upstream commit `48d8476b41` ] MAP_DMA ioctls might be called from various threads within a process, for example when using QEMU, the vCPU threads are often generating these calls and we therefore take a reference to that vCPU task. However, QEMU also supports vCPU hotplug on some machines and the task that called MAP_DMA may have exited by the time UNMAP_DMA is called, resulting in the mm_struct pointer being NULL and thus a failure to match against the existing mapping. To resolve this, we instead take a reference to the thread group_leader, which has the same mm_struct and resource limits, but is less likely exit, at least in the QEMU case. A difficulty here is guaranteeing that the capabilities of the group_leader match that of the calling thread, which we resolve by tracking CAP_IPC_LOCK at the time of calling rather than at an indeterminate time in the future. Potentially this also results in better efficiency as this is now recorded once per MAP_DMA ioctl. Reported-by: Xu Yandong <xuyandong2@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Alex Williamson	85d742bcaf	vfio/mdev: Check globally for duplicate devices [ Upstream commit `002fe996f6` ] When we create an mdev device, we check for duplicates against the parent device and return -EEXIST if found, but the mdev device namespace is global since we'll link all devices from the bus. We do catch this later in sysfs_do_create_link_sd() to return -EEXIST, but with it comes a kernel warning and stack trace for trying to create duplicate sysfs links, which makes it an undesirable response. Therefore we should really be looking for duplicates across all mdev parent devices, or as implemented here, against our mdev device list. Using mdev_list to prevent duplicates means that we can remove mdev_parent.lock, but in order not to serialize mdev device creation and removal globally, we add mdev_device.active which allows UUIDs to be reserved such that we can drop the mdev_list_lock before the mdev device is fully in place. Two behavioral notes; first, mdev_parent.lock had the side-effect of serializing mdev create and remove ops per parent device. This was an implementation detail, not an intentional guarantee provided to the mdev vendor drivers. Vendor drivers can trivially provide this serialization internally if necessary. Second, review comments note the new -EAGAIN behavior when the device, and in particular the remove attribute, becomes visible in sysfs. If a remove is triggered prior to completion of mdev_device_create() the user will see a -EAGAIN error. While the errno is different, receiving an error during this period is not, the previous implementation returned -ENODEV for the same condition. Furthermore, the consistency to the user is improved in the case where mdev_device_remove_ops() returns error. Previously concurrent calls to mdev_device_remove() could see the device disappear with -ENODEV and return in the case of error. Now a user would see -EAGAIN while the device is in this transitory state. Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:35 +02:00
Geert Uytterhoeven	33329cb264	vfio: platform: Fix reset module leak in error path [ Upstream commit `28a6838788` ] If the IOMMU group setup fails, the reset module is not released. Fixes: `b5add544d6` ("vfio, platform: make reset driver a requirement by default") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Scott Mayhew	c5da8ae0ed	nfsd: fix potential use-after-free in nfsd4_decode_getdeviceinfo [ Upstream commit `3171822fdc` ] When running a fuzz tester against a KASAN-enabled kernel, the following splat periodically occurs. The problem occurs when the test sends a GETDEVICEINFO request with a malformed xdr array (size but no data) for gdia_notify_types and the array size is > 0x3fffffff, which results in an overflow in the value of nbytes which is passed to read_buf(). If the array size is 0x40000000, 0x80000000, or 0xc0000000, then after the overflow occurs, the value of nbytes 0, and when that happens the pointer returned by read_buf() points to the end of the xdr data (i.e. argp->end) when really it should be returning NULL. Fix this by returning NFS4ERR_BAD_XDR if the array size is > 1000 (this value is arbitrary, but it's the same threshold used by nfsd4_decode_bitmap()... in could really be any value >= 1 since it's expected to get at most a single bitmap in gdia_notify_types). [ 119.256854] ================================================================== [ 119.257611] BUG: KASAN: use-after-free in nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.258422] Read of size 4 at addr ffff880113ada000 by task nfsd/538 [ 119.259146] CPU: 0 PID: 538 Comm: nfsd Not tainted 4.17.0+ #1 [ 119.259662] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 [ 119.261202] Call Trace: [ 119.262265] dump_stack+0x71/0xab [ 119.263371] print_address_description+0x6a/0x270 [ 119.264609] kasan_report+0x258/0x380 [ 119.265854] ? nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.267291] nfsd4_decode_getdeviceinfo+0x5a4/0x5b0 [nfsd] [ 119.268549] ? nfs4svc_decode_compoundargs+0xa5b/0x13c0 [nfsd] [ 119.269873] ? nfsd4_decode_sequence+0x490/0x490 [nfsd] [ 119.271095] nfs4svc_decode_compoundargs+0xa5b/0x13c0 [nfsd] [ 119.272393] ? nfsd4_release_compoundargs+0x1b0/0x1b0 [nfsd] [ 119.273658] nfsd_dispatch+0x183/0x850 [nfsd] [ 119.274918] svc_process+0x161c/0x31a0 [sunrpc] [ 119.276172] ? svc_printk+0x190/0x190 [sunrpc] [ 119.277386] ? svc_xprt_release+0x451/0x680 [sunrpc] [ 119.278622] nfsd+0x2b9/0x430 [nfsd] [ 119.279771] ? nfsd_destroy+0x1c0/0x1c0 [nfsd] [ 119.281157] kthread+0x2db/0x390 [ 119.282347] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 119.283756] ret_from_fork+0x35/0x40 [ 119.286041] Allocated by task 436: [ 119.287525] kasan_kmalloc+0xa0/0xd0 [ 119.288685] kmem_cache_alloc+0xe9/0x1f0 [ 119.289900] get_empty_filp+0x7b/0x410 [ 119.291037] path_openat+0xca/0x4220 [ 119.292242] do_filp_open+0x182/0x280 [ 119.293411] do_sys_open+0x216/0x360 [ 119.294555] do_syscall_64+0xa0/0x2f0 [ 119.295721] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 119.298068] Freed by task 436: [ 119.299271] __kasan_slab_free+0x130/0x180 [ 119.300557] kmem_cache_free+0x78/0x210 [ 119.301823] rcu_process_callbacks+0x35b/0xbd0 [ 119.303162] __do_softirq+0x192/0x5ea [ 119.305443] The buggy address belongs to the object at ffff880113ada000 which belongs to the cache filp of size 256 [ 119.308556] The buggy address is located 0 bytes inside of 256-byte region [ffff880113ada000, ffff880113ada100) [ 119.311376] The buggy address belongs to the page: [ 119.312728] page:ffffea00044eb680 count:1 mapcount:0 mapping:0000000000000000 index:0xffff880113ada780 [ 119.314428] flags: 0x17ffe000000100(slab) [ 119.315740] raw: 0017ffe000000100 0000000000000000 ffff880113ada780 00000001000c0001 [ 119.317379] raw: ffffea0004553c60 ffffea00045c11e0 ffff88011b167e00 0000000000000000 [ 119.319050] page dumped because: kasan: bad access detected [ 119.321652] Memory state around the buggy address: [ 119.322993] ffff880113ad9f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 119.324515] ffff880113ad9f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 119.326087] >ffff880113ada000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 119.327547] ^ [ 119.328730] ffff880113ada080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 119.330218] ffff880113ada100: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb [ 119.331740] ================================================================== Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Andrew Elble	6e8a0a341b	nfsd: fix error handling in nfs4_set_delegation() [ Upstream commit `692ad280bf` ] I noticed a memory corruption crash in nfsd in 4.17-rc1. This patch corrects the issue. Fix to return error if the delegation couldn't be hashed or there was a recall in progress. Use the existing error path instead of destroy_delegation() for readability. Signed-off-by: Andrew Elble <aweits@rit.edu> Fixes: `353601e7d3` ("nfsd: create a separate lease for each delegation") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Trond Myklebust	ddc109ede2	NFSv4.1: Fix the client behaviour on NFS4ERR_SEQ_FALSE_RETRY [ Upstream commit `f9312a5410` ] If the server returns NFS4ERR_SEQ_FALSE_RETRY or NFS4ERR_RETRY_UNCACHED_REP, then it thinks we're trying to replay an existing request. If so, then let's just bump the sequence ID and retry the operation. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Zhouyang Jia	533e47e4e8	ALSA: fm801: add error handling for snd_ctl_add [ Upstream commit `ef1ffbe788` ] When snd_ctl_add fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling snd_ctl_add. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Zhouyang Jia	ff7d68c646	ALSA: emu10k1: add error handling for snd_ctl_add [ Upstream commit `6d531e7b97` ] When snd_ctl_add fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling snd_ctl_add. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Alexander Duyck	e2bed54f0e	ixgbe: Fix setting of TC configuration for macvlan case [ Upstream commit `646bb57ce8` ] When we were enabling macvlan interfaces we weren't correctly configuring things until ixgbe_setup_tc was called a second time either by tweaking the number of queues or increasing the macvlan count past 15. The issue came down to the fact that num_rx_pools is not populated until after the queues and interrupts are reinitialized. Instead of trying to set it sooner we can just move the call to setup at least 1 traffic class to the SR-IOV/VMDq setup function so that we just set it for this one case. We already had a spot that was configuring the queues for TC 0 in the code here anyway so it makes sense to also set the number of TCs here as well. Fixes: `49cfbeb7a9` ("ixgbe: Fix handling of macvlan Tx offload") Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:34 +02:00
Olga Kornievskaia	3f9e388d1d	skip LAYOUTRETURN if layout is invalid [ Upstream commit `93b7f7ad20` ] Currently, when IO to DS fails, client returns the layout and retries against the MDS. However, then on umounting (inode eviction) it returns the layout again. This is because pnfs_return_layout() was changed in commit `d78471d32b` ("pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR") to always set NFS_LAYOUT_RETURN_REQUESTED so even if we returned the layout, it will be returned again. Instead, let's also check if we have already marked the layout invalid. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:33 +02:00
Stephen Hemminger	0b13d05c52	hv_netvsc: fix network namespace issues with VF support [ Upstream commit `7bf7bb37f1` ] When finding the parent netvsc device, the search needs to be across all netvsc device instances (independent of network namespace). Find parent device of VF using upper_dev_get routine which searches only adjacent list. Fixes: `e8ff40d4bf` ("hv_netvsc: improve VF device matching") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> netns aware byref Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:33 +02:00
Juergen Gross	7197915a49	xen/netfront: raise max number of slots in xennet_get_responses() [ Upstream commit `57f230ab04` ] The max number of slots used in xennet_get_responses() is set to MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD). In old kernel-xen MAX_SKB_FRAGS was 18, while nowadays it is 17. This difference is resulting in frequent messages "too many slots" and a reduced network throughput for some workloads (factor 10 below that of a kernel-xen based guest). Replacing MAX_SKB_FRAGS by XEN_NETIF_NR_SLOTS_MIN for calculation of the max number of slots to use solves that problem (tests showed no more messages "too many slots" and throughput was as high as with the kernel-xen based guest system). Replace MAX_SKB_FRAGS-2 by XEN_NETIF_NR_SLOTS_MIN-1 in netfront_tx_slot_available() for making it clearer what is really being tested without actually modifying the tested value. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:33 +02:00
Kenneth Feng	c608e524f1	drm/amd/powerplay: Set higher SCLK&MCLK frequency than dpm7 in OD (v2) [ Upstream commit `5c16f36f6f` ] Fix the issue that SCLK&MCLK can't be set higher than dpm7 when OD is enabled in SMU7. v2: fix warning (Alex) Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Rex Zhu<rezhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:33 +02:00
Tetsuo Handa	b120017178	mm: check for SIGKILL inside dup_mmap() loop [ Upstream commit `655c79bb40` ] As a theoretical problem, dup_mmap() of an mm_struct with 60000+ vmas can loop while potentially allocating memory, with mm->mmap_sem held for write by current thread. This is bad if current thread was selected as an OOM victim, for current thread will continue allocations using memory reserves while OOM reaper is unable to reclaim memory. As an actually observable problem, it is not difficult to make OOM reaper unable to reclaim memory if the OOM victim is blocked at i_mmap_lock_write() in this loop. Unfortunately, since nobody can explain whether it is safe to use killable wait there, let's check for SIGKILL before trying to allocate memory. Even without an OOM event, there is no point with continuing the loop from the beginning if current thread is killed. I tested with debug printk(). This patch should be safe because we already fail if security_vm_enough_memory_mm() or kmem_cache_alloc(GFP_KERNEL) fails and exit_mmap() handles it. *** Aborting dup_mmap() due to SIGKILL * * Aborting dup_mmap() due to SIGKILL * * Aborting dup_mmap() due to SIGKILL * * Aborting dup_mmap() due to SIGKILL * * Aborting exit_mmap() due to NULL mmap *** [akpm@linux-foundation.org: add comment] Link: http://lkml.kernel.org/r/201804071938.CDE04681.SOFVQJFtMHOOLF@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Rik van Riel <riel@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Mark Rutland	f1ba42ed2b	kcov: ensure irq code sees a valid area [ Upstream commit `c9484b986e` ] Patch series "kcov: fix unexpected faults". These patches fix a few issues where KCOV code could trigger recursive faults, discovered while debugging a patch enabling KCOV for arch/arm: * On CONFIG_PREEMPT kernels, there's a small race window where __sanitizer_cov_trace_pc() can see a bogus kcov_area. * Lazy faulting of the vmalloc area can cause mutual recursion between fault handling code and __sanitizer_cov_trace_pc(). * During the context switch, switching the mm can cause the kcov_area to be transiently unmapped. These are prerequisites for enabling KCOV on arm, but the issues themsevles are generic -- we just happen to avoid them by chance rather than design on x86-64 and arm64. This patch (of 3): For kernels built with CONFIG_PREEMPT, some C code may execute before or after the interrupt handler, while the hardirq count is zero. In these cases, in_task() can return true. A task can be interrupted in the middle of a KCOV_DISABLE ioctl while it resets the task's kcov data via kcov_task_init(). Instrumented code executed during this period will call __sanitizer_cov_trace_pc(), and as in_task() returns true, will inspect t->kcov_mode before trying to write to t->kcov_area. In kcov_init_task() we update t->kcov_{mode,area,size} with plain stores, which may be re-ordered, torn, etc. Thus __sanitizer_cov_trace_pc() may see bogus values for any of these fields, and may attempt to write to memory which is not mapped. Let's avoid this by using WRITE_ONCE() to set t->kcov_mode, with a barrier() to ensure this is ordered before we clear t->kov_{area,size}. This ensures that any code execute while kcov_init_task() is preempted will either see valid values for t->kcov_{area,size}, or will see that t->kcov_mode is KCOV_MODE_DISABLED, and bail out without touching t->kcov_area. Link: http://lkml.kernel.org/r/20180504135535.53744-2-mark.rutland@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Petr Machata	d23eb94c9a	mlxsw: spectrum_switchdev: Fix port_vlan refcounting [ Upstream commit `9e25826ffc` ] Switchdev notifications for addition of SWITCHDEV_OBJ_ID_PORT_VLAN are distributed not only on clean addition, but also when flags on an existing VLAN are changed. mlxsw_sp_bridge_port_vlan_add() calls mlxsw_sp_port_vlan_get() to get at the port_vlan in question, which implicitly references the object. This then leads to discrepancies in reference counting when the VLAN is removed. spectrum.c warns about the problem when the module is removed: [13578.493090] WARNING: CPU: 0 PID: 2454 at drivers/net/ethernet/mellanox/mlxsw/spectrum.c:2973 mlxsw_sp_port_remove+0xfd/0x110 [mlxsw_spectrum] [...] [13578.627106] Call Trace: [13578.629617] mlxsw_sp_fini+0x2a/0xe0 [mlxsw_spectrum] [13578.634748] mlxsw_core_bus_device_unregister+0x3e/0x130 [mlxsw_core] [13578.641290] mlxsw_pci_remove+0x13/0x40 [mlxsw_pci] [13578.646238] pci_device_remove+0x31/0xb0 [13578.650244] device_release_driver_internal+0x14f/0x220 [13578.655562] driver_detach+0x32/0x70 [13578.659183] bus_remove_driver+0x47/0xa0 [13578.663134] pci_unregister_driver+0x1e/0x80 [13578.667486] mlxsw_sp_module_exit+0xc/0x3fa [mlxsw_spectrum] [13578.673207] __x64_sys_delete_module+0x13b/0x1e0 [13578.677888] ? exit_to_usermode_loop+0x78/0x80 [13578.682374] do_syscall_64+0x39/0xe0 [13578.685976] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix by putting the port_vlan when mlxsw_sp_port_vlan_bridge_join() determines it's a flag-only change. Fixes: `b3529af6bb` ("spectrum: Reference count VLAN entries") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Clint Taylor	9572453d5b	drm/i915/glk: Add Quirk for GLK NUC HDMI port issues. commit `0ca9488193` upstream. On GLK NUC platforms the HDMI retiming buffer needs additional disabled time to correctly sync to a faster incoming signal. When measured on a scope the highspeed lines of the HDMI clock turn off for ~400uS during a normal resolution change. The HDMI retimer on the GLK NUC appears to require at least a full frame of quiet time before a new faster clock can be correctly sync'd. Wait 100ms due to msleep inaccuracies while waiting for a completed frame. Add a quirk to the driver for GLK boards that use ITE66317 HDMI retimers. V2: Add more devices to the quirk list V3: Delay increased to 100ms, check to confirm crtc type is HDMI. V4: crtc type check extended to include _DDI and whitespace fixes v5: Fix white spaces, remove the macro for delay. Revert the crtc type check introduced in v4. Cc: Imre Deak <imre.deak@intel.com> Cc: <stable@vger.kernel.org> # v4.14+ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105887 Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com> Tested-by: Daniel Scheller <d.scheller.oss@gmail.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180710200205.1478-1-radhakrishna.sripada@intel.com (cherry picked from commit `90c3e21987`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Johannes Weiner	e20330ac48	arm64: fix vmemmap BUILD_BUG_ON() triggering on !vmemmap setups commit `7b0eb6b41a` upstream. Arnd reports the following arm64 randconfig build error with the PSI patches that add another page flag: /git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init': /git/arm-soc/include/linux/compiler.h:357:38: error: call to '__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT) The additional page flag causes other information stored in page->flags to get bumped into their own struct page member: #if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <= BITS_PER_LONG - NR_PAGEFLAGS #define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT #else #define LAST_CPUPID_WIDTH 0 #endif #if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0 #define LAST_CPUPID_NOT_IN_PAGE_FLAGS #endif which in turn causes the struct page size to exceed the size set in STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the VMEMMAP page array according to address space and struct page size. However, the check is performed - and triggers here - on a !VMEMMAP config, which consumes an additional 22 page bits for the sparse section id. When VMEMMAP is enabled, those bits are returned, cpupid doesn't need its own member, and the page passes the VMEMMAP check. Restrict that check to the situation it was meant to check: that we are sizing the VMEMMAP page array correctly. Says Arnd: Further experiments show that the build error already existed before, but was only triggered with larger values of CONFIG_NR_CPU and/or CONFIG_NODES_SHIFT that might be used in actual configurations but not in randconfig builds. With longer CPU and node masks, I could recreate the problem with kernels as old as linux-4.7 when arm64 NUMA support got added. Reported-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Arnd Bergmann <arnd@arndb.de> Cc: stable@vger.kernel.org Fixes: `1a2db30034` ("arm64, numa: Add NUMA support for arm64 platforms.") Fixes: `3e1907d5bf` ("arm64: mm: move vmemmap region right below the linear region") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Steven Rostedt (VMware)	211b2bcdeb	tracing: Quiet gcc warning about maybe unused link variable commit `2519c1bbe3` upstream. Commit `57ea2a34ad` ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") added an if statement that depends on another if statement that gcc doesn't see will initialize the "link" variable and gives the warning: "warning: 'link' may be used uninitialized in this function" It is really a false positive, but to quiet the warning, and also to make sure that it never actually is used uninitialized, initialize the "link" variable to NULL and add an if (!WARN_ON_ONCE(!link)) where the compiler thinks it could be used uninitialized. Cc: stable@vger.kernel.org Fixes: `57ea2a34ad` ("tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Artem Savkov	0b4ac7c36c	tracing/kprobes: Fix trace_probe flags on enable_trace_kprobe() failure commit `57ea2a34ad` upstream. If enable_trace_kprobe fails to enable the probe in enable_k(ret)probe it returns an error, but does not unset the tp flags it set previously. This results in a probe being considered enabled and failures like being unable to remove the probe through kprobe_events file since probes_open() expects every probe to be disabled. Link: http://lkml.kernel.org/r/20180725102826.8300-1-asavkov@redhat.com Link: http://lkml.kernel.org/r/20180725142038.4765-1-asavkov@redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: `41a7dd420c` ("tracing/kprobes: Support ftrace_event_file base multibuffer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Snild Dolkow	b750e294bf	kthread, tracing: Don't expose half-written comm when creating kthreads commit `3e536e222f` upstream. There is a window for racing when printing directly to task->comm, allowing other threads to see a non-terminated string. The vsnprintf function fills the buffer, counts the truncated chars, then finally writes the \0 at the end. creator other vsnprintf: fill (not terminated) count the rest trace_sched_waking(p): ... memcpy(comm, p->comm, TASK_COMM_LEN) write \0 The consequences depend on how 'other' uses the string. In our case, it was copied into the tracing system's saved cmdlines, a buffer of adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be): crash-arm64> x/1024s savedcmd->saved_cmdlines \| grep 'evenk' 0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12" ...and a strcpy out of there would cause stack corruption: [224761.522292] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffff9bf9783c78 crash-arm64> kbt \| grep 'comm\\|trace_print_context' #6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396) comm (char [16]) = "irq/497-pwr_even" crash-arm64> rd 0xffffffd4d0e17d14 8 ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_ ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16: ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`.. ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`.......... The workaround in `e09e28671` (use strlcpy in __trace_find_cmdline) was likely needed because of this same bug. Solved by vsnprintf:ing to a local buffer, then using set_task_comm(). This way, there won't be a window where comm is not terminated. Link: http://lkml.kernel.org/r/20180726071539.188015-1-snild@sony.com Cc: stable@vger.kernel.org Fixes: `bc0c38d139` ("ftrace: latency tracer infrastructure") Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Snild Dolkow <snild@sony.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Steven Rostedt (VMware)	6fad75b37e	tracing: Fix possible double free in event_enable_trigger_func() commit `15cc78644d` upstream. There was a case that triggered a double free in event_trigger_callback() due to the called reg() function freeing the trigger_data and then it getting freed again by the error return by the caller. The solution there was to up the trigger_data ref count. Code inspection found that event_enable_trigger_func() has the same issue, but is not as easy to trigger (requires harder to trigger failures). It needs to be solved slightly different as it needs more to clean up when the reg() function fails. Link: http://lkml.kernel.org/r/20180725124008.7008e586@gandalf.local.home Cc: stable@vger.kernel.org Fixes: `7862ad1846` ("tracing: Add 'enable_event' and 'disable_event' event trigger commands") Reivewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Steven Rostedt (VMware)	fc6e18d3de	tracing: Fix double free of event_trigger_data commit `1863c38725` upstream. Running the following: # cd /sys/kernel/debug/tracing # echo 500000 > buffer_size_kb [ Or some other number that takes up most of memory ] # echo snapshot > events/sched/sched_switch/trigger Triggers the following bug: ------------[ cut here ]------------ kernel BUG at mm/slub.c:296! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC PTI CPU: 6 PID: 6878 Comm: bash Not tainted 4.18.0-rc6-test+ #1066 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:kfree+0x16c/0x180 Code: 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 ac b3 f8 ff 48 89 d9 48 89 da 41 b8 01 00 00 00 5b 5d 41 5c 4c 89 d6 e9 f4 f3 ff ff <0f> 0b 0f 0b 48 8b 3d d9 d8 f9 00 e9 c1 fe ff ff 0f 1f 40 00 0f 1f RSP: 0018:ffffb654436d3d88 EFLAGS: 00010246 RAX: ffff91a9d50f3d80 RBX: ffff91a9d50f3d80 RCX: ffff91a9d50f3d80 RDX: 00000000000006a4 RSI: ffff91a9de5a60e0 RDI: ffff91a9d9803500 RBP: ffffffff8d267c80 R08: 00000000000260e0 R09: ffffffff8c1a56be R10: fffff0d404543cc0 R11: 0000000000000389 R12: ffffffff8c1a56be R13: ffff91a9d9930e18 R14: ffff91a98c0c2890 R15: ffffffff8d267d00 FS: 00007f363ea64700(0000) GS:ffff91a9de580000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c1cacc8e10 CR3: 00000000d9b46003 CR4: 00000000001606e0 Call Trace: event_trigger_callback+0xee/0x1d0 event_trigger_write+0xfc/0x1a0 __vfs_write+0x33/0x190 ? handle_mm_fault+0x115/0x230 ? _cond_resched+0x16/0x40 vfs_write+0xb0/0x190 ksys_write+0x52/0xc0 do_syscall_64+0x5a/0x160 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7f363e16ab50 Code: 73 01 c3 48 8b 0d 38 83 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 79 db 2c 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 1e e3 01 00 48 89 04 24 RSP: 002b:00007fff9a4c6378 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f363e16ab50 RDX: 0000000000000009 RSI: 000055c1cacc8e10 RDI: 0000000000000001 RBP: 000055c1cacc8e10 R08: 00007f363e435740 R09: 00007f363ea64700 R10: 0000000000000073 R11: 0000000000000246 R12: 0000000000000009 R13: 0000000000000001 R14: 00007f363e4345e0 R15: 00007f363e4303c0 Modules linked in: ip6table_filter ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_seq snd_seq_device i915 snd_pcm snd_timer i2c_i801 snd soundcore i2c_algo_bit drm_kms_helper 86_pkg_temp_thermal video kvm_intel kvm irqbypass wmi e1000e ---[ end trace d301afa879ddfa25 ]--- The cause is because the register_snapshot_trigger() call failed to allocate the snapshot buffer, and then called unregister_trigger() which freed the data that was passed to it. Then on return to the function that called register_snapshot_trigger(), as it sees it failed to register, it frees the trigger_data again and causes a double free. By calling event_trigger_init() on the trigger_data (which only ups the reference counter for it), and then event_trigger_free() afterward, the trigger_data would not get freed by the registering trigger function as it would only up and lower the ref count for it. If the register trigger function fails, then the event_trigger_free() called after it will free the trigger data normally. Link: http://lkml.kernel.org/r/20180724191331.738eb819@gandalf.local.home Cc: stable@vger.kerne.org Fixes: `93e31ffbf4` ("tracing: Add 'snapshot' event trigger command") Reported-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:32 +02:00
Tejun Heo	b591ff1696	delayacct: fix crash in delayacct_blkio_end() after delayacct init failure commit `b512719f77` upstream. While forking, if delayacct init fails due to memory shortage, it continues expecting all delayacct users to check task->delays pointer against NULL before dereferencing it, which all of them used to do. Commit `c96f5471ce` ("delayacct: Account blkio completion on the correct task"), while updating delayacct_blkio_end() to take the target task instead of always using %current, made the function test NULL on %current->delays and then continue to operated on @p->delays. If %current succeeded init while @p didn't, it leads to the following crash. BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: __delayacct_blkio_end+0xc/0x40 PGD 8000001fd07e1067 P4D 8000001fd07e1067 PUD 1fcffbb067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 4 PID: 25774 Comm: QIOThread0 Not tainted 4.16.0-9_fbk1_rc2_1180_g6b593215b4d7 #9 RIP: 0010:__delayacct_blkio_end+0xc/0x40 Call Trace: try_to_wake_up+0x2c0/0x600 autoremove_wake_function+0xe/0x30 __wake_up_common+0x74/0x120 wake_up_page_bit+0x9c/0xe0 mpage_end_io+0x27/0x70 blk_update_request+0x78/0x2c0 scsi_end_request+0x2c/0x1e0 scsi_io_completion+0x20b/0x5f0 blk_mq_complete_request+0xa2/0x100 ata_scsi_qc_complete+0x79/0x400 ata_qc_complete_multiple+0x86/0xd0 ahci_handle_port_interrupt+0xc9/0x5c0 ahci_handle_port_intr+0x54/0xb0 ahci_single_level_irq_intr+0x3b/0x60 __handle_irq_event_percpu+0x43/0x190 handle_irq_event_percpu+0x20/0x50 handle_irq_event+0x2a/0x50 handle_edge_irq+0x80/0x1c0 handle_irq+0xaf/0x120 do_IRQ+0x41/0xc0 common_interrupt+0xf/0xf Fix it by updating delayacct_blkio_end() check @p->delays instead. Link: http://lkml.kernel.org/r/20180724175542.GP1934745@devbig577.frc2.facebook.com Fixes: `c96f5471ce` ("delayacct: Account blkio completion on the correct task") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Dave Jones <dsj@fb.com> Debugged-by: Dave Jones <dsj@fb.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Josh Snyder <joshs@netflix.com> Cc: <stable@vger.kernel.org> [4.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Shakeel Butt	a84297b871	kvm, mm: account shadow page tables to kmemcg commit `d97e5e6160` upstream. The size of kvm's shadow page tables corresponds to the size of the guest virtual machines on the system. Large VMs can spend a significant amount of memory as shadow page tables which can not be left as system memory overhead. So, account shadow page tables to the kmemcg. [shakeelb@google.com: replace (GFP_KERNEL\|__GFP_ACCOUNT) with GFP_KERNEL_ACCOUNT] Link: http://lkml.kernel.org/r/20180629140224.205849-1-shakeelb@google.com Link: http://lkml.kernel.org/r/20180627181349.149778-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Peter Feiner <pfeiner@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Dave Jiang	8d163441d5	mm: disallow mappings that conflict for devm_memremap_pages() commit `15d36fecd0` upstream. When pmem namespaces created are smaller than section size, this can cause an issue during removal and gpf was observed: general protection fault: 0000 1 SMP PTI CPU: 36 PID: 3941 Comm: ndctl Tainted: G W 4.14.28-1.el7uek.x86_64 #2 task: ffff88acda150000 task.stack: ffffc900233a4000 RIP: 0010:__put_page+0x56/0x79 Call Trace: devm_memremap_pages_release+0x155/0x23a release_nodes+0x21e/0x260 devres_release_all+0x3c/0x48 device_release_driver_internal+0x15c/0x207 device_release_driver+0x12/0x14 unbind_store+0xba/0xd8 drv_attr_store+0x27/0x31 sysfs_kf_write+0x3f/0x46 kernfs_fop_write+0x10f/0x18b __vfs_write+0x3a/0x16d vfs_write+0xb2/0x1a1 SyS_write+0x55/0xb9 do_syscall_64+0x79/0x1ae entry_SYSCALL_64_after_hwframe+0x3d/0x0 Add code to check whether we have a mapping already in the same section and prevent additional mappings from being created if that is the case. Link: http://lkml.kernel.org/r/152909478401.50143.312364396244072931.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Robert Elliott <elliott@hpe.com> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
KT Liao	c006356b44	Input: elan_i2c - add another ACPI ID for Lenovo Ideapad 330-15AST commit `6f88a6439d` upstream. Add ELAN0622 to ACPI mapping table to support Elan touchpad found in Ideapad 330-15AST. Signed-off-by: KT Liao <kt.liao@emc.com.tw> Reported-by: Anant Shende <anantshende@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Chen-Yu Tsai	7097414ddf	Input: i8042 - add Lenovo LaVie Z to the i8042 reset list commit `384cf4285b` upstream. The Lenovo LaVie Z laptop requires i8042 to be reset in order to consistently detect its Elantech touchpad. The nomux and kbdreset quirks are not sufficient. It's possible the other LaVie Z models from NEC require this as well. Cc: stable@vger.kernel.org Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Donald Shanty III	1fc88223fe	Input: elan_i2c - add ACPI ID for lenovo ideapad 330 commit `938f45008d` upstream. This allows Elan driver to bind to the touchpad found in Lenovo Ideapad 330 series laptops. Signed-off-by: Donald Shanty III <dshanty@protonmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Marek Szyprowski	acdd6caa62	spi: spi-s3c64xx: Fix system resume support commit `e935dba111` upstream. Since Linux v4.10 release (commit `1d9174fbc5` "PM / Runtime: Defer resuming of the device in pm_runtime_force_resume()"), pm_runtime_force_resume() function doesn't runtime resume device if it was not runtime active before system suspend. Thus, driver should not do any register access after pm_runtime_force_resume() without checking the runtime status of the device. To fix this issue, simply move s3c64xx_spi_hwinit() call to s3c64xx_spi_runtime_resume() to ensure that hardware is always properly initialized. This fixes Synchronous external abort issue on system suspend/resume cycle on newer Exynos SoCs. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-08-03 07:47:31 +02:00
Greg Kroah-Hartman	84d52eb065	Linux 4.17.11	2018-07-28 07:57:19 +02:00
Roman Fietze	f51e215526	can: m_can.c: fix setup of CCCR register: clear CCCR NISO bit before checking can.ctrlmode commit `393753b217` upstream. Inside m_can_chip_config(), when setting up the new value of the CCCR, the CCCR_NISO bit is not cleared like the others, CCCR_TEST, CCCR_MON, CCCR_BRSE and CCCR_FDOE, before checking the can.ctrlmode bits for CAN_CTRLMODE_FD_NON_ISO. This way once the controller was configured for CAN_CTRLMODE_FD_NON_ISO, this mode could never be cleared again. This fix is only relevant for controllers with version 3.1.x or 3.2.x. Older versions do not support NISO. Signed-off-by: Roman Fietze <roman.fietze@telemotive.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Faiz Abbas	467b5b4255	can: m_can: Fix runtime resume call commit `1675bee3e7` upstream. pm_runtime_get_sync() returns a 1 if the state of the device is already 'active'. This is not a failure case and should return a success. Therefore fix error handling for pm_runtime_get_sync() call such that it returns success when the value is 1. Also cleanup the TODO for using runtime PM for sleep mode as that is implemented. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Cc: <stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Stephane Grosjean	432a412118	can: peak_canfd: fix firmware < v3.3.0: limit allocation to 32-bit DMA addr only commit `5d4c94ed9f` upstream. The DMA logic in firmwares < v3.3.0 embedded in the PCAN-PCIe FD cards family is not capable of handling a mix of 32-bit and 64-bit logical addresses. If the board is equipped with 2 or 4 CAN ports, then such a situation might lead to a PCIe Bus Error "Malformed TLP" packet as well as "irq xx: nobody cared" issue. This patch adds a workaround that requests only 32-bit DMA addresses when these might be allocated outside of the 4 GB area. This issue has been fixed in firmware v3.3.0 and next. Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	02a776fe76	can: xilinx_can: fix RX overflow interrupt not being enabled commit `8399799725` upstream. RX overflow interrupt (RXOFLW) is disabled even though xcan_interrupt() processes it. This means that an RX overflow interrupt will only be processed when another interrupt gets asserted (e.g. for RX/TX). Fix that by enabling the RXOFLW interrupt. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	43b08fda36	can: xilinx_can: fix incorrect clear of non-processed interrupts commit `2f4f0f338c` upstream. xcan_interrupt() clears ERROR\|RXOFLV\|BSOFF\|ARBLST interrupts if any of them is asserted. This does not take into account that some of them could have been asserted between interrupt status read and interrupt clear, therefore clearing them without handling them. Fix the code to only clear those interrupts that it knows are asserted and therefore going to be processed in xcan_err_interrupt(). Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	f2285b33b0	can: xilinx_can: keep only 1-2 frames in TX FIFO to fix TX accounting commit `620050d9c2` upstream. The xilinx_can driver assumes that the TXOK interrupt only clears after it has been acknowledged as many times as there have been successfully sent frames. However, the documentation does not mention such behavior, instead saying just that the interrupt is cleared when the clear bit is set. Similarly, testing seems to also suggest that it is immediately cleared regardless of the amount of frames having been sent. Performing some heavy TX load and then going back to idle has the tx_head drifting further away from tx_tail over time, steadily reducing the amount of frames the driver keeps in the TX FIFO (but not to zero, as the TXOK interrupt always frees up space for 1 frame from the driver's perspective, so frames continue to be sent) and delaying the local echo frames. The TX FIFO tracking is also otherwise buggy as it does not account for TX FIFO being cleared after software resets, causing BUG!, TX FIFO full when queue awake! messages to be output. There does not seem to be any way to accurately track the state of the TX FIFO for local echo support while using the full TX FIFO. The Zynq version of the HW (but not the soft-AXI version) has watermark programming support and with it an additional TX-FIFO-empty interrupt bit. Modify the driver to only put 1 frame into TX FIFO at a time on soft-AXI and 2 frames at a time on Zynq. On Zynq the TXFEMP interrupt bit is used to detect whether 1 or 2 frames have been sent at interrupt processing time. Tested with the integrated CAN on Zynq-7000 SoC. The 1-frame-FIFO mode was also tested. An alternative way to solve this would be to drop local echo support but keep using the full TX FIFO. v2: Add FIFO space check before TX queue wake with locking to synchronize with queue stop. This avoids waking the queue when xmit() had just filled it. v3: Keep local echo support and reduce the amount of frames in FIFO instead as suggested by Marc Kleine-Budde. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	bc85f6ccdf	can: xilinx_can: fix device dropping off bus on RX overrun commit `2574fe5451` upstream. The xilinx_can driver performs a software reset when an RX overrun is detected. This causes the device to enter Configuration mode where no messages are received or transmitted. The documentation does not mention any need to perform a reset on an RX overrun, and testing by inducing an RX overflow also indicated that the device continues to work just fine without a reset. Remove the software reset. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	03145be6c4	can: xilinx_can: fix recovery from error states not being propagated commit `877e0b7594` upstream. The xilinx_can driver contains no mechanism for propagating recovery from CAN_STATE_ERROR_WARNING and CAN_STATE_ERROR_PASSIVE. Add such a mechanism by factoring the handling of XCAN_STATE_ERROR_PASSIVE and XCAN_STATE_ERROR_WARNING out of xcan_err_interrupt and checking for recovery after RX and TX if the interface is in one of those states. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	56d8607730	can: xilinx_can: fix power management handling commit `8ebd83bdb0` upstream. There are several issues with the suspend/resume handling code of the driver: - The device is attached and detached in the runtime_suspend() and runtime_resume() callbacks if the interface is running. However, during xcan_chip_start() the interface is considered running, causing the resume handler to incorrectly call netif_start_queue() at the beginning of xcan_chip_start(), and on xcan_chip_start() error return the suspend handler detaches the device leaving the user unable to bring-up the device anymore. - The device is not brought properly up on system resume. A reset is done and the code tries to determine the bus state after that. However, after reset the device is always in Configuration mode (down), so the state checking code does not make sense and communication will also not work. - The suspend callback tries to set the device to sleep mode (low-power mode which monitors the bus and brings the device back to normal mode on activity), but then immediately disables the clocks (possibly before the device reaches the sleep mode), which does not make sense to me. If a clean shutdown is wanted before disabling clocks, we can just bring it down completely instead of only sleep mode. Reorganize the PM code so that only the clock logic remains in the runtime PM callbacks and the system PM callbacks contain the device bring-up/down logic. This makes calling the runtime PM callbacks during e.g. xcan_chip_start() safe. The system PM callbacks now simply call common code to start/stop the HW if the interface was running, replacing the broken code from before. xcan_chip_stop() is updated to use the common reset code so that it will wait for the reset to complete. Reset also disables all interrupts so do not do that separately. Also, the device_may_wakeup() checks are removed as the driver does not have wakeup support. Tested on Zynq-7000 integrated CAN. Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: Michal Simek <michal.simek@xilinx.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Anssi Hannula	def8fd91c7	can: xilinx_can: fix RX loop if RXNEMP is asserted without RXOK commit `32852c561b` upstream. If the device gets into a state where RXNEMP (RX FIFO not empty) interrupt is asserted without RXOK (new frame received successfully) interrupt being asserted, xcan_rx_poll() will continue to try to clear RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is not empty, the interrupt will not be cleared and napi_schedule() will just be called again. This situation can occur when: (a) xcan_rx() returns without reading RX FIFO due to an error condition. The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear due to a frame still being in the FIFO. The frame will never be read from the FIFO as RXOK is no longer set. (b) A frame is received between xcan_rx_poll() reading interrupt status and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain set as the new message is still in the FIFO. I'm able to trigger case (b) by flooding the bus with frames under load. There does not seem to be any benefit in using both RXNEMP and RXOK in the way the driver does, and the polling example in the reference manual (UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either RXOK or RXNEMP can be used for detecting incoming messages. Fix the issue and simplify the RX processing by only using RXNEMP without RXOK. Tested with the integrated CAN on Zynq-7000 SoC. Fixes: `b1201e44f5` ("can: xilinx CAN controller support") Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:18 +02:00
Rafael J. Wysocki	d611778e88	driver core: Partially revert "driver core: correct device's shutdown order" commit `722e5f2b1e` upstream. Commit `52cdbdd498` (driver core: correct device's shutdown order) introduced a regression by breaking device shutdown on some systems. Namely, the devices_kset_move_last() call in really_probe() added by that commit is a mistake as it may cause parents to follow children in the devices_kset list which then causes shutdown to fail. For example, if a device has children before really_probe() is called for it (which is not uncommon), that call will cause it to be reordered after the children in the devices_kset list and the ordering of that list will not reflect the correct device shutdown order any more. Also it causes the devices_kset list to be constantly reordered until all drivers have been probed which is totally pointless overhead in the majority of cases and it only covered an issue with system shutdown, while system-wide suspend/resume potentially had the same issue on the affected platforms (which was not covered). Moreover, the shutdown issue originally addressed by the change in really_probe() made by commit `52cdbdd498` is not present in 4.18-rc any more, since dra7 started to use the sdhci-omap driver which doesn't disable any regulators during shutdown, so the really_probe() part of commit `52cdbdd498` can be safely reverted. [The original issue was related to the omap_hsmmc driver used by dra7 previously.] For the above reasons, revert the really_probe() modifications made by commit `52cdbdd498`. The other code changes made by commit `52cdbdd498` are useful and they need not be reverted. Fixes: `52cdbdd498` (driver core: correct device's shutdown order) Link: https://lore.kernel.org/lkml/CAFgQCTt7VfqM=UyCnvNFxrSw8Z6cUtAi3HUwR4_xPAc03SgHjQ@mail.gmail.com/ Reported-by: Pingfan Liu <kernelfans@gmail.com> Tested-by: Pingfan Liu <kernelfans@gmail.com> Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Schmauss, Erik	73fd6967f1	ACPICA: AML Parser: ignore dispatcher error status during table load commit `73c2a01c52` upstream. The dispatcher and the executer process the parse nodes During table load. Error status from the evaluation confuses the AML parser. This results in the parser failing to complete parsing of the current scope op which becomes problematic. For the incorrect AML below, _ADR never gets created. definition_block(...) { Scope (\_SB) { Device (PCI0){...} Name (OBJ1, 0x0) OBJ1 = PCI0 + 5 // Results in an operand error. } // \_SB not closed // parser looks for \_SB._SB.PCI0, results in AE_NOT_FOUND error // Entire scope block gets skipped. Scope (\_SB.PCI0) { Name (_ADR, 0x0) } } Fix the above error by properly completing the initial \_SB scope after an error by clearing errors that occur during table load. In the above case, this means that OBJ1 = PIC0 + 5 is skipped. Fixes: `5088814a6e` (ACPICA: AML parser: attempt to continue loading table after error) Link: https://bugzilla.kernel.org/show_bug.cgi?id=200363 Tested-by: Bastien Nocera <hadess@hadess.net> Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Jerry Zhang	317398f184	usb: gadget: f_fs: Only return delayed status when len is 0 commit `4d644abf25` upstream. Commit `1b9ba000` ("Allow function drivers to pause control transfers") states that USB_GADGET_DELAYED_STATUS is only supported if data phase is 0 bytes. It seems that when the length is not 0 bytes, there is no need to explicitly delay the data stage since the transfer is not completed until the user responds. However, when the length is 0, there is no data stage and the transfer is finished once setup() returns, hence there is a need to explicitly delay completion. This manifests as the following bugs: Prior to `946ef68ad4` ('Let setup() return USB_GADGET_DELAYED_STATUS'), when setup is 0 bytes, ffs would require user to queue a 0 byte request in order to clear setup state. However, that 0 byte request was actually not needed and would hang and cause errors in other setup requests. After the above commit, 0 byte setups work since the gadget now accepts empty queues to ep0 to clear the delay, but all other setups hang. Fixes: `946ef68ad4` ("Let setup() return USB_GADGET_DELAYED_STATUS") Signed-off-by: Jerry Zhang <zhangjerry@google.com> Cc: stable <stable@vger.kernel.org> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Benjamin Herrenschmidt	af40ab8d32	usb: gadget: Fix OS descriptors support commit `50b9773c13` upstream. The current code is broken as it re-defines "req" inside the if block, then goto out of it. Thus the request that ends up being sent is not the one that was populated by the code in question. This fixes RNDIS driver autodetect by Windows 10 for me. The bug was introduced by Chris rework to remove the local queuing inside the if { } block of the redefined request. Fixes: `636ba13aec` ("usb: gadget: composite: remove duplicated code in OS desc handling") Cc: <stable@vger.kernel.org> # v4.17 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Zheng Xiaowei	ab7e6f9612	usb: xhci: Fix memory leak in xhci_endpoint_reset() commit `d89b7664f7` upstream. If td_list is not empty the cfg_cmd will not be freed, call xhci_free_command to free it. Signed-off-by: Zheng Xiaowei <zhengxiaowei@ruijie.com.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Antti Seppälä	45972a2488	usb: dwc2: Fix DMA alignment to start at allocated boundary commit `56406e017a` upstream. The commit `3bc04e28a0` ("usb: dwc2: host: Get aligned DMA in a more supported way") introduced a common way to align DMA allocations. The code in the commit aligns the struct dma_aligned_buffer but the actual DMA address pointed by data[0] gets aligned to an offset from the allocated boundary by the kmalloc_ptr and the old_xfer_buffer pointers. This is against the recommendation in Documentation/DMA-API.txt which states: Therefore, it is recommended that driver writers who don't take special care to determine the cache line size at run time only map virtual regions that begin and end on page boundaries (which are guaranteed also to be cache line boundaries). The effect of this is that architectures with non-coherent DMA caches may run into memory corruption or kernel crashes with Unhandled kernel unaligned accesses exceptions. Fix the alignment by positioning the DMA area in front of the allocation and use memory at the end of the area for storing the orginal transfer_buffer pointer. This may have the added benefit of increased performance as the DMA area is now fully aligned on all architectures. Tested with Lantiq xRX200 (MIPS) and RPi Model B Rev 2 (ARM). Fixes: `3bc04e28a0` ("usb: dwc2: host: Get aligned DMA in a more supported way") Cc: <stable@vger.kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Antti Seppälä <a.seppala@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Bin Liu	1c3f485257	usb: core: handle hub C_PORT_OVER_CURRENT condition commit `249a32b7ee` upstream. Based on USB2.0 Spec Section 11.12.5, "If a hub has per-port power switching and per-port current limiting, an over-current on one port may still cause the power on another port to fall below specific minimums. In this case, the affected port is placed in the Power-Off state and C_PORT_OVER_CURRENT is set for the port, but PORT_OVER_CURRENT is not set." so let's check C_PORT_OVER_CURRENT too for over current condition. Fixes: `08d1dec6f4` ("usb:hub set hub->change_bits when over-current happens") Cc: <stable@vger.kernel.org> Tested-by: Alessandro Antenucci <antenucci@korg.it> Signed-off-by: Bin Liu <b-liu@ti.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Lubomir Rintel	16cefd1089	usb: cdc_acm: Add quirk for Castles VEGA3000 commit `1445cbe476` upstream. The device (a POS terminal) implements CDC ACM, but has not union descriptor. Signed-off-by: Lubomir Rintel <lkundrak@v3.sk> Acked-by: Oliver Neukum <oneukum@suse.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Samuel Thibault	4f0446f7fa	staging: speakup: fix wraparound in uaccess length check commit `b96fba8d58` upstream. If softsynthx_read() is called with `count < 3`, `count - 3` wraps, causing the loop to copy as much data as available to the provided buffer. If softsynthx_read() is invoked through sys_splice(), this causes an unbounded kernel write; but even when userspace just reads from it normally, a small size could cause userspace crashes. Fixes: `425e586cf9` ("speakup: add unicode variant of /dev/softsynth") Cc: stable@vger.kernel.org Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Hans de Goede	f5d4355c94	Revert "staging:r8188eu: Use lib80211 to support TKIP" commit `69a1d98c83` upstream. Commit `b83b8b1881` ("staging:r8188eu: Use lib80211 to support TKIP") is causing 2 problems for me: 1) One boot the wifi on a laptop with a r8188eu wifi device would not connect and dmesg contained an oops about scheduling while atomic pointing to the tkip code. This went away after reverting the commit. 2) I reverted the revert to try and get the oops from 1. again to be able to add it to this commit message. But now the system did connect to the wifi only to print a whole bunch of oopses, followed by a hardfreeze a few seconds later. Subsequent reboots also all lead to scenario 2. Until I reverted the commit again. Revert the commit fixes both issues making the laptop usable again. Fixes: `b83b8b1881` ("staging:r8188eu: Use lib80211 to support TKIP") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Ivan Safonov <insafonov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:17 +02:00
Eric Dumazet	840e03915b	tcp: add tcp_ooo_try_coalesce() helper [ Upstream commit `58152ecbbc` ] In case skb in out_or_order_queue is the result of multiple skbs coalescing, we would like to get a proper gso_segs counter tracking, so that future tcp_drop() can report an accurate number. I chose to not implement this tracking for skbs in receive queue, since they are not dropped, unless socket is disconnected. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Eric Dumazet	9ad090e6d0	tcp: call tcp_drop() from tcp_data_queue_ofo() [ Upstream commit `8541b21e78` ] In order to be able to give better diagnostics and detect malicious traffic, we need to have better sk->sk_drops tracking. Fixes: `9f5afeae51` ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Eric Dumazet	81a4582f7d	tcp: detect malicious patterns in tcp_collapse_ofo_queue() [ Upstream commit `3d4bf93ac1` ] In case an attacker feeds tiny packets completely out of order, tcp_collapse_ofo_queue() might scan the whole rb-tree, performing expensive copies, but not changing socket memory usage at all. 1) Do not attempt to collapse tiny skbs. 2) Add logic to exit early when too many tiny skbs are detected. We prefer not doing aggressive collapsing (which copies packets) for pathological flows, and revert to tcp_prune_ofo_queue() which will be less expensive. In the future, we might add the possibility of terminating flows that are proven to be malicious. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Eric Dumazet	4971f342bd	tcp: avoid collapses in tcp_prune_queue() if possible [ Upstream commit `f4a3313d8e` ] Right after a TCP flow is created, receiving tiny out of order packets allways hit the condition : if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk); tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc (guarded by tcp_rmem[2]) Calling tcp_collapse_ofo_queue() in this case is not useful, and offers a O(N^2) surface attack to malicious peers. Better not attempt anything before full queue capacity is reached, forcing attacker to spend lots of resource and allow us to more easily detect the abuse. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Eric Dumazet	db11182a1e	tcp: free batches of packets in tcp_prune_ofo_queue() [ Upstream commit `72cd43ba64` ] Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: `36a6503fed` ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Roopa Prabhu	786f9eb0c5	vxlan: fix default fdb entry netlink notify ordering during netdev create [ Upstream commit `e99465b952` ] Problem: In vxlan_newlink, a default fdb entry is added before register_netdev. The default fdb creation function also notifies user-space of the fdb entry on the vxlan device which user-space does not know about yet. (RTM_NEWNEIGH goes before RTM_NEWLINK for the same ifindex). This patch fixes the user-space netlink notification ordering issue with the following changes: - decouple fdb notify from fdb create. - Move fdb notify after register_netdev. - Call rtnl_configure_link in vxlan newlink handler to notify userspace about the newlink before fdb notify and hence avoiding the user-space race. Fixes: `afbd8bae9c` ("vxlan: add implicit fdb entry for default destination") Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Roopa Prabhu	d50a42ed3c	vxlan: make netlink notify in vxlan_fdb_destroy optional [ Upstream commit `f6e0538586` ] Add a new option do_notify to vxlan_fdb_destroy to make sending netlink notify optional. Used by a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Roopa Prabhu	6982c01501	vxlan: add new fdb alloc and create helpers [ Upstream commit `7431016b10` ] - Add new vxlan_fdb_alloc helper - rename existing vxlan_fdb_create into vxlan_fdb_update: because it really creates or updates an existing fdb entry - move new fdb creation into a separate vxlan_fdb_create Main motivation for this change is to introduce the ability to decouple vxlan fdb creation and notify, used in a later patch. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Roopa Prabhu	fc5bf0e5f2	rtnetlink: add rtnl_link_state check in rtnl_configure_link [ Upstream commit `5025f7f7d5` ] rtnl_configure_link sets dev->rtnl_link_state to RTNL_LINK_INITIALIZED and unconditionally calls __dev_notify_flags to notify user-space of dev flags. current call sequence for rtnl_configure_link rtnetlink_newlink rtnl_link_ops->newlink rtnl_configure_link (unconditionally notifies userspace of default and new dev flags) If a newlink handler wants to call rtnl_configure_link early, we will end up with duplicate notifications to user-space. This patch fixes rtnl_configure_link to check rtnl_link_state and call __dev_notify_flags with gchanges = 0 if already RTNL_LINK_INITIALIZED. Later in the series, this patch will help the following sequence where a driver implementing newlink can call rtnl_configure_link to initialize the link early. makes the following call sequence work: rtnetlink_newlink rtnl_link_ops->newlink (vxlan) -> rtnl_configure_link (initializes link and notifies user-space of default dev flags) rtnl_configure_link (updates dev flags if requested by user ifm and notifies user-space of new dev flags) Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Ariel Levkovich	cfe647dde9	net/mlx5: Adjust clock overflow work period [ Upstream commit `33180bee86` ] When driver converts HW timestamp to wall clock time it subtracts the last saved cycle counter from the HW timestamp and converts the difference to nanoseconds. The conversion is done by multiplying the cycles difference with the clock multiplier value as a first step and therefore the cycles difference should be small enough so that the multiplication product doesn't exceed 64bit. The overflow handling routine is in charge of updating the last saved cycle counter in driver and it is called periodically using kernel delayed workqueue. The delay period for this work is calculated using the max HW cycle counter value (a 41 bit mask) as a base which doesn't take the 64bit limit into account so the delay period may be incorrect and too long to prevent a large difference between the HW counter and the last saved counter in SW. This change adjusts the work period for the HW clock overflow work by taking the minimum between the previous value and the quotient of max u64 value and the clock multiplier value. Fixes: `ef9814deaf` ("net/mlx5e: Add HW timestamping (TS) support") Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:16 +02:00
Eran Ben Elisha	4758cb5234	net/mlx5e: Fix quota counting in aRFS expire flow [ Upstream commit `2630bae801` ] Quota should follow the amount of rules which do expire, and not the number of rules that were examined, fixed that. Fixes: `18c908e477` ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Eran Ben Elisha	34d40b0622	net/mlx5e: Don't allow aRFS for encapsulated packets [ Upstream commit `d2e1c57bcf` ] Driver is yet to support aRFS for encapsulated packets, return early error in such case. Fixes: `18c908e477` ("net/mlx5e: Add accelerated RFS support") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
David Ahern	a46fa1c77d	net/ipv6: Fix linklocal to global address with VRF [ Upstream commit `24b711edfc` ] Example setup: host: ip -6 addr add dev eth1 2001:db8:104::4 where eth1 is enslaved to a VRF switch: ip -6 ro add 2001:db8:104::4/128 dev br1 where br1 only has an LLA ping6 2001:db8:104::4 ssh 2001:db8:104::4 (NOTE: UDP works fine if the PKTINFO has the address set to the global address and ifindex is set to the index of eth1 with a destination an LLA). For ICMP, icmp6_iif needs to be updated to check if skb->dev is an L3 master. If it is then return the ifindex from rt6i_idev similar to what is done for loopback. For TCP, restore the original tcp_v6_iif definition which is needed in most places and add a new tcp_v6_iif_l3_slave that considers the l3_slave variability. This latter check is only needed for socket lookups. Fixes: `9ff7438460` ("net: vrf: Handle ipv6 multicast and link-local addresses") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Hangbin Liu	a45dad6235	multicast: do not restore deleted record source filter mode to new one There are two scenarios that we will restore deleted records. The first is when device down and up(or unmap/remap). In this scenario the new filter mode is same with previous one. Because we get it from in_dev->mc_list and we do not touch it during device down and up. The other scenario is when a new socket join a group which was just delete and not finish sending status reports. In this scenario, we should use the current filter mode instead of restore old one. Here are 4 cases in total. old_socket new_socket before_fix after_fix IN(A) IN(A) ALLOW(A) ALLOW(A) IN(A) EX( ) TO_IN( ) TO_EX( ) EX( ) IN(A) TO_EX( ) ALLOW(A) EX( ) EX( ) TO_EX( ) TO_EX( ) Fixes: `24803f38a5` (igmp: do not remove igmp souce list info when set link down) Fixes: `1666d49e1d` (mld: do not remove mld souce list info when set link down) Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Heiner Kallweit	129cb31095	net: phy: consider PHY_IGNORE_INTERRUPT in phy_start_aneg_priv [ Upstream commit `215d08a85b` ] The situation described in the comment can occur also with PHY_IGNORE_INTERRUPT, therefore change the condition to include it. Fixes: `f555f34fdc` ("net: phy: fix auto-negotiation stall due to unavailable interrupt") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Daniel Borkmann	8c9cd50bbc	sock: fix sg page frag coalescing in sk_alloc_sg [ Upstream commit `144fe2bfd2` ] Current sg coalescing logic in sk_alloc_sg() (latter is used by tls and sockmap) is not quite correct in that we do fetch the previous sg entry, however the subsequent check whether the refilled page frag from the socket is still the same as from the last entry with prior offset and length matching the start of the current buffer is comparing always the first sg list entry instead of the prior one. Fixes: `3c4d755915` ("tls: kernel TLS support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
John Hurley	3313c38be9	nfp: flower: ensure dead neighbour entries are not offloaded [ Upstream commit `b809ec869b` ] Previously only the neighbour state was checked to decide if an offloaded entry should be removed. However, there can be situations when the entry is dead but still marked as valid. This can lead to dead entries not being removed from fw tables or even incorrect data being added. Check the entry dead bit before deciding if it should be added to or removed from fw neighbour tables. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Shay Agroskin	dead7d65a7	net/mlx5e: Refine ets validation function [ Upstream commit `e279d634f3` ] Removed an error message received when configuring ETS total bandwidth to be zero. Our hardware doesn't support such configuration, so we shall reject it in the driver. Nevertheless, we removed the error message in order to eliminate error messages caused by old userspace tools who try to pass such configuration. Fixes: `ff0891915c` ("net/mlx5e: Fix ETS BW check") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:15 +02:00
Roi Dayan	457b3b57d8	net/mlx5e: Only allow offloading decap egress (egdev) flows [ Upstream commit `7e29392eee` ] We get egress rules through the egdev mechanism when the ingress device is not supporting offload, with the expected use-case of tunnel decap ingress rule set on shared tunnel device. Make sure to offload egress/egdev rules only if decap action (tunnel key unset) exists there and err otherwise. Fixes: `717503b9cf` ("net: sched: convert cls_flower->egress_dev users to tc_setup_cb_egdev infra") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Or Gerlitz	4b4dbb26d2	net/mlx5e: Add ingress/egress indication for offloaded TC flows [ Upstream commit `60bd4af814` ] When an e-switch TC rule is offloaded through the egdev (egress device) mechanism, we treat this as egress, all other cases (NIC and e-switch) are considred ingress. This is preparation step that will allow us to identify "wrong" stat/del offload calls made by the TC core on egdev based flows and ignore them. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Doron Roberts-Kedes	c049fc66cc	tls: check RCV_SHUTDOWN in tls_wait_data [ Upstream commit `fcf4793e27` ] The current code does not check sk->sk_shutdown & RCV_SHUTDOWN. tls_sw_recvmsg may return a positive value in the case where bytes have already been copied when the socket is shutdown. sk->sk_err has been cleared, causing the tls_wait_data to hang forever on a subsequent invocation. Checking sk->sk_shutdown & RCV_SHUTDOWN, as in tcp_recvmsg, fixes this problem. Fixes: `c46234ebb4` ("tls: RX path for ktls") Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Heiner Kallweit	109c03ba6c	r8169: restore previous behavior to accept BIOS WoL settings [ Upstream commit `18041b5236` ] Commit `7edf6d314c` tried to resolve an inconsistency (BIOS WoL settings are accepted, but device isn't wakeup-enabled) resulting from a previous broken-BIOS workaround by making disabled WoL the default. This however had some side effects, most likely due to a broken BIOS some systems don't properly resume from suspend when the MagicPacket WoL bit isn't set in the chip, see https://bugzilla.kernel.org/show_bug.cgi?id=200195 Therefore restore the WoL behavior from 4.16. Reported-by: Albert Astals Cid <aacid@kde.org> Fixes: `7edf6d314c` ("r8169: disable WOL per default") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Saeed Mahameed	4cdc4ccc8f	net/mlx5: E-Switch, UBSAN fix undefined behavior in mlx5_eswitch_mode [ Upstream commit `443a858158` ] With debug kernel UBSAN detects the following issue, which might happen when eswitch instance is not created, fix this by testing the eswitch pointer before returning the eswitch mode, if not set return mode = SRIOV_NONE. [ 32.528951] UBSAN: Undefined behaviour in drivers/net/ethernet/mellanox/mlx5/core/eswitch.c:2219:12 [ 32.528951] member access within null pointer of type 'struct mlx5_eswitch' [ 32.528951] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.18.0-rc3-dirty #181 [ 32.528951] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 [ 32.528951] Call Trace: [ 32.528951] dump_stack+0xc7/0x13b [ 32.528951] ? show_regs_print_info+0x5/0x5 [ 32.528951] ? __pm_runtime_use_autosuspend+0x140/0x140 [ 32.528951] ubsan_epilogue+0x9/0x49 [ 32.528951] ubsan_type_mismatch_common+0x1f9/0x2c0 [ 32.528951] ? ucs2_as_utf8+0x310/0x310 [ 32.528951] ? device_initialize+0x229/0x2e0 [ 32.528951] __ubsan_handle_type_mismatch+0x9f/0xc9 [ 32.528951] ? __ubsan_handle_divrem_overflow+0x19b/0x19b [ 32.578008] ? ib_device_get_by_index+0xf0/0xf0 [ 32.578008] mlx5_eswitch_mode+0x30/0x40 [ 32.578008] mlx5_ib_add+0x1e0/0x4a0 Fixes: `57cbd893c4` ("net/mlx5: E-Switch, Move representors definition to a global scope") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Yuchung Cheng	afaf0f8326	tcp: do not delay ACK in DCTCP upon CE status change [ Upstream commit `a0496ef2c2` ] Per DCTCP RFC8257 (Section 3.2) the ACK reflecting the CE status change has to be sent immediately so the sender can respond quickly: """ When receiving packets, the CE codepoint MUST be processed as follows: 1. If the CE codepoint is set and DCTCP.CE is false, set DCTCP.CE to true and send an immediate ACK. 2. If the CE codepoint is not set and DCTCP.CE is true, set DCTCP.CE to false and send an immediate ACK. """ Previously DCTCP implementation may continue to delay the ACK. This patch fixes that to implement the RFC by forcing an immediate ACK. Tested with this packetdrill script provided by Larry Brakmo 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 +0.005 < [ce] . 2001:3001(1000) ack 2 win 257 +0.000 > [ect01] . 2:2(0) ack 2001 // Previously the ACK below would be delayed by 40ms +0.000 > [ect01] E. 2:2(0) ack 3001 +0.500 < F. 9501:9501(0) ack 4 win 257 Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Yuchung Cheng	5a2ebffa87	tcp: do not cancel delay-AcK on DCTCP special ACK [ Upstream commit `27cde44a25` ] Currently when a DCTCP receiver delays an ACK and receive a data packet with a different CE mark from the previous one's, it sends two immediate ACKs acking previous and latest sequences respectly (for ECN accounting). Previously sending the first ACK may mark off the delayed ACK timer (tcp_event_ack_sent). This may subsequently prevent sending the second ACK to acknowledge the latest sequence (tcp_ack_snd_check). The culprit is that tcp_send_ack() assumes it always acknowleges the latest sequence, which is not true for the first special ACK. The fix is to not make the assumption in tcp_send_ack and check the actual ack sequence before cancelling the delayed ACK. Further it's safer to pass the ack sequence number as a local variable into tcp_send_ack routine, instead of intercepting tp->rcv_nxt to avoid future bugs like this. Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Yuchung Cheng	ab677b6be8	tcp: helpers to send special DCTCP ack [ Upstream commit `2987babb69` ] Refactor and create helpers to send the special ACK in DCTCP. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Yuchung Cheng	b1f6730440	tcp: fix dctcp delayed ACK schedule [ Upstream commit `b0c05d0e99` ] Previously, when a data segment was sent an ACK was piggybacked on the data segment without generating a CA_EVENT_NON_DELAYED_ACK event to notify congestion control modules. So the DCTCP ca->delayed_ack_reserved flag could incorrectly stay set when in fact there were no delayed ACKs being reserved. This could result in sending a special ECN notification ACK that carries an older ACK sequence, when in fact there was no need for such an ACK. DCTCP keeps track of the delayed ACK status with its own separate state ca->delayed_ack_reserved. Previously it may accidentally cancel the delayed ACK without updating this field upon sending a special ACK that carries a older ACK sequence. This inconsistency would lead to DCTCP receiver never acknowledging the latest data until the sender times out and retry in some cases. Packetdrill script (provided by Larry Brakmo) 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> 0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> 0.110 < [ect0] . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 0.200 < [ect0] . 1:1001(1000) ack 1 win 257 0.200 > [ect01] . 1:1(0) ack 1001 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 1:2(1) ack 1001 0.200 < [ect0] . 1001:2001(1000) ack 2 win 257 0.200 write(4, ..., 1) = 1 0.200 > [ect01] P. 2:3(1) ack 2001 0.200 < [ect0] . 2001:3001(1000) ack 3 win 257 0.200 < [ect0] . 3001:4001(1000) ack 3 win 257 0.200 > [ect01] . 3:3(0) ack 4001 0.210 < [ce] P. 4001:4501(500) ack 3 win 257 +0.001 read(4, ..., 4500) = 4500 +0 write(4, ..., 1) = 1 +0 > [ect01] PE. 3:4(1) ack 4501 +0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257 // Previously the ACK sequence below would be 4501, causing a long RTO +0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack +0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data +0 > [ect01] . 4:4(0) ack 6501 // now acks everything +0.500 < F. 9501:9501(0) ack 4 win 257 Reported-by: Larry Brakmo <brakmo@fb.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Eric Dumazet	0d75a23fde	net: skb_segment() should not return NULL [ Upstream commit `ff907a11a0` ] syzbot caught a NULL deref [1], caused by skb_segment() skb_segment() has many "goto err;" that assume the @err variable contains -ENOMEM. A successful call to __skb_linearize() should not clear @err, otherwise a subsequent memory allocation error could return NULL. While we are at it, we might use -EINVAL instead of -ENOMEM when MAX_SKB_FRAGS limit is reached. [1] kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN CPU: 0 PID: 13285 Comm: syz-executor3 Not tainted 4.18.0-rc4+ #146 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:tcp_gso_segment+0x3dc/0x1780 net/ipv4/tcp_offload.c:106 Code: f0 ff ff 0f 87 1c fd ff ff e8 00 88 0b fb 48 8b 75 d0 48 b9 00 00 00 00 00 fc ff df 48 8d be 90 00 00 00 48 89 f8 48 c1 e8 03 <0f> b6 14 08 48 8d 86 94 00 00 00 48 89 c6 83 e0 07 48 c1 ee 03 0f RSP: 0018:ffff88019b7fd060 EFLAGS: 00010206 RAX: 0000000000000012 RBX: 0000000000000020 RCX: dffffc0000000000 RDX: 0000000000040000 RSI: 0000000000000000 RDI: 0000000000000090 RBP: ffff88019b7fd0f0 R08: ffff88019510e0c0 R09: ffffed003b5c46d6 R10: ffffed003b5c46d6 R11: ffff8801dae236b3 R12: 0000000000000001 R13: ffff8801d6c581f4 R14: 0000000000000000 R15: ffff8801d6c58128 FS: 00007fcae64d6700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004e8664 CR3: 00000001b669b000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: tcp4_gso_segment+0x1c3/0x440 net/ipv4/tcp_offload.c:54 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 inet_gso_segment+0x64e/0x12d0 net/ipv4/af_inet.c:1342 skb_mac_gso_segment+0x3b5/0x740 net/core/dev.c:2792 __skb_gso_segment+0x3c3/0x880 net/core/dev.c:2865 skb_gso_segment include/linux/netdevice.h:4099 [inline] validate_xmit_skb+0x640/0xf30 net/core/dev.c:3104 __dev_queue_xmit+0xc14/0x3910 net/core/dev.c:3561 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_hh_output include/net/neighbour.h:473 [inline] neigh_output include/net/neighbour.h:481 [inline] ip_finish_output2+0x1063/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 iptunnel_xmit+0x567/0x850 net/ipv4/ip_tunnel_core.c:91 ip_tunnel_xmit+0x1598/0x3af1 net/ipv4/ip_tunnel.c:778 ipip_tunnel_xmit+0x264/0x2c0 net/ipv4/ipip.c:308 __netdev_start_xmit include/linux/netdevice.h:4148 [inline] netdev_start_xmit include/linux/netdevice.h:4157 [inline] xmit_one net/core/dev.c:3034 [inline] dev_hard_start_xmit+0x26c/0xc30 net/core/dev.c:3050 __dev_queue_xmit+0x29ef/0x3910 net/core/dev.c:3569 dev_queue_xmit+0x17/0x20 net/core/dev.c:3602 neigh_direct_output+0x15/0x20 net/core/neighbour.c:1403 neigh_output include/net/neighbour.h:483 [inline] ip_finish_output2+0xa67/0x1860 net/ipv4/ip_output.c:229 ip_finish_output+0x841/0xfa0 net/ipv4/ip_output.c:317 NF_HOOK_COND include/linux/netfilter.h:276 [inline] ip_output+0x223/0x880 net/ipv4/ip_output.c:405 dst_output include/net/dst.h:444 [inline] ip_local_out+0xc5/0x1b0 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9df/0x1f80 net/ipv4/ip_output.c:504 tcp_transmit_skb+0x1bf9/0x3f10 net/ipv4/tcp_output.c:1168 tcp_write_xmit+0x1641/0x5c20 net/ipv4/tcp_output.c:2363 __tcp_push_pending_frames+0xb2/0x290 net/ipv4/tcp_output.c:2536 tcp_push+0x638/0x8c0 net/ipv4/tcp.c:735 tcp_sendmsg_locked+0x2ec5/0x3f00 net/ipv4/tcp.c:1410 tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1447 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 __sys_sendto+0x3d7/0x670 net/socket.c:1797 __do_sys_sendto net/socket.c:1809 [inline] __se_sys_sendto net/socket.c:1805 [inline] __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1805 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x455ab9 Code: 1d ba fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb b9 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fcae64d5c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007fcae64d66d4 RCX: 0000000000455ab9 RDX: 0000000000000001 RSI: 0000000020000200 RDI: 0000000000000013 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000014 R13: 00000000004c1145 R14: 00000000004d1818 R15: 0000000000000006 Modules linked in: Dumping ftrace buffer: (ftrace buffer empty) Fixes: `ddff00d420` ("net: Move skb_has_shared_frag check out of GRE code and into segmentation") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Reported-by: syzbot <syzkaller@googlegroups.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:14 +02:00
Zhao Chen	8b0fe96d33	net-next/hinic: fix a problem in hinic_xmit_frame() [ Upstream commit `f7482683f1` ] The calculation of "wqe_size" is not correct when the tx queue is busy in hinic_xmit_frame(). When there are no free WQEs, the tx flow will unmap the skb buffer, then ring the doobell for the pending packets. But the "wqe_size" which used to calculate the doorbell address is not correct. The wqe size should be cleared to 0, otherwise, it will cause a doorbell error. This patch fixes the problem. Reported-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Zhao Chen <zhaochen6@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Jack Morgenstein	254b7df2a2	net/mlx4_core: Save the qpn from the input modifier in RST2INIT wrapper [ Upstream commit `958c696f5a` ] Function mlx4_RST2INIT_QP_wrapper saved the qp number passed in the qp context, rather than the one passed in the input modifier. However, the qp number in the qp context is not defined as a required parameter by the FW. Therefore, drivers may choose to not specify the qp number in the qp context for the reset-to-init transition. Thus, we must save the qp number passed in the command input modifier -- which is always present. (This saved qp number is used as the input modifier for command 2RST_QP when a slave's qp's are destroyed). Fixes: `c82e9aa0a8` ("mlx4_core: resource tracking for HCA resources used by guests") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Uwe Kleine-König	63cd2f0336	net: dsa: mv88e6xxx: fix races between lock and irq freeing [ Upstream commit `3d82475ad4` ] free_irq() waits until all handlers for this IRQ have completed. As the relevant handler (mv88e6xxx_g1_irq_thread_fn()) takes the chip's reg_lock it might never return if the thread calling free_irq() holds this lock. For the same reason kthread_cancel_delayed_work_sync() in the polling case must not hold this lock. Also first free the irq (or stop the worker respectively) such that mv88e6xxx_g1_irq_thread_work() isn't called any more before the irq mappings are dropped in mv88e6xxx_g1_irq_free_common() to prevent the worker thread to call handle_nested_irq(0) which results in a NULL-pointer exception. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Willem de Bruijn	f826037208	ip: in cmsg IP(V6)_ORIGDSTADDR call pskb_may_pull [ Upstream commit `2efd4fca70` ] Syzbot reported a read beyond the end of the skb head when returning IPV6_ORIGDSTADDR: BUG: KMSAN: kernel-infoleak in put_cmsg+0x5ef/0x860 net/core/scm.c:242 CPU: 0 PID: 4501 Comm: syz-executor128 Not tainted 4.17.0+ #9 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1125 kmsan_internal_check_memory+0x138/0x1f0 mm/kmsan/kmsan.c:1219 kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1261 copy_to_user include/linux/uaccess.h:184 [inline] put_cmsg+0x5ef/0x860 net/core/scm.c:242 ip6_datagram_recv_specific_ctl+0x1cf3/0x1eb0 net/ipv6/datagram.c:719 ip6_datagram_recv_ctl+0x41c/0x450 net/ipv6/datagram.c:733 rawv6_recvmsg+0x10fb/0x1460 net/ipv6/raw.c:521 [..] This logic and its ipv4 counterpart read the destination port from the packet at skb_transport_offset(skb) + 4. With MSG_MORE and a local SOCK_RAW sender, syzbot was able to cook a packet that stores headers exactly up to skb_transport_offset(skb) in the head and the remainder in a frag. Call pskb_may_pull before accessing the pointer to ensure that it lies in skb head. Link: http://lkml.kernel.org/r/CAF=yD-LEJwZj5a1-bAAj2Oy_hKmGygV6rsJ_WOrAYnv-fnayiQ@mail.gmail.com Reported-by: syzbot+9adb4b567003cac781f0@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Paolo Abeni	492589c0d8	ip: hash fragments consistently [ Upstream commit `3dd1c9a127` ] The skb hash for locally generated ip[v6] fragments belonging to the same datagram can vary in several circumstances: * for connected UDP[v6] sockets, the first fragment get its hash via set_owner_w()/skb_set_hash_from_sk() * for unconnected IPv6 UDPv6 sockets, the first fragment can get its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if auto_flowlabel is enabled For the following frags the hash is usually computed via skb_get_hash(). The above can cause OoO for unconnected IPv6 UDPv6 socket: in that scenario the egress tx queue can be selected on a per packet basis via the skb hash. It may also fool flow-oriented schedulers to place fragments belonging to the same datagram in different flows. Fix the issue by copying the skb hash from the head frag into the others at fragmentation time. Before this commit: perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8" netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n & perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1 perf script probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0 After this commit: probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 Fixes: `b73c3d0e4f` ("net: Save TX flow hash in sock and set in skbuf on xmit") Fixes: `67800f9b1f` ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Jarod Wilson	e5f7f68b40	bonding: set default miimon value for non-arp modes if not set [ Upstream commit `c1f897ce18` ] For some time now, if you load the bonding driver and configure bond parameters via sysfs using minimal config options, such as specifying nothing but the mode, relying on defaults for everything else, modes that cannot use arp monitoring (802.3ad, balance-tlb, balance-alb) all wind up with both arp_interval=0 (as it should be) and miimon=0, which means the miimon monitor thread never actually runs. This is particularly problematic for 802.3ad. For example, from an LNST recipe I've set up: $ modprobe bonding max_bonds=0" $ echo "+t_bond0" > /sys/class/net/bonding_masters" $ ip link set t_bond0 down" $ echo "802.3ad" > /sys/class/net/t_bond0/bonding/mode" $ ip link set ens1f1 down" $ echo "+ens1f1" > /sys/class/net/t_bond0/bonding/slaves" $ ip link set ens1f0 down" $ echo "+ens1f0" > /sys/class/net/t_bond0/bonding/slaves" $ ethtool -i t_bond0" $ ip link set ens1f1 up" $ ip link set ens1f0 up" $ ip link set t_bond0 up" $ ip addr add 192.168.9.1/24 dev t_bond0" $ ip addr add 2002::1/64 dev t_bond0" This bond comes up okay, but things look slightly suspect in /proc/net/bonding/t_bond0 output: $ grep -i mii /proc/net/bonding/t_bond0 MII Status: up MII Polling Interval (ms): 0 MII Status: up MII Status: up Now, pull a cable on one of the ports in the bond, then reconnect it, and you'll see: Slave Interface: ens1f0 MII Status: down Speed: 1000 Mbps Duplex: full I believe this became a major issue as of commit `4d2c0cda07`, which for 802.3ad bonds, sets slave->link = BOND_LINK_DOWN, with a comment about relying on link monitoring via miimon to set it correctly, but since the miimon work queue never runs, the link just stays marked down. If we simply tweak bond_option_mode_set() slightly, we can check for the non-arp modes having no miimon value set, and insert BOND_DEFAULT_MIIMON, which gets things back in full working order. This problem exists as far back as 4.14, and might be worth fixing in all stable trees since, though the work-around is to simply specify an miimon value yourself. Reported-by: Bob Ball <ball@umich.edu> Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Neil Armstrong	003877f5f1	clk: meson-gxbb: set fclk_div2 as CLK_IS_CRITICAL commit `c987ac6f1f` upstream. On Amlogic Meson GXBB & GXL platforms, the SCPI Cortex-M4 Co-Processor seems to be dependent on the FCLK_DIV2 to be operationnal. The issue occurred since v4.17-rc1 by freezing the kernel boot when the 'schedutil' cpufreq governor was selected as default : [ 12.071837] scpi_protocol scpi: SCP Protocol 0.0 Firmware 0.0.0 version domain-0 init dvfs: 4 [ 12.087757] hctosys: unable to open rtc device (rtc0) [ 12.087907] cfg80211: Loading compiled-in X.509 certificates for regulatory database [ 12.102241] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7' But when disabling the MMC driver, the boot finished but cpufreq failed to change the CPU frequency : [ 12.153045] cpufreq: __target_index: Failed to change cpu frequency: -5 A bisect between v4.16 and v4.16-rc1 gave `05f814402d` ("clk: meson: add fdiv clock gates") to be the first bad commit. This commit added support for the missing clock gates before the fixed PLL fixed dividers (FCLK_DIVx) and the clock framework basically disabled all the unused fixed dividers, thus disabled a critical clock path for the SCPI Co-Processor. This patch simply sets the FCLK_DIV2 gate as critical to ensure nobody can disable it. Fixes: `05f814402d` ("clk: meson: add fdiv clock gates") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Tested-by: Kevin Hilman <khilman@baylibre.com> [few corrections in the commit description] Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Lyude Paul	15f08f48ac	drm/nouveau: Set DRIVER_ATOMIC cap earlier to fix debugfs commit `eb493fbc15` upstream. Currently nouveau doesn't actually expose the state debugfs file that's usually provided for any modesetting driver that supports atomic, even if nouveau is loaded with atomic=1. This is due to the fact that the standard debugfs files that DRM creates for atomic drivers is called when drm_get_pci_dev() is called from nouveau_drm.c. This happens well before we've initialized the display core, which is currently responsible for setting the DRIVER_ATOMIC cap. So, move the atomic option into nouveau_drm.c and just add the DRIVER_ATOMIC cap whenever it's enabled on the kernel commandline. This shouldn't cause any actual issues, as the atomic ioctl will still fail as expected even if the display core doesn't disable it until later in the init sequence. This also provides the added benefit of being able to use the state debugfs file to check the current display state even if clients aren't allowed to modify it through anything other than the legacy ioctls. Additionally, disable the DRIVER_ATOMIC cap in nv04's display core, as this was already disabled there previously. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Lyude Paul	74930a2dca	drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit() commit `e5d54f1935` upstream. A CRTC being enabled doesn't mean it's on! It doesn't even necessarily mean it's being used. This fixes runtime PM leaks on the P50 I've got next to me. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:13 +02:00
Alexey Kardashevskiy	970e28cb2c	KVM: PPC: Check if IOMMU page is contained in the pinned physical page commit `76fa4975f3` upstream. A VM which has: - a DMA capable device passed through to it (eg. network card); - running a malicious kernel that ignores H_PUT_TCE failure; - capability of using IOMMU pages bigger that physical pages can create an IOMMU mapping that exposes (for example) 16MB of the host physical memory to the device when only 64K was allocated to the VM. The remaining 16MB - 64K will be some other content of host memory, possibly including pages of the VM, but also pages of host kernel memory, host programs or other VMs. The attacking VM does not control the location of the page it can map, and is only allowed to map as many pages as it has pages of RAM. We already have a check in drivers/vfio/vfio_iommu_spapr_tce.c that an IOMMU page is contained in the physical page so the PCI hardware won't get access to unassigned host memory; however this check is missing in the KVM fastpath (H_PUT_TCE accelerated code). We were lucky so far and did not hit this yet as the very first time when the mapping happens we do not have tbl::it_userspace allocated yet and fall back to the userspace which in turn calls VFIO IOMMU driver, this fails and the guest does not retry, This stores the smallest preregistered page size in the preregistered region descriptor and changes the mm_iommu_xxx API to check this against the IOMMU page size. This calculates maximum page size as a minimum of the natural region alignment and compound page size. For the page shift this uses the shift returned by find_linux_pte() which indicates how the page is mapped to the current userspace - if the page is huge and this is not a zero, then it is a leaf pte and the page is mapped within the range. Fixes: `121f80ba68` ("KVM: PPC: VFIO: Add in-kernel acceleration for VFIO") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Boris Ostrovsky	7ef8ee7148	xen/PVH: Set up GS segment for stack canary commit `9801406832` upstream. We are making calls to C code (e.g. xen_prepare_pvh()) which may use stack canary (stored in GS segment). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Joel Stanley	2969adb289	clk: aspeed: Support HPLL strapping on ast2400 commit `565b9937f4` upstream. The HPLL can be configured through a register (SCU24), however some platforms chose to configure it through the strapping settings and do not use the register. This was not noticed as the logic for bit 18 in SCU24 was confused: set means programmed, but the driver read it as set means strapped. This gives us the correct HPLL value on Palmetto systems, from which most of the peripheral clocks are generated. Fixes: `5eda5d79e4` ("clk: Add clock driver for ASPEED BMC SoCs") Cc: stable@vger.kernel.org # v4.15 Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Joel Stanley	94996717ed	clk: aspeed: Mark bclk (PCIe) and dclk (VGA) as critical commit `974c7c6d7b` upstream. This is used by the host to talk to the BMC's PCIe slave device. The BMC is not involved, but the clock needs to be enabled so the host can use the device. Fixes: `15ed8ce5f8` ("clk: aspeed: Register gated clocks") Cc: stable@vger.kernel.org # 4.15 Acked-by: Andrew Jeffery <andrew@aj.id.au> Tested-by: Lei YU <mine260309@gmail.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Gregory CLEMENT	5f5e829394	clk: mvebu: armada-37xx-periph: Fix switching CPU rate from 300Mhz to 1.2GHz commit `61c40f35f5` upstream. Switching the CPU from the L2 or L3 frequencies (300 and 200 Mhz respectively) to L0 frequency (1.2 Ghz) requires a significant amount of time to let VDD stabilize to the appropriate voltage. This amount of time is large enough that it cannot be covered by the hardware countdown register. Due to this, the CPU might start operating at L0 before the voltage is stabilized, leading to CPU stalls. To work around this problem, we prevent switching directly from the L2/L3 frequencies to the L0 frequency, and instead switch to the L1 frequency in-between. The sequence therefore becomes: 1. First switch from L2/L3(200/300MHz) to L1(600MHZ) 2. Sleep 20ms for stabling VDD voltage 3. Then switch from L1(600MHZ) to L0(1200Mhz). It is based on the work done by Ken Ma <make@marvell.com> Cc: stable@vger.kernel.org Fixes: `2089dc33ea` ("clk: mvebu: armada-37xx-periph: add DVFS support for cpu clocks") Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Paul Burton	490ca95d7e	MIPS: Fix off-by-one in pci_resource_to_user() commit `38c0a74fe0` upstream. The MIPS implementation of pci_resource_to_user() introduced in v3.12 by commit `4c2924b725` ("MIPS: PCI: Use pci_resource_to_user to map pci memory space properly") incorrectly sets *end to the address of the byte after the resource, rather than the last byte of the resource. This results in userland seeing resources as a byte larger than they actually are, for example a 32 byte BAR will be reported by a tool such as lspci as being 33 bytes in size: Region 2: I/O ports at 1000 [disabled] [size=33] Correct this by subtracting one from the calculated end address, reporting the correct address to userland. Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Rui Wang <rui.wang@windriver.com> Fixes: `4c2924b725` ("MIPS: PCI: Use pci_resource_to_user to map pci memory space properly") Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Wolfgang Grandegger <wg@grandegger.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v3.12+ Patchwork: https://patchwork.linux-mips.org/patch/19829/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Felix Fietkau	faa23cf5a8	MIPS: ath79: fix register address in ath79_ddr_wb_flush() commit `bc88ad2efd` upstream. ath79_ddr_wb_flush_base has the type void __iomem *, so register offsets need to be a multiple of 4 in order to access the intended register. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Paul Burton <paul.burton@mips.com> Fixes: `24b0e3e84f` ("MIPS: ath79: Improve the DDR controller interface") Patchwork: https://patchwork.linux-mips.org/patch/19912/ Cc: Alban Bedel <albeu@free.fr> Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # 4.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Christoph Hellwig	1f75f75657	Revert "iommu/intel-iommu: Enable CONFIG_DMA_DIRECT_OPS=y and clean up intel_{alloc,free}_coherent()" commit `7ec916f82c` upstream. This commit may cause a less than required dma mask to be used for some allocations, which apparently leads to module load failures for iwlwifi sometimes. This reverts commit `d657c5c73c`. Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Fabio Coatti <fabio.coatti@gmail.com> Tested-by: Fabio Coatti <fabio.coatti@gmail.com> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Paolo Bonzini	6702af7efd	KVM: VMX: support MSR_IA32_ARCH_CAPABILITIES as a feature MSR commit `cd28325249` upstream. This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all requested features are available on the host. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-28 07:57:12 +02:00
Greg Kroah-Hartman	50f9e029a6	Linux 4.17.10	2018-07-25 11:26:13 +02:00
Mathias Nyman	1e5e3acc86	xhci: Fix perceived dead host due to runtime suspend race with event handler commit `229bc19fd7` upstream. Don't rely on event interrupt (EINT) bit alone to detect pending port change in resume. If no change event is detected the host may be suspended again, oterwise roothubs are resumed. There is a lag in xHC setting EINT. If we don't notice the pending change in resume, and the controller is runtime suspeded again, it causes the event handler to assume host is dead as it will fail to read xHC registers once PCI puts the controller to D3 state. [ 268.520969] xhci_hcd: xhci_resume: starting port polling. [ 268.520985] xhci_hcd: xhci_hub_status_data: stopping port polling. [ 268.521030] xhci_hcd: xhci_suspend: stopping port polling. [ 268.521040] xhci_hcd: // Setting command ring address to 0x349bd001 [ 268.521139] xhci_hcd: Port Status Change Event for port 3 [ 268.521149] xhci_hcd: resume root hub [ 268.521163] xhci_hcd: port resume event for port 3 [ 268.521168] xhci_hcd: xHC is not running. [ 268.521174] xhci_hcd: handle_port_status: starting port polling. [ 268.596322] xhci_hcd: xhci_hc_died: xHCI host controller not responding, assume dead The EINT lag is described in a additional note in xhci specs 4.19.2: "Due to internal xHC scheduling and system delays, there will be a lag between a change bit being set and the Port Status Change Event that it generated being written to the Event Ring. If SW reads the PORTSC and sees a change bit set, there is no guarantee that the corresponding Port Status Change Event has already been written into the Event Ring." Cc: <stable@vger.kernel.org> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:13 +02:00
Al Viro	fd3702ec5d	cxl_getfile(): fix double-iput() on alloc_file() failures commit `d202797f48` upstream. Doing iput() after path_put() is wrong. Cc: stable@vger.kernel.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:13 +02:00
Al Viro	a242b5c4cd	drm_mode_create_lease_ioctl(): fix open-coded filp_clone_open() commit `b4e7a7a88b` upstream. Failure of ->open() should not be followed by fput(). Fixed by using filp_clone_open(), which gets the cleanups right. Cc: stable@vger.kernel.org Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:13 +02:00
Al Viro	3e354dd0a3	alpha: fix osf_wait4() breakage commit `f88a333b44` upstream. kernel_wait4() expects a userland address for status - it's only rusage that goes as a kernel one (and needs a copyout afterwards) [ Also, fix the prototype of kernel_wait4() to have that __user annotation - Linus ] Fixes: `92ebce5ac5` ("osf_wait4: switch to kernel_wait4()") Cc: stable@kernel.org # v4.13+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Alexander Couzens	c7c57d4b80	net: usb: asix: replace mii_nway_restart in resume path [ Upstream commit `5c968f4802` ] mii_nway_restart is not pm aware which results in a rtnl deadlock. Implement mii_nway_restart manual by setting BMCR_ANRESTART if BMCR_ANENABLE is set. To reproduce: * plug an asix based usb network interface * wait until the device enters PM (~5 sec) * `ip link set eth1 up` will never return Fixes: `d9fe64e511` ("net: asix: Add in_pm parameter") Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Sabrina Dubroca	c3c8f32cb8	ipv6: make DAD fail with enhanced DAD when nonce length differs [ Upstream commit `e66515999b` ] Commit `adc176c547` ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)") added enhanced DAD with a nonce length of 6 bytes. However, RFC7527 doesn't specify the length of the nonce, other than being 6 + 8*k bytes, with integer k >= 0 (RFC3971 5.3.2). The current implementation simply assumes that the nonce will always be 6 bytes, but others systems are free to choose different sizes. If another system sends a nonce of different length but with the same 6 bytes prefix, it shouldn't be considered as the same nonce. Thus, check that the length of the received nonce is the same as the length we sent. Ugly scapy test script running on veth0: def loop(): pkt=sniff(iface="veth0", filter="icmp6", count=1) pkt = pkt[0] b = bytearray(pkt[Raw].load) b[1] += 1 b += b'\xde\xad\xbe\xef\xde\xad\xbe\xef' pkt[Raw].load = bytes(b) pkt[IPv6].plen += 8 # fixup checksum after modifying the payload pkt[IPv6].payload.cksum -= 0x3b44 if pkt[IPv6].payload.cksum < 0: pkt[IPv6].payload.cksum += 0xffff sendp(pkt, iface="veth0") This should result in DAD failure for any address added to veth0's peer, but is currently ignored. Fixes: `adc176c547` ("ipv6 addrconf: Implemented enhanced DAD (RFC7527)") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Florian Fainelli	87389c0afe	net: systemport: Fix CRC forwarding check for SYSTEMPORT Lite [ Upstream commit `9e3bff9239` ] SYSTEMPORT Lite reversed the logic compared to SYSTEMPORT, the GIB_FCS_STRIP bit is set when the Ethernet FCS is stripped, and that bit is not set by default. Fix the logic such that we properly check whether that bit is set or not and we don't forward an extra 4 bytes to the network stack. Fixes: `44a4524c54` ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Saeed Mahameed	aa0c60dcc4	net/mlx4_en: Don't reuse RX page when XDP is set [ Upstream commit `432e629e56` ] When a new rx packet arrives, the rx path will decide whether to reuse the remainder of the page or not according to one of the below conditions: 1. frag_info->frag_stride == PAGE_SIZE / 2 2. frags->page_offset + frag_info->frag_size > PAGE_SIZE; The first condition is no met for when XDP is set. For XDP, page_offset is always set to priv->rx_headroom which is XDP_PACKET_HEADROOM and frag_info->frag_size is around mtu size + some padding, still the 2nd release condition will hold since XDP_PACKET_HEADROOM + 1536 < PAGE_SIZE, as a result the page will not be released and will be _wrongly_ reused for next free rx descriptor. In XDP there is an assumption to have a page per packet and reuse can break such assumption and might cause packet data corruptions. Fix this by adding an extra condition (!priv->rx_headroom) to the 2nd case to avoid page reuse when XDP is set, since rx_headroom is set to 0 for non XDP setup and set to XDP_PACKET_HEADROOM for XDP setup. No additional cache line is required for the new condition. Fixes: `34db548bfb` ("mlx4: add page recycling in receive path") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Suggested-by: Martin KaFai Lau <kafai@fb.com> CC: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Igor Russkikh	6af4a29f80	net: aquantia: vlan unicast address list correct handling [ Upstream commit `94b3b54230` ] Setting up macvlan/macvtap networks over atlantic NIC results in no traffic over these networks because ndo_set_rx_mode did not listed UC MACs as registered in unicast filter. Here we fix that taking into account maximum number of UC filters supported by hardware. If more than MAX addresses were registered, we just enable promisc and/or allmulti to pass the traffic in. We also remove MULTICAST_ADDRESS_MAX constant from aq_cfg since thats not a configurable parameter at all. Fixes: `b21f502` ("net:ethernet:aquantia: Fix for multicast filter handling.") Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Haiyang Zhang	7933909aa3	hv_netvsc: Fix napi reschedule while receive completion is busy [ Upstream commit `6b81b193b8` ] If out ring is full temporarily and receive completion cannot go out, we may still need to reschedule napi if certain conditions are met. Otherwise the napi poll might be stopped forever, and cause network disconnect. Fixes: `7426b1a518` ("netvsc: optimize receive completions") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Xin Long	3d706d7854	sctp: fix the issue that pathmtu may be set lower than MINSEGMENT [ Upstream commit `a659254755` ] After commit `b6c5734db0` ("sctp: fix the handling of ICMP Frag Needed for too small MTUs"), sctp_transport_update_pmtu would refetch pathmtu from the dst and set it to transport's pathmtu without any check. The new pathmtu may be lower than MINSEGMENT if the dst is obsolete and updated by .get_dst() in sctp_transport_update_pmtu. In this case, it could have a smaller MTU as well, and thus we should validate it against MINSEGMENT instead. Syzbot reported a warning in sctp_mtu_payload caused by this. This patch refetches the pathmtu by calling sctp_dst_mtu where it does the check against MINSEGMENT. v1->v2: - refetch the pathmtu by calling sctp_dst_mtu instead as Marcelo's suggestion. Fixes: `b6c5734db0` ("sctp: fix the handling of ICMP Frag Needed for too small MTUs") Reported-by: syzbot+f0d9d7cba052f9344b03@syzkaller.appspotmail.com Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Marcelo Ricardo Leitner	1ef66ca6fc	sctp: introduce sctp_dst_mtu [ Upstream commit `6ff0f871c2` ] Which makes sure that the MTU respects the minimum value of SCTP_DEFAULT_MINSEGMENT and that it is correctly aligned. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Prashant Bhole	41822076f6	net: ip6_gre: get ipv6hdr after skb_cow_head() [ Upstream commit `b7ed879425` ] A KASAN:use-after-free bug was found related to ip6-erspan while running selftests/net/ip6_gre_headroom.sh It happens because of following sequence: - ipv6hdr pointer is obtained from skb - skb_cow_head() is called, skb->head memory is reallocated - old data is accessed using ipv6hdr pointer skb_cow_head() call was added in `e41c7c68ea` ("ip6erspan: make sure enough headroom at xmit."), but looking at the history there was a chance of similar bug because gre_handle_offloads() and pskb_trim() can also reallocate skb->head memory. Fixes tag points to commit which introduced possibility of this bug. This patch moves ipv6hdr pointer assignment after skb_cow_head() call. Fixes: `5a963eb61b` ("ip6_gre: Add ERSPAN native tunnel support") Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:12 +02:00
Sanjeev Bansal	de3896afb5	tg3: Add higher cpu clock for 5762. [ Upstream commit `3a498606bb` ] This patch has fix for TX timeout while running bi-directional traffic with 100 Mbps using 5762. Signed-off-by: Sanjeev Bansal <sanjeevb.bansal@broadcom.com> Signed-off-by: Siva Reddy Kallam <siva.kallam@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Jacob Keller	f7edf5b730	sch_fq_codel: zero q->flows_cnt when fq_codel_init fails [ Upstream commit `83fe6b8709` ] When fq_codel_init fails, qdisc_create_dflt will cleanup by using qdisc_destroy. This function calls the ->reset() op prior to calling the ->destroy() op. Unfortunately, during the failure flow for sch_fq_codel, the ->flows parameter is not initialized, so the fq_codel_reset function will null pointer dereference. kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 kernel: IP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] kernel: PGD 0 P4D 0 kernel: Oops: 0000 [#1] SMP PTI kernel: Modules linked in: i40iw i40e(OE) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack tun bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables rpcrdma ib_isert iscsi_target_mod sunrpc ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate iTCO_wdt iTCO_vendor_support intel_uncore ib_core intel_rapl_perf mei_me mei joydev i2c_i801 lpc_ich ioatdma shpchp wmi sch_fq_codel xfs libcrc32c mgag200 ixgbe drm_kms_helper isci ttm firewire_ohci kernel: mdio drm igb libsas crc32c_intel firewire_core ptp pps_core scsi_transport_sas crc_itu_t dca i2c_algo_bit ipmi_si ipmi_devintf ipmi_msghandler [last unloaded: i40e] kernel: CPU: 10 PID: 4219 Comm: ip Tainted: G OE 4.16.13custom-fq-codel-test+ #3 kernel: Hardware name: Intel Corporation S2600CO/S2600CO, BIOS SE5C600.86B.02.05.0004.051120151007 05/11/2015 kernel: RIP: 0010:fq_codel_reset+0x58/0xd0 [sch_fq_codel] kernel: RSP: 0018:ffffbfbf4c1fb620 EFLAGS: 00010246 kernel: RAX: 0000000000000400 RBX: 0000000000000000 RCX: 00000000000005b9 kernel: RDX: 0000000000000000 RSI: ffff9d03264a60c0 RDI: ffff9cfd17b31c00 kernel: RBP: 0000000000000001 R08: 00000000000260c0 R09: ffffffffb679c3e9 kernel: R10: fffff1dab06a0e80 R11: ffff9cfd163af800 R12: ffff9cfd17b31c00 kernel: R13: 0000000000000001 R14: ffff9cfd153de600 R15: 0000000000000001 kernel: FS: 00007fdec2f92800(0000) GS:ffff9d0326480000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 0000000000000008 CR3: 0000000c1956a006 CR4: 00000000000606e0 kernel: Call Trace: kernel: qdisc_destroy+0x56/0x140 kernel: qdisc_create_dflt+0x8b/0xb0 kernel: mq_init+0xc1/0xf0 kernel: qdisc_create_dflt+0x5a/0xb0 kernel: dev_activate+0x205/0x230 kernel: __dev_open+0xf5/0x160 kernel: __dev_change_flags+0x1a3/0x210 kernel: dev_change_flags+0x21/0x60 kernel: do_setlink+0x660/0xdf0 kernel: ? down_trylock+0x25/0x30 kernel: ? xfs_buf_trylock+0x1a/0xd0 [xfs] kernel: ? rtnl_newlink+0x816/0x990 kernel: ? _xfs_buf_find+0x327/0x580 [xfs] kernel: ? _cond_resched+0x15/0x30 kernel: ? kmem_cache_alloc+0x20/0x1b0 kernel: ? rtnetlink_rcv_msg+0x200/0x2f0 kernel: ? rtnl_calcit.isra.30+0x100/0x100 kernel: ? netlink_rcv_skb+0x4c/0x120 kernel: ? netlink_unicast+0x19e/0x260 kernel: ? netlink_sendmsg+0x1ff/0x3c0 kernel: ? sock_sendmsg+0x36/0x40 kernel: ? ___sys_sendmsg+0x295/0x2f0 kernel: ? ebitmap_cmp+0x6d/0x90 kernel: ? dev_get_by_name_rcu+0x73/0x90 kernel: ? skb_dequeue+0x52/0x60 kernel: ? __inode_wait_for_writeback+0x7f/0xf0 kernel: ? bit_waitqueue+0x30/0x30 kernel: ? fsnotify_grab_connector+0x3c/0x60 kernel: ? __sys_sendmsg+0x51/0x90 kernel: ? do_syscall_64+0x74/0x180 kernel: ? entry_SYSCALL_64_after_hwframe+0x3d/0xa2 kernel: Code: 00 00 48 89 87 00 02 00 00 8b 87 a0 01 00 00 85 c0 0f 84 84 00 00 00 31 ed 48 63 dd 83 c5 01 48 c1 e3 06 49 03 9c 24 90 01 00 00 <48> 8b 73 08 48 8b 3b e8 6c 9a 4f f6 48 8d 43 10 48 c7 03 00 00 kernel: RIP: fq_codel_reset+0x58/0xd0 [sch_fq_codel] RSP: ffffbfbf4c1fb620 kernel: CR2: 0000000000000008 kernel: ---[ end trace e81a62bede66274e ]--- This is caused because flows_cnt is non-zero, but flows hasn't been initialized. fq_codel_init has left the private data in a partially initialized state. To fix this, reset flows_cnt to 0 when we fail to initialize. Additionally, to make the state more consistent, also cleanup the flows pointer when the allocation of backlogs fails. This fixes the NULL pointer dereference, since both the for-loop and memset in fq_codel_reset will be no-ops when flow_cnt is zero. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Taehee Yoo	2fb78678d1	rhashtable: add restart routine in rhashtable_free_and_destroy() [ Upstream commit `0026129c86` ] rhashtable_free_and_destroy() cancels re-hash deferred work then walks and destroys elements. at this moment, some elements can be still in future_tbl. that elements are not destroyed. test case: nft_rhash_destroy() calls rhashtable_free_and_destroy() to destroy all elements of sets before destroying sets and chains. But rhashtable_free_and_destroy() doesn't destroy elements of future_tbl. so that splat occurred. test script: %cat test.nft table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset table ip aa { map map1 { type ipv4_addr : verdict; elements = { 0 : jump a0, 1 : jump a0, 2 : jump a0, 3 : jump a0, 4 : jump a0, 5 : jump a0, 6 : jump a0, 7 : jump a0, 8 : jump a0, 9 : jump a0, } } chain a0 { } } flush ruleset %while :; do nft -f test.nft; done Splat looks like: [ 200.795603] kernel BUG at net/netfilter/nf_tables_api.c:1363! [ 200.806944] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 200.812253] CPU: 1 PID: 1582 Comm: nft Not tainted 4.17.0+ #24 [ 200.820297] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 200.830309] RIP: 0010:nf_tables_chain_destroy.isra.34+0x62/0x240 [nf_tables] [ 200.838317] Code: 43 50 85 c0 74 26 48 8b 45 00 48 8b 4d 08 ba 54 05 00 00 48 c7 c6 60 6d 29 c0 48 c7 c7 c0 65 29 c0 4c 8b 40 08 e8 58 e5 fd f8 <0f> 0b 48 89 da 48 b8 00 00 00 00 00 fc ff [ 200.860366] RSP: 0000:ffff880118dbf4d0 EFLAGS: 00010282 [ 200.866354] RAX: 0000000000000061 RBX: ffff88010cdeaf08 RCX: 0000000000000000 [ 200.874355] RDX: 0000000000000061 RSI: 0000000000000008 RDI: ffffed00231b7e90 [ 200.882361] RBP: ffff880118dbf4e8 R08: ffffed002373bcfb R09: ffffed002373bcfa [ 200.890354] R10: 0000000000000000 R11: ffffed002373bcfb R12: dead000000000200 [ 200.898356] R13: dead000000000100 R14: ffffffffbb62af38 R15: dffffc0000000000 [ 200.906354] FS: 00007fefc31fd700(0000) GS:ffff88011b800000(0000) knlGS:0000000000000000 [ 200.915533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 200.922355] CR2: 0000557f1c8e9128 CR3: 0000000106880000 CR4: 00000000001006e0 [ 200.930353] Call Trace: [ 200.932351] ? nf_tables_commit+0x26f6/0x2c60 [nf_tables] [ 200.939525] ? nf_tables_setelem_notify.constprop.49+0x1a0/0x1a0 [nf_tables] [ 200.947525] ? nf_tables_delchain+0x6e0/0x6e0 [nf_tables] [ 200.952383] ? nft_add_set_elem+0x1700/0x1700 [nf_tables] [ 200.959532] ? nla_parse+0xab/0x230 [ 200.963529] ? nfnetlink_rcv_batch+0xd06/0x10d0 [nfnetlink] [ 200.968384] ? nfnetlink_net_init+0x130/0x130 [nfnetlink] [ 200.975525] ? debug_show_all_locks+0x290/0x290 [ 200.980363] ? debug_show_all_locks+0x290/0x290 [ 200.986356] ? sched_clock_cpu+0x132/0x170 [ 200.990352] ? find_held_lock+0x39/0x1b0 [ 200.994355] ? sched_clock_local+0x10d/0x130 [ 200.999531] ? memset+0x1f/0x40 V2: - free all tables requested by Herbert Xu Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Matevz Vucnik	fbc3f8dd3d	qmi_wwan: add support for Quectel EG91 [ Upstream commit `38cd58ed9c` ] This adds the USB id of LTE modem Quectel EG91. It requires the same quirk as other Quectel modems to make it work. Signed-off-by: Matevz Vucnik <vucnikm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Gustavo A. R. Silva	7e6d5d531f	ptp: fix missing break in switch [ Upstream commit `9ba8376ce1` ] It seems that a break is missing in order to avoid falling through to the default case. Otherwise, checking chan makes no sense. Fixes: `72df7a7244` ("ptp: Allow reassigning calibration pin function") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Heiner Kallweit	feb31042cf	net: phy: fix flag masking in __set_phy_supported [ Upstream commit `df8ed346d4` ] Currently also the pause flags are removed from phydev->supported because they're not included in PHY_DEFAULT_FEATURES. I don't think this is intended, especially when considering that this function can be called via phy_set_max_speed() anywhere in a driver. Change the masking to mask out only the values we're going to change. In addition remove the misleading comment, job of this small function is just to adjust the supported and advertised speeds. Fixes: `f3a6bd393c` ("phylib: Add phy_set_max_speed helper") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
David Ahern	eefbcaa3f5	net/ipv6: Do not allow device only routes via the multipath API [ Upstream commit `b5d2d75e07` ] Eric reported that reverting the patch that fixed and simplified IPv6 multipath routes means reverting back to invalid userspace notifications. eg., $ ip -6 route add 2001:db8:1::/64 nexthop dev eth0 nexthop dev eth1 only generates a single notification: 2001:db8:1::/64 dev eth0 metric 1024 pref medium While working on a fix for this problem I found another case that is just broken completely - a multipath route with a gateway followed by device followed by gateway: $ ip -6 ro add 2001:db8:103::/64 nexthop via 2001:db8:1::64 nexthop dev dummy2 nexthop via 2001:db8:3::64 In this case the device only route is dropped completely - no notification to userpsace but no addition to the FIB either: $ ip -6 ro ls 2001:db8:1::/64 dev dummy1 proto kernel metric 256 pref medium 2001:db8:2::/64 dev dummy2 proto kernel metric 256 pref medium 2001:db8:3::/64 dev dummy3 proto kernel metric 256 pref medium 2001:db8:103::/64 metric 1024 nexthop via 2001:db8:1::64 dev dummy1 weight 1 nexthop via 2001:db8:3::64 dev dummy3 weight 1 pref medium fe80::/64 dev dummy1 proto kernel metric 256 pref medium fe80::/64 dev dummy2 proto kernel metric 256 pref medium fe80::/64 dev dummy3 proto kernel metric 256 pref medium Really, IPv6 multipath is just FUBAR'ed beyond repair when it comes to device only routes, so do not allow it all. This change will break any scripts relying on the mpath api for insert, but I don't see any other way to handle the permutations. Besides, since the routes are added to the FIB as standalone (non-multipath) routes the kernel is not doing what the user requested, so it might as well tell the user that. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
David Ahern	410570f9d7	net/ipv4: Set oif in fib_compute_spec_dst [ Upstream commit `e7372197e1` ] Xin reported that icmp replies may not use the address on the device the echo request is received if the destination address is broadcast. Instead a route lookup is done without considering VRF context. Fix by setting oif in flow struct to the master device if it is enslaved. That directs the lookup to the VRF table. If the device is not enslaved, oif is still 0 so no affect. Fixes: `cd2fbe1b6b` ("net: Use VRF device index for lookups on RX") Reported-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Stefano Brivio	f7939f3c0a	skbuff: Unconditionally copy pfmemalloc in __skb_clone() [ Upstream commit `e78bfb0751` ] Commit `8b7008620b` ("net: Don't copy pfmemalloc flag in __copy_skb_header()") introduced a different handling for the pfmemalloc flag in copy and clone paths. In __skb_clone(), now, the flag is set only if it was set in the original skb, but not cleared if it wasn't. This is wrong and might lead to socket buffers being flagged with pfmemalloc even if the skb data wasn't allocated from pfmemalloc reserves. Copy the flag instead of ORing it. Reported-by: Sabrina Dubroca <sd@queasysnail.net> Fixes: `8b7008620b` ("net: Don't copy pfmemalloc flag in __copy_skb_header()") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Stefano Brivio	8b5084610f	net: Don't copy pfmemalloc flag in __copy_skb_header() [ Upstream commit `8b7008620b` ] The pfmemalloc flag indicates that the skb was allocated from the PFMEMALLOC reserves, and the flag is currently copied on skb copy and clone. However, an skb copied from an skb flagged with pfmemalloc wasn't necessarily allocated from PFMEMALLOC reserves, and on the other hand an skb allocated that way might be copied from an skb that wasn't. So we should not copy the flag on skb copy, and rather decide whether to allow an skb to be associated with sockets unrelated to page reclaim depending only on how it was allocated. Move the pfmemalloc flag before headers_start[0] using an existing 1-bit hole, so that __copy_skb_header() doesn't copy it. When cloning, we'll now take care of this flag explicitly, contravening to the warning comment of __skb_clone(). While at it, restore the newline usage introduced by commit `b193722731` ("net: reorganize sk_buff for faster __copy_skb_header()") to visually separate bytes used in bitfields after headers_start[0], that was gone after commit `a9e419dc7b` ("netfilter: merge ctinfo into nfct pointer storage area"), and describe the pfmemalloc flag in the kernel-doc structure comment. This doesn't change the size of sk_buff or cacheline boundaries, but consolidates the 15 bits hole before tc_index into a 2 bytes hole before csum, that could now be filled more easily. Reported-by: Patrick Talbert <ptalbert@redhat.com> Fixes: `c93bdd0e03` ("netvm: allow skb allocation to use PFMEMALLOC reserves") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:11 +02:00
Lorenzo Colitti	7d28b71d05	net: diag: Don't double-free TCP_NEW_SYN_RECV sockets in tcp_abort [ Upstream commit `acc2cf4e37` ] When tcp_diag_destroy closes a TCP_NEW_SYN_RECV socket, it first frees it by calling inet_csk_reqsk_queue_drop_and_and_put in tcp_abort, and then frees it again by calling sock_gen_put. Since tcp_abort only has one caller, and all the other codepaths in tcp_abort don't free the socket, just remove the free in that function. Cc: David Ahern <dsa@cumulusnetworks.com> Tested: passes Android sock_diag_test.py, which exercises this codepath Fixes: `d7226c7a4d` ("net: diag: Fix refcnt leak in error path destroying socket") Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Tested-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Davidlohr Bueso	561478584f	lib/rhashtable: consider param->min_size when setting initial table size [ Upstream commit `107d01f5ba` ] rhashtable_init() currently does not take into account the user-passed min_size parameter unless param->nelem_hint is set as well. As such, the default size (number of buckets) will always be HASH_DEFAULT_SIZE even if the smallest allowed size is larger than that. Remediate this by unconditionally calling into rounded_hashtable_size() and handling things accordingly. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Arnd Bergmann	677f3c02b5	ipv6: ila: select CONFIG_DST_CACHE [ Upstream commit `83ed7d1fe2` ] My randconfig builds came across an old missing dependency for ILA: ERROR: "dst_cache_set_ip6" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_get" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_init" [net/ipv6/ila/ila.ko] undefined! ERROR: "dst_cache_destroy" [net/ipv6/ila/ila.ko] undefined! We almost never run into this by accident because randconfig builds end up selecting DST_CACHE from some other tunnel protocol, and this one appears to be the only one missing the explicit 'select'. >From all I can tell, this problem first appeared in linux-4.9 when dst_cache support got added to ILA. Fixes: `79ff2fc31e` ("ila: Cache a route to translated address") Cc: Tom Herbert <tom@herbertland.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Colin Ian King	684f362392	ipv6: fix useless rol32 call on hash [ Upstream commit `169dc027fb` ] The rol32 call is currently rotating hash but the rol'd value is being discarded. I believe the current code is incorrect and hash should be assigned the rotated value returned from rol32. Thanks to David Lebrun for spotting this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Tyler Hicks	3e895f9786	ipv4: Return EINVAL when ping_group_range sysctl doesn't map to user ns [ Upstream commit `70ba5b6db9` ] The low and high values of the net.ipv4.ping_group_range sysctl were being silently forced to the default disabled state when a write to the sysctl contained GIDs that didn't map to the associated user namespace. Confusingly, the sysctl's write operation would return success and then a subsequent read of the sysctl would indicate that the low and high values are the overflowgid. This patch changes the behavior by clearly returning an error when the sysctl write operation receives a GID range that doesn't map to the associated user namespace. In such a situation, the previous value of the sysctl is preserved and that range will be returned in a subsequent read of the sysctl. Signed-off-by: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Toke Høiland-Jørgensen	2e84740b04	gen_stats: Fix netlink stats dumping in the presence of padding [ Upstream commit `d5a672ac9f` ] The gen_stats facility will add a header for the toplevel nlattr of type TCA_STATS2 that contains all stats added by qdisc callbacks. A reference to this header is stored in the gnet_dump struct, and when all the per-qdisc callbacks have finished adding their stats, the length of the containing header will be adjusted to the right value. However, on architectures that need padding (i.e., that don't set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS), the padding nlattr is added before the stats, which means that the stored pointer will point to the padding, and so when the header is fixed up, the result is just a very big padding nlattr. Because most qdiscs also supply the legacy TCA_STATS struct, this problem has been mostly invisible, but we exposed it with the netlink attribute-based statistics in CAKE. Fix the issue by fixing up the stored pointer if it points to a padding nlattr. Tested-by: Pete Heist <pete@heistp.net> Tested-by: Kevin Darbyshire-Bryant <kevin@darbyshire-bryant.me.uk> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Lyude Paul	44c7b7c90d	drm/nouveau: Avoid looping through fake MST connectors commit `37afe55b4a` upstream. When MST and atomic were introduced to nouveau, another structure that could contain a drm_connector embedded within it was introduced; struct nv50_mstc. This meant that we no longer would be able to simply loop through our connector list and assume that nouveau_connector() would return a proper pointer for each connector, since the assertion that all connectors coming from nouveau have a full nouveau_connector struct became invalid. Unfortunately, none of the actual code that looped through connectors ever got updated, which means that we've been causing invalid memory accesses for quite a while now. An example that was caught by KASAN: [ 201.038698] ================================================================== [ 201.038792] BUG: KASAN: slab-out-of-bounds in nvif_notify_get+0x190/0x1a0 [nouveau] [ 201.038797] Read of size 4 at addr ffff88076738c650 by task kworker/0:3/718 [ 201.038800] [ 201.038822] CPU: 0 PID: 718 Comm: kworker/0:3 Tainted: G O 4.18.0-rc4Lyude-Test+ #1 [ 201.038825] Hardware name: LENOVO 20EQS64N0B/20EQS64N0B, BIOS N1EET78W (1.51 ) 05/18/2018 [ 201.038882] Workqueue: events nouveau_display_hpd_work [nouveau] [ 201.038887] Call Trace: [ 201.038894] dump_stack+0xa4/0xfd [ 201.038900] print_address_description+0x71/0x239 [ 201.038929] ? nvif_notify_get+0x190/0x1a0 [nouveau] [ 201.038935] kasan_report.cold.6+0x242/0x2fe [ 201.038942] __asan_report_load4_noabort+0x19/0x20 [ 201.038970] nvif_notify_get+0x190/0x1a0 [nouveau] [ 201.038998] ? nvif_notify_put+0x1f0/0x1f0 [nouveau] [ 201.039003] ? kmsg_dump_rewind_nolock+0xe4/0xe4 [ 201.039049] nouveau_display_init.cold.12+0x34/0x39 [nouveau] [ 201.039089] ? nouveau_user_framebuffer_create+0x120/0x120 [nouveau] [ 201.039133] nouveau_display_resume+0x5c0/0x810 [nouveau] [ 201.039173] ? nvkm_client_ioctl+0x20/0x20 [nouveau] [ 201.039215] nouveau_do_resume+0x19f/0x570 [nouveau] [ 201.039256] nouveau_pmops_runtime_resume+0xd8/0x2a0 [nouveau] [ 201.039264] pci_pm_runtime_resume+0x130/0x250 [ 201.039269] ? pci_restore_standard_config+0x70/0x70 [ 201.039275] __rpm_callback+0x1f2/0x5d0 [ 201.039279] ? rpm_resume+0x560/0x18a0 [ 201.039283] ? pci_restore_standard_config+0x70/0x70 [ 201.039287] ? pci_restore_standard_config+0x70/0x70 [ 201.039291] ? pci_restore_standard_config+0x70/0x70 [ 201.039296] rpm_callback+0x175/0x210 [ 201.039300] ? pci_restore_standard_config+0x70/0x70 [ 201.039305] rpm_resume+0xcc3/0x18a0 [ 201.039312] ? rpm_callback+0x210/0x210 [ 201.039317] ? __pm_runtime_resume+0x9e/0x100 [ 201.039322] ? kasan_check_write+0x14/0x20 [ 201.039326] ? do_raw_spin_lock+0xc2/0x1c0 [ 201.039333] __pm_runtime_resume+0xac/0x100 [ 201.039374] nouveau_display_hpd_work+0x67/0x1f0 [nouveau] [ 201.039380] process_one_work+0x7a0/0x14d0 [ 201.039388] ? cancel_delayed_work_sync+0x20/0x20 [ 201.039392] ? lock_acquire+0x113/0x310 [ 201.039398] ? kasan_check_write+0x14/0x20 [ 201.039402] ? do_raw_spin_lock+0xc2/0x1c0 [ 201.039409] worker_thread+0x86/0xb50 [ 201.039418] kthread+0x2e9/0x3a0 [ 201.039422] ? process_one_work+0x14d0/0x14d0 [ 201.039426] ? kthread_create_worker_on_cpu+0xc0/0xc0 [ 201.039431] ret_from_fork+0x3a/0x50 [ 201.039441] [ 201.039444] Allocated by task 79: [ 201.039449] save_stack+0x43/0xd0 [ 201.039452] kasan_kmalloc+0xc4/0xe0 [ 201.039456] kmem_cache_alloc_trace+0x10a/0x260 [ 201.039494] nv50_mstm_add_connector+0x9a/0x340 [nouveau] [ 201.039504] drm_dp_add_port+0xff5/0x1fc0 [drm_kms_helper] [ 201.039511] drm_dp_send_link_address+0x4a7/0x740 [drm_kms_helper] [ 201.039518] drm_dp_check_and_send_link_address+0x1a7/0x210 [drm_kms_helper] [ 201.039525] drm_dp_mst_link_probe_work+0x71/0xb0 [drm_kms_helper] [ 201.039529] process_one_work+0x7a0/0x14d0 [ 201.039533] worker_thread+0x86/0xb50 [ 201.039537] kthread+0x2e9/0x3a0 [ 201.039541] ret_from_fork+0x3a/0x50 [ 201.039543] [ 201.039546] Freed by task 0: [ 201.039549] (stack is not available) [ 201.039551] [ 201.039555] The buggy address belongs to the object at ffff88076738c1a8 which belongs to the cache kmalloc-2048 of size 2048 [ 201.039559] The buggy address is located 1192 bytes inside of 2048-byte region [ffff88076738c1a8, ffff88076738c9a8) [ 201.039563] The buggy address belongs to the page: [ 201.039567] page:ffffea001d9ce200 count:1 mapcount:0 mapping:ffff88084000d0c0 index:0x0 compound_mapcount: 0 [ 201.039573] flags: 0x8000000000008100(slab\|head) [ 201.039578] raw: 8000000000008100 ffffea001da3be08 ffffea001da25a08 ffff88084000d0c0 [ 201.039582] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000 [ 201.039585] page dumped because: kasan: bad access detected [ 201.039588] [ 201.039591] Memory state around the buggy address: [ 201.039594] ffff88076738c500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 201.039598] ffff88076738c580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 201.039601] >ffff88076738c600: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc [ 201.039604] ^ [ 201.039607] ffff88076738c680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 201.039611] ffff88076738c700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 201.039613] ================================================================== Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Cc: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:10 +02:00
Lyude Paul	c89b8c6f8d	drm/nouveau: Use drm_connector_list_iter_* for iterating connectors commit `22b76bbe08` upstream. Every codepath in nouveau that loops through the connector list currently does so using the old method, which is prone to race conditions from MST connectors being created and destroyed. This has been causing a multitude of problems, including memory corruption from trying to access connectors that have already been freed! Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Cc: Karol Herbst <karolherbst@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:09 +02:00
Lyude Paul	99701888bc	drm/nouveau: Remove bogus crtc check in pmops_runtime_idle commit `68fe23a626` upstream. This both uses the legacy modesetting structures in a racy manner, and additionally also doesn't even check the right variable (enabled != the CRTC is actually turned on for atomic). This fixes issues on my P50 regarding the dedicated GPU not entering runtime suspend. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:09 +02:00
Alex Deucher	9168c089a2	Revert "drm/amd/display: Don't return ddc result and read_bytes in same return value" commit `5292221d6d` upstream. This reverts commit `018d82e5f0`. This breaks DDC in certain cases. Revert for 4.18 and previous kernels. For 4.19, this is fixed with the following more extensive patches: drm/amd/display: Serialize is_dp_sink_present drm/amd/display: Break out function to simply read aux reply drm/amd/display: Return aux replies directly to DRM drm/amd/display: Right shift AUX reply value sooner than later drm/amd/display: Read AUX channel even if only status byte is returned Link: https://lists.freedesktop.org/archives/amd-gfx/2018-July/023788.html Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:09 +02:00
Ville Syrjälä	81dd92f5fa	drm/i915: Fix hotplug irq ack on i965/g4x commit `96a85cc517` upstream. Just like with PIPESTAT, the edge triggered IIR on i965/g4x also causes problems for hotplug interrupts. To make sure we don't get the IIR port interrupt bit stuck low with the ISR bit high we must force an edge in ISR. Unfortunately we can't borrow the PIPESTAT trick and toggle the enable bits in PORT_HOTPLUG_EN as that act itself generates hotplug interrupts. Instead we just have to loop until we've cleared PORT_HOTPLUG_STAT, or we just give up and WARN. v2: Don't frob with PORT_HOTPLUG_EN Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180614175625.1615-1-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> (cherry picked from commit `0ba7c51a6f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:09 +02:00
Michel Dänzer	e11afb7bad	drm/amdgpu: Reserve VM root shared fence slot for command submission (v3) commit `ed6b4b5559` upstream. Without this, there could not be enough slots, which could trigger the BUG_ON in reservation_object_add_shared_fence. v2: * Jump to the error label instead of returning directly (Jerry Zhang) v3: * Reserve slots for command submission after VM updates (Christian König) Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/106418 Reported-by: mikhail.v.gavrilov@gmail.com Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:09 +02:00
Gautham R. Shenoy	dbe7420980	powerpc/powernv: Fix save/restore of SPRG3 on entry/exit from stop (idle) commit `b03897cf31` upstream. On 64-bit servers, SPRN_SPRG3 and its userspace read-only mirror SPRN_USPRG3 are used as userspace VDSO write and read registers respectively. SPRN_SPRG3 is lost when we enter stop4 and above, and is currently not restored. As a result, any read from SPRN_USPRG3 returns zero on an exit from stop4 (Power9 only) and above. Thus in this situation, on POWER9, any call from sched_getcpu() always returns zero, as on powerpc, we call __kernel_getcpu() which relies upon SPRN_USPRG3 to report the CPU and NUMA node information. Fix this by restoring SPRN_SPRG3 on wake up from a deep stop state with the sprg_vdso value that is cached in PACA. Fixes: `e1c1cfed54` ("powerpc/powernv: Save/Restore additional SPRs for stop4 cpuidle") Cc: stable@vger.kernel.org # v4.14+ Reported-by: Florian Weimer <fweimer@redhat.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Isaac J. Manjarres	7d847e18bc	stop_machine: Disable preemption when waking two stopper threads commit `9fb8d5dc4b` upstream. When cpu_stop_queue_two_works() begins to wake the stopper threads, it does so without preemption disabled, which leads to the following race condition: The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source CPU, and cpu2 as the destination CPU. When adding the stopper threads to the wake queue used in this function, the source CPU stopper thread is added first, and the destination CPU stopper thread is added last. When wake_up_q() is invoked to wake the stopper threads, the threads are woken up in the order that they are queued in, so the source CPU's stopper thread is woken up first, and it preempts the thread running on the source CPU. The stopper thread will then execute on the source CPU, disable preemption, and begin executing multi_cpu_stop(), and wait for an ack from the destination CPU's stopper thread, with preemption still disabled. Since the worker thread that woke up the stopper thread on the source CPU is affine to the source CPU, and preemption is disabled on the source CPU, that thread will never run to dequeue the destination CPU's stopper thread from the wake queue, and thus, the destination CPU's stopper thread will never run, causing the source CPU's stopper thread to wait forever, and stall. Disable preemption when waking the stopper threads in cpu_stop_queue_two_works(). Fixes: `0b26351b91` ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock") Co-Developed-by: Prasad Sodagudi <psodagud@codeaurora.org> Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org> Co-Developed-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org> Signed-off-by: Isaac J. Manjarres <isaacm@codeaurora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: peterz@infradead.org Cc: matt@codeblueprint.co.uk Cc: bigeasy@linutronix.de Cc: gregkh@linuxfoundation.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1530655334-4601-1-git-send-email-isaacm@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Alexey Kardashevskiy	0a40ad277c	vfio/spapr: Use IOMMU pageshift rather than pagesize commit `1463edca67` upstream. The size is always equal to 1 page so let's use this. Later on this will be used for other checks which use page shifts to check the granularity of access. This should cause no behavioral change. Cc: stable@vger.kernel.org # v4.12+ Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Gustavo A. R. Silva	4218d03f35	vfio/pci: Fix potential Spectre v1 commit `0e714d2778` upstream. info.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/vfio/pci/vfio_pci.c:734 vfio_pci_ioctl() warn: potential spectre issue 'vdev->region' Fix this by sanitizing info.index before indirectly using it to index vdev->region Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Rafael J. Wysocki	02a0f9adbf	cpufreq: intel_pstate: Register when ACPI PCCH is present commit `95d6c0857e` upstream. Currently, intel_pstate doesn't register if _PSS is not present on HP Proliant systems, because it expects the firmware to take over CPU performance scaling in that case. However, if ACPI PCCH is present, the firmware expects the kernel to use it for CPU performance scaling and the pcc-cpufreq driver is loaded for that. Unfortunately, the firmware interface used by that driver is not scalable for fundamental reasons, so pcc-cpufreq is way suboptimal on systems with more than just a few CPUs. In fact, it is better to avoid using it at all. For this reason, modify intel_pstate to look for ACPI PCCH if _PSS is not present and register if it is there. Also prevent the pcc-cpufreq driver from trying to initialize itself if intel_pstate has been registered already. Fixes: `fbbcdc0744` (intel_pstate: skip the driver if ACPI has power mgmt option) Reported-by: Andreas Herrmann <aherrmann@suse.com> Reviewed-by: Andreas Herrmann <aherrmann@suse.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Andreas Herrmann <aherrmann@suse.com> Cc: 4.16+ <stable@vger.kernel.org> # 4.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Hugh Dickins	2defc1c00d	mm/huge_memory.c: fix data loss when splitting a file pmd commit `e1f1b1572e` upstream. __split_huge_pmd_locked() must check if the cleared huge pmd was dirty, and propagate that to PageDirty: otherwise, data may be lost when a huge tmpfs page is modified then split then reclaimed. How has this taken so long to be noticed? Because there was no problem when the huge page is written by a write system call (shmem_write_end() calls set_page_dirty()), nor when the page is allocated for a write fault (fault_dirty_shared_page() calls set_page_dirty()); but when allocated for a read fault (which MAP_POPULATE simulates), no set_page_dirty(). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils Fixes: `d21b9e57c7` ("thp: handle file pages in split_huge_pmd()") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Ashwin Chaugule <ashwinch@google.com> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Jing Xia	214a9fcd05	mm: memcg: fix use after free in mem_cgroup_iter() commit `9f15bde671` upstream. It was reported that a kernel crash happened in mem_cgroup_iter(), which can be triggered if the legacy cgroup-v1 non-hierarchical mode is used. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f ...... Call trace: mem_cgroup_iter+0x2e0/0x6d4 shrink_zone+0x8c/0x324 balance_pgdat+0x450/0x640 kswapd+0x130/0x4b8 kthread+0xe8/0xfc ret_from_fork+0x10/0x20 mem_cgroup_iter(): ...... if (css_tryget(css)) <-- crash here break; ...... The crashing reason is that mem_cgroup_iter() uses the memcg object whose pointer is stored in iter->position, which has been freed before and filled with POISON_FREE(0x6b). And the root cause of the use-after-free issue is that invalidate_reclaim_iterators() fails to reset the value of iter->position to NULL when the css of the memcg is released in non- hierarchical mode. Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.com Fixes: `6df38689e0` ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim") Signed-off-by: Jing Xia <jing.xia.mail@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: <chunyan.zhang@unisoc.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Vineet Gupta	47a8d7bb8c	ARC: mm: allow mprotect to make stack mappings executable commit `93312b6da4` upstream. mprotect(EXEC) was failing for stack mappings as default vm flags was missing MAYEXEC. This was triggered by glibc test suite nptl/tst-execstack testcase What is surprising is that despite running LTP for years on, we didn't catch this issue as it lacks a directed test case. gcc dejagnu tests with nested functions also requiring exec stack work fine though because they rely on the GNU_STACK segment spit out by compiler and handled in kernel elf loader. This glibc case is different as the stack is non exec to begin with and a dlopen of shared lib with GNU_STACK segment triggers the exec stack proceedings using a mprotect(PROT_EXEC) which was broken. CC: stable@vger.kernel.org Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Alexey Brodkin	d94c1605ec	ARC: configs: Remove CONFIG_INITRAMFS_SOURCE from defconfigs commit `64234961c1` upstream. We used to have pre-set CONFIG_INITRAMFS_SOURCE with local path to intramfs in ARC defconfigs. This was quite convenient for in-house development but not that convenient for newcomers who obviusly don't have folders like "arc_initramfs" next to the Linux source tree. Which leads to quite surprising failure of defconfig building: ------------------------------->8----------------------------- ../scripts/gen_initramfs_list.sh: Cannot open '../../arc_initramfs_hs/' ../usr/Makefile:57: recipe for target 'usr/initramfs_data.cpio.gz' failed make[2]: *** [usr/initramfs_data.cpio.gz] Error 1 ------------------------------->8----------------------------- So now when more and more people start to deal with our defconfigs let's make their life easier with removal of CONFIG_INITRAMFS_SOURCE. Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Kevin Hilman <khilman@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Alexey Brodkin	3c23a42e6a	ARC: Fix CONFIG_SWAP commit `6e3761145a` upstream. swap was broken on ARC due to silly copy-paste issue. We encode offset from swapcache page in __swp_entry() as (off << 13) but were not decoding back in __swp_offset() as (off >> 13) - it was still (off << 13). This finally fixes swap usage on ARC. \| # mkswap /dev/sda2 \| \| # swapon -a -e /dev/sda2 \| Adding 500728k swap on /dev/sda2. Priority:-2 extents:1 across:500728k \| \| # free \| total used free shared buffers cached \| Mem: 765104 13456 751648 4736 8 4736 \| -/+ buffers/cache: 8712 756392 \| Swap: 500728 0 500728 Cc: stable@vger.kernel.org Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:08 +02:00
Vineet Gupta	d68ac6cba4	ARCv2: [plat-hsdk]: Save accl reg pair by default commit `af1fc5baa7` upstream. This manifsted as strace segfaulting on HSDK because gcc was targetting the accumulator registers as GPRs, which kernek was not saving/restoring by default. Cc: stable@vger.kernel.org #4.14+ Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Po-Hsu Lin	e98863e745	ALSA: hda: add mute led support for HP ProBook 455 G5 commit `9a6249d2a1` upstream. Audio mute led does not work on HP ProBook 455 G5, this can be fixed by using CXT_FIXUP_MUTE_LED_GPIO to support it. BugLink: https://bugs.launchpad.net/bugs/1781763 Reported-by: James Buren Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Takashi Iwai	0cae4b483f	ALSA: hda/realtek - Yet another Clevo P950 quirk entry commit `f3d737b634` upstream. The PCI SSID 1558:95e1 needs the same quirk for other Clevo P950 models, too. Otherwise no sound comes out of speakers. Bugzilla: https://bugzilla.opensuse.org/show_bug.cgi?id=1101143 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
YOKOTA Hiroshi	96b120fa8f	ALSA: hda/realtek - Add Panasonic CF-SZ6 headset jack quirk commit `0fca97a29b` upstream. This adds some required quirk when uses headset or headphone on Panasonic CF-SZ6. Signed-off-by: YOKOTA Hiroshi <yokota.hgml@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Takashi Iwai	f5f3789f19	ALSA: rawmidi: Change resized buffers atomically commit `39675f7a7c` upstream. The SNDRV_RAWMIDI_IOCTL_PARAMS ioctl may resize the buffers and the current code is racy. For example, the sequencer client may write to buffer while it being resized. As a simple workaround, let's switch to the resized buffer inside the stream runtime lock. Reported-by: syzbot+52f83f0ea8df16932f7f@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
OGAWA Hirofumi	44bbcf08ab	fat: fix memory allocation failure handling of match_strdup() commit `35033ab988` upstream. In parse_options(), if match_strdup() failed, parse_options() leaves opts->iocharset in unexpected state (i.e. still pointing the freed string). And this can be the cause of double free. To fix, this initialize opts->iocharset always when freeing. Link: http://lkml.kernel.org/r/8736wp9dzc.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: syzbot+90b8e10515ae88228a92@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Dewet Thibaut	cf42670fe6	x86/MCE: Remove min interval polling limitation commit `fbdb328c6b` upstream. commit `b3b7c4795c` ("x86/MCE: Serialize sysfs changes") introduced a min interval limitation when setting the check interval for polled MCEs. However, the logic is that 0 disables polling for corrected MCEs, see Documentation/x86/x86_64/machinecheck. The limitation prevents disabling. Remove this limitation and allow the value 0 to disable polling again. Fixes: `b3b7c4795c` ("x86/MCE: Serialize sysfs changes") Signed-off-by: Dewet Thibaut <thibaut.dewet@nokia.com> Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com> [ Massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180716084927.24869-1-alexander.sverdlin@nokia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Hugh Dickins	16b2d5ad46	x86/events/intel/ds: Fix bts_interrupt_threshold alignment commit `2c991e408d` upstream. Markus reported that BTS is sporadically missing the tail of the trace in the perf_event data buffer: [decode error (1): instruction overflow] shown in GDB; and bisected it to the conversion of debug_store to PTI. A little "optimization" crept into alloc_bts_buffer(), which mistakenly placed bts_interrupt_threshold away from the 24-byte record boundary. Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from the BTS buffer base that is a multiple of the BTS record size." Revert "max" from a byte count to a record count, to calculate the bts_interrupt_threshold correctly: which turns out to fix problem seen. Fixes: `c1961a4631` ("x86/events/intel/ds: Map debug buffers in cpu_entry_area") Reported-and-tested-by: Markus T Metzger <markus.t.metzger@intel.com> Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <andi.kleen@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v4.14+ Link: https://lkml.kernel.org/r/alpine.LSU.2.11.1807141248290.1614@eggly.anvils Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Ville Syrjälä	822e65c0be	x86/apm: Don't access __preempt_count with zeroed fs commit `6f6060a5c9` upstream. APM_DO_POP_SEGS does not restore fs/gs which were zeroed by APM_DO_ZERO_SEGS. Trying to access __preempt_count with zeroed fs doesn't really work. Move the ibrs call outside the APM_DO_SAVE_SEGS/APM_DO_RESTORE_SEGS invocations so that fs is actually restored before calling preempt_enable(). Fixes the following sort of oopses: [ 0.313581] general protection fault: 0000 [#1] PREEMPT SMP [ 0.313803] Modules linked in: [ 0.314040] CPU: 0 PID: 268 Comm: kapmd Not tainted 4.16.0-rc1-triton-bisect-00090-gdd84441a7971 #19 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 [ 0.316161] EFLAGS: 00210016 CPU: 0 [ 0.316161] EAX: 00000102 EBX: 00000000 ECX: 00000102 EDX: 00000000 [ 0.316161] ESI: 0000530e EDI: dea95f64 EBP: dea95f18 ESP: dea95ef0 [ 0.316161] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 [ 0.316161] CR0: 80050033 CR2: 00000000 CR3: 015d3000 CR4: 000006d0 [ 0.316161] Call Trace: [ 0.316161] ? cpumask_weight.constprop.15+0x20/0x20 [ 0.316161] on_cpu0+0x44/0x70 [ 0.316161] apm+0x54e/0x720 [ 0.316161] ? __switch_to_asm+0x26/0x40 [ 0.316161] ? __schedule+0x17d/0x590 [ 0.316161] kthread+0xc0/0xf0 [ 0.316161] ? proc_apm_show+0x150/0x150 [ 0.316161] ? kthread_create_worker_on_cpu+0x20/0x20 [ 0.316161] ret_from_fork+0x2e/0x38 [ 0.316161] Code: da 8e c2 8e e2 8e ea 57 55 2e ff 1d e0 bb 5d b1 0f 92 c3 5d 5f 07 1f 89 47 0c 90 8d b4 26 00 00 00 00 90 8d b4 26 00 00 00 00 90 <64> ff 0d 84 16 5c b1 74 7f 8b 45 dc 8e e0 8b 45 d8 8e e8 8b 45 [ 0.316161] EIP: __apm_bios_call_simple+0xc8/0x170 SS:ESP: 0068:dea95ef0 [ 0.316161] ---[ end trace 656253db2deaa12c ]--- Fixes: `dd84441a79` ("x86/speculation: Use IBRS if available before calling into firmware") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lkml.kernel.org/r/20180709133534.5963-1-ville.syrjala@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Radim Krčmář	7d606a69d9	x86/kvmclock: set pvti_cpu0_va after enabling kvmclock commit `94ffba4846` upstream. pvti_cpu0_va is the address of shared kvmclock data structure. pvti_cpu0_va is currently kept unset (1) on 32 bit systems, (2) when kvmclock vsyscall is disabled, and (3) if kvmclock is not stable. This poses a problem, because kvm_ptp needs pvti_cpu0_va, but (1) can work on 32 bit, (2) has little relation to the vsyscall, and (3) does not need stable kvmclock (although kvmclock won't be used for system clock if it's not stable, so kvm_ptp is pointless in that case). Expose pvti_cpu0_va whenever kvmclock is enabled to allow all users to work with it. This fixes a regression found on Gentoo: https://bugs.gentoo.org/658544. Fixes: `9f08890ab9` ("x86/pvclock: add setter for pvclock_pvti_cpu0_va") Cc: stable@vger.kernel.org Reported-by: Andreas Steinmetz <ast@domdv.de> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Vitaly Kuznetsov	7bb53f0cf0	x86/kvm/vmx: don't read current->thread.{fs,gs}base of legacy tasks commit `b062b794c7` upstream. When we switched from doing rdmsr() to reading FS/GS base values from current->thread we completely forgot about legacy 32-bit userspaces which we still support in KVM (why?). task->thread.{fsbase,gsbase} are only synced for 64-bit processes, calling save_fsgs_for_kvm() and using its result from current is illegal for legacy processes. There's no ARCH_SET_FS/GS prctls for legacy applications. Base MSRs are, however, not always equal to zero. Intel's manual says (3.4.4 Segment Loading Instructions in IA-32e Mode): "In order to set up compatibility mode for an application, segment-load instructions (MOV to Sreg, POP Sreg) work normally in 64-bit mode. An entry is read from the system descriptor table (GDT or LDT) and is loaded in the hidden portion of the segment register. ... The hidden descriptor register fields for FS.base and GS.base are physically mapped to MSRs in order to load all address bits supported by a 64-bit implementation. " The issue was found by strace test suite where 32-bit ioctl_kvm_run test started segfaulting. Reported-by: Dmitry V. Levin <ldv@altlinux.org> Bisected-by: Masatake YAMATO <yamato@redhat.com> Fixes: `42b933b597` ("x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread") Cc: stable@vger.kernel.org Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:07 +02:00
Liran Alon	088827be90	KVM: VMX: Mark VMXArea with revision_id of physical CPU even when eVMCS enabled commit `2307af1c4b` upstream. When eVMCS is enabled, all VMCS allocated to be used by KVM are marked with revision_id of KVM_EVMCS_VERSION instead of revision_id reported by MSR_IA32_VMX_BASIC. However, even though not explictly documented by TLFS, VMXArea passed as VMXON argument should still be marked with revision_id reported by physical CPU. This issue was found by the following setup: * L0 = KVM which expose eVMCS to it's L1 guest. * L1 = KVM which consume eVMCS reported by L0. This setup caused the following to occur: 1) L1 execute hardware_enable(). 2) hardware_enable() calls kvm_cpu_vmxon() to execute VMXON. 3) L0 intercept L1 VMXON and execute handle_vmon() which notes vmxarea->revision_id != VMCS12_REVISION and therefore fails with nested_vmx_failInvalid() which sets RFLAGS.CF. 4) L1 kvm_cpu_vmxon() don't check RFLAGS.CF for failure and therefore hardware_enable() continues as usual. 5) L1 hardware_enable() then calls ept_sync_global() which executes INVEPT. 6) L0 intercept INVEPT and execute handle_invept() which notes !vmx->nested.vmxon and thus raise a #UD to L1. 7) Raised #UD caused L1 to panic. Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com> Cc: stable@vger.kernel.org Fixes: `773e8a0425` Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Paolo Bonzini	36ea7aed04	KVM: irqfd: fix race between EPOLLHUP and irq_bypass_register_consumer commit `9432a31757` upstream. A comment warning against this bug is there, but the code is not doing what the comment says. Therefore it is possible that an EPOLLHUP races against irq_bypass_register_consumer. The EPOLLHUP handler schedules irqfd_shutdown, and if that runs soon enough, you get a use-after-free. Reported-by: syzbot <syzkaller@googlegroups.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Lan Tianyu	0e884c9394	KVM/Eventfd: Avoid crash when assign and deassign specific eventfd in parallel. commit `b5020a8e6b` upstream. Syzbot reports crashes in kvm_irqfd_assign(), caused by use-after-free when kvm_irqfd_assign() and kvm_irqfd_deassign() run in parallel for one specific eventfd. When the assign path hasn't finished but irqfd has been added to kvm->irqfds.items list, another thead may deassign the eventfd and free struct kvm_kernel_irqfd(). The assign path then uses the struct kvm_kernel_irqfd that has been freed by deassign path. To avoid such issue, keep irqfd under kvm->irq_srcu protection after the irqfd has been added to kvm->irqfds.items list, and call synchronize_srcu() in irq_shutdown() to make sure that irqfd has been fully initialized in the assign path. Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Tianyu Lan <tianyu.lan@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Chuck Anderson	db1f278abb	scsi: qla2xxx: Fix NULL pointer dereference for fcport search commit `36eb8ff672` upstream. Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 #1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 #2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c #3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a #4 [ffff891ee00039e0] no_context at ffffffffbd074643 #5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e #6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 #7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a #8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 #9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] #11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] #12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] #13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] #14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 #15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 #16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 #17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 #18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 #19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- #20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame #21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd #22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 #23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 #24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 #25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 #26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa #27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca #28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 #29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb #30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: `040036bb0b` ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Chuck Anderson <chuck.anderson@oracle.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
himanshu.madhani@cavium.com	cbff7f129d	scsi: qla2xxx: Fix kernel crash due to late workqueue allocation commit `d48cc67cd4` upstream. This patch fixes crash for FCoE adapter. Once driver initialization is complete, firmware will start posting Asynchronous Event, However driver has not yet allocated workqueue to process and queue up work. This delay of allocating workqueue results into NULL pointer access. The following stack trace is seen: [ 24.577259] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102 [ 24.623133] PGD 0 P4D 0 [ 24.636760] Oops: 0000 [#1] SMP NOPTI [ 24.656942] Modules linked in: i2c_algo_bit drm_kms_helper sr_mod(+) syscopyarea sysfillrect sysimgblt cdrom fb_sys_fops ata_generic ttm pata_acpi sd_mod ahci pata_atiixp sfc(+) qla2xxx(+) libahci drm qla4xxx(+) nvme_fc hpsa mdio libiscsi qlcnic(+) nvme_fabrics scsi_transport_sas serio_raw mtd crc32c_intel libata nvme_core i2c_core scsi_transport_iscsi tg3 scsi_transport_fc bnx2 iscsi_boot_sysfs dm_multipath dm_mirror dm_region_hash dm_log dm_mod [ 24.887449] CPU: 0 PID: 177 Comm: kworker/0:3 Not tainted 4.17.0-rc6 #1 [ 24.925119] Hardware name: HP ProLiant DL385 G7, BIOS A18 08/15/2012 [ 24.962106] Workqueue: events work_for_cpu_fn [ 24.987098] RIP: 0010:__queue_work+0x1f/0x3a0 [ 25.011672] RSP: 0018:ffff992642ceba10 EFLAGS: 00010082 [ 25.042116] RAX: 0000000000000082 RBX: 0000000000000082 RCX: 0000000000000000 [ 25.083293] RDX: ffff8cf9abc6d7d0 RSI: 0000000000000000 RDI: 0000000000002000 [ 25.123094] RBP: 0000000000000000 R08: 0000000000025a40 R09: ffff8cf9aade2880 [ 25.164087] R10: 0000000000000000 R11: ffff992642ceb6f0 R12: ffff8cf9abc6d7d0 [ 25.202280] R13: 0000000000002000 R14: ffff8cf9abc6d7b8 R15: 0000000000002000 [ 25.242050] FS: 0000000000000000(0000) f9b5c00000(0000) knlGS:0000000000000000 [ 25.977565] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 26.010457] CR2: 0000000000000102 CR3: 000000030760a000 CR4: 00000000000406f0 [ 26.051048] Call Trace: [ 26.063572] ? __switch_to_asm+0x34/0x70 [ 26.086079] queue_work_on+0x24/0x40 [ 26.107090] qla2x00_post_work+0x81/0xb0 [qla2xxx] [ 26.133356] qla2x00_async_event+0x1ad/0x1a20 [qla2xxx] [ 26.164075] ? lock_timer_base+0x67/0x80 [ 26.186420] ? try_to_del_timer_sync+0x4d/0x80 [ 26.212284] ? del_timer_sync+0x35/0x40 [ 26.234080] ? schedule_timeout+0x165/0x2f0 [ 26.259575] qla82xx_poll+0x13e/0x180 [qla2xxx] [ 26.285740] qla2x00_mailbox_command+0x74b/0xf50 [qla2xxx] [ 26.319040] qla82xx_set_driver_version+0x13b/0x1c0 [qla2xxx] [ 26.352108] ? qla2x00_init_rings+0x206/0x3f0 [qla2xxx] [ 26.381733] qla2x00_initialize_adapter+0x35c/0x7f0 [qla2xxx] [ 26.413240] qla2x00_probe_one+0x1479/0x2390 [qla2xxx] [ 26.442055] local_pci_probe+0x3f/0xa0 [ 26.463108] work_for_cpu_fn+0x10/0x20 [ 26.483295] process_one_work+0x152/0x350 [ 26.505730] worker_thread+0x1cf/0x3e0 [ 26.527090] kthread+0xf5/0x130 [ 26.545085] ? max_active_store+0x80/0x80 [ 26.568085] ? kthread_bind+0x10/0x10 [ 26.589533] ret_from_fork+0x22/0x40 [ 26.610192] Code: 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 89 ff 41 56 41 55 41 89 fd 41 54 49 89 d4 55 48 89 f5 53 48 83 ec 0 86 02 01 00 00 01 0f 85 80 02 00 00 49 c7 c6 c0 ec 01 00 41 [ 27.308540] RIP: __queue_work+0x1f/0x3a0 RSP: ffff992642ceba10 [ 27.341591] CR2: 0000000000000102 [ 27.360208] ---[ end trace 01b7b7ae2c005cf3 ]--- Cc: <stable@vger.kernel.org> # v4.17+ Fixes: `9b3e0f4d41` ("scsi: qla2xxx: Move work element processing out of DPC thread" Reported-by: Li Wang <liwang@redhat.com> Tested-by: Li Wang <liwang@redhat.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Quinn Tran	8d7555604d	scsi: qla2xxx: Fix inconsistent DMA mem alloc/free commit `b5f3bc39a0` upstream. GPNFT command allocates 2 buffer for switch query. On completion, the same buffers were freed using different size, instead of using original size at the time of allocation. This patch saves the size of the request and response buffers and uses that to free them. Following stack trace can be seen when using debug kernel dump_stack+0x19/0x1b __warn+0xd8/0x100 warn_slowpath_fmt+0x5f/0x80 check_unmap+0xfb/0xa20 debug_dma_free_coherent+0x110/0x160 qla24xx_sp_unmap+0x131/0x1e0 [qla2xxx] qla24xx_async_gnnft_done+0xb6/0x550 [qla2xxx] qla2x00_do_work+0x1ec/0x9f0 [qla2xxx] Cc: <stable@vger.kernel.org> # v4.17+ Fixes: `33b28357dd` ("scsi: qla2xxx: Fix Async GPN_FT for FCP and FC-NVMe scan") Reported-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Himanshu Madhani <hmadhani@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Damien Le Moal	275f85d8f6	scsi: sd_zbc: Fix variable type and bogus comment commit `f13cff6c25` upstream. Fix the description of sd_zbc_check_zone_size() to correctly explain that the returned value is a number of device blocks, not bytes. Additionally, the 32 bits "ret" variable used in this function may truncate the 64 bits zone_blocks variable value upon return. To fix this, change "ret" type to s64. Fixes: `ccce20fc79` ("sd_zbc: Avoid that resetting a zone fails sporadically") Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: stable@kernel.org Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-25 11:26:06 +02:00
Greg Kroah-Hartman	339d95b9a7	Linux 4.17.9	2018-07-22 15:16:09 +02:00
Daniel Borkmann	b1a4a5d0b0	bpf: undo prog rejection on read-only lock failure commit `85782e037f` upstream. Partially undo commit `9facc33687` ("bpf: reject any prog that failed read-only lock") since it caused a regression, that is, syzkaller was able to manage to cause a panic via fault injection deep in set_memory_ro() path by letting an allocation fail: In x86's __change_page_attr_set_clr() it was able to change the attributes of the primary mapping but not in the alias mapping via cpa_process_alias(), so the second, inner call to the __change_page_attr() via __change_page_attr_set_clr() had to split a larger page and failed in the alloc_pages() with the artifically triggered allocation error which is then propagated down to the call site. Thus, for set_memory_ro() this means that it returned with an error, but from debugging a probe_kernel_write() revealed EFAULT on that memory since the primary mapping succeeded to get changed. Therefore the subsequent hdr->locked = 0 reset triggered the panic as it was performed on read-only memory, so call-site assumptions were infact wrong to assume that it would either succeed /or/ not succeed at all since there's no such rollback in set_memory_() calls from partial change of mappings, in other words, we're left in a state that is "half done". A later undo via set_memory_rw() is succeeding though due to matching permissions on that part (aka due to the try_preserve_large_page() succeeding). While reproducing locally with explicitly triggering this error, the initial splitting only happens on rare occasions and in real world it would additionally need oom conditions, but that said, it could partially fail. Therefore, it is definitely wrong to bail out on set_memory_ro() error and reject the program with the set_memory_() semantics we have today. Shouldn't have gone the extra mile since no other user in tree today infact checks for any set_memory_() errors, e.g. neither module_enable_ro() / module_disable_ro() for module RO/NX handling which is mostly default these days nor kprobes core with alloc_insn_page() / free_insn_page() as examples that could be invoked long after bootup and original `314beb9bca` ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") did neither when it got first introduced to BPF so "improving" with bailing out was clearly not right when set_memory_() cannot handle it today. Kees suggested that if set_memory_() can fail, we should annotate it with __must_check, and all callers need to deal with it gracefully given those set_memory_() markings aren't "advisory", but they're expected to actually do what they say. This might be an option worth to move forward in future but would at the same time require that set_memory_*() calls from supporting archs are guaranteed to be "atomic" in that they provide rollback if part of the range fails, once that happened, the transition from RW -> RO could be made more robust that way, while subsequent RO -> RW transition /must/ continue guaranteeing to always succeed the undo part. Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com Fixes: `9facc33687` ("bpf: reject any prog that failed read-only lock") Cc: Laura Abbott <labbott@redhat.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:09 +02:00
Daniel Borkmann	1df397963a	bpf, arm32: fix to use bpf_jit_binary_lock_ro api commit `18d405af30` upstream. Any eBPF JIT that where its underlying arch supports ARCH_HAS_SET_MEMORY would need to use bpf_jit_binary_{un,}lock_ro() pair instead of the set_memory_{ro,rw}() pair directly as otherwise changes to the former might break. arm32's eBPF conversion missed to change it, so fix this up here. Fixes: `39c13c204b` ("arm: eBPF JIT compiler") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Eric Dumazet	d3aaff449e	bpf: enforce correct alignment for instructions commit `9262478220` upstream. After commit `9facc33687` ("bpf: reject any prog that failed read-only lock") offsetof(struct bpf_binary_header, image) became 3 instead of 4, breaking powerpc BPF badly, since instructions need to be word aligned. Fixes: `9facc33687` ("bpf: reject any prog that failed read-only lock") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	abe6147b87	arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID commit `5d81f7dc9b` upstream. Now that all our infrastructure is in place, let's expose the availability of ARCH_WORKAROUND_2 to guests. We take this opportunity to tidy up a couple of SMCCC constants. Acked-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	dc499a3721	arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests commit `b4f18c063a` upstream. In order to forward the guest's ARCH_WORKAROUND_2 calls to EL3, add a small(-ish) sequence to handle it at EL2. Special care must be taken to track the state of the guest itself by updating the workaround flags. We also rely on patching to enable calls into the firmware. Note that since we need to execute branches, this always executes after the Spectre-v2 mitigation has been applied. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	47ff4f2357	arm64: KVM: Add ARCH_WORKAROUND_2 support for guests commit `55e3748e89` upstream. In order to offer ARCH_WORKAROUND_2 support to guests, we need a bit of infrastructure. Let's add a flag indicating whether or not the guest uses SSBD mitigation. Depending on the state of this flag, allow KVM to disable ARCH_WORKAROUND_2 before entering the guest, and enable it when exiting it. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	90b7114bc3	arm64: KVM: Add HYP per-cpu accessors commit `85478bab40` upstream. As we're going to require to access per-cpu variables at EL2, let's craft the minimum set of accessors required to implement reading a per-cpu variable, relying on tpidr_el2 to contain the per-cpu offset. Reviewed-by: Christoffer Dall <christoffer.dall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	1e7eaefb33	arm64: ssbd: Add prctl interface for per-thread mitigation commit `9cdc0108ba` upstream. If running on a system that performs dynamic SSBD mitigation, allow userspace to request the mitigation for itself. This is implemented as a prctl call, allowing the mitigation to be enabled or disabled at will for this particular thread. Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	f2607eac8f	arm64: ssbd: Introduce thread flag to control userspace mitigation commit `9dd9614f54` upstream. In order to allow userspace to be mitigated on demand, let's introduce a new thread flag that prevents the mitigation from being turned off when exiting to userspace, and doesn't turn it on on entry into the kernel (with the assumption that the mitigation is always enabled in the kernel itself). This will be used by a prctl interface introduced in a later patch. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	ec40f159ec	arm64: ssbd: Restore mitigation status on CPU resume commit `647d0519b5` upstream. On a system where firmware can dynamically change the state of the mitigation, the CPU will always come up with the mitigation enabled, including when coming back from suspend. If the user has requested "no mitigation" via a command line option, let's enforce it by calling into the firmware again to disable it. Similarily, for a resume from hibernate, the mitigation could have been disabled by the boot kernel. Let's ensure that it is set back on in that case. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	5c0c86a03b	arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation commit `986372c436` upstream. In order to avoid checking arm64_ssbd_callback_required on each kernel entry/exit even if no mitigation is required, let's add yet another alternative that by default jumps over the mitigation, and that gets nop'ed out if we're doing dynamic mitigation. Think of it as a poor man's static key... Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	dc87d382f0	arm64: ssbd: Add global mitigation state accessor commit `c32e1736ca` upstream. We're about to need the mitigation state in various parts of the kernel in order to do the right thing for userspace and guests. Let's expose an accessor that will let other subsystems know about the state. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:08 +02:00
Marc Zyngier	fcd1b4fab4	arm64: Add 'ssbd' command-line option commit `a43ae4dfe5` upstream. On a system where the firmware implements ARCH_WORKAROUND_2, it may be useful to either permanently enable or disable the workaround for cases where the user decides that they'd rather not get a trap overhead, and keep the mitigation permanently on or off instead of switching it on exception entry/exit. In any case, default to the mitigation being enabled. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Marc Zyngier	e89f7833d9	arm64: Add ARCH_WORKAROUND_2 probing commit `a725e3dda1` upstream. As for Spectre variant-2, we rely on SMCCC 1.1 to provide the discovery mechanism for detecting the SSBD mitigation. A new capability is also allocated for that purpose, and a config option. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Marc Zyngier	71afaa6b7e	arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2 commit `5cf9ce6e5e` upstream. In a heterogeneous system, we can end up with both affected and unaffected CPUs. Let's check their status before calling into the firmware. Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Marc Zyngier	55e72ae8b2	arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1 commit `8e2906245f` upstream. In order for the kernel to protect itself, let's call the SSBD mitigation implemented by the higher exception level (either hypervisor or firmware) on each transition between userspace and kernel. We must take the PSCI conduit into account in order to target the right exception level, hence the introduction of a runtime patching callback. Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Marc Zyngier	e27fa3220c	arm/arm64: smccc: Add SMCCC-specific return codes commit `eff0e9e107` upstream. We've so far used the PSCI return codes for SMCCC because they were extremely similar. But with the new ARM DEN 0070A specification, "NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS". Let's bite the bullet and add SMCCC specific return codes. Users can be repainted as and when required. Acked-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Cong Wang	cc38972301	ipvs: initialize tbl->entries in ip_vs_lblc_init_svc() commit `8b2ebb6cf0` upstream. Similarly, tbl->entries is not initialized after kmalloc(), therefore causes an uninit-value warning in ip_vs_lblc_check_expire(), as reported by syzbot. Reported-by: <syzbot+3e9695f147fb529aa9bc@syzkaller.appspotmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Cong Wang	1b379ebfd1	ipvs: initialize tbl->entries after allocation commit `3aa1409a7b` upstream. tbl->entries is not initialized after kmalloc(), therefore causes an uninit-value warning in ip_vs_lblc_check_expire() as reported by syzbot. Reported-by: <syzbot+3dfdea57819073a04f21@syzkaller.appspotmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Tetsuo Handa	0e1fdd9f14	net/nfc: Avoid stalls when nfc_alloc_send_skb() returned NULL. commit `3bc53be9db` upstream. syzbot is reporting stalls at nfc_llcp_send_ui_frame() [1]. This is because nfc_llcp_send_ui_frame() is retrying the loop without any delay when nonblocking nfc_alloc_send_skb() returned NULL. Since there is no need to use MSG_DONTWAIT if we retry until sock_alloc_send_pskb() succeeds, let's use blocking call. Also, in case an unexpected error occurred, let's break the loop if blocking nfc_alloc_send_skb() failed. [1] https://syzkaller.appspot.com/bug?id=4a131cc571c3733e0eff6bc673f4e36ae48f19c6 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+d29d18215e477cfbfbdd@syzkaller.appspotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
Daniel Borkmann	5b36fc2a96	bpf: don't leave partial mangled prog in jit_subprogs error path commit `c7a8978432` upstream. syzkaller managed to trigger the following bug through fault injection: [...] [ 141.043668] verifier bug. No program starts at insn 3 [ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613 get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline] [ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613 fixup_call_args kernel/bpf/verifier.c:5587 [inline] [ 141.044648] WARNING: CPU: 3 PID: 4072 at kernel/bpf/verifier.c:1613 bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952 [ 141.047355] CPU: 3 PID: 4072 Comm: a.out Not tainted 4.18.0-rc4+ #51 [ 141.048446] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS 1.10.2-1 04/01/2014 [ 141.049877] Call Trace: [ 141.050324] __dump_stack lib/dump_stack.c:77 [inline] [ 141.050324] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 [ 141.050950] ? dump_stack_print_info.cold.2+0x52/0x52 lib/dump_stack.c:60 [ 141.051837] panic+0x238/0x4e7 kernel/panic.c:184 [ 141.052386] ? add_taint.cold.5+0x16/0x16 kernel/panic.c:385 [ 141.053101] ? __warn.cold.8+0x148/0x1ba kernel/panic.c:537 [ 141.053814] ? __warn.cold.8+0x117/0x1ba kernel/panic.c:530 [ 141.054506] ? get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline] [ 141.054506] ? fixup_call_args kernel/bpf/verifier.c:5587 [inline] [ 141.054506] ? bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952 [ 141.055163] __warn.cold.8+0x163/0x1ba kernel/panic.c:538 [ 141.055820] ? get_callee_stack_depth kernel/bpf/verifier.c:1612 [inline] [ 141.055820] ? fixup_call_args kernel/bpf/verifier.c:5587 [inline] [ 141.055820] ? bpf_check+0x525e/0x5e60 kernel/bpf/verifier.c:5952 [...] What happens in jit_subprogs() is that kcalloc() for the subprog func buffer is failing with NULL where we then bail out. Latter is a plain return -ENOMEM, and this is definitely not okay since earlier in the loop we are walking all subprogs and temporarily rewrite insn->off to remember the subprog id as well as insn->imm to temporarily point the call to __bpf_call_base + 1 for the initial JIT pass. Thus, bailing out in such state and handing this over to the interpreter is troublesome since later/subsequent e.g. find_subprog() lookups are based on wrong insn->imm. Therefore, once we hit this point, we need to jump to out_free path where we undo all changes from earlier loop, so that interpreter can work on unmodified insn->{off,imm}. Another point is that should find_subprog() fail in jit_subprogs() due to a verifier bug, then we also should not simply defer the program to the interpreter since also here we did partial modifications. Instead we should just bail out entirely and return an error to the user who is trying to load the program. Fixes: `1c2a088a66` ("bpf: x64: add JIT support for multi-function programs") Reported-by: syzbot+7d427828b2ea6e592804@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
John Fastabend	1697f2008f	bpf: sockmap, consume_skb in close path commit `7ebc14d507` upstream. Currently, when a sock is closed and the bpf_tcp_close() callback is used we remove memory but do not free the skb. Call consume_skb() if the skb is attached to the buffer. Reported-by: syzbot+d464d2c20c717ef5a6a8@syzkaller.appspotmail.com Fixes: `1aa12bdf1b` ("bpf: sockmap, add sock close() hook to remove socks") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:07 +02:00
John Fastabend	0b5a622686	bpf: sockmap, fix crash when ipv6 sock is added commit `9901c5d77e` upstream. This fixes a crash where we assign tcp_prot to IPv6 sockets instead of tcpv6_prot. Previously we overwrote the sk->prot field with tcp_prot even in the AF_INET6 case. This patch ensures the correct tcp_prot and tcpv6_prot are used. Tested with 'netserver -6' and 'netperf -H [IPv6]' as well as 'netperf -H [IPv4]'. The ESTABLISHED check resolves the previously crashing case here. Fixes: `174a79ff95` ("bpf: sockmap with sk redirect support") Reported-by: syzbot+5c063698bdbfac19f363@syzkaller.appspotmail.com Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Jens Axboe	cc4cc98785	block: don't use blocking queue entered for recursive bio submits commit `cd4a4ae468` upstream. If we end up splitting a bio and the queue goes away between the initial submission and the later split submission, then we can block forever in blk_queue_enter() waiting for the reference to drop to zero. This will never happen, since we already hold a reference. Mark a split bio as already having entered the queue, so we can just use the live non-blocking queue enter variant. Thanks to Tetsuo Handa for the analysis. Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Santosh Shilimkar	84e0e8168c	rds: avoid unenecessary cong_update in loop transport commit `f1693c63ab` upstream. Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Daniel Borkmann	ab2acc0f4a	bpf: reject any prog that failed read-only lock commit `9facc33687` upstream. We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro() as well as the BPF image as read-only through bpf_prog_lock_ro(). In the case any of these would fail we throw a WARN_ON_ONCE() in order to yell loudly to the log. Perhaps, to some extend, this may be comparable to an allocation where __GFP_NOWARN is explicitly not set. Added via `65869a47f3` ("bpf: improve read-only handling"), this behavior is slightly different compared to any of the other in-kernel set_memory_ro() users who do not check the return code of set_memory_ro() and friends /at all/ (e.g. in the case of module_enable_ro() / module_disable_ro()). Given in BPF this is mandatory hardening step, we want to know whether there are any issues that would leave both BPF data writable. So it happens that syzkaller enabled fault injection and it triggered memory allocation failure deep inside x86's change_page_attr_set_clr() which was triggered from set_memory_ro(). Now, there are two options: i) leaving everything as is, and ii) reworking the image locking code in order to have a final checkpoint out of the central bpf_prog_select_runtime() which probes whether any of the calls during prog setup weren't successful, and then bailing out with an error. Option ii) is a better approach since this additional paranoia avoids altogether leaving any potential W+X pages from BPF side in the system. Therefore, lets be strict about it, and reject programs in such unlikely occasion. While testing I noticed also that one bpf_prog_lock_ro() call was missing on the outer dummy prog in case of calls, e.g. in the destructor we call bpf_prog_free_deferred() on the main prog where we try to bpf_prog_unlock_free() the program, and since we go via bpf_prog_select_runtime() do that as well. Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com Reported-by: syzbot+9e762b52dd17e616a7a5@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Jan Kara	69a7cd51a1	bdi: Fix another oops in wb_workfn() commit `3ee7e8697d` upstream. syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to wb->bdi->dev being NULL. And Dmitry confirmed that wb->state was WB_shutting_down after wb->bdi->dev became NULL. This indicates that unregister_bdi() failed to call wb_shutdown() on one of wb objects. The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus drops bdi's reference to wb structures before going through the list of wbs again and calling wb_shutdown() on each of them. This way the loop iterating through all wbs can easily miss a wb if that wb has already passed through cgwb_remove_from_bdi_list() called from wb_shutdown() from cgwb_release_workfn() and as a result fully shutdown bdi although wb_workfn() for this wb structure is still running. In fact there are also other ways cgwb_bdi_unregister() can race with cgwb_release_workfn() leading e.g. to use-after-free issues: CPU1 CPU2 cgwb_bdi_unregister() cgwb_kill(*slot); cgwb_release() queue_work(cgwb_release_wq, &wb->release_work); cgwb_release_workfn() wb = list_first_entry(&bdi->wb_list, ...) spin_unlock_irq(&cgwb_lock); wb_shutdown(wb); ... kfree_rcu(wb, rcu); wb_shutdown(wb); -> oops use-after-free We solve these issues by synchronizing writeback structure shutdown from cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That way we also no longer need synchronization using WB_shutting_down as the mutex provides it for CONFIG_CGROUP_WRITEBACK case and without CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from bdi_unregister(). Reported-by: syzbot <syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Florian Westphal	e81fd42395	netfilter: ipv6: nf_defrag: drop skb dst before queueing commit `84379c9afe` upstream. Eric Dumazet reports: Here is a reproducer of an annoying bug detected by syzkaller on our production kernel [..] ./b78305423 enable_conntrack Then : sleep 60 dmesg \| tail -10 [ 171.599093] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 181.631024] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 191.687076] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 201.703037] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 211.711072] unregister_netdevice: waiting for lo to become free. Usage count = 2 [ 221.959070] unregister_netdevice: waiting for lo to become free. Usage count = 2 Reproducer sends ipv6 fragment that hits nfct defrag via LOCAL_OUT hook. skb gets queued until frag timer expiry -- 1 minute. Normally nf_conntrack_reasm gets called during prerouting, so skb has no dst yet which might explain why this wasn't spotted earlier. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: John Sperbeck <jsperbeck@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Tested-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Willem de Bruijn	bd15f1d3d1	nsh: set mac len based on inner packet commit `bab2c80e5a` upstream. When pulling the NSH header in nsh_gso_segment, set the mac length based on the encapsulated packet type. skb_reset_mac_len computes an offset to the network header, which here still points to the outer packet: > skb_reset_network_header(skb); > [...] > __skb_pull(skb, nsh_len); > skb_reset_mac_header(skb); // now mac hdr starts nsh_len == 8B after net hdr > skb_reset_mac_len(skb); // mac len = net hdr - mac hdr == (u16) -8 == 65528 > [..] > skb_mac_gso_segment(skb, ..) Link: http://lkml.kernel.org/r/CAF=yD-KeAcTSOn4AxirAxL8m7QAS8GBBe1w09eziYwvPbbUeYA@mail.gmail.com Reported-by: syzbot+7b9ed9872dab8c32305d@syzkaller.appspotmail.com Fixes: `c411ed8545` ("nsh: add GSO support") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Tomas Bortoli	8cafe20f1d	autofs: fix slab out of bounds read in getname_kernel() commit `02f51d4593` upstream. The autofs subsystem does not check that the "path" parameter is present for all cases where it is required when it is passed in via the "param" struct. In particular it isn't checked for the AUTOFS_DEV_IOCTL_OPENMOUNT_CMD ioctl command. To solve it, modify validate_dev_ioctl(function to check that a path has been provided for ioctl commands that require it. Link: http://lkml.kernel.org/r/153060031527.26631.18306637892746301555.stgit@pluto.themaw.net Signed-off-by: Tomas Bortoli <tomasbortoli@gmail.com> Signed-off-by: Ian Kent <raven@themaw.net> Reported-by: syzbot+60c837b428dc84e83a93@syzkaller.appspotmail.com Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Dave Watson	cf3fd8f306	tls: Stricter error checking in zerocopy sendmsg path commit `32da12216e` upstream. In the zerocopy sendmsg() path, there are error checks to revert the zerocopy if we get any error code. syzkaller has discovered that tls_push_record can return -ECONNRESET, which is fatal, and happens after the point at which it is safe to revert the iter, as we've already passed the memory to do_tcp_sendpages. Previously this code could return -ENOMEM and we would want to revert the iter, but AFAIK this no longer returns ENOMEM after `a447da7d00` ("tls: fix waitall behavior in tls_sw_recvmsg"), so we fail for all error codes. Reported-by: syzbot+c226690f7b3126c5ee04@syzkaller.appspotmail.com Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com Signed-off-by: Dave Watson <davejwatson@fb.com> Fixes: `3c4d755915` ("tls: kernel TLS support") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Eric Biggers	41de3be7b9	KEYS: DNS: fix parsing multiple options commit `c604cb7670` upstream. My recent fix for dns_resolver_preparse() printing very long strings was incomplete, as shown by syzbot which still managed to hit the WARN_ONCE() in set_precision() by adding a crafted "dns_resolver" key: precision 50001 too large WARNING: CPU: 7 PID: 864 at lib/vsprintf.c:2164 vsnprintf+0x48a/0x5a0 The bug this time isn't just a printing bug, but also a logical error when multiple options ("#"-separated strings) are given in the key payload. Specifically, when separating an option string into name and value, if there is no value then the name is incorrectly considered to end at the end of the key payload, rather than the end of the current option. This bypasses validation of the option length, and also means that specifying multiple options is broken -- which presumably has gone unnoticed as there is currently only one valid option anyway. A similar problem also applied to option values, as the kstrtoul() when parsing the "dnserror" option will read past the end of the current option and into the next option. Fix these bugs by correctly computing the length of the option name and by copying the option value, null-terminated, into a temporary buffer. Reproducer for the WARN_ONCE() that syzbot hit: perl -e 'print "#A#", "\0" x 50000' \| keyctl padd dns_resolver desc @s Reproducer for "dnserror" option being parsed incorrectly (expected behavior is to fail when seeing the unknown option "foo", actual behavior was to read the dnserror value as "1#foo" and fail there): perl -e 'print "#dnserror=1#foo\0"' \| keyctl padd dns_resolver desc @s Reported-by: syzbot <syzkaller@googlegroups.com> Fixes: `4a2d789267` ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:06 +02:00
Eric Biggers	53e9ccdffb	reiserfs: fix buffer overflow with long warning messages commit `fe10e398e8` upstream. ReiserFS prepares log messages into a 1024-byte buffer with no bounds checks. Long messages, such as the "unknown mount option" warning when userspace passes a crafted mount options string, overflow this buffer. This causes KASAN to report a global-out-of-bounds write. Fix it by truncating messages to the buffer size. Link: http://lkml.kernel.org/r/20180707203621.30922-1-ebiggers3@gmail.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reported-by: syzbot+b890b3335a4d8c608963@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Florian Westphal	276992ecd6	netfilter: ebtables: reject non-bridge targets commit `11ff7288be` upstream. the ebtables evaluation loop expects targets to return positive values (jumps), or negative values (absolute verdicts). This is completely different from what xtables does. In xtables, targets are expected to return the standard netfilter verdicts, i.e. NF_DROP, NF_ACCEPT, etc. ebtables will consider these as jumps. Therefore reject any target found due to unspec fallback. v2: also reject watchers. ebtables ignores their return value, so a target that assumes skb ownership (and returns NF_STOLEN) causes use-after-free. The only watchers in the 'ebtables' front-end are log and nflog; both have AF_BRIDGE specific wrappers on kernel side. Reported-by: syzbot+2b43f681169a2a0d306a@syzkaller.appspotmail.com Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Dexuan Cui	e0b0a684d5	PCI: hv: Disable/enable IRQs rather than BH in hv_compose_msi_msg() commit `35a88a18d7` upstream. Commit `de0aa7b2f9` ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") uses local_bh_disable()/enable(), because hv_pci_onchannelcallback() can also run in tasklet context as the channel event callback, so bottom halves should be disabled to prevent a race condition. With CONFIG_PROVE_LOCKING=y in the recent mainline, or old kernels that don't have commit `f71b74bca6` ("irq/softirqs: Use lockdep to assert IRQs are disabled/enabled"), when the upper layer IRQ code calls hv_compose_msi_msg() with local IRQs disabled, we'll see a warning at the beginning of __local_bh_enable_ip(): IRQs not enabled as expected WARNING: CPU: 0 PID: 408 at kernel/softirq.c:162 __local_bh_enable_ip The warning exposes an issue in `de0aa7b2f9`: local_bh_enable() can potentially call do_softirq(), which is not supposed to run when local IRQs are disabled. Let's fix this by using local_irq_save()/restore() instead. Note: hv_pci_onchannelcallback() is not a hot path because it's only called when the PCI device is hot added and removed, which is infrequent. Fixes: `de0aa7b2f9` ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()") Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: stable@vger.kernel.org Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Stephan Mueller	3185701e54	crypto: af_alg - Initialize sg_num_bytes in error code path commit `2546da9921` upstream. The RX SGL in processing is already registered with the RX SGL tracking list to support proper cleanup. The cleanup code path uses the sg_num_bytes variable which must therefore be always initialized, even in the error code path. Signed-off-by: Stephan Mueller <smueller@chronox.de> Reported-by: syzbot+9c251bdd09f83b92ba95@syzkaller.appspotmail.com #syz test: https://github.com/google/kmsan.git master CC: <stable@vger.kernel.org> #4.14 Fixes: `e870456d8e` ("crypto: algif_skcipher - overhaul memory management") Fixes: `d887c52d6a` ("crypto: algif_aead - overhaul memory management") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Stefan Wahren	c52b146b28	net: lan78xx: Fix race in tx pending skb size calculation commit `dea39aca1d` upstream. The skb size calculation in lan78xx_tx_bh is in race with the start_xmit, which could lead to rare kernel oopses. So protect the whole skb walk with a spin lock. As a benefit we can unlink the skb directly. This patch was tested on Raspberry Pi 3B+ Link: https://github.com/raspberrypi/linux/issues/2608 Fixes: `55d7de9de6` ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet") Cc: stable <stable@vger.kernel.org> Signed-off-by: Floris Bos <bos@je-eigen-domein.nl> Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Ping-Ke Shih	e7e7b9c43e	rtlwifi: rtl8821ae: fix firmware is not ready to run commit `9a98302de1` upstream. Without this patch, firmware will not run properly on rtl8821ae, and it causes bad user experience. For example, bad connection performance with low rate, higher power consumption, and so on. rtl8821ae uses two kinds of firmwares for normal and WoWlan cases, and each firmware has firmware data buffer and size individually. Original code always overwrite size of normal firmware rtlpriv->rtlhal.fwsize, and this mismatch causes firmware checksum error, then firmware can't start. In this situation, driver gives message "Firmware is not ready to run!". Fixes: `fe89707f0a` ("rtlwifi: rtl8821ae: Simplify loading of WOWLAN firmware") Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: Stable <stable@vger.kernel.org> # 4.0+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:05 +02:00
Ping-Ke Shih	deec5e377a	rtlwifi: Fix kernel Oops "Fw download fail!!" commit `12dfa2f68a` upstream. When connecting to AP, mac80211 asks driver to enter and leave PS quickly, but driver deinit doesn't wait for delayed work complete when entering PS, then driver reinit procedure and delay work are running simultaneously. This will cause unpredictable kernel oops or crash like rtl8723be: error H2C cmd because of Fw download fail!!! WARNING: CPU: 3 PID: 159 at drivers/net/wireless/realtek/rtlwifi/ rtl8723be/fw.c:227 rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be] CPU: 3 PID: 159 Comm: kworker/3:2 Tainted: G O 4.16.13-2-ARCH #1 Hardware name: ASUSTeK COMPUTER INC. X556UF/X556UF, BIOS X556UF.406 10/21/2016 Workqueue: rtl8723be_pci rtl_c2hcmd_wq_callback [rtlwifi] RIP: 0010:rtl8723be_fill_h2c_cmd+0x182/0x510 [rtl8723be] RSP: 0018:ffffa6ab01e1bd70 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffffa26069071520 RCX: 0000000000000001 RDX: 0000000080000001 RSI: ffffffff8be70e9c RDI: 00000000ffffffff RBP: 0000000000000000 R08: 0000000000000048 R09: 0000000000000348 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffffa26069071520 R14: 0000000000000000 R15: ffffa2607d205f70 FS: 0000000000000000(0000) GS:ffffa26081d80000(0000) knlGS:000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000443b39d3000 CR3: 000000037700a005 CR4: 00000000003606e0 Call Trace: ? halbtc_send_bt_mp_operation.constprop.17+0xd5/0xe0 [btcoexist] ? ex_btc8723b1ant_bt_info_notify+0x3b8/0x820 [btcoexist] ? rtl_c2hcmd_launcher+0xab/0x110 [rtlwifi] ? process_one_work+0x1d1/0x3b0 ? worker_thread+0x2b/0x3d0 ? process_one_work+0x3b0/0x3b0 ? kthread+0x112/0x130 ? kthread_create_on_node+0x60/0x60 ? ret_from_fork+0x35/0x40 Code: 00 76 b4 e9 e2 fe ff ff 4c 89 ee 4c 89 e7 e8 56 22 86 ca e9 5e ... This patch ensures all delayed works done before entering PS to satisfy our expectation, so use cancel_delayed_work_sync() instead. An exception is delayed work ips_nic_off_wq because running task may be itself, so add a parameter ips_wq to deinit function to handle this case. This issue is reported and fixed in below threads: https://github.com/lwfinger/rtlwifi_new/issues/367 https://github.com/lwfinger/rtlwifi_new/issues/366 Tested-by: Evgeny Kapun <abacabadabacaba@gmail.com> # 8723DE Tested-by: Shivam Kakkar <shivam543@gmail.com> # 8723BE on 4.18-rc1 Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Fixes: `cceb0a5973` ("rtlwifi: Add work queue for c2h cmd.") Cc: Stable <stable@vger.kernel.org> # 4.11+ Reviewed-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Gustavo A. R. Silva	0da69aa8a8	net: cxgb3_main: fix potential Spectre v1 commit `676bcfece1` upstream. t.qset_idx can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:2286 cxgb_extension_ioctl() warn: potential spectre issue 'adapter->msix_info' Fix this by sanitizing t.qset_idx before using it to index adapter->msix_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Janakarajan Natarajan	06bf2a78c7	x86/kvm/Kconfig: Ensure CRYPTO_DEV_CCP_DD state at minimum matches KVM_AMD commit `d30f370d3a` upstream. Prevent a config where KVM_AMD=y and CRYPTO_DEV_CCP_DD=m thereby ensuring that AMD Secure Processor device driver will be built-in when KVM_AMD is also built-in. v1->v2: * Removed usage of 'imply' Kconfig option. * Change patch commit message. Fixes: `505c9e94d8` ("KVM: x86: prefer "depends on" to "select" for SEV") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Reviewed-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Jesper Dangaard Brouer	b0166cb03f	virtio_net: split XDP_TX kick and XDP_REDIRECT map flushing [ Upstream commit `2471c75efe` ] The driver was combining XDP_TX virtqueue_kick and XDP_REDIRECT map flushing (xdp_do_flush_map). This is suboptimal, these two flush operations should be kept separate. The suboptimal behavior was introduced in commit `9267c430c6` ("virtio-net: add missing virtqueue kick when flushing packets"). Fixes: `9267c430c6` ("virtio-net: add missing virtqueue kick when flushing packets") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Bert Kenward	5bf7725547	sfc: correctly initialise filter rwsem for farch [ Upstream commit `cafb39600e` ] Fixes: `fc7a6c287f` ("sfc: use a semaphore to lock farch filters too") Suggested-by: Joseph Korty <joe.korty@concurrent-rt.com> Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Julian Wiedmann	149e5fe410	s390/qeth: fix race when setting MAC address [ Upstream commit `4789a21880` ] When qeth_l2_set_mac_address() finds the card in a non-reachable state, it merely copies the new MAC address into dev->dev_addr so that __qeth_l2_set_online() can later register it with the HW. But __qeth_l2_set_online() may very well be running concurrently, so we can't trust the card state without appropriate locking: If the online sequence is past the point where it registers dev->dev_addr (but not yet in SOFTSETUP state), any address change needs to be properly programmed into the HW. Otherwise the netdevice ends up with a different MAC address than what's set in the HW, and inbound traffic is not forwarded as expected. This is most likely to occur for OSD in LPAR, where commit `21b1702af1` ("s390/qeth: improve fallback to random MAC address") now triggers eg. systemd to immediately change the MAC when the netdevice is registered with a NET_ADDR_RANDOM address. Fixes: `bcacfcbc82` ("s390/qeth: fix MAC address update sequence") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:04 +02:00
Vasily Gorbik	3498531f13	s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 )[6] [ Upstream commit `9d0a58fb97` ] ether_addr_64bits functions have been introduced to optimize performance critical paths, which access 6-byte ethernet address as u64 value to get "nice" assembly. A harmless hack works nicely on ethernet addresses shoved into a structure or a larger buffer, until busted by Kasan on smth like plain (u8 )[6]. qeth_l2_set_mac_address calls qeth_l2_remove_mac passing u8 old_addr[ETH_ALEN] as an argument. Adding/removing macs for an ethernet adapter is not that performance critical. Moreover is_multicast_ether_addr_64bits itself on s390 is not faster than is_multicast_ether_addr: is_multicast_ether_addr(%r2) -> %r2 llc %r2,0(%r2) risbg %r2,%r2,63,191,0 is_multicast_ether_addr_64bits(%r2) -> %r2 llgc %r2,0(%r2) risbg %r2,%r2,63,191,0 So, let's just use is_multicast_ether_addr instead of is_multicast_ether_addr_64bits. Fixes: `bcacfcbc82` ("s390/qeth: fix MAC address update sequence") Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Julian Wiedmann	c61529de8b	Revert "s390/qeth: use Read device to query hypervisor for MAC" [ Upstream commit `4664610537` ] This reverts commit `b7493e91c1`. On its own, querying RDEV for a MAC address works fine. But when upgrading from a qeth that previously queried DDEV on a z/VM NIC (ie. any kernel with commit `ec61bd2fd2`), the RDEV query now returns a _different_ MAC address than the DDEV query. If the NIC is configured with MACPROTECT, z/VM apparently requires us to use the MAC that was initially returned (on DDEV) and registered. So after upgrading to a kernel that uses RDEV, the SETVMAC registration cmd for the new MAC address fails and we end up with a non-operabel interface. To avoid regressions on upgrade, switch back to using DDEV for the MAC address query. The downgrade path (first RDEV, later DDEV) is fine, in this case both queries return the same MAC address. Fixes: `b7493e91c1` ("s390/qeth: use Read device to query hypervisor for MAC") Reported-by: Michal Kubecek <mkubecek@suse.com> Tested-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Or Gerlitz	448f2a9448	IB/mlx5: Avoid dealing with vport representors if not being e-switch manager [ Upstream commit `aff2252a2a` ] In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. Fixes: `b5ca15ad7e` ('IB/mlx5: Add proper representors support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Jesper Dangaard Brouer	fd58784465	i40e: split XDP_TX tail and XDP_REDIRECT map flushing [ Upstream commit `2e68931238` ] The driver was combining the XDP_TX tail flush and XDP_REDIRECT map flushing (xdp_do_flush_map). This is suboptimal, these two flush operations should be kept separate. It looks like the mistake was copy-pasted from ixgbe. Fixes: `d9314c474d` ("i40e: add support for XDP_REDIRECT") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Govindarajulu Varadarajan	351cb027b5	enic: do not overwrite error code [ Upstream commit `56f772279a` ] In failure path, we overwrite err to what vnic_rq_disable() returns. In case it returns 0, enic_open() returns success in case of error. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: `e8588e2685` ("enic: enable rq before updating rq descriptors") Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Ross Lagerwall	3e5ff4440c	xen-netfront: Update features after registering netdev [ Upstream commit `45c8184c1b` ] Update the features after calling register_netdev() otherwise the device features are not set up correctly and it not possible to change the MTU of the device. After this change, the features reported by ethtool match the device's features before the commit which introduced the issue and it is possible to change the device's MTU. Fixes: `f599c64fdf` ("xen-netfront: Fix race between device setup and open") Reported-by: Liam Shepherd <liam@dancer.es> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Ross Lagerwall	55fc9a61d1	xen-netfront: Fix mismatched rtnl_unlock [ Upstream commit `cb257783c2` ] Fixes: `f599c64fdf` ("xen-netfront: Fix race between device setup and open") Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
John Hurley	074a6dd0fe	nfp: reject binding to shared blocks [ Upstream commit `951a8ee6de` ] TC shared blocks allow multiple qdiscs to be grouped together and filters shared between them. Currently the chains of filters attached to a block are only flushed when the block is removed. If a qdisc is removed from a block but the block still exists, flow del messages are not passed to the callback registered for that qdisc. For the NFP, this presents the possibility of rules still existing in hw when they should be removed. Prevent binding to shared blocks until the kernel can send per qdisc del messages when block unbinds occur. tcf_block_shared() was not used outside of the core until now, so also add an empty implementation for builds with CONFIG_NET_CLS=n. Fixes: `4861738775` ("net: sched: introduce shared filter blocks infrastructure") Signed-off-by: John Hurley <john.hurley@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Cong Wang	9ee53d372b	net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN [ Upstream commit `3f76df1982` ] As noticed by Eric, we need to switch to the helper dev_change_tx_queue_len() for SIOCSIFTXQLEN call path too, otheriwse still miss dev_qdisc_change_tx_queue_len(). Fixes: `6a643ddb56` ("net: introduce helper dev_change_tx_queue_len()") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Alexandre Belloni	b1d28fea48	net: macb: initialize bp->queues[0].bp for at91rm9200 [ Upstream commit `fec9d3b1dc` ] The macb driver currently crashes on at91rm9200 with the following trace: Unable to handle kernel NULL pointer dereference at virtual address 00000014 [...] [<c031da44>] (macb_rx_desc) from [<c031f2bc>] (at91ether_open+0x2e8/0x3f8) [<c031f2bc>] (at91ether_open) from [<c041e8d8>] (__dev_open+0x120/0x13c) [<c041e8d8>] (__dev_open) from [<c041ec08>] (__dev_change_flags+0x17c/0x1a8) [<c041ec08>] (__dev_change_flags) from [<c041ec4c>] (dev_change_flags+0x18/0x4c) [<c041ec4c>] (dev_change_flags) from [<c07a5f4c>] (ip_auto_config+0x220/0x10b0) [<c07a5f4c>] (ip_auto_config) from [<c000a4fc>] (do_one_initcall+0x78/0x18c) [<c000a4fc>] (do_one_initcall) from [<c0783e50>] (kernel_init_freeable+0x184/0x1c4) [<c0783e50>] (kernel_init_freeable) from [<c0574d70>] (kernel_init+0x8/0xe8) [<c0574d70>] (kernel_init) from [<c00090e0>] (ret_from_fork+0x14/0x34) Solve that by initializing bp->queues[0].bp in at91ether_init (as is done in macb_init). Fixes: `ae1f2a56d2` ("net: macb: Added support for many RX queues") Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:03 +02:00
Pieter Jansen van Vuuren	36fada8402	nfp: flower: fix mpls ether type detection [ Upstream commit `a64119415f` ] Previously it was not possible to distinguish between mpls ether types and other ether types. This leads to incorrect classification of offloaded filters that match on mpls ether type. For example the following two filters overlap: # tc filter add dev eth0 parent ffff: \ protocol 0x8847 flower \ action mirred egress redirect dev eth1 # tc filter add dev eth0 parent ffff: \ protocol 0x0800 flower \ action mirred egress redirect dev eth2 The driver now correctly includes the mac_mpls layer where HW stores mpls fields, when it detects an mpls ether type. It also sets the MPLS_Q bit to indicate that the filter should match mpls packets. Fixes: `bb055c198d` ("nfp: add mpls match offloading support") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Wei Yongjun	79fc9d3d2e	hinic: reset irq affinity before freeing irq [ Upstream commit `82be2ab159` ] Following warning is seen when rmmod hinic. This is because affinity value is not reset before calling free_irq(). This patch fixes it. [ 55.181232] WARNING: CPU: 38 PID: 19589 at kernel/irq/manage.c:1608 __free_irq+0x2aa/0x2c0 Fixes: `352f58b0d9` ("net-next/hinic: Set Rxq irq to specific cpu for NUMA") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Claudio Imbrenda	3b1507d748	VSOCK: fix loopback on big-endian systems [ Upstream commit `e5ab564c9e` ] The dst_cid and src_cid are 64 bits, therefore 64 bit accessors should be used, and in fact in virtio_transport_common.c only 64 bit accessors are used. Using 32 bit accessors for 64 bit values breaks big endian systems. This patch fixes a wrong use of le32_to_cpu in virtio_transport_send_pkt. Fixes: `b911682318` ("VSOCK: add loopback to virtio_transport") Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Jason Wang	05b2a64977	vhost_net: validate sock before trying to put its fd [ Upstream commit `b8f1f65882` ] Sock will be NULL if we pass -1 to vhost_net_set_backend(), but when we meet errors during ubuf allocation, the code does not check for NULL before calling sockfd_put(), this will lead NULL dereferencing. Fixing by checking sock pointer before. Fixes: `bab632d69e` ("vhost: vhost TX zero-copy support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Ilpo Järvinen	6fd372a73f	tcp: prevent bogus FRTO undos with non-SACK flows [ Upstream commit `1236f22fba` ] If SACK is not enabled and the first cumulative ACK after the RTO retransmission covers more than the retransmitted skb, a spurious FRTO undo will trigger (assuming FRTO is enabled for that RTO). The reason is that any non-retransmitted segment acknowledged will set FLAG_ORIG_SACK_ACKED in tcp_clean_rtx_queue even if there is no indication that it would have been delivered for real (the scoreboard is not kept with TCPCB_SACKED_ACKED bits in the non-SACK case so the check for that bit won't help like it does with SACK). Having FLAG_ORIG_SACK_ACKED set results in the spurious FRTO undo in tcp_process_loss. We need to use more strict condition for non-SACK case and check that none of the cumulatively ACKed segments were retransmitted to prove that progress is due to original transmissions. Only then keep FLAG_ORIG_SACK_ACKED set, allowing FRTO undo to proceed in non-SACK case. (FLAG_ORIG_SACK_ACKED is planned to be renamed to FLAG_ORIG_PROGRESS to better indicate its purpose but to keep this change minimal, it will be done in another patch). Besides burstiness and congestion control violations, this problem can result in RTO loop: When the loss recovery is prematurely undoed, only new data will be transmitted (if available) and the next retransmission can occur only after a new RTO which in case of multiple losses (that are not for consecutive packets) requires one RTO per loss to recover. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Tested-by: Neal Cardwell <ncardwell@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Yuchung Cheng	74737328f2	tcp: fix Fast Open key endianness [ Upstream commit `c860e997e9` ] Fast Open key could be stored in different endian based on the CPU. Previously hosts in different endianness in a server farm using the same key config (sysctl value) would produce different cookies. This patch fixes it by always storing it as little endian to keep same API for LE hosts. Reported-by: Daniele Iamartino <danielei@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Doron Roberts-Kedes	4ba0e265ff	strparser: Remove early eaten to fix full tcp receive buffer stall [ Upstream commit `977c7114eb` ] On receving an incomplete message, the existing code stores the remaining length of the cloned skb in the early_eaten field instead of incrementing the value returned by __strp_recv. This defers invocation of sock_rfree for the current skb until the next invocation of __strp_recv, which returns early_eaten if early_eaten is non-zero. This behavior causes a stall when the current message occupies the very tail end of a massive skb, and strp_peek/need_bytes indicates that the remainder of the current message has yet to arrive on the socket. The TCP receive buffer is totally full, causing the TCP window to go to zero, so the remainder of the message will never arrive. Incrementing the value returned by __strp_recv by the amount otherwise stored in early_eaten prevents stalls of this nature. Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Bhadram Varka	fe924d087d	stmmac: fix DMA channel hang in half-duplex mode [ Upstream commit `b6cfffa7ad` ] HW does not support Half-duplex mode in multi-queue scenario. Fix it by not advertising the Half-Duplex mode if multi-queue enabled. Signed-off-by: Bhadram Varka <vbhadram@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Julian Wiedmann	7a3f79ffec	s390/qeth: don't clobber buffer on async TX completion [ Upstream commit `ce28867fd2` ] If qeth_qdio_output_handler() detects that a transmit requires async completion, it replaces the pending buffer's metadata object (qeth_qdio_out_buffer) so that this queue buffer can be re-used while the data is pending completion. Later when the CQ indicates async completion of such a metadata object, qeth_qdio_cq_handler() tries to free any data associated with this object (since HW has now completed the transfer). By calling qeth_clear_output_buffer(), it erronously operates on the queue buffer that _previously_ belonged to this transfer ... but which has been potentially re-used several times by now. This results in double-free's of the buffer's data, and failing transmits as the buffer descriptor is scrubbed in mid-air. The correct way of handling this situation is to 1. scrub the queue buffer when it is prepared for re-use, and 2. later obtain the data addresses from the async-completion notifier (ie. the AOB), instead of the queue buffer. All this only affects qeth devices used for af_iucv HiperTransport. Fixes: `0da9581ddb` ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Jiri Slaby	da5d09f7f2	r8152: napi hangup fix after disconnect [ Upstream commit `0ee1f47349` ] When unplugging an r8152 adapter while the interface is UP, the NIC becomes unusable. usb->disconnect (aka rtl8152_disconnect) deletes napi. Then, rtl8152_disconnect calls unregister_netdev and that invokes netdev->ndo_stop (aka rtl8152_close). rtl8152_close tries to napi_disable, but the napi is already deleted by disconnect above. So the first while loop in napi_disable never finishes. This results in complete deadlock of the network layer as there is rtnl_mutex held by unregister_netdev. So avoid the call to napi_disable in rtl8152_close when the device is already gone. The other calls to usb_kill_urb, cancel_delayed_work_sync, netif_stop_queue etc. seem to be fine. The urb and netdev is not destroyed yet. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: linux-usb@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Aleksander Morgado	babeafd726	qmi_wwan: add support for the Dell Wireless 5821e module [ Upstream commit `e7e197edd0` ] This module exposes two USB configurations: a QMI+AT capable setup on USB config #1 and a MBIM capable setup on USB config #2. By default the kernel will choose the MBIM capable configuration as long as the cdc_mbim driver is available. This patch adds support for the QMI port in the secondary configuration. Signed-off-by: Aleksander Morgado <aleksander@aleksander.es> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:02 +02:00
Sudarsana Reddy Kalluru	1d924e5c97	qed: Limit msix vectors in kdump kernel to the minimum required count. [ Upstream commit `bb7858ba11` ] Memory size is limited in the kdump kernel environment. Allocation of more msix-vectors (or queues) consumes few tens of MBs of memory, which might lead to the kdump kernel failure. This patch adds changes to limit the number of MSI-X vectors in kdump kernel to minimum required value (i.e., 2 per engine). Fixes: `fe56b9e6a` ("qed: Add module with basic common support") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Sudarsana Reddy Kalluru	93e6320153	qed: Fix use of incorrect size in memcpy call. [ Upstream commit `cc9b27cdf7` ] Use the correct size value while copying chassis/port id values. Fixes: `6ad8c632e` ("qed: Add support for query/config dcbx.") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Sudarsana Reddy Kalluru	b476ea8458	qed: Fix setting of incorrect eswitch mode. [ Upstream commit `538f8d00ba` ] By default, driver sets the eswitch mode incorrectly as VEB (virtual Ethernet bridging). Need to set VEB eswitch mode only when sriov is enabled, and it should be to set NONE by default. The patch incorporates this change. Fixes: `0fefbfbaa` ("qed*: Management firmware - notifications and defaults") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Sudarsana Reddy Kalluru	9056a8de93	qede: Adverstise software timestamp caps when PHC is not available. [ Upstream commit `82a4e71b15` ] When ptp clock is not available for a PF (e.g., higher PFs in NPAR mode), get-tsinfo() callback should return the software timestamp capabilities instead of returning the error. Fixes: `4c55215c` ("qede: Add driver support for PTP") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
David Ahern	5db05a6c73	net/tcp: Fix socket lookups with SO_BINDTODEVICE [ Upstream commit `8c43bd1706` ] Similar to `69678bcd4d` ("udp: fix SO_BINDTODEVICE"), TCP socket lookups need to fail if dev_match is not true. Currently, a packet to a given port can match a socket bound to device when it should not. In the VRF case, this causes the lookup to hit a VRF socket and not a global socket resulting in a response trying to go through the VRF when it should not. Fixes: `3fa6f616a7` ("net: ipv4: add second dif to inet socket lookups") Fixes: `4297a0ef08` ("net: ipv6: add second dif to inet6 socket lookups") Reported-by: Lou Berger <lberger@labn.net> Diagnosed-by: Renato Westphal <renato@opensourcerouting.org> Tested-by: Renato Westphal <renato@opensourcerouting.org> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Eric Dumazet	aa91f56478	net: sungem: fix rx checksum support [ Upstream commit `12b03558ce` ] After commit `88078d98d1` ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), sungem owners reported the infamous "eth0: hw csum failure" message. CHECKSUM_COMPLETE has in fact never worked for this driver, but this was masked by the fact that upper stacks had to strip the FCS, and therefore skb->ip_summed was set back to CHECKSUM_NONE before my recent change. Driver configures a number of bytes to skip when the chip computes the checksum, and for some reason only half of the Ethernet header was skipped. Then a second problem is that we should strip the FCS by default, unless the driver is updated to eventually support NETIF_F_RXFCS in the future. Finally, a driver should check if NETIF_F_RXCSUM feature is enabled or not, so that the admin can turn off rx checksum if wanted. Many thanks to Andreas Schwab and Mathieu Malaterre for their help in debugging this issue. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Meelis Roos <mroos@linux.ee> Reported-by: Mathieu Malaterre <malat@debian.org> Reported-by: Andreas Schwab <schwab@linux-m68k.org> Tested-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Konstantin Khlebnikov	b6c572c0f5	net_sched: blackhole: tell upper qdisc about dropped packets [ Upstream commit `7e85dc8cb3` ] When blackhole is used on top of classful qdisc like hfsc it breaks qlen and backlog counters because packets are disappear without notice. In HFSC non-zero qlen while all classes are inactive triggers warning: WARNING: ... at net/sched/sch_hfsc.c:1393 hfsc_dequeue+0xba4/0xe90 [sch_hfsc] and schedules watchdog work endlessly. This patch return __NET_XMIT_BYPASS in addition to NET_XMIT_SUCCESS, this flag tells upper layer: this packet is gone and isn't queued. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Davide Caratti	7339a1469e	net/sched: act_ife: preserve the action control in case of error [ Upstream commit `cbf56c2962` ] in the following script # tc actions add action ife encode allow prio pass index 42 # tc actions replace action ife encode allow tcindex drop index 42 the action control should remain equal to 'pass', if the kernel failed to replace the TC action. Pospone the assignment of the action control, to ensure it is not overwritten in the error path of tcf_ife_init(). Fixes: `ef6980b6be` ("introduce IFE action") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Davide Caratti	cf25779a36	net/sched: act_ife: fix recursive lock and idr leak [ Upstream commit `0a889b9404` ] a recursive lock warning [1] can be observed with the following script, # $TC actions add action ife encode allow prio pass index 42 IFE type 0xED3E # $TC actions replace action ife encode allow tcindex pass index 42 in case the kernel was unable to run the last command (e.g. because of the impossibility to load 'act_meta_skbtcindex'). For a similar reason, the kernel can leak idr in the error path of tcf_ife_init(), because tcf_idr_release() is not called after successful idr reservation: # $TC actions add action ife encode allow tcindex index 47 IFE type 0xED3E RTNETLINK answers: No such file or directory We have an error talking to the kernel # $TC actions add action ife encode allow tcindex index 47 IFE type 0xED3E RTNETLINK answers: No space left on device We have an error talking to the kernel # $TC actions add action ife encode use mark 7 type 0xfefe pass index 47 IFE type 0xFEFE RTNETLINK answers: No space left on device We have an error talking to the kernel Since tcfa_lock is already taken when the action is being edited, a call to tcf_idr_release() wrongly makes tcf_idr_cleanup() take the same lock again. On the other hand, tcf_idr_release() needs to be called in the error path of tcf_ife_init(), to undo the last tcf_idr_create() invocation. Fix both problems in tcf_ife_init(). Since the cleanup() routine can now be called when ife->params is NULL, also add a NULL pointer check to avoid calling kfree_rcu(NULL, rcu). [1] ============================================ WARNING: possible recursive locking detected 4.17.0-rc4.kasan+ #417 Tainted: G E -------------------------------------------- tc/3932 is trying to acquire lock: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_cleanup+0x19/0x80 [act_ife] but task is already holding lock: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&p->tcfa_lock)->rlock); lock(&(&p->tcfa_lock)->rlock); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by tc/3932: #0: 000000007ca8e990 (rtnl_mutex){+.+.}, at: tcf_ife_init+0xf61/0x13c0 [act_ife] #1: 000000005097c9a6 (&(&p->tcfa_lock)->rlock){+...}, at: tcf_ife_init+0xf6d/0x13c0 [act_ife] stack backtrace: CPU: 3 PID: 3932 Comm: tc Tainted: G E 4.17.0-rc4.kasan+ #417 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 Call Trace: dump_stack+0x9a/0xeb __lock_acquire+0xf43/0x34a0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? debug_check_no_locks_freed+0x2b0/0x2b0 ? __mutex_lock+0x62f/0x1240 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 ? lock_acquire+0x10b/0x330 lock_acquire+0x10b/0x330 ? tcf_ife_cleanup+0x19/0x80 [act_ife] _raw_spin_lock_bh+0x38/0x70 ? tcf_ife_cleanup+0x19/0x80 [act_ife] tcf_ife_cleanup+0x19/0x80 [act_ife] __tcf_idr_release+0xff/0x350 tcf_ife_init+0xdde/0x13c0 [act_ife] ? ife_exit_net+0x290/0x290 [act_ife] ? __lock_is_held+0xb4/0x140 tcf_action_init_1+0x67b/0xad0 ? tcf_action_dump_old+0xa0/0xa0 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? memset+0x1f/0x40 tcf_action_init+0x30f/0x590 ? tcf_action_init_1+0xad0/0xad0 ? memset+0x1f/0x40 tc_ctl_action+0x48e/0x5e0 ? mutex_lock_io_nested+0x1160/0x1160 ? tca_action_gd+0x990/0x990 ? sched_clock+0x5/0x10 ? find_held_lock+0x39/0x1d0 rtnetlink_rcv_msg+0x4da/0x990 ? validate_linkmsg+0x680/0x680 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 netlink_rcv_skb+0x127/0x350 ? validate_linkmsg+0x680/0x680 ? netlink_ack+0x970/0x970 ? __kmalloc_node_track_caller+0x304/0x3a0 netlink_unicast+0x40f/0x5d0 ? netlink_attachskb+0x580/0x580 ? _copy_from_iter_full+0x187/0x760 ? import_iovec+0x90/0x390 netlink_sendmsg+0x67f/0xb50 ? netlink_unicast+0x5d0/0x5d0 ? copy_msghdr_from_user+0x206/0x340 ? netlink_unicast+0x5d0/0x5d0 sock_sendmsg+0xb3/0xf0 ___sys_sendmsg+0x60a/0x8b0 ? copy_msghdr_from_user+0x340/0x340 ? lock_downgrade+0x5e0/0x5e0 ? tty_write_lock+0x18/0x50 ? kvm_sched_clock_read+0x1a/0x30 ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? find_held_lock+0x39/0x1d0 ? lock_downgrade+0x5e0/0x5e0 ? lock_acquire+0x10b/0x330 ? __audit_syscall_entry+0x316/0x690 ? current_kernel_time64+0x6b/0xd0 ? __fget_light+0x55/0x1f0 ? __sys_sendmsg+0xd2/0x170 __sys_sendmsg+0xd2/0x170 ? __ia32_sys_shutdown+0x70/0x70 ? syscall_trace_enter+0x57a/0xd60 ? rcu_read_lock_sched_held+0xdc/0x110 ? __bpf_trace_sys_enter+0x10/0x10 ? do_syscall_64+0x22/0x480 do_syscall_64+0xa5/0x480 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fd646988ba0 RSP: 002b:00007fffc9fab3c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fffc9fab4f0 RCX: 00007fd646988ba0 RDX: 0000000000000000 RSI: 00007fffc9fab440 RDI: 0000000000000003 RBP: 000000005b28c8b3 R08: 0000000000000002 R09: 0000000000000000 R10: 00007fffc9faae20 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffc9fab504 R14: 0000000000000001 R15: 000000000066c100 Fixes: `4e8c861550` ("net sched: net sched: ife action fix late binding") Fixes: `ef6980b6be` ("introduce IFE action") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:01 +02:00
Eric Dumazet	ca0b5e05c2	net/packet: fix use-after-free [ Upstream commit `945d015ee0` ] We should put copy_skb in receive_queue only after a successful call to virtio_net_hdr_from_skb(). syzbot report : BUG: KASAN: use-after-free in __skb_unlink include/linux/skbuff.h:1843 [inline] BUG: KASAN: use-after-free in __skb_dequeue include/linux/skbuff.h:1863 [inline] BUG: KASAN: use-after-free in skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 Read of size 8 at addr ffff8801b044ecc0 by task syz-executor217/4553 CPU: 0 PID: 4553 Comm: syz-executor217 Not tainted 4.18.0-rc1+ #111 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __skb_unlink include/linux/skbuff.h:1843 [inline] __skb_dequeue include/linux/skbuff.h:1863 [inline] skb_dequeue+0x16a/0x180 net/core/skbuff.c:2815 skb_queue_purge+0x26/0x40 net/core/skbuff.c:2852 packet_set_ring+0x675/0x1da0 net/packet/af_packet.c:4331 packet_release+0x630/0xd90 net/packet/af_packet.c:2991 __sock_release+0xd7/0x260 net/socket.c:603 sock_close+0x19/0x20 net/socket.c:1186 __fput+0x35b/0x8b0 fs/file_table.c:209 ____fput+0x15/0x20 fs/file_table.c:243 task_work_run+0x1ec/0x2a0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x1b08/0x2750 kernel/exit.c:865 do_group_exit+0x177/0x440 kernel/exit.c:968 __do_sys_exit_group kernel/exit.c:979 [inline] __se_sys_exit_group kernel/exit.c:977 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:977 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x4448e9 Code: Bad RIP value. RSP: 002b:00007ffd5f777ca8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004448e9 RDX: 00000000004448e9 RSI: 000000000000fcfb RDI: 0000000000000001 RBP: 00000000006cf018 R08: 00007ffd0000a45b R09: 0000000000000000 R10: 00007ffd5f777e48 R11: 0000000000000202 R12: 00000000004021f0 R13: 0000000000402280 R14: 0000000000000000 R15: 0000000000000000 Allocated by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490 kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554 skb_clone+0x1f5/0x500 net/core/skbuff.c:1282 tpacket_rcv+0x28f7/0x3200 net/packet/af_packet.c:2221 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 4553: save_stack+0x43/0xd0 mm/kasan/kasan.c:448 set_track mm/kasan/kasan.c:460 [inline] __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521 kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528 __cache_free mm/slab.c:3498 [inline] kmem_cache_free+0x86/0x2d0 mm/slab.c:3756 kfree_skbmem+0x154/0x230 net/core/skbuff.c:582 __kfree_skb net/core/skbuff.c:642 [inline] kfree_skb+0x1a5/0x580 net/core/skbuff.c:659 tpacket_rcv+0x189e/0x3200 net/packet/af_packet.c:2385 deliver_skb net/core/dev.c:1925 [inline] deliver_ptype_list_skb net/core/dev.c:1940 [inline] __netif_receive_skb_core+0x1bfb/0x3680 net/core/dev.c:4611 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 netif_receive_skb_internal+0x12e/0x7d0 net/core/dev.c:4767 netif_receive_skb+0xbf/0x420 net/core/dev.c:4791 tun_rx_batched.isra.55+0x4ba/0x8c0 drivers/net/tun.c:1571 tun_get_user+0x2af1/0x42f0 drivers/net/tun.c:1981 tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:2009 call_write_iter include/linux/fs.h:1795 [inline] new_sync_write fs/read_write.c:474 [inline] __vfs_write+0x6c6/0x9f0 fs/read_write.c:487 vfs_write+0x1f8/0x560 fs/read_write.c:549 ksys_write+0x101/0x260 fs/read_write.c:598 __do_sys_write fs/read_write.c:610 [inline] __se_sys_write fs/read_write.c:607 [inline] __x64_sys_write+0x73/0xb0 fs/read_write.c:607 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff8801b044ecc0 which belongs to the cache skbuff_head_cache of size 232 The buggy address is located 0 bytes inside of 232-byte region [ffff8801b044ecc0, ffff8801b044eda8) The buggy address belongs to the page: page:ffffea0006c11380 count:1 mapcount:0 mapping:ffff8801d9be96c0 index:0x0 flags: 0x2fffc0000000100(slab) raw: 02fffc0000000100 ffffea0006c17988 ffff8801d9bec248 ffff8801d9be96c0 raw: 0000000000000000 ffff8801b044e040 000000010000000c 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801b044eb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801b044ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc >ffff8801b044ec80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8801b044ed00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8801b044ed80: fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc fc Fixes: `58d19b19cd` ("packet: vnet_hdr support for tpacket_rcv") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Antoine Tenart	f1d4fe121a	net: mvneta: fix the Rx desc DMA address in the Rx path [ Upstream commit `271f7ff5aa` ] When using s/w buffer management, buffers are allocated and DMA mapped. When doing so on an arm64 platform, an offset correction is applied on the DMA address, before storing it in an Rx descriptor. The issue is this DMA address is then used later in the Rx path without removing the offset correction. Thus the DMA address is wrong, which can led to various issues. This patch fixes this by removing the offset correction from the DMA address retrieved from the Rx descriptor before using it in the Rx path. Fixes: `8d5047cf9c` ("net: mvneta: Convert to be 64 bits compatible") Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Shay Agroskin	cda669dd33	net/mlx5: Fix wrong size allocation for QoS ETC TC regitster [ Upstream commit `d14fcb8d87` ] The driver allocates wrong size (due to wrong struct name) when issuing a query/set request to NIC's register. Fixes: `d8880795da` ("net/mlx5e: Implement DCBNL IEEE max rate") Signed-off-by: Shay Agroskin <shayag@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Eli Cohen	589afe3fe4	net/mlx5: Fix required capability for manipulating MPFS [ Upstream commit `f811980444` ] Manipulating of the MPFS requires eswitch manager capabilities. Fixes: `eeb66cdb68` ('net/mlx5: Separate between E-Switch and MPFS') Signed-off-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Alex Vesker	aa0b139804	net/mlx5: Fix incorrect raw command length parsing [ Upstream commit `603b7bcff8` ] The NULL character was not set correctly for the string containing the command length, this caused failures reading the output of the command due to a random length. The fix is to initialize the output length string. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Alex Vesker	b3eacc2b1d	net/mlx5: Fix command interface race in polling mode [ Upstream commit `d412c31dae` ] The command interface can work in two modes: Events and Polling. In the general case, each time we invoke a command, a work is queued to handle it. When working in events, the interrupt handler completes the command execution. On the other hand, when working in polling mode, the work itself completes it. Due to a bug in the work handler, a command could have been completed by the interrupt handler, while the work handler hasn't finished yet, causing the it to complete once again if the command interface mode was changed from Events to polling after the interrupt handler was called. mlx5_unload_one() mlx5_stop_eqs() // Destroy the EQ before cmd EQ ...cmd_work_handler() write_doorbell() --> EVENT_TYPE_CMD mlx5_cmd_comp_handler() // First free free_ent(cmd, ent->idx) complete(&ent->done) <-- mlx5_stop_eqs //cmd was complete // move to polling before destroying the last cmd EQ mlx5_cmd_use_polling() cmd->mode = POLL; --> cmd_work_handler (continues) if (cmd->mode == POLL) mlx5_cmd_comp_handler() // Double free The solution is to store the cmd->mode before writing the doorbell. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Or Gerlitz	ac1131c5d8	net/mlx5: E-Switch, Avoid setup attempt if not being e-switch manager [ Upstream commit `0efc856249` ] In smartnic env, the host (PF) driver might not be an e-switch manager, hence the FW will err on driver attempts to deal with setting/unsetting the eswitch and as a result the overall setup of sriov will fail. Fix that by avoiding the operation if e-switch management is not allowed for this driver instance. While here, move to use the correct name for the esw manager capability name. Fixes: `81848731ff` ('net/mlx5: E-Switch, Add SR-IOV (FDB) support') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Guy Kushnir <guyk@mellanox.com> Reviewed-by: Eli Cohen <eli@melloanox.com> Tested-by: Eli Cohen <eli@melloanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Or Gerlitz	9425cb1569	net/mlx5e: Don't attempt to dereference the ppriv struct if not being eswitch manager [ Upstream commit `8ffd569aaa` ] The check for cpu hit statistics was not returning immediate false for any non vport rep netdev and hence we crashed (say on mlx5 probed VFs) if user-space tool was calling into any possible netdev in the system. Fix that by doing a proper check before dereferencing. Fixes: `1d447a3914` ('net/mlx5e: Extendable vport representor netdev private data') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Eli Cohen <eli@melloanox.com> Reviewed-by: Eli Cohen <eli@melloanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:16:00 +02:00
Or Gerlitz	174dd9808e	net/mlx5e: Avoid dealing with vport representors if not being e-switch manager [ Upstream commit `733d3e5497` ] In smartnic env, the host (PF) driver might not be an e-switch manager, hence the switchdev mode representors are running on the embedded cpu (EC) and not at the host. As such, we should avoid dealing with vport representors if not being esw manager. While here, make sure to disallow eswitch switchdev related setups through devlink if we are not esw managers. Fixes: `cb67b83292` ('net/mlx5e: Introduce SRIOV VF representors') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Harini Katakam	f27bfb8884	net: macb: Fix ptp time adjustment for large negative delta [ Upstream commit `64d7839af8` ] When delta passed to gem_ptp_adjtime is negative, the sign is maintained in the ns_to_timespec64 conversion. Hence timespec_add should be used directly. timespec_sub will just subtract the negative value thus increasing the time difference. Signed-off-by: Harini Katakam <harini.katakam@xilinx.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Sabrina Dubroca	beff4d81c5	net: fix use-after-free in GRO with ESP [ Upstream commit `603d4cf8fe` ] Since the addition of GRO for ESP, gro_receive can consume the skb and return -EINPROGRESS. In that case, the lower layer GRO handler cannot touch the skb anymore. Commit `5f114163f2` ("net: Add a skb_gro_flush_final helper.") converted some of the gro_receive handlers that can lead to ESP's gro_receive so that they wouldn't access the skb when -EINPROGRESS is returned, but missed other spots, mainly in tunneling protocols. This patch finishes the conversion to using skb_gro_flush_final(), and adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and GUE. Fixes: `5f114163f2` ("net: Add a skb_gro_flush_final helper.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Eric Dumazet	eb895b632c	net: dccp: switch rx_tstamp_last_feedback to monotonic clock [ Upstream commit `0ce4e70ff0` ] To compute delays, better not use time of the day which can be changed by admins or malicious programs. Also change ccid3_first_li() to use s64 type for delta variable to avoid potential overflows. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Eric Dumazet	2d0624c94d	net: dccp: avoid crash in ccid3_hc_rx_send_feedback() [ Upstream commit `74174fe563` ] On fast hosts or malicious bots, we trigger a DCCP_BUG() which seems excessive. syzbot reported : BUG: delta (-6195) <= 0 at net/dccp/ccids/ccid3.c:628/ccid3_hc_rx_send_feedback() CPU: 1 PID: 18 Comm: ksoftirqd/1 Not tainted 4.18.0-rc1+ #112 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 ccid3_hc_rx_send_feedback net/dccp/ccids/ccid3.c:628 [inline] ccid3_hc_rx_packet_recv.cold.16+0x38/0x71 net/dccp/ccids/ccid3.c:793 ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline] dccp_deliver_input_to_ccids+0xf0/0x280 net/dccp/input.c:180 dccp_rcv_established+0x87/0xb0 net/dccp/input.c:378 dccp_v4_do_rcv+0x153/0x180 net/dccp/ipv4.c:654 sk_backlog_rcv include/net/sock.h:914 [inline] __sk_receive_skb+0x3ba/0xd80 net/core/sock.c:517 dccp_v4_rcv+0x10f9/0x1f58 net/dccp/ipv4.c:875 ip_local_deliver_finish+0x2eb/0xda0 net/ipv4/ip_input.c:215 NF_HOOK include/linux/netfilter.h:287 [inline] ip_local_deliver+0x1e9/0x750 net/ipv4/ip_input.c:256 dst_input include/net/dst.h:450 [inline] ip_rcv_finish+0x823/0x2220 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:287 [inline] ip_rcv+0xa18/0x1284 net/ipv4/ip_input.c:492 __netif_receive_skb_core+0x2488/0x3680 net/core/dev.c:4628 __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:4693 process_backlog+0x219/0x760 net/core/dev.c:5373 napi_poll net/core/dev.c:5771 [inline] net_rx_action+0x7da/0x1980 net/core/dev.c:5837 __do_softirq+0x2e8/0xb17 kernel/softirq.c:284 run_ksoftirqd+0x86/0x100 kernel/softirq.c:645 smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk> Cc: dccp@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Jesper Dangaard Brouer	9428d615a6	ixgbe: split XDP_TX tail and XDP_REDIRECT map flushing [ Upstream commit `ad088ec480` ] The driver was combining the XDP_TX tail flush and XDP_REDIRECT map flushing (xdp_do_flush_map). This is suboptimal, these two flush operations should be kept separate. Fixes: `11393cc9b9` ("xdp: Add batching support to redirect map") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Xin Long	83a1343be9	ipvlan: fix IFLA_MTU ignored on NEWLINK [ Upstream commit `30877961b1` ] Commit `296d485680` ("ipvlan: inherit MTU from master device") adjusted the mtu from the master device when creating a ipvlan device, but it would also override the mtu value set in rtnl_create_link. It causes IFLA_MTU param not to take effect. So this patch is to not adjust the mtu if IFLA_MTU param is set when creating a ipvlan device. Fixes: `296d485680` ("ipvlan: inherit MTU from master device") Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Eric Biggers	f1e0258ef5	ipv6: sr: fix passing wrong flags to crypto_alloc_shash() [ Upstream commit `fc9c2029e3` ] The 'mask' argument to crypto_alloc_shash() uses the CRYPTO_ALG_* flags, not 'gfp_t'. So don't pass GFP_KERNEL to it. Fixes: `bf355b8d2c` ("ipv6: sr: add core files for SR HMAC support") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Stephen Hemminger	222eb83b1c	hv_netvsc: split sub-channel setup into async and sync [ Upstream commit `3ffe64f1a6` ] When doing device hotplug the sub channel must be async to avoid deadlock issues because device is discovered in softirq context. When doing changes to MTU and number of channels, the setup must be synchronous to avoid races such as when MTU and device settings are done in a single ip command. Reported-by: Thomas Walker <Thomas.Walker@twosigma.com> Fixes: `8195b1396e` ("hv_netvsc: fix deadlock on hotplug") Fixes: `732e49850c` ("netvsc: fix race on sub channel creation") Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:59 +02:00
Gustavo A. R. Silva	413a092814	atm: zatm: Fix potential Spectre v1 [ Upstream commit `ced9e19150` ] pool can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/atm/zatm.c:1491 zatm_ioctl() warn: potential spectre issue 'zatm_dev->pool_info' (local cap) Fix this by sanitizing pool before using it to index zatm_dev->pool_info Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
David Woodhouse	9993992284	atm: Preserve value of skb->truesize when accounting to vcc [ Upstream commit `9bbe60a67b` ] ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on which they are to be sent. But it doesn't take ownership of those packets from the sock (if any) which originally owned them. They should remain owned by their actual sender until they've left the box. There's a hack in pskb_expand_head() to avoid adjusting skb->truesize for certain skbs, precisely to avoid messing up sk_wmem_alloc accounting. Ideally that hack would cover the ATM use case too, but it doesn't — skbs which aren't owned by any sock, for example PPP control frames, still get their truesize adjusted when the low-level ATM driver adds headroom. This has always been an issue, it seems. The truesize of a packet increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't for normal traffic, only for control frames. So I think we just got away with it, and we probably needed to send 2GiB of LCP echo frames before the misaccounting would ever have caused a problem and caused atm_may_send() to start refusing packets. Commit `14afee4b60` ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t") did exactly what it was intended to do, and turned this mostly-theoretical problem into a real one, causing PPPoATM to fail immediately as sk_wmem_alloc underflows and atm_may_send() immediately starts refusing to allow new packets. The least intrusive solution to this problem is to stash the value of skb->truesize that was accounted to the VCC, in a new member of the ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that value instead of the then-current value of skb->truesize. Fixes: `158f323b98` ("net: adjust skb->truesize in pskb_expand_head()") Signed-off-by: David Woodhouse <dwmw2@infradead.org> Tested-by: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Sabrina Dubroca	fb26f804ed	alx: take rtnl before calling __alx_open from resume [ Upstream commit `bc800e8b39` ] The __alx_open function can be called from ndo_open, which is called under RTNL, or from alx_resume, which isn't. Since commit `d768319cd4`, we're calling the netif_set_real_num_{tx,rx}_queues functions, which need to be called under RTNL. This is similar to commit `0c2cc02e57` ("igb: Move the calls to set the Tx and Rx queues into igb_open"). Fixes: `d768319cd4` ("alx: enable multiple tx queues") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Sean Wang	489ad739ec	pinctrl: mt7622: fix a kernel panic when gpio-hog is being applied commit `5b1c4bf251` upstream. When we are explicitly using GPIO hogging mechanism in the pinctrl node, such as: &pio { line_input { gpio-hog; gpios = <95 0>, <96 0>, <97 0>; input; }; }; A kernel panic happens at dereferencing a NULL pointer: In this case, the drvdata is still not setup properly yet when it is being accessed. A better solution for fixing up this issue should be we should obtain the private data from struct gpio_chip using a specific gpiochip_get_data instead of a generic dev_get_drvdata. [ 0.249424] Unable to handle kernel NULL pointer dereference at virtual address 000000c8 [ 0.257818] Mem abort info: [ 0.260704] ESR = 0x96000005 [ 0.263869] Exception class = DABT (current EL), IL = 32 bits [ 0.270011] SET = 0, FnV = 0 [ 0.273167] EA = 0, S1PTW = 0 [ 0.276421] Data abort info: [ 0.279398] ISV = 0, ISS = 0x00000005 [ 0.283372] CM = 0, WnR = 0 [ 0.286440] [00000000000000c8] user address but active_mm is swapper [ 0.293027] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 0.298795] Modules linked in: [ 0.301958] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1+ #389 [ 0.308716] Hardware name: MediaTek MT7622 RFB1 board (DT) [ 0.314396] pstate: 80000005 (Nzcv daif -PAN -UAO) [ 0.319362] pc : mtk_hw_pin_field_get+0x28/0x118 [ 0.324140] lr : mtk_hw_set_value+0x30/0x104 [ 0.328557] sp : ffffff800801b6d0 [ 0.331983] x29: ffffff800801b6d0 x28: ffffff80086b7970 [ 0.337484] x27: 0000000000000000 x26: ffffff80087b8000 [ 0.342986] x25: 0000000000000000 x24: ffffffc00324c230 [ 0.348487] x23: 0000000000000003 x22: 0000000000000000 [ 0.353988] x21: ffffff80087b8000 x20: 0000000000000000 [ 0.359489] x19: 0000000000000054 x18: 00000000fffff7c0 [ 0.364990] x17: 0000000000006300 x16: 000000000000003f [ 0.370492] x15: 000000000000000e x14: ffffffffffffffff [ 0.375993] x13: 0000000000000000 x12: 0000000000000020 [ 0.381494] x11: 0000000000000006 x10: 0101010101010101 [ 0.386995] x9 : fffffffffffffffa x8 : 0000000000000007 [ 0.392496] x7 : ffffff80085d63f8 x6 : 0000000000000003 [ 0.397997] x5 : 0000000000000054 x4 : ffffffc0031eb800 [ 0.403499] x3 : ffffff800801b728 x2 : 0000000000000003 [ 0.409000] x1 : 0000000000000054 x0 : 0000000000000000 [ 0.414502] Process swapper/0 (pid: 1, stack limit = 0x000000002a913c1c) [ 0.421441] Call trace: [ 0.423968] mtk_hw_pin_field_get+0x28/0x118 [ 0.428387] mtk_hw_set_value+0x30/0x104 [ 0.432445] mtk_gpio_set+0x20/0x28 [ 0.436052] mtk_gpio_direction_output+0x18/0x30 [ 0.440833] gpiod_direction_output_raw_commit+0x7c/0xa0 [ 0.446333] gpiod_direction_output+0x104/0x114 [ 0.451022] gpiod_configure_flags+0xbc/0xfc [ 0.455441] gpiod_hog+0x8c/0x140 [ 0.458869] of_gpiochip_add+0x27c/0x2d4 [ 0.462928] gpiochip_add_data_with_key+0x338/0x5f0 [ 0.467976] mtk_pinctrl_probe+0x388/0x400 [ 0.472217] platform_drv_probe+0x58/0xa4 [ 0.476365] driver_probe_device+0x204/0x44c [ 0.480783] __device_attach_driver+0xac/0x108 [ 0.485384] bus_for_each_drv+0x7c/0xac [ 0.489352] __device_attach+0xa0/0x144 [ 0.493320] device_initial_probe+0x10/0x18 [ 0.497647] bus_probe_device+0x2c/0x8c [ 0.501616] device_add+0x2f8/0x540 [ 0.505226] of_device_add+0x3c/0x44 [ 0.508925] of_platform_device_create_pdata+0x80/0xb8 [ 0.514245] of_platform_bus_create+0x290/0x3e8 [ 0.518933] of_platform_populate+0x78/0x100 [ 0.523352] of_platform_default_populate+0x24/0x2c [ 0.528403] of_platform_default_populate_init+0x94/0xa4 [ 0.533903] do_one_initcall+0x98/0x130 [ 0.537874] kernel_init_freeable+0x13c/0x1d4 [ 0.542385] kernel_init+0x10/0xf8 [ 0.545903] ret_from_fork+0x10/0x18 [ 0.549603] Code: 900020a1 f9400800 911dcc21 1400001f (f9406401) [ 0.555916] ---[ end trace de8c34787fdad3b3 ]--- [ 0.560722] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 0.560722] [ 0.570188] SMP: stopping secondary CPUs [ 0.574253] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b [ 0.574253] Cc: stable@vger.kernel.org Fixes: `d6ed935513` ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Sean Wang	73614e26bc	pinctrl: mt7622: stop using the deprecated pinctrl_add_gpio_range commit `de227ed796` upstream. If the pinctrl node has the gpio-ranges property, the range will be added by the gpio core and doesn't need to be added by the pinctrl driver. But for keeping backward compatibility, an explicit pinctrl_add_gpio_range is still needed to be called when there is a missing gpio-ranges in pinctrl node in old dts files. Cc: stable@vger.kernel.org Fixes: `d6ed935513` ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Sean Wang	1c84640cb1	pinctrl: mt7622: fix error path on failing at groups building commit `fafa35cce3` upstream. It should be to return an error code when failing at groups building. Cc: stable@vger.kernel.org Fixes: `d6ed935513` ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Niklas Söderlund	0823975b47	pinctrl: sh-pfc: r8a77970: remove SH_PFC_PIN_CFG_DRIVE_STRENGTH flag commit `550b6f7e8c` upstream. The datasheet does not document any registers to control drive strength, and no drive strength registers are for this reason described for this SoC. The flags indicating that drive strength can be controlled are however set for some pins in the driver. This leads to a NULL pointer dereference when the sh-pfc core tries to access the struct describing the drive strength registers, for example when reading the sysfs file pinconf-pins. Fix this by removing the SH_PFC_PIN_CFG_DRIVE_STRENGTH from all pins. Fixes: `b92ac66a18` ("pinctrl: sh-pfc: Add R8A77970 PFC support") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Nick Desaulniers	1fe6514c78	x86/paravirt: Make native_save_fl() extern inline commit `d0a8d9378d` upstream. native_save_fl() is marked static inline, but by using it as a function pointer in arch/x86/kernel/paravirt.c, it MUST be outlined. paravirt's use of native_save_fl() also requires that no GPRs other than %rax are clobbered. Compilers have different heuristics which they use to emit stack guard code, the emittance of which can break paravirt's callee saved assumption by clobbering %rcx. Marking a function definition extern inline means that if this version cannot be inlined, then the out-of-line version will be preferred. By having the out-of-line version be implemented in assembly, it cannot be instrumented with a stack protector, which might violate custom calling conventions that code like paravirt rely on. The semantics of extern inline has changed since gnu89. This means that folks using GCC versions >= 5.1 may see symbol redefinition errors at link time for subdirs that override KBUILD_CFLAGS (making the C standard used implicit) regardless of this patch. This has been cleaned up earlier in the patch set, but is left as a note in the commit message for future travelers. Reports: https://lkml.org/lkml/2018/5/7/534 https://github.com/ClangBuiltLinux/linux/issues/16 Discussion: https://bugs.llvm.org/show_bug.cgi?id=37512 https://lkml.org/lkml/2018/5/24/1371 Thanks to the many folks that participated in the discussion. Debugged-by: Alistair Strachan <astrachan@google.com> Debugged-by: Matthias Kaehlcke <mka@chromium.org> Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Tom Stellar <tstellar@redhat.com> Reported-by: Sedat Dilek <sedat.dilek@gmail.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: joe@perches.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: thomas.lendacky@amd.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-4-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
H. Peter Anvin	9bae58ebc9	x86/asm: Add _ASM_ARG* constants for argument registers to <asm/asm.h> commit `0e2e160033` upstream. i386 and x86-64 uses different registers for arguments; make them available so we don't have to #ifdef in the actual code. Native size and specified size (q, l, w, b) versions are provided. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Sedat Dilek <sedat.dilek@gmail.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: arnd@arndb.de Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: joe@perches.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: thomas.lendacky@amd.com Cc: tstellar@redhat.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-3-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Nick Desaulniers	6e6ccbbecc	compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline declarations commit `d03db2bc26` upstream. Functions marked extern inline do not emit an externally visible function when the gnu89 C standard is used. Some KBUILD Makefiles overwrite KBUILD_CFLAGS. This is an issue for GCC 5.1+ users as without an explicit C standard specified, the default is gnu11. Since c99, the semantics of extern inline have changed such that an externally visible function is always emitted. This can lead to multiple definition errors of extern inline functions at link time of compilation units whose build files have removed an explicit C standard compiler flag for users of GCC 5.1+ or Clang. Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: H. Peter Anvin <hpa@zytor.com> Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Acked-by: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@redhat.com Cc: akataria@vmware.com Cc: akpm@linux-foundation.org Cc: andrea.parri@amarulasolutions.com Cc: ard.biesheuvel@linaro.org Cc: aryabinin@virtuozzo.com Cc: astrachan@google.com Cc: boris.ostrovsky@oracle.com Cc: brijesh.singh@amd.com Cc: caoj.fnst@cn.fujitsu.com Cc: geert@linux-m68k.org Cc: ghackmann@google.com Cc: gregkh@linuxfoundation.org Cc: jan.kiszka@siemens.com Cc: jarkko.sakkinen@linux.intel.com Cc: jpoimboe@redhat.com Cc: keescook@google.com Cc: kirill.shutemov@linux.intel.com Cc: kstewart@linuxfoundation.org Cc: linux-efi@vger.kernel.org Cc: linux-kbuild@vger.kernel.org Cc: manojgupta@google.com Cc: mawilcox@microsoft.com Cc: michal.lkml@markovi.net Cc: mjg59@google.com Cc: mka@chromium.org Cc: pombredanne@nexb.com Cc: rientjes@google.com Cc: rostedt@goodmis.org Cc: sedat.dilek@gmail.com Cc: thomas.lendacky@amd.com Cc: tstellar@redhat.com Cc: tweek@google.com Cc: virtualization@lists.linux-foundation.org Cc: will.deacon@arm.com Cc: yamada.masahiro@socionext.com Link: http://lkml.kernel.org/r/20180621162324.36656-2-ndesaulniers@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-22 15:15:58 +02:00
Greg Kroah-Hartman	5606f577a7	Linux 4.17.8	2018-07-18 07:56:38 +02:00
Pavel Tatashin	27d8b7daf7	mm: don't do zero_resv_unavail if memmap is not allocated commit `d1b47a7c9e` upstream. Moving zero_resv_unavail before memmap_init_zone(), caused a regression on x86-32. The cause is that we access struct pages before they are allocated when CONFIG_FLAT_NODE_MEM_MAP is used. free_area_init_nodes() zero_resv_unavail() mm_zero_struct_page(pfn_to_page(pfn)); <- struct page is not alloced free_area_init_node() if CONFIG_FLAT_NODE_MEM_MAP alloc_node_mem_map() memblock_virt_alloc_node_nopanic() <- struct page alloced here On the other hand memblock_virt_alloc_node_nopanic() zeroes all the memory that it returns, so we do not need to do zero_resv_unavail() here. Fixes: `e181ae0c5d` ("mm: zero unavailable pages before memmap init") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Tested-by: Matt Hart <matt@mattface.org> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-18 07:56:38 +02:00
Greg Kroah-Hartman	73bc05003e	Linux 4.17.7	2018-07-17 11:48:36 +02:00
Baruch Siach	8d930a6f0a	ARM: dts: armada-38x: use the new thermal binding commit `568cc2f07c` upstream. Commit `2f28e4c24b` (thermal: armada: Clarify control registers accesses) introduced the new thermal binding. The new binding extends the second registers field size to 8. Switch to the new binding to fix thermal reading values. Without this change the fix for errata #132698 introduced in commit `8c0b888f66` (thermal: armada: Change sensors trim default value) has no effect. Cc: stable@vger.kernel.org # v4.16+ Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:36 +02:00
Jaegeuk Kim	220672cd2d	f2fs: sanity check for total valid node blocks commit `8a29c1260e` upstream. This patch enhances sanity check for SIT entries. syzbot hit the following crash on upstream commit `83beed7b2b` (Fri Apr 20 17:56:32 2018 +0000) Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=bf9253040425feb155ad syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5692130282438656 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5095924598571008 Kernel config: https://syzkaller.appspot.com/x/.config?id=1808800213120130118 compiler: gcc (GCC) 8.0.1 20180413 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+bf9253040425feb155ad@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. F2FS-fs (loop0): invalid crc value F2FS-fs (loop0): Try to recover 1th superblock, ret: 0 F2FS-fs (loop0): Mounted with checkpoint version = d F2FS-fs (loop0): Bitmap was wrongly cleared, blk:9740 ------------[ cut here ]------------ kernel BUG at fs/f2fs/segment.c:1884! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4508 Comm: syz-executor0 Not tainted 4.17.0-rc1+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:update_sit_entry+0x1215/0x1590 fs/f2fs/segment.c:1882 RSP: 0018:ffff8801af526708 EFLAGS: 00010282 RAX: ffffed0035ea4cc0 RBX: ffff8801ad454f90 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff82eeb87e RDI: ffffed0035ea4cb6 RBP: ffff8801af526760 R08: ffff8801ad4a2480 R09: ffffed003b5e4f90 R10: ffffed003b5e4f90 R11: ffff8801daf27c87 R12: ffff8801adb8d380 R13: 0000000000000001 R14: 0000000000000008 R15: 00000000ffffffff FS: 00000000014af940(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f06bc223000 CR3: 00000001adb02000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: allocate_data_block+0x66f/0x2050 fs/f2fs/segment.c:2663 do_write_page+0x105/0x1b0 fs/f2fs/segment.c:2727 write_node_page+0x129/0x350 fs/f2fs/segment.c:2770 __write_node_page+0x7da/0x1370 fs/f2fs/node.c:1398 sync_node_pages+0x18cf/0x1eb0 fs/f2fs/node.c:1652 block_operations+0x429/0xa60 fs/f2fs/checkpoint.c:1088 write_checkpoint+0x3ba/0x5380 fs/f2fs/checkpoint.c:1405 f2fs_sync_fs+0x2fb/0x6a0 fs/f2fs/super.c:1077 __sync_filesystem fs/sync.c:39 [inline] sync_filesystem+0x265/0x310 fs/sync.c:67 generic_shutdown_super+0xd7/0x520 fs/super.c:429 kill_block_super+0xa4/0x100 fs/super.c:1191 kill_f2fs_super+0x9f/0xd0 fs/f2fs/super.c:3030 deactivate_locked_super+0x97/0x100 fs/super.c:316 deactivate_super+0x188/0x1b0 fs/super.c:347 cleanup_mnt+0xbf/0x160 fs/namespace.c:1174 __cleanup_mnt+0x16/0x20 fs/namespace.c:1181 task_work_run+0x1e4/0x290 kernel/task_work.c:113 tracehook_notify_resume include/linux/tracehook.h:191 [inline] exit_to_usermode_loop+0x2bd/0x310 arch/x86/entry/common.c:166 prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline] syscall_return_slowpath arch/x86/entry/common.c:265 [inline] do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457d97 RSP: 002b:00007ffd46f9c8e8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000457d97 RDX: 00000000014b09a3 RSI: 0000000000000002 RDI: 00007ffd46f9da50 RBP: 00007ffd46f9da50 R08: 0000000000000000 R09: 0000000000000009 R10: 0000000000000005 R11: 0000000000000246 R12: 00000000014b0940 R13: 0000000000000000 R14: 0000000000000002 R15: 000000000000658e RIP: update_sit_entry+0x1215/0x1590 fs/f2fs/segment.c:1882 RSP: ffff8801af526708 ---[ end trace f498328bb02610a2 ]--- Reported-and-tested-by: syzbot+bf9253040425feb155ad@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+7d6d31d3bc702f566ce3@syzkaller.appspotmail.com Reported-and-tested-by: syzbot+0a725420475916460f12@syzkaller.appspotmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:35 +02:00
Jaegeuk Kim	3a4c382c07	f2fs: sanity check on sit entry commit `b2ca374f33` upstream. syzbot hit the following crash on upstream commit `87ef12027b` (Wed Apr 18 19:48:17 2018 +0000) Merge tag 'ceph-for-4.17-rc2' of git://github.com/ceph/ceph-client syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=83699adeb2d13579c31e C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5805208181407744 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=6005073343676416 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=6555047731134464 Kernel config: https://syzkaller.appspot.com/x/.config?id=1808800213120130118 compiler: gcc (GCC) 8.0.1 20180413 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+83699adeb2d13579c31e@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. F2FS-fs (loop0): Magic Mismatch, valid(0xf2f52010) - read(0x0) F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock F2FS-fs (loop0): invalid crc value BUG: unable to handle kernel paging request at ffffed006b2a50c0 PGD 21ffee067 P4D 21ffee067 PUD 21fbeb067 PMD 0 Oops: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4514 Comm: syzkaller989480 Not tainted 4.17.0-rc1+ #8 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:build_sit_entries fs/f2fs/segment.c:3653 [inline] RIP: 0010:build_segment_manager+0x7ef7/0xbf70 fs/f2fs/segment.c:3852 RSP: 0018:ffff8801b102e5b0 EFLAGS: 00010a06 RAX: 1ffff1006b2a50c0 RBX: 0000000000000004 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8801ac74243e RBP: ffff8801b102f410 R08: ffff8801acbd46c0 R09: fffffbfff14d9af8 R10: fffffbfff14d9af8 R11: ffff8801acbd46c0 R12: ffff8801ac742a80 R13: ffff8801d9519100 R14: dffffc0000000000 R15: ffff880359528600 FS: 0000000001e04880(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffed006b2a50c0 CR3: 00000001ac6ac000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: f2fs_fill_super+0x4095/0x7bf0 fs/f2fs/super.c:2803 mount_bdev+0x30c/0x3e0 fs/super.c:1165 f2fs_mount+0x34/0x40 fs/f2fs/super.c:3020 mount_fs+0xae/0x328 fs/super.c:1268 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2517 [inline] do_mount+0x564/0x3070 fs/namespace.c:2847 ksys_mount+0x12d/0x140 fs/namespace.c:3063 __do_sys_mount fs/namespace.c:3077 [inline] __se_sys_mount fs/namespace.c:3074 [inline] __x64_sys_mount+0xbe/0x150 fs/namespace.c:3074 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x443d6a RSP: 002b:00007ffd312813c8 EFLAGS: 00000297 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000020000c00 RCX: 0000000000443d6a RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffd312813d0 RBP: 0000000000000003 R08: 0000000020016a00 R09: 000000000000000a R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000004 R13: 0000000000402c60 R14: 0000000000000000 R15: 0000000000000000 RIP: build_sit_entries fs/f2fs/segment.c:3653 [inline] RSP: ffff8801b102e5b0 RIP: build_segment_manager+0x7ef7/0xbf70 fs/f2fs/segment.c:3852 RSP: ffff8801b102e5b0 CR2: ffffed006b2a50c0 ---[ end trace a2034989e196ff17 ]--- Reported-and-tested-by: syzbot+83699adeb2d13579c31e@syzkaller.appspotmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:35 +02:00
Jaegeuk Kim	4f9d71172a	f2fs: avoid bug_on on corrupted inode commit `5d64600d4f` upstream. syzbot has tested the proposed patch but the reproducer still triggered crash: kernel BUG at fs/f2fs/inode.c:LINE! F2FS-fs (loop1): invalid crc value F2FS-fs (loop5): Magic Mismatch, valid(0xf2f52010) - read(0x0) F2FS-fs (loop5): Can't find valid F2FS filesystem in 1th superblock F2FS-fs (loop5): invalid crc value ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:238! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4886 Comm: syz-executor1 Not tainted 4.17.0-rc1+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:do_read_inode fs/f2fs/inode.c:238 [inline] RIP: 0010:f2fs_iget+0x3307/0x3ca0 fs/f2fs/inode.c:313 RSP: 0018:ffff8801c44a70e8 EFLAGS: 00010293 RAX: ffff8801ce208040 RBX: ffff8801b3621080 RCX: ffffffff82eace18 F2FS-fs (loop2): Magic Mismatch, valid(0xf2f52010) - read(0x0) RDX: 0000000000000000 RSI: ffffffff82eaf047 RDI: 0000000000000007 RBP: ffff8801c44a7410 R08: ffff8801ce208040 R09: ffffed0039ee4176 R10: ffffed0039ee4176 R11: ffff8801cf720bb7 R12: ffff8801c0efa000 R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f753aa9d700(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 ------------[ cut here ]------------ CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel BUG at fs/f2fs/inode.c:238! CR2: 0000000001b03018 CR3: 00000001c8b74000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: f2fs_fill_super+0x4377/0x7bf0 fs/f2fs/super.c:2842 mount_bdev+0x30c/0x3e0 fs/super.c:1165 f2fs_mount+0x34/0x40 fs/f2fs/super.c:3020 mount_fs+0xae/0x328 fs/super.c:1268 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2517 [inline] do_mount+0x564/0x3070 fs/namespace.c:2847 ksys_mount+0x12d/0x140 fs/namespace.c:3063 __do_sys_mount fs/namespace.c:3077 [inline] __se_sys_mount fs/namespace.c:3074 [inline] __x64_sys_mount+0xbe/0x150 fs/namespace.c:3074 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457daa RSP: 002b:00007f753aa9cba8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000020000000 RCX: 0000000000457daa RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007f753aa9cbf0 RBP: 0000000000000064 R08: 0000000020016a00 R09: 0000000020000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003 R13: 0000000000000064 R14: 00000000006fcb80 R15: 0000000000000000 RIP: do_read_inode fs/f2fs/inode.c:238 [inline] RSP: ffff8801c44a70e8 RIP: f2fs_iget+0x3307/0x3ca0 fs/f2fs/inode.c:313 RSP: ffff8801c44a70e8 invalid opcode: 0000 [#2] SMP KASAN ---[ end trace 1cbcbec2156680bc ]--- Reported-and-tested-by: syzbot+41a1b341571f0952badb@syzkaller.appspotmail.com Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:35 +02:00
Jaegeuk Kim	65fc06a33e	f2fs: give message and set need_fsck given broken node id commit `a4f843bd00` upstream. syzbot hit the following crash on upstream commit `83beed7b2b` (Fri Apr 20 17:56:32 2018 +0000) Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=d154ec99402c6f628887 C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5414336294027264 syzkaller reproducer: https://syzkaller.appspot.com/x/repro.syz?id=5471683234234368 Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5436660795834368 Kernel config: https://syzkaller.appspot.com/x/.config?id=1808800213120130118 compiler: gcc (GCC) 8.0.1 20180413 (experimental) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+d154ec99402c6f628887@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. F2FS-fs (loop0): Magic Mismatch, valid(0xf2f52010) - read(0x0) F2FS-fs (loop0): Can't find valid F2FS filesystem in 1th superblock F2FS-fs (loop0): invalid crc value ------------[ cut here ]------------ kernel BUG at fs/f2fs/node.c:1185! invalid opcode: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 4549 Comm: syzkaller704305 Not tainted 4.17.0-rc1+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__get_node_page+0xb68/0x16e0 fs/f2fs/node.c:1185 RSP: 0018:ffff8801d960e820 EFLAGS: 00010293 RAX: ffff8801d88205c0 RBX: 0000000000000003 RCX: ffffffff82f6cc06 RDX: 0000000000000000 RSI: ffffffff82f6d5e8 RDI: 0000000000000004 RBP: ffff8801d960ec30 R08: ffff8801d88205c0 R09: ffffed003b5e46c2 R10: 0000000000000003 R11: 0000000000000003 R12: ffff8801a86e00c0 R13: 0000000000000001 R14: ffff8801a86e0530 R15: ffff8801d9745240 FS: 000000000072c880(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3d403209b8 CR3: 00000001d8f3f000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: get_node_page fs/f2fs/node.c:1237 [inline] truncate_xattr_node+0x152/0x2e0 fs/f2fs/node.c:1014 remove_inode_page+0x200/0xaf0 fs/f2fs/node.c:1039 f2fs_evict_inode+0xe86/0x1710 fs/f2fs/inode.c:547 evict+0x4a6/0x960 fs/inode.c:557 iput_final fs/inode.c:1519 [inline] iput+0x62d/0xa80 fs/inode.c:1545 f2fs_fill_super+0x5f4e/0x7bf0 fs/f2fs/super.c:2849 mount_bdev+0x30c/0x3e0 fs/super.c:1164 f2fs_mount+0x34/0x40 fs/f2fs/super.c:3020 mount_fs+0xae/0x328 fs/super.c:1267 vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037 vfs_kern_mount fs/namespace.c:1027 [inline] do_new_mount fs/namespace.c:2518 [inline] do_mount+0x564/0x3070 fs/namespace.c:2848 ksys_mount+0x12d/0x140 fs/namespace.c:3064 __do_sys_mount fs/namespace.c:3078 [inline] __se_sys_mount fs/namespace.c:3075 [inline] __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x443dea RSP: 002b:00007ffcc7882368 EFLAGS: 00000297 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000020000c00 RCX: 0000000000443dea RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffcc7882370 RBP: 0000000000000003 R08: 0000000020016a00 R09: 000000000000000a R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000004 R13: 0000000000402ce0 R14: 0000000000000000 R15: 0000000000000000 RIP: __get_node_page+0xb68/0x16e0 fs/f2fs/node.c:1185 RSP: ffff8801d960e820 ---[ end trace 4edbeb71f002bb76 ]--- Reported-and-tested-by: syzbot+d154ec99402c6f628887@syzkaller.appspotmail.com Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:35 +02:00
Marc Orr	65e044949d	kvm: vmx: Nested VM-entry prereqs for event inj. commit `0447378a4a` upstream. This patch extends the checks done prior to a nested VM entry. Specifically, it extends the check_vmentry_prereqs function with checks for fields relevant to the VM-entry event injection information, as described in the Intel SDM, volume 3. This patch is motivated by a syzkaller bug, where a bad VM-entry interruption information field is generated in the VMCS02, which causes the nested VM launch to fail. Then, KVM fails to resume L1. While KVM should be improved to correctly resume L1 execution after a failed nested launch, this change is justified because the existing code to resume L1 is flaky/ad-hoc and the test coverage for resuming L1 is sparse. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Marc Orr <marcorr@google.com> [Removed comment whose parts were describing previous revisions and the rest was obvious from function/variable naming. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:34 +02:00
Tetsuo Handa	9e369cfc38	loop: remember whether sysfs_create_group() was done commit `d3349b6b3c` upstream. syzbot is hitting WARN() triggered by memory allocation fault injection [1] because loop module is calling sysfs_remove_group() when sysfs_create_group() failed. Fix this by remembering whether sysfs_create_group() succeeded. [1] https://syzkaller.appspot.com/bug?id=3f86c0edf75c86d2633aeb9dd69eccc70bc7e90b Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+9f03168400f56df89dbc6f1751f4458fe739ff29@syzkaller.appspotmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Renamed sysfs_ready -> sysfs_inited. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:34 +02:00
Leon Romanovsky	a624d1cdb3	RDMA/ucm: Mark UCM interface as BROKEN commit `7a8690ed6f` upstream. In commit 357d23c811a7 ("Remove the obsolete libibcm library") in rdma-core [1], we removed obsolete library which used the /dev/infiniband/ucmX interface. Following multiple syzkaller reports about non-sanitized user input in the UCMA module, the short audit reveals the same issues in UCM module too. It is better to disable this interface in the kernel, before syzkaller team invests time and energy to harden this unused interface. [1] https://github.com/linux-rdma/rdma-core/pull/279 Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:34 +02:00
Tetsuo Handa	08dee81491	PM / hibernate: Fix oops at snapshot_write() commit `fc14eebfc2` upstream. syzbot is reporting NULL pointer dereference at snapshot_write() [1]. This is because data->handle is zero-cleared by ioctl(SNAPSHOT_FREE). Fix this by checking data_of(data->handle) != NULL before using it. [1] https://syzkaller.appspot.com/bug?id=828a3c71bd344a6de8b6a31233d51a72099f27fd Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+ae590932da6e45d6564d@syzkaller.appspotmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:34 +02:00
Darrick J. Wong	ce597b5c7e	xfs: fix inobt magic number check commit `2e050e648a` upstream. In commit `a6a781a58b` ("xfs: have buffer verifier functions report failing address") the bad magic number return was ported incorrectly. Fixes: `a6a781a58b` Reported-by: syzbot+08ab33be0178b76851c8@syzkaller.appspotmail.com Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:34 +02:00
Theodore Ts'o	d5f0b908ab	loop: add recursion validation to LOOP_CHANGE_FD commit `d2ac838e4c` upstream. Refactor the validation code used in LOOP_SET_FD so it is also used in LOOP_CHANGE_FD. Otherwise it is possible to construct a set of loop devices that all refer to each other. This can lead to a infinite loop in starting with "while (is_loop_device(f)) .." in loop_set_fd(). Fix this by refactoring out the validation code and using it for LOOP_CHANGE_FD as well as LOOP_SET_FD. Reported-by: syzbot+4349872271ece473a7c91190b68b4bac7c5dbc87@syzkaller.appspotmail.com Reported-by: syzbot+40bd32c4d9a3cc12a339@syzkaller.appspotmail.com Reported-by: syzbot+769c54e66f994b041be7@syzkaller.appspotmail.com Reported-by: syzbot+0a89a9ce473936c57065@syzkaller.appspotmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:33 +02:00
Florian Westphal	e1b0fc3358	netfilter: x_tables: initialise match/target check parameter struct commit `c568503ef0` upstream. syzbot reports following splat: BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506 ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline] ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline] The uninitialised access is xt_mtchk_param->nft_compat ... which should be set to 0. Fix it by zeroing the struct beforehand, same for tgchk. ip(6)tables targetinfo uses c99-style initialiser, so no change needed there. Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com Fixes: `55917a21d0` ("netfilter: x_tables: add context to know if extension runs from nft_compat") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:33 +02:00
Dmitry Vyukov	2ab61ab0a4	crypto: don't optimize keccakf() commit `f044a84e04` upstream. keccakf() is the only function in kernel that uses __optimize() macro. __optimize() breaks frame pointer unwinder as optimized code uses RBP, and amusingly this always lead to degraded performance as gcc does not inline across different optimizations levels, so keccakf() wasn't inlined into its callers and keccakf_round() wasn't inlined into keccakf(). Drop __optimize() to resolve both problems. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Fixes: `83dee2ce1a` ("crypto: sha3-generic - rewrite KECCAK transform to help the compiler optimize") Reported-by: syzbot+37035ccfa9a0a017ffcf@syzkaller.appspotmail.com Reported-by: syzbot+e073e4740cfbb3ae200b@syzkaller.appspotmail.com Cc: linux-crypto@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:33 +02:00
Eric Dumazet	43eadcb460	netfilter: nf_queue: augment nfqa_cfg_policy commit `ba062ebb2c` upstream. Three attributes are currently not verified, thus can trigger KMSAN warnings such as : BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline] BUG: KMSAN: uninit-value in nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268 CPU: 1 PID: 4521 Comm: syz-executor120 Not tainted 4.17.0+ #5 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1117 __msan_warning_32+0x70/0xc0 mm/kmsan/kmsan_instr.c:620 __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline] __fswab32 include/uapi/linux/swab.h:59 [inline] nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268 nfnetlink_rcv_msg+0xb2e/0xc80 net/netfilter/nfnetlink.c:212 netlink_rcv_skb+0x37e/0x600 net/netlink/af_netlink.c:2448 nfnetlink_rcv+0x2fe/0x680 net/netfilter/nfnetlink.c:513 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1680/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x43fd59 RSP: 002b:00007ffde0e30d28 EFLAGS: 00000213 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd59 RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8 R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401680 R13: 0000000000401710 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2753 [inline] __kmalloc_node_track_caller+0xb35/0x11b0 mm/slub.c:4395 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:988 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `fdb694a01f` ("netfilter: Add fail-open support") Fixes: `829e17a1a6` ("[NETFILTER]: nfnetlink_queue: allow changing queue length through netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:33 +02:00
Oleg Nesterov	95262c792d	uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn() commit `90718e32e1` upstream. insn_get_length() has the side-effect of processing the entire instruction but only if it was decoded successfully, otherwise insn_complete() can fail and in this case we need to just return an error without warning. Reported-by: syzbot+30d675e3ca03c1c351e7@syzkaller.appspotmail.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Link: https://lkml.kernel.org/lkml/20180518162739.GA5559@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:33 +02:00
Eric Biggers	49c8ef6d52	crypto: x86/salsa20 - remove x86 salsa20 implementations commit `b7b73cd5d7` upstream. The x86 assembly implementations of Salsa20 use the frame base pointer register (%ebp or %rbp), which breaks frame pointer convention and breaks stack traces when unwinding from an interrupt in the crypto code. Recent (v4.10+) kernels will warn about this, e.g. WARNING: kernel stack regs at 00000000a8291e69 in syzkaller047086:4677 has bad 'bp' value 000000001077994c [...] But after looking into it, I believe there's very little reason to still retain the x86 Salsa20 code. First, these are not vectorized (SSE2/SSSE3/AVX2) implementations, which would be needed to get anywhere close to the best Salsa20 performance on any remotely modern x86 processor; they're just regular x86 assembly. Second, it's still unclear that anyone is actually using the kernel's Salsa20 at all, especially given that now ChaCha20 is supported too, and with much more efficient SSSE3 and AVX2 implementations. Finally, in benchmarks I did on both Intel and AMD processors with both gcc 8.1.0 and gcc 4.9.4, the x86_64 salsa20-asm is actually slightly slower than salsa20-generic (~3% slower on Skylake, ~10% slower on Zen), while the i686 salsa20-asm is only slightly faster than salsa20-generic (~15% faster on Skylake, ~20% faster on Zen). The gcc version made little difference. So, the x86_64 salsa20-asm is pretty clearly useless. That leaves just the i686 salsa20-asm, which based on my tests provides a 15-20% speed boost. But that's without updating the code to not use %ebp. And given the maintenance cost, the small speed difference vs. salsa20-generic, the fact that few people still use i686 kernels, the doubt that anyone is even using the kernel's Salsa20 at all, and the fact that a SSE2 implementation would almost certainly be much faster on any remotely modern x86 processor yet no one has cared enough to add one yet, I don't think it's worthwhile to keep. Thus, just remove both the x86_64 and i686 salsa20-asm implementations. Reported-by: syzbot+ffa3a158337bbc01ff09@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:32 +02:00
Tony Battersby	7632f0f4f4	bsg: fix bogus EINVAL on non-data commands commit `70dbcc2254` upstream. Fix a regression introduced in Linux kernel 4.17 where sending a SCSI command that does not transfer data (such as TEST UNIT READY) via /dev/bsg/* results in EINVAL. Fixes: `17cb960f29` ("bsg: split handling of SCSI CDBs vs transport requeues") Cc: <stable@vger.kernel.org> # 4.17+ Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:32 +02:00
Juergen Gross	8d9399b136	xen: setup pv irq ops vector earlier commit `0ce0bba4e5` upstream. Setting pv_irq_ops for Xen PV domains should be done as early as possible in order to support e.g. very early printk() usage. The same applies to xen_vcpu_info_reset(0), as it is needed for the pv irq ops. Move the call of xen_setup_machphys_mapping() after initializing the pv functions as it contains a WARN_ON(), too. Remove the no longer necessary conditional in xen_init_irq_ops() from PVH V1 times to make clear this is a PV only function. Cc: <stable@vger.kernel.org> # 4.14 Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:32 +02:00
Juergen Gross	029da06d0e	xen: remove global bit from __default_kernel_pte_mask for pv guests commit `e69b5d308d` upstream. When removing the global bit from __supported_pte_mask do the same for __default_kernel_pte_mask in order to avoid the WARN_ONCE() in check_pgprot() when setting a kernel pte before having called init_mem_mapping(). Cc: <stable@vger.kernel.org> # 4.17 Reported-by: Michael Young <m.a.young@durham.ac.uk> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:32 +02:00
Steve Wise	08f52afde2	iw_cxgb4: correctly enforce the max reg_mr depth commit `7b72717a20` upstream. The code was mistakenly using the length of the page array memory instead of the depth of the page array. This would cause MR creation to fail in some cases. Fixes: `8376b86de7` ("iw_cxgb4: Support the new memory registration API") Cc: stable@vger.kernel.org Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:32 +02:00
Wolfram Sang	2b759455da	i2c: recovery: if possible send STOP with recovery pulses commit `abe41184ab` upstream. I2C clients may misunderstand recovery pulses if they can't read SDA to bail out early. In the worst case, as a write operation. To avoid that and if we can write SDA, try to send STOP to avoid the misinterpretation. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:31 +02:00
Jon Hunter	173e89c1ad	i2c: tegra: Fix NACK error handling commit `54836e2d03` upstream. On Tegra30 Cardhu the PCA9546 I2C mux is not ACK'ing I2C commands on resume from suspend (which is caused by the reset signal for the I2C mux not being configured correctl). However, this NACK is causing the Tegra30 to hang on resuming from suspend which is not expected as we detect NACKs and handle them. The hang observed appears to occur when resetting the I2C controller to recover from the NACK. Commit `77821b4678` ("i2c: tegra: proper handling of error cases") added additional error handling for some error cases including NACK, however, it appears that this change conflicts with an early fix by commit `f70893d083` ("i2c: tegra: Add delay before resetting the controller after NACK"). After commit `77821b4678` was made we now disable 'packet mode' before the delay from commit `f70893d083` happens. Testing shows that moving the delay to before disabling 'packet mode' fixes the hang observed on Tegra30. The delay was added to give the I2C controller chance to send a stop condition and so it makes sense to move this to before we disable packet mode. Please note that packet mode is always enabled for Tegra. Fixes: `77821b4678` ("i2c: tegra: proper handling of error cases") Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:31 +02:00
Michael J. Ruhl	5eb7b13988	IB/hfi1: Fix incorrect mixing of ERR_PTR and NULL return values commit `b697d7d8c7` upstream. The __get_txreq() function can return a pointer, ERR_PTR(-EBUSY), or NULL. All of the relevant call sites look for IS_ERR, so the NULL return would lead to a NULL pointer exception. Do not use the ERR_PTR mechanism for this function. Update all call sites to handle the return value correctly. Clean up error paths to reflect return value. Fixes: `45842abbb2` ("staging/rdma/hfi1: move txreq header code") Cc: <stable@vger.kernel.org> # 4.9.x+ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Kamenee Arumugam <kamenee.arumugam@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:31 +02:00
Paul Menzel	0df3a8abfe	tools build: fix # escaping in .cmd files for future Make commit `9feeb638cd` upstream. In 2016 GNU Make made a backwards incompatible change to the way '#' characters were handled in Makefiles when used inside functions or macros: http://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57 Due to this change, when attempting to run `make prepare' I get a spurious make syntax error: /home/earnest/linux/tools/objtool/.fixdep.o.cmd:1: *** missing separator. Stop. When inspecting `.fixdep.o.cmd' it includes two lines which use unescaped comment characters at the top: \# cannot find fixdep (/home/earnest/linux/tools/objtool//fixdep) \# using basic dep data This is because `tools/build/Build.include' prints these '\#' characters: printf '\# cannot find fixdep (%s)\n' $(fixdep) > $(dot-target).cmd; \ printf '\# using basic dep data\n\n' >> $(dot-target).cmd; \ This completes commit `9564a8cf42` ("Kbuild: fix # escaping in .cmd files for future Make"). Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847 Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:31 +02:00
Yandong Zhao	372c759fdc	arm64: neon: Fix function may_use_simd() return error status commit `2fd8eb4ad8` upstream. It does not matter if the caller of may_use_simd() migrates to another cpu after the call, but it is still important that the kernel_neon_busy percpu instance that is read matches the cpu the task is running on at the time of the read. This means that raw_cpu_read() is not sufficient. kernel_neon_busy may appear true if the caller migrates during the execution of raw_cpu_read() and the next task to be scheduled in on the initial cpu calls kernel_neon_begin(). This patch replaces raw_cpu_read() with this_cpu_read() to protect against this race. Cc: <stable@vger.kernel.org> Fixes: `cb84d11e16` ("arm64: neon: Remove support for nested or hardirq kernel-mode NEON") Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Yandong Zhao <yandong77520@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:31 +02:00
Dan Williams	02a5363e98	acpi, nfit: Fix scrub idle detection commit `33cc2c9667` upstream. The notification of scrub completion happens within the scrub workqueue. That can clearly race someone running scrub_show() and work_busy() before the workqueue has a chance to flush the recently completed work. Add a flag to reliably indicate the idle vs busy state. Without this change applications using poll(2) to wait for scrub-completion may falsely wakeup and read ARS as being busy even though the thread is going idle and then hang indefinitely. Fixes: `bc6ba80858` ("nfit, address-range-scrub: rework and simplify ARS...") Cc: <stable@vger.kernel.org> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reported-by: Lukasz Dorau <lukasz.dorau@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:30 +02:00
Randy Dunlap	60eacd898c	kbuild: delete INSTALL_FW_PATH from kbuild documentation commit `3f9cdee592` upstream. Removed Kbuild documentation for INSTALL_FW_PATH. The kbuild symbol INSTALL_FW_PATH was removed from Kbuild tools in September 2017 (for 4.14) but the symbol was not deleted from the kbuild documentation, so do that now. Fixes: `5620a0d1aa` ("firmware: delete in-kernel firmware") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: stable@vger.kernel.org # 4.14+ Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:30 +02:00
Joel Fernandes (Google)	259698f202	tracing: Reorder display of TGID to be after PID commit `f8494fa3dd` upstream. Currently ftrace displays data in trace output like so: _-----=> irqs-off / _----=> need-resched \| / _---=> hardirq/softirq \|\| / _--=> preempt-depth \|\|\| / delay TASK-PID CPU TGID \|\|\|\| TIMESTAMP FUNCTION \| \| \| \| \|\|\|\| \| \| bash-1091 [000] ( 1091) d..2 28.313544: sched_switch: However Android's trace visualization tools expect a slightly different format due to an out-of-tree patch patch that was been carried for a decade, notice that the TGID and CPU fields are reversed: _-----=> irqs-off / _----=> need-resched \| / _---=> hardirq/softirq \|\| / _--=> preempt-depth \|\|\| / delay TASK-PID TGID CPU \|\|\|\| TIMESTAMP FUNCTION \| \| \| \| \|\|\|\| \| \| bash-1091 ( 1091) [002] d..2 64.965177: sched_switch: From kernel v4.13 onwards, during which TGID was introduced, tracing with systrace on all Android kernels will break (most Android kernels have been on 4.9 with Android patches, so this issues hasn't been seen yet). From v4.13 onwards things will break. The chrome browser's tracing tools also embed the systrace viewer which uses the legacy TGID format and updates to that are known to be difficult to make. Considering this, I suggest we make this change to the upstream kernel and backport it to all Android kernels. I believe this feature is merged recently enough into the upstream kernel that it shouldn't be a problem. Also logically, IMO it makes more sense to group the TGID with the TASK-PID and the CPU after these. Link: http://lkml.kernel.org/r/20180626000822.113931-1-joel@joelfernandes.org Cc: jreck@google.com Cc: tkjos@google.com Cc: stable@vger.kernel.org Fixes: `441dae8f2f` ("tracing: Add support for display of tgid in trace output") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:30 +02:00
Michal Hocko	5787d29e64	mm: do not bug_on on incorrect length in __mm_populate() commit `bb177a732c` upstream. syzbot has noticed that a specially crafted library can easily hit VM_BUG_ON in __mm_populate kernel BUG at mm/gup.c:1242! invalid opcode: 0000 [#1] SMP CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 RIP: 0010:__mm_populate+0x1e2/0x1f0 Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb Call Trace: vm_brk_flags+0xc3/0x100 vm_brk+0x1f/0x30 load_elf_library+0x281/0x2e0 __ia32_sys_uselib+0x170/0x1e0 do_fast_syscall_32+0xca/0x420 entry_SYSENTER_compat+0x70/0x7f The reason is that the length of the new brk is not page aligned when we try to populate the it. There is no reason to bug on that though. do_brk_flags already aligns the length properly so the mapping is expanded as it should. All we need is to tell mm_populate about it. Besides that there is absolutely no reason to to bug_on in the first place. The worst thing that could happen is that the last page wouldn't get populated and that is far from putting system into an inconsistent state. Fix the issue by moving the length sanitization code from do_brk_flags up to vm_brk_flags. The only other caller of do_brk_flags is brk syscall entry and it makes sure to provide the proper length so t here is no need for sanitation and so we can use do_brk_flags without it. Also remove the bogus BUG_ONs. [osalvador@techadventures.net: fix up vm_brk_flags s@request@len@] Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Reported-by: syzbot <syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com> Tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: "Huang, Ying" <ying.huang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:30 +02:00
Oscar Salvador	9ffb09757e	fs, elf: make sure to page align bss in load_elf_library commit `24962af7e1` upstream. The current code does not make sure to page align bss before calling vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate() due to the requested lenght not being correctly aligned. Let us make sure to align it properly. Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured for libc5. Link: http://lkml.kernel.org/r/20180705145539.9627-1-osalvador@techadventures.net Signed-off-by: Oscar Salvador <osalvador@suse.de> Reported-by: syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Acked-by: Kees Cook <keescook@chromium.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Philipp Rudo	3f09dcc110	x86/purgatory: add missing FORCE to Makefile target commit `fa8cbda88d` upstream. - Build the kernel without the fix - Add some flag to the purgatories KBUILD_CFLAGS,I used -fno-asynchronous-unwind-tables - Re-build the kernel When you look at makes output you see that sha256.o is not re-build in the last step. Also readelf -S still shows the .eh_frame section for sha256.o. With the fix sha256.o is rebuilt in the last step. Without FORCE make does not detect changes only made to the command line options. So object files might not be re-built even when they should be. Fix this by adding FORCE where it is missing. Link: http://lkml.kernel.org/r/20180704110044.29279-2-prudo@linux.ibm.com Fixes: `df6f2801f5` ("kernel/kexec_file.c: move purgatories sha256 to common code") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Acked-by: Dave Young <dyoung@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> [4.17+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Vlastimil Babka	43c841b3f6	fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps* commit `e70cc2bd57` upstream. Thomas reports: "While looking around in /proc on my v4.14.52 system I noticed that all processes got a lot of "Locked" memory in /proc/*/smaps. A lot more memory than a regular user can usually lock with mlock(). Commit `493b0e9d94` (in v4.14-rc1) seems to have changed the behavior of "Locked". Before that commit the code was like this. Notice the VM_LOCKED check. (vma->vm_flags & VM_LOCKED) ? (unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0); After that commit Locked is now the same as Pss: (unsigned long)(mss->pss >> (10 + PSS_SHIFT))); This looks like a mistake." Indeed, the commit has added mss->pss_locked with the correct value that depends on VM_LOCKED, but forgot to actually use it. Fix it. Link: http://lkml.kernel.org/r/ebf6c7fb-fec3-6a26-544f-710ed193c154@suse.cz Fixes: `493b0e9d94` ("mm: add /proc/pid/smaps_rollup") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Daniel Colascione <dancol@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Christian Borntraeger	04e51092df	mm: do not drop unused pages when userfaultd is running commit `bce73e4842` upstream. KVM guests on s390 can notify the host of unused pages. This can result in pte_unused callbacks to be true for KVM guest memory. If a page is unused (checked with pte_unused) we might drop this page instead of paging it. This can have side-effects on userfaultd, when the page in question was already migrated: The next access of that page will trigger a fault and a user fault instead of faulting in a new and empty zero page. As QEMU does not expect a userfault on an already migrated page this migration will fail. The most straightforward solution is to ignore the pte_unused hint if a userfault context is active for this VMA. Link: http://lkml.kernel.org/r/20180703171854.63981-1-borntraeger@de.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Janosch Frank <frankja@linux.ibm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Chris Wilson	54fbeaaa2a	ALSA: hda - Handle pm failure during hotplug commit `aaa23f8600` upstream. Obtaining the runtime pm wakeref can fail, especially in a hotplug scenario where i915.ko has been unloaded. If we do not catch the failure, we end up with an unbalanced pm. v2 additions by tiwai: hdmi_present_sense() checks the return value and handle only a negative error case and bails out only if it's really still suspended. Also, snd_hda_power_down() is called at the error path so that the refcount is balanced. Along with it, the spec->pcm_lock is taken outside hdmi_present_sense() in the caller side, so that it won't cause deadlock at reentrace via runtime resume. v3 fix by tiwai: Missing linux/pm_runtime.h is included. References: `222bde0388` ("ALSA: hda - Fix mutex deadlock at HDMI/DP hotplug") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Hui Wang	7523c91e9f	ALSA: hda/realtek - two more lenovo models need fixup of MIC_LOCATION commit `c6b17f1020` upstream. We have two new lenovo desktop models which need to apply the fixup of ALC294_FIXUP_LENOVO_MIC_LOCATION, and they have the same pin cfg as the machine with subsystem id:0x17aa3136, now use the pincfg table to apply the fixup for them. Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:29 +02:00
Pavel Tatashin	5ea4573620	mm: zero unavailable pages before memmap init commit `e181ae0c5d` upstream. We must zero struct pages for memory that is not backed by physical memory, or kernel does not have access to. Recently, there was a change which zeroed all memmap for all holes in e820. Unfortunately, it introduced a bug that is discussed here: https://www.spinics.net/lists/linux-mm/msg156764.html Linus, also saw this bug on his machine, and confirmed that reverting commit `124049decb` ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") fixes the issue. The problem is that we incorrectly zero some struct pages after they were setup. The fix is to zero unavailable struct pages prior to initializing of struct pages. A more detailed fix should come later that would avoid double zeroing cases: one in __init_single_page(), the other one in zero_resv_unavail(). Fixes: `124049decb` ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:28 +02:00
Linus Torvalds	3dcb24abc9	Fix up non-directory creation in SGID directories commit `0fa3ecd878` upstream. sgid directories have special semantics, making newly created files in the directory belong to the group of the directory, and newly created subdirectories will also become sgid. This is historically used for group-shared directories. But group directories writable by non-group members should not imply that such non-group members can magically join the group, so make sure to clear the sgid bit on non-directories for non-members (but remember that sgid without group execute means "mandatory locking", just to confuse things even more). Reported-by: Jann Horn <jannh@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:28 +02:00
Dan Carpenter	5b5d496c97	xhci: xhci-mem: off by one in xhci_stream_id_to_ring() commit `313db3d648` upstream. The > should be >= here so that we don't read one element beyond the end of the ep->stream_info->stream_rings[] array. Fixes: `e9df17eb14` ("USB: xhci: Correct assumptions about number of rings per endpoint.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:28 +02:00
Nico Sneck	3819252d58	usb: quirks: add delay quirks for Corsair Strafe commit `bba57eddad` upstream. Corsair Strafe appears to suffer from the same issues as the Corsair Strafe RGB. Apply the same quirks (control message delay and init delay) that the RGB version has to 1b1c:1b15. With these quirks in place the keyboard works correctly upon booting the system, and no longer requires reattaching the device. Signed-off-by: Nico Sneck <snecknico@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:28 +02:00
Johan Hovold	13c94467c2	USB: serial: mos7840: fix status-register error handling commit `794744abff` upstream. Add missing transfer-length sanity check to the status-register completion handler to avoid leaking bits of uninitialised slab data to user space. Fixes: `3f5429746d` ("USB: Moschip 7840 USB-Serial Driver") Cc: stable <stable@vger.kernel.org> # 2.6.19 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:27 +02:00
Jann Horn	ce6037ad83	USB: yurex: fix out-of-bounds uaccess in read handler commit `f1e255d60a` upstream. In general, accessing userspace memory beyond the length of the supplied buffer in VFS read/write handlers can lead to both kernel memory corruption (via kernel_read()/kernel_write(), which can e.g. be triggered via sys_splice()) and privilege escalation inside userspace. Fix it by using simple_read_from_buffer() instead of custom logic. Fixes: `6bc235a2e2` ("USB: add driver for Meywa-Denki & Kayac YUREX") Signed-off-by: Jann Horn <jannh@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:27 +02:00
Johan Hovold	137d487b46	USB: serial: keyspan_pda: fix modem-status error handling commit `01b3cdfca2` upstream. Fix broken modem-status error handling which could lead to bits of slab data leaking to user space. Fixes: `3b36a8fd67` ("usb: fix uninitialized variable warning in keyspan_pda") Cc: stable <stable@vger.kernel.org> # 2.6.27 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:27 +02:00
Olli Salonen	2bcb5a0f79	USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick commit `367b160fe4` upstream. There are two versions of the Qivicon Zigbee stick in circulation. This adds the second USB ID to the cp210x driver. Signed-off-by: Olli Salonen <olli.salonen@iki.fi> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:27 +02:00
Dan Carpenter	d35925d6eb	USB: serial: ch341: fix type promotion bug in ch341_control_in() commit `e33eab9ded` upstream. The "r" variable is an int and "bufsize" is an unsigned int so the comparison is type promoted to unsigned. If usb_control_msg() returns a negative that is treated as a high positive value and the error handling doesn't work. Fixes: `2d5a9c72d0` ("USB: serial: ch341: fix control-message error handling") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:26 +02:00
Mika Westerberg	36567d0bba	thunderbolt: Notify userspace when boot_acl is changed commit `007a74907d` upstream. The commit `9aaa3b8b4c` ("thunderbolt: Add support for preboot ACL") introduced boot_acl attribute but missed the fact that now userspace needs to poll the attribute constantly to find out whether it has changed or not. Fix this by sending notification to the userspace whenever the boot_acl attribute is changed. Fixes: `9aaa3b8b4c` ("thunderbolt: Add support for preboot ACL") Reported-and-tested-by: Christian Kellner <christian@kellner.me> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Christian Kellner <christian@kellner.me> Acked-by: Yehezkel Bernat <yehezkelshb@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:26 +02:00
Hans de Goede	66a1080d5c	ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS commit `240630e618` upstream. There have been several reports of LPM related hard freezes about once a day on multiple Lenovo 50 series models. Strange enough these reports where not disk model specific as LPM issues usually are and some users with the exact same disk + laptop where seeing them while other users where not seeing these issues. It turns out that enabling LPM triggers a firmware bug somewhere, which has been fixed in later BIOS versions. This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM for dealing with this. The ahci_broken_lpm() function contains DMI match info for the 4 models which are known to be affected by this and the DMI BIOS date field for known good BIOS versions. If the BIOS date is older then the one in the table LPM will be disabled and a warning will be printed. Note the BIOS dates are for known good versions, some older versions may work too, but we don't know for sure, the table is using dates from BIOS versions for which users have confirmed that upgrading to that version makes the problem go away. Unfortunately I've been unable to get hold of the reporter who reported that BIOS version 2.35 fixed the problems on the W541 for him. I've been able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older dmidecode, but I don't know the exact BIOS date as reported in the DMI. Lenovo keeps a changelog with dates in their release notes, but the dates there are the release dates not the build dates which are in DMI. So I've chosen to set the date to which we compare to one day past the release date of the 2.34 BIOS. I plan to fix this with a follow up commit once I've the necessary info. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:26 +02:00
Mika Westerberg	5ca653aaa0	ahci: Add Intel Ice Lake LP PCI ID commit `ba44579141` upstream. This should also be using the default LPM policy for mobile chipsets so add the PCI ID to the driver list of supported devices. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:26 +02:00
Nadav Amit	0bac537f68	vmw_balloon: fix inflation with batching commit `90d72ce079` upstream. Embarrassingly, the recent fix introduced worse problem than it solved, causing the balloon not to inflate. The VM informed the hypervisor that the pages for lock/unlock are sitting in the wrong address, as it used the page that is used the uninitialized page variable. Fixes: `b23220fe05` ("vmw_balloon: fixing double free when batching mode is off") Cc: stable@vger.kernel.org Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:25 +02:00
Jiri Olsa	4a82304704	tracing/kprobe: Release kprobe print_fmt properly commit `0fc8c3581d` upstream. We don't release tk->tp.call.print_fmt when destroying local uprobe. Also there's missing print_fmt kfree in create_local_trace_kprobe error path. Link: http://lkml.kernel.org/r/20180709141906.2390-1-jolsa@kernel.org Cc: stable@vger.kernel.org Fixes: `e12f03d703` ("perf/core: Implement the 'perf_kprobe' PMU") Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:25 +02:00
Vignesh R	0786b525e1	mtd: spi-nor: cadence-quadspi: Fix direct mode write timeouts commit `aa7eee8a14` upstream. Sometimes when writing large size files to flash in direct/memory mapped mode, it is seen that flash write enable command times out with error: [ 503.146293] cadence-qspi 47040000.ospi: Flash command execution timed out. This is because, we need to make sure previous direct write operation is complete by polling for IDLE bit in CONFIG_REG before starting the next operation. Fix this by polling for IDLE bit after memory mapped write. Fixes: `a27f2eaf2b` ("mtd: spi-nor: cadence-quadspi: Add support for direct access mode") Cc: stable@vger.kernel.org Signed-off-by: Vignesh R <vigneshr@ti.com> Reviewed-by: Marek Vasut <marek.vasut@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:25 +02:00
Alexander Usyskin	7ac3afe134	mei: discard messages from not connected client during power down. commit `b7a020bff3` upstream. This fixes regression introduced by commit `8d52af6795` ("mei: speed up the power down flow") In power down or suspend flow a message can still be received from the FW because the clients fake disconnection. In normal case we interpret messages w/o destination as corrupted and link reset is performed in order to clean the channel, but during power down link reset is already in progress resulting in endless loop. To resolve the issue under power down flow we discard messages silently. Cc: <stable@vger.kernel.org> 4.16+ Fixes: `8d52af6795` ("mei: speed up the power down flow") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199541 Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:25 +02:00
Damien Le Moal	28b37a9548	ata: Fix ZBC_OUT all bit handling commit `6edf1d4cb0` upstream. If the ALL bit is set in the ZBC_OUT command, the command zone ID field (block) should be ignored. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:25 +02:00
Damien Le Moal	d24c77e726	ata: Fix ZBC_OUT command block check commit `b320a0a9f2` upstream. The block (LBA) specified must not exceed the last addressable LBA, which is dev->nr_sectors - 1. So fix the correct check is "if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)". Additionally, the asc/ascq to return for an LBA that is not a zone start LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of range. Reported-by: David Butterfield <david.butterfield@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Cc: stable@vger.kernel.org Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:24 +02:00
Ping-Ke Shih	7fefbccbeb	staging: r8822be: Fix RTL8822be can't find any wireless AP commit `d59d2f9995` upstream. RTL8822be can't bring up properly on ASUS X530UN, and dmesg says: [ 8.591333] r8822be: module is from the staging directory, the quality is unknown, you have been warned. [ 8.593122] r8822be 0000:02:00.0: enabling device (0000 -> 0003) [ 8.669163] r8822be: Using firmware rtlwifi/rtl8822befw.bin [ 9.289939] r8822be: rtlwifi: wireless switch is on [ 10.056426] r8822be 0000:02:00.0 wlp2s0: renamed from wlan0 ... [ 11.952534] r8822be: halmac_init_hal failed [ 11.955933] r8822be: halmac_init_hal failed [ 11.956227] r8822be: halmac_init_hal failed [ 22.007942] r8822be: halmac_init_hal failed Jian-Hong reported it works if turn off ASPM with module parameter aspm=0. In order to fix this problem kindly, this commit don't turn off aspm but enlarge ASPM L1 latency to 7. Reported-by: Jian-Hong Pan <jian-hong@endlessm.com> Tested-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:24 +02:00
Murray McAllister	ad7af60f0f	staging: rtl8723bs: Prevent an underflow in rtw_check_beacon_data(). commit `920c924488` upstream. Dan Carpenter reported an integer underflow issue in the rtl8188eu driver. This is also needed for the length (signed integer) in rtl8723bs, as it is later converted to an unsigned integer and used in a memcpy operation. Original issue is at https://patchwork.kernel.org/patch/9796371/ Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:24 +02:00
Jann Horn	f96dd5fb7b	ibmasm: don't write out of bounds in read handler commit `a0341fc198` upstream. This read handler had a lot of custom logic and wrote outside the bounds of the provided buffer. This could lead to kernel and userspace memory corruption. Just use simple_read_from_buffer() with a stack buffer. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:24 +02:00
Yoshihiro Shimoda	78211c3b2e	mmc: renesas_sdhi_internal_dmac: Cannot clear the RX_IN_USE in abort commit `25a98edd57` upstream. This patch is fixes an issue that the SDHI_INTERNAL_DMAC_RX_IN_USE flag cannot be cleared because tmio_mmc_core sets the host->data to NULL before the tmio_mmc_core calls tmio_mmc_abort_dma(). So, this patch clears the SDHI_INTERNAL_DMAC_RX_IN_USE in the renesas_sdhi_internal_dmac_abort_dma() anyway. This doesn't cause any side effects. Fixes: `0cbc94daa5` ("mmc: renesas_sdhi_internal_dmac: limit DMA RX for old SoCs") Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:23 +02:00
x00270170	f74564e33f	mmc: dw_mmc: fix card threshold control configuration commit `7a6b9f4d60` upstream. Card write threshold control is supposed to be set since controller version 2.80a for data write in HS400 mode and data read in HS200/HS400/SDR104 mode. However the current code returns without configuring it in the case of data writing in HS400 mode. Meanwhile the patch fixes that the current code goes to 'disable' when doing data reading in HS400 mode. Fixes: `7e4bf1bc95` ("mmc: dw_mmc: add the card write threshold for HS400 mode") Signed-off-by: Qing Xia <xiaqing17@hisilicon.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:23 +02:00
Stefan Agner	527c1ee291	mmc: sdhci-esdhc-imx: allow 1.8V modes without 100/200MHz pinctrl states commit `92748beac0` upstream. If pinctrl nodes for 100/200MHz are missing, the controller should not select any mode which need signal frequencies 100MHz or higher. To prevent such speed modes the driver currently uses the quirk flag SDHCI_QUIRK2_NO_1_8_V. This works nicely for SD cards since 1.8V signaling is required for all faster modes and slower modes use 3.3V signaling only. However, there are eMMC modes which use 1.8V signaling and run below 100MHz, e.g. DDR52 at 1.8V. With using SDHCI_QUIRK2_NO_1_8_V this mode is prevented. When using a fixed 1.8V regulator as vqmmc-supply the stack has no valid mode to use. In this tenuous situation the kernel continuously prints voltage switching errors: mmc1: Switching to 3.3V signalling voltage failed Avoid using SDHCI_QUIRK2_NO_1_8_V and prevent faster modes by altering the SDHCI capability register. With that the stack is able to select 1.8V modes even if no faster pinctrl states are available: # cat /sys/kernel/debug/mmc1/ios ... timing spec: 8 (mmc DDR52) signal voltage: 1 (1.80 V) ... Link: http://lkml.kernel.org/r/20180628081331.13051-1-stefan@agner.ch Signed-off-by: Stefan Agner <stefan@agner.ch> Fixes: `ad93220de7` ("mmc: sdhci-esdhc-imx: change pinctrl state according to uhs mode") Cc: <stable@vger.kernel.org> # v4.13+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:23 +02:00
Rafael J. Wysocki	5abec3a59f	ACPICA: Clear status of all events when entering S5 commit `fa85015c0d` upstream. After commit `18996f2db9` (ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume) the status of ACPI events is not cleared any more when entering the ACPI S5 system state (power off) which causes some systems to power up immediately after turing off power in certain situations. That is a functional regression, so address it by making the code clear the status of all ACPI events again when entering S5 (for system-wide suspend or hibernation the clearing of the status of all events is not desirable, as it might cause the kernel to miss wakeup events sometimes). Fixes: `18996f2db9` (ACPICA: Events: Stop unconditionally clearing ACPI IRQs during suspend/resume) Reported-by: Takashi Iwai <tiwai@suse.de> Tested-by: Thomas Hänig <haenig@cosifan.de> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:23 +02:00
Lucas Stach	7e48285b84	drm/etnaviv: bring back progress check in job timeout handler commit `2c83a726d6` upstream. When the hangcheck handler was replaced by the DRM scheduler timeout handling we dropped the forward progress check, as this might allow clients to hog the GPU for a long time with a big job. It turns out that even reasonably well behaved clients like the Armada Xorg driver occasionally trip over the 500ms timeout. Bring back the forward progress check to get rid of the userspace regression. We would still like to fix userspace to submit smaller batches if possible, but that is for another day. Cc: <stable@vger.kernel.org> Fixes: `6d7a20c077` (drm/etnaviv: replace hangcheck with scheduler timeout) Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:22 +02:00
Fabio Estevam	b163d99f43	drm/etnaviv: Fix driver unregistering commit `bf6ba3aeb2` upstream. Russell King reported: "When removing and reloading the etnaviv module, the following splat occurs: sysfs: cannot create duplicate filename '/devices/platform/etnaviv' CPU: 0 PID: 1471 Comm: modprobe Not tainted 4.17.0+ #1608 Hardware name: Marvell Dove (Cubox) Backtrace: [<c00157d4>] (dump_backtrace) from [<c0015b8c>] (show_stack+0x18/0x1c) r6:ef033e38 r5:ee07b340 r4:edb9d000 r3:00000000 [<c0015b74>] (show_stack) from [<c0620784>] (dump_stack+0x20/0x28) [<c0620764>] (dump_stack) from [<c01bcd24>] (sysfs_warn_dup+0x5c/0x70) [<c01bccc8>] (sysfs_warn_dup) from [<c01bce14>] (sysfs_create_dir_ns+0x90/0x98) ..." Commit `246774d17f` ("drm/etnaviv: remove the need for a gpu-subsystem DT node") introduced DRM registration via platform_device_register_simple(), but missed to call platform_device_unregister() inside etnaviv_exit(). Fix the problem by calling platform_device_unregister() inside etnaviv_exit(). While at it, also rearrange the function calls in the exit path to make them happen in the opposite order of registration. Tested on a imx6-sabresd board. Cc: <stable@vger.kernel.org> Fixes: `246774d17f` ("drm/etnaviv: remove the need for a gpu-subsystem DT node") Reported-by: Russell King <linux@armlinux.org.uk> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:22 +02:00
Fabio Estevam	0a69975dad	drm/etnaviv: Check for platform_device_register_simple() failure commit `45a0faaba9` upstream. platform_device_register_simple() may fail, so we should better check its return value and propagate it in the case of error. Cc: <stable@vger.kernel.org> Fixes: `246774d17f` ("drm/etnaviv: remove the need for a gpu-subsystem DT node") Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:22 +02:00
Paul Burton	0c54c6ddd8	MIPS: Fix ioremap() RAM check commit `523402fa91` upstream. We currently attempt to check whether a physical address range provided to __ioremap() may be in use by the page allocator by examining the value of PageReserved for each page in the region - lowmem pages not marked reserved are presumed to be in use by the page allocator, and requests to ioremap them fail. The way we check this has been broken since commit `92923ca3aa` ("mm: meminit: only set page reserved in the memblock region"), because memblock will typically not have any knowledge of non-RAM pages and therefore those pages will not have the PageReserved flag set. Thus when we attempt to ioremap a region outside of RAM we incorrectly fail believing that the region is RAM that may be in use. In most cases ioremap() on MIPS will take a fast-path to use the unmapped kseg1 or xkphys virtual address spaces and never hit this path, so the only way to hit it is for a MIPS32 system to attempt to ioremap() an address range in lowmem with flags other than _CACHE_UNCACHED. Perhaps the most straightforward way to do this is using ioremap_uncached_accelerated(), which is how the problem was discovered. Fix this by making use of walk_system_ram_range() to test the address range provided to __ioremap() against only RAM pages, rather than all lowmem pages. This means that if we have a lowmem I/O region, which is very common for MIPS systems, we're free to ioremap() address ranges within it. A nice bonus is that the test is no longer limited to lowmem. The approach here matches the way x86 performed the same test after commit `c81c8a1eee` ("x86, ioremap: Speed up check for RAM pages") until x86 moved towards a slightly more complicated check using walk_mem_res() for unrelated reasons with commit `0e4c12b45a` ("x86/mm, resource: Use PAGE_KERNEL protection for ioremap of memory pages"). Signed-off-by: Paul Burton <paul.burton@mips.com> Reported-by: Serge Semin <fancer.lancer@gmail.com> Tested-by: Serge Semin <fancer.lancer@gmail.com> Fixes: `92923ca3aa` ("mm: meminit: only set page reserved in the memblock region") Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.2+ Patchwork: https://patchwork.linux-mips.org/patch/19786/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:22 +02:00
Paul Burton	12cb64bd0e	MIPS: Use async IPIs for arch_trigger_cpumask_backtrace() commit `b63e132b64` upstream. The current MIPS implementation of arch_trigger_cpumask_backtrace() is broken because it attempts to use synchronous IPIs despite the fact that it may be run with interrupts disabled. This means that when arch_trigger_cpumask_backtrace() is invoked, for example by the RCU CPU stall watchdog, we may: - Deadlock due to use of synchronous IPIs with interrupts disabled, causing the CPU that's attempting to generate the backtrace output to hang itself. - Not succeed in generating the desired output from remote CPUs. - Produce warnings about this from smp_call_function_many(), for example: [42760.526910] INFO: rcu_sched detected stalls on CPUs/tasks: [42760.535755] 0-...!: (1 GPs behind) idle=ade/140000000000000/0 softirq=526944/526945 fqs=0 [42760.547874] 1-...!: (0 ticks this GP) idle=e4a/140000000000000/0 softirq=547885/547885 fqs=0 [42760.559869] (detected by 2, t=2162 jiffies, g=266689, c=266688, q=33) [42760.568927] ------------[ cut here ]------------ [42760.576146] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:416 smp_call_function_many+0x88/0x20c [42760.587839] Modules linked in: [42760.593152] CPU: 2 PID: 1216 Comm: sh Not tainted 4.15.4-00373-gee058bb4d0c2 #2 [42760.603767] Stack : 8e09bd20 8e09bd20 8e09bd20 fffffff0 00000007 00000006 00000000 8e09bca8 [42760.616937] 95b2b379 95b2b379 807a0080 00000007 81944518 0000018a 00000032 00000000 [42760.630095] 00000000 00000030 80000000 00000000 806eca74 00000009 8017e2b8 000001a0 [42760.643169] 00000000 00000002 00000000 8e09baa4 00000008 808b8008 86d69080 8e09bca0 [42760.656282] 8e09ad50 805e20aa 00000000 00000000 00000000 8017e2b8 00000009 801070ca [42760.669424] ... [42760.673919] Call Trace: [42760.678672] [<27fde568>] show_stack+0x70/0xf0 [42760.685417] [<84751641>] dump_stack+0xaa/0xd0 [42760.692188] [<699d671c>] __warn+0x80/0x92 [42760.698549] [<68915d41>] warn_slowpath_null+0x28/0x36 [42760.705912] [<f7c76c1c>] smp_call_function_many+0x88/0x20c [42760.713696] [<6bbdfc2a>] arch_trigger_cpumask_backtrace+0x30/0x4a [42760.722216] [<f845bd33>] rcu_dump_cpu_stacks+0x6a/0x98 [42760.729580] [<796e7629>] rcu_check_callbacks+0x672/0x6ac [42760.737476] [<059b3b43>] update_process_times+0x18/0x34 [42760.744981] [<6eb94941>] tick_sched_handle.isra.5+0x26/0x38 [42760.752793] [<478d3d70>] tick_sched_timer+0x1c/0x50 [42760.759882] [<e56ea39f>] __hrtimer_run_queues+0xc6/0x226 [42760.767418] [<e88bbcae>] hrtimer_interrupt+0x88/0x19a [42760.775031] [<6765a19e>] gic_compare_interrupt+0x2e/0x3a [42760.782761] [<0558bf5f>] handle_percpu_devid_irq+0x78/0x168 [42760.790795] [<90c11ba2>] generic_handle_irq+0x1e/0x2c [42760.798117] [<1b6d462c>] gic_handle_local_int+0x38/0x86 [42760.805545] [<b2ada1c7>] gic_irq_dispatch+0xa/0x14 [42760.812534] [<90c11ba2>] generic_handle_irq+0x1e/0x2c [42760.820086] [<c7521934>] do_IRQ+0x16/0x20 [42760.826274] [<9aef3ce6>] plat_irq_dispatch+0x62/0x94 [42760.833458] [<6a94b53c>] except_vec_vi_end+0x70/0x78 [42760.840655] [<22284043>] smp_call_function_many+0x1ba/0x20c [42760.848501] [<54022b58>] smp_call_function+0x1e/0x2c [42760.855693] [<ab9fc705>] flush_tlb_mm+0x2a/0x98 [42760.862730] [<0844cdd0>] tlb_flush_mmu+0x1c/0x44 [42760.869628] [<cb259b74>] arch_tlb_finish_mmu+0x26/0x3e [42760.877021] [<1aeaaf74>] tlb_finish_mmu+0x18/0x66 [42760.883907] [<b3fce717>] exit_mmap+0x76/0xea [42760.890428] [<c4c8a2f6>] mmput+0x80/0x11a [42760.896632] [<a41a08f4>] do_exit+0x1f4/0x80c [42760.903158] [<ee01cef6>] do_group_exit+0x20/0x7e [42760.909990] [<13fa8d54>] __wake_up_parent+0x0/0x1e [42760.917045] [<46cf89d0>] smp_call_function_many+0x1a2/0x20c [42760.924893] [<8c21a93b>] syscall_common+0x14/0x1c [42760.931765] ---[ end trace 02aa09da9dc52a60 ]--- [42760.938342] ------------[ cut here ]------------ [42760.945311] WARNING: CPU: 2 PID: 1216 at kernel/smp.c:291 smp_call_function_single+0xee/0xf8 ... This patch switches MIPS' arch_trigger_cpumask_backtrace() to use async IPIs & smp_call_function_single_async() in order to resolve this problem. We ensure use of the pre-allocated call_single_data_t structures is serialized by maintaining a cpumask indicating that they're busy, and refusing to attempt to send an IPI when a CPU's bit is set in this mask. This should only happen if a CPU hasn't responded to a previous backtrace IPI - ie. if it's hung - and we print a warning to the console in this case. I've marked this for stable branches as far back as v4.9, to which it applies cleanly. Strictly speaking the faulty MIPS implementation can be traced further back to commit `856839b768` ("MIPS: Add arch_trigger_all_cpu_backtrace() function") in v3.19, but kernel versions v3.19 through v4.8 will require further work to backport due to the rework performed in commit `9a01c3ed5c` ("nmi_backtrace: add more trigger__cpu_backtrace() methods"). Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19597/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.9+ Fixes: `856839b768` ("MIPS: Add arch_trigger_all_cpu_backtrace() function") Fixes: `9a01c3ed5c` ("nmi_backtrace: add more trigger__cpu_backtrace() methods") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:22 +02:00
Paul Burton	60a4a2c37e	MIPS: Call dump_stack() from show_regs() commit `5a267832c2` upstream. The generic nmi_cpu_backtrace() function calls show_regs() when a struct pt_regs is available, and dump_stack() otherwise. If we were to make use of the generic nmi_cpu_backtrace() with MIPS' current implementation of show_regs() this would mean that we see only register data with no accompanying stack information, in contrast with our current implementation which calls dump_stack() regardless of whether register state is available. In preparation for making use of the generic nmi_cpu_backtrace() to implement arch_trigger_cpumask_backtrace(), have our implementation of show_regs() call dump_stack() and drop the explicit dump_stack() call in arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace(). This will allow the output we produce to remain the same after a later patch switches to using nmi_cpu_backtrace(). It may mean that we produce extra stack output in other uses of show_regs(), but this: 1) Seems harmless. 2) Is good for consistency between arch_trigger_cpumask_backtrace() and other users of show_regs(). 3) Matches the behaviour of the ARM & PowerPC architectures. Marked for stable back to v4.9 as a prerequisite of the following patch "MIPS: Call dump_stack() from show_regs()". Signed-off-by: Paul Burton <paul.burton@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/19596/ Cc: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Huacai Chen <chenhc@lemote.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org # v4.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:21 +02:00
Daniel Borkmann	49fde2180b	bpf: reject passing modified ctx to helper functions commit `58990d1ff3` upstream. As commit `28e33f9d78` ("bpf: disallow arithmetic operations on context pointer") already describes, `f1174f77b5` ("bpf/verifier: rework value tracking") removed the specific white-listed cases we had previously where we would allow for pointer arithmetic in order to further generalize it, and allow e.g. context access via modified registers. While the dereferencing of modified context pointers had been forbidden through `28e33f9d78`, syzkaller did recently manage to trigger several KASAN splats for slab out of bounds access and use after frees by simply passing a modified context pointer to a helper function which would then do the bad access since verifier allowed it in adjust_ptr_min_max_vals(). Rejecting arithmetic on ctx pointer in adjust_ptr_min_max_vals() generally could break existing programs as there's a valid use case in tracing in combination with passing the ctx to helpers as bpf_probe_read(), where the register then becomes unknown at verification time due to adding a non-constant offset to it. An access sequence may look like the following: offset = args->filename; /* field __data_loc filename / bpf_probe_read(&dst, len, (char )args + offset); // args is ctx There are two options: i) we could special case the ctx and as soon as we add a constant or bounded offset to it (hence ctx type wouldn't change) we could turn the ctx into an unknown scalar, or ii) we generalize the sanity test for ctx member access into a small helper and assert it on the ctx register that was passed as a function argument. Fwiw, latter is more obvious and less complex at the same time, and one case that may potentially be legitimate in future for ctx member access at least would be for ctx to carry a const offset. Therefore, fix follows approach from ii) and adds test cases to BPF kselftests. Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Reported-by: syzbot+3d0b2441dbb71751615e@syzkaller.appspotmail.com Reported-by: syzbot+c8504affd4fdd0c1b626@syzkaller.appspotmail.com Reported-by: syzbot+e5190cb881d8660fb1a3@syzkaller.appspotmail.com Reported-by: syzbot+efae31b384d5badbd620@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-17 11:48:21 +02:00
Greg Kroah-Hartman	b36cc73101	Linux 4.17.6	2018-07-11 16:31:35 +02:00
Sebastian Andrzej Siewior	22f5d0a83a	Revert mm/vmstat.c: fix vmstat_update() preemption BUG commit `28557cc106` upstream. Revert commit `c7f26ccfb2` ("mm/vmstat.c: fix vmstat_update() preemption BUG"). Steven saw a "using smp_processor_id() in preemptible" message and added a preempt_disable() section around it to keep it quiet. This is not the right thing to do it does not fix the real problem. vmstat_update() is invoked by a kworker on a specific CPU. This worker it bound to this CPU. The name of the worker was "kworker/1:1" so it should have been a worker which was bound to CPU1. A worker which can run on any CPU would have a `u' before the first digit. smp_processor_id() can be used in a preempt-enabled region as long as the task is bound to a single CPU which is the case here. If it could run on an arbitrary CPU then this is the problem we have an should seek to resolve. Not only this smp_processor_id() must not be migrated to another CPU but also refresh_cpu_vm_stats() which might access wrong per-CPU variables. Not to mention that other code relies on the fact that such a worker runs on one specific CPU only. Therefore revert that commit and we should look instead what broke the affinity mask of the kworker. Link: http://lkml.kernel.org/r/20180504104451.20278-1-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Steven J. Hill <steven.hill@cavium.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:34 +02:00
Dan Carpenter	2aeef3fb80	staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write() commit `1376b0a216` upstream. There is a '>' vs '<' typo so this loop is a no-op. Fixes: `d35dcc89fc` ("staging: comedi: quatech_daqp_cs: fix daqp_ao_insn_write()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:34 +02:00
Jann Horn	362aa587f6	netfilter: nf_log: don't hold nf_log_mutex during user access commit `ce00bf07cc` upstream. The old code would indefinitely block other users of nf_log_mutex if a userspace access in proc_dostring() blocked e.g. due to a userfaultfd region. Fix it by moving proc_dostring() out of the locked region. This is a followup to commit `266d07cb1c` ("netfilter: nf_log: fix sleeping function called from invalid context"), which changed this code from using rcu_read_lock() to taking nf_log_mutex. Fixes: `266d07cb1c` ("netfilter: nf_log: fix sleeping function calle[...]") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:34 +02:00
Tokunori Ikegami	10929ce0fe	mtd: cfi_cmdset_0002: Change erase functions to check chip good only commit `79ca484b61` upstream. Currently the functions use to check both chip ready and good. But the chip ready is not enough to check the operation status. So change this to check the chip good instead of this. About the retry functions to make sure the error handling remain it. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:34 +02:00
Tokunori Ikegami	e642f13e4a	mtd: cfi_cmdset_0002: Change erase functions to retry for error commit `45f75b8a91` upstream. For the word write functions it is retried for error. But it is not implemented to retry for the erase functions. To make sure for the erase functions change to retry as same. This is needed to prevent the flash erase error caused only once. It was caused by the error case of chip_good() in the do_erase_oneblock(). Also it was confirmed on the MACRONIX flash device MX29GL512FHT2I-11G. But the error issue behavior is not able to reproduce at this moment. The flash controller is parallel Flash interface integrated on BCM53003. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:34 +02:00
Tokunori Ikegami	1cc9491775	mtd: cfi_cmdset_0002: Change definition naming to retry write operation commit `85a82e28b0` upstream. The definition can be used for other program and erase operations also. So change the naming to MAX_RETRIES from MAX_WORD_RETRIES. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:33 +02:00
Ross Zwisler	1ac28f0997	dm: prevent DAX mounts if not supported commit `dbc626597c` upstream. Currently device_supports_dax() just checks to see if the QUEUE_FLAG_DAX flag is set on the device's request queue to decide whether or not the device supports filesystem DAX. Really we should be using bdev_dax_supported() like filesystems do at mount time. This performs other tests like checking to make sure the dax_direct_access() path works. We also explicitly clear QUEUE_FLAG_DAX on the DM device's request queue if any of the underlying devices do not support DAX. This makes the handling of QUEUE_FLAG_DAX consistent with the setting/clearing of most other flags in dm_table_set_restrictions(). Now that bdev_dax_supported() explicitly checks for QUEUE_FLAG_DAX, this will ensure that filesystems built upon DM devices will only be able to mount with DAX if all underlying devices also support DAX. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Fixes: commit `545ed20e6d` ("dm: add infrastructure for DAX support") Cc: stable@vger.kernel.org Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:33 +02:00
Ross Zwisler	e9aaac43ff	dax: check for QUEUE_FLAG_DAX in bdev_dax_supported() commit `15256f6cc4` upstream. Add an explicit check for QUEUE_FLAG_DAX to __bdev_dax_supported(). This is needed for DM configurations where the first element in the dm-linear or dm-stripe target supports DAX, but other elements do not. Without this check __bdev_dax_supported() will pass for such devices, letting a filesystem on that device mount with the DAX option. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Mike Snitzer <snitzer@redhat.com> Fixes: commit `545ed20e6d` ("dm: add infrastructure for DAX support") Cc: stable@vger.kernel.org Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:33 +02:00
Dave Jiang	f266082e1b	dax: change bdev_dax_supported() to support boolean returns commit `80660f2025` upstream. The function return values are confusing with the way the function is named. We expect a true or false return value but it actually returns 0/-errno. This makes the code very confusing. Changing the return values to return a bool where if DAX is supported then return true and no DAX support returns false. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:33 +02:00
Darrick J. Wong	2384edbfa7	fs: allow per-device dax status checking for filesystems commit `ba23cba9b3` upstream. Change bdev_dax_supported so it takes a bdev parameter. This enables multi-device filesystems like xfs to check that a dax device can work for the particular filesystem. Once that's in place, actually fix all the parts of XFS where we need to be able to distinguish between datadev and rtdev. This patch fixes the problem where we screw up the dax support checking in xfs if the datadev and rtdev have different dax capabilities. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [rez: Re-added __bdev_dax_supported() for !CONFIG_FS_DAX cases] Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:33 +02:00
Peter Rosin	5f0a93e123	i2c: smbus: kill memory leak on emulated and failed DMA SMBus xfers commit `9aa613674f` upstream. If DMA safe memory was allocated, but the subsequent I2C transfer fails the memory is leaked. Plug this leak. Fixes: `8a77821e74` ("i2c: smbus: use DMA safe buffers for emulated SMBus transactions") Signed-off-by: Peter Rosin <peda@axentia.se> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:32 +02:00
Wenwen Wang	328349288e	i2c: core: smbus: fix a potential missing-check bug commit `8e03477cb7` upstream. In i2c_smbus_xfer_emulated(), the function i2c_transfer() is invoked to transfer i2c messages. The number of actual transferred messages is returned and saved to 'status'. If 'status' is negative, that means an error occurred during the transfer process. In that case, the value of 'status' is an error code to indicate the reason of the transfer failure. In most cases, i2c_transfer() can transfer 'num' messages with no error. And so 'status' == 'num'. However, due to unexpected errors, it is probable that only partial messages are transferred by i2c_transfer(). As a result, 'status' != 'num'. This special case is not checked after the invocation of i2c_transfer() and can potentially lead to unexpected issues in the following execution since it is expected that 'status' == 'num'. This patch checks the return value of i2c_transfer() and returns an error code -EIO if the number of actual transferred messages 'status' is not equal to 'num'. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:32 +02:00
Benjamin Tissoires	091b5e3f59	HID: core: allow concurrent registration of drivers commit `8f732850df` upstream. Detected on the Dell XPS 9365. The laptop has 2 devices that benefit from the hid-generic auto-unbinding. When those 2 devices are presented to the userspace, udev loads both wacom and hid-multitouch. When this happens, the code in __hid_bus_reprobe_drivers() is called concurrently and the second device gets reprobed twice. An other bug in the power_supply subsystem prevent to remove the wacom driver if it just finished its initialization, which basically kills the wacom node. [jkosina@suse.cz: reformat changelog a bit] Fixes `c17a7476e4` ("HID: core: rewrite the hid-generic automatic unbind") Cc: stable@vger.kernel.org # v4.17 Tested-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:32 +02:00
Daniel Rosenberg	787b882d91	HID: debug: check length before copy_to_user() commit `717adfdaf1` upstream. If our length is greater than the size of the buffer, we overflow the buffer Cc: stable@vger.kernel.org Signed-off-by: Daniel Rosenberg <drosen@google.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:32 +02:00
Gustavo A. R. Silva	9be3ce7937	HID: hiddev: fix potential Spectre v1 commit `4f65245f2d` upstream. uref->field_index, uref->usage_index, finfo.field_index and cinfo.index can be indirectly controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/hid/usbhid/hiddev.c:473 hiddev_ioctl_usage() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:477 hiddev_ioctl_usage() warn: potential spectre issue 'field->usage' (local cap) drivers/hid/usbhid/hiddev.c:757 hiddev_ioctl() warn: potential spectre issue 'report->field' (local cap) drivers/hid/usbhid/hiddev.c:801 hiddev_ioctl() warn: potential spectre issue 'hid->collection' (local cap) Fix this by sanitizing such structure fields before using them to index report->field, field->usage and hid->collection Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:31 +02:00
Jason Andryuk	79ef9243e7	HID: i2c-hid: Fix "incomplete report" noise commit `ef6eaf2727` upstream. Commit `ac75a04104` ("HID: i2c-hid: fix size check and type usage") started writing messages when the ret_size is <= 2 from i2c_master_recv. However, my device i2c-DLL07D1 returns 2 for a short period of time (~0.5s) after I stop moving the pointing stick or touchpad. It varies, but you get ~50 messages each time which spams the log hard. [ 95.925055] i2c_hid i2c-DLL07D1:01: i2c_hid_get_input: incomplete report (83/2) This has also been observed with a i2c-ALP0017. [ 1781.266353] i2c_hid i2c-ALP0017:00: i2c_hid_get_input: incomplete report (30/2) Only print the message when ret_size is totally invalid and less than 2 to cut down on the log spam. Fixes: `ac75a04104` ("HID: i2c-hid: fix size check and type usage") Reported-by: John Smith <john-s-84@gmx.net> Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:31 +02:00
Jon Derrick	c2b9dc3f6e	ext4: check superblock mapped prior to committing commit `a17712c8e4` upstream. This patch attempts to close a hole leading to a BUG seen with hot removals during writes [1]. A block device (NVME namespace in this test case) is formatted to EXT4 without partitions. It's mounted and write I/O is run to a file, then the device is hot removed from the slot. The superblock attempts to be written to the drive which is no longer present. The typical chain of events leading to the BUG: ext4_commit_super() __sync_dirty_buffer() submit_bh() submit_bh_wbc() BUG_ON(!buffer_mapped(bh)); This fix checks for the superblock's buffer head being mapped prior to syncing. [1] https://www.spinics.net/lists/linux-ext4/msg56527.html Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:31 +02:00
Theodore Ts'o	a5f8b0a70e	ext4: add more mount time checks of the superblock commit `bfe0a5f47a` upstream. The kernel's ext4 mount-time checks were more permissive than e2fsprogs's libext2fs checks when opening a file system. The superblock is considered too insane for debugfs or e2fsck to operate on it, the kernel has no business trying to mount it. This will make file system fuzzing tools work harder, but the failure cases that they find will be more useful and be easier to evaluate. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:31 +02:00
Theodore Ts'o	77d4024c10	ext4: add more inode number paranoia checks commit `c37e9e0134` upstream. If there is a directory entry pointing to a system inode (such as a journal inode), complain and declare the file system to be corrupted. Also, if the superblock's first inode number field is too small, refuse to mount the file system. This addresses CVE-2018-10882. https://bugzilla.kernel.org/show_bug.cgi?id=200069 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:31 +02:00
Theodore Ts'o	44cb38f456	ext4: avoid running out of journal credits when appending to an inline file commit `8bc1379b82` upstream. Use a separate journal transaction if it turns out that we need to convert an inline file to use an data block. Otherwise we could end up failing due to not having journal credits. This addresses CVE-2018-10883. https://bugzilla.kernel.org/show_bug.cgi?id=200071 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:30 +02:00
Theodore Ts'o	1f18321d31	ext4: never move the system.data xattr out of the inode body commit `8cdb5240ec` upstream. When expanding the extra isize space, we must never move the system.data xattr out of the inode body. For performance reasons, it doesn't make any sense, and the inline data implementation assumes that system.data xattr is never in the external xattr block. This addresses CVE-2018-10880 https://bugzilla.kernel.org/show_bug.cgi?id=200005 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:30 +02:00
Theodore Ts'o	0abaed0c74	ext4: clear i_data in ext4_inode_info when removing inline data commit `6e8ab72a81` upstream. When converting from an inode from storing the data in-line to a data block, ext4_destroy_inline_data_nolock() was only clearing the on-disk copy of the i_blocks[] array. It was not clearing copy of the i_blocks[] in ext4_inode_info, in i_data[], which is the copy actually used by ext4_map_blocks(). This didn't matter much if we are using extents, since the extents header would be invalid and thus the extents could would re-initialize the extents tree. But if we are using indirect blocks, the previous contents of the i_blocks array will be treated as block numbers, with potentially catastrophic results to the file system integrity and/or user data. This gets worse if the file system is using a 1k block size and s_first_data is zero, but even without this, the file system can get quite badly corrupted. This addresses CVE-2018-10881. https://bugzilla.kernel.org/show_bug.cgi?id=200015 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:30 +02:00
Theodore Ts'o	6af469f553	ext4: include the illegal physical block in the bad map ext4_error msg commit `bdbd6ce01a` upstream. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:30 +02:00
Theodore Ts'o	a266689c46	ext4: verify the depth of extent tree in ext4_find_extent() commit `bc890a6024` upstream. If there is a corupted file system where the claimed depth of the extent tree is -1, this can cause a massive buffer overrun leading to sadness. This addresses CVE-2018-10877. https://bugzilla.kernel.org/show_bug.cgi?id=199417 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:30 +02:00
Theodore Ts'o	b94094f668	ext4: only look at the bg_flags field if it is valid commit `8844618d8a` upstream. The bg_flags field in the block group descripts is only valid if the uninit_bg or metadata_csum feature is enabled. We were not consistently looking at this field; fix this. Also block group #0 must never have uninitialized allocation bitmaps, or need to be zeroed, since that's where the root inode, and other special inodes are set up. Check for these conditions and mark the file system as corrupted if they are detected. This addresses CVE-2018-10876. https://bugzilla.kernel.org/show_bug.cgi?id=199403 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:29 +02:00
Theodore Ts'o	425a51f7b0	ext4: always check block group bounds in ext4_init_block_bitmap() commit `819b23f1c5` upstream. Regardless of whether the flex_bg feature is set, we should always check to make sure the bits we are setting in the block bitmap are within the block group bounds. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:29 +02:00
Theodore Ts'o	0f2e7fe6d2	ext4: make sure bitmaps and the inode table don't overlap with bg descriptors commit `77260807d1` upstream. It's really bad when the allocation bitmaps and the inode table overlap with the block group descriptors, since it causes random corruption of the bg descriptors. So we really want to head those off at the pass. https://bugzilla.kernel.org/show_bug.cgi?id=199865 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:29 +02:00
Theodore Ts'o	11f6b0e426	ext4: always verify the magic number in xattr blocks commit `513f86d738` upstream. If there an inode points to a block which is also some other type of metadata block (such as a block allocation bitmap), the buffer_verified flag can be set when it was validated as that other metadata block type; however, it would make a really terrible external attribute block. The reason why we use the verified flag is to avoid constantly reverifying the block. However, it doesn't take much overhead to make sure the magic number of the xattr block is correct, and this will avoid potential crashes. This addresses CVE-2018-10879. https://bugzilla.kernel.org/show_bug.cgi?id=200001 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:29 +02:00
Theodore Ts'o	e4130b9612	ext4: add corruption check in ext4_xattr_set_entry() commit `5369a762c8` upstream. In theory this should have been caught earlier when the xattr list was verified, but in case it got missed, it's simple enough to add check to make sure we don't overrun the xattr buffer. This addresses CVE-2018-10879. https://bugzilla.kernel.org/show_bug.cgi?id=200001 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:28 +02:00
Theodore Ts'o	f869e0b631	jbd2: don't mark block as modified if the handle is out of credits commit `e09463f220` upstream. Do not set the b_modified flag in block's journal head should not until after we're sure that jbd2_journal_dirty_metadat() will not abort with an error due to there not being enough space reserved in the jbd2 handle. Otherwise, future attempts to modify the buffer may lead a large number of spurious errors and warnings. This addresses CVE-2018-10883. https://bugzilla.kernel.org/show_bug.cgi?id=200071 Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:28 +02:00
Lyude Paul	802de58a57	drm/amdgpu: Dynamically probe for ATIF handle (v2) commit `f9ff68521a` upstream. The other day I was testing one of the HP laptops at my office with an i915/amdgpu hybrid setup and noticed that hotplugging was non-functional on almost all of the display outputs. I eventually discovered that all of the external outputs were connected to the amdgpu device instead of i915, and that the hotplugs weren't being detected so long as the GPU was in runtime suspend. After some talking with folks at AMD, I learned that amdgpu is actually supposed to support hotplug detection in runtime suspend so long as the OEM has implemented it properly in the firmware. On this HP ZBook 15 G4 (the machine in question), amdgpu wasn't managing to find the ATIF handle at all despite the fact that I could see acpi events being sent in response to any hotplugging. After going through dumps of the firmware, I discovered that this machine did in fact support ATIF, but that it's ATIF method lived in an entirely different namespace than this device's handle (the device handle was \_SB_.PCI0.PEG0.PEGP, but ATIF lives in ATPX's handle at \_SB_.PCI0.GFX0). So, fix this by probing ATPX's ACPI parent's namespace if we can't find ATIF elsewhere, along with storing a pointer to the proper handle to use for ATIF and using that instead of the device's handle. This fixes HPD detection while in runtime suspend for this ZBook! v2: Update the comment to reflect how the namespaces are arranged based on the system configuration. (Alex) Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:28 +02:00
Lyude Paul	95576e9e2d	drm/amdgpu: Add amdgpu_atpx_get_dhandle() commit `4aa5d5eb82` upstream. Since it seems that some vendors are storing the ATIF ACPI methods under the same handle that ATPX lives under instead of the device's own handle, we're going to need to be able to retrieve this handle later so we can probe for ATIF there. Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:28 +02:00
Mikulas Patocka	039ae7c9cb	drm/udl: fix display corruption of the last line commit `99ec9e7751` upstream. The displaylink hardware has such a peculiarity that it doesn't render a command until next command is received. This produces occasional corruption, such as when setting 22x11 font on the console, only the first line of the cursor will be blinking if the cursor is located at some specific columns. When we end up with a repeating pixel, the driver has a bug that it leaves one uninitialized byte after the command (and this byte is enough to flush the command and render it - thus it fixes the screen corruption), however whe we end up with a non-repeating pixel, there is no byte appended and this results in temporary screen corruption. This patch fixes the screen corruption by always appending a byte 0xAF at the end of URB. It also removes the uninitialized byte. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:28 +02:00
Michel Dänzer	50f6300d73	drm: Use kvzalloc for allocating blob property memory commit `718b5406cd` upstream. The property size may be controlled by userspace, can be large (I've seen failure with order 4, i.e. 16 pages / 64 KB) and doesn't need to be physically contiguous. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180629142710.2069-1-michel@daenzer.net Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:27 +02:00
Paulo Alcantara	9f0003b7d6	cifs: Fix infinite loop when using hard mount option commit `7ffbe65578` upstream. For every request we send, whether it is SMB1 or SMB2+, we attempt to reconnect tcon (cifs_reconnect_tcon or smb2_reconnect) before carrying out the request. So, while server->tcpStatus != CifsNeedReconnect, we wait for the reconnection to succeed on wait_event_interruptible_timeout(). If it returns, that means that either the condition was evaluated to true, or timeout elapsed, or it was interrupted by a signal. Since we're not handling the case where the process woke up due to a received signal (-ERESTARTSYS), the next call to wait_event_interruptible_timeout() will _always_ fail and we end up looping forever inside either cifs_reconnect_tcon() or smb2_reconnect(). Here's an example of how to trigger that: $ mount.cifs //foo/share /mnt/test -o username=foo,password=foo,vers=1.0,hard (break connection to server before executing bellow cmd) $ stat -f /mnt/test & sleep 140 [1] 2511 $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 0.0 0.0 12892 1008 pts/0 S 12:24 0:00 stat -f /mnt/test $ kill -9 2511 (wait for a while; process is stuck in the kernel) $ ps -aux -q 2511 USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 2511 83.2 0.0 12892 1008 pts/0 R 12:24 30:01 stat -f /mnt/test By using 'hard' mount point means that cifs.ko will keep retrying indefinitely, however we must allow the process to be killed otherwise it would hang the system. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Cc: stable@vger.kernel.org Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:27 +02:00
Stefano Brivio	dbe71f37e7	cifs: Fix slab-out-of-bounds in send_set_info() on SMB2 ACE setting commit `f46ecbd97f` upstream. A "small" CIFS buffer is not big enough in general to hold a setacl request for SMB2, and we end up overflowing the buffer in send_set_info(). For instance: # mount.cifs //127.0.0.1/test /mnt/test -o username=test,password=test,nounix,cifsacl # touch /mnt/test/acltest # getcifsacl /mnt/test/acltest REVISION:0x1 CONTROL:0x9004 OWNER:S-1-5-21-2926364953-924364008-418108241-1000 GROUP:S-1-22-2-1001 ACL:S-1-5-21-2926364953-924364008-418108241-1000:ALLOWED/0x0/0x1e01ff ACL:S-1-22-2-1001:ALLOWED/0x0/R ACL:S-1-22-2-1001:ALLOWED/0x0/R ACL:S-1-5-21-2926364953-924364008-418108241-1000:ALLOWED/0x0/0x1e01ff ACL:S-1-1-0:ALLOWED/0x0/R # setcifsacl -a "ACL:S-1-22-2-1004:ALLOWED/0x0/R" /mnt/test/acltest this setacl will cause the following KASAN splat: [ 330.777927] BUG: KASAN: slab-out-of-bounds in send_set_info+0x4dd/0xc20 [cifs] [ 330.779696] Write of size 696 at addr ffff88010d5e2860 by task setcifsacl/1012 [ 330.781882] CPU: 1 PID: 1012 Comm: setcifsacl Not tainted 4.18.0-rc2+ #2 [ 330.783140] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 330.784395] Call Trace: [ 330.784789] dump_stack+0xc2/0x16b [ 330.786777] print_address_description+0x6a/0x270 [ 330.787520] kasan_report+0x258/0x380 [ 330.788845] memcpy+0x34/0x50 [ 330.789369] send_set_info+0x4dd/0xc20 [cifs] [ 330.799511] SMB2_set_acl+0x76/0xa0 [cifs] [ 330.801395] set_smb2_acl+0x7ac/0xf30 [cifs] [ 330.830888] cifs_xattr_set+0x963/0xe40 [cifs] [ 330.840367] __vfs_setxattr+0x84/0xb0 [ 330.842060] __vfs_setxattr_noperm+0xe6/0x370 [ 330.843848] vfs_setxattr+0xc2/0xd0 [ 330.845519] setxattr+0x258/0x320 [ 330.859211] path_setxattr+0x15b/0x1b0 [ 330.864392] __x64_sys_setxattr+0xc0/0x160 [ 330.866133] do_syscall_64+0x14e/0x4b0 [ 330.876631] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 330.878503] RIP: 0033:0x7ff2e507db0a [ 330.880151] Code: 48 8b 0d 89 93 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 bc 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 56 93 2c 00 f7 d8 64 89 01 48 [ 330.885358] RSP: 002b:00007ffdc4903c18 EFLAGS: 00000246 ORIG_RAX: 00000000000000bc [ 330.887733] RAX: ffffffffffffffda RBX: 000055d1170de140 RCX: 00007ff2e507db0a [ 330.890067] RDX: 000055d1170de7d0 RSI: 000055d115b39184 RDI: 00007ffdc4904818 [ 330.892410] RBP: 0000000000000001 R08: 0000000000000000 R09: 000055d1170de7e4 [ 330.894785] R10: 00000000000002b8 R11: 0000000000000246 R12: 0000000000000007 [ 330.897148] R13: 000055d1170de0c0 R14: 0000000000000008 R15: 000055d1170de550 [ 330.901057] Allocated by task 1012: [ 330.902888] kasan_kmalloc+0xa0/0xd0 [ 330.904714] kmem_cache_alloc+0xc8/0x1d0 [ 330.906615] mempool_alloc+0x11e/0x380 [ 330.908496] cifs_small_buf_get+0x35/0x60 [cifs] [ 330.910510] smb2_plain_req_init+0x4a/0xd60 [cifs] [ 330.912551] send_set_info+0x198/0xc20 [cifs] [ 330.914535] SMB2_set_acl+0x76/0xa0 [cifs] [ 330.916465] set_smb2_acl+0x7ac/0xf30 [cifs] [ 330.918453] cifs_xattr_set+0x963/0xe40 [cifs] [ 330.920426] __vfs_setxattr+0x84/0xb0 [ 330.922284] __vfs_setxattr_noperm+0xe6/0x370 [ 330.924213] vfs_setxattr+0xc2/0xd0 [ 330.926008] setxattr+0x258/0x320 [ 330.927762] path_setxattr+0x15b/0x1b0 [ 330.929592] __x64_sys_setxattr+0xc0/0x160 [ 330.931459] do_syscall_64+0x14e/0x4b0 [ 330.933314] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 330.936843] Freed by task 0: [ 330.938588] (stack is not available) [ 330.941886] The buggy address belongs to the object at ffff88010d5e2800 which belongs to the cache cifs_small_rq of size 448 [ 330.946362] The buggy address is located 96 bytes inside of 448-byte region [ffff88010d5e2800, ffff88010d5e29c0) [ 330.950722] The buggy address belongs to the page: [ 330.952789] page:ffffea0004357880 count:1 mapcount:0 mapping:ffff880108fdca80 index:0x0 compound_mapcount: 0 [ 330.955665] flags: 0x17ffffc0008100(slab\|head) [ 330.957760] raw: 0017ffffc0008100 dead000000000100 dead000000000200 ffff880108fdca80 [ 330.960356] raw: 0000000000000000 0000000080100010 00000001ffffffff 0000000000000000 [ 330.963005] page dumped because: kasan: bad access detected [ 330.967039] Memory state around the buggy address: [ 330.969255] ffff88010d5e2880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 330.971833] ffff88010d5e2900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 330.974397] >ffff88010d5e2980: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ 330.976956] ^ [ 330.979226] ffff88010d5e2a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 330.981755] ffff88010d5e2a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 330.984225] ================================================================== Fix this by allocating a regular CIFS buffer in smb2_plain_req_init() if the request command is SMB2_SET_INFO. Reported-by: Jianhong Yin <jiyin@redhat.com> Fixes: `366ed846df` ("cifs: Use smb 2 - 3 and cifsacl mount options setacl function") CC: Stable <stable@vger.kernel.org> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-and-tested-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:27 +02:00
Paulo Alcantara	ef5bf57178	cifs: Fix memory leak in smb2_set_ea() commit `6aa0c114ec` upstream. This patch fixes a memory leak when doing a setxattr(2) in SMB2+. Signed-off-by: Paulo Alcantara <palcantara@suse.de> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:27 +02:00
Lars Persson	a9f4cf98dd	cifs: Fix use after free of a mid_q_entry commit `696e420bb2` upstream. With protocol version 2.0 mounts we have seen crashes with corrupt mid entries. Either the server->pending_mid_q list becomes corrupt with a cyclic reference in one element or a mid object fetched by the demultiplexer thread becomes overwritten during use. Code review identified a race between the demultiplexer thread and the request issuing thread. The demultiplexer thread seems to be written with the assumption that it is the sole user of the mid object until it calls the mid callback which either wakes the issuer task or deletes the mid. This assumption is not true because the issuer task can be woken up earlier by a signal. If the demultiplexer thread has proceeded as far as setting the mid_state to MID_RESPONSE_RECEIVED then the issuer thread will happily end up calling cifs_delete_mid while the demultiplexer thread still is using the mid object. Inserting a delay in the cifs demultiplexer thread widens the race window and makes reproduction of the race very easy: if (server->large_buf) buf = server->bigbuf; + usleep_range(500, 4000); server->lstrp = jiffies; To resolve this I think the proper solution involves putting a reference count on the mid object. This patch makes sure that the demultiplexer thread holds a reference until it has finished processing the transaction. Cc: stable@vger.kernel.org Signed-off-by: Lars Persson <larper@axis.com> Acked-by: Paulo Alcantara <palcantara@suse.de> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:27 +02:00
Roger Quadros	38d3690f7a	ARM: dts: dra7: Disable metastability workaround for USB2 commit `07eaa43e66` upstream. Disable the metastability workaround for USB2. The original patch disabled the workaround on the wrong USB port. Fixes: `b8c9c6fa20` ("ARM: dts: dra7: Disable USB metastability workaround for USB2") Cc: <stable@vger.kernel.org> [4.16+] Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:26 +02:00
Adam Ford	c25bd5bf23	ARM: dts: omap3: Fix am3517 mdio and emac clock references commit `0144eb204c` upstream. A previous patch removed OMAP clock aliases that were perceived to be unnecessary. Unfortunately, it broke the ethernet on the am3517-evm. This patch enables the MDIO clock and EMAC clock. Fixes: `0ed266d7ae` ("clk: ti: omap3: cleanup unnecessary clock aliases") Cc: stable@vger.kernel.org #4.16+ Signed-off-by: Adam Ford <aford173@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:26 +02:00
Nick Dyer	dbecaae4bf	ARM: dts: imx51-zii-rdu1: fix touchscreen pinctrl commit `06d793b114` upstream. The pinctrl settings were incorrect for the touchscreen interrupt line, causing an interrupt storm. This change has been tested with both the atmel_mxt_ts and RMI4 drivers on the RDU1 units. The value 0x4 comes from the value of register IOMUXC_SW_PAD_CTL_PAD_CSI1_D8 from the old vendor kernel. Signed-off-by: Nick Dyer <nick@shmanahar.org> Fixes: `ceef0396f3` ("ARM: dts: imx: add ZII RDU1 board") Cc: <stable@vger.kernel.org> # 4.15+ Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:26 +02:00
Jason Gunthorpe	5cdc9e29ff	vfio: Use get_user_pages_longterm correctly commit `bb94b55af3` upstream. The patch noted in the fixes below converted get_user_pages_fast() to get_user_pages_longterm(), however the two calls differ in a few ways. First _fast() is documented to not require the mmap_sem, while _longterm() is documented to need it. Hold the mmap sem as required. Second, _fast accepts an 'int write' while _longterm uses 'unsigned int gup_flags', so the expression '!!(prot & IOMMU_WRITE)' is only working by luck as FOLL_WRITE is currently == 0x1. Use the expected FOLL_WRITE constant instead. Fixes: `94db151dc8` ("vfio: disable filesystem-dax page pinning") Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:26 +02:00
Lars Ellenberg	a3be6357de	drbd: fix access after free commit `64dafbc953` upstream. We have struct drbd_requests { ... struct bio *private_bio; ... } to hold a bio clone for local submission. On local IO completion, we put that bio, and in case we want to use the result later, we overload that member to hold the ERR_PTR() of the completion result, Which, before v4.3, used to be the passed in "int error", so we could first bio_put(), then assign. v4.3-rc1~100^2~21 `4246a0b63b` block: add a bi_error field to struct bio changed that: bio_put(req->private_bio); - req->private_bio = ERR_PTR(error); + req->private_bio = ERR_PTR(bio->bi_error); Which introduces an access after free, because it was non obvious that req->private_bio == bio. Impact of that was mostly unnoticable, because we only use that value in a multiple-failure case, and even then map any "unexpected" error code to EIO, so worst case we could potentially mask a more specific error with EIO in a multiple failure case. Unless the pointed to memory region was unmapped, as is the case with CONFIG_DEBUG_PAGEALLOC, in which case this results in BUG: unable to handle kernel paging request v4.13-rc1~70^2~75 `4e4cbee93d` block: switch bios to blk_status_t changes it further to bio_put(req->private_bio); req->private_bio = ERR_PTR(blk_status_to_errno(bio->bi_status)); And blk_status_to_errno() now contains a WARN_ON_ONCE() for unexpected values, which catches this "sometimes", if the memory has been reused quickly enough for other things. Should also go into stable since 4.3, with the trivial change around 4.13. Cc: stable@vger.kernel.org Fixes: `4246a0b63b` block: add a bi_error field to struct bio Reported-by: Sarah Newman <srn@prgmr.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:25 +02:00
Christian Borntraeger	3d590170fb	s390: Correct register corruption in critical section cleanup commit `891f6a726c` upstream. In the critical section cleanup we must not mess with r1. For march=z9 or older, larl + ex (instead of exrl) are used with r1 as a temporary register. This can clobber r1 in several interrupt handlers. Fix this by using r11 as a temp register. r11 is being saved by all callers of cleanup_critical. Fixes: `6dd85fbb87` ("s390: move expoline assembler macros to a header") Cc: stable@vger.kernel.org #v4.16 Reported-by: Oliver Kurz <okurz@suse.com> Reported-by: Petr Tesařík <ptesarik@suse.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:25 +02:00
David Disseldorp	04dbce2f5d	scsi: target: Fix truncated PR-in ReadKeys response commit `63ce3c384d` upstream. SPC5r17 states that the contents of the ADDITIONAL LENGTH field are not altered based on the allocation length, so always calculate and pack the full key list length even if the list itself is truncated. According to Maged: Yes it fixes the "Storage Spaces Persistent Reservation" test in the Windows 2016 Server Failover Cluster validation suites when having many connections that result in more than 8 registrations. I tested your patch on 4.17 with iblock. This behaviour can be tested using the libiscsi PrinReadKeys.Truncate test. Cc: stable@vger.kernel.org Signed-off-by: David Disseldorp <ddiss@suse.de> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Maged Mokhtar <mmokhtar@petasan.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:25 +02:00
Raghava Aditya Renukunta	620f480fd8	scsi: aacraid: Fix PD performance regression over incorrect qd being set commit `59b433c825` upstream. The driver fails to set the correct queue depth for native devices, due to failing to set the device type prior to calling aac_set_safw_target_qd(). This results in slave configure setting the queue depth to 1. This causes around 30% performance degradation. Fixed by setting the dev type before trying to set queue depth. Reported-by: Steve Best <sbest@redhat.com> Fixes: `0bcb45fb20` ("scsi: aacraid: Add helper function to set queue depth") cc: stable@vger.kernel.org Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com> Reviewed-by: David Carroll <David.Carroll@microsemi.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:25 +02:00
Jann Horn	ae78cf6c0c	scsi: sg: mitigate read/write abuse commit `26b5b874af` upstream. As Al Viro noted in commit `128394eff3` ("sg_write()/bsg_write() is not fit to be called under KERNEL_DS"), sg improperly accesses userspace memory outside the provided buffer, permitting kernel memory corruption via splice(). But it doesn't just do it on ->write(), also on ->read(). As a band-aid, make sure that the ->read() and ->write() handlers can not be called in weird contexts (kernel context or credentials different from file opener), like for ib_safe_file_access(). If someone needs to use these interfaces from different security contexts, a new interface should be written that goes through the ->ioctl() handler. I've mostly copypasted ib_safe_file_access() over as sg_safe_file_access() because I couldn't find a good common header - please tell me if you know a better way. [mkp: s/_safe_/_check_/] Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: <stable@vger.kernel.org> Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:25 +02:00
Changbin Du	ac126896d4	tracing: Fix missing return symbol in function_graph output commit `1fe4293f4b` upstream. The function_graph tracer does not show the interrupt return marker for the leaf entry. On leaf entries, we see an unbalanced interrupt marker (the interrupt was entered, but nevern left). Before: 1) \| SyS_write() { 1) \| __fdget_pos() { 1) 0.061 us \| __fget_light(); 1) 0.289 us \| } 1) \| vfs_write() { 1) 0.049 us \| rw_verify_area(); 1) + 15.424 us \| __vfs_write(); 1) ==========> \| 1) 6.003 us \| smp_apic_timer_interrupt(); 1) 0.055 us \| __fsnotify_parent(); 1) 0.073 us \| fsnotify(); 1) + 23.665 us \| } 1) + 24.501 us \| } After: 0) \| SyS_write() { 0) \| __fdget_pos() { 0) 0.052 us \| __fget_light(); 0) 0.328 us \| } 0) \| vfs_write() { 0) 0.057 us \| rw_verify_area(); 0) \| __vfs_write() { 0) ==========> \| 0) 8.548 us \| smp_apic_timer_interrupt(); 0) <========== \| 0) + 36.507 us \| } /* __vfs_write */ 0) 0.049 us \| __fsnotify_parent(); 0) 0.066 us \| fsnotify(); 0) + 50.064 us \| } 0) + 50.952 us \| } Link: http://lkml.kernel.org/r/1517413729-20411-1-git-send-email-changbin.du@intel.com Cc: stable@vger.kernel.org Fixes: `f8b755ac8e` ("tracing/function-graph-tracer: Output arrows signal on hardirq call/return") Signed-off-by: Changbin Du <changbin.du@intel.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:24 +02:00
Arnd Bergmann	89cc5f854c	tracing: Avoid string overflow commit `cf4d418e65` upstream. 'err' is used as a NUL-terminated string, but using strncpy() with the length equal to the buffer size may result in lack of the termination: kernel/trace/trace_events_hist.c: In function 'hist_err_event': kernel/trace/trace_events_hist.c:396:3: error: 'strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation] strncpy(err, var, MAX_FILTER_STR_VAL); This changes it to use the safer strscpy() instead. Link: http://lkml.kernel.org/r/20180328140920.2842153-1-arnd@arndb.de Cc: stable@vger.kernel.org Fixes: `f404da6e1d` ("tracing: Add 'last error' error facility for hist triggers") Acked-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:24 +02:00
Lyude Paul	af23e901a6	drm/amdgpu: Make struct amdgpu_atif private to amdgpu_acpi.c commit `2cd5fe22d9` upstream. Currently, there is nothing in amdgpu that actually uses these structs other than amdgpu_acpi.c. Additionally, since we're about to start saving the correct ACPI handle to use for calling ATIF in this struct this saves us from having to handle making sure that the acpi_handle (and by proxy, the type definition for acpi_handle and all of the other acpi headers) doesn't need to be included within the amdgpu_drv struct itself. This follows the example set by amdgpu_atpx_handler.c. Signed-off-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:24 +02:00
Jouke Witteveen	8505f57ce3	ACPI / battery: Safe unregistering of hooks commit `673b427166` upstream. A hooking API was implemented for 4.17 in `fa93854f7a` followed by hooks for Thinkpad laptops in `2801b9683f`. The Thinkpad drivers did not support the Thinkpad 13 and the hooking API crashes on unsupported batteries by altering a list of hooks during unsafe iteration. Thus, Thinkpad 13 laptops could no longer boot. Additionally, a lock was kept in place and debugging information was printed out of order. Fixes: `fa93854f7a` (battery: Add the battery hooking API) Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:24 +02:00
Rafael J. Wysocki	a5b79b8f55	ACPICA: Drop leading newlines from error messages commit `a0d5f3b69a` upstream. Commit `5088814a6e` (ACPICA: AML parser: attempt to continue loading table after error) unintentionally added leading newlines to error messages emitted by ACPICA which caused unexpected things to be printed to the kernel log. Drop these newlines (which effectively reverts the part of commit `5088814a6e` adding them). Fixes: `5088814a6e` (ACPICA: AML parser: attempt to continue loading table after error) Reported-by: Toralf Förster <toralf.foerster@gmx.de> Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: 4.17+ <stable@vger.kernel.org> # 4.17+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:24 +02:00
Rafael J. Wysocki	bff8d39af6	PCI / ACPI / PM: Resume bridges w/o drivers on suspend-to-RAM commit `26112ddc25` upstream. It is reported that commit `c62ec4610c` (PM / core: Fix direct_complete handling for devices with no callbacks) introduced a system suspend regression on Samsung 305V4A by allowing a PCI bridge (not a PCIe port) to stay in D3 over suspend-to-RAM, which is a side effect of setting power.direct_complete for the children of that bridge that have no PM callbacks. On the majority of systems PCI bridges are not allowed to be runtime-suspended (the power/control sysfs attribute is set to "on" for them by default), but user space can change that setting and if it does so and a given bridge has no children with PM callbacks, the direct_complete optimization will be applied to it and it will stay in suspend over system suspend. Apparently, that confuses the platform firmware on the affected machine and that may very well happen elsewhere, so avoid the direct_complete optimization for PCI bridges with no drivers (if there is a driver, it should take care of the PM handling) on suspend-to-RAM altogether (that should not matter for suspend-to-idle as platform firmware is not involved in it). Fixes: `c62ec4610c` (PM / core: Fix direct_complete handling for devices with no callbacks) Link: https://bugzilla.kernel.org/show_bug.cgi?id=199941 Reported-by: n0000b.n000b@gmail.com Tested-by: n0000b.n000b@gmail.com Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:23 +02:00
Pavel Tatashin	b34e1148f2	mm: teach dump_page() to correctly output poisoned struct pages commit `fc36def997` upstream. If struct page is poisoned, and uninitialized access is detected via PF_POISONED_CHECK(page) dump_page() is called to output the page. But, the dump_page() itself accesses struct page to determine how to print it, and therefore gets into a recursive loop. For example: dump_page() __dump_page() PageSlab(page) PF_POISONED_CHECK(page) VM_BUG_ON_PGFLAGS(PagePoisoned(page), page) dump_page() recursion loop. Link: http://lkml.kernel.org/r/20180702180536.2552-1-pasha.tatashin@oracle.com Fixes: `f165b378bb` ("mm: uninitialized struct page poisoning sanity checking") Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:23 +02:00
Cannon Matthews	8b08309646	mm: hugetlb: yield when prepping struct pages commit `520495fe96` upstream. When booting with very large numbers of gigantic (i.e. 1G) pages, the operations in the loop of gather_bootmem_prealloc, and specifically prep_compound_gigantic_page, takes a very long time, and can cause a softlockup if enough pages are requested at boot. For example booting with 3844 1G pages requires prepping (set_compound_head, init the count) over 1 billion 4K tail pages, which takes considerable time. Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to prevent this lockup. Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and no softlockup is reported, and the hugepages are reported as successfully setup. Link: http://lkml.kernel.org/r/20180627214447.260804-1-cannonmatthews@google.com Signed-off-by: Cannon Matthews <cannonmatthews@google.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Peter Feiner <pfeiner@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:23 +02:00
Janosch Frank	8712b87adb	userfaultfd: hugetlbfs: fix userfaultfd_huge_must_wait() pte access commit `1e2c043628` upstream. Use huge_ptep_get() to translate huge ptes to normal ptes so we can check them with the huge_pte_* functions. Otherwise some architectures will check the wrong values and will not wait for userspace to bring in the memory. Link: http://lkml.kernel.org/r/20180626132421.78084-1-frankja@linux.ibm.com Fixes: `369cd2121b` ("userfaultfd: hugetlbfs: userfaultfd_huge_must_wait for hugepmd ranges") Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-11 16:31:23 +02:00
Greg Kroah-Hartman	c97bfb7e6e	Linux 4.17.5	2018-07-08 15:32:21 +02:00
Sean Nyekjaer	ef66314970	ARM: dts: imx6q: Use correct SDMA script for SPI5 core commit `df07101e1c` upstream. According to the reference manual the shp_2_mcu / mcu_2_shp scripts must be used for devices connected through the SPBA. This fixes an issue we saw with DMA transfers. Sometimes the SPI controller RX FIFO was not empty after a DMA transfer and the driver got stuck in the next PIO transfer when it read one word more than expected. commit `dd4b487b32` ("ARM: dts: imx6: Use correct SDMA script for SPI cores") is fixing the same issue but only for SPI1 - 4. Fixes: `677940258d` ("ARM: dts: imx6q: enable dma for ecspi5") Signed-off-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:21 +02:00
Andrey Ryabinin	b77816b86b	x86/mm: Don't free P4D table when it is folded at runtime commit `0e311d237d` upstream. When the P4D page table layer is folded at runtime, the p4d_free() should do nothing, the same as in <asm-generic/pgtable-nop4d.h>. It seems this bug should cause double-free in efi_call_phys_epilog(), but I don't know how to trigger that code path, so I can't confirm that by testing. Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # 4.17 Fixes: `98219dda2a` ("x86/mm: Fold p4d page table layer at runtime") Link: http://lkml.kernel.org/r/20180625102427.15015-1-aryabinin@virtuozzo.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:21 +02:00
Neil Armstrong	54fb3c180d	ARM64: dts: meson-gxl-s905x-p212: Add phy-supply for usb0 commit `d511b3e408` upstream. Like LibreTech-CC, the USB0 needs the 5V regulator to be enabled to power the devices on the P212 Reference Design based boards. Fixes: `b9f07cb4f4` ("ARM64: dts: meson-gxl-s905x-p212: enable the USB controller") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:21 +02:00
Taehee Yoo	012007a1dd	netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain() commit `adc972c5b8` upstream. When depth of chain is bigger than NFT_JUMP_STACK_SIZE, the nft_do_chain crashes. But there is no need to crash hard here. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:21 +02:00
Florian Westphal	735c001bfb	netfilter: xt_connmark: fix list corruption on rmmod commit `fc6ddbecce` upstream. This needs to use xt_unregister_targets, else new revision is left on the list which then causes list to point to a target struct that has been free'd. Fixes: `472a73e007` ("netfilter: xt_conntrack: Support bit-shifting for CONNMARK & MARK targets.") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:21 +02:00
Vincent Bernat	33e021ba64	netfilter: ip6t_rpfilter: provide input interface for route lookup commit `cede24d1b2` upstream. In commit `47b7e7f828`, this bit was removed at the same time the RT6_LOOKUP_F_IFACE flag was removed. However, it is needed when link-local addresses are used, which is a very common case: when packets are routed, neighbor solicitations are done using link-local addresses. For example, the following neighbor solicitation is not matched by "-m rpfilter": IP6 fe80::5254:33ff:fe00:1 > ff02::1:ff00:3: ICMP6, neighbor solicitation, who has 2001:db8::5254:33ff:fe00:3, length 32 Commit `47b7e7f828` doesn't quite explain why we shouldn't use RT6_LOOKUP_F_IFACE in the rpfilter case. I suppose the interface check later in the function would make it redundant. However, the remaining of the routing code is using RT6_LOOKUP_F_IFACE when there is no source address (which matches rpfilter's case with a non-unicast destination, like with neighbor solicitation). Signed-off-by: Vincent Bernat <vincent@bernat.im> Fixes: `47b7e7f828` ("netfilter: don't set F_IFACE on ipv6 fib lookups") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Kenneth Graunke	3642628920	drm/i915: Enable provoking vertex fix on Gen9 systems. commit `7a3727f385` upstream. The SF and clipper units mishandle the provoking vertex in some cases, which can cause misrendering with shaders that use flat shaded inputs. There are chicken bits in 3D_CHICKEN3 (for SF) and FF_SLICE_CHICKEN (for the clipper) that work around the issue. These registers are unfortunately not part of the logical context (even the power context), and so we must reload them every time we start executing in a context. Bugzilla: https://bugs.freedesktop.org/103047 Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180615190605.16238-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: stable@vger.kernel.org (cherry picked from commit `b77422f803`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Ville Syrjälä	e3d16447e8	drm/i915: Turn off g4x DP port in .post_disable() commit `4dccc4d517` upstream. While Bspec doesn't list a specific sequence for turning off the DP port on g4x we are getting an underrun if the port is disabled in the .disable() hook. Looks like the pipe stops when the port stops, and by that time the plane disable may not have completed yet. Also the plane(s) seem to end up in some wonky state when this happens as they also signal another underrun immediately after we turn them back on during the next enable sequence. We could add a vblank wait in .disable() to avoid wedging the planes, but I assume we're still tripping up the pipe in some way. So it seems better to me to just follow the ILK+ sequence and turn off the DP port in .post_disable() instead. This sequence doesn't seem to suffer from this problem. Could be it was always the intended sequence for DP and the gen4 bspec was just never updated to include it. Originally we used the bad sequence even on ilk+, but I changed that in commit `08aff3fe26` ("drm/i915: Move DP port disable to post_disable for pch platforms") as it was causing issues on those platforms as well. I left out g4x then only because I didn't have the hardware to test it. Now that I do it's fairly clear that the ilk+ sequence is also the right choice for g4x. v2: Fix whitespace fail (Jani) Mention the ilk+ commit (Jani) Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180613160553.11664-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `51a9f6dfc0`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Ville Syrjälä	e3e3408135	drm/i915: Disallow interlaced modes on g4x DP outputs commit `1e34f1d368` upstream. Looks like interlaced DP output doesn't work on g4x either. Not all that surprising considering we already established that interlaced DP output is busted on VLV/CHV. Cc: stable@vger.kernel.org Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180613160553.11664-1-ville.syrjala@linux.intel.com (cherry picked from commit `929168c5f3`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Ville Syrjälä	ea6ac2b5cf	drm/i915: Fix PIPESTAT irq ack on i965/g4x commit `4dc055c9cc` upstream. On i965/g4x IIR is edge triggered. So in order for IIR to notice that there is still a pending interrupt we have to force and edge in ISR. For the ISR/IIR pipe event bits we can do that by temporarily clearing all the PIPESTAT enable bits when we ack the status bits. This will force the ISR pipe event bit low, and it can then go back high when we restore the PIPESTAT enable bits. This avoids the following race: 1. stat = read(PIPESTAT) 2. an enabled PIPESTAT status bit goes high 3. write(PIPESTAT, enable\|stat); 4. write(IIR, PIPE_EVENT) The end result is IIR==0 and ISR!=0. This can lead to nasty vblank wait/flip_done timeouts if another interrupt source doesn't trick us into looking at the PIPESTAT status bits despite the IIR PIPE_EVENT bit being low. Before i965 IIR was level triggered so this problem can't actually happen there. And curiously VLV/CHV went back to the level triggered scheme as well. But for simplicity we'll use the same i965/g4x compatible code for all platforms. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106033 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105225 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106030 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180611200258.27121-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (cherry picked from commit `132c27c97c`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Ville Syrjälä	d85341e000	drm/i915: Allow DBLSCAN user modes with eDP/LVDS/DSI commit `541ab84d2b` upstream. When encountering a connector with the scaling mode property both intel and modesetting ddxs sometimes add tons of DBLSCAN modes to the output's mode list. The idea presumably being that since the output will be going through the panel fitter anyway we can pretend to use any kind of mode. Sadly that means we can't reject user modes with the DBLSCAN flag until we know whether we're going to be using the panel's native mode or the user mode directly. Doing otherwise means X clients using xf86vidmode/xrandr will get a protocol error (and often self terminate as a result) when the kernel refuses to use the requested mode with the DBLSCAN flag. To undo the regression we'll move the DBLSCAN checks into the connector->mode_valid() and encoder->compute_config() hooks. Cc: stable@vger.kernel.org Cc: Vito Caputo <vcaputo@pengaru.com> Reported-by: Vito Caputo <vcaputo@pengaru.com> Fixes: `e995ca0b81` ("drm/i915: Provide a device level .mode_valid() hook") References: https://lkml.org/lkml/2018/5/21/715 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180524125403.23445-1-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106804 Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl> (cherry picked from commit `e4dd27aadd`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Shirish S	9cdd39e51c	drm/amd/display: release spinlock before committing updates to stream commit `4de9f38bb2` upstream. Currently, amdgpu_do_flip() spinlocks crtc->dev->event_lock and releases it only after committing updates to the stream. dc_commit_updates_for_stream() should be moved out of spinlock for the below reasons: 1. event_lock is supposed to protect access to acrct->pflip_status _only_ 2. dc_commit_updates_for_stream() has potential sleep's and also its not appropriate to be in an atomic state for such long sequences of code. Signed-off-by: Shirish S <shirish.s@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Lyude Paul	fd044fd484	drm/amdgpu: Count disabled CRTCs in commit tail earlier commit `fe2a196529` upstream. This fixes a regression I accidentally reduced that was picked up by kasan, where we were checking the CRTC atomic states after DRM's helpers had already freed them. Example: ================================================================== BUG: KASAN: use-after-free in amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu] Read of size 1 at addr ffff8803a697b071 by task kworker/u16:0/7 CPU: 7 PID: 7 Comm: kworker/u16:0 Tainted: G O 4.18.0-rc1Lyude-Upstream+ #1 Hardware name: HP HP ZBook 15 G4/8275, BIOS P70 Ver. 01.21 05/02/2018 Workqueue: events_unbound commit_work [drm_kms_helper] Call Trace: dump_stack+0xc1/0x169 ? dump_stack_print_info.cold.1+0x42/0x42 ? kmsg_dump_rewind_nolock+0xd9/0xd9 ? printk+0x9f/0xc5 ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu] print_address_description+0x6c/0x23c ? amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu] kasan_report.cold.6+0x241/0x2fd amdgpu_dm_atomic_commit_tail.cold.50+0x13d/0x15a [amdgpu] ? commit_planes_to_stream.constprop.45+0x13b0/0x13b0 [amdgpu] ? cpu_load_update_active+0x290/0x290 ? finish_task_switch+0x2bd/0x840 ? __switch_to_asm+0x34/0x70 ? read_word_at_a_time+0xe/0x20 ? strscpy+0x14b/0x460 ? drm_atomic_helper_wait_for_dependencies+0x47d/0x7e0 [drm_kms_helper] commit_tail+0x96/0xe0 [drm_kms_helper] process_one_work+0x88a/0x1360 ? create_worker+0x540/0x540 ? __sched_text_start+0x8/0x8 ? move_queued_task+0x760/0x760 ? call_rcu_sched+0x20/0x20 ? vsnprintf+0xcda/0x1350 ? wait_woken+0x1c0/0x1c0 ? mutex_unlock+0x1d/0x40 ? init_timer_key+0x190/0x230 ? schedule+0xea/0x390 ? __schedule+0x1ea0/0x1ea0 ? need_to_create_worker+0xe4/0x210 ? init_worker_pool+0x700/0x700 ? try_to_del_timer_sync+0xbf/0x110 ? del_timer+0x120/0x120 ? __mutex_lock_slowpath+0x10/0x10 worker_thread+0x196/0x11f0 ? flush_rcu_work+0x50/0x50 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 ? __switch_to_asm+0x40/0x70 ? __schedule+0x7d6/0x1ea0 ? migrate_swap_stop+0x850/0x880 ? __sched_text_start+0x8/0x8 ? save_stack+0x8c/0xb0 ? kasan_kmalloc+0xbf/0xe0 ? kmem_cache_alloc_trace+0xe4/0x190 ? kthread+0x98/0x390 ? ret_from_fork+0x35/0x40 ? ret_from_fork+0x35/0x40 ? deactivate_slab.isra.67+0x3c4/0x5c0 ? kthread+0x98/0x390 ? kthread+0x98/0x390 ? set_track+0x76/0x120 ? schedule+0xea/0x390 ? __schedule+0x1ea0/0x1ea0 ? wait_woken+0x1c0/0x1c0 ? kasan_unpoison_shadow+0x30/0x40 ? parse_args.cold.15+0x17a/0x17a ? flush_rcu_work+0x50/0x50 kthread+0x2d4/0x390 ? kthread_create_worker_on_cpu+0xc0/0xc0 ret_from_fork+0x35/0x40 Allocated by task 1124: kasan_kmalloc+0xbf/0xe0 kmem_cache_alloc_trace+0xe4/0x190 dm_crtc_duplicate_state+0x78/0x130 [amdgpu] drm_atomic_get_crtc_state+0x147/0x410 [drm] page_flip_common+0x57/0x230 [drm_kms_helper] drm_atomic_helper_page_flip+0xa6/0x110 [drm_kms_helper] drm_mode_page_flip_ioctl+0xc4b/0x10a0 [drm] drm_ioctl_kernel+0x1d4/0x260 [drm] drm_ioctl+0x433/0x920 [drm] amdgpu_drm_ioctl+0x11d/0x290 [amdgpu] do_vfs_ioctl+0x1a1/0x13d0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x147/0x440 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 1124: __kasan_slab_free+0x12e/0x180 kfree+0x92/0x1a0 drm_atomic_state_default_clear+0x315/0xc40 [drm] __drm_atomic_state_free+0x35/0xd0 [drm] drm_atomic_helper_update_plane+0xac/0x350 [drm_kms_helper] __setplane_internal+0x2d6/0x840 [drm] drm_mode_cursor_universal+0x41e/0xbe0 [drm] drm_mode_cursor_common+0x49f/0x880 [drm] drm_mode_cursor_ioctl+0xd8/0x130 [drm] drm_ioctl_kernel+0x1d4/0x260 [drm] drm_ioctl+0x433/0x920 [drm] amdgpu_drm_ioctl+0x11d/0x290 [amdgpu] do_vfs_ioctl+0x1a1/0x13d0 ksys_ioctl+0x60/0x90 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x147/0x440 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff8803a697b068 which belongs to the cache kmalloc-1024 of size 1024 The buggy address is located 9 bytes inside of 1024-byte region [ffff8803a697b068, ffff8803a697b468) The buggy address belongs to the page: page:ffffea000e9a5e00 count:1 mapcount:0 mapping:ffff88041e00efc0 index:0x0 compound_mapcount: 0 flags: 0x8000000000008100(slab\|head) raw: 8000000000008100 ffffea000ecbc208 ffff88041e000c70 ffff88041e00efc0 raw: 0000000000000000 0000000000170017 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8803a697af00: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8803a697af80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8803a697b000: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb ^ ffff8803a697b080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8803a697b100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== So, we fix this by counting the number of CRTCs this atomic commit disabled early on in the function before their atomic states have been freed, then use that count later to do the appropriate number of RPM puts at the end of the function. Acked-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Cc: stable@vger.kernel.org Fixes: `97028037a3` ("drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail()") Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Michel Dänzer <michel@daenzer.net> Reported-by: Michel Dänzer <michel@daenzer.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Michel Dänzer	26dc4e9607	drm/amdgpu: GPU vs CPU page size fixes in amdgpu_vm_bo_split_mapping commit `38e624a18f` upstream. start / last / max_entries are numbers of GPU pages, pfn / count are numbers of CPU pages. Convert between them accordingly. Fixes badness on systems with > 4K page size. Cc: stable@vger.kernel.org Bugzilla: https://bugs.freedesktop.org/106258 Reported-by: Matt Corallo <freedesktop@bluematt.me> Tested-by: foxbat@ruin.net Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:20 +02:00
Michel Dänzer	762c9d721e	drm/amdgpu: Update pin_size values before unpinning BO commit `34d6d59986` upstream. At least in theory, ttm_bo_validate may move the BO, in which case the pin_size accounting would be inconsistent with when the BO was pinned. Cc: stable@vger.kernel.org Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Michel Dänzer	c56ec3c243	drm/amdgpu: Make amdgpu_vram_mgr_bo_invisible_size always accurate commit `7303b39e46` upstream. Even BOs with AMDGPU_GEM_CREATE_NO_CPU_ACCESS may end up at least partially in CPU visible VRAM, in particular when all VRAM is visible. v2: * Don't take VRAM mgr spinlock, not needed (Christian König) * Make loop logic simpler and clearer. Cc: stable@vger.kernel.org Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Michel Dänzer	d0a6f952d2	drm/amdgpu: Refactor amdgpu_vram_mgr_bo_invisible_size helper commit `5e9244ff58` upstream. Preparation for the following fix, no functional change intended. Cc: stable@vger.kernel.org Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Michel Dänzer	95861afae4	drm/amdgpu: Use kvmalloc_array for allocating VRAM manager nodes array commit `6fa39bc1e0` upstream. It can be quite big, and there's no need for it to be physically contiguous. This is less likely to fail under memory pressure (has actually happened while running piglit). Cc: stable@vger.kernel.org Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Harry Wentland	734b9e8abf	drm/amdgpu: Don't default to DC support for Kaveri and older commit `d9fda24804` upstream. We've had a number of users report failures to detect and light up display with DC with LVDS and VGA. These connector types are not currently supported with DC. I'd like to add support but unfortunately don't have a system with LVDS or VGA available. In order not to cause regressions we should probably fallback to the non-DC driver for ASICs that support VGA and LVDS. These ASICs are: * Bonaire * Kabini * Kaveri * Mullins ASIC support can always be force enabled with amdgpu.dc=1 v2: Keep Hawaii on DC v3: Added Mullins to the list Cc: stable@vger.kernel.org Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Paul Kocialkowski	c5f04b3af3	Revert "drm/sun4i: Handle DRM_BUS_FLAG_PIXDATA_*EDGE" commit `58b3d02f06` upstream. This reverts commit `2c17a4368a`. The offending commit triggers a run-time fault when accessing the panel element of the sun4i_tcon structure when no such panel is attached. It was apparently assumed in said commit that a panel is always used with the TCON. Although it is often the case, this is not always true. For instance a bridge might be used instead of a panel. This issue was discovered using an A13-OLinuXino, that uses the TCON in RGB mode for a simple DAC-based VGA bridge. Cc: stable@vger.kernel.org Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180613081647.31183-1-paul.kocialkowski@bootlin.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Stefan Agner	88059664f0	drm/atmel-hlcdc: check stride values in the first plane commit `9fcf2b3c1c` upstream. The statement always evaluates to true since the struct fields are arrays. This has shown up as a warning when compiling with clang: warning: address of array 'desc->layout.xstride' will always evaluate to 'true' [-Wpointer-bool-conversion] Check for values in the first plane instead. Fixes: `1a396789f6` ("drm: add Atmel HLCDC Display Controller support") Cc: <stable@vger.kernel.org> Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180617084826.31885-1-stefan@agner.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Jeremy Cline	af810a3c7b	drm/qxl: Call qxl_bo_unref outside atomic context commit `889ad63d41` upstream. "qxl_bo_unref" may sleep, but calling "qxl_release_map" causes "preempt_disable()" to be called and "preempt_enable()" isn't called until "qxl_release_unmap" is used. Move the call to "qxl_bo_unref" out from in between the two to avoid sleeping from an atomic context. This issue can be demonstrated on a kernel with CONFIG_LOCKDEP=y by creating a VM using QXL, using a desktop environment using Xorg, then moving the cursor on or off a window. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1571128 Fixes: `9428088c90` ("drm/qxl: reapply cursor after resetting primary") Cc: stable@vger.kernel.org Signed-off-by: Jeremy Cline <jcline@redhat.com> Link: http://patchwork.freedesktop.org/patch/msgid/20180601200532.13619-1-jcline@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Lyude Paul	dc3af1abbb	drm/i915/dp: Send DPCD ON for MST before phy_up commit `be1c63c801` upstream. When doing a modeset where the sink is transitioning from D3 to D0 , it would sometimes be possible for the initial power_up_phy() to start timing out. This would only be observed in the last action before the sink went into D3 mode was intel_dp_sink_dpms(DRM_MODE_DPMS_OFF). We originally thought this might be an issue with us accidentally shutting off the aux block when putting the sink into D3, but since the DP spec mandates that sinks must wake up within 1ms while we have 100ms to respond to an ESI irq, this didn't really add up. Turns out that the problem is more subtle then that: It turns out that the timeout is from us not enabling DPMS on the MST hub before actually trying to initiate sideband communications. This would cause the first sideband communication (power_up_phy()), to start timing out because the sink wasn't ready to respond. Afterwards, we would call intel_dp_sink_dpms(DRM_MODE_DPMS_ON) in intel_ddi_pre_enable_dp(), which would actually result in waking up the sink so that sideband requests would work again. Since DPMS is what lets us actually bring the hub up into a state where sideband communications become functional again, we just need to make sure to enable DPMS on the display before attempting to perform sideband communications. Changes since v1: - Remove comment above if (!intel_dp->is_mst) - vsryjala - Move intel_dp_sink_dpms() for MST into intel_dp_post_disable_mst() to keep enable/disable paths symmetrical - Improve commit message - dhnkrn Changes since v2: - Only send DPMS off when we're disabling the last sink, and only send DPMS on when we're enabling the first sink - dhnkrn Changes since v3: - Check against is_mst, not intel_dp->is_mst - dhnkrn/vsyrjala Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Tested-by: Laura Abbott <labbott@redhat.com> Cc: stable@vger.kernel.org Fixes: `ad260ab32a` ("drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.") Link: https://patchwork.freedesktop.org/patch/msgid/20180407011053.22437-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:19 +02:00
Mikita Lipski	b6a16c9a66	drm/amd/display: Clear connector's edid pointer commit `5326c4525d` upstream. Clear connector's edid pointer on coonnector update, when unplugging the display. Fix poison EDID when hotplugging on previously used connector. Signed-off-by: Mikita Lipski <mikita.lipski@amd.com> Reviewed-by: Harry Wentland <Harry.Wentland@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Oliver O'Halloran	b98b0fd6bb	drm/sti: Depend on OF rather than selecting it commit `c9fea6f437` upstream. Commit `cc6b741c6f` ("drm: sti: remove useless fields from vtg structure") reworked some code inside of this driver and made it select CONFIG_OF. This results in the entire OF layer being enabled when building an allmodconfig on ia64. OF on ia64 is completely unsupported so this isn't a great state of affairs. The 0day robot noticed a link-time failure on ia64 caused by using of_node_to_nid() in an otherwise unrelated driver. The generic fallback for of_node_to_nid() only exists when: defined(CONFIG_OF) && defined(CONFIG_NUMA) == false Since CONFIG_NUMA is usually selected for IA64 we get the link failure. Fix this by making the driver depend on OF rather than selecting it, odds are that was the original intent. Link: https://lists.01.org/pipermail/kbuild-all/2018-March/045172.html Fixes: `cc6b741c6f` ("drm: sti: remove useless fields from vtg structure") Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org> Cc: Vincent Abriou <vincent.abriou@st.com> Cc: David Airlie <airlied@linux.ie> Cc: dri-devel@lists.freedesktop.org Cc: linux-ia64@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Reviewed-by: Benjamin Gaignard <benjamin.gaignard@linaro.org> Signed-off-by: Philippe Cornu <philippe.cornu@st.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180403053401.30045-1-oohall@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Junwei Zhang	15169d615c	drm/amdgpu: fix clear_all and replace handling in the VM (v2) commit `387f49e546` upstream. v2: assign bo_va as well We need to put the lose ends on the invalid list because it is possible that we need to split up huge pages for them. Cc: stable@vger.kernel.org Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com> (v2) Reviewed-by: David Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Lyude Paul	4b7c313d24	drm/amdgpu: Grab/put runtime PM references in atomic_commit_tail() commit `97028037a3` upstream. So, unfortunately I recently made the discovery that in the upstream kernel, the only reason that amdgpu is not currently suffering from issues with runtime PM putting the GPU into suspend while it's driving displays is due to the fact that on most prime systems, we have sound devices associated with the GPU that hold their own runtime PM ref for the GPU. What this means however, is that in the event that there isn't any kind of sound device active (which can easily be reproduced by building a kernel with sound drivers disabled), the GPU will fall asleep even when there's displays active. This appears to be in part due to the fact that amdgpu has not actually ever relied on it's rpm_idle() function to be the only thing keeping it running, and normally grabs it's own power references whenever there are displays active (as can be seen with the original pre-DC codepath in amdgpu_display_crtc_set_config() in amdgpu_display.c). This means it's very likely that this bug was introduced during the switch over the DC. So to fix this, we start grabbing runtime PM references every time we enable a previously disabled CRTC in atomic_commit_tail(). This appears to be the correct solution, as it matches up with what i915 does in i915/intel_runtime_pm.c. The one sideaffect of this is that we ignore the variable that the pre-DC code used to use for tracking when it needed runtime PM refs, adev->have_disp_power_ref. This is mainly because there's no way for a driver to tell whether or not all of it's CRTCs are enabled or disabled when we've begun committing an atomic state, as there may be CRTC commits happening in parallel that aren't contained within the atomic state being committed. So, it's safer to just get/put a reference for each CRTC being enabled or disabled in the new atomic state. Signed-off-by: Lyude Paul <lyude@redhat.com> Acked-by: Christian König <christian.koenig@amd.com>. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Huang Rui	beb52f8d5c	drm/amdgpu: fix the missed vcn fw version report commit `a0b2ac2941` upstream. It missed vcn.fw_version setting when init vcn microcode, and it will be used to report vcn ucode version via amdgpu_firmware_info sysfs interface. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Rex Zhu	e19da0b5b0	drm/amdgpu: Add APU support in vi_set_vce_clocks commit `08ebb6e9f4` upstream. 1. fix set vce clocks failed on Cz/St which lead 1s delay when boot up. 2. remove the workaround in vce_v3_0.c Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Shirish S <shirish.s@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Rex Zhu	ea54c0431f	drm/amdgpu: Add APU support in vi_set_uvd_clocks commit `819a23f83e` upstream. fix the issue set uvd clock failed on CZ/ST which lead 1s delay when boot up. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Shirish S <shirish.s@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Alexander Potapenko	9b69c0f9c2	vt: prevent leaking uninitialized data to userspace via /dev/vcs* commit `21eff69aaa` upstream. KMSAN reported an infoleak when reading from /dev/vcs*: BUG: KMSAN: kernel-infoleak in vcs_read+0x18ba/0x1cc0 Call Trace: ... kmsan_copy_to_user+0x7a/0x160 mm/kmsan/kmsan.c:1253 copy_to_user ./include/linux/uaccess.h:184 vcs_read+0x18ba/0x1cc0 drivers/tty/vt/vc_screen.c:352 __vfs_read+0x1b2/0x9d0 fs/read_write.c:416 vfs_read+0x36c/0x6b0 fs/read_write.c:452 ... Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 __kmalloc+0x13a/0x350 mm/slub.c:3818 kmalloc ./include/linux/slab.h:517 vc_allocate+0x438/0x800 drivers/tty/vt/vt.c:787 con_install+0x8c/0x640 drivers/tty/vt/vt.c:2880 tty_driver_install_tty drivers/tty/tty_io.c:1224 tty_init_dev+0x1b5/0x1020 drivers/tty/tty_io.c:1324 tty_open_by_driver drivers/tty/tty_io.c:1959 tty_open+0x17b4/0x2ed0 drivers/tty/tty_io.c:2007 chrdev_open+0xc25/0xd90 fs/char_dev.c:417 do_dentry_open+0xccc/0x1440 fs/open.c:794 vfs_open+0x1b6/0x2f0 fs/open.c:908 ... Bytes 0-79 of 240 are uninitialized Consistently allocating \|vc_screenbuf\| with kzalloc() fixes the problem Reported-by: syzbot+17a8efdf800000@syzkaller.appspotmail.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Johan Hovold	eb2abeb2a8	serdev: fix memleak on module unload commit `bc6cf3669d` upstream. Make sure to free all resources associated with the ida on module exit. Fixes: `cd6484e183` ("serdev: Introduce new bus for serial attached devices") Cc: stable <stable@vger.kernel.org> # 4.11 Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Andy Shevchenko	5fb029f9a8	serial: 8250_pci: Remove stalled entries in blacklist commit `20dcff436e` upstream. After the commit `7d8905d064` ("serial: 8250_pci: Enable device after we check black list") pure serial multi-port cards, such as CH355, got blacklisted and thus not being enumerated anymore. Previously, it seems, blacklisting them was on purpose to shut up pciserial_init_one() about record duplication. So, remove the entries from blacklist in order to get cards enumerated. Fixes: `7d8905d064` ("serial: 8250_pci: Enable device after we check black list") Reported-by: Matt Turner <mattst88@gmail.com> Cc: Sergej Pupykin <ml@sergej.pp.ru> Cc: Alexandr Petrenko <petrenkoas83@gmail.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:18 +02:00
Leonard Crestez	19dc0ed2e4	iio: mma8452: Fix ignoring MMA8452_INT_DRDY commit `b02ec67a8e` upstream. Interrupts are ignored if no event bit is set in the status status register and this breaks the buffer interface. No data is shown when running "iio_generic_buffer -n mma8451 -a" and interrupt counts go crazy. Fix by not returning IRQ_NONE if DRDY is set. Fixes: `605f72de13` ("iio: accel: mma8452: improvements to handle multiple events") Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Laura Abbott	129cdc94d4	staging: android: ion: Return an ERR_PTR in ion_map_kernel commit `0a2bc00341` upstream. The expected return value from ion_map_kernel is an ERR_PTR. The error path for a vmalloc failure currently just returns NULL, triggering a warning in ion_buffer_kmap_get. Encode the vmalloc failure as an ERR_PTR. Reported-by: syzbot+55b1d9f811650de944c6@syzkaller.appspotmail.com Signed-off-by: Laura Abbott <labbott@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Tetsuo Handa	a8ac143bb1	n_tty: Access echo_* variables carefully. commit `ebec3f8f52` upstream. syzbot is reporting stalls at __process_echoes() [1]. This is because since ldata->echo_commit < ldata->echo_tail becomes true for some reason, the discard loop is serving as almost infinite loop. This patch tries to avoid falling into ldata->echo_commit < ldata->echo_tail situation by making access to echo_* variables more carefully. Since reset_buffer_flags() is called without output_lock held, it should not touch echo_* variables. And omit a call to reset_buffer_flags() from n_tty_open() by using vzalloc(). Since add_echo_byte() is called without output_lock held, it needs memory barrier between storing into echo_buf[] and incrementing echo_head counter. echo_buf() needs corresponding memory barrier before reading echo_buf[]. Lack of handling the possibility of not-yet-stored multi-byte operation might be the reason of falling into ldata->echo_commit < ldata->echo_tail situation, for if I do WARN_ON(ldata->echo_commit == tail + 1) prior to echo_buf(ldata, tail + 1), the WARN_ON() fires. Also, explicitly masking with buffer for the former "while" loop, and use ldata->echo_commit > tail for the latter "while" loop. [1] https://syzkaller.appspot.com/bug?id=17f23b094cd80df750e5b0f8982c521ee6bcbf40 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+108696293d7a21ab688f@syzkaller.appspotmail.com> Cc: Peter Hurley <peter@hurleysoftware.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Tetsuo Handa	cd4a7a84e6	n_tty: Fix stall at n_tty_receive_char_special(). commit `3d63b7e4ae` upstream. syzbot is reporting stalls at n_tty_receive_char_special() [1]. This is because comparison is not working as expected since ldata->read_head can change at any moment. Mitigate this by explicitly masking with buffer size when checking condition for "while" loops. [1] https://syzkaller.appspot.com/bug?id=3d7481a346958d9469bebbeb0537d5f056bdd6e8 Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+18df353d7540aa6b5467@syzkaller.appspotmail.com> Fixes: `bc5a5e3f45` ("n_tty: Don't wrap input buffer indices at buffer size") Cc: stable <stable@vger.kernel.org> Cc: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Zhengjun Xing	fde5e5c8ad	xhci: Fix kernel oops in trace_xhci_free_virt_device commit `d850c16583` upstream. commit `44a182b9d1` ("xhci: Fix use-after-free in xhci_free_virt_device") set dev->udev pointer to NULL in xhci_free_dev(), it will cause kernel panic in trace_xhci_free_virt_device. This patch reimplement the trace function trace_xhci_free_virt_device, remove dev->udev dereference and added more useful parameters to show in the trace function,it also makes sure dev->udev is not NULL before calling trace_xhci_free_virt_device. This issue happened when xhci-hcd trace is enabled and USB devices hot plug test. Original use-after-free patch went to stable so this needs so be applied there as well. [ 1092.022457] usb 2-4: USB disconnect, device number 6 [ 1092.092772] BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 [ 1092.101694] PGD 0 P4D 0 [ 1092.104601] Oops: 0000 [#1] SMP [ 1092.207734] Workqueue: usb_hub_wq hub_event [ 1092.212507] RIP: 0010:trace_event_raw_event_xhci_log_virt_dev+0x6c/0xf0 [ 1092.220050] RSP: 0018:ffff8c252e883d28 EFLAGS: 00010086 [ 1092.226024] RAX: ffff8c24af86fa84 RBX: 0000000000000003 RCX: ffff8c25255c2a01 [ 1092.234130] RDX: 0000000000000000 RSI: 00000000aef55009 RDI: ffff8c252e883d28 [ 1092.242242] RBP: ffff8c252550e2c0 R08: ffff8c24af86fa84 R09: 0000000000000a70 [ 1092.250364] R10: 0000000000000a70 R11: 0000000000000000 R12: ffff8c251f21a000 [ 1092.258468] R13: 000000000000000c R14: ffff8c251f21a000 R15: ffff8c251f432f60 [ 1092.266572] FS: 0000000000000000(0000) GS:ffff8c252e880000(0000) knlGS:0000000000000000 [ 1092.275757] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1092.282281] CR2: 0000000000000000 CR3: 0000000154209001 CR4: 00000000003606e0 [ 1092.290384] Call Trace: [ 1092.293156] <IRQ> [ 1092.295439] xhci_free_virt_device.part.34+0x182/0x1a0 [ 1092.301288] handle_cmd_completion+0x7ac/0xfa0 [ 1092.306336] ? trace_event_raw_event_xhci_log_trb+0x6e/0xa0 [ 1092.312661] xhci_irq+0x3e8/0x1f60 [ 1092.316524] __handle_irq_event_percpu+0x75/0x180 [ 1092.321876] handle_irq_event_percpu+0x20/0x50 [ 1092.326922] handle_irq_event+0x36/0x60 [ 1092.331273] handle_edge_irq+0x6d/0x180 [ 1092.335644] handle_irq+0x16/0x20 [ 1092.339417] do_IRQ+0x41/0xc0 [ 1092.342782] common_interrupt+0xf/0xf [ 1092.346955] </IRQ> Fixes: `44a182b9d1` ("xhci: Fix use-after-free in xhci_free_virt_device") Cc: <stable@vger.kernel.org> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Heikki Krogerus	ef271d2346	usb: typec: ucsi: Fix for incorrect status data issue commit `68816e16b4` upstream. According to UCSI Specification, Connector Change Event only means a change in the Connector Status and Operation Mode fields of the STATUS data structure. So any other change should create another event. Unfortunately on some platforms the firmware acting as PPM (platform policy manager - usually embedded controller firmware) still does not report any other status changes if there is a connector change event. So if the connector power or data role was changed when a device was plugged to the connector, the driver does not get any indication about that. The port will show wrong roles if that happens. To fix the issue, always checking the data and power role together with a connector change event. Fixes: `c1b0bc2dab` ("usb: typec: Add support for UCSI interface") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Heikki Krogerus	9f55125880	usb: typec: ucsi: acpi: Workaround for cache mode issue commit `1f9f9d168c` upstream. This fixes an issue where the driver fails with an error: ioremap error for 0x3f799000-0x3f79a000, requested 0x2, got 0x0 On some platforms the UCSI ACPI mailbox SystemMemory Operation Region may be setup before the driver has been loaded. That will lead into the driver failing to map the mailbox region, as it has been already marked as write-back memory. acpi_os_ioremap() for x86 uses ioremap_cache() unconditionally. When the issue happens, the embedded controller has a pending query event for the UCSI notification right after boot-up which causes the operation region to be setup before UCSI driver has been loaded. The fix is to notify acpi core that the driver is about to access memory region which potentially overlaps with an operation region right before mapping it. acpi_release_memory() will check if the memory has already been setup (mapped) by acpi core, and deactivate it (unmap) if it has. The driver is then able to map the memory with ioremap_nocache() and set the memtype to uncached for the region. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: `8243edf441` ("usb: typec: ucsi: Add ACPI driver") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Heikki Krogerus	ba710ae7b7	acpi: Add helper for deactivating memory region commit `d2d2e3c46b` upstream. Sometimes memory resource may be overlapping with SystemMemory Operation Region by design, for example if the memory region is used as a mailbox for communication with a firmware in the system. One occasion of such mailboxes is USB Type-C Connector System Software Interface (UCSI). With regions like that, it is important that the driver is able to map the memory with the requirements it has. For example, the driver should be allowed to map the memory as non-cached memory. However, if the operation region has been accessed before the driver has mapped the memory, the memory has been marked as write-back by the time the driver is loaded. That means the driver will fail to map the memory if it expects non-cached memory. To work around the problem, introducing helper that the drivers can use to temporarily deactivate (unmap) SystemMemory Operation Regions that overlap with their IO memory. Fixes: `8243edf441` ("usb: typec: ucsi: Add ACPI driver") Cc: stable@vger.kernel.org Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Peter Chen	86c44bd956	usb: typec: tcpm: fix logbuffer index is wrong if _tcpm_log is re-entered commit `d5a4f93511` upstream. The port->logbuffer_head may be wrong if the two processes enters _tcpm_log at the mostly same time. The 2nd process enters _tcpm_log before the 1st process update the index, then the 2nd process will not allocate logbuffer, when the 2nd process tries to use log buffer, the index has already updated by the 1st process, so it will get NULL pointer for updated logbuffer, the error message like below: tcpci 0-0050: Log buffer index 6 is NULL Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Guenter Roeck <linux@roeck-us.net> Cc: Jun Li <jun.li@nxp.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
William Wu	6f557c6876	usb: dwc2: fix the incorrect bitmaps for the ports of multi_tt hub commit `8760675932` upstream. The dwc2_get_ls_map() use ttport to reference into the bitmap if we're on a multi_tt hub. But the bitmaps index from 0 to (hub->maxchild - 1), while the ttport index from 1 to hub->maxchild. This will cause invalid memory access when the number of ttport is hub->maxchild. Without this patch, I can easily meet a Kernel panic issue if connect a low-speed USB mouse with the max port of FE2.1 multi-tt hub (1a40:0201) on rk3288 platform. Fixes: `9f9f09b048` ("usb: dwc2: host: Totally redo the microframe scheduler") Cc: <stable@vger.kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Acked-by: Minas Harutyunyan hminas@synopsys.com> Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:17 +02:00
Karoly Pados	a4c541363f	USB: serial: cp210x: add Silicon Labs IDs for Windows Update commit `2f83982338` upstream. Silicon Labs defines alternative VID/PID pairs for some chips that when used will automatically install drivers for Windows users without manual intervention. Unfortunately, these IDs are not recognized by the Linux module, so using these IDs improves user experience on one platform but degrades it on Linux. This patch addresses this problem. Signed-off-by: Karoly Pados <pados@pados.hu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:16 +02:00
Johan Hovold	e5d06d6cc9	USB: serial: cp210x: add CESINEL device ids commit `24160628a3` upstream. Add device ids for CESINEL products. Reported-by: Carlos Barcala Lara <cabl@cesinel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:16 +02:00
Houston Yaroschoff	66147e5beb	usb: cdc_acm: Add quirk for Uniden UBC125 scanner commit `4a762569a2` upstream. Uniden UBC125 radio scanner has USB interface which fails to work with cdc_acm driver: usb 1-1.5: new full-speed USB device number 4 using xhci_hcd cdc_acm 1-1.5:1.0: Zero length descriptor references cdc_acm: probe of 1-1.5:1.0 failed with error -22 Adding the NO_UNION_NORMAL quirk for the device fixes the issue: usb 1-4: new full-speed USB device number 15 using xhci_hcd usb 1-4: New USB device found, idVendor=1965, idProduct=0018 usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 usb 1-4: Product: UBC125XLT usb 1-4: Manufacturer: Uniden Corp. usb 1-4: SerialNumber: 0001 cdc_acm 1-4:1.0: ttyACM0: USB ACM device `lsusb -v` of the device: Bus 001 Device 015: ID 1965:0018 Uniden Corporation Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 2 Communications bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize0 64 idVendor 0x1965 Uniden Corporation idProduct 0x0018 bcdDevice 0.01 iManufacturer 1 Uniden Corp. iProduct 2 UBC125XLT iSerial 3 0001 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 48 bNumInterfaces 2 bConfigurationValue 1 iConfiguration 0 bmAttributes 0x80 (Bus Powered) MaxPower 500mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 0 None iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 10 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 0 Device Status: 0x0000 (Bus Powered) Signed-off-by: Houston Yaroschoff <hstn@4ever3.net> Cc: stable <stable@vger.kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-08 15:32:16 +02:00
Greg Kroah-Hartman	bdeb8f5efe	Linux 4.17.4	2018-07-03 11:27:13 +02:00
Wenwen Wang	a61b352a7c	virt: vbox: Only copy_from_user the request-header once commit `bd23a72698` upstream. In vbg_misc_device_ioctl(), the header of the ioctl argument is copied from the userspace pointer 'arg' and saved to the kernel object 'hdr'. Then the 'version', 'size_in', and 'size_out' fields of 'hdr' are verified. Before this commit, after the checks a buffer for the entire request would be allocated and then all data including the verified header would be copied from the userspace 'arg' pointer again. Given that the 'arg' pointer resides in userspace, a malicious userspace process can race to change the data pointed to by 'arg' between the two copies. By doing so, the user can bypass the verifications on the ioctl argument. This commit fixes this by using the already checked copy of the header to fill the header part of the allocated buffer and only copying the remainder of the data from userspace. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Cc: Justin Forbes <jmforbes@linuxtx.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Mike Snitzer	c91aad6666	dm thin: handle running out of data space vs concurrent discard commit `a685557fbb` upstream. Discards issued to a DM thin device can complete to userspace (via fstrim) _before_ the metadata changes associated with the discards is reflected in the thinp superblock (e.g. free blocks). As such, if a user constructs a test that loops repeatedly over these steps, block allocation can fail due to discards not having completed yet: 1) fill thin device via filesystem file 2) remove file 3) fstrim From initial report, here: https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html "The root cause of this issue is that dm-thin will first remove mapping and increase corresponding blocks' reference count to prevent them from being reused before DISCARD bios get processed by the underlying layers. However. increasing blocks' reference count could also increase the nr_allocated_this_transaction in struct sm_disk which makes smd->old_ll.nr_allocated + smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks. In this case, alloc_data_block() will never commit metadata to reset the begin pointer of struct sm_disk, because sm_disk_get_nr_free() always return an underflow value." While there is room for improvement to the space-map accounting that thinp is making use of: the reality is this test is inherently racey and will result in the previous iteration's fstrim's discard(s) completing vs concurrent block allocation, via dd, in the next iteration of the loop. No amount of space map accounting improvements will be able to allow user's to use a block before a discard of that block has completed. So the best we can really do is allow DM thinp to gracefully handle such aggressive use of all the pool's data by degrading the pool into out-of-data-space (OODS) mode. We _should_ get that behaviour already (if space map accounting didn't falsely cause alloc_data_block() to believe free space was available).. but short of that we handle the current reality that dm_pool_alloc_data_block() can return -ENOSPC. Reported-by: Dennis Yang <dennisyang@qnap.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Bart Van Assche	5b9fbf51d4	dm zoned: avoid triggering reclaim from inside dmz_map() commit `2d0b2d64d3` upstream. This patch avoids that lockdep reports the following: ====================================================== WARNING: possible circular locking dependency detected 4.18.0-rc1 #62 Not tainted ------------------------------------------------------ kswapd0/84 is trying to acquire lock: 00000000c313516d (&xfs_nondir_ilock_class){++++}, at: xfs_free_eofblocks+0xa2/0x1e0 but task is already holding lock: 00000000591c83ae (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (fs_reclaim){+.+.}: kmem_cache_alloc+0x2c/0x2b0 radix_tree_node_alloc.constprop.19+0x3d/0xc0 __radix_tree_create+0x161/0x1c0 __radix_tree_insert+0x45/0x210 dmz_map+0x245/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 mpage_readpages+0x19e/0x1f0 read_pages+0x6d/0x1b0 __do_page_cache_readahead+0x21b/0x2d0 force_page_cache_readahead+0xc4/0x100 generic_file_read_iter+0x7c6/0xd20 __vfs_read+0x102/0x180 vfs_read+0x9b/0x140 ksys_read+0x55/0xc0 do_syscall_64+0x5a/0x1f0 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #1 (&dmz->chunk_lock){+.+.}: dmz_map+0x133/0x2d0 [dm_zoned] __map_bio+0x40/0x260 __split_and_process_non_flush+0x116/0x220 __split_and_process_bio+0x81/0x180 __dm_make_request.isra.32+0x5a/0x100 generic_make_request+0x36e/0x690 submit_bio+0x6c/0x140 _xfs_buf_ioapply+0x31c/0x590 xfs_buf_submit_wait+0x73/0x520 xfs_buf_read_map+0x134/0x2f0 xfs_trans_read_buf_map+0xc3/0x580 xfs_read_agf+0xa5/0x1e0 xfs_alloc_read_agf+0x59/0x2b0 xfs_alloc_pagf_init+0x27/0x60 xfs_bmap_longest_free_extent+0x43/0xb0 xfs_bmap_btalloc_nullfb+0x7f/0xf0 xfs_bmap_btalloc+0x428/0x7c0 xfs_bmapi_write+0x598/0xcc0 xfs_iomap_write_allocate+0x15a/0x330 xfs_map_blocks+0x1cf/0x3f0 xfs_do_writepage+0x15f/0x7b0 write_cache_pages+0x1ca/0x540 xfs_vm_writepages+0x65/0xa0 do_writepages+0x48/0xf0 __writeback_single_inode+0x58/0x730 writeback_sb_inodes+0x249/0x5c0 wb_writeback+0x11e/0x550 wb_workfn+0xa3/0x670 process_one_work+0x228/0x670 worker_thread+0x3c/0x390 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 -> #0 (&xfs_nondir_ilock_class){++++}: down_read_nested+0x43/0x70 xfs_free_eofblocks+0xa2/0x1e0 xfs_fs_destroy_inode+0xac/0x270 dispose_list+0x51/0x80 prune_icache_sb+0x52/0x70 super_cache_scan+0x127/0x1a0 shrink_slab.part.47+0x1bd/0x590 shrink_node+0x3b5/0x470 balance_pgdat+0x158/0x3b0 kswapd+0x1ba/0x600 kthread+0x11c/0x140 ret_from_fork+0x3a/0x50 other info that might help us debug this: Chain exists of: &xfs_nondir_ilock_class --> &dmz->chunk_lock --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&dmz->chunk_lock); lock(fs_reclaim); lock(&xfs_nondir_ilock_class);	2018-07-03 11:27:12 +02:00
Kirill A. Shutemov	c1a1ba3a0e	x86/efi: Fix efi_call_phys_epilog() with CONFIG_X86_5LEVEL=y commit `cfe1957704` upstream. Open-coded page table entry checks don't work correctly when we fold the page table level at runtime. pgd_present() on 4-level paging machine always returns true, but open-coded version of the check may return false-negative result and we silently skip the rest of the loop body in efi_call_phys_epilog(). Replace open-coded checks with proper helpers. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Baoquan He <bhe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org # v4.12+ Fixes: `94133e46a0` ("x86/efi: Correct EFI identity mapping under 'efi=old_map' when KASLR is enabled") Link: http://lkml.kernel.org/r/20180625120852.18300-1-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Andy Lutomirski	79dba4a168	x86/entry/64/compat: Fix "x86/entry/64/compat: Preserve r8-r11 in int $0x80" commit `22cd978e59` upstream. Commit: `8bb2610bc4` ("x86/entry/64/compat: Preserve r8-r11 in int $0x80") was busted: my original patch had a minor conflict with some of the nospec changes, but "git apply" is very clever and silently accepted the patch by making the same changes to a different function in the same file. There was obviously a huge offset, but "git apply" for some reason doesn't feel any need to say so. Move the changes to the correct function. Now the test_syscall_vdso_32 selftests passes. If anyone cares to observe the original problem, try applying the patch at: https://lore.kernel.org/lkml/d4c4d9985fbe64f8c9e19291886453914b48caee.1523975710.git.luto@kernel.org/raw to the kernel at `316d097c4c`: - "git am" and "git apply" accept the patch without any complaints at all - "patch -p1" at least prints out a message about the huge offset. Reported-by: zhijianx.li@intel.com Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org #v4.17+ Fixes: `8bb2610bc4` ("x86/entry/64/compat: Preserve r8-r11 in int $0x80") Link: http://lkml.kernel.org/r/6012b922485401bc42676e804171ded262fc2ef2.1530078306.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Jann Horn	05e3000bb3	selinux: move user accesses in selinuxfs out of locked regions commit `0da74120c5` upstream. If a user is accessing a file in selinuxfs with a pointer to a userspace buffer that is backed by e.g. a userfaultfd, the userspace access can stall indefinitely, which can block fsi->mutex if it is held. For sel_read_policy(), remove the locking, since this method doesn't seem to access anything that requires locking. For sel_read_bool(), move the user access below the locked region. For sel_write_bool() and sel_commit_bools_write(), move the user access up above the locked region. Cc: stable@vger.kernel.org Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> [PM: removed an unused variable in sel_read_policy()] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Naoya Horiguchi	54428453ef	x86/e820: put !E820_TYPE_RAM regions into memblock.reserved commit `124049decb` upstream. There is a kernel panic that is triggered when reading /proc/kpageflags on the kernel booted with kernel parameter 'memmap=nn[KMG]!ss[KMG]': BUG: unable to handle kernel paging request at fffffffffffffffe PGD 9b20e067 P4D 9b20e067 PUD 9b210067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 2 PID: 1728 Comm: page-types Not tainted 4.17.0-rc6-mm1-v4.17-rc6-180605-0816-00236-g2dfb086ef02c+ #160 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-2.fc28 04/01/2014 RIP: 0010:stable_page_flags+0x27/0x3c0 Code: 00 00 00 0f 1f 44 00 00 48 85 ff 0f 84 a0 03 00 00 41 54 55 49 89 fc 53 48 8b 57 08 48 8b 2f 48 8d 42 ff 83 e2 01 48 0f 44 c7 <48> 8b 00 f6 c4 01 0f 84 10 03 00 00 31 db 49 8b 54 24 08 4c 89 e7 RSP: 0018:ffffbbd44111fde0 EFLAGS: 00010202 RAX: fffffffffffffffe RBX: 00007fffffffeff9 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000202 RDI: ffffed1182fff5c0 RBP: ffffffffffffffff R08: 0000000000000001 R09: 0000000000000001 R10: ffffbbd44111fed8 R11: 0000000000000000 R12: ffffed1182fff5c0 R13: 00000000000bffd7 R14: 0000000002fff5c0 R15: ffffbbd44111ff10 FS: 00007efc4335a500(0000) GS:ffff93a5bfc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: fffffffffffffffe CR3: 00000000b2a58000 CR4: 00000000001406e0 Call Trace: kpageflags_read+0xc7/0x120 proc_reg_read+0x3c/0x60 __vfs_read+0x36/0x170 vfs_read+0x89/0x130 ksys_pread64+0x71/0x90 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7efc42e75e23 Code: 09 00 ba 9f 01 00 00 e8 ab 81 f4 ff 66 2e 0f 1f 84 00 00 00 00 00 90 83 3d 29 0a 2d 00 00 75 13 49 89 ca b8 11 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 db d3 01 00 48 89 04 24 According to kernel bisection, this problem became visible due to commit `f7f99100d8` ("mm: stop zeroing memory during allocation in vmemmap") which changes how struct pages are initialized. Memblock layout affects the pfn ranges covered by node/zone. Consider that we have a VM with 2 NUMA nodes and each node has 4GB memory, and the default (no memmap= given) memblock layout is like below: MEMBLOCK configuration: memory size = 0x00000001fff75c00 reserved size = 0x000000000300c000 memory.cnt = 0x4 memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0 memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0 memory[0x2] [0x0000000100000000-0x000000013fffffff], 0x0000000040000000 bytes on node 0 flags: 0x0 memory[0x3] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0 ... If you give memmap=1G!4G (so it just covers memory[0x2]), the range [0x100000000-0x13fffffff] is gone: MEMBLOCK configuration: memory size = 0x00000001bff75c00 reserved size = 0x000000000300c000 memory.cnt = 0x3 memory[0x0] [0x0000000000001000-0x000000000009efff], 0x000000000009e000 bytes on node 0 flags: 0x0 memory[0x1] [0x0000000000100000-0x00000000bffd6fff], 0x00000000bfed7000 bytes on node 0 flags: 0x0 memory[0x2] [0x0000000140000000-0x000000023fffffff], 0x0000000100000000 bytes on node 1 flags: 0x0 ... This causes shrinking node 0's pfn range because it is calculated by the address range of memblock.memory. So some of struct pages in the gap range are left uninitialized. We have a function zero_resv_unavail() which does zeroing the struct pages within the reserved unavailable range (i.e. memblock.memory && !memblock.reserved). This patch utilizes it to cover all unavailable ranges by putting them into memblock.reserved. Link: http://lkml.kernel.org/r/20180615072947.GB23273@hori1.linux.bs1.fc.nec.co.jp Fixes: `f7f99100d8` ("mm: stop zeroing memory during allocation in vmemmap") Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Tested-by: Oscar Salvador <osalvador@suse.de> Tested-by: "Herton R. Krzesinski" <herton@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Pavel Tatashin <pasha.tatashin@oracle.com> Cc: Steven Sistare <steven.sistare@oracle.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:12 +02:00
Bart Van Assche	ee23f3bd9d	block: Fix cloning of requests with a special payload commit `297ba57dcd` upstream. This patch avoids that removing a path controlled by the dm-mpath driver while mkfs is running triggers the following kernel bug: kernel BUG at block/blk-core.c:3347! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 20 PID: 24369 Comm: mkfs.ext4 Not tainted 4.18.0-rc1-dbg+ #2 RIP: 0010:blk_end_request_all+0x68/0x70 Call Trace: <IRQ> dm_softirq_done+0x326/0x3d0 [dm_mod] blk_done_softirq+0x19b/0x1e0 __do_softirq+0x128/0x60d irq_exit+0x100/0x110 smp_call_function_single_interrupt+0x90/0x330 call_function_single_interrupt+0xf/0x20 </IRQ> Fixes: `f9d03f96b9` ("block: improve handling of the magic discard payload") Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Johannes Thumshirn <jthumshirn@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Keith Busch	4ef7273f59	block: Fix transfer when chunk sectors exceeds max commit `15bfd21fbc` upstream. A device may have boundary restrictions where the number of sectors between boundaries exceeds its max transfer size. In this case, we need to cap the max size to the smaller of the two limits. Reported-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Tested-by: Jitendra Bhivare <jitendra.bhivare@broadcom.com> Cc: <stable@vger.kernel.org> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Ross Zwisler	81f318e259	pmem: only set QUEUE_FLAG_DAX for fsdax mode commit `4557641b4c` upstream. QUEUE_FLAG_DAX is an indication that a given block device supports filesystem DAX and should not be set for PMEM namespaces which are in "raw" mode. These namespaces lack struct page and are prevented from participating in filesystem DAX as of commit `569d0365f5` ("dax: require 'struct page' by default for filesystem dax"). Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Mike Snitzer <snitzer@redhat.com> Fixes: `569d0365f5` ("dax: require 'struct page' by default for filesystem dax") Cc: stable@vger.kernel.org Acked-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Mike Snitzer	ec43a73489	dm: use bio_split() when splitting out the already processed bio commit `f21c601a2b` upstream. Use of bio_clone_bioset() is inefficient if there is no need to clone the original bio's bio_vec array. Best to use the bio_clone_fast() variant. Also, just using bio_advance() is only part of what is needed to properly setup the clone -- it doesn't account for the various bio_integrity() related work that also needs to be performed (see bio_split). Address both of these issues by switching from bio_clone_bioset() to bio_split(). Fixes: `18a25da8` ("dm: ensure bio submission follows a depth-first tree walk") Cc: stable@vger.kernel.org # 4.15+, requires removal of '&' before md->queue->bio_split Reported-by: Christoph Hellwig <hch@lst.de> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Jason A. Donenfeld	b26c9f3687	kasan: depend on CONFIG_SLUB_DEBUG commit `dd275caf4a` upstream. KASAN depends on having access to some of the accounting that SLUB_DEBUG does; without it, there are immediate crashes [1]. So, the natural thing to do is to make KASAN select SLUB_DEBUG. [1] http://lkml.kernel.org/r/CAHmME9rtoPwxUSnktxzKso14iuVCWT7BE_-_8PAC=pGw1iJnQg@mail.gmail.com Link: http://lkml.kernel.org/r/20180622154623.25388-1-Jason@zx2c4.com Fixes: `f9e13c0a5a` ("slab, slub: skip unnecessary kasan_cache_shutdown()") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Christoph Lameter <cl@linux.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Mikulas Patocka	916c0db51d	slub: fix failure when we delete and create a slab cache commit `d50d82faa0` upstream. In kernel 4.17 I removed some code from dm-bufio that did slab cache merging (commit `21bb132767`: "dm bufio: remove code that merges slab caches") - both slab and slub support merging caches with identical attributes, so dm-bufio now just calls kmem_cache_create and relies on implicit merging. This uncovered a bug in the slub subsystem - if we delete a cache and immediatelly create another cache with the same attributes, it fails because of duplicate filename in /sys/kernel/slab/. The slub subsystem offloads freeing the cache to a workqueue - and if we create the new cache before the workqueue runs, it complains because of duplicate filename in sysfs. This patch fixes the bug by moving the call of kobject_del from sysfs_slab_remove_workfn to shutdown_cache. kobject_del must be called while we hold slab_mutex - so that the sysfs entry is deleted before a cache with the same attributes could be created. Running device-mapper-test-suite with: dmtest run --suite thin-provisioning -n /commit_failure_causes_fallback/ triggered: Buffer I/O error on dev dm-0, logical block 1572848, async page read device-mapper: thin: 253:1: metadata operation 'dm_pool_alloc_data_block' failed: error = -5 device-mapper: thin: 253:1: aborting current metadata transaction sysfs: cannot create duplicate filename '/kernel/slab/:a-0000144' CPU: 2 PID: 1037 Comm: kworker/u48:1 Not tainted 4.17.0.snitm+ #25 Hardware name: Supermicro SYS-1029P-WTR/X11DDW-L, BIOS 2.0a 12/06/2017 Workqueue: dm-thin do_worker [dm_thin_pool] Call Trace: dump_stack+0x5a/0x73 sysfs_warn_dup+0x58/0x70 sysfs_create_dir_ns+0x77/0x80 kobject_add_internal+0xba/0x2e0 kobject_init_and_add+0x70/0xb0 sysfs_slab_add+0xb1/0x250 __kmem_cache_create+0x116/0x150 create_cache+0xd9/0x1f0 kmem_cache_create_usercopy+0x1c1/0x250 kmem_cache_create+0x18/0x20 dm_bufio_client_create+0x1ae/0x410 [dm_bufio] dm_block_manager_create+0x5e/0x90 [dm_persistent_data] __create_persistent_data_objects+0x38/0x940 [dm_thin_pool] dm_pool_abort_metadata+0x64/0x90 [dm_thin_pool] metadata_operation_failed+0x59/0x100 [dm_thin_pool] alloc_data_block.isra.53+0x86/0x180 [dm_thin_pool] process_cell+0x2a3/0x550 [dm_thin_pool] do_worker+0x28d/0x8f0 [dm_thin_pool] process_one_work+0x171/0x370 worker_thread+0x49/0x3f0 kthread+0xf8/0x130 ret_from_fork+0x35/0x40 kobject_add_internal failed for :a-0000144 with -EEXIST, don't try to register things with the same name in the same directory. kmem_cache_create(dm_bufio_buffer-16) failed with error -17 Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1806151817130.6333@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Mike Snitzer <snitzer@redhat.com> Tested-by: Mike Snitzer <snitzer@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Wolfram Sang	323252c831	i2c: gpio: initialize SCL to HIGH again commit `12b731dd46` upstream. It seems that during the conversion from gpio* to gpiod*, the initial state of SCL was wrongly switched to LOW. Fix it to be HIGH again. Fixes: `7bb75029ef` ("i2c: gpio: Enforce open drain through gpiolib") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Wolfram Sang	97be42058b	Revert "i2c: algo-bit: init the bus to a known state" commit `2a2c8ee2d7` upstream. This reverts commit `3e5f06bed7`. As per bugzilla #200045, this caused a regression. I don't really see a way to fix it without having the hardware. So, revert the patch and I will fix the issue I was seeing originally in the i2c-gpio driver itself. I couldn't find new users of this algorithm since, so there should be no one depending on the new behaviour. Reported-by: Sergey Larin <cerg2010cerg2010@mail.ru> Fixes: `3e5f06bed7` ("i2c: algo-bit: init the bus to a known state") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Sergey Larin <cerg2010cerg2010@mail.ru> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:11 +02:00
Hui Wang	11a74cf5d3	ALSA: hda/realtek - Fix the problem of two front mics on more machines commit `e41fc8c5bd` upstream. We have 3 more Lenovo machines, they all have 2 front mics on them, so they need the fixup to change the location for one of two mics. Among these 3 Lenovo machines, one of them has the same pin cfg as the machine with subid 0x17aa3138, so use the pin cfg table to apply fixup for them. The rest machines don't share the same pin cfg, so far use the subid to apply fixup for them. Fixes: `a3dafb2200` ("ALSA: hda/realtek - adjust the location of one mic") Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Takashi Iwai	8e1b02d7de	ALSA: hda/realtek - Add a quirk for FSC ESPRIMO U9210 commit `275ec0cb94` upstream. Fujitsu Seimens ESPRIMO Mobile U9210 requires the same fixup as H270 for the correct pin configs. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200107 Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Takashi Iwai	d80af7bcfe	ALSA: hda/realtek - Fix pop noise on Lenovo P50 & co commit `d5a6cabf02` upstream. Some Lenovo laptops, e.g. Lenovo P50, showed the pop noise at resume or runtime resume. It turned out to be reduced by applying alc_no_shutup() just like TPT440 quirk does. Since there are many Lenovo models showing the same behavior, put this workaround in ALC269_FIXUP_THINKPAD_ACPI entry so that it's applied commonly to all such Lenovo machines. Reported-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Benjamin Berg <bberg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Takashi Iwai	c67ba63493	ALSA: hda - Force to link down at runtime suspend on ATI/AMD HDMI commit `57cb54e53b` upstream. Henning Kühn reported that the discrete AMD GPU on his hybrid graphics laptop no longer runtime-suspends due to the recent commit `07f4f97d7b` ("vga_switcheroo: Use device link for HDA controller"). The root cause is that the HDMI codec on AMD GPU doesn't support CLKSTOP and EPSS, which are currently mandatory for powering down the HD-audio link at runtime suspend. Because the HD-audio link is still up, HD-audio controller driver blocks the transition to D3. For addressing the regression, this patch adds a new flag to indicate the forced link-down, and sets it for AMD HDMI codecs appropriately in the codec driver. Fixes: `07f4f97d7b` ("vga_switcheroo: Use device link for HDA controller") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106957 Reported-by: Lukas Wunner <lukas@wunner.de> Reported-and-tested-by: Henning Kühn <prg@cooco.de> Cc: <stable@vger.kernel.org> # v4.17+ Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Takashi Iwai	b859775275	ALSA: timer: Fix UBSAN warning at SNDRV_TIMER_IOCTL_NEXT_DEVICE ioctl commit `b41f794f28` upstream. The kernel may spew a WARNING about UBSAN undefined behavior at handling ALSA timer ioctl SNDRV_TIMER_IOCTL_NEXT_DEVICE: UBSAN: Undefined behaviour in sound/core/timer.c:1524:19 signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x122/0x1c8 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x86 lib/ubsan.c:159 handle_overflow+0x1c2/0x21f lib/ubsan.c:190 __ubsan_handle_add_overflow+0x2a/0x31 lib/ubsan.c:198 snd_timer_user_next_device sound/core/timer.c:1524 [inline] __snd_timer_user_ioctl+0x204d/0x2520 sound/core/timer.c:1939 snd_timer_user_ioctl+0x67/0x95 sound/core/timer.c:1994 .... It happens only when a value with INT_MAX is passed, as we're incrementing it unconditionally. So the fix is trivial, check the value with INT_MAX. Although the bug itself is fairly harmless, it's better to fix it so that fuzzers won't hit this again later. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200213 Reported-and-tested-by: Team OWL337 <icytxw@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
???	72e144aa92	Input: elantech - fix V4 report decoding for module with middle key commit `e0ae2519ca` upstream. Some touchpad has middle key and it will be indicated in bit 2 of packet[0]. We need to fix V4 formation's byte mask to prevent error decoding. Signed-off-by: KT Liao <kt.liao@emc.com.tw> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Aaron Ma	ff27b6b3a4	Input: elantech - enable middle button of touchpads on ThinkPad P52 commit `24bb555e6e` upstream. PNPID is better way to identify the type of touchpads. Enable middle button support on 2 types of touchpads on Lenovo P52. Cc: stable@vger.kernel.org Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Ben Hutchings	f1f3d22d65	Input: elan_i2c_smbus - fix more potential stack buffer overflows commit `50fc7b6195` upstream. Commit `40f7090bb1` ("Input: elan_i2c_smbus - fix corrupted stack") fixed most of the functions using i2c_smbus_read_block_data() to allocate a buffer with the maximum block size. However three functions were left unchanged: * In elan_smbus_initialize(), increase the buffer size in the same way. * In elan_smbus_calibrate_result(), the buffer is provided by the caller (calibrate_store()), so introduce a bounce buffer. Also name the result buffer size. * In elan_smbus_get_report(), the buffer is provided by the caller but happens to be the right length. Add a compile-time assertion to ensure this remains the case. Cc: <stable@vger.kernel.org> # 3.19+ Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:10 +02:00
Dmitry Torokhov	76ce7a5e1b	Input: psmouse - fix button reporting for basic protocols commit `03ae3a9caf` upstream. The commit `ba667650c5` ("Input: psmouse - clean up code") was pretty brain-dead and broke extra buttons reporting for variety of PS/2 mice: Genius, Thinkmouse and Intellimouse Explorer. We need to actually inspect the data coming from the device when reporting events. Fixes: `ba667650c5` ("Input: psmouse - clean up code") Reported-by: Jiri Slaby <jslaby@suse.cz> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Enno Boland	9be75ae5a7	Input: xpad - fix GPD Win 2 controller name commit `dd6bee81c9` upstream. This fixes using the controller with SDL2. SDL2 has a naive algorithm to apply the correct settings to a controller. For X-Box compatible controllers it expects that the controller name contains a variation of a 'XBOX'-string. This patch changes the identifier to contain "X-Box" as substring. Tested with Steam and C-Dogs-SDL which both detect the controller properly after adding this patch. Fixes: `c1ba08390a` ("Input: xpad - add GPD Win 2 Controller USB IDs") Cc: stable@vger.kernel.org Signed-off-by: Enno Boland <gottox@voidlinux.eu> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Jan Kara	238e2637c6	udf: Detect incorrect directory size commit `fa65653e57` upstream. Detect when a directory entry is (possibly partially) beyond directory size and return EIO in that case since it means the filesystem is corrupted. Otherwise directory operations can further corrupt the directory and possibly also oops the kernel. CC: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> CC: stable@vger.kernel.org Reported-and-tested-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Bartosz Golaszewski	0c42dc46c5	net: ethernet: fix suspend/resume in davinci_emac commit `dc45519eb1` upstream. This patch reverts commit `3243ff2a05` ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching") and adds a comment which should stop anyone from reintroducing the same "fix" in the future. We can't use bus_find_device_by_name() here because the device name is not guaranteed to be 'davinci_mdio'. On some systems it can be 'davinci_mdio.0' so we need to use strncmp() against the first part of the string to correctly match it. Fixes: `3243ff2a05` ("net: ethernet: davinci_emac: Deduplicate bus_find_device() by name matching") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Acked-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Boris Ostrovsky	4570dfd991	xen: Remove unnecessary BUG_ON from __unbind_from_irq() commit `eef04c7b37` upstream. Commit `910f8befdf` ("xen/pirq: fix error path cleanup when binding MSIs") fixed a couple of errors in error cleanup path of xen_bind_pirq_msi_to_irq(). This cleanup allowed a call to __unbind_from_irq() with an unbound irq, which would result in triggering the BUG_ON there. Since there is really no reason for the BUG_ON (xen_free_irq() can operate on unbound irqs) we can remove it. Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Steven Rostedt (VMware)	a6974c2f8e	tracing: Check for no filter when processing event filters commit `70303420b5` upstream. The syzkaller detected a out-of-bounds issue with the events filter code, specifically here: prog[N].pred = NULL; /* #13 / prog[N].target = 1; / TRUE / prog[N+1].pred = NULL; prog[N+1].target = 0; / FALSE */ -> prog[N-1].target = N; prog[N-1].when_to_branch = false; As that's the first reference to a "N-1" index, it appears that the code got here with N = 0, which means the filter parser found no filter to parse (which shouldn't ever happen, but apparently it did). Add a new error to the parsing code that will check to make sure that N is not zero before going into this part of the code. If N = 0, then -EINVAL is returned, and a error message is added to the filter. Cc: stable@vger.kernel.org Fixes: `80765597bc` ("tracing: Rewrite filter logic to be simpler and faster") Reported-by: air icy <icytxw@gmail.com> bugzilla url: https://bugzilla.kernel.org/show_bug.cgi?id=200019 Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Dan Williams	63905eb704	mm: fix devmem_is_allowed() for sub-page System RAM intersections commit `2bdce74412` upstream. Hussam reports: I was poking around and for no real reason, I did cat /dev/mem and strings /dev/mem. Then I saw the following warning in dmesg. I saved it and rebooted immediately. memremap attempted on mixed range 0x000000000009c000 size: 0x1000 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 11810 at kernel/memremap.c:98 memremap+0x104/0x170 [..] Call Trace: xlate_dev_mem_ptr+0x25/0x40 read_mem+0x89/0x1a0 __vfs_read+0x36/0x170 The memremap() implementation checks for attempts to remap System RAM with MEMREMAP_WB and instead redirects those mapping attempts to the linear map. However, that only works if the physical address range being remapped is page aligned. In low memory we have situations like the following: 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved ...where System RAM intersects Reserved ranges on a sub-page page granularity. Given that devmem_is_allowed() special cases any attempt to map System RAM in the first 1MB of memory, replace page_is_ram() with the more precise region_intersects() to trap attempts to map disallowed ranges. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199999 Link: http://lkml.kernel.org/r/152856436164.18127.2847888121707136898.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: `92281dee82` ("arch: introduce memremap()") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reported-by: Hussam Al-Tayeb <me@hussam.eu.org> Tested-by: Hussam Al-Tayeb <me@hussam.eu.org> Cc: Christoph Hellwig <hch@lst.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Jia He	3d2d9f7df1	mm/ksm.c: ignore STABLE_FLAG of rmap_item->address in rmap_walk_ksm() commit `1105a2fc02` upstream. In our armv8a server(QDF2400), I noticed lots of WARN_ON caused by PAGE_SIZE unaligned for rmap_item->address under memory pressure tests(start 20 guests and run memhog in the host). WARNING: CPU: 4 PID: 4641 at virt/kvm/arm/mmu.c:1826 kvm_age_hva_handler+0xc0/0xc8 CPU: 4 PID: 4641 Comm: memhog Tainted: G W 4.17.0-rc3+ #8 Call trace: kvm_age_hva_handler+0xc0/0xc8 handle_hva_to_gpa+0xa8/0xe0 kvm_age_hva+0x4c/0xe8 kvm_mmu_notifier_clear_flush_young+0x54/0x98 __mmu_notifier_clear_flush_young+0x6c/0xa0 page_referenced_one+0x154/0x1d8 rmap_walk_ksm+0x12c/0x1d0 rmap_walk+0x94/0xa0 page_referenced+0x194/0x1b0 shrink_page_list+0x674/0xc28 shrink_inactive_list+0x26c/0x5b8 shrink_node_memcg+0x35c/0x620 shrink_node+0x100/0x430 do_try_to_free_pages+0xe0/0x3a8 try_to_free_pages+0xe4/0x230 __alloc_pages_nodemask+0x564/0xdc0 alloc_pages_vma+0x90/0x228 do_anonymous_page+0xc8/0x4d0 __handle_mm_fault+0x4a0/0x508 handle_mm_fault+0xf8/0x1b0 do_page_fault+0x218/0x4b8 do_translation_fault+0x90/0xa0 do_mem_abort+0x68/0xf0 el0_da+0x24/0x28 In rmap_walk_ksm, the rmap_item->address might still have the STABLE_FLAG, then the start and end in handle_hva_to_gpa might not be PAGE_SIZE aligned. Thus it will cause exceptions in handle_hva_to_gpa on arm64. This patch fixes it by ignoring (not removing) the low bits of address when doing rmap_walk_ksm. IMO, it should be backported to stable tree. the storm of WARN_ONs is very easy for me to reproduce. More than that, I watched a panic (not reproducible) as follows: page:ffff7fe003742d80 count:-4871 mapcount:-2126053375 mapping: (null) index:0x0 flags: 0x1fffc00000000000() raw: 1fffc00000000000 0000000000000000 0000000000000000 ffffecf981470000 raw: dead000000000100 dead000000000200 ffff8017c001c000 0000000000000000 page dumped because: nonzero _refcount CPU: 29 PID: 18323 Comm: qemu-kvm Tainted: G W 4.14.15-5.hxt.aarch64 #1 Hardware name: <snip for confidential issues> Call trace: dump_backtrace+0x0/0x22c show_stack+0x24/0x2c dump_stack+0x8c/0xb0 bad_page+0xf4/0x154 free_pages_check_bad+0x90/0x9c free_pcppages_bulk+0x464/0x518 free_hot_cold_page+0x22c/0x300 __put_page+0x54/0x60 unmap_stage2_range+0x170/0x2b4 kvm_unmap_hva_handler+0x30/0x40 handle_hva_to_gpa+0xb0/0xec kvm_unmap_hva_range+0x5c/0xd0 I even injected a fault on purpose in kvm_unmap_hva_range by seting size=size-0x200, the call trace is similar as above. So I thought the panic is similarly caused by the root cause of WARN_ON. Andrea said: : It looks a straightforward safe fix, on x86 hva_to_gfn_memslot would : zap those bits and hide the misalignment caused by the low metadata : bits being erroneously left set in the address, but the arm code : notices when that's the last page in the memslot and the hva_end is : getting aligned and the size is below one page. : : I think the problem triggers in the addr += PAGE_SIZE of : unmap_stage2_ptes that never matches end because end is aligned but : addr is not. : : } while (pte++, addr += PAGE_SIZE, addr != end); : : x86 again only works on hva_start/hva_end after converting it to : gfn_start/end and that being in pfn units the bits are zapped before : they risk to cause trouble. Jia He said: : I've tested by myself in arm64 server (QDF2400,46 cpus,96G mem) Without : this patch, the WARN_ON is very easy for reproducing. After this patch, I : have run the same benchmarch for a whole day without any WARN_ONs Link: http://lkml.kernel.org/r/1525403506-6750-1-git-send-email-hejianet@gmail.com Signed-off-by: Jia He <jia.he@hxt-semitech.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Jia He <hejianet@gmail.com> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Cc: Arvind Yadav <arvind.yadav.cs@gmail.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Dongsheng Yang	5319f5be7d	rbd: flush rbd_dev->watch_dwork after watch is unregistered commit `23edca8649` upstream. There is a problem if we are going to unmap a rbd device and the watch_dwork is going to queue delayed work for watch: unmap Thread watch Thread timer do_rbd_remove cancel_tasks_sync(rbd_dev) queue_delayed_work for watch destroy_workqueue(rbd_dev->task_wq) drain_workqueue(wq) destroy other resources in wq call_timer_fn __queue_work() Then the delayed work escape the cancel_tasks_sync() and destroy_workqueue() and we will get an user-after-free call trace: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 4.17.0-rc6+ #13 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__queue_work+0x6a/0x3b0 RSP: 0018:ffff9427df1c3e90 EFLAGS: 00010086 RAX: ffff9427deca8400 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff9427deca8400 RSI: ffff9427df1c3e50 RDI: 0000000000000000 RBP: ffff942783e39e00 R08: ffff9427deca8400 R09: ffff9427df1c3f00 R10: 0000000000000004 R11: 0000000000000005 R12: ffff9427cfb85970 R13: 0000000000002000 R14: 000000000001eca0 R15: 0000000000000007 FS: 0000000000000000(0000) GS:ffff9427df1c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000004c900a005 CR4: 00000000000206e0 Call Trace: <IRQ> ? __queue_work+0x3b0/0x3b0 call_timer_fn+0x2d/0x130 run_timer_softirq+0x16e/0x430 ? tick_sched_timer+0x37/0x70 __do_softirq+0xd2/0x280 irq_exit+0xd5/0xe0 smp_apic_timer_interrupt+0x6c/0x130 apic_timer_interrupt+0xf/0x20 [ Move rbd_dev->watch_dwork cancellation so that rbd_reregister_watch() either bails out early because the watch is UNREGISTERED at that point or just gets cancelled. ] Cc: stable@vger.kernel.org Fixes: `99d1694310` ("rbd: retry watch re-registration periodically") Signed-off-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:09 +02:00
Hans de Goede	9adcde2d67	pwm: lpss: platform: Save/restore the ctrl register over a suspend/resume commit `1d375b58c1` upstream. On some devices the contents of the ctrl register get lost over a suspend/resume and the PWM comes back up disabled after the resume. This is seen on some Bay Trail devices with the PWM in ACPI enumerated mode, so it shows up as a platform device instead of a PCI device. If we still think it is enabled and then try to change the duty-cycle after this, we end up with a "PWM_SW_UPDATE was not cleared" error and the PWM is stuck in that state from then on. This commit adds suspend and resume pm callbacks to the pwm-lpss-platform code, which save/restore the ctrl register over a suspend/resume, fixing this. Note that: 1) There is no need to do this over a runtime suspend, since we only runtime suspend when disabled and then we properly set the enable bit and reprogram the timings when we re-enable the PWM. 2) This may be happening on more systems then we realize, but has been covered up sofar by a bug in the acpi-lpss.c code which was save/restoring the regular device registers instead of the lpss private registers due to lpss_device_desc.prv_offset not being set. This is fixed by a later patch in this series. Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Alexandr Savca	9aa355d196	Input: elan_i2c - add ELAN0618 (Lenovo v330 15IKB) ACPI ID commit `8938fc7b8f` upstream. Add ELAN0618 to the list of supported touchpads; this ID is used in Lenovo v330 15IKB devices. Signed-off-by: Alexandr Savca <alexandr.savca@saltedge.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Hans de Goede	13b08caad6	Input: silead - add MSSL0002 ACPI HID commit `fc573af632` upstream. The Silead touchscreen on the Chuwi Vi8 tablet uses MSSL0002 as ACPI HID, rather then the usual MSSL1680 id. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Hans de Goede	180e733e21	ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices commit `fdcb613d49` upstream. The LPSS PWM device on on Bay Trail and Cherry Trail devices has a set of private registers at offset 0x800, the current lpss_device_desc for them already sets the LPSS_SAVE_CTX flag to have these saved/restored over device-suspend, but the current lpss_device_desc was not setting the prv_offset field, leading to the regular device registers getting saved/restored instead. This is causing the PWM controller to no longer work, resulting in a black screen, after a suspend/resume on systems where the firmware clears the APB clock and reset bits at offset 0x804. This commit fixes this by properly setting prv_offset to 0x800 for the PWM devices. Cc: stable@vger.kernel.org Fixes: `e1c7481797` ("ACPI / LPSS: Add Intel BayTrail ACPI mode PWM") Fixes: `1bfbd8eb8a` ("ACPI / LPSS: Add ACPI IDs for Intel Braswell") Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Rafael J . Wysocki <rjw@rjwysocki.net> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Kees Cook	ac6992286f	video: uvesafb: Fix integer overflow in allocation commit `9f645bcc56` upstream. cmap->len can get close to INT_MAX/2, allowing for an integer overflow in allocation. This uses kmalloc_array() instead to catch the condition. Reported-by: Dr Silvio Cesare of InfoSect <silvio.cesare@gmail.com> Fixes: `8bdb3a2d7d` ("uvesafb: the driver core") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Trond Myklebust	11de37bbc3	NFSv4: Fix a typo in nfs41_sequence_process commit `995891006c` upstream. We want to compare the slot_id to the highest slot number advertised by the server. Fixes: `3be0f80b5f` ("NFSv4.1: Fix up replays of interrupted requests") Cc: stable@vger.kernel.org # 4.15+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Trond Myklebust	5819c04fb7	NFSv4: Revert commit `5f83d86cf5` ("NFSv4.x: Fix wraparound issues..") commit `fc40724fc6` upstream. The correct behaviour for NFSv4 sequence IDs is to wrap around to the value 0 after 0xffffffff. See https://tools.ietf.org/html/rfc5661#section-2.10.6.1 Fixes: `5f83d86cf5` ("NFSv4.x: Fix wraparound issues when validing...") Cc: stable@vger.kernel.org # 4.6+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Dave Wysochanski	4c530c3b34	NFSv4: Fix possible 1-byte stack overflow in nfs_idmap_read_and_verify_message commit `d68894800e` upstream. In nfs_idmap_read_and_verify_message there is an incorrect sprintf '%d' that converts the __u32 'im_id' from struct idmap_msg to 'id_str', which is a stack char array variable of length NFS_UINT_MAXLEN == 11. If a uid or gid value is > 2147483647 = 0x7fffffff, the conversion overflows into a negative value, for example: crash> p (unsigned) (0x80000000) $1 = 2147483648 crash> p (signed) (0x80000000) $2 = -2147483648 The '-' sign is written to the buffer and this causes a 1 byte overflow when the NULL byte is written, which corrupts kernel stack memory. If CONFIG_CC_STACKPROTECTOR_STRONG is set we see a stack-protector panic: [11558053.616565] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffffa05b8a8c [11558053.639063] CPU: 6 PID: 9423 Comm: rpc.idmapd Tainted: G W ------------ T 3.10.0-514.el7.x86_64 #1 [11558053.641990] Hardware name: Red Hat OpenStack Compute, BIOS 1.10.2-3.el7_4.1 04/01/2014 [11558053.644462] ffffffff818c7bc0 00000000b1f3aec1 ffff880de0f9bd48 ffffffff81685eac [11558053.646430] ffff880de0f9bdc8 ffffffff8167f2b3 ffffffff00000010 ffff880de0f9bdd8 [11558053.648313] ffff880de0f9bd78 00000000b1f3aec1 ffffffff811dcb03 ffffffffa05b8a8c [11558053.650107] Call Trace: [11558053.651347] [<ffffffff81685eac>] dump_stack+0x19/0x1b [11558053.653013] [<ffffffff8167f2b3>] panic+0xe3/0x1f2 [11558053.666240] [<ffffffff811dcb03>] ? kfree+0x103/0x140 [11558053.682589] [<ffffffffa05b8a8c>] ? idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.689710] [<ffffffff810855db>] __stack_chk_fail+0x1b/0x30 [11558053.691619] [<ffffffffa05b8a8c>] idmap_pipe_downcall+0x1cc/0x1e0 [nfsv4] [11558053.693867] [<ffffffffa00209d6>] rpc_pipe_write+0x56/0x70 [sunrpc] [11558053.695763] [<ffffffff811fe12d>] vfs_write+0xbd/0x1e0 [11558053.702236] [<ffffffff810acccc>] ? task_work_run+0xac/0xe0 [11558053.704215] [<ffffffff811fec4f>] SyS_write+0x7f/0xe0 [11558053.709674] [<ffffffff816964c9>] system_call_fastpath+0x16/0x1b Fix this by calling the internally defined nfs_map_numeric_to_string() function which properly uses '%u' to convert this __u32. For consistency, also replace the one other place where snprintf is called. Signed-off-by: Dave Wysochanski <dwysocha@redhat.com> Reported-by: Stephen Johnston <sjohnsto@redhat.com> Fixes: `cf4ab538f1` ("NFSv4: Fix the string length returned by the idmapper") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:08 +02:00
Scott Mayhew	2547b0da6c	nfsd: restrict rd_maxcount to svc_max_payload in nfsd_encode_readdir commit `9c2ece6ef6` upstream. nfsd4_readdir_rsize restricts rd_maxcount to svc_max_payload when estimating the size of the readdir reply, but nfsd_encode_readdir restricts it to INT_MAX when encoding the reply. This can result in log messages like "kernel: RPC request reserved 32896 but used 1049444". Restrict rd_dircount similarly (no reason it should be larger than svc_max_payload). Signed-off-by: Scott Mayhew <smayhew@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Mauro Carvalho Chehab	96f53ceaaf	media: dvb_frontend: fix locking issues at dvb_frontend_get_event() commit `76d81243a4` upstream. As warned by smatch: drivers/media/dvb-core/dvb_frontend.c:314 dvb_frontend_get_event() warn: inconsistent returns 'sem:&fepriv->sem'. Locked on: line 288 line 295 line 306 line 314 Unlocked on: line 303 The lock implementation for get event is wrong, as, if an interrupt occurs, down_interruptible() will fail, and the routine will call up() twice when userspace calls the ioctl again. The bad code is there since when Linux migrated to git, in 2005. Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Sean Young	e53d0ed7a0	media: rc: mce_kbd decoder: fix stuck keys commit `63039c29f7` upstream. The MCE Remote sends a 0 scancode when keys are released. If this is not received or decoded, then keys can get "stuck"; the keyup event is not sent since the input_sync() is missing from the timeout handler. Cc: stable@vger.kernel.org Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Kai-Heng Feng	a4655bea3b	media: cx231xx: Add support for AverMedia DVD EZMaker 7 commit `29e61d6ef0` upstream. User reports AverMedia DVD EZMaker 7 can be driven by VIDEO_GRABBER. Add the device to the id_table to make it work. BugLink: https://bugs.launchpad.net/bugs/1620762 Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Mauro Carvalho Chehab	5b7d2179f0	media: v4l2-compat-ioctl32: prevent go past max size commit `ea72fbf588` upstream. As warned by smatch: drivers/media/v4l2-core/v4l2-compat-ioctl32.c:879 put_v4l2_ext_controls32() warn: check for integer overflow 'count' The access_ok() logic should check for too big arrays too. Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Brad Love	dfa8b39ec3	media: cx231xx: Ignore an i2c mux adapter commit `13a257f8d5` upstream. Hauppauge 935C cannot communicate with the si2157 when using the mux adapter returned by the si2168, so disable it to fix the device. Signed-off-by: Brad Love <brad@nextdimension.cc> Cc: stable@vger.kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
ming_qian	eb57d10156	media: uvcvideo: Support realtek's UVC 1.5 device commit `f620d1d7af` upstream. media: uvcvideo: Support UVC 1.5 video probe & commit controls The length of UVC 1.5 video control is 48, and it is 34 for UVC 1.1. Change it to 48 for UVC 1.5 device, and the UVC 1.5 device can be recognized. More changes to the driver are needed for full UVC 1.5 compatibility. However, at least the UVC 1.5 Realtek RTS5847/RTS5852 cameras have been reported to work well. [laurent.pinchart@ideasonboard.com: Factor out code to helper function, update size checks] Cc: stable@vger.kernel.org Signed-off-by: ming_qian <ming_qian@realsil.com.cn> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Ana Guerrero Lopez <ana.guerrero@collabora.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Kieran Bingham	6bf7b1f6ac	media: vsp1: Release buffers for each video node commit `83967993f2` upstream. Commit `372b2b0399` ("media: v4l: vsp1: Release buffers in start_streaming error path") introduced a helper to clean up buffers on error paths, but inadvertently changed the code such that only the output WPF buffers were cleaned, rather than the video node being operated on. Since then vsp1_video_cleanup_pipeline() has grown to perform both video node cleanup, as well as pipeline cleanup. Split the implementation into two distinct functions that perform the required work, so that each video node can release its buffers correctly on streamoff. The pipe cleanup that was performed in the vsp1_video_stop_streaming() (releasing the pipe->dl) is moved to the function for clarity. Fixes: `372b2b0399` ("media: v4l: vsp1: Release buffers in start_streaming error path") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com> Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:07 +02:00
Adrian Hunter	a127d97b60	perf intel-pt: Fix packet decoding of CYC packets commit `621a5a327c` upstream. Use a 64-bit type so that the cycle count is not limited to 32-bits. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1528371002-8862-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Adrian Hunter	719cdad75a	perf intel-pt: Fix "Unexpected indirect branch" error commit `9fb523363f` upstream. Some Atom CPUs can produce FUP packets that contain NLIP (next linear instruction pointer) instead of CLIP (current linear instruction pointer). That will result in "Unexpected indirect branch" errors. Fix by comparing IP to NLIP in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Adrian Hunter	2fd3f5d72b	perf intel-pt: Fix MTC timing after overflow commit `dd27b87ab5` upstream. On some platforms, overflows will clear before MTC wraparound, and there is no following TSC/TMA packet. In that case the previous TMA is valid. Since there will be a valid TMA either way, stop setting 'have_tma' to false upon overflow. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-4-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Adrian Hunter	b8b98ab7e1	perf intel-pt: Fix decoding to accept CBR between FUP and corresponding TIP commit `bd2e49ec48` upstream. It is possible to have a CBR packet between a FUP packet and corresponding TIP packet. Stop treating it as an error. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Adrian Hunter	dea2a7f5a7	perf intel-pt: Fix sync_switch INTEL_PT_SS_NOT_TRACING commit `dbcb82b93f` upstream. sync_switch is a facility to synchronize decoding more closely with the point in the kernel when the context actually switched. In one case, INTEL_PT_SS_NOT_TRACING state was not correctly transitioning to INTEL_PT_SS_TRACING state due to a missing case clause. Add it. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1527762225-26024-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Adrian Hunter	de6f838305	perf tools: Fix symbol and object code resolution for vdso32 and vdsox32 commit `aef4feace2` upstream. Fix __kmod_path__parse() so that perf tools does not treat vdso32 and vdsox32 as kernel modules and fail to find the object. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: stable@vger.kernel.org Fixes: `1f121b03d0` ("perf tools: Deal with kernel module names in '[]' correctly") Link: http://lkml.kernel.org/r/1528117014-30032-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Sean Wang	28493295f0	arm: dts: mt7623: fix invalid memory node being generated commit `c0b0d540db` upstream. Below two wrong nodes in existing DTS files would cause a fail boot since in fact the address 0 is not the correct place the memory device locates at. memory { device_type = "memory"; reg = <0x0 0x0 0x0 0x0>; }; memory@80000000 { reg = <0x0 0x80000000 0x0 0x40000000>; }; In order to avoid having a memory node starting at address 0, we can't include file skeleton64.dtsi and instead need to explicitly manually define a few of properties the DTS relies on such as #address-cells and #size-cells in root node and device_type in the node memory@80000000. Cc: stable@vger.kernel.org Fixes: `31ac0d69a1` ("ARM: dts: mediatek: add MT7623 basic support") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Sibi Sankar	a9eb2d0325	remoteproc: Prevent incorrect rproc state on xfer mem ownership failure commit `2724807f7f` upstream. Any failure in the secure call for transferring mem ownership of mba region to Q6 would result in reporting that the remoteproc device is running. This is because the previous q6v5_clk_enable would have been a success. Prevent this by updating variable 'ret' accordingly. Cc: stable@vger.kernel.org Signed-off-by: Sibi Sankar <sibis@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:06 +02:00
Jarkko Nikula	0c8d1ee930	mfd: intel-lpss: Fix Intel Cannon Lake LPSS I2C input clock commit `4e93a65857` upstream. Intel Cannon Lake PCH has much higher 216 MHz input clock to LPSS I2C than Sunrisepoint which uses 120 MHz. Preliminary information was that both share the same clock rate but actual silicon implements elevated rate for better support for 3.4 MHz high-speed I2C. This incorrect input clock rate results too high I2C bus clock in case ACPI doesn't provide tuned I2C timing parameters since I2C host controller driver calculates them from input clock rate. Fix this by using the correct rate. We still share the same 230 ns SDA hold time value than Sunrisepoint. Cc: stable@vger.kernel.org Fixes: `b418bbff36` ("mfd: intel-lpss: Add Intel Cannonlake PCI IDs") Reported-by: Jian-Hong Pan <jian-hong@endlessm.com> Reported-by: Chris Chiu <chiu@endlessm.com> Reported-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:05 +02:00
Andy Shevchenko	00b9cb007e	mfd: intel-lpss: Program REMAP register in PIO mode commit `d28b625208` upstream. According to documentation REMAP register has to be programmed in either DMA or PIO mode of the slice. Move the DMA capability check below to let REMAP register be programmed in PIO mode. Cc: stable@vger.kernel.org # 4.3+ Fixes: `4b45efe852` ("mfd: Add support for Intel Sunrisepoint LPSS devices") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:05 +02:00
Peter Ujfalusi	23c1debbdb	mfd: twl-core: Fix clock initialization commit `c218b3b242` upstream. When looking up the clock we must use the client->dev as device since that is the one which is probed via DT. Cc: stable@vger.kernel.org # 4.16+ Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:05 +02:00
Anton Ivanov	a67941723d	um: Fix raw interface options commit `5ec9121195` upstream. Raw interface initialization needs QDISC_BYPASS. Otherwise it sees its own packets when transmitting. Fixes: `49da7e64f3` ("High Performance UML Vector Network Driver") Cc: <stable@vger.kernel.org> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:05 +02:00
Anton Ivanov	026eef7be8	um: Fix initialization of vector queues commit `4579a1ba69` upstream. UML vector drivers could derefence uninitialized memory when cleaning up after a queue allocation failure. Fixes: `49da7e64f3` ("High Performance UML Vector Network Driver") Cc: <stable@vger.kernel.org> Reported-by: Dan Capenter <dan.carpenter@oracle.com> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:05 +02:00
Chao Yu	bc2bad3b87	f2fs: don't use GFP_ZERO for page caches commit `81114baa83` upstream. Related to https://lkml.org/lkml/2018/4/8/661 Sometimes, we need to write meta data to new allocated block address, then we will allocate a zeroed page in inner inode's address space, and fill partial data in it, and leave other place with zero value which means some fields are initial status. There are two inner inodes (meta inode and node inode) setting __GFP_ZERO, I have just checked them, for both of them, we can avoid using __GFP_ZERO, and do initialization by ourselves to avoid unneeded/redundant zeroing from mm. Cc: <stable@vger.kernel.org> Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:04 +02:00
Linus Torvalds	fca2a4240d	Revert "iommu/amd_iommu: Use CONFIG_DMA_DIRECT_OPS=y and dma_direct_{alloc,free}()" commit `e16c4790de` upstream. This reverts commit `b468620f2a`. It turns out that this broke drm on AMD platforms. Quoting Gabriel C: "I can confirm reverting `b468620f2a` fixes that issue for me. The GPU is working fine with SME enabled. Now with working GPU :) I can also confirm performance is back to normal without doing any other workarounds" Christan König analyzed it partially: "As far as I analyzed it we now get an -ENOMEM from dma_alloc_attrs() in drivers/gpu/drm/ttm/ttm_page_alloc_dma.c when IOMMU is enabled" and Christoph Hellwig responded: "I think the prime issue is that dma_direct_alloc respects the dma mask. Which we don't need if actually using the iommu. This would be mostly harmless exept for the the SEV bit high in the address that makes the checks fail. For now I'd say revert this commit for 4.17/4.18-rc and I'll look into addressing these issues properly" Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com> Acked-by: Christoph Hellwig <hch@lst.de> Cc: Christian König <christian.koenig@amd.com> Cc: Michel Dänzer <michel.daenzer@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org # v4.17 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:04 +02:00
Johan Hovold	d8c7843981	backlight: tps65217_bl: Fix Device Tree node lookup commit `2b12dfa124` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. This would only cause trouble if the child node is missing while there is an unrelated node named "backlight" elsewhere in the tree. Cc: stable <stable@vger.kernel.org> # 3.7 Fixes: `eebfdc17cc` ("backlight: Add TPS65217 WLED driver") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:04 +02:00
Johan Hovold	86053c4b0f	backlight: max8925_bl: Fix Device Tree node lookup commit `d1cc0ec3da` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent mfd node was also prematurely freed, while the child backlight node was leaked. Cc: stable <stable@vger.kernel.org> # 3.9 Fixes: `47ec340cb8` ("mfd: max8925: Support dt for backlight") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Johan Hovold	280d922dae	backlight: as3711_bl: Fix Device Tree node lookup commit `4a9c8bb2ac` upstream. Fix child-node lookup during probe, which ended up searching the whole device tree depth-first starting at the parent rather than just matching on its children. To make things worse, the parent mfd node was also prematurely freed. Cc: stable <stable@vger.kernel.org> # 3.10 Fixes: `59eb2b5e57` ("drivers/video/backlight/as3711_bl.c: add OF support") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Silvio Cesare	d49a9fd028	UBIFS: Fix potential integer overflow in allocation commit `353748a359` upstream. There is potential for the size and len fields in ubifs_data_node to be too large causing either a negative value for the length fields or an integer overflow leading to an incorrect memory allocation. Likewise, when the len field is small, an integer underflow may occur. Signed-off-by: Silvio Cesare <silvio.cesare@gmail.com> Fixes: `1e51764a3c` ("UBIFS: add new flash file system") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Richard Weinberger	9211df81e8	ubi: fastmap: Correctly handle interrupted erasures in EBA commit `781932375f` upstream. Fastmap cannot track the LEB unmap operation, therefore it can happen that after an interrupted erasure the mapping still looks good from Fastmap's point of view, while reading from the PEB will cause an ECC error and confuses the upper layer. Instead of teaching users of UBI how to deal with that, we read back the VID header and check for errors. If the PEB is empty or shows ECC errors we fixup the mapping and schedule the PEB for erasure. Fixes: `dbb7d2a88d` ("UBI: Add fastmap core") Cc: <stable@vger.kernel.org> Reported-by: martin bayern <Martinbayern@outlook.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Richard Weinberger	2162ded749	ubi: fastmap: Cancel work upon detach commit `6e7d801610` upstream. Ben Hutchings pointed out that `29b7a6fa1e` ("ubi: fastmap: Don't flush fastmap work on detach") does not really fix the problem, it just reduces the risk to hit the race window where fastmap work races against free()'ing ubi->volumes[]. The correct approach is making sure that no more fastmap work is in progress before we free ubi data structures. So we cancel fastmap work right after the ubi background thread is stopped. By setting ubi->thread_enabled to zero we make sure that no further work tries to wake the thread. Fixes: `29b7a6fa1e` ("ubi: fastmap: Don't flush fastmap work on detach") Fixes: `74cdaf2400` ("UBI: Fastmap: Fix memory leaks while closing the WL sub-system") Cc: stable@vger.kernel.org Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Martin Townsend <mtownsend1973@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Srinivas Kandagatla	46b10f263b	rpmsg: smd: do not use mananged resources for endpoints and channels commit `4a2e84c6ed` upstream. All the managed resources would be freed by the time release function is invoked. Handling such memory in qcom_smd_edge_release() would do bad things. Found this issue while testing Audio usecase where the dsp is started up and shutdown in a loop. This patch fixes this issue by using simple kzalloc for allocating channel->name and channel which is then freed in qcom_smd_edge_release(). Without this patch restarting a remoteproc would crash the system. Fixes: `53e2822e56` ("rpmsg: Introduce Qualcomm SMD backend") Cc: <stable@vger.kernel.org> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
NeilBrown	4e08619a02	md: fix two problems with setting the "re-add" device state. commit `011abdc9df` upstream. If "re-add" is written to the "state" file for a device which is faulty, this has an effect similar to removing and re-adding the device. It should take up the same slot in the array that it previously had, and an accelerated (e.g. bitmap-based) rebuild should happen. The slot that "it previously had" is determined by rdev->saved_raid_disk. However this is not set when a device fails (only when a device is added), and it is cleared when resync completes. This means that "re-add" will normally work once, but may not work a second time. This patch includes two fixes. 1/ when a device fails, record the ->raid_disk value in ->saved_raid_disk before clearing ->raid_disk 2/ when "re-add" is written to a device for which ->saved_raid_disk is not set, fail. I think this is suitable for stable as it can cause re-adding a device to be forced to do a full resync which takes a lot longer and so puts data at more risk. Cc: <stable@vger.kernel.org> (v4.1) Fixes: `97f6cd39da` ("md-cluster: re-add capabilities") Signed-off-by: NeilBrown <neilb@suse.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Michael Trimarchi	b0b68cc5d7	rtc: sun6i: Fix bit_idx value for clk_register_gate commit `09018d4bd7` upstream. clk-gate core will take bit_idx through clk_register_gate and then do clk_gate_ops by using BIT(bit_idx), but rtc-sun6i is passing bit_idx as BIT(bit_idx) it becomes BIT(BIT(bit_idx) which is wrong and eventually external gate clock is not enabling. This patch fixed by passing bit index and the original change introduced from below commit. "rtc: sun6i: Add support for the external oscillator gate" (sha1: `17ecd24641`) Signed-off-by: Michael Trimarchi <michael@amarulasolutions.com> Fixes: `17ecd24641` ("rtc: sun6i: Add support for the external oscillator gate") Cc: stable@vger.kernel.org Signed-off-by: Jagan Teki <jagan@amarulasolutions.com> Acked-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:03 +02:00
Marcin Ziemianowicz	abbfa3a4e3	clk: at91: PLL recalc_rate() now using cached MUL and DIV values commit `a982e45dc1` upstream. When a USB device is connected to the USB host port on the SAM9N12 then you get "-62" error which seems to indicate USB replies from the device are timing out. Based on a logic sniffer, I saw the USB bus was running at half speed. The PLL code uses cached MUL and DIV values which get set in set_rate() and applied in prepare(), but the recalc_rate() function instead queries the hardware instead of using these cached values. Therefore, if recalc_rate() is called between a set_rate() and prepare(), the wrong frequency is calculated and later the USB clock divider for the SAM9N12 SOC will be configured for an incorrect clock. In my case, the PLL hardware was set to 96 Mhz before the OHCI driver loads, and therefore the usb clock divider was being set to /2 even though the OHCI driver set the PLL to 48 Mhz. As an alternative explanation, I noticed this was fixed in the past by `87e2ed338f` ("clk: at91: fix recalc_rate implementation of PLL driver") but the bug was later re-introduced by `1bdf02326b` ("clk: at91: make use of syscon/regmap internally"). Fixes: `1bdf02326b` ("clk: at91: make use of syscon/regmap internally) Cc: <stable@vger.kernel.org> Signed-off-by: Marcin Ziemianowicz <marcin@ziemianowicz.com> Acked-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Martin Blumenstingl	87f5d29340	clk: meson: meson8b: mark fclk_div2 gate clocks as CLK_IS_CRITICAL commit `72e1f23020` upstream. Until commit `05f814402d` ("clk: meson: add fdiv clock gates") we relied on the bootloader to enable the fclk_div clock gates. It turns out that our clock tree is incomplete at least on Meson8b (tested with an Odroid-C1, which uses an RGMII PHY) because after the mentioned commit Ethernet is not working anymore (no RX/TX activity can be seen). At the same time Ethernet was still working on Meson8m2 with a RMII PHY. Testing has shown that as soon as "fclk_div2" is disabled Ethernet stops working on Odroid-C1. Unfortunately it's currently not clear what the Ethernet controller IP block uses the fclk_div2 clock for. Mark the clock as CLK_IS_CRITICAL to keep it enabled (as it's already enabled by most bootloaders by default, which is why we didn't notice it before). Fixes: `05f814402d` ("clk: meson: add fdiv clock gates") Cc: stable@vger.kernel.org Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Ross Zwisler	a90be8e17e	libnvdimm, pmem: Unconditionally deep flush on sync commit `ce7f11a230` upstream. Prior to this commit we would only do a "deep flush" (have nvdimm_flush() write to each of the flush hints for a region) in response to an msync/fsync/sync call if the nvdimm_has_cache() returned true at the time we were setting up the request queue. This happens due to the write cache value passed in to blk_queue_write_cache(), which then causes the block layer to send down BIOs with REQ_FUA and REQ_PREFLUSH set. We do have a "write_cache" sysfs entry for namespaces, i.e.: /sys/bus/nd/devices/pfn0.1/block/pmem0/dax/write_cache which can be used to control whether or not the kernel thinks a given namespace has a write cache, but this didn't modify the deep flush behavior that we set up when the driver was initialized. Instead, it only modified whether or not DAX would flush CPU caches via dax_flush() in response to sync calls. Simplify this by making the *sync deep flush always happen, regardless of the write cache setting of a namespace. The DAX CPU cache flushing will still be controlled the write_cache setting of the namespace. Cc: <stable@vger.kernel.org> Fixes: `5fdf8e5ba5` ("libnvdimm: re-enable deep flush for pmem devices via fsync()") Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Robert Elliott	4bc3f448b0	linvdimm, pmem: Preserve read-only setting for pmem devices commit `254a4cd50b` upstream. The pmem driver does not honor a forced read-only setting for very long: $ blockdev --setro /dev/pmem0 $ blockdev --getro /dev/pmem0 1 followed by various commands like these: $ blockdev --rereadpt /dev/pmem0 or $ mkfs.ext4 /dev/pmem0 results in this in the kernel serial log: nd_pmem namespace0.0: region0 read-write, marking pmem0 read-write with the read-only setting lost: $ blockdev --getro /dev/pmem0 0 That's from bus.c nvdimm_revalidate_disk(), which always applies the setting from nd_region (which is initially based on the ACPI NFIT NVDIMM state flags not_armed bit). In contrast, commit `20bd1d026a` ("scsi: sd: Keep disk read-only when re-reading partition") fixed this issue for SCSI devices to preserve the previous setting if it was set to read-only. This patch modifies bus.c to preserve any previous read-only setting. It also eliminates the kernel serial log print except for cases where read-write is changed to read-only, so it doesn't print read-only to read-only non-changes. Cc: <stable@vger.kernel.org> Fixes: `5813882094` ("libnvdimm, nfit: handle unarmed dimms, mark namespaces read-only") Signed-off-by: Robert Elliott <elliott@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Steffen Maier	cfb80b3e27	scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread commit `6a76550841` upstream. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Steffen Maier	56bcb85ccb	scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED commit `8c3d20aada` upstream. That other commit introduced an inconsistency because it would trace on ERP_FAILED for all callers of port forced reopen triggers (not just terminate_rport_io), but it would not trace on ERP_FAILED for all callers of other ERP triggers such as adapter, port regular, LUN. Therefore, generalize that other commit. zfcp_erp_action_enqueue() already had two early outs which re-used the one zfcp_dbf_rec_trig() call. All ERP trigger functions finally run through zfcp_erp_action_enqueue(). So move the special handling for ZFCP_STATUS_COMMON_ERP_FAILED into zfcp_erp_action_enqueue() and add another early out with new trace marker for pseudo ERP need in this case. This removes all early returns from all ERP trigger functions so we always end up at zfcp_dbf_rec_trig(). Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : ....... LUN : 0x... WWPN : 0x... D_ID : 0x... Adapter status : 0x... Port status : 0x... LUN status : 0x... Ready count : 0x... Running count : 0x... ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_... ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Steffen Maier	a671d81eb7	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED commit `d70aab5592` upstream. For problem determination we always want to see when we were invoked on the terminate_rport_io callback whether we perform something or not. Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec: loose remote port t workqueue [s] zfcp_q_<dev> IRQ zfcperp<dev> === ================== =================== ============================ 0 recv RSCN q p.test_link_work block rport start fast_io_fail_tmo send ADISC ELS 4 recv ADISC fail block zfcp_port port forced reopen send open port 12 recv open port fail q p.gid_pn_work zfcp_erp_wakeup (zfcp_erp_wait would return) GID_PN fail Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail, e.g. with the typical 5 sec setting. port.status \|= ERP_FAILED If fast_io_fail_tmo triggers after this point, we missed a SCSI trace. workqueue fc_dl_<host> ================== 27 fc_timeout_fail_rport_io fc_terminate_rport_io zfcp_scsi_terminate_rport_io zfcp_erp_port_forced_reopen _zfcp_erp_port_forced_reopen if (port.status & ERP_FAILED) return; Therefore, write a trace before above early return. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 ZFCP_DBF_REC_TRIG Tag : sctrpi1 SCSI terminate rport I/O LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> D_ID : 0x<n_port_id> Adapter status : 0x... Port status : 0x... LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED Signed-off-by: Steffen Maier <maier@linux.ibm.com> Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Steffen Maier	7ee4ac8af0	scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return commit `96d9270499` upstream. get_device() and its internally used kobject_get() only return NULL if they get passed NULL as argument. zfcp_get_port_by_wwpn() loops over adapter->port_list so the iteration variable port is always non-NULL. Struct device is embedded in struct zfcp_port so &port->dev is always non-NULL. This is the argument to get_device(). However, if we get an fc_rport in terminate_rport_io() for which we cannot find a match within zfcp_get_port_by_wwpn(), the latter can return NULL. v2.6.30 commit `70932935b6` ("[SCSI] zfcp: Fix oops when port disappears") introduced an early return without adding a trace record for this case. Even if we don't need recovery in this case, for debugging we should still see that our callback was invoked originally by scsi_transport_fc. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : REC Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : sctrpin SCSI terminate rport I/O, no zfcp port LUN : 0xffffffffffffffff none (invalid) WWPN : 0x<wwpn> WWPN D_ID : 0x<n_port_id> N_Port-ID Adapter status : 0x... Port status : 0xffffffff unknown (-1) LUN status : 0x00000000 none (invalid) Ready count : 0x... Running count : 0x... ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED ERP need : 0xc0 ZFCP_ERP_ACTION_NONE Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `70932935b6` ("[SCSI] zfcp: Fix oops when port disappears") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:02 +02:00
Steffen Maier	9a5fa89470	scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed commit `512857a795` upstream. If a SCSI device is deleted during scsi_eh host reset, we cannot get a reference to the SCSI device anymore since scsi_device_get returns !=0 by design. Assuming the recovery of adapter and port(s) was successful, zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for the half-gone SCSI device. Unfortunately, it causes the following confusing trace record which states that zfcp will do a LUN recovery as "ERP need" is ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want". Old example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ZFCP_STATUS_COMMON_RUNNING but not ZFCP_STATUS_COMMON_UNBLOCKED as it was closed on close part of adapter reopen ERP want : 0x01 ERP need : 0x01 misleading However, zfcp_erp_setup_act() returns NULL as it cannot get the reference. Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery actually happens. We always do want the recovery trigger trace record even if no erp_action could be enqueued as in this case. For other cases where we did not enqueue an erp_action, 'need' has always been zero to indicate this. In order to indicate above goto out, introduce an eyecatcher "flag" to mark the "ERP need" as 'not needed' but still keep the information which erp_action type, that zfcp_erp_required_act() had decided upon, is needed. 0xc_ is chosen to be visibly different from 0x0_ in "ERP want". New example trace record formatted with zfcpdbf from s390-tools: Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded LUN : 0x<FCP_LUN> WWPN : 0x<WWPN> D_ID : 0x<N_Port-ID> Adapter status : 0x5400050b Port status : 0x54000001 LUN status : 0x40000000 ERP want : 0x01 ERP need : 0xc1 would need LUN ERP, but no action set up ^ Before v2.6.38 commit `ae0904f60f` ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") we could detect this case because the "erp_action" field in the trace was NULL. The rework removed erp_action as argument and field from the trace. This patch here is for tracing. A fix to allow LUN recovery in the case at hand is a topic for a separate patch. See also commit `fdbd1c5e27` ("[SCSI] zfcp: Allow running unit/LUN shutdown without acquiring reference") for a similar case and background info. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `ae0904f60f` ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Steffen Maier	fee64fc23d	scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF commit `81979ae63e` upstream. We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the start of our eh callback and an actual send/recv of an abort / TMF request. In order to see the temporal sequence including any abort / TMF send retries, add a trace before the above two blocking functions. This supports problem determination with scsi_eh and parallel zfcp ERP. No need to explicitly trace the beginning of our eh callback, since we typically can send an abort / TMF and see its HBA response (in the worst case, it's a pseudo response on dismiss all of adapter recovery, e.g. due to an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send, we now get a trace record for the first "abrt_wt" or "[lt]r_wait" which denotes almost the beginning of the callback. No need to explicitly trace the wakeup after the above two blocking functions because the next retry loop causes another trace in any case and that is sufficient. Example trace records formatted with zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : abrt_wt abort, before zfcp_erp_wait() Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0x<scsi_id> SCSI LUN : 0x<scsi_lun> SCSI LUN high : 0x<scsi_lun_high> SCSI result : 0x<scsi_result_of_cmd_to_be_aborted> SCSI retries : 0x<retries_of_cmd_to_be_aborted> SCSI allowed : 0x<allowed_retries_of_cmd_to_be_aborted> SCSI scribble : 0x<req_id_of_cmd_to_be_aborted> SCSI opcode : <CDB_of_cmd_to_be_aborted> FCP rsp inf cod: 0x.. none (invalid) FCP rsp IU : ... none (invalid) Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : lr_wait LUN reset, before zfcp_erp_wait() Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0x<scsi_id> SCSI LUN : 0x<scsi_lun> SCSI LUN high : 0x<scsi_lun_high> SCSI result : 0x... unrelated SCSI retries : 0x.. unrelated SCSI allowed : 0x.. unrelated SCSI scribble : 0x... unrelated SCSI opcode : ... unrelated FCP rsp inf cod: 0x.. none (invalid) FCP rsp IU : ... none (invalid) Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `63caf367e1` ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp") Fixes: `af4de36d91` ("[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Steffen Maier	a39141dc0b	scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler commit `df30781699` upstream. For problem determination we need to see whether and why we were successful or not. This allows deduction of scsi_eh escalation. Example trace record formatted with zfcpdbf from s390-tools: Timestamp : ... Area : SCSI Subarea : 00 Level : 1 Exception : - CPU ID : .. Caller : 0x... Record ID : 1 Tag : schrh_r SCSI host reset handler result Request ID : 0x0000000000000000 none (invalid) SCSI ID : 0xffffffff none (invalid) SCSI LUN : 0xffffffff none (invalid) SCSI LUN high : 0xffffffff none (invalid) SCSI result : 0x00002002 field re-used for midlayer value: SUCCESS or in other cases: 0x2009 == FAST_IO_FAIL SCSI retries : 0xff none (invalid) SCSI allowed : 0xff none (invalid) SCSI scribble : 0xffffffffffffffff none (invalid) SCSI opcode : ffffffff ffffffff ffffffff ffffffff none (invalid) FCP rsp inf cod: 0xff none (invalid) FCP rsp IU : 00000000 00000000 00000000 00000000 none (invalid) 00000000 00000000 v2.6.35 commit `a1dbfddd02` ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") introduced the first return with something other than the previously hardcoded single SUCCESS return path. Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: `a1dbfddd02` ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh") Cc: <stable@vger.kernel.org> #2.6.38+ Reviewed-by: Jens Remus <jremus@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Mikhail Malygin	5d91d06a69	scsi: qla2xxx: Spinlock recursion in qla_target commit `49d7bd3681` upstream. The patch reverts changes done in qlt_schedule_sess_for_deletion() to avoid spinlock recursion sess->vha->work_lock should be used instead of ha->tgt.sess_lock, that can be locked in callers: qlt_reset() or qlt_handle_login() [mkp: roll in build warning reported by sfr] Fixes: `1c6cacf4ea` ("scsi: qla2xxx: Fixup locking for session deletion") Cc: <stable@vger.kernel.org> #v4.17 Signed-off-by: Mikhail Malygin <m.malygin@yadro.com> Reported-by: Mikhail Malygin <m.malygin@yadro.com> Tested-by: Mikhail Malygin <m.malygin@yadro.com> Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Anil Gurumurthy	2880dcd24e	scsi: qla2xxx: Mask off Scope bits in retry delay commit `3cedc8797b` upstream. Some newer target uses "Status Qualifier" response in a returned "Busy Status". This new response code of 0x4001, which is "Scope" bits, translates to "Affects all units accessible by target". Due to this new value returned in the Scope bits, driver was using that value as timeout value which resulted into driver waiting for 27min timeout. This patch masks off this Scope bits so that driver does not use this value as retry delay time. Cc: <stable@vger.kernel.org> Signed-off-by: Anil Gurumurthy <anil.gurumurthy@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Himanshu Madhani	4988b9fb17	scsi: qla2xxx: Fix setting lower transfer speed if GPSC fails commit `413c2f3348` upstream. This patch prevents driver from setting lower default speed of 1 GB/sec, if the switch does not support Get Port Speed Capabilities (GPSC) command. Setting this default speed results into much lower write performance for large sequential WRITE. This patch modifies driver to check for gpsc_supported flags and prevents driver from issuing MBC_SET_PORT_PARAM (001Ah) to set default speed of 1 GB/sec. If driver does not send this mailbox command, firmware assumes maximum supported link speed and will operate at the max speed. Cc: stable@vger.kernel.org Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Reported-by: Eda Zhou <ezhou@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Tested-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Quinn Tran	09d4c4ef82	scsi: qla2xxx: Delete session for nport id change commit `1d317b2123` upstream. This patch fixes regression introduced by commit `a4239945b8` ("scsi: qla2xxx: Add switch command to simplify fabric discovery") by scheduling session deletion when Nport ID changes. [mkp: clarified commit] Fixes: `a4239945b8` ("scsi: qla2xxx: Add switch command to simplify fabric discovery") Cc: <stable@vger.kernel.org> Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Sinan Kaya	e956c1569c	scsi: hpsa: disable device during shutdown commit `0d98ba8d70` upstream. 'Commit `cc27b735ad` ("PCI/portdrv: Turn off PCIe services during shutdown")' has been added to kernel to shutdown pending PCIe port service interrupts during reboot so that a newly started kexec kernel wouldn't observe pending interrupts. pcie_port_device_remove() is disabling the root port and switches by calling pci_disable_device() after all PCIe service drivers are shutdown. This has been found to cause crashes on HP DL360 Gen9 machines during reboot due to hpsa driver not clearing the bus master bit during the shutdown procedure by calling pci_disable_device(). Disable device as part of the shutdown sequence. Signed-off-by: Sinan Kaya <okaya@codeaurora.org> Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779 Fixes: `cc27b735ad` ("PCI/portdrv: Turn off PCIe services during shutdown") Cc: stable@vger.kernel.org Reported-by: Ryan Finnie <ryan@finnie.org> Tested-by: Don Brace <don.brace@microsemi.com> Acked-by: Don Brace <don.brace@microsemi.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:01 +02:00
Luis Henriques	7e7a859106	scsi: scsi_debug: Fix memory leak on module unload commit `52ab9768f7` upstream. Since commit `80c49563e2` ("scsi: scsi_debug: implement IMMED bit") there are long delays in F_SYNC_DELAY and F_SSU_DELAY. This can cause a memory leak in schedule_resp(), which can be invoked while unloading the scsi_debug module: free_all_queued() had already freed all sd_dp and schedule_resp will alloc a new one, which will never get freed. Here's the kmemleak report while running xfstests generic/350: unreferenced object 0xffff88007d752b00 (size 128): comm "rmmod", pid 26940, jiffies 4295816945 (age 7.588s) hex dump (first 32 bytes): 00 2b 75 7d 00 88 ff ff 00 00 00 00 00 00 00 00 .+u}............ 00 00 00 00 00 00 00 00 8e 31 a2 34 5f 03 00 00 .........1.4_... backtrace: [<000000002abd83d0>] 0xffffffffa000705e [<000000004c063fda>] scsi_dispatch_cmd+0xc7/0x1a0 [<000000000c119a00>] scsi_request_fn+0x251/0x550 [<000000009de0c736>] __blk_run_queue+0x3f/0x60 [<000000001c4453c8>] blk_execute_rq_nowait+0x98/0xd0 [<00000000d17ec79f>] blk_execute_rq+0x3a/0x50 [<00000000a7654b6e>] scsi_execute+0x113/0x250 [<00000000fd78f7cd>] sd_sync_cache+0x95/0x160 [<0000000024dacb14>] sd_shutdown+0x9b/0xd0 [<00000000e9101710>] sd_remove+0x5f/0xb0 [<00000000c43f0d63>] device_release_driver_internal+0x13c/0x1f0 [<00000000e8ad57b6>] bus_remove_device+0xe9/0x160 [<00000000713a7b8a>] device_del+0x120/0x320 [<00000000e5db670c>] __scsi_remove_device+0x115/0x150 [<00000000eccbef30>] scsi_forget_host+0x20/0x60 [<00000000cd5a0738>] scsi_remove_host+0x6d/0x120 Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Luis Henriques <lhenriques@suse.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Dan Williams	0bccb4aa0d	mm: fix __gup_device_huge vs unmap commit `a9b6de77b1` upstream. get_user_pages_fast() for device pages is missing the typical validation that all page references have been taken while the mapping was valid. Without this validation truncate operations can not reliably coordinate against new page reference events like O_DIRECT. Cc: <stable@vger.kernel.org> Fixes: `3565fce3a6` ("mm, x86: get_user_pages() for dax mappings") Reported-by: Jan Kara <jack@suse.cz> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Christophe JAILLET	d1dbe4ccfe	iio: sca3000: Fix an error handling path in 'sca3000_probe()' commit `4a5b45383c` upstream. Use 'devm_iio_kfifo_allocate()' instead of 'iio_kfifo_allocate()' in order to simplify code and avoid a memory leak in an error path in 'sca3000_probe()'. A call to 'sca3000_unconfigure_ring()' was missing. Sent via the next merge window as unimportant bug and there are other patches dependent on it. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Alexandru Ardelean	4c45012473	iio: adc: ad7791: remove sample freq sysfs attributes commit `7eb6b35d93` upstream. In the current state, these attributes are broken, because they are registered already, and the kernel throws a warning. The first registration happens via the `IIO_CHAN_INFO_SAMP_FREQ` flag from the `ad_sigma_delta` driver. In this commit these attrs are removed, and in the following the IIO_CHAN_INFO_SAMP_FREQ behavior will be implemented, which replaces these hooks. This is done to make things a bit easier to review as there is a bit of overlap in the patch if it's done all at once. Fixes: `a13e831fca` ("staging: iio: ad7192: implement IIO_CHAN_INFO_SAMP_FREQ") Signed-off-by: Alexandru Ardelean <alexandru.ardelean@analog.com> Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Filipe Manana	b3643ef1ca	Btrfs: fix return value on rename exchange failure commit `c5b4a50b74` upstream. If we failed during a rename exchange operation after starting/joining a transaction, we would end up replacing the return value, stored in the local 'ret' variable, with the return value from btrfs_end_transaction(). So this could end up returning 0 (success) to user space despite the operation having failed and aborted the transaction, because if there are multiple tasks having a reference on the transaction at the time btrfs_end_transaction() is called by the rename exchange, that function returns 0 (otherwise it returns -EIO and not the original error value). So fix this by not overwriting the return value on error after getting a transaction handle. Fixes: `cdd1fedf82` ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Maciej S. Szmigiero	31b3044dc5	X.509: unpack RSA signatureValue field from BIT STRING commit `b65c32ec5a` upstream. The signatureValue field of a X.509 certificate is encoded as a BIT STRING. For RSA signatures this BIT STRING is of so-called primitive subtype, which contains a u8 prefix indicating a count of unused bits in the encoding. We have to strip this prefix from signature data, just as we already do for key data in x509_extract_key_data() function. This wasn't noticed earlier because this prefix byte is zero for RSA key sizes divisible by 8. Since BIT STRING is a big-endian encoding adding zero prefixes has no bearing on its value. The signature length, however was incorrect, which is a problem for RSA implementations that need it to be exactly correct (like AMD CCP). Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name> Fixes: `c26fd69fa0` ("X.509: Add a crypto key parser for binary (DER) X.509 certificates") Cc: stable@vger.kernel.org Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Waiman Long	00d2cb166d	locking/rwsem: Fix up_read_non_owner() warning with DEBUG_RWSEMS commit `03eeafdd9a` upstream. It was found that the use of up_read_non_owner() in NFS was causing the following warning when DEBUG_RWSEMS was configured. DEBUG_LOCKS_WARN_ON(sem->owner != ((struct task_struct *)(1UL << 0))) Looking into the rwsem.c file, it was discovered that the corresponding down_read_non_owner() function was not setting the owner field properly. This is fixed now, and the warning should be gone. Fixes: `5149cbac42` ("locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches") Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Gavin Schenk <g.schenk@eckelmann.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: linux-nfs@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1527168398-4291-1-git-send-email-longman@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Yang Yingliang	3f695ef9c4	irqchip/gic-v3-its: Don't bind LPI to unavailable NUMA node commit `c1797b11a0` upstream. On a NUMA system, if an ITS is local to an offline node, the ITS driver may pick an offline CPU to bind the LPI. In this case, pick an online CPU (and the first one will do). But on some systems, binding an LPI to non-local node CPU may cause deadlock (see Cavium erratum 23144). In this case, just fail the activate and return an error code. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jason Cooper <jason@lakedaemon.net> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Sumit Garg <sumit.garg@linaro.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622095254.5906-5-marc.zyngier@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:27:00 +02:00
Geert Uytterhoeven	8a42ff24cf	time: Make sure jiffies_to_msecs() preserves non-zero time periods commit `abcbcb80cd` upstream. For the common cases where 1000 is a multiple of HZ, or HZ is a multiple of 1000, jiffies_to_msecs() never returns zero when passed a non-zero time period. However, if HZ > 1000 and not an integer multiple of 1000 (e.g. 1024 or 1200, as used on alpha and DECstation), jiffies_to_msecs() may return zero for small non-zero time periods. This may break code that relies on receiving back a non-zero value. jiffies_to_usecs() does not need such a fix: one jiffy can only be less than one µs if HZ > 1000000, and such large values of HZ are already rejected at build time, twice: - include/linux/jiffies.h does #error if HZ >= 12288, - kernel/time/time.c has BUILD_BUG_ON(HZ > USEC_PER_SEC). Broken since forever. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Stephen Boyd <sboyd@kernel.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180622143357.7495-1-geert@linux-m68k.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Huacai Chen	396be03fda	MIPS: io: Add barrier after register read in inX() commit `18f3e95b90` upstream. While a barrier is present in the outX() functions before the register write, a similar barrier is missing in the inX() functions after the register read. This could allow memory accesses following inX() to observe stale data. This patch is very similar to commit `a1cc7034e3` ("MIPS: io: Add barrier after register read in readX()"). Because war_io_reorder_wmb() is both used by writeX() and outX(), if readX() need a barrier then so does inX(). Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhc@lemote.com> Patchwork: https://patchwork.linux-mips.org/patch/19516/ Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: James Hogan <james.hogan@mips.com> Cc: linux-mips@linux-mips.org Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: Huacai Chen <chenhuacai@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Linus Walleij	6f2c82f5cc	MIPS: pb44: Fix i2c-gpio GPIO descriptor table commit `326345f995` upstream. I used bad names in my clumsiness when rewriting many board files to use GPIO descriptors instead of platform data. A few had the platform_device ID set to -1 which would indeed give the device name "i2c-gpio". But several had it set to >=0 which gives the names "i2c-gpio.0", "i2c-gpio.1" ... Fix the one affected board in the MIPS tree. Sorry. Fixes: `b2e6355559` ("i2c: gpio: Convert to use descriptors") Reported-by: Simon Guinot <simon.guinot@sequanux.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Paul Burton <paul.burton@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Simon Guinot <simon.guinot@sequanux.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 4.15+ Patchwork: https://patchwork.linux-mips.org/patch/19387/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Srinivas Pandruvada	319d04b8a4	cpufreq: intel_pstate: Fix scaling max/min limits with Turbo 3.0 commit `ff7c991714` upstream. When scaling max/min settings are changed, internally they are converted to a ratio using the max turbo 1 core turbo frequency. This works fine when 1 core max is same irrespective of the core. But under Turbo 3.0, this will not be the case. For example: Core 0: max turbo pstate: 43 (4.3GHz) Core 1: max turbo pstate: 45 (4.5GHz) In this case 1 core turbo ratio will be maximum of all, so it will be 45 (4.5GHz). Suppose scaling max is set to 4GHz (ratio 40) for all cores ,then on core one it will be = max_state * policy->max / max_freq; = 43 * (4000000/4500000) = 38 (3.8GHz) = 38 which is 200MHz less than the desired. On core2, it will be correctly set to ratio 40 (4GHz). Same holds true for scaling min frequency limit. So this requires usage of correct turbo max frequency for core one, which in this case is 4.3GHz. So we need to adjust per CPU cpu->pstate.turbo_freq using the maximum HWP ratio of that core. This change uses the HWP capability of a core to adjust max turbo frequency. But since Broadwell HWP doesn't use ratios in the HWP capabilities, we have to use legacy max 1 core turbo ratio. This is not a problem as the HWP capabilities don't differ among cores in Broadwell. We need to check for non Broadwell CPU model for applying this change, though. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: 4.6+ <stable@vger.kernel.org> # 4.6+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Fabio Estevam	32c319d4f1	pinctrl: devicetree: Fix pctldev pointer overwrite commit `bc3322bc16` upstream. Commit `b89405b610` ("pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs") causes the pinctrl hog pins to not get initialized on i.MX platforms leaving them with the IOMUX settings untouched. This causes several regressions on i.MX such as: - OV5640 camera driver can not be probed anymore on imx6qdl-sabresd because the camera clock pin is in a pinctrl_hog group and since its pinctrl initialization is skipped, the camera clock is kept in GPIO functionality instead of CLK_CKO function. - Audio stopped working on imx6qdl-wandboard and imx53-qsb for the same reason. Richard Fitzgerald explains the problem: "I see the bug. If the hog node isn't a 1st level child of the pinctrl parent node it will go around the for(;;) loop again but on the first pass I overwrite pctldev with the result of get_pinctrl_dev_from_of_node() so it doesn't point to the pinctrl driver any more." Fix the issue by stashing the original pctldev so it doesn't get overwritten. Fixes: `b89405b610` ("pinctrl: devicetree: Fix dt_to_map_one_config handling of hogs") Cc: <stable@vger.kernel.org> Reported-by: Mika Penttilä <mika.penttila@nextfour.com> Reported-by: Steve Longerbeam <slongerbeam@gmail.com> Suggested-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com> Reviewed-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Paweł Chmiel	1a31d6f5dc	pinctrl: samsung: Correct EINTG banks order commit `5cf9a338db` upstream. All banks with GPIO interrupts should be at beginning of bank array and without any other types of banks between them. This order is expected by exynos_eint_gpio_irq, when doing interrupt group to bank translation. Otherwise, kernel NULL pointer dereference would happen when trying to handle interrupt, due to wrong bank being looked up. Observed on s5pv210, when trying to handle gpj0 interrupt, where kernel was mapping it to gpi bank. Cc: stable@vger.kernel.org Fixes: `023e06dfa6` ("pinctrl: exynos: add exynos5410 SoC specific data") Fixes: `608a26a7bc` ("pinctrl: Add s5pv210 support to pinctrl-exynos) Signed-off-by: Paweł Chmiel <pawel.mikolaj.chmiel@gmail.com> Reviewed-by: Tomasz Figa <tomasz.figa@gmail.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Terry Zhou	1bdeee56be	pinctrl: armada-37xx: Fix spurious irq management commit `702d1e81fe` upstream. Until now, if we found spurious irq in irq_handler, we only updated the status in register but not the status in the code. Due to this the system will got stuck dues to the infinite loop [gregory.clement@bootlin.com: update comment and add fix and stable tags] Fixes: `30ac0d3b07` ("pinctrl: armada-37xx: Add edge both type gpio irq support") Cc: <stable@vger.kernel.org> Signed-off-by: Terry Zhou <bjzhou@marvell.com> Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Randy Dunlap	6feb00b1af	auxdisplay: fix broken menu commit `b5b903fba9` upstream. Having the CHARLCD Kconfig symbol between "menuconfig AUXDISPLAY" and "if AUXDISPLAY" breaks the AUXDISPLAY submenus, so move the CHARLCD Kconfig symbol near the end of the file so that the menu display is continuous. Also include ARM_CHARLCD inside of the if AUXDISPLAY/endif block. Geert says that it should be there. Fixes: `39f8ea4672` ("auxdisplay: charlcd: Extract character LCD core from misc/panel") Cc: stable@vger.kernel.org # v4.12 Cc: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Mika Westerberg	4db687446a	PCI: Account for all bridges on bus when distributing bus numbers commit `3374c545c2` upstream. When distributing extra bus number space to hotplug bridges for future extension, we don't account for the fact that there might be non-hotplug bridges on the bus after the hotplug bridges. For example: 01:00.0 --+- 02:00.0 (HotPlug-) -- Thunderbolt host controller +- 02:01.0 (HotPlug+) \- 02:02.0 (HotPlug-) -- xHCI host controller pci_scan_child_bus_extend() is supposed to distribute the remaining bus numbers to the hotplug bridge at 02:01.0, but only after accounting for all bridges on bus 02. Since we don't check whether there's another non-hotplug bridge after the hotplug bridge 02:01.0, it may not leave space for the non-hotplug bridge: pci 0000:00:1b.0: PCI bridge to [bus 01-39] (Root Port) pci 0000:01:00.0: PCI bridge to [bus 02-39] ... pci 0000:02:00.0: PCI bridge to [bus 03] pci 0000:02:01.0: PCI bridge to [bus 04] pci_bus 0000:04: [bus 04-39] extended by 0x35 pci_bus 0000:04: bus scan returning with max=39 pci_bus 0000:04: busn_res: [bus 04-39] end is updated to 39 pci 0000:02:02.0: scanning [bus 00-00] behind bridge, pass 1 pci_bus 0000:3a: scanning bus pci_bus 0000:3a: bus scan returning with max=3a pci_bus 0000:3a: busn_res: [bus 3a] end is updated to 3a pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:02 [bus 02-39] pci_bus 0000:3a: [bus 3a] partially hidden behind bridge 0000:01 [bus 01-39] pci_bus 0000:02: bus scan returning with max=3a pci_bus 0000:02: busn_res: [bus 02-39] end can not be updated to 3a The resulting 'lspci -t' output looks like this: +-1b.0-[01-39]----00.0-[02-3a]--+-00.0-[03]----00.0 ^^ +-01.0-[04-39]-- \-02.0-[3a]----00.0 ^^ The xHCI host controller behind 02:02.0 is not usable because it would have to be assigned bus 3a, which is not accessible through 00:1b.0. To fix this, reserve at least one bus for each bridge while scanning already configured bridges. Then use this information in the second scan to correct the available extra bus space for hotplug bridges. After this change the 'lspci -t' output is what is expected: +-1b.0-[01-39]----00.0-[02-39]--+-00.0-[03]----00.0 +-01.0-[04-38]-- \-02.0-[39]----00.0 The xHCI controller is now on bus 39, where it is usable. Fixes: `1c02ea8100` ("PCI: Distribute available buses to hotplug-capable bridges") Reported-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:59 +02:00
Mika Westerberg	f7b9f2ffe7	PCI: pciehp: Clear Presence Detect and Data Link Layer Status Changed on resume commit `13c65840fe` upstream. After a suspend/resume cycle the Presence Detect or Data Link Layer Status Changed bits might be set. If we don't clear them those events will not fire anymore and nothing happens for instance when a device is now hot-unplugged. Fix this by clearing those bits in a newly introduced function pcie_reenable_notification(). This should be fine because immediately after, we check if the adapter is still present by reading directly from the status register. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Mika Westerberg	f383f7cc3c	PCI: Add ACS quirk for Intel 300 series commit `f154a718e6` upstream. Intel 300 series chipset still has the same ACS issue as the previous generations so extend the ACS quirk to cover it as well. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Alex Williamson	9103d6f434	PCI: Add ACS quirk for Intel 7th & 8th Gen mobile commit `e8440f4bfe` upstream. The specification update indicates these have the same errata for implementing non-standard ACS capabilities. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Sridhar Pitchai	d6ae416694	PCI: hv: Make sure the bus domain is really unique commit `29927dfb7f` upstream. When Linux runs as a guest VM in Hyper-V and Hyper-V adds the virtual PCI bus to the guest, Hyper-V always provides unique PCI domain. commit `4a9b0933bd` ("PCI: hv: Use device serial number as PCI domain") overrode unique domain with the serial number of the first device added to the virtual PCI bus. The reason for that patch was to have a consistent and short name for the device, but Hyper-V doesn't provide unique serial numbers. Using non-unique serial numbers as domain IDs leads to duplicate device addresses, which causes PCI bus registration to fail. commit `0c195567a8` ("netvsc: transparent VF management") avoids the need for commit `4a9b0933bd` ("PCI: hv: Use device serial number as PCI domain"). When scripts were used to configure VF devices, the name of the VF needed to be consistent and short, but with commit `0c195567a8` ("netvsc: transparent VF management") all the setup is done in the kernel, and we do not need to maintain consistent name. Revert commit `4a9b0933bd` ("PCI: hv: Use device serial number as PCI domain") so we can reliably support multiple devices being assigned to a guest. Tag the patch for stable kernels containing commit `0c195567a8` ("netvsc: transparent VF management"). Fixes: `4a9b0933bd` ("PCI: hv: Use device serial number as PCI domain") Signed-off-by: Sridhar Pitchai <sridhar.pitchai@microsoft.com> [lorenzo.pieralisi@arm.com: trimmed commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: stable@vger.kernel.org # v4.14+ Reviewed-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Jae Hyun Yoo	ffd5fb98d6	clk:aspeed: Fix reset bits for PCI/VGA and PECI commit `e76e56823a` upstream. This commit fixes incorrect setting of reset bits for PCI/VGA and PECI modules. 1. Reset bit for PCI/VGA is 8. 2. PECI reset bit is missing so added bit 10 as its reset bit. Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com> Fixes: `15ed8ce5f8` ("clk: aspeed: Register gated clocks") Cc: stable <stable@vger.kernel.org> Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Tokunori Ikegami	09d006c3f8	MIPS: BCM47XX: Enable 74K Core ExternalSync for PCIe erratum commit `2a027b47db` upstream. The erratum and workaround are described by BCM5300X-ES300-RDS.pdf as below. R10: PCIe Transactions Periodically Fail Description: The BCM5300X PCIe does not maintain transaction ordering. This may cause PCIe transaction failure. Fix Comment: Add a dummy PCIe configuration read after a PCIe configuration write to ensure PCIe configuration access ordering. Set ES bit of CP0 configu7 register to enable sync function so that the sync instruction is functional. Resolution: hndpci.c: extpci_write_config() hndmips.c: si_mips_init() mipsinc.h CONF7_ES This is fixed by the CFE MIPS bcmsi chipset driver also for BCM47XX. Also the dummy PCIe configuration read is already implemented in the Linux BCMA driver. Enable ExternalSync in Config7 when CONFIG_BCMA_DRIVER_PCI_HOSTMODE=y too so that the sync instruction is externalised. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Paul Burton <paul.burton@mips.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Rafał Miłecki <zajec5@gmail.com> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19461/ Signed-off-by: James Hogan <jhogan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Joakim Tjernlund	a5382451cc	mtd: cfi_cmdset_0002: Avoid walking all chips when unlocking. commit `f1ce87f608` upstream. cfi_ppb_unlock() walks all flash chips when unlocking sectors, avoid walking chips unaffected by the unlock operation. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Joakim Tjernlund	6b4d980095	mtd: cfi_cmdset_0002: Fix unlocking requests crossing a chip boudary commit `0cd8116f17` upstream. The "sector is in requested range" test used to determine whether sectors should be re-locked or not is done on a variable that is reset everytime we cross a chip boundary, which can lead to some blocks being re-locked while the caller expect them to be unlocked. Fix the check to make sure this cannot happen. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:58 +02:00
Joakim Tjernlund	5dd85fdba5	mtd: cfi_cmdset_0002: fix SEGV unlocking multiple chips commit `5fdfc3dbad` upstream. cfi_ppb_unlock() tries to relock all sectors that were locked before unlocking the whole chip. This locking used the chip start address + the FULL offset from the first flash chip, thereby forming an illegal address. Fix that by using the chip offset(adr). Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Joakim Tjernlund	69d3096b39	mtd: cfi_cmdset_0002: Use right chip in do_ppb_xxlock() commit `f93aa8c4de` upstream. do_ppb_xxlock() fails to add chip->start when querying for lock status (and chip_ready test), which caused false status reports. Fix that by adding adr += chip->start and adjust call sites accordingly. Fixes: `1648eaaa15` ("mtd: cfi_cmdset_0002: Support Persistent Protection Bits (PPB) locking") Cc: stable@vger.kernel.org Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Mason Yang	7ac7aa8b70	mtd: rawnand: All AC chips have a broken GET_FEATURES(TIMINGS). commit `fe3dd97dd6` upstream. Make sure we flag all broken chips as not supporting this feature. Also move this logic to a new function to keep things readable. Fixes: `34c5c01e0c` ("mtd: rawnand: macronix: nack the support of changing timings for one chip") Cc: <stable@vger.kernel.org> Signed-off-by: Mason Yang <masonccyang@mxic.com.tw> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Chris Packham	4c10798dac	mtd: rawnand: micron: add ONFI_FEATURE_ON_DIE_ECC to supported features commit `12baf77211` upstream. Add ONFI_FEATURE_ON_DIE_ECC to the set/get features list for Micron NAND flash. Fixes: `789157e41a` ("mtd: rawnand: allow vendors to declare (un)supported features") Cc: <stable@vger.kernel.org> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Martin Kaiser	52e9bf78f2	mtd: rawnand: mxc: set spare area size register explicitly commit `3f77f244d8` upstream. The v21 version of the NAND flash controller contains a Spare Area Size Register (SPAS) at offset 0x10. Its setting defaults to the maximum spare area size of 218 bytes. The size that is set in this register is used by the controller when it calculates the ECC bytes internally in hardware. Usually, this register is updated from settings in the IIM fuses when the system is booting from NAND flash. For other boot media, however, the SPAS register remains at the default setting, which may not work for the particular flash chip on the board. The same goes for flash chips whose configuration cannot be set in the IIM fuses (e.g. chips with 2k sector size and 128 bytes spare area size can't be configured in the IIM fuses on imx25 systems). Set the SPAS register explicitly during the preset operation. Derive the register value from mtd->oobsize that was detected during probe by decoding the flash chip's ID bytes. While at it, rename the define for the spare area register's offset to NFC_V21_RSLTSPARE_AREA. The register at offset 0x10 on v1 controllers is different from the register on v21 controllers. Fixes: `d484018` ("mtd: mxc_nand: set NFC registers after reset") Cc: stable@vger.kernel.org Signed-off-by: Martin Kaiser <martin@kaiser.cx> Reviewed-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Abhishek Sahu	0bea8b947d	mtd: rawnand: fix return value check for bad block status commit `e9893e6fa9` upstream. Positive return value from read_oob() is making false BAD blocks. For some of the NAND controllers, OOB bytes will be protected with ECC and read_oob() will return number of bitflips. If there is any bitflip in ECC protected OOB bytes for BAD block status page, then that block is getting treated as BAD. Fixes: `c120e75e0e` ("mtd: nand: use read_oob() instead of cmdfunc() for bad block check") Cc: <stable@vger.kernel.org> Signed-off-by: Abhishek Sahu <absahu@codeaurora.org> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Masahiro Yamada	5777295dc8	mtd: rawnand: denali_dt: set clk_x_rate to 200 MHz unconditionally commit `3f6e698604` upstream. Since commit `1bb8866677` ("mtd: nand: denali: handle timing parameters by setup_data_interface()"), denali_dt.c gets the clock rate from the clock driver. The driver expects the frequency of the bus interface clock, whereas the clock driver of SOCFPGA provides the core clock. Thus, the setup_data_interface() hook calculates timing parameters based on a wrong frequency. To make it work without relying on the clock driver, hard-code the clock frequency, 200MHz. This is fine for existing DT of UniPhier, and also fixes the issue of SOCFPGA because both platforms use 200 MHz for the bus interface clock. Fixes: `1bb8866677` ("mtd: nand: denali: handle timing parameters by setup_data_interface()") Cc: linux-stable <stable@vger.kernel.org> #4.14+ Reported-by: Philipp Rosenberger <p.rosenberger@linutronix.de> Suggested-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Richard Weinberger <richard@nod.at> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Tokunori Ikegami	4a10b5b3e4	mtd: cfi_cmdset_0002: Change write buffer to check correct value commit `dfeae10735` upstream. For the word write it is checked if the chip has the correct value. But it is not checked for the write buffer as only checked if ready. To make sure for the write buffer change to check the value. It is enough as this patch is only checking the last written word. Since it is described by data sheets to check the operation status. Signed-off-by: Tokunori Ikegami <ikegami@allied-telesis.co.jp> Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Cc: Brian Norris <computersforpeace@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Marek Vasut <marek.vasut@gmail.com> Cc: Richard Weinberger <richard@nod.at> Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr> Cc: linux-mtd@lists.infradead.org Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:57 +02:00
Boris Brezillon	bf6aa8a64b	mtd: rawnand: Do not check FAIL bit when executing a SET_FEATURES op commit `782d1967d0` upstream. The ONFI spec clearly says that FAIL bit is only valid for PROGRAM, ERASE and READ-with-on-die-ECC operations, and should be ignored otherwise. It seems that checking it after sending a SET_FEATURES is a bad idea because a previous READ, PROGRAM or ERASE op may have failed, and depending on the implementation, the FAIL bit is not cleared until a new READ, PROGRAM or ERASE is started. This leads to ->set_features() returning -EIO while it actually worked, which can sometimes stop a batch of READ/PROGRAM ops. Note that we only fix the ->exec_op() path here, because some drivers are abusing the NAND_STATUS_FAIL flag in their ->waitfunc() implementation to propagate other kind of errors, like wait-ready-timeout or controller-related errors. Let's not try to fix those drivers since they worked fine so far. Fixes: `8878b126df` ("mtd: nand: add ->exec_op() implementation") Cc: stable@vger.kernel.org Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Bharat Potnuri	a1cf633259	RDMA/core: Save kernel caller name when creating CQ using ib_create_cq() commit `7350cdd025` upstream. Few kernel applications like SCST-iSER create CQ using ib_create_cq(), where accessing CQ structures using rdma restrack tool leads to below NULL pointer dereference. This patch saves caller kernel module name similar to ib_alloc_cq(). BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8132ca70>] skip_spaces+0x30/0x30 PGD 738bac067 PUD 8533f0067 PMD 0 Oops: 0000 [#1] SMP R10: ffff88017fc03300 R11: 0000000000000246 R12: 0000000000000000 R13: ffff88082fa5a668 R14: ffff88017475a000 R15: 0000000000000000 FS: 00002b32726582c0(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000008491a1000 CR4: 00000000003607e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffc05af69c>] ? fill_res_name_pid+0x7c/0x90 [ib_core] [<ffffffffc05af79f>] fill_res_cq_entry+0xef/0x170 [ib_core] [<ffffffffc05af4c4>] res_get_common_dumpit+0x3c4/0x480 [ib_core] [<ffffffffc05af5d3>] nldev_res_get_cq_dumpit+0x13/0x20 [ib_core] [<ffffffff815bc1e7>] netlink_dump+0x117/0x2e0 [<ffffffff815bcb8b>] __netlink_dump_start+0x1ab/0x230 [<ffffffffc059fead>] ibnl_rcv_msg+0x11d/0x1f0 [ib_core] [<ffffffffc05af5c0>] ? nldev_res_get_mr_dumpit+0x20/0x20 [ib_core] [<ffffffffc059fd90>] ? rdma_nl_multicast+0x30/0x30 [ib_core] [<ffffffff815bea49>] netlink_rcv_skb+0xa9/0xc0 [<ffffffffc05a0018>] ibnl_rcv+0x98/0xb0 [ib_core] [<ffffffff815be132>] netlink_unicast+0xf2/0x1b0 [<ffffffff815be50f>] netlink_sendmsg+0x31f/0x6a0 [<ffffffff8156b580>] sock_sendmsg+0xb0/0xf0 [<ffffffff816ace9e>] ? _raw_spin_unlock_bh+0x1e/0x20 [<ffffffff8156f998>] ? release_sock+0x118/0x170 [<ffffffff8156b731>] SYSC_sendto+0x121/0x1c0 [<ffffffff81568340>] ? sock_alloc_file+0xa0/0x140 [<ffffffff81221265>] ? __fd_install+0x25/0x60 [<ffffffff8156c2ce>] SyS_sendto+0xe/0x10 [<ffffffff816b6c2a>] system_call_fastpath+0x16/0x1b RIP [<ffffffff8132ca70>] skip_spaces+0x30/0x30 RSP <ffff88072be97760> CR2: 0000000000000000 Cc: <stable@vger.kernel.org> Fixes: `f66c8ba4c9` ("RDMA/core: Save kernel caller name when creating PD and CQ objects") Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Chuck Lever	e9b7f6ccdf	xprtrdma: Return -ENOBUFS when no pages are available commit `a8f688ec43` upstream. The use of -EAGAIN in rpcrdma_convert_iovs() is a latent bug: the transport never calls xprt_write_space() when more pages become available. -ENOBUFS will trigger the correct "delay briefly and call again" logic. Fixes: `7a89f9c626` ("xprtrdma: Honor ->send_request API contract") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org # 4.8+ Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Leon Romanovsky	ef6c1a40bd	RDMA/mlx4: Discard unknown SQP work requests commit `6b1ca7ece1` upstream. There is no need to crash the machine if unknown work request was received in SQP MAD. Cc: <stable@vger.kernel.org> # 3.6 Fixes: `37bfc7c1e8` ("IB/mlx4: SR-IOV multiplex and demultiplex MADs") Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Jason Gunthorpe	3c62cd44d9	IB/uverbs: Fix ordering of ucontext check in ib_uverbs_write commit `1eb9364ce8` upstream. During disassociation the ucontext will become NULL, however due to how the SRCU locking works the ucontext must only be examined after looking at the ib_dev, which governs the RCU control flow. With the wrong ordering userspace will see EINVAL instead of EIO for a disassociated uverbs FD, which breaks rdma-core. Cc: stable@vger.kernel.org Fixes: `491d5c6a30` ("RDMA/uverbs: Move uncontext check before SRCU read lock") Reported-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Mike Marciniszyn	02f470533d	IB/hfi1: Fix user context tail allocation for DMA_RTAIL commit `1bc0299d97` upstream. The following code fails to allocate a buffer for the tail address that the hardware DMAs into when the user context DMA_RTAIL is set. if (HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL)) { rcd->rcvhdrtail_kvaddr = dma_zalloc_coherent( &dd->pcidev->dev, PAGE_SIZE, &dma_hdrqtail, gfp_flags); if (!rcd->rcvhdrtail_kvaddr) goto bail_free; rcd->rcvhdrqtailaddr_dma = dma_hdrqtail; } So the rcvhdrtail_kvaddr would then be NULL. The mmap logic fails to check for a NULL rcvhdrtail_kvaddr. The fix is to test for both user and kernel DMA_TAIL options during the allocation as well as testing for a NULL rcvhdrtail_kvaddr during the mmap processing. Additionally, all downstream testing of the capmask for DMA_RTAIL have been eliminated in favor of testing rcvhdrtail_kvaddr. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Sebastian Sanchez	a67b7eebae	IB/hfi1: Optimize kthread pointer locking when queuing CQ entries commit `af8aab7137` upstream. All threads queuing CQ entries on different CQs are unnecessarily synchronized by a spin lock to check if the CQ kthread worker hasn't been destroyed before queuing an CQ entry. The lock used in `6efaf10f16` ("IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker") is a device global lock and will have poor performance at scale as completions are entered from a large number of CPUs. Convert to use RCU where the read side of RCU is rvt_cq_enter() to determine that the worker is alive prior to triggering the completion event. Apply write side RCU semantics in rvt_driver_cq_init() and rvt_cq_exit(). Fixes: `6efaf10f16` ("IB/rdmavt: Avoid queuing work into a destroyed cq kthread worker") Cc: <stable@vger.kernel.org> # 4.14.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Michael J. Ruhl	dac7b47c23	IB/hfi1: Reorder incorrect send context disable commit `a93a0a3111` upstream. User send context integrity bits are cleared before the context is disabled. If the send context is still processing data, any packets that need those integrity bits will cause an error and halt the send context. During the disable handling, the driver waits for the context to drain. If the context is halted, the driver will eventually timeout because the context won't drain and then incorrectly bounce the link. Reorder the bit clearing and the context disable. Examine the software state and send context status as well as the egress status to determine if a send context is in the halted state. Promote the check macros to static functions for consistency with the new check and to follow kernel style. Remove an unused define that refers to the egress timeout. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:56 +02:00
Mike Marciniszyn	1d9945caf4	IB/hfi1: Fix fault injection init/exit issues commit `8c79d8223b` upstream. There are config dependent code paths that expose panics in unload paths both in this file and in debugfs_remove_recursive() because CONFIG_FAULT_INJECTION and CONFIG_FAULT_INJECTION_DEBUG_FS can be set independently. Having CONFIG_FAULT_INJECTION set and CONFIG_FAULT_INJECTION_DEBUG_FS reset causes fault_create_debugfs_attr() to return an error. The debugfs.c routines tolerate failures, but the module unload panics dereferencing a NULL in the two exit routines. If that is fixed, the dir passed to debugfs_remove_recursive comes from a memory location that was freed and potentially reused causing a segfault or corrupting memory. Here is an example of the NULL deref panic: [66866.286829] BUG: unable to handle kernel NULL pointer dereference at 0000000000000088 [66866.295602] IP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1] [66866.301138] PGD 858496067 P4D 858496067 PUD 8433a7067 PMD 0 [66866.307452] Oops: 0000 [#1] SMP [66866.310953] Modules linked in: hfi1(-) rdmavt rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm iw_cm ib_cm ib_core rpcsec_gss_krb5 nfsv4 dns_resolver nfsv3 nfs fscache sb_edac x86_pkg_temp_thermal intel_powerclamp vfat fat coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel iTCO_wdt iTCO_vendor_support crypto_simd mei_me glue_helper cryptd mxm_wmi ipmi_si pcspkr lpc_ich sg mei ioatdma ipmi_devintf i2c_i801 mfd_core shpchp ipmi_msghandler wmi acpi_power_meter acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt igb fb_sys_fops ttm ahci ptp crc32c_intel libahci pps_core drm dca libata i2c_algo_bit i2c_core [last unloaded: opa_vnic] [66866.385551] CPU: 8 PID: 7470 Comm: rmmod Not tainted 4.14.0-mam-tid-rdma #2 [66866.393317] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016 [66866.405252] task: ffff88084f28c380 task.stack: ffffc90008454000 [66866.411866] RIP: 0010:hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1] [66866.417984] RSP: 0018:ffffc90008457da0 EFLAGS: 00010202 [66866.423812] RAX: 0000000000000000 RBX: ffff880857de0000 RCX: 0000000180040001 [66866.431773] RDX: 0000000180040002 RSI: ffffea0021088200 RDI: 0000000040000000 [66866.439734] RBP: ffffc90008457da8 R08: ffff88084220e000 R09: 0000000180040001 [66866.447696] R10: 000000004220e001 R11: ffff88084220e000 R12: ffff88085a31c000 [66866.455657] R13: ffffffffa07c9820 R14: ffffffffa07c9890 R15: ffff881059d78100 [66866.463618] FS: 00007f6876047740(0000) GS:ffff88085f800000(0000) knlGS:0000000000000000 [66866.472644] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [66866.479053] CR2: 0000000000000088 CR3: 0000000856357006 CR4: 00000000001606e0 [66866.487013] Call Trace: [66866.489747] remove_one+0x1f/0x220 [hfi1] [66866.494221] pci_device_remove+0x39/0xc0 [66866.498596] device_release_driver_internal+0x141/0x210 [66866.504424] driver_detach+0x3f/0x80 [66866.508409] bus_remove_driver+0x55/0xd0 [66866.512784] driver_unregister+0x2c/0x50 [66866.517164] pci_unregister_driver+0x2a/0xa0 [66866.521934] hfi1_mod_cleanup+0x10/0xaa2 [hfi1] [66866.526988] SyS_delete_module+0x171/0x250 [66866.531558] do_syscall_64+0x67/0x1b0 [66866.535644] entry_SYSCALL64_slow_path+0x25/0x25 [66866.540792] RIP: 0033:0x7f6875525c27 [66866.544777] RSP: 002b:00007ffd48528e78 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [66866.553224] RAX: ffffffffffffffda RBX: 0000000001cc01d0 RCX: 00007f6875525c27 [66866.561185] RDX: 00007f6875596000 RSI: 0000000000000800 RDI: 0000000001cc0238 [66866.569146] RBP: 0000000000000000 R08: 00007f68757e9060 R09: 00007f6875596000 [66866.577120] R10: 00007ffd48528c00 R11: 0000000000000206 R12: 00007ffd48529db4 [66866.585080] R13: 0000000000000000 R14: 0000000001cc01d0 R15: 0000000001cc0010 [66866.593040] Code: 90 0f 1f 44 00 00 48 83 3d a3 8b 03 00 00 55 48 89 e5 53 48 89 fb 74 4e 48 8d bf 18 0c 00 00 e8 9d f2 ff ff 48 8b 83 20 0c 00 00 <48> 8b b8 88 00 00 00 e8 2a 21 b3 e0 48 8b bb 20 0c 00 00 e8 0e [66866.614127] RIP: hfi1_dbg_ibdev_exit+0x2a/0x80 [hfi1] RSP: ffffc90008457da0 [66866.621885] CR2: 0000000000000088 [66866.625618] ---[ end trace c4817425783fb092 ]--- Fix by insuring that upon failure from fault_create_debugfs_attr() the parent pointer for the routines is always set to NULL and guards added in the exit routines to insure that debugfs_remove_recursive() is not called when when the parent pointer is NULL. Fixes: `0181ce31b2` ("IB/hfi1: Add receive fault injection feature") Cc: <stable@vger.kernel.org> # 4.14.x Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Max Gurtovoy	719311d815	IB/isert: fix T10-pi check mask setting commit `0e12af84cd` upstream. A copy/paste bug (probably) caused setting of an app_tag check mask in case where a ref_tag check was needed. Fixes: `38a2d0d429` ("IB/isert: convert to the generic RDMA READ/WRITE API") Fixes: `9e961ae73c` ("IB/isert: Support T10-PI protected transactions") Cc: stable@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Alex Estrin	f94c02635d	IB/isert: Fix for lib/dma_debug check_sync warning commit `763b69654b` upstream. The following error message occurs on a target host in a debug build during session login: [ 3524.411874] WARNING: CPU: 5 PID: 12063 at lib/dma-debug.c:1207 check_sync+0x4ec/0x5b0 [ 3524.421057] infiniband hfi1_0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x0000000000000000] [size=76 bytes] ......snip ..... [ 3524.535846] CPU: 5 PID: 12063 Comm: iscsi_np Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 3524.546764] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.2.6 06/08/2015 [ 3524.555740] Call Trace: [ 3524.559102] [<ffffffffa5fe915b>] dump_stack+0x19/0x1b [ 3524.565477] [<ffffffffa58a2f58>] __warn+0xd8/0x100 [ 3524.571557] [<ffffffffa58a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 3524.578610] [<ffffffffa5bf5b8c>] check_sync+0x4ec/0x5b0 [ 3524.585177] [<ffffffffa58efc3f>] ? set_cpus_allowed_ptr+0x5f/0x1c0 [ 3524.592812] [<ffffffffa5bf5cd0>] debug_dma_sync_single_for_cpu+0x80/0x90 [ 3524.601029] [<ffffffffa586add3>] ? x2apic_send_IPI_mask+0x13/0x20 [ 3524.608574] [<ffffffffa585ee1b>] ? native_smp_send_reschedule+0x5b/0x80 [ 3524.616699] [<ffffffffa58e9b76>] ? resched_curr+0xf6/0x140 [ 3524.623567] [<ffffffffc0879af0>] isert_create_send_desc.isra.26+0xe0/0x110 [ib_isert] [ 3524.633060] [<ffffffffc087af95>] isert_put_login_tx+0x55/0x8b0 [ib_isert] [ 3524.641383] [<ffffffffa58ef114>] ? try_to_wake_up+0x1a4/0x430 [ 3524.648561] [<ffffffffc098cfed>] iscsi_target_do_tx_login_io+0xdd/0x230 [iscsi_target_mod] [ 3524.658557] [<ffffffffc098d827>] iscsi_target_do_login+0x1a7/0x600 [iscsi_target_mod] [ 3524.668084] [<ffffffffa59f9bc9>] ? kstrdup+0x49/0x60 [ 3524.674420] [<ffffffffc098e976>] iscsi_target_start_negotiation+0x56/0xc0 [iscsi_target_mod] [ 3524.684656] [<ffffffffc098c2ee>] __iscsi_target_login_thread+0x90e/0x1070 [iscsi_target_mod] [ 3524.694901] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.705446] [<ffffffffc098ca50>] ? __iscsi_target_login_thread+0x1070/0x1070 [iscsi_target_mod] [ 3524.715976] [<ffffffffc098ca78>] iscsi_target_login_thread+0x28/0x60 [iscsi_target_mod] [ 3524.725739] [<ffffffffa58d60ff>] kthread+0xef/0x100 [ 3524.732007] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.739540] [<ffffffffa5fff1b7>] ret_from_fork_nospec_begin+0x21/0x21 [ 3524.747558] [<ffffffffa58d6010>] ? insert_kthread_work+0x80/0x80 [ 3524.755088] ---[ end trace 23f8bf9238bd1ed8 ]--- [ 3595.510822] iSCSI/iqn.1994-05.com.redhat:537fa56299: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. The code calls dma_sync on login_tx_desc->dma_addr prior to initializing it with dma-mapped address. login_tx_desc is a part of iser_conn structure and is used only once during login negotiation, so the issue is fixed by eliminating dma_sync call for this buffer using a special case routine. Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Erez Shitrit	54bae587bb	IB/mlx5: Fetch soft WQE's on fatal error state commit `7b74a83cf5` upstream. On fatal error the driver simulates CQE's for ULPs that rely on completion of all their posted work-request. For the GSI traffic, the mlx5 has its own mechanism that sends the completions via software CQE's directly to the relevant CQ. This should be kept in fatal error too, so the driver should simulate such CQE's with the specified error state in order to complete GSI QP work requests. Without the fix the next deadlock might appears: schedule_timeout+0x274/0x350 wait_for_common+0xec/0x240 mcast_remove_one+0xd0/0x120 [ib_core] ib_unregister_device+0x12c/0x230 [ib_core] mlx5_ib_remove+0xc4/0x270 [mlx5_ib] mlx5_detach_device+0x184/0x1a0 [mlx5_core] mlx5_unload_one+0x308/0x340 [mlx5_core] mlx5_pci_err_detected+0x74/0xe0 [mlx5_core] Cc: <stable@vger.kernel.org> # 4.7 Fixes: `89ea94a7b6` ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Jack Morgenstein	d15cc81580	IB/core: Make testing MR flags for writability a static inline function commit `08bb558ac1` upstream. Make the MR writability flags check, which is performed in umem.c, a static inline function in file ib_verbs.h This allows the function to be used by low-level infiniband drivers. Cc: <stable@vger.kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Jack Morgenstein	400bfde710	IB/mlx4: Mark user MR as writable if actual virtual memory is writable commit `d8f9cc328c` upstream. To allow rereg_user_mr to modify the MR from read-only to writable without using get_user_pages again, we needed to define the initial MR as writable. However, this was originally done unconditionally, without taking into account the writability of the underlying virtual memory. As a result, any attempt to register a read-only MR over read-only virtual memory failed. To fix this, do not add the writable flag bit when the user virtual memory is not writable (e.g. const memory). However, when the underlying memory is NOT writable (and we therefore do not define the initial MR as writable), the IB core adds a "force writable" flag to its user-pages request. If this succeeds, the reg_user_mr caller gets a writable copy of the original pages. If the user-space caller then does a rereg_user_mr operation to enable writability, this will succeed. This should not be allowed, since the original virtual memory was not writable. Cc: <stable@vger.kernel.org> Fixes: `9376932d0c` ("IB/mlx4_ib: Add support for user MR re-registration") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Alex Estrin	cfb0cac8cd	IB/{hfi1, qib}: Add handling of kernel restart commit `8d3e71136a` upstream. A warm restart will fail to unload the driver, leaving link state potentially flapping up to the point the BIOS resets the adapter. Correct the issue by hooking the shutdown pci method, which will bring port down. Cc: <stable@vger.kernel.org> # 4.9.x Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Alex Estrin <alex.estrin@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Mike Marciniszyn	ffa15404fa	IB/qib: Fix DMA api warning with debug kernel commit `0252f73334` upstream. The following error occurs in a debug build when running MPI PSM: [ 307.415911] WARNING: CPU: 4 PID: 23867 at lib/dma-debug.c:1158 check_unmap+0x4ee/0xa20 [ 307.455661] ib_qib 0000:05:00.0: DMA-API: device driver failed to check map error[device address=0x00000000df82b000] [size=4096 bytes] [mapped as page] [ 307.517494] Modules linked in: [ 307.531584] ib_isert iscsi_target_mod ib_srpt target_core_mod rpcrdma sunrpc ib_srp scsi_transport_srp scsi_tgt ib_iser libiscsi ib_ipoib scsi_transport_iscsi rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_qib intel_powerclamp coretemp rdmavt intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel ipmi_ssif ib_core aesni_intel sg ipmi_si lrw gf128mul dca glue_helper ipmi_devintf iTCO_wdt gpio_ich hpwdt iTCO_vendor_support ablk_helper hpilo acpi_power_meter cryptd ipmi_msghandler ie31200_edac shpchp pcc_cpufreq lpc_ich pcspkr ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci crct10dif_pclmul crct10dif_common drm crc32c_intel libahci tg3 libata serio_raw ptp i2c_core [ 307.846113] pps_core dm_mirror dm_region_hash dm_log dm_mod [ 307.866505] CPU: 4 PID: 23867 Comm: mpitests-IMB-MP Kdump: loaded Not tainted 3.10.0-862.el7.x86_64.debug #1 [ 307.911178] Hardware name: HP ProLiant DL320e Gen8, BIOS J05 11/09/2013 [ 307.944206] Call Trace: [ 307.956973] [<ffffffffbd9e915b>] dump_stack+0x19/0x1b [ 307.982201] [<ffffffffbd2a2f58>] __warn+0xd8/0x100 [ 308.005999] [<ffffffffbd2a2fdf>] warn_slowpath_fmt+0x5f/0x80 [ 308.034260] [<ffffffffbd5f667e>] check_unmap+0x4ee/0xa20 [ 308.060801] [<ffffffffbd41acaa>] ? page_add_file_rmap+0x2a/0x1d0 [ 308.090689] [<ffffffffbd5f6c4d>] debug_dma_unmap_page+0x9d/0xb0 [ 308.120155] [<ffffffffbd4082e0>] ? might_fault+0xa0/0xb0 [ 308.146656] [<ffffffffc07761a5>] qib_tid_free.isra.14+0x215/0x2a0 [ib_qib] [ 308.180739] [<ffffffffc0776bf4>] qib_write+0x894/0x1280 [ib_qib] [ 308.210733] [<ffffffffbd540b00>] ? __inode_security_revalidate+0x70/0x80 [ 308.244837] [<ffffffffbd53c2b7>] ? security_file_permission+0x27/0xb0 [ 308.266025] qib_ib0.8006: multicast join failed for ff12:401b:8006:0000:0000:0000:ffff:ffff, status -22 [ 308.323421] [<ffffffffbd46f5d3>] vfs_write+0xc3/0x1f0 [ 308.347077] [<ffffffffbd492a5c>] ? fget_light+0xfc/0x510 [ 308.372533] [<ffffffffbd47045a>] SyS_write+0x8a/0x100 [ 308.396456] [<ffffffffbd9ff355>] system_call_fastpath+0x1c/0x21 The code calls a qib_map_page() which has never correctly tested for a mapping error. Fix by testing for pci_dma_mapping_error() in all cases and properly handling the failure in the caller. Additionally, streamline qib_map_page() arguments to satisfy just the single caller. Cc: <stable@vger.kernel.org> Reviewed-by: Alex Estrin <alex.estrin@intel.com> Tested-by: Don Dutile <ddutile@redhat.com> Reviewed-by: Don Dutile <ddutile@redhat.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:55 +02:00
Hans de Goede	330d7e9223	efi/libstub/tpm: Initialize efi_physical_addr_t vars to zero for mixed mode commit `52e1cf2d19` upstream. Commit: `79832f0b5f` ("efi/libstub/tpm: Initialize pointer variables to zero for mixed mode") fixes a problem with the tpm code on mixed mode (64-bit kernel on 32-bit UEFI), where 64-bit pointer variables are not fully initialized by the 32-bit EFI code. A similar problem applies to the efi_physical_addr_t variables which are written by the ->get_event_log() EFI call. Even though efi_physical_addr_t is 64-bit everywhere, it seems that some 32-bit UEFI implementations only fill in the lower 32 bits when passed a pointer to an efi_physical_addr_t to fill. This commit initializes these to 0 to, to ensure the upper 32 bits are 0 in mixed mode. This fixes recent kernels sometimes hanging during early boot on mixed mode UEFI systems. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> # v4.16+ Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20180622064222.11633-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Tadeusz Struk	ba19d31297	tpm: fix race condition in tpm_common_write() commit `3ab2011ea3` upstream. There is a race condition in tpm_common_write function allowing two threads on the same /dev/tpm<N>, or two different applications on the same /dev/tpmrm<N> to overwrite each other commands/responses. Fixed this by taking the priv->buffer_mutex early in the function. Also converted the priv->data_pending from atomic to a regular size_t type. There is no need for it to be atomic since it is only touched under the protection of the priv->buffer_mutex. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Tadeusz Struk	2dd75ec3c2	tpm: fix use after free in tpm2_load_context() commit `8c81c24758` upstream. If load context command returns with TPM2_RC_HANDLE or TPM2_RC_REFERENCE_H0 then we have use after free in line 114 and double free in 117. Fixes: `4d57856a21` ("tpm2: add session handle context saving and restoring to the space code") Cc: stable@vger.kernel.org Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off--by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Srinivas Kandagatla	fb4c5420a2	of: platform: stop accessing invalid dev in of_platform_device_destroy commit `522811e944` upstream. Immediately after the platform_device_unregister() the device will be cleaned up. Accessing the freed pointer immediately after that will crash the system. Found this bug when kernel is built with CONFIG_PAGE_POISONING and testing loading/unloading audio drivers in a loop on Qcom platforms. Fix this by moving of_node_clear_flag() just before the unregister calls. Below is the crash trace: Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c03 Mem abort info: ESR = 0x96000021 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000021 CM = 0, WnR = 0 [006b6b6b6b6b6c03] address between user and kernel address ranges Internal error: Oops: 96000021 [#1] PREEMPT SMP Modules linked in: CPU: 2 PID: 1784 Comm: sh Tainted: G W 4.17.0-rc7-02230-ge3a63a7ef641-dirty #204 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) pstate: 80000005 (Nzcv daif -PAN -UAO) pc : clear_bit+0x18/0x2c lr : of_platform_device_destroy+0x64/0xb8 sp : ffff00000c9c3930 x29: ffff00000c9c3930 x28: ffff80003d39b200 x27: ffff000008bb1000 x26: 0000000000000040 x25: 0000000000000124 x24: ffff80003a9a3080 x23: 0000000000000060 x22: ffff00000939f518 x21: ffff80003aa79e98 x20: ffff80003aa3dae0 x19: ffff80003aa3c890 x18: ffff800009feb794 x17: 0000000000000000 x16: 0000000000000000 x15: ffff800009feb790 x14: 0000000000000000 x13: ffff80003a058778 x12: ffff80003a058728 x11: ffff80003a058750 x10: 0000000000000000 x9 : 0000000000000006 x8 : ffff80003a825988 x7 : bbbbbbbbbbbbbbbb x6 : 0000000000000001 x5 : 0000000000000000 x4 : 0000000000000001 x3 : 0000000000000008 x2 : 0000000000000001 x1 : 6b6b6b6b6b6b6c03 x0 : 0000000000000000 Process sh (pid: 1784, stack limit = 0x (ptrval)) Call trace: clear_bit+0x18/0x2c q6afe_remove+0x20/0x38 apr_device_remove+0x30/0x70 device_release_driver_internal+0x170/0x208 device_release_driver+0x14/0x20 bus_remove_device+0xcc/0x150 device_del+0x10c/0x310 device_unregister+0x1c/0x70 apr_remove_device+0xc/0x18 device_for_each_child+0x50/0x80 apr_remove+0x18/0x20 rpmsg_dev_remove+0x38/0x68 device_release_driver_internal+0x170/0x208 device_release_driver+0x14/0x20 bus_remove_device+0xcc/0x150 device_del+0x10c/0x310 device_unregister+0x1c/0x70 qcom_smd_remove_device+0xc/0x18 device_for_each_child+0x50/0x80 qcom_smd_unregister_edge+0x3c/0x70 smd_subdev_remove+0x18/0x28 rproc_stop+0x48/0xd8 rproc_shutdown+0x60/0xe8 state_store+0xbc/0xf8 dev_attr_store+0x18/0x28 sysfs_kf_write+0x3c/0x50 kernfs_fop_write+0x118/0x1e0 __vfs_write+0x18/0x110 vfs_write+0xa4/0x1a8 ksys_write+0x48/0xb0 sys_write+0xc/0x18 el0_svc_naked+0x30/0x34 Code: d2800022 8b400c21 f9800031 9ac32043 (c85f7c22) ---[ end trace 32020935775616a2 ]--- Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Stefan M Schaeckeler	5c6347cd5d	of: unittest: for strings, account for trailing \0 in property length field commit `3b9cf7905f` upstream. For strings, account for trailing \0 in property length field: This is consistent with how dtc builds string properties. Function __of_prop_dup() would misbehave on such properties as it duplicates properties based on the property length field creating new string values without trailing \0s. Signed-off-by: Stefan M Schaeckeler <sschaeck@cisco.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Tested-by: Frank Rowand <frank.rowand@sony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Frank Rowand	e1a0a5c8d2	of: overlay: validate offset from property fixups commit `482137bf2a` upstream. The smatch static checker marks the data in offset as untrusted, leading it to warn: drivers/of/resolver.c:125 update_usages_of_a_phandle_reference() error: buffer underflow 'prop->value' 's32min-s32max' Add check to verify that offset is within the property data. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Frank Rowand <frank.rowand@sony.com> Cc: <stable@vger.kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Kevin Hilman	8fe86da866	ARM64: dts: meson-gx: fix ATF reserved memory region commit `48e21ded04` upstream. Vendor firmware/uboot has different reserved regions depending on firmware version, but current codebase reserves the same regions on GXL and GXBB, so move the additional reserved memory region to common .dtsi. Found when putting a recent vendor u-boot on meson-gxbb-p200. Suggested-by: Neil Armstrong <narmstrong@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Jerome Brunet	d886b01099	ARM64: dts: meson: disable sd-uhs modes on the libretech-cc commit `d5b4885b1d` upstream. There is a problem with the sd-uhs mode when doing a soft reboot. Switching back from 1.8v to 3.3v messes with the card, which no longer respond (timeout errors). According to the specification, we should perform a card reset (power cycling the card) but this is something we cannot control on this design. Then the only solution to restore the communication with the card is an "unplug-plug" which is not acceptable Until we find a solution, if any, disable the sd-uhs modes on this design. For the people using uhs at the moment, there will a performance drop as a result. Fixes: `3cde63ebc8` ("ARM64: dts: meson-gxl: libretech-cc: enable high speed modes") Signed-off-by: Jerome Brunet <jbrunet@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:54 +02:00
Thor Thayer	412ea46396	arm64: dts: stratix10: Fix SPI nodes for Stratix10 commit `4595299c5e` upstream. Remove the unused bus-num node and change num-chipselect to num-cs to match SPI bindings. Cc: stable@vger.kernel.org Fixes: `78cd6a9d8e` ("arm64: dts: Add base stratix 10 dtsi") Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Miquel Raynal	cdcadf3578	arm64: dts: marvell: fix CP110 ICU node size commit `2f872ddcdb` upstream. ICU size in CP110 is not 0x10 but at least 0x440 bytes long (from the specification). Fixes: `6ef84a827c` ("arm64: dts: marvell: enable GICP and ICU on Armada 7K/8K") Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Will Deacon	17f2f5cc6e	arm64: mm: Ensure writes to swapper are ordered wrt subsequent cache maintenance commit `71c8fc0c96` upstream. When rewriting swapper using nG mappings, we must performance cache maintenance around each page table access in order to avoid coherency problems with the host's cacheable alias under KVM. To ensure correct ordering of the maintenance with respect to Device memory accesses made with the Stage-1 MMU disabled, DMBs need to be added between the maintenance and the corresponding memory access. This patch adds a missing DMB between writing a new page table entry and performing a clean+invalidate on the same line. Fixes: `f992b4dfd5` ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings") Cc: <stable@vger.kernel.org> # 4.16.x- Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Will Deacon	c5256da5b8	arm64: kpti: Use early_param for kpti= command-line option commit `b5b7dd647f` upstream. We inspect __kpti_forced early on as part of the cpufeature enable callback which remaps the swapper page table using non-global entries. Ensure that __kpti_forced has been updated to reflect the kpti= command-line option before we start using it. Fixes: `ea1e3de85e` ("arm64: entry: Add fake CPU feature for unmapping the kernel at EL0") Cc: <stable@vger.kernel.org> # 4.16.x- Reported-by: Wei Xu <xuwei5@hisilicon.com> Tested-by: Sudeep Holla <sudeep.holla@arm.com> Tested-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Jia He	f151bbc7e6	crypto: arm64/aes-blk - fix and move skcipher_walk_done out of kernel_neon_begin, _end commit `6e88f01206` upstream. In a arm64 server(QDF2400),I met a similar might-sleep warning as [1]: [ 7.019116] BUG: sleeping function called from invalid context at ./include/crypto/algapi.h:416 [ 7.027863] in_atomic(): 1, irqs_disabled(): 0, pid: 410, name: cryptomgr_test [ 7.035106] 1 lock held by cryptomgr_test/410: [ 7.039549] #0: (ptrval) (&drbg->drbg_mutex){+.+.}, at: drbg_instantiate+0x34/0x398 [ 7.048038] CPU: 9 PID: 410 Comm: cryptomgr_test Not tainted 4.17.0-rc6+ #27 [ 7.068228] dump_backtrace+0x0/0x1c0 [ 7.071890] show_stack+0x24/0x30 [ 7.075208] dump_stack+0xb0/0xec [ 7.078523] ___might_sleep+0x160/0x238 [ 7.082360] skcipher_walk_done+0x118/0x2c8 [ 7.086545] ctr_encrypt+0x98/0x130 [ 7.090035] simd_skcipher_encrypt+0x68/0xc0 [ 7.094304] drbg_kcapi_sym_ctr+0xd4/0x1f8 [ 7.098400] drbg_ctr_update+0x98/0x330 [ 7.102236] drbg_seed+0x1b8/0x2f0 [ 7.105637] drbg_instantiate+0x2ac/0x398 [ 7.109646] drbg_kcapi_seed+0xbc/0x188 [ 7.113482] crypto_rng_reset+0x4c/0xb0 [ 7.117319] alg_test_drbg+0xec/0x330 [ 7.120981] alg_test.part.6+0x1c8/0x3c8 [ 7.124903] alg_test+0x58/0xa0 [ 7.128044] cryptomgr_test+0x50/0x58 [ 7.131708] kthread+0x134/0x138 [ 7.134936] ret_from_fork+0x10/0x1c Seems there is a bug in Ard Biesheuvel's commit. Fixes: `6833817472` ("crypto: arm64/aes-blk - move kernel mode neon en/disable into loop") [1] https://www.spinics.net/lists/linux-crypto/msg33103.html Signed-off-by: jia.he@hxt-semitech.com Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: <stable@vger.kernel.org> # 4.17 Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Dave Martin	be1db036d7	arm64: Fix syscall restarting around signal suppressed by tracer commit `0fe42512b2` upstream. Commit `17c2895` ("arm64: Abstract syscallno manipulation") abstracts out the pt_regs.syscallno value for a syscall cancelled by a tracer as NO_SYSCALL, and provides helpers to set and check for this condition. However, the way this was implemented has the unintended side-effect of disabling part of the syscall restart logic. This comes about because the second in_syscall() check in do_signal() re-evaluates the "in a syscall" condition based on the updated pt_regs instead of the original pt_regs. forget_syscall() is explicitly called prior to the second check in order to prevent restart logic in the ret_to_user path being spuriously triggered, which means that the second in_syscall() check always yields false. This triggers a failure in tools/testing/selftests/seccomp/seccomp_bpf.c, when using ptrace to suppress a signal that interrups a nanosleep() syscall. Misbehaviour of this type is only expected in the case where a tracer suppresses a signal and the target process is either being single-stepped or the interrupted syscall attempts to restart via -ERESTARTBLOCK. This patch restores the old behaviour by performing the in_syscall() check only once at the start of the function. Fixes: `17c2895860` ("arm64: Abstract syscallno manipulation") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reported-by: Sumit Semwal <sumit.semwal@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: <stable@vger.kernel.org> # 4.14.x- Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Joel Fernandes (Google)	44608311b6	softirq: Reorder trace_softirqs_on to prevent lockdep splat commit `1a63dcd876` upstream. I'm able to reproduce a lockdep splat with config options: CONFIG_PROVE_LOCKING=y, CONFIG_DEBUG_LOCK_ALLOC=y and CONFIG_PREEMPTIRQ_EVENTS=y $ echo 1 > /d/tracing/events/preemptirq/preempt_enable/enable [ 26.112609] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled) [ 26.112636] WARNING: CPU: 0 PID: 118 at kernel/locking/lockdep.c:3854 [...] [ 26.144229] Call Trace: [ 26.144926] <IRQ> [ 26.145506] lock_acquire+0x55/0x1b0 [ 26.146499] ? __do_softirq+0x46f/0x4d9 [ 26.147571] ? __do_softirq+0x46f/0x4d9 [ 26.148646] trace_preempt_on+0x8f/0x240 [ 26.149744] ? trace_preempt_on+0x4d/0x240 [ 26.150862] ? __do_softirq+0x46f/0x4d9 [ 26.151930] preempt_count_sub+0x18a/0x1a0 [ 26.152985] __do_softirq+0x46f/0x4d9 [ 26.153937] irq_exit+0x68/0xe0 [ 26.154755] smp_apic_timer_interrupt+0x271/0x280 [ 26.156056] apic_timer_interrupt+0xf/0x20 [ 26.157105] </IRQ> The issue was this: preempt_count = 1 << SOFTIRQ_SHIFT __local_bh_enable(cnt = 1 << SOFTIRQ_SHIFT) { if (softirq_count() == (cnt && SOFTIRQ_MASK)) { trace_softirqs_on() { current->softirqs_enabled = 1; } } preempt_count_sub(cnt) { trace_preempt_on() { tracepoint() { rcu_read_lock_sched() { // jumps into lockdep Where preempt_count still has softirqs disabled, but current->softirqs_enabled is true, and we get a splat. Link: http://lkml.kernel.org/r/20180607201143.247775-1-joel@joelfernandes.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <tom.zanussi@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Glexiner <tglx@linutronix.de> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Todd Kjos <tkjos@google.com> Cc: Erick Reyes <erickreyes@google.com> Cc: Julia Cartwright <julia@ni.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: stable@vger.kernel.org Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Fixes: `d59158162e` ("tracing: Add support for preempt and irq enable/disable events") Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Michael Buesch	1a371b889a	hwrng: core - Always drop the RNG in hwrng_unregister() commit `837bf7cc3b` upstream. enable_best_rng() is used in hwrng_unregister() to switch away from the currently active RNG, if that is the one currently being removed. However enable_best_rng() might fail, if the next RNG's init routine fails. In that case enable_best_rng() will return an error code and the currently active RNG will remain active. After unregistering this might lead to crashes due to use-after-free. Fix this by dropping the currently active RNG, if enable_best_rng() failed. This will result in no RNG to be active, if the next-best one failed to initialize. This problem was introduced by `142a27f0a7` Fixes: `142a27f0a7` ("hwrng: core - Reset user selected rng by...") Reported-by: Wirz <spam@lukas-wirz.de> Tested-by: Wirz <spam@lukas-wirz.de> Signed-off-by: Michael Büsch <m@bues.ch> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:53 +02:00
Dinh Nguyen	0654b4f4c3	ARM: dts: socfpga: Fix NAND controller node compatible for Arria10 commit `3877ef7a1c` upstream. The NAND compatible "denali,denal-nand-dt" property has never been used and is obsolete. Remove it. Cc: stable@vger.kernel.org Fixes: f549af06e9b6("ARM: dts: socfpga: Add NAND device tree for Arria10") Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Marek Vasut	747e40a8bd	ARM: dts: socfpga: Fix NAND controller clock supply commit `4eda9b766b` upstream. The Denali NAND x-clock should be supplied by nand_x_clk, not by nand_clk. Fix this, otherwise the Denali driver gets incorrect clock frequency information and incorrectly configures the NAND timing. Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut <marex@denx.de> Fixes: `d837a80d19` ("ARM: dts: socfpga: add nand controller nodes") Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Marek Vasut	b301deb10b	ARM: dts: socfpga: Fix NAND controller node compatible commit `d9a695f3c8` upstream. The compatible string for the Denali NAND controller is incorrect, fix it by replacing it with one matching the DT bindings and the driver. Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut <marex@denx.de> Fixes: `d837a80d19` ("ARM: dts: socfpga: add nand controller nodes") Cc: Steffen Trumtrar <s.trumtrar@pengutronix.de> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Thor Thayer	71bc4823de	ARM: dts: Fix SPI node for Arria10 commit `975ba94c2c` upstream. Remove the unused bus-num node and change num-chipselect to num-cs to match SPI bindings. Cc: stable@vger.kernel.org Fixes: `f2d6f8f817` ("ARM: dts: socfpga: Add SPI Master1 for Arria10 SR chip") Signed-off-by: Thor Thayer <thor.thayer@linux.intel.com> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Chen-Yu Tsai	1999b9179d	ARM: dts: sun8i: h3: fix ALL-H3-CC H3 ver VCC-1V2 regulator voltage commit `bceb1f25b8` upstream. The voltage of the VCC-1V2 regulator on the ALL-H3-CC H3 ver. should be 1.2V, not the 3.3V currently defined in the device tree. Fix the voltage in the device tree. Fixes: `6ca358645d` ("ARM: dts: sun8i: h3: Add dts file for Libre Computer Board ALL-H3-CC H3 ver.") Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Icenowy Zheng	b16653fe77	ARM: dts: sun8i: h3: fix ALL-H3-CC H3 ver VDD-CPUX voltage commit `e6e7b7c2c8` upstream. The VDD-CPUX voltage of ALL-H3-CC H3 ver should be 1.2V, not the 3.3V currently defined in the device tree. Fix the voltage in the device tree. Fixes: `6ca358645d` ("ARM: dts: sun8i: h3: Add dts file for Libre Computer Board ALL-H3-CC H3 ver.") Signed-off-by: Icenowy Zheng <icenowy@aosc.io> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Cc: <stable@vger.kernel.org> # 4.16.x Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
David Rivshin	ab87bc596e	ARM: 8764/1: kgdb: fix NUMREGBYTES so that gdb_regs[] is the correct size commit `76ed0b803a` upstream. NUMREGBYTES (which is used as the size for gdb_regs[]) is incorrectly based on DBG_MAX_REG_NUM instead of GDB_MAX_REGS. DBG_MAX_REG_NUM is the number of total registers, while GDB_MAX_REGS is the number of 'unsigned longs' it takes to serialize those registers. Since FP registers require 3 'unsigned longs' each, DBG_MAX_REG_NUM is smaller than GDB_MAX_REGS. This causes GDB 8.0 give the following error on connect: "Truncated register 19 in remote 'g' packet" This also causes the register serialization/deserialization logic to overflow gdb_regs[], overwriting whatever follows. Fixes: `834b2964b7` ("kgdb,arm: fix register dump") Cc: <stable@vger.kernel.org> # 2.6.37+ Signed-off-by: David Rivshin <drivshin@allworx.com> Acked-by: Rabin Vincent <rabin@rab.in> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Vaibhav Jain	301a8cf566	cxl: Disable prefault_mode in Radix mode commit `b6c84ba22f` upstream. Currently we see a kernel-oops reported on Power-9 while attaching a context to an AFU, with radix-mode and sysfs attr 'prefault_mode' set to anything other than 'none'. The backtrace of the oops is of this form: Unable to handle kernel paging request for data at address 0x00000080 Faulting instruction address: 0xc00800000bcf3b20 cpu 0x1: Vector: 300 (Data Access) at [c00000037f003800] pc: c00800000bcf3b20: cxl_load_segment+0x178/0x290 [cxl] lr: c00800000bcf39f0: cxl_load_segment+0x48/0x290 [cxl] sp: c00000037f003a80 msr: 9000000000009033 dar: 80 dsisr: 40000000 current = 0xc00000037f280000 paca = 0xc0000003ffffe600 softe: 3 irq_happened: 0x01 pid = 3529, comm = afp_no_int <snip> cxl_prefault+0xfc/0x248 [cxl] process_element_entry_psl9+0xd8/0x1a0 [cxl] cxl_attach_dedicated_process_psl9+0x44/0x130 [cxl] native_attach_process+0xc0/0x130 [cxl] afu_ioctl+0x3f4/0x5e0 [cxl] do_vfs_ioctl+0xdc/0x890 ksys_ioctl+0x68/0xf0 sys_ioctl+0x40/0xa0 system_call+0x58/0x6c The issue is caused as on Power-8 the AFU attr 'prefault_mode' was used to improve initial storage fault performance by prefaulting process segments. However on Power-9 with radix mode we don't have Storage-Segments that we can prefault. Also prefaulting process Pages will be too costly and fine-grained. Hence, since the prefaulting mechanism doesn't makes sense of radix-mode, this patch updates prefault_mode_store() to not allow any other value apart from CXL_PREFAULT_NONE when radix mode is enabled. Fixes: `f24be42aab` ("cxl: Add psl9 specific code") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:52 +02:00
Vaibhav Jain	2c8fd3cbef	cxl: Configure PSL to not use APC virtual machines commit `9a6d2022ba` upstream. APC virtual machines arent used on POWER-9 chips and are already disabled in on-chip CAPP. They also need to be disabled on the PSL via 'PSL Data Send Control Register' by setting bit(47). This forces the PSL to send commands to CAPP with queue.id == 0. Fixes: `5632874311` ("cxl: Add support for POWER9 DD2") Cc: stable@vger.kernel.org # v4.15+ Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Alastair D'Silva <alastair@d-silva.org> Reviewed-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Michael Ellerman	5dbf5eb8ae	powerpc/64s: Fix DT CPU features Power9 DD2.1 logic commit `749a0278c2` upstream. In the device tree CPU features quirk code we want to set CPU_FTR_POWER9_DD2_1 on all Power9s that aren't DD2.0 or earlier. But we got the logic wrong and instead set it on all CPUs that aren't Power9 DD2.0 or earlier, ie. including Power8. Fix it by making sure we're on a Power9. This isn't a bug in practice because the only code that checks the feature is Power9 only to begin with. But we'll backport it anyway to avoid confusion. Fixes: `9e9626ed3a` ("powerpc/64s: Fix POWER9 DD2.2 and above in DT CPU features") Cc: stable@vger.kernel.org # v4.17+ Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Michael Jeanson	8108d537b4	powerpc/e500mc: Set assembler machine type to e500mc commit `69a8405999` upstream. In binutils 2.26 a new opcode for the "wait" instruction was added for the POWER9 and has precedence over the one specific to the e500mc. Commit `ebf714ff37` ("powerpc/e500mc: Add support for the wait instruction in e500_idle") uses this instruction specifically on the e500mc to work around an erratum. This results in an invalid instruction in idle_e500 when we build for the e500mc on bintutils >= 2.26 with the default assembler machine type. Since multiplatform between e500 and non-e500 is not supported, set the assembler machine type globaly when CONFIG_PPC_E500MC=y. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: Paul Mackerras <paulus@samba.org> CC: Michael Ellerman <mpe@ellerman.id.au> CC: Kumar Gala <galak@kernel.crashing.org> CC: Vakul Garg <vakul.garg@nxp.com> CC: Scott Wood <swood@redhat.com> CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: linuxppc-dev@lists.ozlabs.org CC: linux-kernel@vger.kernel.org CC: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Nicholas Piggin	e6df88998f	powerpc/64s/radix: Fix radix_kvm_prefetch_workaround paca access of not possible CPU commit `758380b815` upstream. If possible CPUs are limited (e.g., by kexec), then the kvm prefetch workaround function can access the paca pointer for a !possible CPU. Fixes: `d2e60075a3` ("powerpc/64: Use array of paca pointers and allocate pacas individually") Cc: stable@kernel.org Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Finley Xiao	397a7ac34a	soc: rockchip: power-domain: Fix wrong value when power up pd with writemask commit `9e59c5f66c` upstream. Solve the pd could only ever turn off but never turn them on again, if the pd registers have the writemask bits. So far this affects the rk3328 only. Fixes: `79bb17ce8e` ("soc: rockchip: power-domain: Support domain control in hiword-registers") Cc: stable@vger.kernel.org Signed-off-by: Finley Xiao <finley.xiao@rock-chips.com> Signed-off-by: Elaine Zhang <zhangqing@rock-chips.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Ross Zwisler	dce27cf83b	libnvdimm, pmem: Do not flush power-fail protected CPU caches commit `546eb0317c` upstream. This commit: `5fdf8e5ba5` ("libnvdimm: re-enable deep flush for pmem devices via fsync()") intended to make sure that deep flush was always available even on platforms which support a power-fail protected CPU cache. An unintended side effect of this change was that we also lost the ability to skip flushing CPU caches on those power-fail protected CPU cache. Fix this by skipping the low level cache flushing in dax_flush() if we have CPU caches which are power-fail protected. The user can still override this behavior by manually setting the write_cache state of a namespace. See libndctl's ndctl_namespace_write_cache_is_enabled(), ndctl_namespace_enable_write_cache() and ndctl_namespace_disable_write_cache() functions. Cc: <stable@vger.kernel.org> Fixes: `5fdf8e5ba5` ("libnvdimm: re-enable deep flush for pmem devices via fsync()") Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Mahesh Salgaonkar	e4e0a5d95c	powerpc/fadump: Unregister fadump on kexec down path. commit `722cde76d6` upstream. Unregister fadump on kexec down path otherwise the fadump registration in new kexec-ed kernel complains that fadump is already registered. This makes new kernel to continue using fadump registered by previous kernel which may lead to invalid vmcore generation. Hence this patch fixes this issue by un-registering fadump in fadump_cleanup() which is called during kexec path so that new kernel can register fadump with new valid values. Fixes: `b500afff11` ("fadump: Invalidate registration and release reserved memory for general use.") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Gautham R. Shenoy	6de015f7f9	cpuidle: powernv: Fix promotion from snooze if next state disabled commit `0a4ec6aa03` upstream. The commit `78eaa10f02` ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") introduced a timeout for the snooze idle state so that it could be eventually be promoted to a deeper idle state. The snooze timeout value is static and set to the target residency of the next idle state, which would train the cpuidle governor to pick the next idle state eventually. The unfortunate side-effect of this is that if the next idle state(s) is disabled, the CPU will forever remain in snooze, despite the fact that the system is completely idle, and other deeper idle states are available. This patch fixes the issue by dynamically setting the snooze timeout to the target residency of the next enabled state on the device. Before Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01297 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| Nap \| Fast 0\| 8\| 0\| 96.41\| 0.00\| 0.00 0\| 8\| 1\| 96.43\| 0.00\| 0.00 0\| 8\| 2\| 96.47\| 0.00\| 0.00 0\| 8\| 3\| 96.35\| 0.00\| 0.00 0\| 8\| 4\| 96.37\| 0.00\| 0.00 0\| 8\| 5\| 96.37\| 0.00\| 0.00 0\| 8\| 6\| 96.47\| 0.00\| 0.00 0\| 8\| 7\| 96.47\| 0.00\| 0.00 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.05033 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| stop \| stop \| stop \| stop \| stop \| stop \| stop \| stop 0\| 16\| 0\| 89.79\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 1\| 90.12\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 2\| 90.21\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 0\| 16\| 3\| 90.29\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00 After Patch: POWER8 : Only nap disabled. $ cpupower monitor sleep 30 sleep took 30.01200 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| Nap \| Fast 0\| 8\| 0\| 16.58\| 0.00\| 77.21 0\| 8\| 1\| 18.42\| 0.00\| 75.38 0\| 8\| 2\| 4.70\| 0.00\| 94.09 0\| 8\| 3\| 17.06\| 0.00\| 81.73 0\| 8\| 4\| 3.06\| 0.00\| 95.73 0\| 8\| 5\| 7.00\| 0.00\| 96.80 0\| 8\| 6\| 1.00\| 0.00\| 98.79 0\| 8\| 7\| 5.62\| 0.00\| 94.17 POWER9: Shallow states (stop0lite, stop1lite, stop2lite, stop0, stop1, stop2) disabled: $ cpupower monitor sleep 30 sleep took 30.02110 seconds and exited with status 0 \|Idle_Stats PKG \|CORE\|CPU \| snoo \| stop \| stop \| stop \| stop \| stop \| stop \| stop \| stop 0\| 0\| 0\| 0.69\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 9.39\| 89.70 0\| 0\| 1\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.05\| 93.21 0\| 0\| 2\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 89.93 0\| 0\| 3\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 0.00\| 93.26 Fixes: `78eaa10f02` ("cpuidle: powernv/pseries: Auto-promotion of snooze to deeper idle state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Akshay Adiga	3d5abb68b2	powerpc/powernv/cpuidle: Init all present cpus for deep states commit `ac9816dcba` upstream. Init all present cpus for deep states instead of "all possible" cpus. Init fails if a possible cpu is guarded. Resulting in making only non-deep states available for cpuidle/hotplug. Stewart says, this means that for single threaded workloads, if you guard out a CPU core you'll not get WoF (Workload Optimised Frequency), which means that performance goes down when you wouldn't expect it to. Fixes: `77b54e9f21` ("powernv/powerpc: Add winkle support for offline cpus") Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Akshay Adiga <akshay.adiga@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:51 +02:00
Haren Myneni	588c94787a	powerpc/powernv: copy/paste - Mask SO bit in CR commit `7574364906` upstream. NX can set the 3rd bit in CR register for XER[SO] (Summary overflow) which is not related to paste request. The current paste function returns failure for a successful request when this bit is set. So mask this bit and check the proper return status. Fixes: `2392c8c8c0` ("powerpc/powernv/vas: Define copy/paste interfaces") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Haren Myneni <haren@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Alexey Kardashevskiy	5eaad8b7f7	powerpc/powernv/ioda2: Remove redundant free of TCE pages commit `98fd72fe82` upstream. When IODA2 creates a PE, it creates an IOMMU table with it_ops::free set to pnv_ioda2_table_free() which calls pnv_pci_ioda2_table_free_pages(). Since iommu_tce_table_put() calls it_ops::free when the last reference to the table is released, explicit call to pnv_pci_ioda2_table_free_pages() is not needed so let's remove it. This should fix double free in the case of PCI hotuplug as pnv_pci_ioda2_table_free_pages() does not reset neither iommu_table::it_base nor ::it_size. This was not exposed by SRIOV as it uses different code path via pnv_pcibios_sriov_disable(). IODA1 does not inialize it_ops::free so it does not have this issue. Fixes: `c5f7700bbd` ("powerpc/powernv: Dynamically release PE") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Michael Neuling	698ca4b33d	powerpc/ptrace: Fix enforcement of DAWR constraints commit `cd6ef7eebf` upstream. Back when we first introduced the DAWR, in commit `4ae7ebe952` ("powerpc: Change hardware breakpoint to allow longer ranges"), we screwed up the constraint making it a 1024 byte boundary rather than a 512. This makes the check overly permissive. Fortunately GDB is the only real user and it always did they right thing, so we never noticed. This fixes the constraint to 512 bytes. Fixes: `4ae7ebe952` ("powerpc: Change hardware breakpoint to allow longer ranges") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Anju T Sudhakar	a4de2ca91f	powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus() commit `d2032678e5` upstream. Currently memory is allocated for core-imc based on cpu_present_mask, which has bit 'cpu' set iff cpu is populated. We use (cpu number / threads per core) as the array index to access the memory. Under some circumstances firmware marks a CPU as GUARDed CPU and boot the system, until cleared of errors, these CPU's are unavailable for all subsequent boots. GUARDed CPUs are possible but not present from linux view, so it blows a hole when we assume the max length of our allocation is driven by our max present cpus, where as one of the cpus might be online and be beyond the max present cpus, due to the hole. So (cpu number / threads per core) value bounds the array index and leads to memory overflow. Call trace observed during a guard test: Faulting instruction address: 0xc000000000149f1c cpu 0x69: Vector: 380 (Data Access Out of Range) at [c000003fea303420] pc:c000000000149f1c: prefetch_freepointer+0x14/0x30 lr:c00000000014e0f8: __kmalloc+0x1a8/0x1ac sp:c000003fea3036a0 msr:9000000000009033 dar:c9c54b2c91dbf6b7 current = 0xc000003fea2c0000 paca = 0xc00000000fddd880 softe: 3 irq_happened: 0x01 pid = 1, comm = swapper/104 Linux version 4.16.7-openpower1 (smc@smc-desktop) (gcc version 6.4.0 (Buildroot 2018.02.1-00006-ga8d1126)) #2 SMP Fri May 4 16:44:54 PDT 2018 enter ? for help call trace: __kmalloc+0x1a8/0x1ac (unreliable) init_imc_pmu+0x7f4/0xbf0 opal_imc_counters_probe+0x3fc/0x43c platform_drv_probe+0x48/0x80 driver_probe_device+0x22c/0x308 __driver_attach+0xa0/0xd8 bus_for_each_dev+0x88/0xb4 driver_attach+0x2c/0x40 bus_add_driver+0x1e8/0x228 driver_register+0xd0/0x114 __platform_driver_register+0x50/0x64 opal_imc_driver_init+0x24/0x38 do_one_initcall+0x150/0x15c kernel_init_freeable+0x250/0x254 kernel_init+0x1c/0x150 ret_from_kernel_thread+0x5c/0xc8 Allocating memory for core-imc based on cpu_possible_mask, which has bit 'cpu' set iff cpu is populatable, will fix this issue. Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com> Fixes: `39a846db1d` ("powerpc/perf: Add core IMC PMU support") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Michael Neuling	2b51c0ea71	powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG commit `4f7c06e26e` upstream. In commit `e2a800beac` ("powerpc/hw_brk: Fix off by one error when validating DAWR region end") we fixed setting the DAWR end point to its max value via PPC_PTRACE_SETHWDEBUG. Unfortunately we broke PTRACE_SET_DEBUGREG when setting a 512 byte aligned breakpoint. PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to zero (memset() in hw_breakpoint_init()). This worked with arch_validate_hwbkpt_settings() before the above patch was applied but is now broken if the breakpoint is 512byte aligned. This sets the length of the breakpoint to 8 bytes when using PTRACE_SET_DEBUGREG. Fixes: `e2a800beac` ("powerpc/hw_brk: Fix off by one error when validating DAWR region end") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Ram Pai	d7690c9857	powerpc/pkeys: Detach execute_only key on !PROT_EXEC commit `eabdb8ca86` upstream. Disassociate the exec_key from a VMA if the VMA permission is not PROT_EXEC anymore. Otherwise the exec_only key continues to be associated with the vma, causing unexpected behavior. The problem was reported on x86 by Shakeel Butt, which is also applicable on powerpc. Fixes: `5586cf61e1` ("powerpc: introduce execute-only pkey") Cc: stable@vger.kernel.org # v4.16+ Reported-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Ram Pai <linuxram@us.ibm.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Aneesh Kumar K.V	6ab840f24c	powerpc/mm/hash: Add missing isync prior to kernel stack SLB switch commit `91d0697188` upstream. Currently we do not have an isync, or any other context synchronizing instruction prior to the slbie/slbmte in _switch() that updates the SLB entry for the kernel stack. However that is not correct as outlined in the ISA. From Power ISA Version 3.0B, Book III, Chapter 11, page 1133: "Changing the contents of ... the contents of SLB entries ... can have the side effect of altering the context in which data addresses and instruction addresses are interpreted, and in which instructions are executed and data accesses are performed. ... These side effects need not occur in program order, and therefore may require explicit synchronization by software. ... The synchronizing instruction before the context-altering instruction ensures that all instructions up to and including that synchronizing instruction are fetched and executed in the context that existed before the alteration." And page 1136: "For data accesses, the context synchronizing instruction before the slbie, slbieg, slbia, slbmte, tlbie, or tlbiel instruction ensures that all preceding instructions that access data storage have completed to a point at which they have reported all exceptions they will cause." We're not aware of any bugs caused by this, but it should be fixed regardless. Add the missing isync when updating kernel stack SLB entry. Cc: stable@vger.kernel.org Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [mpe: Flesh out change log with more ISA text & explanation] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Miklos Szeredi	dcc2f0e875	fuse: fix control dir setup and teardown commit `6becdb601b` upstream. syzbot is reporting NULL pointer dereference at fuse_ctl_remove_conn() [1]. Since fc->ctl_ndents is incremented by fuse_ctl_add_conn() when new_inode() failed, fuse_ctl_remove_conn() reaches an inode-less dentry and tries to clear d_inode(dentry)->i_private field. Fix by only adding the dentry to the array after being fully set up. When tearing down the control directory, do d_invalidate() on it to get rid of any mounts that might have been added. [1] https://syzkaller.appspot.com/bug?id=f396d863067238959c91c0b7cfc10b163638cac6 Reported-by: syzbot <syzbot+32c236387d66c4516827@syzkaller.appspotmail.com> Fixes: `bafa96541b` ("[PATCH] fuse: add control filesystem") Cc: <stable@vger.kernel.org> # v2.6.18 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:50 +02:00
Tetsuo Handa	bb086bb13b	fuse: don't keep dead fuse_conn at fuse_fill_super(). commit `543b8f8662` upstream. syzbot is reporting use-after-free at fuse_kill_sb_blk() [1]. Since sb->s_fs_info field is not cleared after fc was released by fuse_conn_put() when initialization failed, fuse_kill_sb_blk() finds already released fc and tries to hold the lock. Fix this by clearing sb->s_fs_info field after calling fuse_conn_put(). [1] https://syzkaller.appspot.com/bug?id=a07a680ed0a9290585ca424546860464dd9658db Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+ec3986119086fe4eec97@syzkaller.appspotmail.com> Fixes: `3b463ae0c6` ("fuse: invalidation reverse calls") Cc: John Muir <john@jmuir.com> Cc: Csaba Henk <csaba@gluster.com> Cc: Anand Avati <avati@redhat.com> Cc: <stable@vger.kernel.org> # v2.6.31 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Miklos Szeredi	7b6bdfafaf	fuse: atomic_o_trunc should truncate pagecache commit `df0e91d488` upstream. Fuse has an "atomic_o_trunc" mode, where userspace filesystem uses the O_TRUNC flag in the OPEN request to truncate the file atomically with the open. In this mode there's no need to send a SETATTR request to userspace after the open, so fuse_do_setattr() checks this mode and returns. But this misses the important step of truncating the pagecache. Add the missing parts of truncation to the ATTR_OPEN branch. Reported-by: Chad Austin <chadaustin@fb.com> Fixes: `6ff958edbf` ("fuse: add atomic open+truncate support") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Tejun Heo	f1e5a66aeb	fuse: fix congested state leak on aborted connections commit `8a301eb16d` upstream. If a connection gets aborted while congested, FUSE can leave nr_wb_congested[] stuck until reboot causing wait_iff_congested() to wait spuriously which can lead to severe performance degradation. The leak is caused by gating congestion state clearing with fc->connected test in request_end(). This was added way back in 2009 by `26c3679101` ("fuse: destroy bdi on umount"). While the commit description doesn't explain why the test was added, it most likely was to avoid dereferencing bdi after it got destroyed. Since then, bdi lifetime rules have changed many times and now we're always guaranteed to have access to the bdi while the superblock is alive (fc->sb). Drop fc->connected conditional to avoid leaking congestion states. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Joshua Miller <joshmiller@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: stable@vger.kernel.org # v2.6.29+ Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Tetsuo Handa	b8a177e220	printk: fix possible reuse of va_list variable commit `988a35f8da` upstream. I noticed that there is a possibility that printk_safe_log_store() causes kernel oops because "args" parameter is passed to vsnprintf() again when atomic_cmpxchg() detected that we raced. Fix this by using va_copy(). Link: http://lkml.kernel.org/r/201805112002.GIF21216.OFVHFOMLJtQFSO@I-love.SAKURA.ne.jp Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: dvyukov@google.com Cc: syzkaller@googlegroups.com Cc: fengguang.wu@intel.com Cc: linux-kernel@vger.kernel.org Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Fixes: `42a0bb3f71` ("printk/nmi: generic solution for safe printk in NMI") Cc: 4.7+ <stable@vger.kernel.org> # v4.7+ Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Amit Pundir	9d842dd329	Bluetooth: hci_qca: Avoid missing rampatch failure with userspace fw loader commit `7dc5fe0814` upstream. AOSP use userspace firmware loader to load firmwares, which will return -EAGAIN in case qca/rampatch_00440302.bin is not found. Since there is no rampatch for dragonboard820c QCA controller revision, just make it work as is. CC: Loic Poulain <loic.poulain@linaro.org> CC: Nicolas Dechesne <nicolas.dechesne@linaro.org> CC: Marcel Holtmann <marcel@holtmann.org> CC: Johan Hedberg <johan.hedberg@gmail.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Corey Minyard	52ba3606a8	ipmi:bt: Set the timeout before doing a capabilities check commit `fe50a7d039` upstream. There was one place where the timeout value for an operation was not being set, if a capabilities request was done from idle. Move the timeout value setting to before where that change might be requested. IMHO the cause here is the invisible returns in the macros. Maybe that's a job for later, though. Reported-by: Nordmark Claes <Claes.Nordmark@tieto.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Mikulas Patocka	510c1658dd	branch-check: fix long->int truncation when profiling branches commit `2026d35741` upstream. The function __builtin_expect returns long type (see the gcc documentation), and so do macros likely and unlikely. Unfortunatelly, when CONFIG_PROFILE_ANNOTATED_BRANCHES is selected, the macros likely and unlikely expand to __branch_check__ and __branch_check__ truncates the long type to int. This unintended truncation may cause bugs in various kernel code (we found a bug in dm-writecache because of it), so it's better to fix __branch_check__ to return long. Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1805300818140.24812@file01.intranet.prod.int.rdu2.redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: stable@vger.kernel.org Fixes: `1f0d69a9fc` ("tracing: profile likely and unlikely annotations") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Matthias Schiffer	e5da9c5d96	mips: ftrace: fix static function graph tracing commit `6fb8656646` upstream. ftrace_graph_caller was never run after calling ftrace_trace_function, breaking the function graph tracer. Fix this, bringing it in line with the x86 implementation. While we're at it, also streamline the control flow of _mcount a bit to reduce the number of branches. This issue was reported before: https://www.linux-mips.org/archives/linux-mips/2014-11/msg00295.html Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Tested-by: Matt Redfearn <matt.redfearn@mips.com> Patchwork: https://patchwork.linux-mips.org/patch/18929/ Signed-off-by: Paul Burton <paul.burton@mips.com> Cc: stable@vger.kernel.org # v3.17+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:49 +02:00
Steven Rostedt (VMware)	9830adf3ff	ftrace/selftest: Have the reset_trigger code be a bit more careful commit `756b56a9e8` upstream. The trigger code is picky in how it can be disabled as there may be dependencies between different events and synthetic events. Change the order on how triggers are reset. 1) Reset triggers of all synthetic events first 2) Remove triggers with actions attached to them 3) Remove all other triggers If this order isn't followed, then some triggers will not be reset, and an error may happen because a trigger is busy. Cc: stable@vger.kernel.org Fixes: `cfa0963dc4` ("kselftests/ftrace : Add event trigger testcases") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Geert Uytterhoeven	1bdd898794	lib/vsprintf: Remove atomic-unsafe support for %pCr commit `666902e42f` upstream. "%pCr" formats the current rate of a clock, and calls clk_get_rate(). The latter obtains a mutex, hence it must not be called from atomic context. Remove support for this rarely-used format, as vsprintf() (and e.g. printk()) must be callable from any context. Any remaining out-of-tree users will start seeing the clock's name printed instead of its rate. Reported-by: Jia-Ju Bai <baijiaju1990@gmail.com> Fixes: `900cca2944` ("lib/vsprintf: add %pC{,n,r} format specifiers for clocks") Link: http://lkml.kernel.org/r/1527845302-12159-5-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.1+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Geert Uytterhoeven	d9d35ee53b	clk: renesas: cpg-mssr: Stop using printk format %pCr commit `ef4b0be626` upstream. Printk format "%pCr" will be removed soon, as clk_get_rate() must not be called in atomic context. Replace it by open-coding the operation. This is safe here, as the code runs in task context. Link: http://lkml.kernel.org/r/1527845302-12159-2-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Geert Uytterhoeven	b1735e3cb1	thermal: bcm2835: Stop using printk format %pCr commit `bd2a07f71a` upstream. Printk format "%pCr" will be removed soon, as clk_get_rate() must not be called in atomic context. Replace it by printing the variable that already holds the clock rate. Note that calling clk_get_rate() is safe here, as the code runs in task context. Link: http://lkml.kernel.org/r/1527845302-12159-3-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Alexander Sverdlin	f5333c0ab3	ASoC: cirrus: i2s: Fix {TX\|RX}LinCtrlData setup commit `5d302ed3cc` upstream. According to "EP93xx User’s Guide", I2STXLinCtrlData and I2SRXLinCtrlData registers actually have different format. The only currently used bit (Left_Right_Justify) has different position. Fix this and simplify the whole setup taking into account the fact that both registers have zero default value. The practical effect of the above is repaired SND_SOC_DAIFMT_RIGHT_J support (currently unused). Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Alexander Sverdlin	ba0ec6da7c	ASoC: cirrus: i2s: Fix LRCLK configuration commit `2d534113be` upstream. The bit responsible for LRCLK polarity is i2s_tlrs (0), not i2s_trel (2) (refer to "EP93xx User's Guide"). Previously card drivers which specified SND_SOC_DAIFMT_NB_IF actually got SND_SOC_DAIFMT_NB_NF, an adaptation is necessary to retain the old behavior. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Kai Chieh Chuang	9f2a8125e2	ASoC: mediatek: preallocate pages use platform device commit `5845e6155d` upstream. preallocate pages should use platform device, since we set dma mask for platform device. Signed-off-by: KaiChieh Chuang <kaichieh.chuang@mediatek.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Paul Handrigan	697f9ed49d	ASoC: cs35l35: Add use_single_rw to regmap config commit `6a6ad7face` upstream. Add the use_single_rw flag to regmap config since the device does not support bulk transactions over i2c. Signed-off-by: Paul Handrigan <Paul.Handrigan@cirrus.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:48 +02:00
Srinivas Kandagatla	83d0e2f3de	ASoC: dapm: delete dapm_kcontrol_data paths list before freeing it commit `ff2faf1289` upstream. dapm_kcontrol_data is freed as part of dapm_kcontrol_free(), leaving the paths pointer dangling in the list. This leads to system crash when we try to unload and reload sound card. I hit this bug during ADSP crash/reboot test case on Dragon board DB410c. Without this patch, on SLAB Poisoning enabled build, kernel crashes with "BUG kmalloc-128 (Tainted: G W ): Poison overwritten" Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Ingo Flaschberger	63f60351c4	1wire: family module autoload fails because of upper/lower case mismatch. commit `065c09563c` upstream. 1wire family module autoload fails because of upper/lower case mismatch. Signed-off-by: Ingo Flaschberger <ingo.flaschberger@gmail.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Maxim Moseychuk	9247717225	usb: do not reset if a low-speed or full-speed device timed out commit `6e01827ed9` upstream. Some low-speed and full-speed devices (for example, bluetooth) do not have time to initialize. For them, ETIMEDOUT is a valid error. We need to give them another try. Otherwise, they will never be initialized correctly and in dmesg will be messages "Bluetooth: hci0 command 0x1002 tx timeout" or similars. Fixes: `264904ccc3` ("usb: retry reset if a device times out") Cc: stable <stable@vger.kernel.org> Signed-off-by: Maxim Moseychuk <franchesko.salias.hudro.pedros@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Wolfram Sang	e05b3dce98	mmc: renesas_sdhi: really fix WP logic regressions commit `ef5332c10d` upstream. This reverts commit `e060d376cc` ("mmc: renesas_sdhi: fix WP detection") and adds some code to really fix the regressions. It was missed so far that Renesas R-Car instantiations of SDHI chose to disable internal WP and used the existence of "wp-gpios" to en/disable WP at all. With the first refactoring by Yamada-san with commit `2ad1db059b` ("mmc: renesas_sdhi: use MMC_CAP2_NO_WRITE_PROTECT instead of TMIO own flag"), WP was always disabled even when GPIOs were present. With Wolfram's first fix which gets now reverted, GPIOs were honored. But when not available, the fallback was to internal WP and not to disabled WP. This caused wrong WP status on uSD card slots. Restore the old behaviour now. By default, WP is disabled. When a GPIO is found, the GPIO re-enables WP. We will think about possible better ways to handle this in the future. Tested on a previously regressing Renesas Lager board (H2) and a still working Renesas Salvator-X board (M3-W). Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Cc: stable@vger.kernel.org # v4.17+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Waldemar Rymarkiewicz	29ba766370	PM / OPP: Update voltage in case freq == old_freq commit `c5c2a97b3a` upstream. This commit fixes a rare but possible case when the clk rate is updated without update of the regulator voltage. At boot up, CPUfreq checks if the system is running at the right freq. This is a sanity check in case a bootloader set clk rate that is outside of freq table present with cpufreq core. In such cases system can be unstable so better to change it to a freq that is preset in freq-table. The CPUfreq takes next freq that is >= policy->cur and this is our target_freq that needs to be set now. dev_pm_opp_set_rate(dev, target_freq) checks the target_freq and the old_freq (a current rate). If these are equal it returns early. If not, it searches for OPP (old_opp) that fits best to old_freq (not listed in the table) and updates old_freq (!). Here, we can end up with old_freq = old_opp.rate = target_freq, which is not handled in _generic_set_opp_regulator(). It's supposed to update voltage only when freq > old_freq \|\| freq > old_freq. if (freq > old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); [...] if (freq < old_freq) { ret = _set_opp_voltage(dev, reg, new_supply); if (ret) It results in, no voltage update while clk rate is updated. Example: freq-table = { 1000MHz 1.15V 666MHZ 1.10V 333MHz 1.05V } boot-up-freq = 800MHz # not listed in freq-table freq = target_freq = 1GHz old_freq = 800Mhz old_opp = _find_freq_ceil(opp_table, &old_freq); #(old_freq is modified!) old_freq = 1GHz Fixes: `6a0712f6f1` ("PM / OPP: Add dev_pm_opp_set_rate()") Cc: 4.6+ <stable@vger.kernel.org> # v4.6+ Signed-off-by: Waldemar Rymarkiewicz <waldemar.rymarkiewicz@gmail.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Rafael J. Wysocki	f663914635	PM / core: Fix supplier device runtime PM usage counter imbalance commit `47e5abfb54` upstream. If a device link is added via device_link_add() by the driver of the link's consumer device, the supplier's runtime PM usage counter is going to be dropped by the pm_runtime_put_suppliers() call in driver_probe_device(). However, in that case it is not incremented unless the supplier driver is already present and the link is not stateless. That leads to a runtime PM usage counter imbalance for the supplier device in a few cases. To prevent that from happening, bump up the supplier runtime PM usage counter in device_link_add() for all links with the DL_FLAG_PM_RUNTIME flag set that are added at the consumer probe time. Use pm_runtime_get_noresume() for that as the callers of device_link_add() who want the supplier to be resumed by it are expected to pass DL_FLAG_RPM_ACTIVE in flags to it anyway, but additionally resume the supplier if the link is added during consumer driver probe to retain the existing behavior for the callers depending on it. Fixes: `21d5c57b37` (PM / runtime: Use device links) Reported-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Cc: 4.10+ <stable@vger.kernel.org> # 4.10+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Rafael J. Wysocki	f6826af90f	ACPI / LPSS: Avoid PM quirks on suspend and resume from S3 commit `a09c591306` upstream. It is reported that commit `a192aa923b` (ACPI / LPSS: Consolidate runtime PM and system sleep handling) introduced a system suspend regression on some machines, but the only functional change made by it was to cause the PM quirks in the LPSS to also be used during system suspend and resume. While that should always work for suspend-to-idle, it turns out to be problematic for S3 (suspend-to-RAM). To address that issue restore the previous S3 suspend and resume behavior of the LPSS to avoid applying PM quirks then. Fixes: `a192aa923b` (ACPI / LPSS: Consolidate runtime PM and system sleep handling) Link: https://bugs.launchpad.net/bugs/1774950 Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Rafael J. Wysocki	5d647d72e2	PCI / PM: Do not clear state_saved for devices that remain suspended commit `656088aa9b` upstream. The state_saved flag should not be cleared in pci_pm_suspend() if the given device is going to remain suspended, or the device's config space will not be restored properly during the subsequent resume. Namely, if the device is going to stay in suspend, both the late and noirq callbacks return early for it, so if its state_saved flag is cleared in pci_pm_suspend(), it will remain unset throughout the remaining part of suspend and resume and pci_restore_state() called for the device going forward will return without doing anything. For this reason, change pci_pm_suspend() to only clear state_saved if the given device is not going to remain suspended. [This is analogous to what commit `ae860a19f3` (PCI / PM: Do not clear state_saved in pci_pm_freeze() when smart suspend is set) did for hibernation.] Fixes: `c4b65157ae` (PCI / PM: Take SMART_SUSPEND driver flag into account) Cc: 4.15+ <stable@vger.kernel.org> # 4.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:47 +02:00
Ulf Hansson	d3d6cde518	PM / Domains: Fix error path during attach in genpd commit `72038df3c5` upstream. In case the PM domain fails to be powered on in genpd_dev_pm_attach(), it returns -EPROBE_DEFER, but keeping the device attached to its PM domain. This leads to problems when the next attempt to attach is re-tried. More precisely, in that situation an -EEXIST error code is returned, because the device already has its PM domain pointer assigned, from the first attempt. Now, because of the sloppy error handling by the existing callers of dev_pm_domain_attach(), probing is allowed to continue when -EEXIST is returned. However, in such case there are no guarantees that the PM domain is powered on by genpd, which may lead to hangs when buses/drivers tried to access their devices. Let's fix this behaviour, simply by detaching the device when powering on fails in genpd_dev_pm_attach(). Cc: v4.11+ <stable@vger.kernel.org> # v4.11+ Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Eric W. Biederman	4b4c442c75	signal/xtensa: Consistenly use SIGBUS in do_unaligned_user commit `7de712ccc0` upstream. While working on changing this code to use force_sig_fault I discovered that do_unaliged_user is sets si_signo to SIGBUS and passes SIGSEGV to force_sig_info. Which is just b0rked. The code is reporting a SIGBUS error so replace the SIGSEGV with SIGBUS. Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: linux-xtensa@linux-xtensa.org Cc: stable@vger.kernel.org Acked-by: Max Filippov <jcmvbkbc@gmail.com> Fixes: `5a0015d626` ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 3") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Daniel Wagner	ed99259924	serial: sh-sci: Use spin_{try}lock_irqsave instead of open coding version commit `8afb1d2c12` upstream. Commit `40f70c03e3` ("serial: sh-sci: add locking to console write function to avoid SMP lockup") copied the strategy to avoid locking problems in conjuncture with the console from the UART8250 driver. Instead using directly spin_{try}lock_irqsave(), local_irq_save() followed by spin_{try}lock() was used. While this is correct on mainline, for -rt it is a problem. spin_{try}lock() will check if it is running in a valid context. Since the local_irq_save() has already been executed, the context has changed and spin_{try}lock() will complain. The reason why spin_{try}lock() complains is that on -rt the spin locks are turned into mutexes and therefore can sleep. Sleeping with interrupts disabled is not valid. BUG: sleeping function called from invalid context at /home/wagi/work/rt/v4.4-cip-rt/kernel/locking/rtmutex.c:995 in_atomic(): 0, irqs_disabled(): 128, pid: 778, name: irq/76-eth0 CPU: 0 PID: 778 Comm: irq/76-eth0 Not tainted 4.4.126-test-cip22-rt14-00403-gcd03665c8318 #12 Hardware name: Generic RZ/G1 (Flattened Device Tree) Backtrace: [<c00140a0>] (dump_backtrace) from [<c001424c>] (show_stack+0x18/0x1c) r7:c06b01f0 r6:60010193 r5:00000000 r4:c06b01f0 [<c0014234>] (show_stack) from [<c01d3c94>] (dump_stack+0x78/0x94) [<c01d3c1c>] (dump_stack) from [<c004c134>] (___might_sleep+0x134/0x194) r7:60010113 r6:c06d3559 r5:00000000 r4:ffffe000 [<c004c000>] (___might_sleep) from [<c04ded60>] (rt_spin_lock+0x20/0x74) r5:c06f4d60 r4:c06f4d60 [<c04ded40>] (rt_spin_lock) from [<c02577e4>] (serial_console_write+0x100/0x118) r5:c06f4d60 r4:c06f4d60 [<c02576e4>] (serial_console_write) from [<c0061060>] (call_console_drivers.constprop.15+0x10c/0x124) r10:c06d2894 r9:c04e18b0 r8:00000028 r7:00000000 r6:c06d3559 r5:c06d2798 r4:c06b9914 r3:c02576e4 [<c0060f54>] (call_console_drivers.constprop.15) from [<c0062984>] (console_unlock+0x32c/0x430) r10:c06d30d8 r9:00000028 r8:c06dd518 r7:00000005 r6:00000000 r5:c06d2798 r4:c06d2798 r3:00000028 [<c0062658>] (console_unlock) from [<c0062e1c>] (vprintk_emit+0x394/0x4f0) r10:c06d2798 r9:c06d30ee r8:00000006 r7:00000005 r6:c06a78fc r5:00000027 r4:00000003 [<c0062a88>] (vprintk_emit) from [<c0062fa0>] (vprintk+0x28/0x30) r10:c060bd46 r9:00001000 r8:c06b9a90 r7:c06b9a90 r6:c06b994c r5:c06b9a3c r4:c0062fa8 [<c0062f78>] (vprintk) from [<c0062fb8>] (vprintk_default+0x10/0x14) [<c0062fa8>] (vprintk_default) from [<c009cd30>] (printk+0x78/0x84) [<c009ccbc>] (printk) from [<c025afdc>] (credit_entropy_bits+0x17c/0x2cc) r3:00000001 r2:decade60 r1:c061a5ee r0:c061a523 r4:00000006 [<c025ae60>] (credit_entropy_bits) from [<c025bf74>] (add_interrupt_randomness+0x160/0x178) r10:466e7196 r9:1f536000 r8:fffeef74 r7:00000000 r6:c06b9a60 r5:c06b9a3c r4:dfbcf680 [<c025be14>] (add_interrupt_randomness) from [<c006536c>] (irq_thread+0x1e8/0x248) r10:c006537c r9:c06cdf21 r8:c0064fcc r7:df791c24 r6:df791c00 r5:ffffe000 r4:df525180 [<c0065184>] (irq_thread) from [<c003fba4>] (kthread+0x108/0x11c) r10:00000000 r9:00000000 r8:c0065184 r7:df791c00 r6:00000000 r5:df791d00 r4:decac000 [<c003fa9c>] (kthread) from [<c00101b8>] (ret_from_fork+0x14/0x3c) r8:00000000 r7:00000000 r6:00000000 r5:c003fa9c r4:df791d00 Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Wagner <daniel.wagner@siemens.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Mika Westerberg	4335787184	mtd: spi-nor: intel-spi: Fix atomic sequence handling commit `c7d6a82d90` upstream. On many older systems using SW sequencer the PREOP_OPTYPE register contains two preopcodes as following: PREOP_OPTYPE=0xf2785006 The last two bytes are the opcodes decoded to: 0x50 - Write enable for volatile status register 0x06 - Write enable The former is used to modify volatile bits in the status register. For non-volatile bits the latter is needed. Preopcodes are used in SW sequencer to send one command "atomically" without anything else interfering the transfer. The sequence that gets executed is: - Send preopcode (write enable) from PREOP_OPTYPE register - Send the actual SPI command - Poll busy bit in the status register (0x05, RDSR) Commit `8c473dd61b` ("spi-nor: intel-spi: Don't assume OPMENU0/1 to be programmed by BIOS") enabled atomic sequence handling but because both preopcodes are programmed, the following happens: if (preop >> 8) val \|= SSFSTS_CTL_SPOP; Since on these systems preop >> 8 == 0x50 we end up picking volatile write enable instead. Because of this the actual write command is pretty much NOP unless there is a WREN latched in the chip already. Furthermore we should not really just assume that WREN was issued in previous call to intel_spi_write_reg() because that might not be the case. This updates driver to first check that the opcode is actually available in PREOP_OPTYPE register and if not return error back to the spi-nor core (if the controller is not locked we program it now). In addition we save the opcode to ispi->atomic_preopcode field which is checked in next call to intel_spi_sw_cycle() to actually enable atomic sequence using the requested preopcode. Fixes: `8c473dd61b` ("spi-nor: intel-spi: Don't assume OPMENU0/1 to be programmed by BIOS") Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Marek Vasut <marek.vasut@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Guenter Roeck	bfef659775	hwmon: (k10temp) Add support for Stoney Ridge and Bristol Ridge CPUs commit `ccaf63b4d6` upstream. Add support for Stoney Ridge and Bristol Ridge (Family 15h Model 0x70) CPUs. Registers match those of Family 15h Model 0x60. Cc: stable@vger.kernel.org # v4.16+ Tested-by: Gabriel Craciunescu <nix.or.die@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Dmitry Torokhov	bfc36279c0	platform/chrome: cros_ec_lpc: do not try DMI match when ACPI device found commit `b410b12266` upstream. Older models of Chromebooks did not describe the LPC EC in their ACPI tables; starting with Strago-based devices Google is using GOOG0004 device to describe EC LPC. DMI-based match is fragile and does not work reliably, especially when using custom firmware. It is also not needed when we can locate the right ACPI device, so let's stop bailing out when DMI does not match but the right ACPI device is present. Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Benson Leung <bleung@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Finn Thain	b0b94616ae	m68k/mac: Fix SWIM memory resource end address commit `3e2816c107` upstream. The resource size is 0x2000 == end - start + 1. Therefore end == start + 0x2000 - 1. Cc: Laurent Vivier <lvivier@redhat.com> Cc: stable@vger.kernel.org # v4.14+ Tested-by: Stan Johnson <userm57@yahoo.com> Signed-off-by: Finn Thain <fthain@telegraphics.com.au> Acked-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Michael Schmitz	9ff6f230f8	m68k/mm: Adjust VM area to be unmapped by gap size for __iounmap() commit `3f90f9ef2d` upstream. If 020/030 support is enabled, get_io_area() leaves an IO_SIZE gap between mappings which is added to the vm_struct representing the mapping. __ioremap() uses the actual requested size (after alignment), while __iounmap() is passed the size from the vm_struct. On 020/030, early termination descriptors are used to set up mappings of extent 'size', which are validated on unmapping. The unmapped gap of size IO_SIZE defeats the sanity check of the pmd tables, causing __iounmap() to loop forever on 030. On 040/060, unmapping of page table entries does not check for a valid mapping, so the umapping loop always completes there. Adjust size to be unmapped by the gap that had been added in the vm_struct prior. This fixes the hang in atari_platform_init() reported a long time ago, and a similar one reported by Finn recently (addressed by removing ioremap() use from the SWIM driver. Tested on my Falcon in 030 mode - untested but should work the same on 040/060 (the extra page tables cleared there would never have been set up anyway). Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> [geert: Minor commit description improvements] [geert: This was fixed in 2.4.23, but not in 2.5.x] Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:46 +02:00
Siarhei Liakh	872712718f	x86: Call fixup_exception() before notify_die() in math_error() commit `3ae6295ccb` upstream. fpu__drop() has an explicit fwait which under some conditions can trigger a fixable FPU exception while in kernel. Thus, we should attempt to fixup the exception first, and only call notify_die() if the fixup failed just like in do_general_protection(). The original call sequence incorrectly triggers KDB entry on debug kernels under particular FPU-intensive workloads. Andy noted, that this makes the whole conditional irq enable thing even more inconsistent, but fixing that it outside the scope of this. Signed-off-by: Siarhei Liakh <siarhei.liakh@concurrent-rt.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Borislav Petkov" <bpetkov@suse.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/DM5PR11MB201156F1CAB2592B07C79A03B17D0@DM5PR11MB2011.namprd11.prod.outlook.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:45 +02:00
Borislav Petkov	587bb245cd	x86/mce: Do not overwrite MCi_STATUS in mce_no_way_out() commit `1f74c8a647` upstream. mce_no_way_out() does a quick check during #MC to see whether some of the MCEs logged would require the kernel to panic immediately. And it passes a struct mce where MCi_STATUS gets written. However, after having saved a valid status value, the next iteration of the loop which goes over the MCA banks on the CPU, overwrites the valid status value because we're using struct mce as storage instead of a temporary variable. Which leads to MCE records with an empty status value: mce: [Hardware Error]: CPU 0: Machine Check Exception: 6 Bank 0: 0000000000000000 mce: [Hardware Error]: RIP 10:<ffffffffbd42fbd7> {trigger_mce+0x7/0x10} In order to prevent the loss of the status register value, return immediately when severity is a panic one so that we can panic immediately with the first fatal MCE logged. This is also the intention of this function and not to noodle over the banks while a fatal MCE is already logged. Tony: read the rest of the MCA bank to populate the struct mce fully. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20180622095428.626-8-bp@alien8.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:45 +02:00
Tony Luck	d819beffd3	x86/mce: Fix incorrect "Machine check from unknown source" message commit `40c36e2741` upstream. Some injection testing resulted in the following console log: mce: [Hardware Error]: CPU 22: Machine Check Exception: f Bank 1: bd80000000100134 mce: [Hardware Error]: RIP 10:<ffffffffc05292dd> {pmem_do_bvec+0x11d/0x330 [nd_pmem]} mce: [Hardware Error]: TSC c51a63035d52 ADDR 3234bc4000 MISC 88 mce: [Hardware Error]: PROCESSOR 0:50654 TIME 1526502199 SOCKET 0 APIC 38 microcode 2000043 mce: [Hardware Error]: Run the above through 'mcelog --ascii' Kernel panic - not syncing: Machine check from unknown source This confused everybody because the first line quite clearly shows that we found a logged error in "Bank 1", while the last line says "unknown source". The problem is that the Linux code doesn't do the right thing for a local machine check that results in a fatal error. It turns out that we know very early in the handler whether the machine check is fatal. The call to mce_no_way_out() has checked all the banks for the CPU that took the local machine check. If it says we must crash, we can do so right away with the right messages. We do scan all the banks again. This means that we might initially not see a problem, but during the second scan find something fatal. If this happens we print a slightly different message (so I can see if it actually every happens). [ bp: Remove unneeded severity assignment. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: stable@vger.kernel.org # 4.2 Link: http://lkml.kernel.org/r/52e049a497e86fd0b71c529651def8871c804df0.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
Tony Luck	293e1d1a31	x86/mce: Check for alternate indication of machine check recovery on Skylake commit `4c5717da1d` upstream. Currently we just check the "CAPID0" register to see whether the CPU can recover from machine checks. But there are also some special SKUs which do not have all advanced RAS features, but do enable machine check recovery for use with NVDIMMs. Add a check for any of bits {8:5} in the "CAPID5" register (each reports some NVDIMM mode available, if any of them are set, then the system supports memory machine check recovery). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # 4.9 Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/03cbed6e99ddafb51c2eadf9a3b7c8d7a0cc204e.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
Tony Luck	f03fdc44f3	x86/mce: Improve error message when kernel cannot recover commit `c7d606f560` upstream. Since we added support to add recovery from some errors inside the kernel in: commit `b2f9d678e2` ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") we have done a less than stellar job at reporting the cause of recoverable machine checks that occur in other parts of the kernel. The user just gets the unhelpful message: mce: [Hardware Error]: Machine check: Action required: unknown MCACOD doubly unhelpful when they check the manual for the reported IA32_MSR_STATUS.MCACOD and see that it is listed as one of the standard recoverable values. Add an extra rule to the MCE severity table to catch this case and report it as: mce: [Hardware Error]: Machine check: Data load in unrecoverable area of kernel Fixes: `b2f9d678e2` ("x86/mce: Check for faults tagged in EXTABLE_CLASS_FAULT exception table entries") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # 4.6+ Cc: Dan Williams <dan.j.williams@intel.com> Cc: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/4cc7c465150a9a48b8b9f45d0b840278e77eb9b5.1527283897.git.tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
mike.travis@hpe.com	d8e7492ccb	x86/platform/UV: Add kernel parameter to set memory block size commit `d7609f4210` upstream. Add a kernel parameter that allows setting UV memory block size. This is to provide an adjustment for new forms of PMEM and other DIMM memory that might require alignment restrictions other than scanning the global address table for the required minimum alignment. The value set will be further adjusted by both the GAM range table scan as well as restrictions imposed by set_memory_block_size_order(). Signed-off-by: Mike Travis <mike.travis@hpe.com> Reviewed-by: Andrew Banman <andrew.banman@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dan.j.williams@intel.com Cc: jgross@suse.com Cc: kirill.shutemov@linux.intel.com Cc: mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/20180524201711.854849120@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
mike.travis@hpe.com	0f810a6393	x86/platform/UV: Use new set memory block size function commit `bbbd2b51a2` upstream. Add a call to the new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This accommodates changes in the Intel BIOS, and therefore UV BIOS, which now can align boundaries different than the previous UV standard of 2GB. It also flags any UV Global Address boundaries from BIOS that cause a change in the mem block size (boundary). The current boundary of 2GB has been used on UV since the first system release in 2009 with Linux 2.6 and has worked fine. But the new NVDIMM persistent memory modules (PMEM), along with the Intel BIOS changes to support these modules caused the memory block size boundary to be set to a lower limit. Intel only guarantees that this minimum boundary at 64MB though the current Linux limit is 128MB. Note that the default remains 2GB if no changes occur. Signed-off-by: Mike Travis <mike.travis@hpe.com> Reviewed-by: Andrew Banman <andrew.banman@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dan.j.williams@intel.com Cc: jgross@suse.com Cc: kirill.shutemov@linux.intel.com Cc: mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/20180524201711.732785782@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
mike.travis@hpe.com	08129e0384	x86/platform/UV: Add adjustable set memory block size function commit `f642fb5864` upstream. Add a new function to "adjust" the current fixed UV memory block size of 2GB so it can be changed to a different physical boundary. This is out of necessity so arch dependent code can accommodate specific BIOS requirements which can align these new PMEM modules at less than the default boundaries. A "set order" type of function was used to insure that the memory block size will be a power of two value without requiring a validity check. 64GB was chosen as the upper limit for memory block size values to accommodate upcoming 4PB systems which have 6 more bits of physical address space (46 becoming 52). Signed-off-by: Mike Travis <mike.travis@hpe.com> Reviewed-by: Andrew Banman <andrew.banman@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dimitri Sivanich <dimitri.sivanich@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: dan.j.williams@intel.com Cc: jgross@suse.com Cc: kirill.shutemov@linux.intel.com Cc: mhocko@suse.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/lkml/20180524201711.609546602@stormcage.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:44 +02:00
Juergen Gross	4143a919a2	x86/xen: Add call of speculative_store_bypass_ht_init() to PV paths commit `74899d92e6` upstream. Commit: `1f50ddb4f4` ("x86/speculation: Handle HT correctly on AMD") ... added speculative_store_bypass_ht_init() to the per-CPU initialization sequence. speculative_store_bypass_ht_init() needs to be called on each CPU for PV guests, too. Reported-by: Brian Woods <brian.woods@amd.com> Tested-by: Brian Woods <brian.woods@amd.com> Signed-off-by: Juergen Gross <jgross@suse.com> Cc: <stable@vger.kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: boris.ostrovsky@oracle.com Cc: xen-devel@lists.xenproject.org Fixes: `1f50ddb4f4` ("x86/speculation: Handle HT correctly on AMD") Link: https://lore.kernel.org/lkml/20180621084331.21228-1-jgross@suse.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:43 +02:00
Dan Williams	cc4549f933	x86/spectre_v1: Disable compiler optimizations over array_index_mask_nospec() commit `eab6870fee` upstream. Mark Rutland noticed that GCC optimization passes have the potential to elide necessary invocations of the array_index_mask_nospec() instruction sequence, so mark the asm() volatile. Mark explains: "The volatile will inhibit some cases where the compiler could lift the array_index_nospec() call out of a branch, e.g. where there are multiple invocations of array_index_nospec() with the same arguments: if (idx < foo) { idx1 = array_idx_nospec(idx, foo) do_something(idx1); } < some other code > if (idx < foo) { idx2 = array_idx_nospec(idx, foo); do_something_else(idx2); } ... since the compiler can determine that the two invocations yield the same result, and reuse the first result (likely the same register as idx was in originally) for the second branch, effectively re-writing the above as: if (idx < foo) { idx = array_idx_nospec(idx, foo); do_something(idx); } < some other code > if (idx < foo) { do_something_else(idx); } ... if we don't take the first branch, then speculatively take the second, we lose the nospec protection. There's more info on volatile asm in the GCC docs: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile " Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: `babdde2698` ("x86: Implement array_index_mask_nospec") Link: https://lkml.kernel.org/lkml/152838798950.14521.4893346294059739135.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-07-03 11:26:43 +02:00
Greg Kroah-Hartman	83e05c5830	Linux 4.17.3	2018-06-26 07:51:32 +08:00
Vlastimil Babka	9ec1cadef3	mm, page_alloc: do not break __GFP_THISNODE by zonelist reset commit `7810e6781e` upstream. In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for allocations that can ignore memory policies. The zonelist is obtained from current CPU's node. This is a problem for __GFP_THISNODE allocations that want to allocate on a different node, e.g. because the allocating thread has been migrated to a different CPU. This has been observed to break SLAB in our 4.4-based kernel, because there it relies on __GFP_THISNODE working as intended. If a slab page is put on wrong node's list, then further list manipulations may corrupt the list because page_to_nid() is used to determine which node's list_lock should be locked and thus we may take a wrong lock and race. Current SLAB implementation seems to be immune by luck thanks to commit `511e3a0588` ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") but there may be others assuming that __GFP_THISNODE works as promised. We can fix it by simply removing the zonelist reset completely. There is actually no reason to reset it, because memory policies and cpusets don't affect the zonelist choice in the first place. This was different when commit `183f6371aa` ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their own restricted zonelists. We might consider this for 4.17 although I don't know if there's anything currently broken. SLAB is currently not affected, but in kernels older than 4.7 that don't yet have `511e3a0588` ("mm/slab: make cache_grow() handle the page allocated on arbitrary node") it is. That's at least 4.4 LTS. Older ones I'll have to check. So stable backports should be more important, but will have to be reviewed carefully, as the code went through many changes. BTW I think that also the ac->preferred_zoneref reset is currently useless if we don't also reset ac->nodemask from a mempolicy to NULL first (which we probably should for the OOM victims etc?), but I would leave that for a separate patch. Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Fixes: `183f6371aa` ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK") Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:32 +08:00
Thadeu Lima de Souza Cascardo	9ce8013b1b	fs/binfmt_misc.c: do not allow offset overflow commit `5cc41e0995` upstream. WHen registering a new binfmt_misc handler, it is possible to overflow the offset to get a negative value, which might crash the system, or possibly leak kernel data. Here is a crash log when 2500000000 was used as an offset: BUG: unable to handle kernel paging request at ffff989cfd6edca0 IP: load_misc_binary+0x22b/0x470 [binfmt_misc] PGD 1ef3e067 P4D 1ef3e067 PUD 0 Oops: 0000 [#1] SMP NOPTI Modules linked in: binfmt_misc kvm_intel ppdev kvm irqbypass joydev input_leds serio_raw mac_hid parport_pc qemu_fw_cfg parpy CPU: 0 PID: 2499 Comm: bash Not tainted 4.15.0-22-generic #24-Ubuntu Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014 RIP: 0010:load_misc_binary+0x22b/0x470 [binfmt_misc] Call Trace: search_binary_handler+0x97/0x1d0 do_execveat_common.isra.34+0x667/0x810 SyS_execve+0x31/0x40 do_syscall_64+0x73/0x130 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Use kstrtoint instead of simple_strtoul. It will work as the code already set the delimiter byte to '\0' and we only do it when the field is not empty. Tested with offsets -1, 2500000000, UINT_MAX and INT_MAX. Also tested with examples documented at Documentation/admin-guide/binfmt-misc.rst and other registrations from packages on Ubuntu. Link: http://lkml.kernel.org/r/20180529135648.14254-1-cascardo@canonical.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:32 +08:00
Michael S. Tsirkin	a8c0b29a87	vhost: fix info leak due to uninitialized memory commit `670ae9caac` upstream. struct vhost_msg within struct vhost_msg_node is copied to userspace. Unfortunately it turns out on 64 bit systems vhost_msg has padding after type which gcc doesn't initialize, leaking 4 uninitialized bytes to userspace. This padding also unfortunately means 32 bit users of this interface are broken on a 64 bit kernel which will need to be fixed separately. Fixes: CVE-2018-1118 Cc: stable@vger.kernel.org Reported-by: Kevin Easton <kevin@guarana.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reported-by: syzbot+87cfa083e727a224754b@syzkaller.appspotmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:32 +08:00
Jason Gerecke	e2527e57b6	HID: wacom: Correct logical maximum Y for 2nd-gen Intuos Pro large commit `d471b6b22d` upstream. The HID descriptor for the 2nd-gen Intuos Pro large (PTH-860) contains a typo which defines an incorrect logical maximum Y value. This causes a small portion of the bottom of the tablet to become unusable (both because the area is below the "bottom" of the tablet and because 'wacom_wac_event' ignores out-of-range values). It also results in a skewed aspect ratio. To fix this, we add a quirk to 'wacom_usage_mapping' which overwrites the data with the correct value. Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> CC: stable@vger.kernel.org # v4.10+ Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:31 +08:00
Even Xu	d95d9509a9	HID: intel_ish-hid: ipc: register more pm callbacks to support hibernation commit `ebeaa36754` upstream. Current ISH driver only registers suspend/resume PM callbacks which don't support hibernation (suspend to disk). Basically after hiberation, the ISH can't resume properly and user may not see sensor events (for example: screen rotation may not work). User will not see a crash or panic or anything except the following message in log: hid-sensor-hub 001F:8086:22D8.0001: timeout waiting for response from ISHTP device So this patch adds support for S4/hiberbation to ISH by using the SIMPLE_DEV_PM_OPS() MACRO instead of struct dev_pm_ops directly. The suspend and resume functions will now be used for both suspend to RAM and hibernation. If power management is disabled, SIMPLE_DEV_PM_OPS will do nothing, the suspend and resume related functions won't be used, so mark them as __maybe_unused to clarify that this is the intended behavior, and remove #ifdefs for power management. Cc: stable@vger.kernel.org Signed-off-by: Even Xu <even.xu@intel.com> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:31 +08:00
Martin Brandenburg	998d7773b9	orangefs: report attributes_mask and attributes for statx commit `7f54910fa8` upstream. OrangeFS formerly failed to set attributes_mask with the result that software could not see immutable and append flags present in the filesystem. Reported-by: Becky Ligon <ligon@clemson.edu> Signed-off-by: Martin Brandenburg <martin@omnibond.com> Fixes: `68a24a6cc4` ("orangefs: implement statx") Cc: stable@vger.kernel.org Cc: hubcap@omnibond.com Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:31 +08:00
Martin Brandenburg	5aeeed5f60	orangefs: set i_size on new symlink commit `f6a4b4c9d0` upstream. As long as a symlink inode remains in-core, the destination (and therefore size) will not be re-fetched from the server, as it cannot change. The original implementation of the attribute cache assumed that setting the expiry time in the past was sufficient to cause a re-fetch of all attributes on the next getattr. That does not work in this case. The bug manifested itself as follows. When the command sequence touch foo; ln -s foo bar; ls -l bar is run, the output was lrwxrwxrwx. 1 fedora fedora 4906 Apr 24 19:10 bar -> foo However, after a re-mount, ls -l bar produces lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo After this commit, even before a re-mount, the output is lrwxrwxrwx. 1 fedora fedora 3 Apr 24 19:10 bar -> foo Reported-by: Becky Ligon <ligon@clemson.edu> Signed-off-by: Martin Brandenburg <martin@omnibond.com> Fixes: `71680c18c8` ("orangefs: Cache getattr results.") Cc: stable@vger.kernel.org Cc: hubcap@omnibond.com Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:31 +08:00
Luca Coelho	1d79bd39cd	iwlwifi: fw: harden page loading code commit `9039d98581` upstream. The page loading code trusts the data provided in the firmware images a bit too much and may cause a buffer overflow or copy unknown data if the block sizes don't match what we expect. To prevent potential problems, harden the code by checking if the sizes we are copying are what we expect. Cc: stable@vger.kernel.org Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:31 +08:00
Sean Young	b72cee00bc	media: rc: ensure input/lirc device can be opened after register commit `d7832cd2a3` upstream. Since commit `cb84343fce` ("media: lirc: do not call close() or open() on unregistered devices") rc_open() will return -ENODEV if rcdev->registered is false. Ensure this is set before we register the input device and the lirc device, else we have a short window where the neither the lirc or input device can be opened. Fixes: `cb84343fce` ("media: lirc: do not call close() or open() on unregistered devices") Cc: stable@vger.kernel.org # v4.16+ Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:30 +08:00
Kieran Bingham	7ffb262726	media: uvcvideo: Prevent setting unavailable flags commit `0dc68cabdb` upstream. The addition of an extra operation to use the GET_INFO command overwrites all existing flags from the uvc_ctrls table. This includes setting all controls as supporting GET_MIN, GET_MAX, GET_RES, and GET_DEF regardless of whether they do or not. Move the initialisation of these control capabilities directly to the uvc_ctrl_fill_xu_info() call where they were originally located in that use case, and ensure that the new functionality in uvc_ctrl_get_flags() will only set flags based on their reported capability from the GET_INFO call. Fixes: `859086ae36` ("media: uvcvideo: Apply flags from device to actual properties") Cc: stable@vger.kernel.org Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> Tested-by: Guennadi Liakhovetski <guennadi.liakhovetski@intel.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:30 +08:00
Tony Luck	caa8334f34	x86/intel_rdt: Enable CMT and MBM on new Skylake stepping commit `1d9f3e20a5` upstream. New stepping of Skylake has fixes for cache occupancy and memory bandwidth monitoring. Update the code to enable these by default on newer steppings. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: stable@vger.kernel.org # v4.14 Cc: Vikas Shivappa <vikas.shivappa@linux.intel.com> Link: https://lkml.kernel.org/r/20180608160732.9842-1-tony.luck@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:30 +08:00
Thomas Gleixner	377cc1b7b8	genirq/migration: Avoid out of line call if pending is not set commit `d340ebd696` upstream. The upcoming fix for the -EBUSY return from affinity settings requires to use the irq_move_irq() functionality even on irq remapped interrupts. To avoid the out of line call, move the check for the pending bit into an inline helper. Preparatory change for the real fix. No functional change. Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Cc: Dou Liyang <douly.fnst@cn.fujitsu.com> Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:30 +08:00
Thomas Gleixner	65a5f89ae6	genirq/affinity: Defer affinity setting if irq chip is busy commit `12f47073a4` upstream. The case that interrupt affinity setting fails with -EBUSY can be handled in the kernel completely by using the already available generic pending infrastructure. If a irq_chip::set_affinity() fails with -EBUSY, handle it like the interrupts for which irq_chip::set_affinity() can only be invoked from interrupt context. Copy the new affinity mask to irq_desc::pending_mask and set the affinity pending bit. The next raised interrupt for the affected irq will check the pending bit and try to set the new affinity from the handler. This avoids that -EBUSY is returned when an affinity change is requested from user space and the previous change has not been cleaned up. The new affinity will take effect when the next interrupt is raised from the device. Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.819273597@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:30 +08:00
Thomas Gleixner	05418c1693	genirq/generic_pending: Do not lose pending affinity update commit `a33a5d2d16` upstream. The generic pending interrupt mechanism moves interrupts from the interrupt handler on the original target CPU to the new destination CPU. This is required for x86 and ia64 due to the way the interrupt delivery and acknowledge works if the interrupts are not remapped. However that update can fail for various reasons. Some of them are valid reasons to discard the pending update, but the case, when the previous move has not been fully cleaned up is not a legit reason to fail. Check the return value of irq_do_set_affinity() for -EBUSY, which indicates a pending cleanup, and rearm the pending move in the irq dexcriptor so it's tried again when the next interrupt arrives. Fixes: `996c591227` ("x86/irq: Plug vector cleanup race") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.386544292@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:29 +08:00
Thomas Gleixner	8ad6484b9e	irq_remapping: Use apic_ack_irq() commit `8a2b7d142e` upstream. To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of the special ir_ack_apic_edge() implementation which is merily a wrapper around ack_APIC_irq(). Preparatory change for the real fix Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.555716895@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:29 +08:00
Thomas Gleixner	8244e4593d	x86/platform/uv: Use apic_ack_irq() commit `839b0f1c4e` upstream. To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of the special uv_ack_apic() implementation which is merily a wrapper around ack_APIC_irq(). Preparatory change for the real fix Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Reported-by: Song Liu <liu.song.a23@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.721691398@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:29 +08:00
Thomas Gleixner	b287d8e1fa	x86/ioapic: Use apic_ack_irq() commit `2b04e46d8d` upstream. To address the EBUSY fail of interrupt affinity settings in case that the previous setting has not been cleaned up yet, use the new apic_ack_irq() function instead of directly invoking ack_APIC_irq(). Preparatory change for the real fix Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.639011135@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:29 +08:00
Thomas Gleixner	aa984d9c38	x86/apic: Provide apic_ack_irq() commit `c0255770cc` upstream. apic_ack_edge() is explicitely for handling interrupt affinity cleanup when interrupt remapping is not available or disable. Remapped interrupts and also some of the platform specific special interrupts, e.g. UV, invoke ack_APIC_irq() directly. To address the issue of failing an affinity update with -EBUSY the delayed affinity mechanism can be reused, but ack_APIC_irq() does not handle that. Adding this to ack_APIC_irq() is not possible, because that function is also used for exceptions and directly handled interrupts like IPIs. Create a new function, which just contains the conditional invocation of irq_move_irq() and the final ack_APIC_irq(). Reuse the new function in apic_ack_edge(). Preparatory change for the real fix. Fixes: `dccfe3147b` ("x86/vector: Simplify vector move cleanup") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <liu.song.a23@gmail.com> Cc: Dmitry Safonov <0x7f454c46@gmail.com> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Tariq Toukan <tariqt@mellanox.com> Link: https://lkml.kernel.org/r/20180604162224.471925894@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:29 +08:00
Thomas Gleixner	fab4dcd004	x86/apic/vector: Prevent hlist corruption and leaks commit `80ae7b1a91` upstream. Several people observed the WARN_ON() in irq_matrix_free() which triggers when the caller tries to free an vector which is not in the allocation range. Song provided the trace information which allowed to decode the root cause. The rework of the vector allocation mechanism failed to preserve a sanity check, which prevents setting a new target vector/CPU when the previous affinity change has not fully completed. As a result a half finished affinity change can be overwritten, which can cause the leak of a irq descriptor pointer on the previous target CPU and double enqueue of the hlist head into the cleanup lists of two or more CPUs. After one CPU cleaned up its vector the next CPU will invoke the cleanup handler with vector 0, which triggers the out of range warning in the matrix allocator. Prevent this by checking the apic_data of the interrupt whether the move_in_progress flag is false and the hlist node is not hashed. Return -EBUSY if not. This prevents the damage and restores the behaviour before the vector allocation rework, but due to other changes in that area it also widens the chance that user space can observe -EBUSY. In theory this should be fine, but actually not all user space tools handle -EBUSY correctly. Addressing that is not part of this fix, but will be addressed in follow up patches. Fixes: `69cde0004a` ("x86/vector: Use matrix allocator for vector assignment") Reported-by: Dmitry Safonov <0x7f454c46@gmail.com> Reported-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Song Liu <liu.song.a23@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Song Liu <songliubraving@fb.com> Cc: Joerg Roedel <jroedel@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Cc: Mike Travis <mike.travis@hpe.com> Cc: Borislav Petkov <bp@alien8.de> Link: https://lkml.kernel.org/r/20180604162224.303870257@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:28 +08:00
Dou Liyang	e274d27200	x86/vector: Fix the args of vector_alloc tracepoint commit `838d76d63e` upstream. The vector_alloc tracepont reversed the reserved and ret aggs, that made the trace print wrong. Exchange them. Fixes: `8d1e3dca7d` ("x86/vector: Add tracepoints for vector management") Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: hpa@zytor.com Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180601065031.21872-1-douly.fnst@cn.fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:28 +08:00
Stefan Potyra	7df4825a13	w1: mxc_w1: Enable clock before calling clk_get_rate() on it commit `955bc61328` upstream. According to the API, you may only call clk_get_rate() after actually enabling it. Found by Linux Driver Verification project (linuxtesting.org). Fixes: `a5fd9139f7` ("w1: add 1-wire master driver for i.MX27 / i.MX31") Signed-off-by: Stefan Potyra <Stefan.Potyra@elektrobit.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:28 +08:00
Keith Busch	3bee34f76e	nvme/pci: Sync controller reset for AER slot_reset commit `cc1d5e749a` upstream. AER handling expects a successful return from slot_reset means the driver made the device functional again. The nvme driver had been using an asynchronous reset to recover the device, so the device may still be initializing after control is returned to the AER handler. This creates problems for subsequent event handling, causing the initializion to fail. This patch fixes that by syncing the controller reset before returning to the AER driver, and reporting the true state of the reset. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199657 Reported-by: Alex Gagniuc <mr.nuke.me@gmail.com> Cc: Sinan Kaya <okaya@codeaurora.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:28 +08:00
Hans de Goede	26e0e0da45	libata: Drop SanDisk SD7UB3QG1001 NOLPM quirk commit `2cfce3a86b` upstream. Commit `184add2ca2` ("libata: Apply NOLPM quirk for SanDisk SD7UB3QG1001 SSDs") disabled LPM for SanDisk SD7UB3Q*G1001 SSDs. This has lead to several reports of users of that SSD where LPM was working fine and who know have a significantly increased idle power consumption on their laptops. Likely there is another problem on the T450s from the original reporter which gets exposed by the uncore reaching deeper sleep states (higher PC-states) due to LPM being enabled. The problem as reported, a hardfreeze about once a day, already did not sound like it would be caused by LPM and the reports of the SSD working fine confirm this. The original reporter is ok with dropping the quirk. A X250 user has reported the same hard freeze problem and for him the problem went away after unrelated updates, I suspect some GPU driver stack changes fixed things. TL;DR: The original reporters problem were triggered by LPM but not an LPM issue, so drop the quirk for the SSD in question. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1583207 Cc: stable@vger.kernel.org Cc: Richard W.M. Jones <rjones@redhat.com> Cc: Lorenzo Dalrio <lorenzo.dalrio@gmail.com> Reported-by: Lorenzo Dalrio <lorenzo.dalrio@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: "Richard W.M. Jones" <rjones@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:28 +08:00
Dan Carpenter	1b7fa3981b	libata: zpodd: small read overflow in eject_tray() commit `18c9a99bce` upstream. We read from the cdb[] buffer in ata_exec_internal_sg(). It has to be ATAPI_CDB_LEN (16) bytes long, but this buffer is only 12 bytes. Fixes: `213342053d` ("libata: handle power transition of ODD") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:27 +08:00
Chen Yu	8ae0b72896	cpufreq: governors: Fix long idle detection logic in load calculation commit `7592019634` upstream. According to current code implementation, detecting the long idle period is done by checking if the interval between two adjacent utilization update handlers is long enough. Although this mechanism can detect if the idle period is long enough (no utilization hooks invoked during idle period), it might not cover a corner case: if the task has occupied the CPU for too long which causes no context switches during that period, then no utilization handler will be launched until this high prio task is scheduled out. As a result, the idle_periods field might be calculated incorrectly because it regards the 100% load as 0% and makes the conservative governor who uses this field confusing. Change the detection to compare the idle_time with sampling_rate directly. Reported-by: Artem S. Tashkinov <t.artem@mailcity.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:27 +08:00
Suman Anna	3e2d228fca	cpufreq: ti-cpufreq: Fix an incorrect error return value commit `e5d295b06d` upstream. Commit `05829d9431` (cpufreq: ti-cpufreq: kfree opp_data when failure) has fixed a memory leak in the failure path, however the patch returned a positive value on get_cpu_device() failure instead of the previous negative value. Fix this incorrect error return value properly. Fixes: `05829d9431` (cpufreq: ti-cpufreq: kfree opp_data when failure) Cc: 4.14+ <stable@vger.kernel.org> # v4.14+ Signed-off-by: Suman Anna <s-anna@ti.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:27 +08:00
Tao Wang	2243ed28c9	cpufreq: Fix new policy initialization during limits updates via sysfs commit `c7d1f119c4` upstream. If the policy limits are updated via cpufreq_update_policy() and subsequently via sysfs, the limits stored in user_policy may be set incorrectly. For example, if both min and max are set via sysfs to the maximum available frequency, user_policy.min and user_policy.max will also be the maximum. If a policy notifier triggered by cpufreq_update_policy() lowers both the min and the max at this point, that change is not reflected by the user_policy limits, so if the max is updated again via sysfs to the same lower value, then user_policy.max will be lower than user_policy.min which shouldn't happen. In particular, if one of the policy CPUs is then taken offline and back online, cpufreq_set_policy() will fail for it due to a failing limits check. To prevent that from happening, initialize the min and max fields of the new_policy object to the ones stored in user_policy that were previously set via sysfs. Signed-off-by: Kevin Wangtao <kevin.wangtao@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject & changelog ] Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:27 +08:00
Tejun Heo	1ca4ccb9f3	bdi: Move cgroup bdi_writeback to a dedicated low concurrency workqueue commit `f183464684` upstream. From 0aa2e9b921d6db71150633ff290199554f0842a8 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Wed, 23 May 2018 10:29:00 -0700 cgwb_release() punts the actual release to cgwb_release_workfn() on system_wq. Depending on the number of cgroups or block devices, there can be a lot of cgwb_release_workfn() in flight at the same time. We're periodically seeing close to 256 kworkers getting stuck with the following stack trace and overtime the entire system gets stuck. [<ffffffff810ee40c>] _synchronize_rcu_expedited.constprop.72+0x2fc/0x330 [<ffffffff810ee634>] synchronize_rcu_expedited+0x24/0x30 [<ffffffff811ccf23>] bdi_unregister+0x53/0x290 [<ffffffff811cd1e9>] release_bdi+0x89/0xc0 [<ffffffff811cd645>] wb_exit+0x85/0xa0 [<ffffffff811cdc84>] cgwb_release_workfn+0x54/0xb0 [<ffffffff810a68d0>] process_one_work+0x150/0x410 [<ffffffff810a71fd>] worker_thread+0x6d/0x520 [<ffffffff810ad3dc>] kthread+0x12c/0x160 [<ffffffff81969019>] ret_from_fork+0x29/0x40 [<ffffffffffffffff>] 0xffffffffffffffff The events leading to the lockup are... 1. A lot of cgwb_release_workfn() is queued at the same time and all system_wq kworkers are assigned to execute them. 2. They all end up calling synchronize_rcu_expedited(). One of them wins and tries to perform the expedited synchronization. 3. However, that invovles queueing rcu_exp_work to system_wq and waiting for it. Because #1 is holding all available kworkers on system_wq, rcu_exp_work can't be executed. cgwb_release_workfn() is waiting for synchronize_rcu_expedited() which in turn is waiting for cgwb_release_workfn() to free up some of the kworkers. We shouldn't be scheduling hundreds of cgwb_release_workfn() at the same time. There's nothing to be gained from that. This patch updates cgwb release path to use a dedicated percpu workqueue with @max_active of 1. While this resolves the problem at hand, it might be a good idea to isolate rcu_exp_work to its own workqueue too as it can be used from various paths and is prone to this sort of indirect A-A deadlocks. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:27 +08:00
Roman Pen	17fdd9860c	blk-mq: reinit q->tag_set_list entry only after grace period commit `a347c7ad8e` upstream. It is not allowed to reinit q->tag_set_list list entry while RCU grace period has not completed yet, otherwise the following soft lockup in blk_mq_sched_restart() happens: [ 1064.252652] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [fio:9270] [ 1064.254445] task: ffff99b912e8b900 task.stack: ffffa6d54c758000 [ 1064.254613] RIP: 0010:blk_mq_sched_restart+0x96/0x150 [ 1064.256510] Call Trace: [ 1064.256664] <IRQ> [ 1064.256824] blk_mq_free_request+0xea/0x100 [ 1064.256987] msg_io_conf+0x59/0xd0 [ibnbd_client] [ 1064.257175] complete_rdma_req+0xf2/0x230 [ibtrs_client] [ 1064.257340] ? ibtrs_post_recv_empty+0x4d/0x70 [ibtrs_core] [ 1064.257502] ibtrs_clt_rdma_done+0xd1/0x1e0 [ibtrs_client] [ 1064.257669] ib_create_qp+0x321/0x380 [ib_core] [ 1064.257841] ib_process_cq_direct+0xbd/0x120 [ib_core] [ 1064.258007] irq_poll_softirq+0xb7/0xe0 [ 1064.258165] __do_softirq+0x106/0x2a2 [ 1064.258328] irq_exit+0x92/0xa0 [ 1064.258509] do_IRQ+0x4a/0xd0 [ 1064.258660] common_interrupt+0x7a/0x7a [ 1064.258818] </IRQ> Meanwhile another context frees other queue but with the same set of shared tags: [ 1288.201183] INFO: task bash:5910 blocked for more than 180 seconds. [ 1288.201833] bash D 0 5910 5820 0x00000000 [ 1288.202016] Call Trace: [ 1288.202315] schedule+0x32/0x80 [ 1288.202462] schedule_timeout+0x1e5/0x380 [ 1288.203838] wait_for_completion+0xb0/0x120 [ 1288.204137] __wait_rcu_gp+0x125/0x160 [ 1288.204287] synchronize_sched+0x6e/0x80 [ 1288.204770] blk_mq_free_queue+0x74/0xe0 [ 1288.204922] blk_cleanup_queue+0xc7/0x110 [ 1288.205073] ibnbd_clt_unmap_device+0x1bc/0x280 [ibnbd_client] [ 1288.205389] ibnbd_clt_unmap_dev_store+0x169/0x1f0 [ibnbd_client] [ 1288.205548] kernfs_fop_write+0x109/0x180 [ 1288.206328] vfs_write+0xb3/0x1a0 [ 1288.206476] SyS_write+0x52/0xc0 [ 1288.206624] do_syscall_64+0x68/0x1d0 [ 1288.206774] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 What happened is the following: 1. There are several MQ queues with shared tags. 2. One queue is about to be freed and now task is in blk_mq_del_queue_tag_set(). 3. Other CPU is in blk_mq_sched_restart() and loops over all queues in tag list in order to find hctx to restart. Because linked list entry was modified in blk_mq_del_queue_tag_set() without proper waiting for a grace period, blk_mq_sched_restart() never ends, spining in list_for_each_entry_rcu_rr(), thus soft lockup. Fix is simple: reinit list entry after an RCU grace period elapsed. Fixes: Fixes: `705cda97ee` ("blk-mq: Make it safe to use RCU to iterate over blk_mq_tag_set.tag_list") Cc: stable@vger.kernel.org Cc: Sagi Grimberg <sagi@grimberg.me> Cc: linux-block@vger.kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:26 +08:00
Josef Bacik	1a3f2c2c36	nbd: use bd_set_size when updating disk size commit `9e2b19675d` upstream. When we stopped relying on the bdev everywhere I broke updating the block device size on the fly, which ceph relies on. We can't just do set_capacity, we also have to do bd_set_size so things like parted will notice the device size change. Fixes: `29eaadc` ("nbd: stop using the bdev everywhere") cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:26 +08:00
Josef Bacik	f4484fb56d	nbd: update size when connected commit `c3f7c93976` upstream. I messed up changing the size of an NBD device while it was connected by not actually updating the device or doing the uevent. Fix this by updating everything if we're connected and we change the size. cc: stable@vger.kernel.org Fixes: `639812a` ("nbd: don't set the device size until we're connected") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:26 +08:00
Josef Bacik	8d5a503fca	nbd: fix nbd device deletion commit `8364da4751` upstream. This fixes a use after free bug, we shouldn't be doing disk->queue right after we do del_gendisk(disk). Save the queue and do the cleanup after the del_gendisk. Fixes: `c6a4759ea0` ("nbd: add device refcounting") cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:26 +08:00
Shirish Pargaonkar	4a25e0be25	cifs: For SMB2 security informaion query, check for minimum sized security descriptor instead of sizeof FileAllInformation class commit `ee25c6dd7b` upstream. Validate_buf () function checks for an expected minimum sized response passed to query_info() function. For security information, the size of a security descriptor can be smaller (one subauthority, no ACEs) than the size of the structure that defines FileInfoClass of FileAllInformation. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199725 Cc: <stable@vger.kernel.org> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Reviewed-by: Noah Morrison <noah.morrison@rubrik.com> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:26 +08:00
Mark Syms	d49d23fd34	CIFS: `511c54a2f6` adds a check for session expiry commit `d81243c697` upstream. Handle this additional status in the same way as SESSION_EXPIRED. Signed-off-by: Mark Syms <mark.syms@citrix.com> Signed-off-by: Steve French <stfrench@microsoft.com> CC: Stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:25 +08:00
Steve French	194f11ffa3	smb3: on reconnect set PreviousSessionId field commit `b2adf22fdf` upstream. The server detects reconnect by the (non-zero) value in PreviousSessionId of SMB2/SMB3 SessionSetup request, but this behavior regressed due to commit `166cea4dc3` ("SMB2: Separate RawNTLMSSP authentication from SMB2_sess_setup") CC: Stable <stable@vger.kernel.org> CC: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:25 +08:00
Steve French	84a1f067ce	smb3: fix various xid leaks commit `cfe8909164` upstream. Fix a few cases where we were not freeing the xid which led to active requests being non-zero at unmount time. Signed-off-by: Steve French <smfrench@gmail.com> CC: Stable <stable@vger.kernel.org> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:25 +08:00
Tony Luck	754168a3e4	x86/MCE: Fix stack out-of-bounds write in mce-inject.c: Flags_read() commit `985c78d3ff` upstream. Each of the strings that we want to put into the buf[MAX_FLAG_OPT_SIZE] in flags_read() is two characters long. But the sprintf() adds a trailing newline and will add a terminating NUL byte. So MAX_FLAG_OPT_SIZE needs to be 4. sprintf() calls vsnprintf() and that does return: " * The return value is the number of characters which would * be generated for the given input, excluding the trailing * '\0', as per ISO C99." Note the "excluding". Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/20180427163707.ktaiysvbk3yhk4wm@agluck-desk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:25 +08:00
Dennis Wassenberg	b16426f3be	ALSA: hda: add dock and led support for HP ProBook 640 G4 commit `7eef32c1ef` upstream. This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP ProBook 640 G4 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:24 +08:00
Dennis Wassenberg	df21031ef0	ALSA: hda: add dock and led support for HP EliteBook 830 G5 commit `2861751f67` upstream. This patch adds missing initialisation for HP 2013 UltraSlim Dock Line-In/Out PINs and activates keyboard mute/micmute leds for HP EliteBook 830 G5 Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:24 +08:00
Bo Chen	4b7a7009fd	ALSA: hda - Handle kzalloc() failure in snd_hda_attach_pcm_stream() commit `a3aa60d511` upstream. When 'kzalloc()' fails in 'snd_hda_attach_pcm_stream()', a new pcm instance is created without setting its operators via 'snd_pcm_set_ops()'. Following operations on the new pcm instance can trigger kernel null pointer dereferences and cause kernel oops. This bug was found with my work on building a gray-box fault-injection tool for linux-kernel-module binaries. A kernel null pointer dereference was confirmed from line 'substream->ops->open()' in function 'snd_pcm_open_substream()' in file 'sound/core/pcm_native.c'. This patch fixes the bug by calling 'snd_device_free()' in the error handling path of 'kzalloc()', which removes the new pcm instance from the snd card before returns with an error code. Signed-off-by: Bo Chen <chenbo@pdx.edu> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:24 +08:00
Takashi Iwai	409201da49	ALSA: hda/conexant - Add fixup for HP Z2 G4 workstation commit `f16041df4c` upstream. HP Z2 G4 requires the same workaround as other HP machines that have no mic-pin detection. Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:24 +08:00
Hui Wang	d0f2334df9	ALSA: hda/realtek - Enable mic-mute hotkey for several Lenovo AIOs commit `986376b68d` upstream. We have several Lenovo AIOs like M810z, M820z and M920z, they have the same design for mic-mute hotkey and led and they use the same codec with the same pin configuration, so use the pin conf table to apply fix to all of them. Fixes: `29693efcea` ("ALSA: hda - Fix micmute hotkey problem for a lenovo AIO machine") Cc: <stable@vger.kernel.org> Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:24 +08:00
Takashi Iwai	bfbd47c7c9	ALSA: usb-audio: Disable the quirk for Nura headset commit `5ebf6b1e45` upstream. The commit `33193dca67` ("ALSA: usb-audio: Add a quirk for Nura's first gen headset") added a quirk for Nura headset with USB ID 0a12:1243, with a hope that it doesn't conflict with others. Unfortunately, other devices (e.g. Philips Wecall) with the very same ID got broken by this change, spewing an error like: usb 2-1.8.2: 2:1: cannot set freq 48000 to ep 0x3 Until we find a proper solution, fix the regression at first by disabling the added quirk entry. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199905 Fixes: `33193dca67` ("ALSA: usb-audio: Add a quirk for Nura's first gen headset") Reviewed-by: Martin Peres <martin.peres@free.fr> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:23 +08:00
Qu Wenruo	d90760b157	btrfs: scrub: Don't use inode pages for device replace commit `ac0b4145d6` upstream. [BUG] Btrfs can create compressed extent without checksum (even though it shouldn't), and if we then try to replace device containing such extent, the result device will contain all the uncompressed data instead of the compressed one. Test case already submitted to fstests: https://patchwork.kernel.org/patch/10442353/ [CAUSE] When handling compressed extent without checksum, device replace will goe into copy_nocow_pages() function. In that function, btrfs will get all inodes referring to this data extents and then use find_or_create_page() to get pages direct from that inode. The problem here is, pages directly from inode are always uncompressed. And for compressed data extent, they mismatch with on-disk data. Thus this leads to corrupted compressed data extent written to replace device. [FIX] In this attempt, we could just remove the "optimization" branch, and let unified scrub_pages() to handle it. Although scrub_pages() won't bother reusing page cache, it will be a little slower, but it does the correct csum checking and won't cause such data corruption caused by "optimization". Note about the fix: this is the minimal fix that can be backported to older stable trees without conflicts. The whole callchain from copy_nocow_pages() can be deleted, and will be in followup patches. Fixes: `ff023aac31` ("Btrfs: add code to scrub to copy read data to another disk") CC: stable@vger.kernel.org # 4.4+ Reported-by: James Harvey <jamespharvey20@gmail.com> Reviewed-by: James Harvey <jamespharvey20@gmail.com> Signed-off-by: Qu Wenruo <wqu@suse.com> [ remove code removal, add note why ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:23 +08:00
Su Yue	fe4753f49e	btrfs: return error value if create_io_em failed in cow_file_range commit `090a127afa` upstream. In cow_file_range(), create_io_em() may fail, but its return value is not recorded. Then return value may be 0 even it failed which is a wrong behavior. Let cow_file_range() return PTR_ERR(em) if create_io_em() failed. Fixes: `6f9994dbab` ("Btrfs: create a helper to create em for IO") CC: stable@vger.kernel.org # 4.11+ Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:23 +08:00
Omar Sandoval	9c7786995f	Btrfs: fix memory and mount leak in btrfs_ioctl_rm_dev_v2() commit `fd4e994bd1` upstream. If we have invalid flags set, when we error out we must drop our writer counter and free the buffer we allocated for the arguments. This bug is trivially reproduced with the following program on 4.7+: #include <fcntl.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/types.h> #include <linux/btrfs.h> #include <linux/btrfs_tree.h> int main(int argc, char **argv) { struct btrfs_ioctl_vol_args_v2 vol_args = { .flags = UINT64_MAX, }; int ret; int fd; if (argc != 2) { fprintf(stderr, "usage: %s PATH\n", argv[0]); return EXIT_FAILURE; } fd = open(argv[1], O_WRONLY); if (fd == -1) { perror("open"); return EXIT_FAILURE; } ret = ioctl(fd, BTRFS_IOC_RM_DEV_V2, &vol_args); if (ret == -1) perror("ioctl"); close(fd); return EXIT_SUCCESS; } When unmounting the filesystem, we'll hit the WARN_ON(mnt_get_writers(mnt)) in cleanup_mnt() and also may prevent the filesystem to be remounted read-only as the writer count will stay lifted. Fixes: `6b526ed70c` ("btrfs: introduce device delete by devid") CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:23 +08:00
Omar Sandoval	55f38d1675	Btrfs: fix clone vs chattr NODATASUM race commit `b5c40d598f` upstream. In btrfs_clone_files(), we must check the NODATASUM flag while the inodes are locked. Otherwise, it's possible that btrfs_ioctl_setflags() will change the flags after we check and we can end up with a party checksummed file. The race window is only a few instructions in size, between the if and the locks which is: 3834 if (S_ISDIR(src->i_mode) \|\| S_ISDIR(inode->i_mode)) 3835 return -EISDIR; where the setflags must be run and toggle the NODATASUM flag (provided the file size is 0). The clone will block on the inode lock, segflags takes the inode lock, changes flags, releases log and clone continues. Not impossible but still needs a lot of bad luck to hit unintentionally. Fixes: `0e7b824c4e` ("Btrfs: don't make a file partly checksummed through file clone") CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:23 +08:00
Omar Sandoval	f0da0bcc2f	Btrfs: allow empty subvol= again commit `37becec95a` upstream. I got a report that after upgrading to 4.16, someone's filesystems weren't mounting: [ 23.845852] BTRFS info (device loop0): unrecognized mount option 'subvol=' Before 4.16, this mounted the default subvolume. It turns out that this empty "subvol=" is actually an application bug, but it was causing the application to fail, so it's an ABI break if you squint. The generic parsing code we use for mount options (match_token()) doesn't match an empty string as "%s". Previously, setup_root_args() removed the "subvol=" string, but the mount path was cleaned up to not need that. Add a dummy Opt_subvol_empty to fix this. The simple workaround is to use / or . for the value of 'subvol=' . Fixes: `312c89fbca` ("btrfs: cleanup btrfs_mount() using btrfs_mount_root()") CC: stable@vger.kernel.org # 4.16+ Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:22 +08:00
Tetsuo Handa	a776f1e169	driver core: Don't ignore class_dir_create_and_add() failure. commit `84d0c27d62` upstream. syzbot is hitting WARN() at kernfs_add_one() [1]. This is because kernfs_create_link() is confused by previous device_add() call which continued without setting dev->kobj.parent field when get_device_parent() failed by memory allocation fault injection. Fix this by propagating the error from class_dir_create_and_add() to the calllers of get_device_parent(). [1] https://syzkaller.appspot.com/bug?id=fae0fb607989ea744526d1c082a5b8de6529116f Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: syzbot <syzbot+df47f81c226b31d89fb1@syzkaller.appspotmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:22 +08:00
Jan Kara	ab5bde76c4	ext4: fix fencepost error in check for inode count overflow during resize commit `4f2f76f751` upstream. ext4_resize_fs() has an off-by-one bug when checking whether growing of a filesystem will not overflow inode count. As a result it allows a filesystem with 8192 inodes per group to grow to 64TB which overflows inode count to 0 and makes filesystem unusable. Fix it. Cc: stable@vger.kernel.org Fixes: `3f8a6411fb` Reported-by: Jaco Kroon <jaco@uls.co.za> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:22 +08:00
Theodore Ts'o	0ea1fdcb04	ext4: correctly handle a zero-length xattr with a non-zero e_value_offs commit `8a2b307c21` upstream. Ext4 will always create ext4 extended attributes which do not have a value (where e_value_size is zero) with e_value_offs set to zero. In most places e_value_offs will not be used in a substantive way if e_value_size is zero. There was one exception to this, which is in ext4_xattr_set_entry(), where if there is a maliciously crafted file system where there is an extended attribute with e_value_offs is non-zero and e_value_size is 0, the attempt to remove this xattr will result in a negative value getting passed to memmove, leading to the following sadness: [ 41.225365] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [ 44.538641] BUG: unable to handle kernel paging request at ffff9ec9a3000000 [ 44.538733] IP: __memmove+0x81/0x1a0 [ 44.538755] PGD 1249bd067 P4D 1249bd067 PUD 1249c1067 PMD 80000001230000e1 [ 44.538793] Oops: 0003 [#1] SMP PTI [ 44.539074] CPU: 0 PID: 1470 Comm: poc Not tainted 4.16.0-rc1+ #1 ... [ 44.539475] Call Trace: [ 44.539832] ext4_xattr_set_entry+0x9e7/0xf80 ... [ 44.539972] ext4_xattr_block_set+0x212/0xea0 ... [ 44.540041] ext4_xattr_set_handle+0x514/0x610 [ 44.540065] ext4_xattr_set+0x7f/0x120 [ 44.540090] __vfs_removexattr+0x4d/0x60 [ 44.540112] vfs_removexattr+0x75/0xe0 [ 44.540132] removexattr+0x4d/0x80 ... [ 44.540279] path_removexattr+0x91/0xb0 [ 44.540300] SyS_removexattr+0xf/0x20 [ 44.540322] do_syscall_64+0x71/0x120 [ 44.540344] entry_SYSCALL_64_after_hwframe+0x21/0x86 https://bugzilla.kernel.org/show_bug.cgi?id=199347 This addresses CVE-2018-10840. Reported-by: "Xu, Wen" <wen.xu@gatech.edu> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Fixes: `dec214d00e` ("ext4: xattr inode deduplication") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:22 +08:00
Theodore Ts'o	7e0ecf69e6	ext4: bubble errors from ext4_find_inline_data_nolock() up to ext4_iget() commit `eb9b5f01c3` upstream. If ext4_find_inline_data_nolock() returns an error it needs to get reflected up to ext4_iget(). In order to fix this, ext4_iget_extra_inode() needs to return an error (and not return void). This is related to "ext4: do not allow external inodes for inline data" (which fixes CVE-2018-11412) in that in the errors=continue case, it would be useful to for userspace to receive an error indicating that file system is corrupted. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:22 +08:00
Theodore Ts'o	49e5abce91	ext4: do not allow external inodes for inline data commit `117166efb1` upstream. The inline data feature was implemented before we added support for external inodes for xattrs. It makes no sense to support that combination, but the problem is that there are a number of extended attribute checks that are skipped if e_value_inum is non-zero. Unfortunately, the inline data code is completely e_value_inum unaware, and attempts to interpret the xattr fields as if it were an inline xattr --- at which point, Hilarty Ensues. This addresses CVE-2018-11412. https://bugzilla.kernel.org/show_bug.cgi?id=199803 Reported-by: Jann Horn <jannh@google.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Fixes: `e50e5129f3` ("ext4: xattr-in-inode support") Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:21 +08:00
Lukas Czerner	1f32dcf2bc	ext4: update mtime in ext4_punch_hole even if no blocks are released commit `eee597ac93` upstream. Currently in ext4_punch_hole we're going to skip the mtime update if there are no actual blocks to release. However we've actually modified the file by zeroing the partial block so the mtime should be updated. Moreover the sync and datasync handling is skipped as well, which is also wrong. Fix it. Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reported-by: Joe Habermann <joe.habermann@quantum.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:21 +08:00
Jan Kara	8ab7e7e052	ext4: fix hole length detection in ext4_ind_map_blocks() commit `2ee3ee06a8` upstream. When ext4_ind_map_blocks() computes a length of a hole, it doesn't count with the fact that mapped offset may be somewhere in the middle of the completely empty subtree. In such case it will return too large length of the hole which then results in lseek(SEEK_DATA) to end up returning an incorrect offset beyond the end of the hole. Fix the problem by correctly taking offset within a subtree into account when computing a length of a hole. Fixes: `facab4d971` CC: stable@vger.kernel.org Reported-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:21 +08:00
Erik Schmauss	2e78935d1e	ACPICA: AML parser: attempt to continue loading table after error commit `5088814a6e` upstream. This change alters the parser so that the table load does not abort upon an error. Notable changes: If there is an error while parsing an element of the termlist, we will skip parsing the current termlist element and continue parsing to the next opcode in the termlist. If we get an error while parsing the conditional of If/Else/While or the device name of Scope, we will skip the body of the statement all together and pop the parser_state. If we get an error while parsing the base offset and length of an operation region declaration, we will remove the operation region from the namespace. Signed-off-by: Erik Schmauss <erik.schmauss@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:21 +08:00
Dexuan Cui	a583025f20	hv_netvsc: Fix a network regression after ifdown/ifup [ Upstream commit `52acf73b6e` ] Recently people reported the NIC stops working after "ifdown eth0; ifup eth0". It turns out in this case the TX queues are not enabled, after the refactoring of the common detach logic: when the NIC has sub-channels, usually we enable all the TX queues after all sub-channels are set up: see rndis_set_subchannel() -> netif_device_attach(), but in the case of "ifdown eth0; ifup eth0" where the number of channels doesn't change, we also must make sure the TX queues are enabled. The patch fixes the regression. Fixes: `7b2ee50c0c` ("hv_netvsc: common detach logic") Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:20 +08:00
Willem de Bruijn	aba9ac3884	net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlan [ Upstream commit `fd3a886258` ] Tun, tap, virtio, packet and uml vector all use struct virtio_net_hdr to communicate packet metadata to userspace. For skbuffs with vlan, the first two return the packet as it may have existed on the wire, inserting the VLAN tag in the user buffer. Then virtio_net_hdr.csum_start needs to be adjusted by VLAN_HLEN bytes. Commit `f09e2249c4` ("macvtap: restore vlan header on user read") added this feature to macvtap. Commit `3ce9b20f19` ("macvtap: Fix csum_start when VLAN tags are present") then fixed up csum_start. Virtio, packet and uml do not insert the vlan header in the user buffer. When introducing virtio_net_hdr_from_skb to deduplicate filling in the virtio_net_hdr, the variant from macvtap which adds VLAN_HLEN was applied uniformly, breaking csum offset for packets with vlan on virtio and packet. Make insertion of VLAN_HLEN optional. Convert the callers to pass it when needed. Fixes: `e858fae2b0` ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Fixes: `1276f24eee` ("packet: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:20 +08:00
Paolo Abeni	e8b9a16954	udp: fix rx queue len reported by diag and proc interface [ Upstream commit `6c206b2009` ] After commit `6b229cf77d` ("udp: add batching to udp_rmem_release()") the sk_rmem_alloc field does not measure exactly anymore the receive queue length, because we batch the rmem release. The issue is really apparent only after commit `0d4a6608f6` ("udp: do rmem bulk free even if the rx sk queue is empty"): the user space can easily check for an empty socket with not-0 queue length reported by the 'ss' tool or the procfs interface. We need to use a custom UDP helper to report the correct queue length, taking into account the forward allocation deficit. Reported-by: trevor.francis@46labs.com Fixes: `6b229cf77d` ("UDP: add batching to udp_rmem_release()") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:19 +08:00
Cong Wang	be3bb23cc0	socket: close race condition between sock_close() and sockfs_setattr() [ Upstream commit `6d8c50dcb0` ] fchownat() doesn't even hold refcnt of fd until it figures out fd is really needed (otherwise is ignored) and releases it after it resolves the path. This means sock_close() could race with sockfs_setattr(), which leads to a NULL pointer dereference since typically we set sock->sk to NULL in ->release(). As pointed out by Al, this is unique to sockfs. So we can fix this in socket layer by acquiring inode_lock in sock_close() and checking against NULL in sockfs_setattr(). sock_release() is called in many places, only the sock_close() path matters here. And fortunately, this should not affect normal sock_close() as it is only called when the last fd refcnt is gone. It only affects sock_close() with a parallel sockfs_setattr() in progress, which is not common. Fixes: `86741ec254` ("net: core: Add a UID field to struct sock.") Reported-by: shankarapailoor <shankarapailoor@gmail.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:19 +08:00
Daniel Borkmann	6196f30e84	tls: fix waitall behavior in tls_sw_recvmsg [ Upstream commit `06030dbaf3` ] Current behavior in tls_sw_recvmsg() is to wait for incoming tls messages and copy up to exactly len bytes of data that the user provided. This is problematic in the sense that i) if no packet is currently queued in strparser we keep waiting until one has been processed and pushed into tls receive layer for tls_wait_data() to wake up and push the decrypted bits to user space. Given after tls decryption, we're back at streaming data, use sock_rcvlowat() hint from tcp socket instead. Retain current behavior with MSG_WAITALL flag and otherwise use the hint target for breaking the loop and returning to application. This is done if currently no ctx->recv_pkt is ready, otherwise continue to process it from our strparser backlog. Fixes: `c46234ebb4` ("tls: RX path for ktls") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:18 +08:00
Daniel Borkmann	7fd98de479	tls: fix use-after-free in tls_push_record [ Upstream commit `a447da7d00` ] syzkaller managed to trigger a use-after-free in tls like the following: BUG: KASAN: use-after-free in tls_push_record.constprop.15+0x6a2/0x810 [tls] Write of size 1 at addr ffff88037aa08000 by task a.out/2317 CPU: 3 PID: 2317 Comm: a.out Not tainted 4.17.0+ #144 Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x280 kasan_report+0x258/0x380 ? tls_push_record.constprop.15+0x6a2/0x810 [tls] tls_push_record.constprop.15+0x6a2/0x810 [tls] tls_sw_push_pending_record+0x2e/0x40 [tls] tls_sk_proto_close+0x3fe/0x710 [tls] ? tcp_check_oom+0x4c0/0x4c0 ? tls_write_space+0x260/0x260 [tls] ? kmem_cache_free+0x88/0x1f0 inet_release+0xd6/0x1b0 __sock_release+0xc0/0x240 sock_close+0x11/0x20 __fput+0x22d/0x660 task_work_run+0x114/0x1a0 do_exit+0x71a/0x2780 ? mm_update_next_owner+0x650/0x650 ? handle_mm_fault+0x2f5/0x5f0 ? __do_page_fault+0x44f/0xa50 ? mm_fault_error+0x2d0/0x2d0 do_group_exit+0xde/0x300 __x64_sys_exit_group+0x3a/0x50 do_syscall_64+0x9a/0x300 ? page_fault+0x8/0x30 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This happened through fault injection where aead_req allocation in tls_do_encryption() eventually failed and we returned -ENOMEM from the function. Turns out that the use-after-free is triggered from tls_sw_sendmsg() in the second tls_push_record(). The error then triggers a jump to waiting for memory in sk_stream_wait_memory() resp. returning immediately in case of MSG_DONTWAIT. What follows is the trim_both_sgl(sk, orig_size), which drops elements from the sg list added via tls_sw_sendmsg(). Now the use-after-free gets triggered when the socket is being closed, where tls_sk_proto_close() callback is invoked. The tls_complete_pending_work() will figure that there's a pending closed tls record to be flushed and thus calls into the tls_push_pending_closed_record() from there. ctx->push_pending_record() is called from the latter, which is the tls_sw_push_pending_record() from sw path. This again calls into tls_push_record(). And here the tls_fill_prepend() will panic since the buffer address has been freed earlier via trim_both_sgl(). One way to fix it is to move the aead request allocation out of tls_do_encryption() early into tls_push_record(). This means we don't prep the tls header and advance state to the TLS_PENDING_CLOSED_RECORD before allocation which could potentially fail happened. That fixes the issue on my side. Fixes: `3c4d755915` ("tls: kernel TLS support") Reported-by: syzbot+5c74af81c547738e1684@syzkaller.appspotmail.com Reported-by: syzbot+709f2810a6a05f11d4d3@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:18 +08:00
Frank van der Linden	1caf7914a7	tcp: verify the checksum of the first data segment in a new connection [ Upstream commit `4fd44a98ff` ] commit `079096f103` ("tcp/dccp: install syn_recv requests into ehash table") introduced an optimization for the handling of child sockets created for a new TCP connection. But this optimization passes any data associated with the last ACK of the connection handshake up the stack without verifying its checksum, because it calls tcp_child_process(), which in turn calls tcp_rcv_state_process() directly. These lower-level processing functions do not do any checksum verification. Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to fix this. Fixes: `079096f103` ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:18 +08:00
Davide Caratti	a316034b20	net/sched: act_simple: fix parsing of TCA_DEF_DATA [ Upstream commit `8d499533e0` ] use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA netlink attribute, in case it is less than SIMP_MAX_DATA and it does not end with '\0' character. v2: fix errors in the commit message, thanks Hangbin Liu Fixes: `fa1b1cff3d` ("net_cls_act: Make act_simple use of netlink policy.") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:18 +08:00
Alvaro Gamez Machado	c13befc37e	net: phy: dp83822: use BMCR_ANENABLE instead of BMSR_ANEGCAPABLE for DP83620 [ Upstream commit `b718e8c8f4` ] DP83620 register set is compatible with the DP83848, but it also supports 100base-FX. When the hardware is configured such as that fiber mode is enabled, autonegotiation is not possible. The chip, however, doesn't expose this information via BMSR_ANEGCAPABLE. Instead, this bit is always set high, even if the particular hardware configuration makes it so that auto negotiation is not possible [1]. Under these circumstances, the phy subsystem keeps trying for autonegotiation to happen, without success. Hereby, we inspect BMCR_ANENABLE bit after genphy_config_init, which on reset is set to 0 when auto negotiation is disabled, and so we use this value instead of BMSR_ANEGCAPABLE. [1] https://e2e.ti.com/support/interface/ethernet/f/903/p/697165/2571170 Signed-off-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:17 +08:00
Zhouyang Jia	36ee02c5ff	net: dsa: add error handling for pskb_trim_rcsum [ Upstream commit `349b71d6f4` ] When pskb_trim_rcsum fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling pskb_trim_rcsum. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:17 +08:00
Julian Anastasov	b788fc5f69	ipv6: allow PMTU exceptions to local routes [ Upstream commit `0975764684` ] IPVS setups with local client and remote tunnel server need to create exception for the local virtual IP. What we do is to change PMTU from 64KB (on "lo") to 1460 in the common case. Suggested-by: Martin KaFai Lau <kafai@fb.com> Fixes: `45e4fd2668` ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Fixes: `7343ff31eb` ("ipv6: Don't create clones of host routes.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:17 +08:00
Bjørn Mork	5b9e032148	cdc_ncm: avoid padding beyond end of skb [ Upstream commit `49c2c3f246` ] Commit `4a0e3e989d` ("cdc_ncm: Add support for moving NDP to end of NCM frame") added logic to reserve space for the NDP at the end of the NTB/skb. This reservation did not take the final alignment of the NDP into account, causing us to reserve too little space. Additionally the padding prior to NDP addition did not ensure there was enough space for the NDP. The NTB/skb with the NDP appended would then exceed the configured max size. This caused the final padding of the NTB to use a negative count, padding to almost INT_MAX, and resulting in: [60103.825970] BUG: unable to handle kernel paging request at ffff9641f2004000 [60103.825998] IP: __memset+0x24/0x30 [60103.826001] PGD a6a06067 P4D a6a06067 PUD 4f65a063 PMD 72003063 PTE 0 [60103.826013] Oops: 0002 [#1] SMP NOPTI [60103.826018] Modules linked in: (removed( [60103.826158] CPU: 0 PID: 5990 Comm: Chrome_DevTools Tainted: G O 4.14.0-3-amd64 #1 Debian 4.14.17-1 [60103.826162] Hardware name: LENOVO 20081 BIOS 41CN28WW(V2.04) 05/03/2012 [60103.826166] task: ffff964193484fc0 task.stack: ffffb2890137c000 [60103.826171] RIP: 0010:__memset+0x24/0x30 [60103.826174] RSP: 0000:ffff964316c03b68 EFLAGS: 00010216 [60103.826178] RAX: 0000000000000000 RBX: 00000000fffffffd RCX: 000000001ffa5000 [60103.826181] RDX: 0000000000000005 RSI: 0000000000000000 RDI: ffff9641f2003ffc [60103.826184] RBP: ffff964192f6c800 R08: 00000000304d434e R09: ffff9641f1d2c004 [60103.826187] R10: 0000000000000002 R11: 00000000000005ae R12: ffff9642e6957a80 [60103.826190] R13: ffff964282ff2ee8 R14: 000000000000000d R15: ffff9642e4843900 [60103.826194] FS: 00007f395aaf6700(0000) GS:ffff964316c00000(0000) knlGS:0000000000000000 [60103.826197] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [60103.826200] CR2: ffff9641f2004000 CR3: 0000000013b0c000 CR4: 00000000000006f0 [60103.826204] Call Trace: [60103.826212] <IRQ> [60103.826225] cdc_ncm_fill_tx_frame+0x5e3/0x740 [cdc_ncm] [60103.826236] cdc_ncm_tx_fixup+0x57/0x70 [cdc_ncm] [60103.826246] usbnet_start_xmit+0x5d/0x710 [usbnet] [60103.826254] ? netif_skb_features+0x119/0x250 [60103.826259] dev_hard_start_xmit+0xa1/0x200 [60103.826267] sch_direct_xmit+0xf2/0x1b0 [60103.826273] __dev_queue_xmit+0x5e3/0x7c0 [60103.826280] ? ip_finish_output2+0x263/0x3c0 [60103.826284] ip_finish_output2+0x263/0x3c0 [60103.826289] ? ip_output+0x6c/0xe0 [60103.826293] ip_output+0x6c/0xe0 [60103.826298] ? ip_forward_options+0x1a0/0x1a0 [60103.826303] tcp_transmit_skb+0x516/0x9b0 [60103.826309] tcp_write_xmit+0x1aa/0xee0 [60103.826313] ? sch_direct_xmit+0x71/0x1b0 [60103.826318] tcp_tasklet_func+0x177/0x180 [60103.826325] tasklet_action+0x5f/0x110 [60103.826332] __do_softirq+0xde/0x2b3 [60103.826337] irq_exit+0xae/0xb0 [60103.826342] do_IRQ+0x81/0xd0 [60103.826347] common_interrupt+0x98/0x98 [60103.826351] </IRQ> [60103.826355] RIP: 0033:0x7f397bdf2282 [60103.826358] RSP: 002b:00007f395aaf57d8 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff6e [60103.826362] RAX: 0000000000000000 RBX: 00002f07bc6d0900 RCX: 00007f39752d7fe7 [60103.826365] RDX: 0000000000000022 RSI: 0000000000000147 RDI: 00002f07baea02c0 [60103.826368] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 [60103.826371] R10: 00000000ffffffff R11: 0000000000000000 R12: 00002f07baea02c0 [60103.826373] R13: 00002f07bba227a0 R14: 00002f07bc6d090c R15: 0000000000000000 [60103.826377] Code: 90 90 90 90 90 90 90 0f 1f 44 00 00 49 89 f9 48 89 d1 83 e2 07 48 c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 <f3> 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 [60103.826442] RIP: __memset+0x24/0x30 RSP: ffff964316c03b68 [60103.826444] CR2: ffff9641f2004000 Commit `e1069bbfcf` ("net: cdc_ncm: Reduce memory use when kernel memory low") made this bug much more likely to trigger by reducing the NTB size under memory pressure. Link: https://bugs.debian.org/893393 Reported-by: Горбешко Богдан <bodqhrohro@gmail.com> Reported-and-tested-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: Enrico Mioso <mrkiko.rs@gmail.com> Fixes: `4a0e3e989d` ("cdc_ncm: Add support for moving NDP to end of NCM frame") Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:17 +08:00
Xiangning Yu	6360d93908	bonding: re-evaluate force_primary when the primary slave name changes [ Upstream commit `eb55bbf865` ] There is a timing issue under active-standy mode, when bond_enslave() is called, bond->params.primary might not be initialized yet. Any time the primary slave string changes, bond->force_primary should be set to true to make sure the primary becomes the active slave. Signed-off-by: Xiangning Yu <yuxiangning@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:17 +08:00
Colin Ian King	d0345e21ec	net: aquantia: fix unsigned numvecs comparison with less than zero commit `58d813afbe` upstream. This was originally mistakenly submitted to net-next. Resubmitting to net. The comparison of numvecs < 0 is always false because numvecs is a u32 and hence the error return from a failed call to pci_alloc_irq_vectores is never detected. Fix this by using the signed int ret to handle the error return and assign numvecs to err. Detected by CoverityScan, CID#1468650 ("Unsigned compared against 0") Fixes: `a09bd81b54` ("net: aquantia: Limit number of vectors to actually allocated irqs") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Holger Hoffstätte <holger@applied-asynchrony.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-26 07:51:16 +08:00
Greg Kroah-Hartman	3816828a8c	Linux 4.17.2	2018-06-16 09:18:24 +02:00
Bin Liu	e1b1c10701	crypto: omap-sham - fix memleak commit `9dbc8a0328` upstream. Fixes: `8043bb1ae0` ("crypto: omap-sham - convert driver logic to use sgs for data xmit") The memory pages freed in omap_sham_finish_req() were less than those allocated in omap_sham_copy_sgs(). Cc: stable@vger.kernel.org Signed-off-by: Bin Liu <b-liu@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:24 +02:00
Michael Ellerman	0603cc8258	crypto: vmx - Remove overly verbose printk from AES XTS init commit `730f23b660` upstream. In p8_aes_xts_init() we do a printk(KERN_INFO ...) to report the fallback implementation we're using. However with a slow console this can significantly affect the speed of crypto operations. So remove it. Fixes: `c07f5d3da6` ("crypto: vmx - Adding support for XTS") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:24 +02:00
Michael Ellerman	e120cbbb53	crypto: vmx - Remove overly verbose printk from AES init routines commit `1411b5218a` upstream. In the vmx AES init routines we do a printk(KERN_INFO ...) to report the fallback implementation we're using. However with a slow console this can significantly affect the speed of crypto operations. Using 'cryptsetup benchmark' the removal of the printk() leads to a ~5x speedup for aes-cbc decryption. So remove them. Fixes: `8676590a15` ("crypto: vmx - Adding AES routines for VMX module") Fixes: `8c755ace35` ("crypto: vmx - Adding CBC routines for VMX module") Fixes: `4f7f60d312` ("crypto: vmx - Adding CTR routines for VMX module") Fixes: `cc333cd68d` ("crypto: vmx - Adding GHASH routines for VMX module") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Jan Glauber	11a145ea46	crypto: cavium - Limit result reading attempts commit `c782a8c43e` upstream. After issuing a request an endless loop was used to read the completion state from memory which is asynchronously updated by the ZIP coprocessor. Add an upper bound to the retry attempts to prevent a CPU getting stuck forever in case of an error. Additionally, add a read memory barrier and a small delay between the reading attempts. Signed-off-by: Jan Glauber <jglauber@cavium.com> Reviewed-by: Robert Richter <rrichter@cavium.com> Cc: stable <stable@vger.kernel.org> # 4.14 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Jan Glauber	3d3a603f09	crypto: cavium - Fix fallout from CONFIG_VMAP_STACK commit `37ff02acaa` upstream. Enabling virtual mapped kernel stacks breaks the thunderx_zip driver. On compression or decompression the executing CPU hangs in an endless loop. The reason for this is the usage of __pa by the driver which does no longer work for an address that is not part of the 1:1 mapping. The zip driver allocates a result struct on the stack and needs to tell the hardware the physical address within this struct that is used to signal the completion of the request. As the hardware gets the wrong address after the broken __pa conversion it writes to an arbitrary address. The zip driver then waits forever for the completion byte to contain a non-zero value. Allocating the result struct from 1:1 mapped memory resolves this bug. Signed-off-by: Jan Glauber <jglauber@cavium.com> Reviewed-by: Robert Richter <rrichter@cavium.com> Cc: stable <stable@vger.kernel.org> # 4.14 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Horia Geantă	df54a248ba	crypto: caam - fix size of RSA prime factor q commit `4bffaab373` upstream. Fix a typo where size of RSA prime factor q is using the size of prime factor p. Cc: <stable@vger.kernel.org> # 4.13+ Fixes: `52e26d77b8` ("crypto: caam - add support for RSA key form 2") Fixes: `4a651b122a` ("crypto: caam - add support for RSA key form 3") Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Horia Geantă	8f643590af	crypto: caam/qi - fix IV DMA mapping and updating commit `3a488aaec6` upstream. There are two IV-related issues: (1) crypto API does not guarantee to provide an IV buffer that is DMAable, thus it's incorrect to DMA map it (2) for in-place decryption, since ciphertext is overwritten with plaintext, updated IV (req->info) will contain the last block of plaintext (instead of the last block of ciphertext) While these two issues could be fixed separately, it's straightforward to fix both in the same time - by using the {ablkcipher,aead}_edesc extended descriptor to store the IV that will be fed to the crypto engine; this allows for fixing (2) by saving req->src[last_block] in req->info directly, i.e. without allocating yet another temporary buffer. A side effect of the fix is that it's no longer possible to have the IV contiguous with req->src or req->dst. Code checking for this case is removed. Cc: <stable@vger.kernel.org> # 4.14+ Fixes: `a68a193805` ("crypto: caam/qi - properly set IV after {en,de}crypt") Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au Reported-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Horia Geantă	9136b00c6f	crypto: caam - fix IV DMA mapping and updating commit `115957bb3e` upstream. There are two IV-related issues: (1) crypto API does not guarantee to provide an IV buffer that is DMAable, thus it's incorrect to DMA map it (2) for in-place decryption, since ciphertext is overwritten with plaintext, updated req->info will contain the last block of plaintext (instead of the last block of ciphertext) While these two issues could be fixed separately, it's straightforward to fix both in the same time - by allocating extra space in the ablkcipher_edesc for the IV that will be fed to the crypto engine; this allows for fixing (2) by saving req->src[last_block] in req->info directly, i.e. without allocating another temporary buffer. A side effect of the fix is that it's no longer possible to have the IV and req->src contiguous. Code checking for this case is removed. Cc: <stable@vger.kernel.org> # 4.13+ Fixes: `854b06f768` ("crypto: caam - properly set IV after {en,de}crypt") Link: http://lkml.kernel.org/r/20170113084620.GF22022@gondor.apana.org.au Reported-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Horia Geantă	4f2fa31f93	crypto: caam - fix DMA mapping dir for generated IV commit `a38acd236c` upstream. In case of GIVCIPHER, IV is generated by the device. Fix the DMA mapping direction. Cc: <stable@vger.kernel.org> # 3.19+ Fixes: `7222d1a341` ("crypto: caam - add support for givencrypt cbc(aes) and rfc3686(ctr(aes))") Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Horia Geantă	2f8b123b08	crypto: caam - strip input zeros from RSA input buffer commit `8a2a0dd35f` upstream. Sometimes the provided RSA input buffer provided is not stripped of leading zeros. This could cause its size to be bigger than that of the modulus, making the HW complain: caam_jr 2142000.jr1: 40000789: DECO: desc idx 7: Protocol Size Error - A protocol has seen an error in size. When running RSA, pdb size N < (size of F) when no formatting is used; or pdb size N < (F + 11) when formatting is used. Fix the problem by stripping off the leading zero from input data before feeding it to the CAAM accelerator. Fixes: `8c419778ab` ("crypto: caam - add support for RSA algorithm") Cc: <stable@vger.kernel.org> # 4.8+ Reported-by: Martin Townsend <mtownsend1973@gmail.com> Link: https://lkml.kernel.org/r/CABatt_ytYORYKtApcB4izhNanEKkGFi9XAQMjHi_n-8YWoCRiw@mail.gmail.com Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Tested-by: Fabio Estevam <fabio.estevam@nxp.com> Reviewed-by: Tudor Ambarus <tudor.ambarus@microchip.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Johannes Wienke	696fd07910	Input: elan_i2c - add ELAN0612 (Lenovo v330 14IKB) ACPI ID commit `e6e7e9cd8e` upstream. Add ELAN0612 to the list of supported touchpads; this ID is used in Lenovo v330 14IKB devices. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199253 Signed-off-by: Johannes Wienke <languitar@semipol.de> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:23 +02:00
Ethan Lee	d36966739b	Input: goodix - add new ACPI id for GPD Win 2 touch screen commit `5ca4d1ae9b` upstream. GPD Win 2 Website: http://www.gpd.hk/gpdwin2.asp Tested on a unit from the first production run sent to Indiegogo backers Signed-off-by: Ethan Lee <flibitijibibo@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Gilad Ben-Yossef	3f0a1a0433	crypto: ccree - correct host regs offset commit `281a58c832` upstream. The product signature and HW revision register have different offset on the older HW revisions. This fixes the problem of the driver failing sanity check on silicon despite working on the FPGA emulation systems. Fixes: `27b3b22dd9` ("crypto: ccree - add support for older HW revs") Cc: stable@vger.kernel.org Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Dave Martin	f4764b7b1e	tty: pl011: Avoid spuriously stuck-off interrupts commit `4a7e625ce5` upstream. Commit `9b96fbacda` ("serial: PL011: clear pending interrupts") clears the RX and receive timeout interrupts on pl011 startup, to avoid a screaming-interrupt scenario that can occur when the firmware or bootloader leaves these interrupts asserted. This has been noted as an issue when running Linux on qemu [1]. Unfortunately, the above fix seems to lead to potential misbehaviour if the RX FIFO interrupt is asserted _non_ spuriously on driver startup, if the RX FIFO is also already full to the trigger level. Clearing the RX FIFO interrupt does not change the FIFO fill level. In this scenario, because the interrupt is now clear and because the FIFO is already full to the trigger level, no new assertion of the RX FIFO interrupt can occur unless the FIFO is drained back below the trigger level. This never occurs because the pl011 driver is waiting for an RX FIFO interrupt to tell it that there is something to read, and does not read the FIFO at all until that interrupt occurs. Thus, simply clearing "spurious" interrupts on startup may be misguided, since there is no way to be sure that the interrupts are truly spurious, and things can go wrong if they are not. This patch instead clears the interrupt condition by draining the RX FIFO during UART startup, after clearing any potentially spurious interrupt. This should ensure that an interrupt will definitely be asserted if the RX FIFO subsequently becomes sufficiently full. The drain is done at the point of enabling interrupts only. This means that it will occur any time the UART is newly opened through the tty layer. It will not apply to polled-mode use of the UART by kgdboc: since that scenario cannot use interrupts by design, this should not matter. kgdboc will interact badly with "normal" use of the UART in any case: this patch makes no attempt to paper over such issues. This patch does not attempt to address the case where the RX FIFO fills faster than it can be drained: that is a pathological hardware design problem that is beyond the scope of the driver to work around. As a failsafe, the number of poll iterations for draining the FIFO is limited to twice the FIFO size. This will ensure that the kernel at least boots even if it is impossible to drain the FIFO for some reason. [1] [Qemu-devel] [Qemu-arm] [PATCH] pl011: do not put into fifo before enabled the interruption https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg06446.html Reported-by: Wei Xu <xuwei5@hisilicon.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: `9b96fbacda` ("serial: PL011: clear pending interrupts") Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: stable <stable@vger.kernel.org> Tested-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Sean Wang	6597f6f504	arm64: defconfig: Enable CONFIG_PINCTRL_MT7622 by default commit `1e31927aa6` upstream. Recently kernelCI reported the board mt7622-rfb1 has a fail test with kernel: ERROR: did not start booting whose details could be seen at [1]. The cause is that UART0 can't output anything when it's missing a proper pin setup with current DTS, so the essential driver is always getting enabled to fix up the issue. [1] https://kernelci.org/boot/id/5ad7d62759b51461bfb1f829/ Cc: Kevin Hilman <khilman@baylibre.com> Cc: stable@vger.kernel.org Fixes: `ae457b7679` ("arm64: dts: mt7622: add SoC and peripheral related device nodes") Signed-off-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Stephen Hemminger	4cd9a5cf23	doc: fix sysfs ABI documentation commit `f59acbc5e0` upstream. In 4.9 kernel, the sysfs files for Hyper-V VMBus changed name but the documentation files were not updated. The current sysfs file names are /sys/bus/vmbus/devices/<UUID>/... See commit 9a56e5d6a0ba ("Drivers: hv: make VMBus bus ids persistent") and commit `f6b2db084b` ("vmbus: make sysfs names consistent with PCI") Reported-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Cc: stable@vger.kernel.org Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Gil Kupfer	6858372bcc	vmw_balloon: fixing double free when batching mode is off commit `b23220fe05` upstream. The balloon.page field is used for two different purposes if batching is on or off. If batching is on, the field point to the page which is used to communicate with with the hypervisor. If it is off, balloon.page points to the page that is about to be (un)locked. Unfortunately, this dual-purpose of the field introduced a bug: when the balloon is popped (e.g., when the machine is reset or the balloon driver is explicitly removed), the balloon driver frees, unconditionally, the page that is held in balloon.page. As a result, if batching is disabled, this leads to double freeing the last page that is sent to the hypervisor. The following error occurs during rmmod when kernel checkers are on, and the balloon is not empty: [ 42.307653] ------------[ cut here ]------------ [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable] [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5 [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016 [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000 [ 42.313290] RIP: 0010:__free_pages+0x38/0x40 [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246 [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006 [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0 [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000 [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4 [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200 [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000 [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0 [ 42.314864] Call Trace: [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon] [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon] [ 42.315853] SyS_delete_module+0x1e2/0x250 [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2 [ 42.315924] RIP: 0033:0x7f3af5b0e8e7 [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7 [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248 [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999 [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130 [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0 [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48 [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98 [ 42.325735] ---[ end trace 872e008e33f81508 ]--- To solve the bug, we eliminate the dual purpose of balloon.page. Fixes: `f220a80f0c` ("VMware balloon: add batching to the vmw_balloon.") Cc: stable@vger.kernel.org Reported-by: Oleksandr Natalenko <onatalen@redhat.com> Signed-off-by: Gil Kupfer <gilkup@gmail.com> Signed-off-by: Nadav Amit <namit@vmware.com> Reviewed-by: Xavier Deguillard <xdeguillard@vmware.com> Tested-by: Oleksandr Natalenko <oleksandr@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Tony Lindgren	592141182e	serial: 8250: omap: Fix idling of clocks for unused uarts commit `13dc04d0e5` upstream. I noticed that unused UARTs won't necessarily idle properly always unless at least one byte tx transfer is done first. After some debugging I narrowed down the problem to the scr register dma configuration bits that need to be set before softreset for the clocks to idle. Unless we do this, the module clkctrl idlest bits may be set to 1 instead of 3 meaning the clock will never idle and is blocking deeper idle states for the whole domain. This might be related to the configuration done by the bootloader or kexec booting where certain configurations cause the 8250 or the clkctrl clock to jam in a way where setting of the scr bits and reset is needed to clear it. I've tried diffing the 8250 registers for the various modes, but did not see anything specific. So far I've only seen this on omap4 but I'm suspecting this might also happen on the other clkctrl using SoCs considering they already have a quirk enabled for UART_ERRATA_CLOCK_DISABLE. Let's fix the issue by configuring scr before reset for basic dma even if we don't use it. The scr register will be reset when we do softreset few lines after, and we restore scr on resume. We should do this for all the SoCs with UART_ERRATA_CLOCK_DISABLE quirk flag set since the ones with UART_ERRATA_CLOCK_DISABLE are all based using clkctrl similar to omap4. Looks like both OMAP_UART_SCR_DMAMODE_1 \| OMAP_UART_SCR_DMAMODE_CTL bits are needed for the clkctrl to idle after a softreset. And we need to add omap4 to also use the UART_ERRATA_CLOCK_DISABLE for the related workaround to be enabled. This same compatible value will also be used for omap5. Fixes: `cdb929e445` ("serial: 8250_omap: workaround errata around idling UART after using DMA") Cc: Keerthy <j-keerthy@ti.com> Cc: Matthijs van Duin <matthijsvanduin@gmail.com> Cc: Sekhar Nori <nsekhar@ti.com> Cc: Tero Kristo <t-kristo@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Marek Szyprowski	c8f1940e3a	serial: samsung: fix maxburst parameter for DMA transactions commit `aa2f80e752` upstream. The best granularity of residue that DMA engine can report is in the BURST units, so the serial driver must use MAXBURST = 1 and DMA_SLAVE_BUSWIDTH_1_BYTE if it relies on exact number of bytes transferred by DMA engine. Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Sebastian Andrzej Siewior	ab7a194395	tty/serial: atmel: use port->name as name in request_irq() commit `9594b5be7e` upstream. I was puzzled while looking at /proc/interrupts and random things showed up between reboots. This occurred more often but I realised it later. The "correct" output should be: \|38: 11861 atmel-aic5 2 Level ttyS0 but I saw sometimes \|38: 6426 atmel-aic5 2 Level tty1 and accounted it wrongly as correct. This is use after free and the former example randomly got the "old" pointer which pointed to the same content. With SLAB_FREELIST_RANDOM and HARDENED I even got \|38: 7067 atmel-aic5 2 Level E=Started User Manager for UID 0 or other nonsense. As it turns out the tty, pointer that is accessed in atmel_startup(), is freed() before atmel_shutdown(). It seems to happen quite often that the tty for ttyS0 is allocated and freed while ->shutdown is not invoked. I don't do anything special - just a systemd boot :) Use dev_name(&pdev->dev) as the IRQ name for request_irq(). This exists as long as the driver is loaded so no use-after-free here. Cc: stable@vger.kernel.org Fixes: `761ed4a945` ("tty: serial_core: convert uart_close to use tty_port_close") Acked-by: Richard Genoud <richard.genoud@gmail.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:22 +02:00
Geert Uytterhoeven	6f7e40b879	serial: sh-sci: Stop using printk format %pCr commit `d63c16f8e1` upstream. Printk format "%pCr" will be removed soon, as clk_get_rate() must not be called in atomic context. Replace it by open-coding the operation. This is safe here, as the code runs in task context. Link: http://lkml.kernel.org/r/1527845302-12159-4-git-send-email-geert+renesas@glider.be To: Jia-Ju Bai <baijiaju1990@gmail.com> To: Jonathan Corbet <corbet@lwn.net> To: Michael Turquette <mturquette@baylibre.com> To: Stephen Boyd <sboyd@kernel.org> To: Zhang Rui <rui.zhang@intel.com> To: Eduardo Valentin <edubezval@gmail.com> To: Eric Anholt <eric@anholt.net> To: Stefan Wahren <stefan.wahren@i2se.com> To: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Petr Mladek <pmladek@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-doc@vger.kernel.org Cc: linux-clk@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-serial@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-renesas-soc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: stable@vger.kernel.org # 4.5+ Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	cde4abc367	usb: gadget: udc: renesas_usb3: disable the controller's irqs for reconnecting commit `bd6bce004d` upstream. This patch fixes an issue that reconnection is possible to fail because unexpected state handling happens by the irqs. To fix the issue, the driver disables the controller's irqs when disconnected. Fixes: `746bfe63bb` ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Cc: <stable@vger.kernel.org> # v4.5+ Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	a1b6b51fab	usb: gadget: udc: renesas_usb3: should fail if devm_phy_get() returns error commit `0259068f63` upstream. This patch fixes an issue that this driver ignores errors other than the non-existence of the device, f.e. a memory allocation failure in devm_phy_get(). So, this patch replaces devm_phy_get() with devm_phy_optional_get(). Reported-by: Simon Horman <horms+renesas@verge.net.au> Fixes: `279d4bc640` ("usb: gadget: udc: renesas_usb3: add support for generic phy") Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	6ce652a4a1	usb: gadget: udc: renesas_usb3: should call devm_phy_get() before add udc commit `003bc1dee2` upstream. This patch fixes an issue that this driver cannot call phy_init() if a gadget driver is alreadly loaded because usb_add_gadget_udc() might call renesas_usb3_start() via .udc_start. This patch also revises the typo (s/an optional/optional/). Fixes: `279d4bc640` ("usb: gadget: udc: renesas_usb3: add support for generic phy") Cc: <stable@vger.kernel.org> # v4.15+ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	b27a53d962	usb: gadget: udc: renesas_usb3: should call pm_runtime_enable() before add udc commit `d998844016` upstream. This patch fixes an issue that this driver causes panic if a gadget driver is already loaded because usb_add_gadget_udc() might call renesas_usb3_start() via .udc_start, and then pm_runtime_get_sync() in renesas_usb3_start() doesn't work correctly. Note that the usb3_to_dev() macro should not be called at this timing because the macro uses the gadget structure. Fixes: `cf06df3fae` ("usb: gadget: udc: renesas_usb3: move pm_runtime_{en,dis}able()") Cc: <stable@vger.kernel.org> # v4.15+ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	4ba996b962	usb: gadget: udc: renesas_usb3: should remove debugfs commit `1990cf7c21` upstream. This patch fixes an issue that this driver doesn't remove its debugfs. Fixes: `43ba968b00` ("usb: gadget: udc: renesas_usb3: add debugfs to set the b-device mode") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	7ccec0f685	usb: gadget: udc: renesas_usb3: fix double phy_put() commit `8223b2f89c` upstream. This patch fixes an issue that this driver cause double phy_put() calling. This driver must not call phy_put() in the remove because the driver calls devm_phy_get() in the probe. Fixes: `279d4bc640` ("usb: gadget: udc: renesas_usb3: add support for generic phy") Cc: <stable@vger.kernel.org> # v4.15+ Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Yoshihiro Shimoda	1dce9fcf93	usb: gadget: function: printer: avoid wrong list handling in printer_write() commit `4a014a7339` upstream. When printer_write() calls usb_ep_queue(), a udc driver (e.g. renesas_usbhs driver) may call usb_gadget_giveback_request() in the udc .queue ops immediately. Then, printer_write() calls list_add(&req->list, &dev->tx_reqs_active) wrongly. After that, if we do unbind the printer driver, WARN_ON() happens in printer_func_unbind() because the list entry is not removed. So, this patch moves list_add(&req->list, &dev->tx_reqs_active) calling before usb_ep_queue(). Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Heikki Krogerus	b20a6fb3fd	usb: typec: wcove: Remove dependency on HW FSM commit `05826ff135` upstream. The USB Type-C PHY in Intel WhiskeyCove PMIC has build-in USB Type-C state machine which we were relying on to configure the CC lines correctly. This patch removes that dependency and configures the CC line according to commands from the port manager (tcpm.c) in wcove_set_cc(). This fixes an issue where USB devices attached to the USB Type-C port do not get enumerated. When acting as source/host, the HW FSM sometimes fails to configure the PHY correctly. Fixes: `3c4fb9f169` ("usb: typec: wcove: start using tcpm for USB PD support") Cc: stable@vger.kernel.org Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Ruslan Bilovol	2212281caf	usb: core: message: remove extra endianness conversion in usb_set_isoch_delay commit `48b73d0fa1` upstream. No need to do extra endianness conversion in usb_set_isoch_delay because it is already done in usb_control_msg() Fixes: `886ee36e72` ("usb: core: add support for USB_REQ_SET_ISOCH_DELAY") Cc: Dmytro Panchenko <dmytro.panchenko@globallogic.com> Cc: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: stable <stable@vger.kernel.org> # v4.16+ Signed-off-by: Ruslan Bilovol <ruslan.bilovol@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:21 +02:00
Manu Gautam	024e9a071b	phy: qcom-qusb2: Fix crash if nvmem cell not specified commit `0b4555e776` upstream. Driver currently crashes due to NULL pointer deference while updating PHY tune register if nvmem cell is NULL. Since, fused value for Tune1/2 register is optional, we'd rather bail out. Fixes: `ca04d9d3e1` ("phy: qcom-qusb2: New driver for QUSB2 PHY on Qcom chips") Reviewed-by: Vivek Gautam <vivek.gautam@codeaurora.org> Reviewed-by: Evan Green <evgreen@chromium.org> Cc: stable <stable@vger.kernel.org> # 4.14+ Signed-off-by: Manu Gautam <mgautam@codeaurora.org> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Ethan Lee	b7d3c95482	Input: xpad - add GPD Win 2 Controller USB IDs commit `c1ba08390a` upstream. GPD Win 2 Website: http://www.gpd.hk/gpdwin2.asp Tested on a unit from the first production run sent to Indiegogo backers Signed-off-by: Ethan Lee <flibitijibibo@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Alexander Kappner	f3726ba76b	usb-storage: Add compatibility quirk flags for G-Technologies G-Drive commit `ca7d9515d0` upstream. The "G-Drive" (sold by G-Technology) external USB 3.0 drive hangs on write access under UAS and usb-storage: [ 136.079121] sd 15:0:0:0: [sdi] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 136.079144] sd 15:0:0:0: [sdi] tag#0 Sense Key : Illegal Request [current] [ 136.079152] sd 15:0:0:0: [sdi] tag#0 Add. Sense: Invalid field in cdb [ 136.079176] sd 15:0:0:0: [sdi] tag#0 CDB: Write(16) 8a 08 00 00 00 00 00 00 00 00 00 00 00 08 00 00 [ 136.079180] print_req_error: critical target error, dev sdi, sector 0 [ 136.079183] Buffer I/O error on dev sdi, logical block 0, lost sync page write [ 136.173148] EXT4-fs (sdi): mounted filesystem with ordered data mode. Opts: (null) [ 140.583998] sd 15:0:0:0: [sdi] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [ 140.584010] sd 15:0:0:0: [sdi] tag#0 Sense Key : Illegal Request [current] [ 140.584016] sd 15:0:0:0: [sdi] tag#0 Add. Sense: Invalid field in cdb [ 140.584022] sd 15:0:0:0: [sdi] tag#0 CDB: Write(16) 8a 08 00 00 00 00 e8 c4 00 18 00 00 00 08 00 00 [ 140.584025] print_req_error: critical target error, dev sdi, sector 3905159192 [ 140.584044] print_req_error: critical target error, dev sdi, sector 3905159192 [ 140.584052] Aborting journal on device sdi-8. The proposed patch adds compatibility quirks. Because the drive requires two quirks (one to work with UAS, and another to work with usb-storage), adding this under unusual_devs.h and not just unusual_uas.h so kernels compiled without UAS receive the quirk. With the patch, the drive works reliably on UAS and usb- storage. (tested on NEC Corporation uPD720200 USB 3.0 host controller). Signed-off-by: Alexander Kappner <agk@godking.net> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Alexander Kappner	6dc83647c1	usb-storage: Add support for FL_ALWAYS_SYNC flag in the UAS driver commit `8c4e97ddfe` upstream. The ALWAYS_SYNC flag is currently honored by the usb-storage driver but not UAS and is required to work around devices that become unstable upon being queried for cache. This code is taken straight from: drivers/usb/storage/scsiglue.c:284 Signed-off-by: Alexander Kappner <agk@godking.net> Acked-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Gustavo A. R. Silva	089cf89b15	usbip: vhci_sysfs: fix potential Spectre v1 commit `a0d6ec8809` upstream. pdev_nr and rhport can be controlled by user-space, hence leading to a potential exploitation of the Spectre variant 1 vulnerability. This issue was detected with the help of Smatch: drivers/usb/usbip/vhci_sysfs.c:238 detach_store() warn: potential spectre issue 'vhcis' drivers/usb/usbip/vhci_sysfs.c:328 attach_store() warn: potential spectre issue 'vhcis' drivers/usb/usbip/vhci_sysfs.c:338 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_ss->vdev' drivers/usb/usbip/vhci_sysfs.c:340 attach_store() warn: potential spectre issue 'vhci->vhci_hcd_hs->vdev' Fix this by sanitizing pdev_nr and rhport before using them to index vhcis and vhci->vhci_hcd_ss->vdev respectively. Notice that given that speculation windows are large, the policy is to kill the speculation on the first load and not worry if it can be completed with a dependent load/store [1]. [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2 Cc: stable@vger.kernel.org Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Shuah Khan (Samsung OSG) <shuah@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Greg Kroah-Hartman	d40e8d428a	NFC: pn533: don't send USB data off of the stack commit `dbafc28955` upstream. It's amazing that this driver ever worked, but now that x86 doesn't allow USB data to be sent off of the stack, it really does not work at all. Fix this up by properly allocating the data for the small "commands" that get sent to the device off of the stack. We do this for one command by having a whole urb just for ack messages, as they can be submitted in interrupt context, so we can not use usb_bulk_msg(). But the poweron command can sleep (and does), so use usb_bulk_msg() for that transfer. Reported-by: Carlos Manuel Santos <cmmpsantos@gmail.com> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: stable <stable@vger.kernel.org> Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Laura Abbott	183c01f618	staging: android: ion: Switch to pr_warn_once in ion_buffer_destroy commit `45ad559a29` upstream. Syzbot reported yet another warning with Ion: WARNING: CPU: 0 PID: 1467 at drivers/staging/android/ion/ion.c:122 ion_buffer_destroy+0xd4/0x190 drivers/staging/android/ion/ion.c:122 Kernel panic - not syncing: panic_on_warn set ... This is catching that a buffer was freed with an existing kernel mapping still present. This can be easily be triggered from userspace by calling DMA_BUF_SYNC_START without calling DMA_BUF_SYNC_END. Switch to a single pr_warn_once to indicate the error without being disruptive. Reported-by: syzbot+cd8bcd40cb049efa2770@syzkaller.appspotmail.com Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Paolo Bonzini	0c950f7417	kvm: x86: use correct privilege level for sgdt/sidt/fxsave/fxrstor access commit `3c9fa24ca7` upstream. The functions that were used in the emulation of fxrstor, fxsave, sgdt and sidt were originally meant for task switching, and as such they did not check privilege levels. This is very bad when the same functions are used in the emulation of unprivileged instructions. This is CVE-2018-10853. The obvious fix is to add a new argument to ops->read_std and ops->write_std, which decides whether the access is a "system" access or should use the processor's CPL. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:20 +02:00
Paolo Bonzini	3842b793ee	KVM: x86: pass kvm_vcpu to kvm_read_guest_virt and kvm_write_guest_virt_system commit `ce14e868a5` upstream. Int the next patch the emulator's .read_std and .write_std callbacks will grow another argument, which is not needed in kvm_read_guest_virt and kvm_write_guest_virt_system's callers. Since we have to make separate functions, let's give the currently existing names a nicer interface, too. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:19 +02:00
Felix Wilhelm	9c3c305756	kvm: nVMX: Enforce cpl=0 for VMX instructions commit `727ba748e1` upstream. VMX instructions executed inside a L1 VM will always trigger a VM exit even when executed with cpl 3. This means we must perform the privilege check in software. Fixes: 70f3aac964ae("kvm: nVMX: Remove superfluous VMX instruction fault checks") Cc: stable@vger.kernel.org Signed-off-by: Felix Wilhelm <fwilhelm@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:19 +02:00
Michael S. Tsirkin	482e73ef32	kvm: fix typo in flag name commit `766d3571d8` upstream. KVM_X86_DISABLE_EXITS_HTL really refers to exit on halt. Obviously a typo: should be named KVM_X86_DISABLE_EXITS_HLT. Fixes: `caa057a2ca` ("KVM: X86: Provide a capability to disable HLT intercepts") Cc: stable@vger.kernel.org Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:19 +02:00
Paolo Bonzini	a0f33fde11	KVM: x86: introduce linear_{read,write}_system commit `79367a6574` upstream. Wrap the common invocation of ctxt->ops->read_std and ctxt->ops->write_std, so as to have a smaller patch when the functions grow another argument. Fixes: `129a72a0d3` ("KVM: x86: Introduce segmented_write_std", 2017-01-12) Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:19 +02:00
Wanpeng Li	681bfc6800	KVM: X86: Fix reserved bits check for MOV to CR3 commit `a780a3ea62` upstream. MSB of CR3 is a reserved bit if the PCIDE bit is not set in CR4. It should be checked when PCIDE bit is not set, however commit 'd1cd3ce900441 ("KVM: MMU: check guest CR3 reserved bits based on its physical address width")' removes the bit 63 checking unconditionally. This patch fixes it by checking bit 63 of CR3 when PCIDE bit is not set in CR4. Fixes: `d1cd3ce900` (KVM: MMU: check guest CR3 reserved bits based on its physical address width) Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Liran Alon <liran.alon@oracle.com> Cc: stable@vger.kernel.org Reviewed-by: Junaid Shahid <junaids@google.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:19 +02:00
Bart Van Assche	a4932a9ccd	blkdev_report_zones_ioctl(): Use vmalloc() to allocate large buffers commit `327ea4adcf` upstream. Avoid that complaints similar to the following appear in the kernel log if the number of zones is sufficiently large: fio: page allocation failure: order:9, mode:0x140c0c0(GFP_KERNEL\|__GFP_COMP\|__GFP_ZERO), nodemask=(null) Call Trace: dump_stack+0x63/0x88 warn_alloc+0xf5/0x190 __alloc_pages_slowpath+0x8f0/0xb0d __alloc_pages_nodemask+0x242/0x260 alloc_pages_current+0x6a/0xb0 kmalloc_order+0x18/0x50 kmalloc_order_trace+0x26/0xb0 __kmalloc+0x20e/0x220 blkdev_report_zones_ioctl+0xa5/0x1a0 blkdev_ioctl+0x1ba/0x930 block_ioctl+0x41/0x50 do_vfs_ioctl+0xaa/0x610 SyS_ioctl+0x79/0x90 do_syscall_64+0x79/0x1b0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: `3ed05a987e` ("blk-zoned: implement ioctls") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Shaun Tancheff <shaun.tancheff@seagate.com> Cc: Damien Le Moal <damien.lemoal@hgst.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:18 +02:00
Atul Gupta	1adad7cbc1	crypto: chelsio - request to HW should wrap commit `4c826fed67` upstream. -Tx request and data is copied to HW Q in 64B desc, check for end of queue and adjust the current position to start from beginning before passing the additional request info. -key context copy should check key length only -Few reverse christmas tree correction Signed-off-by: Atul Gupta <atul.gupta@chelsio.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-16 09:18:18 +02:00
Greg Kroah-Hartman	d0c077266e	Linux 4.17.1	2018-06-11 22:43:19 +02:00
Dexuan Cui	2d3cc89eb0	PCI: hv: Do not wait forever on a device that has disappeared commit `c3635da2a3` upstream. Before the guest finishes the device initialization, the device can be removed anytime by the host, and after that the host won't respond to the guest's request, so the guest should be prepared to handle this case. Add a polling mechanism to detect device presence. Signed-off-by: Dexuan Cui <decui@microsoft.com> [lorenzo.pieralisi@arm.com: edited commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:19 +02:00
Sabrina Dubroca	53d26741c9	ipmr: fix error path when ipmr_new_table fails [ Upstream commit `e783bb00ad` ] commit `0bbbf0e7d0` ("ipmr, ip6mr: Unite creation of new mr_table") refactored ipmr_new_table, so that it now returns NULL when mr_table_alloc fails. Unfortunately, all callers of ipmr_new_table expect an ERR_PTR. This can result in NULL deref, for example when ipmr_rules_exit calls ipmr_free_table with NULL net->ipv4.mrt in the !CONFIG_IP_MROUTE_MULTIPLE_TABLES version. This patch makes mr_table_alloc return errors, and changes ip6mr_new_table and its callers to return/expect error pointers as well. It also removes the version of mr_table_alloc defined under !CONFIG_IP_MROUTE_COMMON, since it is never used. Fixes: `0bbbf0e7d0` ("ipmr, ip6mr: Unite creation of new mr_table") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:19 +02:00
Arun Parameswaran	ce324bbc88	net: dsa: b53: Fix for brcm tag issue in Cygnus SoC [ Upstream commit `5040cc990c` ] In the Broadcom Cygnus SoC, the brcm tag needs to be inserted in between the mac address and the ether type (should use 'DSA_PROTO_TAG_BRCM') for the packets sent to the internal b53 switch. Since the Cygnus was added with the BCM58XX device id and the BCM58XX uses 'DSA_PROTO_TAG_BRCM_PREPEND', the data path is broken, due to the incorrect brcm tag location. Add a new b53 device id (BCM583XX) for Cygnus family to fix the issue. Add the new device id to the BCM58XX family as Cygnus is similar to the BCM58XX in most other functionalities. Fixes: `1160603960` ("net: dsa: b53: Support prepended Broadcom tags") Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Acked-by: Scott Branden <scott.branden@broadcom.com> Reported-by: Clément Péron <peron.clem@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:19 +02:00
Stephen Suryaputra	2f2a68a67c	vrf: check the original netdevice for generating redirect [ Upstream commit `2f17becfbe` ] Use the right device to determine if redirect should be sent especially when using vrf. Same as well as when sending the redirect. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Dan Carpenter	2b14bcf74b	team: use netdev_features_t instead of u32 [ Upstream commit `25ea66544b` ] This code was introduced in 2011 around the same time that we made netdev_features_t a u64 type. These days a u32 is not big enough to hold all the potential features. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Xin Long	f9c8107b98	sctp: not allow transport timeout value less than HZ/5 for hb_timer [ Upstream commit `1d88ba1ebb` ] syzbot reported a rcu_sched self-detected stall on CPU which is caused by too small value set on rto_min with SCTP_RTOINFO sockopt. With this value, hb_timer will get stuck there, as in its timer handler it starts this timer again with this value, then goes to the timer handler again. This problem is there since very beginning, and thanks to Eric for the reproducer shared from a syzbot mail. This patch fixes it by not allowing sctp_transport_timeout to return a smaller value than HZ/5 for hb_timer, which is based on TCP's min rto. Note that it doesn't fix this issue by limiting rto_min, as some users are still using small rto and no proper value was found for it yet. Reported-by: syzbot+3dcd59a1f907245f891f@syzkaller.appspotmail.com Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Eric Dumazet	bbfa6b2ecc	rtnetlink: validate attributes in do_setlink() [ Upstream commit `644c7eebbf` ] It seems that rtnl_group_changelink() can call do_setlink while a prior call to validate_linkmsg(dev = NULL, ...) could not validate IFLA_ADDRESS / IFLA_BROADCAST Make sure do_setlink() calls validate_linkmsg() instead of letting its callers having this responsibility. With help from Dmitry Vyukov, thanks a lot ! BUG: KMSAN: uninit-value in is_valid_ether_addr include/linux/etherdevice.h:199 [inline] BUG: KMSAN: uninit-value in eth_prepare_mac_addr_change net/ethernet/eth.c:275 [inline] BUG: KMSAN: uninit-value in eth_mac_addr+0x203/0x2b0 net/ethernet/eth.c:308 CPU: 1 PID: 8695 Comm: syz-executor3 Not tainted 4.17.0-rc5+ #103 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084 __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686 is_valid_ether_addr include/linux/etherdevice.h:199 [inline] eth_prepare_mac_addr_change net/ethernet/eth.c:275 [inline] eth_mac_addr+0x203/0x2b0 net/ethernet/eth.c:308 dev_set_mac_address+0x261/0x530 net/core/dev.c:7157 do_setlink+0xbc3/0x5fc0 net/core/rtnetlink.c:2317 rtnl_group_changelink net/core/rtnetlink.c:2824 [inline] rtnl_newlink+0x1fe9/0x37a0 net/core/rtnetlink.c:2976 rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x455a09 RSP: 002b:00007fc07480ec68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fc07480f6d4 RCX: 0000000000455a09 RDX: 0000000000000000 RSI: 00000000200003c0 RDI: 0000000000000014 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000005d0 R14: 00000000006fdc20 R15: 0000000000000000 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_save_stack mm/kmsan/kmsan.c:294 [inline] kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685 kmsan_memcpy_origins+0x11d/0x170 mm/kmsan/kmsan.c:527 __msan_memcpy+0x109/0x160 mm/kmsan/kmsan_instr.c:478 do_setlink+0xb84/0x5fc0 net/core/rtnetlink.c:2315 rtnl_group_changelink net/core/rtnetlink.c:2824 [inline] rtnl_newlink+0x1fe9/0x37a0 net/core/rtnetlink.c:2976 rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2753 [inline] __kmalloc_node_track_caller+0xb32/0x11b0 mm/slub.c:4395 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:988 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `e7ed828f10` ("netlink: support setting devgroup parameters") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Eric Dumazet	97c37ac70d	net/packet: refine check for priv area size [ Upstream commit `eb73190f4f` ] syzbot was able to trick af_packet again [1] Various commits tried to address the problem in the past, but failed to take into account V3 header size. [1] tpacket_rcv: packet too big, clamped from 72 to 4294967224. macoff=96 BUG: KASAN: use-after-free in prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline] BUG: KASAN: use-after-free in prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039 Write of size 2 at addr ffff8801cb62000e by task kworker/1:2/2106 CPU: 1 PID: 2106 Comm: kworker/1:2 Not tainted 4.17.0-rc7+ #77 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: ipv6_addrconf addrconf_dad_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 print_address_description+0x6c/0x20b mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412 __asan_report_store2_noabort+0x17/0x20 mm/kasan/report.c:436 prb_run_all_ft_ops net/packet/af_packet.c:1016 [inline] prb_fill_curr_block.isra.59+0x4e5/0x5c0 net/packet/af_packet.c:1039 __packet_lookup_frame_in_block net/packet/af_packet.c:1094 [inline] packet_current_rx_frame net/packet/af_packet.c:1117 [inline] tpacket_rcv+0x1866/0x3340 net/packet/af_packet.c:2282 dev_queue_xmit_nit+0x891/0xb90 net/core/dev.c:2018 xmit_one net/core/dev.c:3049 [inline] dev_hard_start_xmit+0x16b/0xc10 net/core/dev.c:3069 __dev_queue_xmit+0x2724/0x34c0 net/core/dev.c:3584 dev_queue_xmit+0x17/0x20 net/core/dev.c:3617 neigh_resolve_output+0x679/0xad0 net/core/neighbour.c:1358 neigh_output include/net/neighbour.h:482 [inline] ip6_finish_output2+0xc9c/0x2810 net/ipv6/ip6_output.c:120 ip6_finish_output+0x5fe/0xbc0 net/ipv6/ip6_output.c:154 NF_HOOK_COND include/linux/netfilter.h:277 [inline] ip6_output+0x227/0x9b0 net/ipv6/ip6_output.c:171 dst_output include/net/dst.h:444 [inline] NF_HOOK include/linux/netfilter.h:288 [inline] ndisc_send_skb+0x100d/0x1570 net/ipv6/ndisc.c:491 ndisc_send_ns+0x3c1/0x8d0 net/ipv6/ndisc.c:633 addrconf_dad_work+0xbef/0x1340 net/ipv6/addrconf.c:4033 process_one_work+0xc1e/0x1b50 kernel/workqueue.c:2145 worker_thread+0x1cc/0x1440 kernel/workqueue.c:2279 kthread+0x345/0x410 kernel/kthread.c:240 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412 The buggy address belongs to the page: page:ffffea00072d8800 count:0 mapcount:-127 mapping:0000000000000000 index:0xffff8801cb620e80 flags: 0x2fffc0000000000() raw: 02fffc0000000000 0000000000000000 ffff8801cb620e80 00000000ffffff80 raw: ffffea00072e3820 ffffea0007132d20 0000000000000002 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8801cb61ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801cb61ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8801cb620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8801cb620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8801cb620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff Fixes: `2b6867c2ce` ("net/packet: fix overflow in check for priv area size") Fixes: `dc808110bb` ("packet: handle too big packets for PACKET_V3") Fixes: `f6fb8f100b` ("af-packet: TPACKET_V3 flexible buffer implementation.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Eric Dumazet	8b1cd48bae	net: metrics: add proper netlink validation [ Upstream commit `5b5e7a0de2` ] Before using nla_get_u32(), better make sure the attribute is of the proper size. Code recently was changed, but bug has been there from beginning of git. BUG: KMSAN: uninit-value in rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746 CPU: 1 PID: 14139 Comm: syz-executor6 Not tainted 4.17.0-rc5+ #103 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x149/0x260 mm/kmsan/kmsan.c:1084 __msan_warning_32+0x6e/0xc0 mm/kmsan/kmsan_instr.c:686 rtnetlink_put_metrics+0x553/0x960 net/core/rtnetlink.c:746 fib_dump_info+0xc42/0x2190 net/ipv4/fib_semantics.c:1361 rtmsg_fib+0x65f/0x8c0 net/ipv4/fib_semantics.c:419 fib_table_insert+0x2314/0x2b50 net/ipv4/fib_trie.c:1287 inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779 rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x455a09 RSP: 002b:00007faae5fd8c68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007faae5fd96d4 RCX: 0000000000455a09 RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000013 RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 00000000000005d0 R14: 00000000006fdc20 R15: 0000000000000000 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_save_stack mm/kmsan/kmsan.c:294 [inline] kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:685 __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:529 fib_convert_metrics net/ipv4/fib_semantics.c:1056 [inline] fib_create_info+0x2d46/0x9dc0 net/ipv4/fib_semantics.c:1150 fib_table_insert+0x3e4/0x2b50 net/ipv4/fib_trie.c:1146 inet_rtm_newroute+0x210/0x340 net/ipv4/fib_frontend.c:779 rtnetlink_rcv_msg+0xa32/0x1560 net/core/rtnetlink.c:4646 netlink_rcv_skb+0x378/0x600 net/netlink/af_netlink.c:2448 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:4664 netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline] netlink_unicast+0x1678/0x1750 net/netlink/af_netlink.c:1336 netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315 kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322 slab_post_alloc_hook mm/slab.h:446 [inline] slab_alloc_node mm/slub.c:2753 [inline] __kmalloc_node_track_caller+0xb32/0x11b0 mm/slub.c:4395 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:988 [inline] netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline] netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg net/socket.c:639 [inline] ___sys_sendmsg+0xec0/0x1310 net/socket.c:2117 __sys_sendmsg net/socket.c:2155 [inline] __do_sys_sendmsg net/socket.c:2164 [inline] __se_sys_sendmsg net/socket.c:2162 [inline] __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162 do_syscall_64+0x152/0x230 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `a919525ad8` ("net: Move fib_convert_metrics to metrics file") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Cong Wang	b9fbfc02f5	netdev-FAQ: clarify DaveM's position for stable backports [ Upstream commit `75d4e704fa` ] Per discussion with David at netconf 2018, let's clarify DaveM's position of handling stable backports in netdev-FAQ. This is important for people relying on upstream -stable releases. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Guillaume Nault	ac87a90a0d	l2tp: fix refcount leakage on PPPoL2TP sockets [ Upstream commit `3d609342cc` ] Commit `d02ba2a611` ("l2tp: fix race in pppol2tp_release with session object destroy") tried to fix a race condition where a PPPoL2TP socket would disappear while the L2TP session was still using it. However, it missed the root issue which is that an L2TP session may accept to be reconnected if its associated socket has entered the release process. The tentative fix makes the session hold the socket it is connected to. That saves the kernel from crashing, but introduces refcount leakage, preventing the socket from completing the release process. Once stalled, everything the socket depends on can't be released anymore, including the L2TP session and the l2tp_ppp module. The root issue is that, when releasing a connected PPPoL2TP socket, the session's ->sk pointer (RCU-protected) is reset to NULL and we have to wait for a grace period before destroying the socket. The socket drops the session in its ->sk_destruct callback function, so the session will exist until the last reference on the socket is dropped. Therefore, there is a time frame where pppol2tp_connect() may accept reconnecting a session, as it only checks ->sk to figure out if the session is connected. This time frame is shortened by the fact that pppol2tp_release() calls l2tp_session_delete(), making the session unreachable before resetting ->sk. However, pppol2tp_connect() may grab the session before it gets unhashed by l2tp_session_delete(), but it may test ->sk after the later got reset. The race is not so hard to trigger and syzbot found a pretty reliable reproducer: https://syzkaller.appspot.com/bug?id=418578d2a4389074524e04d641eacb091961b2cf Before `d02ba2a611`, another race could let pppol2tp_release() overwrite the ->__sk pointer of an L2TP session, thus tricking pppol2tp_put_sk() into calling sock_put() on a socket that is different than the one for which pppol2tp_release() was originally called. To get there, we had to trigger the race described above, therefore having one PPPoL2TP socket being released, while the session it is connected to is reconnecting to a different PPPoL2TP socket. When releasing this new socket fast enough, pppol2tp_release() overwrites the session's ->__sk pointer with the address of the new socket, before the first pppol2tp_put_sk() call gets scheduled. Then the pppol2tp_put_sk() call invoked by the original socket will sock_put() the new socket, potentially dropping its last reference. When the second pppol2tp_put_sk() finally runs, its socket has already been freed. With `d02ba2a611`, the session takes a reference on both sockets. Furthermore, the session's ->sk pointer is reset in the pppol2tp_session_close() callback function rather than in pppol2tp_release(). Therefore, ->__sk can't be overwritten and pppol2tp_put_sk() is called only once (l2tp_session_delete() will only run pppol2tp_session_close() once, to protect the session against concurrent deletion requests). Now pppol2tp_put_sk() will properly sock_put() the original socket, but the new socket will remain, as l2tp_session_delete() prevented the release process from completing. Here, we don't depend on the ->__sk race to trigger the bug. Getting into the pppol2tp_connect() race is enough to leak the reference, no matter when new socket is released. So it all boils down to pppol2tp_connect() failing to realise that the session has already been connected. This patch drops the unneeded extra reference counting (mostly reverting `d02ba2a611`) and checks that neither ->sk nor ->__sk is set before allowing a session to be connected. Fixes: `d02ba2a611` ("l2tp: fix race in pppol2tp_release with session object destroy") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Michal Kubecek	317594ac46	ipv6: omit traffic class when calculating flow hash [ Upstream commit `fa1be7e01e` ] Some of the code paths calculating flow hash for IPv6 use flowlabel member of struct flowi6 which, despite its name, encodes both flow label and traffic class. If traffic class changes within a TCP connection (as e.g. ssh does), ECMP route can switch between path. It's also inconsistent with other code paths where ip6_flowlabel() (returning only flow label) is used to feed the key. Use only flow label everywhere, including one place where hash key is set using ip6_flowinfo(). Fixes: `51ebd31815` ("ipv6: add support of equal cost multipath (ECMP)") Fixes: `f70ea018da` ("net: Add functions to get skb->hash based on flow structures") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Sabrina Dubroca	e61dbe7992	ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds [ Upstream commit `848235edb5` ] Currently, raw6_sk(sk)->ip6mr_table is set unconditionally during ip6_mroute_setsockopt(MRT6_TABLE). A subsequent attempt at the same setsockopt will fail with -ENOENT, since we haven't actually created that table. A similar fix for ipv4 was included in commit `5e1859fbcc` ("ipv4: ipmr: various fixes and cleanups"). Fixes: `d1db275dd3` ("ipv6: ip6mr: support multiple tables") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:18 +02:00
Julia Lawall	dc60fb7336	bnx2x: use the right constant [ Upstream commit `dd612f18a4` ] Nearby code that also tests port suggests that the P0 constant should be used when port is zero. The semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression e,e1; @@ * e ? e1 : e1 // </smpl> Fixes: `6c3218c6f7` ("bnx2x: Adjust ETS to 578xx") Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:17 +02:00
Jason A. Donenfeld	209dedf806	netfilter: nf_flow_table: attach dst to skbs commit `2a79fd3908` upstream. Some drivers, such as vxlan and wireguard, use the skb's dst in order to determine things like PMTU. They therefore loose functionality when flow offloading is enabled. So, we ensure the skb has it before xmit'ing it in the offloading path. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2018-06-11 22:43:17 +02:00

20693 changed files with 866152 additions and 1095470 deletions

2

.clang-format

View File

@@ -382,7 +382,7 @@ IncludeIsMainRegex: '(Test)?$'
 IndentCaseLabels: false
 #IndentPPDirectives: None # Unknown to clang-format-5.0
 IndentWidth: 8
 IndentWrappedFunctionNames: false
 IndentWrappedFunctionNames: true
 JavaScriptQuotes: Leave
 JavaScriptWrapImports: true
 KeepEmptyLinesAtTheStartOfBlocks: false

									
										34

.github/ISSUE_TEMPLATE/bug_report.md
									
										vendored
									
												View File
											
				@@ -1,34 +0,0 @@

				---

				name: Bug report

				about: Create a report to help us fix your issue

				---

				**Is this the right place for my bug report?**

				This repository contains the Linux kernel used on the Raspberry Pi. If you believe that the issue you are seeing is kernel-related, this is the right place. If not, we have other repositories for the GPU firmware at [github.com/raspberrypi/firmware](https://github.com/raspberrypi/firmware) and Raspberry Pi userland applications at [github.com/raspberrypi/userland](https://github.com/raspberrypi/userland). If you have problems with the Raspbian distribution packages, report them in the [github.com/RPi-Distro/repo](https://github.com/RPi-Distro/repo). If you simply have a question, then [the Raspberry Pi forums](https://www.raspberrypi.org/forums) are the best place to ask it.

				**Describe the bug**

				Add a clear and concise description of what you think the bug is.

				**To reproduce**

				List the steps required to reproduce the issue.

				**Expected behaviour**

				Add a clear and concise description of what you expected to happen.

				**Actual behaviour**

				Add a clear and concise description of what actually happened.

				**System**

				 Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:

				* Which model of Raspberry Pi? e.g. Pi3B+, PiZeroW

				* Which OS and version (`cat /etc/rpi-issue`)?

				* Which firmware version (`vcgencmd version`)?

				* Which kernel version (`uname -a`)?

				**Logs**

				If applicable, add the relevant output from `dmesg` or similar.

				**Additional context**

				Add any other relevant context for the problem.

9

.mailmap

View File

@@ -31,8 +31,6 @@ Arnaud Patard <arnaud.patard@rtp-net.org>
 Arnd Bergmann <arnd@arndb.de>
 Axel Dyks <xl@xlsigned.net>
 Axel Lin <axel.lin@gmail.com>
 Bart Van Assche <bvanassche@acm.org> <bart.vanassche@wdc.com>
 Bart Van Assche <bvanassche@acm.org> <bart.vanassche@sandisk.com>
 Ben Gardner <bgardner@wabtec.com>
 Ben M Cahill <ben.m.cahill@intel.com>
 Björn Steinbrink <B.Steinbrink@gmx.de>
@@ -83,9 +81,6 @@ Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com>
 <javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
 Jean Tourrilhes <jt@hpl.hp.com>
 Jeff Garzik <jgarzik@pretzel.yyz.us>
 Jeff Layton <jlayton@kernel.org> <jlayton@redhat.com>
 Jeff Layton <jlayton@kernel.org> <jlayton@poochiereds.net>
 Jeff Layton <jlayton@kernel.org> <jlayton@primarydata.com>
 Jens Axboe <axboe@suse.de>
 Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
 Johan Hovold <johan@kernel.org> <jhovold@gmail.com>
@@ -159,7 +154,6 @@ Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
 Randy Dunlap <rdunlap@infradead.org> <rdunlap@xenotime.net>
 Rémi Denis-Courmont <rdenis@simphalempin.com>
 Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
 Ross Zwisler <zwisler@kernel.org> <ross.zwisler@linux.intel.com>
 Rudolf Marek <R.Marek@sh.cvut.cz>
 Rui Saraiva <rmps@joel.ist.utl.pt>
 Sachin P Sant <ssant@in.ibm.com>
@@ -192,9 +186,6 @@ Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
 Uwe Kleine-König <ukl@pengutronix.de>
 Uwe Kleine-König <Uwe.Kleine-Koenig@digi.com>
 Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
 Vinod Koul <vkoul@kernel.org> <vinod.koul@intel.com>
 Vinod Koul <vkoul@kernel.org> <vinod.koul@linux.intel.com>
 Vinod Koul <vkoul@kernel.org> <vkoul@infradead.org>
 Viresh Kumar <vireshk@kernel.org> <viresh.kumar@st.com>
 Viresh Kumar <vireshk@kernel.org> <viresh.linux@gmail.com>
 Viresh Kumar <vireshk@kernel.org> <viresh.kumar2@arm.com>

5

CREDITS

View File

@@ -2571,11 +2571,6 @@ S: Helstorfer Str. 7
 S: D-30625 Hannover
 S: Germany
 N: Ron Minnich
 E: rminnich@sandia.gov
 E: rminnich@gmail.com
 D: 9p filesystem development
 N: Corey Minyard
 E: minyard@wf-rch.cirr.com
 E: minyard@mvista.com

10

Documentation/00-INDEX

View File

@@ -64,6 +64,8 @@ auxdisplay/
 	- misc. LCD driver documentation (cfag12864b, ks0108).
 backlight/
 	- directory with info on controlling backlights in flat panel displays
 bcache.txt
 	- Block-layer cache on fast SSDs to improve slow (raid) I/O performance.
 block/
 	- info on the Block I/O (BIO) layer.
 blockdev/
@@ -76,10 +78,18 @@ bus-devices/
 	- directory with info on TI GPMC (General Purpose Memory Controller)
 bus-virt-phys-mapping.txt
 	- how to access I/O mapped memory from within device drivers.
 cachetlb.txt
 	- describes the cache/TLB flushing interfaces Linux uses.
 cdrom/
 	- directory with information on the CD-ROM drivers that Linux has.
 cgroup-v1/
 	- cgroups v1 features, including cpusets and memory controller.
 cgroup-v2.txt
 	- cgroups v2 features, including cpusets and memory controller.
 circular-buffers.txt
 	- how to make use of the existing circular buffer infrastructure
 clk.txt
 	- info on the common clock framework
 cma/
 	- Continuous Memory Area (CMA) debugfs interface.
 conf.py

48

Documentation/ABI/obsolete/sysfs-class-typec

View File

@@ -1,48 +0,0 @@
 These files are deprecated and will be removed. The same files are available
 under /sys/bus/typec (see Documentation/ABI/testing/sysfs-bus-typec).
 What:		/sys/class/typec/<port|partner|cable>/<dev>/svid
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The SVID (Standard or Vendor ID) assigned by USB-IF for this
 		alternate mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Every supported mode will have its own directory. The name of
 		a mode will be "mode<index>" (for example mode1), where <index>
 		is the actual index to the mode VDO returned by Discover Modes
 		USB power delivery command.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/description
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows description of the mode. The description is optional for
 		the drivers, just like with the Billboard Devices.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/vdo
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the VDO in hexadecimal returned by Discover Modes command
 		for this mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/active
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the mode is active or not. The attribute can be used
 		for entering/exiting the mode with partners and cable plugs, and
 		with the port alternate modes it can be used for disabling
 		support for specific alternate modes. Entering/exiting modes is
 		supported as synchronous operation so write(2) to the attribute
 		does not return until the enter/exit mode operation has
 		finished. The attribute is notified when the mode is
 		entered/exited so poll(2) on the attribute wakes up.
 		Entering/exiting a mode will also generate uevent KOBJ_CHANGE.
 		Valid values: yes, no

2

Documentation/ABI/obsolete/sysfs-gpio

View File

@@ -11,7 +11,7 @@ Description:
   Kernel code may export it for complete or partial access.
   GPIOs are identified as they are inside the kernel, using integers in
   the range 0..INT_MAX.  See Documentation/gpio for more information.
   the range 0..INT_MAX.  See Documentation/gpio/gpio.txt for more information.
     /sys/class/gpio
 	/export ... asks the kernel to export a GPIO to userspace

17

Documentation/ABI/removed/sysfs-bus-nfit

View File

@@ -1,17 +0,0 @@
 What:		/sys/bus/nd/devices/regionX/nfit/ecc_unit_size
 Date:		Aug, 2017
 KernelVersion:	v4.14 (Removed v4.18)
 Contact:	linux-nvdimm@lists.01.org
 Description:
 		(RO) Size of a write request to a DIMM that will not incur a
 		read-modify-write cycle at the memory controller.
 		When the nfit driver initializes it runs an ARS (Address Range
 		Scrub) operation across every pmem range. Part of that process
 		involves determining the ARS capabilities of a given address
 		range. One of the capabilities that is reported is the 'Clear
 		Uncorrectable Error Range Length Unit Size' (see: ACPI 6.2
 		section 9.20.7.4 Function Index 1 - Query ARS Capabilities).
 		This property indicates the boundary at which the NVDIMM may
 		need to perform read-modify-write cycles to maintain ECC (Error
 		Correcting Code) blocks.

7

Documentation/ABI/stable/sysfs-bus-vmbus

View File

@@ -42,13 +42,6 @@ Contact:	K. Y. Srinivasan <kys@microsoft.com>
 Description:	The 16 bit vendor ID of the device
 Users:		tools/hv/lsvmbus and user level RDMA libraries
 What:		/sys/bus/vmbus/devices/<UUID>/numa_node
 Date:		Jul 2018
 KernelVersion:	4.19
 Contact:	Stephen Hemminger <sthemmin@microsoft.com>
 Description:	This NUMA node to which the VMBUS device is
 		attached, or -1 if the node is unknown.
 What:		/sys/bus/vmbus/devices/<UUID>/channels/<N>
 Date:		September. 2017
 KernelVersion:	4.14

9

Documentation/ABI/stable/sysfs-bus-xen-backend

View File

@@ -73,12 +73,3 @@ KernelVersion:	3.0
 Contact:	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 Description:
                 Number of sectors written by the frontend.
 What:		/sys/bus/xen-backend/devices/*/state
 Date:		August 2018
 KernelVersion:	4.19
 Contact:	Joe Jin <joe.jin@oracle.com>
 Description:
                 The state of the device. One of: 'Unknown',
                 'Initialising', 'Initialised', 'Connected', 'Closing',
                 'Closed', 'Reconfiguring', 'Reconfigured'.

6

Documentation/ABI/stable/sysfs-class-rfkill

View File

@@ -11,7 +11,7 @@ KernelVersion:	v2.6.22
 Contact:	linux-wireless@vger.kernel.org,
 Description: 	The rfkill class subsystem folder.
 		Each registered rfkill driver is represented by an rfkillX
 		subfolder (X being an integer >= 0).
 		subfolder (X being an integer > 0).
 What:		/sys/class/rfkill/rfkill[0-9]+/name
@@ -48,8 +48,8 @@ Contact:	linux-wireless@vger.kernel.org
 Description: 	Current state of the transmitter.
 		This file was scheduled to be removed in 2014, but due to its
 		large number of users it will be sticking around for a bit
 		longer. Despite it being marked as stable, the newer "hard" and
 		"soft" interfaces should be preferred, since it is not possible
 		longer. Despite it being marked as stabe, the newer "hard" and
 		"soft" interfaces should be preffered, since it is not possible
 		to express the 'soft and hard block' state of the rfkill driver
 		through this interface. There will likely be another attempt to
 		remove it in the future.

2

Documentation/ABI/stable/sysfs-devices-node

View File

@@ -90,4 +90,4 @@ Date:		December 2009
 Contact:	Lee Schermerhorn <lee.schermerhorn@hp.com>
 Description:
 		The node's huge page size control/query attributes.
 		See Documentation/admin-guide/mm/hugetlbpage.rst
 		See Documentation/vm/hugetlbpage.txt

9

Documentation/ABI/stable/sysfs-devices-system-xen_memory

View File

@@ -75,12 +75,3 @@ Contact:	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
 Description:
 		Amount (in KiB) of low (or normal) memory in the
 		balloon.
 What:		/sys/devices/system/xen_memory/xen_memory0/scrub_pages
 Date:		September 2018
 KernelVersion:	4.20
 Contact:	xen-devel@lists.xenproject.org
 Description:
 		Control scrubbing pages before returning them to Xen for others domains
 		use. Can be set with xen_scrub_pages cmdline
 		parameter. Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT.

78

Documentation/ABI/stable/sysfs-driver-mlxreg-io

View File

@@ -1,78 +0,0 @@
 What:		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
 							asic_health
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Vadim Pasternak <vadimpmellanox.com>
 Description:	This file shows ASIC health status. The possible values are:
 - health failed, 2 - health OK, 3 - ASIC in booting state.
 		The files are read only.
 What:		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
 							cpld1_version
 							cpld2_version
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Vadim Pasternak <vadimpmellanox.com>
 Description:	These files show with which CPLD versions have been burned
 		on carrier and switch boards.
 		The files are read only.
 What:		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/select_iio
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Vadim Pasternak <vadimpmellanox.com>
 Description:	This file allows iio devices selection.
 		Attribute select_iio can be written with 0 or with 1. It
 		selects which one of iio devices can be accessed.
 		The file is read/write.
 What:		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/psu1_on
 		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/psu2_on
 		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/pwr_cycle
 		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/pwr_down
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Vadim Pasternak <vadimpmellanox.com>
 Description:	These files allow asserting system power cycling, switching
 		power supply units on and off and system's main power domain
 		shutdown.
 		Expected behavior:
 		When pwr_cycle is written 1: auxiliary power domain will go
 		down and after short period (about 1 second) up.
 		When  psu1_on or psu2_on is written 1, related unit will be
 		disconnected from the power source, when written 0 - connected.
 		If both are written 1 - power supplies main power domain will
 		go down.
 		When pwr_down is written 1, system's main power domain will go
 		down.
 		The files are write only.
 What:		/sys/devices/platform/mlxplat/mlxreg-io/hwmon/hwmon*/
 							reset_aux_pwr_or_ref
 							reset_asic_thermal
 							reset_hotswap_or_halt
 							reset_hotswap_or_wd
 							reset_fw_reset
 							reset_long_pb
 							reset_main_pwr_fail
 							reset_short_pb
 							reset_sw_reset
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Vadim Pasternak <vadimpmellanox.com>
 Description:	These files show the system reset cause, as following: power
 		auxiliary outage or power refresh, ASIC thermal shutdown, halt,
 		hotswap, watchdog, firmware reset, long press power button,
 		short press power button, software reset. Value 1 in file means
 		this is reset cause, 0 - otherwise. Only one of the above
 		causes could be 1 at the same time, representing only last
 		reset cause.
 		The files are read only.

5

Documentation/ABI/testing/configfs-usb-gadget-uvc

View File

@@ -263,8 +263,3 @@ Description:	Specific streaming header descriptors
 					is connected
 		bmInfo			- capabilities of this video streaming
 					interface
 What:		/sys/class/udc/udc.name/device/gadget/video4linux/video.name/function_name
 Date:		May 2018
 KernelVersion:	4.19
 Description:	UVC configfs function instance name

13

Documentation/ABI/testing/evm

View File

@@ -57,16 +57,3 @@ Description:
 		dracut (via 97masterkey and 98integrity) and systemd (via
 		core/ima-setup) have support for loading keys at boot
 		time.
 What:		security/integrity/evm/evm_xattrs
 Date:		April 2018
 Contact:	Matthew Garrett <mjg59@google.com>
 Description:
 		Shows the set of extended attributes used to calculate or
 		validate the EVM signature, and allows additional attributes
 		to be added at runtime. Any signatures generated after
 		additional attributes are added (and on files posessing those
 		additional attributes) will only be valid if the same
 		additional attributes are configured on system boot. Writing
 		a single period (.) will lock the xattr list from any further
 		modification.

2

Documentation/ABI/testing/ima_policy

View File

@@ -21,7 +21,7 @@ Description:
 			audit | hash | dont_hash
 		condition:= base | lsm  [option]
 			base:	[[func=] [mask=] [fsmagic=] [fsuuid=] [uid=]
 				[euid=] [fowner=] [fsname=]]
 				[euid=] [fowner=]]
 			lsm:	[[subj_user=] [subj_role=] [subj_type=]
 				 [obj_user=] [obj_role=] [obj_type=]]
 			option:	[[appraise_type=]] [permit_directio]

9

Documentation/ABI/testing/ppc-memtrace

View File

@@ -13,11 +13,10 @@ Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	Write an integer containing the size in bytes of the memory
 		you want removed from each NUMA node to this file - it must be
 		aligned to the memblock size. This amount of RAM will be removed
 		from each NUMA node in the kernel mappings and the following
 		debugfs files will be created. Once memory is successfully
 		removed from each node, the following files are created. To
 		re-add memory to the kernel, echo 0 into this file (it will be
 		automatically onlined).
 		from the kernel mappings and the following debugfs files will be
 		created. This can only be successfully done once per boot. Once
 		memory is successfully removed from each node, the following
 		files are created.
 What:		/sys/kernel/debug/powerpc/memtrace/<node-id>
 Date:		Aug 2017

10

Documentation/ABI/testing/procfs-diskstats

View File

@@ -5,7 +5,6 @@ Description:
 		The /proc/diskstats file displays the I/O statistics
 		of block devices. Each line contains the following 14
 		fields:
 - major number
 - minor mumber
 - device name
@@ -20,13 +19,4 @@ Description:
 - I/Os currently in progress
 - time spent doing I/Os (ms)
 - weighted time spent doing I/Os (ms)
 		Kernel 4.18+ appends four more fields for discard
 		tracking putting the total at 18:
 - discards completed successfully
 - discards merged
 - sectors discarded
 - time spent discarding
 		For more details refer to Documentation/iostats.txt

8

Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc

View File

@@ -83,11 +83,3 @@ KernelVersion:	4.7
 Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
 Description:	(R) Indicates the capabilities of the Coresight TMC.
 		The value is read directly from the DEVID register, 0xFC8,
 What:		/sys/bus/coresight/devices/<memory_map>.tmc/buffer_size
 Date:		December 2018
 KernelVersion:	4.19
 Contact:	Mathieu Poirier <mathieu.poirier@linaro.org>
 Description:	(RW) Size of the trace buffer for TMC-ETR when used in SYSFS
 		mode. Writable only for TMC-ETR configurations. The value
 		should be aligned to the kernel pagesize.

35

Documentation/ABI/testing/sysfs-bus-iio

View File

@@ -190,25 +190,6 @@ Description:
 		but should match other such assignments on device).
 		Units after application of scale and offset are m/s^2.
 What:		/sys/bus/iio/devices/iio:deviceX/in_angl_raw
 KernelVersion:	4.17
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Angle of rotation. Units after application of scale and offset
 		are radians.
 What:		/sys/bus/iio/devices/iio:deviceX/in_positionrelative_x_raw
 What:		/sys/bus/iio/devices/iio:deviceX/in_positionrelative_y_raw
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Relative position in direction x or y on a pad (may be
 		arbitrarily assigned but should match other such assignments on
 		device).
 		Units after application of scale and offset are milli percents
 		from the pad's size in both directions. Should be calibrated by
 		the consumer.
 What:		/sys/bus/iio/devices/iio:deviceX/in_anglvel_x_raw
 What:		/sys/bus/iio/devices/iio:deviceX/in_anglvel_y_raw
 What:		/sys/bus/iio/devices/iio:deviceX/in_anglvel_z_raw
@@ -316,7 +297,6 @@ What:		/sys/bus/iio/devices/iio:deviceX/in_pressure_offset
 What:		/sys/bus/iio/devices/iio:deviceX/in_humidityrelative_offset
 What:		/sys/bus/iio/devices/iio:deviceX/in_magn_offset
 What:		/sys/bus/iio/devices/iio:deviceX/in_rot_offset
 What:		/sys/bus/iio/devices/iio:deviceX/in_angl_offset
 KernelVersion:	2.6.35
 Contact:	linux-iio@vger.kernel.org
 Description:
@@ -370,7 +350,6 @@ What:		/sys/bus/iio/devices/iio:deviceX/in_humidityrelative_scale
 What:		/sys/bus/iio/devices/iio:deviceX/in_velocity_sqrt(x^2+y^2+z^2)_scale
 What:		/sys/bus/iio/devices/iio:deviceX/in_illuminance_scale
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_scale
 What:		/sys/bus/iio/devices/iio:deviceX/in_angl_scale
 KernelVersion:	2.6.35
 Contact:	linux-iio@vger.kernel.org
 Description:
@@ -1307,16 +1286,13 @@ What:		/sys/.../iio:deviceX/in_intensityY_raw
 What:		/sys/.../iio:deviceX/in_intensityY_ir_raw
 What:		/sys/.../iio:deviceX/in_intensityY_both_raw
 What:		/sys/.../iio:deviceX/in_intensityY_uv_raw
 What:		/sys/.../iio:deviceX/in_intensityY_duv_raw
 KernelVersion:	3.4
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Unit-less light intensity. Modifiers both and ir indicate
 		that measurements contain visible and infrared light
 		components or just infrared light, respectively. Modifier
 		uv indicates that measurements contain ultraviolet light
 		components. Modifier duv indicates that measurements
 		contain deep ultraviolet light components.
 		components or just infrared light, respectively. Modifier uv indicates
 		that measurements contain ultraviolet light components.
 What:		/sys/.../iio:deviceX/in_uvindex_input
 KernelVersion:	4.6
@@ -1678,10 +1654,3 @@ KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw counter device counters direction for channel Y.
 What:		/sys/bus/iio/devices/iio:deviceX/in_phaseY_raw
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw (unscaled) phase difference reading from channel Y
 		that can be processed to radians.

47

Documentation/ABI/testing/sysfs-bus-iio-isl29501

View File

@@ -1,47 +0,0 @@
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_agc_gain
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_agc_gain_bias
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		This sensor has an automatic gain control (agc) loop
 		which sets the analog signal levels at an optimum
 		level by controlling programmable gain amplifiers. The
 		criteria for optimal gain is determined by the sensor.
 		Return the actual gain value as an integer in [0; 65536]
 		range when read from.
 		The agc gain read when measuring crosstalk shall be
 		written into in_proximity0_agc_gain_bias.
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_calib_phase_temp_a
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_calib_phase_temp_b
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_calib_phase_light_a
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity0_calib_phase_light_b
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		The sensor is able to perform correction of distance
 		measurements due to changing temperature and ambient
 		light conditions. It can be programmed to correct for
 		a second order error polynomial.
 		Phase data has to be collected when temperature and
 		ambient light are modulated independently.
 		Then a least squares curve fit to a second order
 		polynomial has to be generated from the data. The
 		resultant curves have the form ax^2 + bx + c.
 		From those two curves, a and b coefficients shall be
 		stored in in_proximity0_calib_phase_temp_a and
 		in_proximity0_calib_phase_temp_b for temperature and
 		in in_proximity0_calib_phase_light_a and
 		in_proximity0_calib_phase_light_b for ambient light.
 		Those values must be integer in [0; 8355840] range.
 		Finally, the c constant is set by the sensor
 		internally.
 		The value stored in sensor is displayed when read from.

22

Documentation/ABI/testing/sysfs-bus-iio-light-si1133

View File

@@ -1,22 +0,0 @@
 What:		/sys/bus/iio/devices/iio:deviceX/in_intensity_ir_small_raw
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Unit-less infrared intensity. The intensity is measured from 1
 		dark photodiode. "small" indicate the surface area capturing
 		infrared.
 What:		/sys/bus/iio/devices/iio:deviceX/in_intensity_ir_large_raw
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Unit-less infrared intensity. The intensity is measured from 4
 		dark photodiodes. "large" indicate the surface area capturing
 		infrared.
 What:		/sys/bus/iio/devices/iio:deviceX/in_intensity_large_raw
 KernelVersion:	4.18
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Unit-less light intensity with more diodes.

19

Documentation/ABI/testing/sysfs-bus-nfit

View File

@@ -212,3 +212,22 @@ Description:
 		range. Used by NVDIMM Region Mapping Structure to uniquely refer
 		to this structure. Value of 0 is reserved and not used as an
 		index.
 What:		/sys/bus/nd/devices/regionX/nfit/ecc_unit_size
 Date:		Aug, 2017
 KernelVersion:	v4.14
 Contact:	linux-nvdimm@lists.01.org
 Description:
 		(RO) Size of a write request to a DIMM that will not incur a
 		read-modify-write cycle at the memory controller.
 		When the nfit driver initializes it runs an ARS (Address Range
 		Scrub) operation across every pmem range. Part of that process
 		involves determining the ARS capabilities of a given address
 		range. One of the capabilities that is reported is the 'Clear
 		Uncorrectable Error Range Length Unit Size' (see: ACPI 6.2
 		section 9.20.7.4 Function Index 1 - Query ARS Capabilities).
 		This property indicates the boundary at which the NVDIMM may
 		need to perform read-modify-write cycles to maintain ECC (Error
 		Correcting Code) blocks.

122

Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats

View File

@@ -1,122 +0,0 @@
 ==========================
 PCIe Device AER statistics
 ==========================
 These attributes show up under all the devices that are AER capable. These
 statistical counters indicate the errors "as seen/reported by the device".
 Note that this may mean that if an endpoint is causing problems, the AER
 counters may increment at its link partner (e.g. root port) because the
 errors may be "seen" / reported by the link partner and not the
 problematic endpoint itself (which may report all counters as 0 as it never
 saw any problems).
 Where:		/sys/bus/pci/devices/<dev>/aer_dev_correctable
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	List of correctable errors seen and reported by this
 		PCI device using ERR_COR. Note that since multiple errors may
 		be reported using a single ERR_COR message, thus
 		TOTAL_ERR_COR at the end of the file may not match the actual
 		total of all the errors in the file. Sample output:
 -------------------------------------------------------------------------
 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_correctable
 Receiver Error 2
 Bad TLP 0
 Bad DLLP 0
 RELAY_NUM Rollover 0
 Replay Timer Timeout 0
 Advisory Non-Fatal 0
 Corrected Internal Error 0
 Header Log Overflow 0
 TOTAL_ERR_COR 2
 -------------------------------------------------------------------------
 Where:		/sys/bus/pci/devices/<dev>/aer_dev_fatal
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	List of uncorrectable fatal errors seen and reported by this
 		PCI device using ERR_FATAL. Note that since multiple errors may
 		be reported using a single ERR_FATAL message, thus
 		TOTAL_ERR_FATAL at the end of the file may not match the actual
 		total of all the errors in the file. Sample output:
 -------------------------------------------------------------------------
 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_fatal
 Undefined 0
 Data Link Protocol 0
 Surprise Down Error 0
 Poisoned TLP 0
 Flow Control Protocol 0
 Completion Timeout 0
 Completer Abort 0
 Unexpected Completion 0
 Receiver Overflow 0
 Malformed TLP 0
 ECRC 0
 Unsupported Request 0
 ACS Violation 0
 Uncorrectable Internal Error 0
 MC Blocked TLP 0
 AtomicOp Egress Blocked 0
 TLP Prefix Blocked Error 0
 TOTAL_ERR_FATAL 0
 -------------------------------------------------------------------------
 Where:		/sys/bus/pci/devices/<dev>/aer_dev_nonfatal
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	List of uncorrectable nonfatal errors seen and reported by this
 		PCI device using ERR_NONFATAL. Note that since multiple errors
 		may be reported using a single ERR_FATAL message, thus
 		TOTAL_ERR_NONFATAL at the end of the file may not match the
 		actual total of all the errors in the file. Sample output:
 -------------------------------------------------------------------------
 localhost /sys/devices/pci0000:00/0000:00:1c.0 # cat aer_dev_nonfatal
 Undefined 0
 Data Link Protocol 0
 Surprise Down Error 0
 Poisoned TLP 0
 Flow Control Protocol 0
 Completion Timeout 0
 Completer Abort 0
 Unexpected Completion 0
 Receiver Overflow 0
 Malformed TLP 0
 ECRC 0
 Unsupported Request 0
 ACS Violation 0
 Uncorrectable Internal Error 0
 MC Blocked TLP 0
 AtomicOp Egress Blocked 0
 TLP Prefix Blocked Error 0
 TOTAL_ERR_NONFATAL 0
 -------------------------------------------------------------------------
 ============================
 PCIe Rootport AER statistics
 ============================
 These attributes show up under only the rootports (or root complex event
 collectors) that are AER capable. These indicate the number of error messages as
 "reported to" the rootport. Please note that the rootports also transmit
 (internally) the ERR_* messages for errors seen by the internal rootport PCI
 device, so these counters include them and are thus cumulative of all the error
 messages on the PCI hierarchy originating at that root port.
 Where:		/sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_cor
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	Total number of ERR_COR messages reported to rootport.
 Where:	    /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_fatal
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	Total number of ERR_FATAL messages reported to rootport.
 Where:	    /sys/bus/pci/devices/<dev>/aer_stats/aer_rootport_total_err_nonfatal
 Date:		July 2018
 Kernel Version: 4.19.0
 Contact:	linux-pci@vger.kernel.org, rajatja@google.com
 Description:	Total number of ERR_NONFATAL messages reported to rootport.

20

Documentation/ABI/testing/sysfs-bus-rpmsg

View File

@@ -73,23 +73,3 @@ Description:
 		This sysfs entry tells us whether the channel is a local
 		server channel that is announced (values are either
 		true or false).
 What:		/sys/bus/rpmsg/devices/.../driver_override
 Date:		April 2018
 KernelVersion:	4.18
 Contact:	Bjorn Andersson <bjorn.andersson@linaro.org>
 Description:
 		Every rpmsg device is a communication channel with a remote
 		processor. Channels are identified by a textual name (see
 		/sys/bus/rpmsg/devices/.../name above) and have a local
 		("source") rpmsg address, and remote ("destination") rpmsg
 		address.
 		The listening entity (or client) which communicates with a
 		remote processor is referred as rpmsg driver. The rpmsg device
 		and rpmsg driver are matched based on rpmsg device name and
 		rpmsg driver ID table.
 		This sysfs entry allows the rpmsg driver for a rpmsg device
 		to be specified which will override standard OF, ID table
 		and name matching.

51

Documentation/ABI/testing/sysfs-bus-typec

View File

@@ -1,51 +0,0 @@
 What:		/sys/bus/typec/devices/.../active
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the mode is active or not. The attribute can be used
 		for entering/exiting the mode. Entering/exiting modes is
 		supported as synchronous operation so write(2) to the attribute
 		does not return until the enter/exit mode operation has
 		finished. The attribute is notified when the mode is
 		entered/exited so poll(2) on the attribute wakes up.
 		Entering/exiting a mode will also generate uevent KOBJ_CHANGE.
 		Valid values are boolean.
 What:		/sys/bus/typec/devices/.../description
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows description of the mode. The description is optional for
 		the drivers, just like with the Billboard Devices.
 What:		/sys/bus/typec/devices/.../mode
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The index number of the mode returned by Discover Modes USB
 		Power Delivery command. Depending on the alternate mode, the
 		mode index may be significant.
 		With some alternate modes (SVIDs), the mode index is assigned
 		for specific functionality in the specification for that
 		alternate mode.
 		With other alternate modes, the mode index values are not
 		assigned, and can not be therefore used for identification. When
 		the mode index is not assigned, identifying the alternate mode
 		must be done with either mode VDO or the description.
 What:		/sys/bus/typec/devices/.../svid
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The Standard or Vendor ID (SVID) assigned by USB-IF for this
 		alternate mode.
 What:		/sys/bus/typec/devices/.../vdo
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the VDO in hexadecimal returned by Discover Modes command
 		for this mode.

40

Documentation/ABI/testing/sysfs-bus-usb

View File

@@ -189,28 +189,6 @@ Description:
 		The file will read "hotplug", "wired" and "not used" if the
 		information is available, and "unknown" otherwise.
 What:		/sys/bus/usb/devices/.../(hub interface)/portX/quirks
 Date:		May 2018
 Contact:	Nicolas Boichat <drinkcat@chromium.org>
 Description:
 		In some cases, we care about time-to-active for devices
 		connected on a specific port (e.g. non-standard USB port like
 		pogo pins), where the device to be connected is known in
 		advance, and behaves well according to the specification.
 		This attribute is a bit-field that controls the behavior of
 		a specific port:
 		 - Bit 0 of this field selects the "old" enumeration scheme,
 		   as it is considerably faster (it only causes one USB reset
 		   instead of 2).
 		   The old enumeration scheme can also be selected globally
 		   using /sys/module/usbcore/parameters/old_scheme_first, but
 		   it is often not desirable as the new scheme was introduced to
 		   increase compatibility with more devices.
 		 - Bit 1 reduces TRSTRCY to the 10 ms that are required by the
 		   USB 2.0 specification, instead of the 50 ms that are normally
 		   used to help make enumeration work better on some high speed
 		   devices.
 What:		/sys/bus/usb/devices/.../(hub interface)/portX/over_current_count
 Date:		February 2018
 Contact:	Richard Leitner <richard.leitner@skidata.com>
@@ -258,21 +236,3 @@ Description:
 		Supported values are 0 - 15.
 		More information on how besl values map to microseconds can be found in
 		USB 2.0 ECN Errata for Link Power Management, section 4.10)
 What:		/sys/bus/usb/devices/.../rx_lanes
 Date:		March 2018
 Contact:	Mathias Nyman <mathias.nyman@linux.intel.com>
 Description:
 		Number of rx lanes the device is using.
 		USB 3.2 adds Dual-lane support, 2 rx and 2 tx lanes over Type-C.
 		Inter-Chip SSIC devices support asymmetric lanes up to 4 lanes per
 		direction. Devices before USB 3.2 are single lane (rx_lanes = 1)
 What:		/sys/bus/usb/devices/.../tx_lanes
 Date:		March 2018
 Contact:	Mathias Nyman <mathias.nyman@linux.intel.com>
 Description:
 		Number of tx lanes the device is using.
 		USB 3.2 adds Dual-lane support, 2 rx and 2 tx -lanes over Type-C.
 		Inter-Chip SSIC devices support asymmetric lanes up to 4 lanes per
 		direction. Devices before USB 3.2 are single lane (tx_lanes = 1)

24

Documentation/ABI/testing/sysfs-class-fpga-manager

View File

@@ -35,27 +35,3 @@ Description:	Read fpga manager state as a string.
 		* write complete	= Doing post programming steps
 		* write complete error	= Error while doing post programming
 		* operating		= FPGA is programmed and operating
 What:		/sys/class/fpga_manager/<fpga>/status
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read fpga manager status as a string.
 		If FPGA programming operation fails, it could be caused by crc
 		error or incompatible bitstream image. The intent of this
 		interface is to provide more detailed information for FPGA
 		programming errors to userspace. This is a list of strings for
 		the supported status.
 		* reconfig operation error 	- invalid operations detected by
 						  reconfiguration hardware.
 						  e.g. start reconfiguration
 						  with errors not cleared
 		* reconfig CRC error		- CRC error detected by
 						  reconfiguration hardware.
 		* reconfig incompatible image	- reconfiguration image is
 						  incompatible with hardware
 		* reconfig IP protocol error	- protocol errors detected by
 						  reconfiguration hardware
 		* reconfig fifo overflow error	- FIFO overflow detected by
 						  reconfiguration hardware

9

Documentation/ABI/testing/sysfs-class-fpga-region

View File

@@ -1,9 +0,0 @@
 What:		/sys/class/fpga_region/<region>/compat_id
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	FPGA region id for compatibility check, e.g. compatibility
 		of the FPGA reconfiguration hardware and image. This value
 		is defined or calculated by the layer that is creating the
 		FPGA region. This interface returns the compat_id value or
 		just error code -ENOENT in case compat_id is not used.

15

Documentation/ABI/testing/sysfs-class-gnss

View File

@@ -1,15 +0,0 @@
 What:		/sys/class/gnss/gnssN/type
 Date:		May 2018
 KernelVersion:	4.18
 Contact:	Johan Hovold <johan@kernel.org>
 Description:
 		The GNSS receiver type. The currently identified types reflect
 		the protocol(s) supported by the receiver:
 			"NMEA"		NMEA 0183
 			"SiRF"		SiRF Binary
 			"UBX"		UBX
 		Note that also non-"NMEA" type receivers typically support a
 		subset of NMEA 0183 with vendor extensions (e.g. to allow
 		switching to a vendor protocol).

11

Documentation/ABI/testing/sysfs-class-mei

View File

@@ -54,14 +54,3 @@ Description:	Configure tx queue limit
 		Set maximal number of pending writes
 		per opened session.
 What:		/sys/class/mei/meiN/fw_ver
 Date:		May 2018
 KernelVersion:	4.18
 Contact:	Tomas Winkler <tomas.winkler@intel.com>
 Description:	Display the ME firmware version.
 		The version of the platform ME firmware is in format:
 		<platform>:<major>.<minor>.<milestone>.<build_no>.
 		There can be up to three such blocks for different
 		FW components.

8

Documentation/ABI/testing/sysfs-class-mtd

View File

@@ -232,11 +232,3 @@ Description:
 		of the parent (another partition or a flash device) in bytes.
 		This attribute is absent on flash devices, so it can be used
 		to distinguish them from partitions.
 What:		/sys/class/mtd/mtdX/oobavail
 Date:		April 2018
 KernelVersion:	4.16
 Contact:	linux-mtd@lists.infradead.org
 Description:
 		Number of bytes available for a client to place data into
 		the out of band area.

11

Documentation/ABI/testing/sysfs-class-net-queues

View File

@@ -42,17 +42,6 @@ Description:
 		network device transmit queue. Possible vaules depend on the
 		number of available CPU(s) in the system.
 What:		/sys/class/<iface>/queues/tx-<queue>/xps_rxqs
 Date:		June 2018
 KernelVersion:	4.18.0
 Contact:	netdev@vger.kernel.org
 Description:
 		Mask of the receive queue(s) currently enabled to participate
 		into the Transmit Packet Steering packet processing flow for this
 		network device transmit queue. Possible values depend on the
 		number of available receive queue(s) in the network device.
 		Default is disabled.
 What:		/sys/class/<iface>/queues/tx-<queue>/byte_queue_limits/hold_time
 Date:		November 2011
 KernelVersion:	3.3

455

Documentation/ABI/testing/sysfs-class-power

View File

@@ -1,458 +1,3 @@
 ===== General Properties =====
 What:		/sys/class/power_supply/<supply_name>/manufacturer
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the name of the device manufacturer.
 		Access: Read
 		Valid values: Represented as string
 What:		/sys/class/power_supply/<supply_name>/model_name
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the name of the device model.
 		Access: Read
 		Valid values: Represented as string
 What:		/sys/class/power_supply/<supply_name>/serial_number
 Date:		January 2008
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the serial number of the device.
 		Access: Read
 		Valid values: Represented as string
 What:		/sys/class/power_supply/<supply_name>/type
 Date:		May 2010
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Describes the main type of the supply.
 		Access: Read
 		Valid values: "Battery", "UPS", "Mains", "USB"
 ===== Battery Properties =====
 What:		/sys/class/power_supply/<supply_name>/capacity
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Fine grain representation of battery capacity.
 		Access: Read
 		Valid values: 0 - 100 (percent)
 What:		/sys/class/power_supply/<supply_name>/capacity_alert_max
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Maximum battery capacity trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		battery discharging scenario where user-space needs to know the
 		battery has dropped to an upper level so it can take
 		appropriate action (e.g. warning user that battery level is
 		low).
 		Access: Read, Write
 		Valid values: 0 - 100 (percent)
 What:		/sys/class/power_supply/<supply_name>/capacity_alert_min
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Minimum battery capacity trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		battery discharging scenario where user-space needs to know the
 		battery has dropped to a lower level so it can take
 		appropriate action (e.g. warning user that battery level is
 		critically low).
 		Access: Read, Write
 		Valid values: 0 - 100 (percent)
 What:		/sys/class/power_supply/<supply_name>/capacity_level
 Date:		June 2009
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Coarse representation of battery capacity.
 		Access: Read
 		Valid values: "Unknown", "Critical", "Low", "Normal", "High",
 			      "Full"
 What:		/sys/class/power_supply/<supply_name>/current_avg
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports an average IBAT current reading for the battery, over a
 		fixed period. Normally devices will provide a fixed interval in
 		which they average readings to smooth out the reported value.
 		Access: Read
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/current_max
 Date:		October 2010
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum IBAT current allowed into the battery.
 		Access: Read
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/current_now
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports an instant, single IBAT current reading for the battery.
 		This value is not averaged/smoothed.
 		Access: Read
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/charge_type
 Date:		July 2009
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Represents the type of charging currently being applied to the
 		battery.
 		Access: Read
 		Valid values: "Unknown", "N/A", "Trickle", "Fast"
 What:		/sys/class/power_supply/<supply_name>/charge_term_current
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the charging current value which is used to determine
 		when the battery is considered full and charging should end.
 		Access: Read
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/health
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the health of the battery or battery side of charger
 		functionality.
 		Access: Read
 		Valid values: "Unknown", "Good", "Overheat", "Dead",
 			      "Over voltage", "Unspecified failure", "Cold",
 			      "Watchdog timer expire", "Safety timer expire"
 What:		/sys/class/power_supply/<supply_name>/precharge_current
 Date:		June 2017
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the charging current applied during pre-charging phase
 		for a battery charge cycle.
 		Access: Read
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/present
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports whether a battery is present or not in the system.
 		Access: Read
 		Valid values:
 : Absent
 : Present
 What:		/sys/class/power_supply/<supply_name>/status
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Represents the charging status of the battery. Normally this
 		is read-only reporting although for some supplies this can be
 		used to enable/disable charging to the battery.
 		Access: Read, Write
 		Valid values: "Unknown", "Charging", "Discharging",
 			      "Not charging", "Full"
 What:		/sys/class/power_supply/<supply_name>/technology
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Describes the battery technology supported by the supply.
 		Access: Read
 		Valid values: "Unknown", "NiMH", "Li-ion", "Li-poly", "LiFe",
 			      "NiCd", "LiMn"
 What:		/sys/class/power_supply/<supply_name>/temp
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the current TBAT battery temperature reading.
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_alert_max
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Maximum TBAT temperature trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		battery charging scenario where user-space needs to know the
 		battery temperature has crossed an upper threshold so it can
 		take appropriate action (e.g. warning user that battery level is
 		critically high, and charging has stopped).
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_alert_min
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Minimum TBAT temperature trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		battery charging scenario where user-space needs to know the
 		battery temperature has crossed a lower threshold so it can take
 		appropriate action (e.g. warning user that battery level is
 		high, and charging current has been reduced accordingly to
 		remedy the situation).
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_max
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum allowed TBAT battery temperature for
 		charging.
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_min
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the minimum allowed TBAT battery temperature for
 		charging.
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/voltage_avg,
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports an average VBAT voltage reading for the battery, over a
 		fixed period. Normally devices will provide a fixed interval in
 		which they average readings to smooth out the reported value.
 		Access: Read
 		Valid values: Represented in microvolts
 What:		/sys/class/power_supply/<supply_name>/voltage_max,
 Date:		January 2008
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum safe VBAT voltage permitted for the battery,
 		during charging.
 		Access: Read
 		Valid values: Represented in microvolts
 What:		/sys/class/power_supply/<supply_name>/voltage_min,
 Date:		January 2008
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the minimum safe VBAT voltage permitted for the battery,
 		during discharging.
 		Access: Read
 		Valid values: Represented in microvolts
 What:		/sys/class/power_supply/<supply_name>/voltage_now,
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports an instant, single VBAT voltage reading for the battery.
 		This value is not averaged/smoothed.
 		Access: Read
 		Valid values: Represented in microvolts
 ===== USB Properties =====
 What: 		/sys/class/power_supply/<supply_name>/current_avg
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports an average IBUS current reading over a fixed period.
 		Normally devices will provide a fixed interval in which they
 		average readings to smooth out the reported value.
 		Access: Read
 		Valid values: Represented in microamps
 What: 		/sys/class/power_supply/<supply_name>/current_max
 Date:		October 2010
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum IBUS current the supply can support.
 		Access: Read
 		Valid values: Represented in microamps
 What: 		/sys/class/power_supply/<supply_name>/current_now
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the IBUS current supplied now. This value is generally
 		read-only reporting, unless the 'online' state of the supply
 		is set to be programmable, in which case this value can be set
 		within the reported min/max range.
 		Access: Read, Write
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/input_current_limit
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Details the incoming IBUS current limit currently set in the
 		supply. Normally this is configured based on the type of
 		connection made (e.g. A configured SDP should output a maximum
 		of 500mA so the input current limit is set to the same value).
 		Access: Read, Write
 		Valid values: Represented in microamps
 What:		/sys/class/power_supply/<supply_name>/online,
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Indicates if VBUS is present for the supply. When the supply is
 		online, and the supply allows it, then it's possible to switch
 		between online states (e.g. Fixed -> Programmable for a PD_PPS
 		USB supply so voltage and current can be controlled).
 		Access: Read, Write
 		Valid values:
 : Offline
 : Online Fixed - Fixed Voltage Supply
 : Online Programmable - Programmable Voltage Supply
 What:		/sys/class/power_supply/<supply_name>/temp
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the current supply temperature reading. This would
 		normally be the internal temperature of the device itself (e.g
 		TJUNC temperature of an IC)
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_alert_max
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Maximum supply temperature trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		charging scenario where user-space needs to know the supply
 		temperature has crossed an upper threshold so it can take
 		appropriate action (e.g. warning user that the supply
 		temperature is critically high, and charging has stopped to
 		remedy the situation).
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_alert_min
 Date:		July 2012
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Minimum supply temperature trip-wire value where the supply will
 		notify user-space of the event. This is normally used for the
 		charging scenario where user-space needs to know the supply
 		temperature has crossed a lower threshold so it can take
 		appropriate action (e.g. warning user that the supply
 		temperature is high, and charging current has been reduced
 		accordingly to remedy the situation).
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_max
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum allowed supply temperature for operation.
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What:		/sys/class/power_supply/<supply_name>/temp_min
 Date:		July 2014
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the mainimum allowed supply temperature for operation.
 		Access: Read
 		Valid values: Represented in 1/10 Degrees Celsius
 What: 		/sys/class/power_supply/<supply_name>/usb_type
 Date:		March 2018
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports what type of USB connection is currently active for
 		the supply, for example it can show if USB-PD capable source
 		is attached.
 		Access: Read-Only
 		Valid values: "Unknown", "SDP", "DCP", "CDP", "ACA", "C", "PD",
 			      "PD_DRP", "PD_PPS", "BrickID"
 What: 		/sys/class/power_supply/<supply_name>/voltage_max
 Date:		January 2008
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the maximum VBUS voltage the supply can support.
 		Access: Read
 		Valid values: Represented in microvolts
 What: 		/sys/class/power_supply/<supply_name>/voltage_min
 Date:		January 2008
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the minimum VBUS voltage the supply can support.
 		Access: Read
 		Valid values: Represented in microvolts
 What: 		/sys/class/power_supply/<supply_name>/voltage_now
 Date:		May 2007
 Contact:	linux-pm@vger.kernel.org
 Description:
 		Reports the VBUS voltage supplied now. This value is generally
 		read-only reporting, unless the 'online' state of the supply
 		is set to be programmable, in which case this value can be set
 		within the reported min/max range.
 		Access: Read, Write
 		Valid values: Represented in microvolts
 ===== Device Specific Properties =====
 What:		/sys/class/power/ds2760-battery.*/charge_now
 Date:		May 2010
 KernelVersion:	2.6.35

16

Documentation/ABI/testing/sysfs-class-rc

View File

@@ -1,7 +1,7 @@
 What:		/sys/class/rc/
 Date:		Apr 2010
 KernelVersion:	2.6.35
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		The rc/ class sub-directory belongs to the Remote Controller
 		core and provides a sysfs interface for configuring infrared
@@ -10,7 +10,7 @@ Description:
 What:		/sys/class/rc/rcN/
 Date:		Apr 2010
 KernelVersion:	2.6.35
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		A /sys/class/rc/rcN directory is created for each remote
 		control receiver device where N is the number of the receiver.
@@ -18,7 +18,7 @@ Description:
 What:		/sys/class/rc/rcN/protocols
 Date:		Jun 2010
 KernelVersion:	2.6.36
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Reading this file returns a list of available protocols,
 		something like:
@@ -36,7 +36,7 @@ Description:
 What:		/sys/class/rc/rcN/filter
 Date:		Jan 2014
 KernelVersion:	3.15
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Sets the scancode filter expected value.
 		Use in combination with /sys/class/rc/rcN/filter_mask to set the
@@ -49,7 +49,7 @@ Description:
 What:		/sys/class/rc/rcN/filter_mask
 Date:		Jan 2014
 KernelVersion:	3.15
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Sets the scancode filter mask of bits to compare.
 		Use in combination with /sys/class/rc/rcN/filter to set the bits
@@ -64,7 +64,7 @@ Description:
 What:		/sys/class/rc/rcN/wakeup_protocols
 Date:		Feb 2017
 KernelVersion:	4.11
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Reading this file returns a list of available protocols to use
 		for the wakeup filter, something like:
@@ -83,7 +83,7 @@ Description:
 What:		/sys/class/rc/rcN/wakeup_filter
 Date:		Jan 2014
 KernelVersion:	3.15
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Sets the scancode wakeup filter expected value.
 		Use in combination with /sys/class/rc/rcN/wakeup_filter_mask to
@@ -98,7 +98,7 @@ Description:
 What:		/sys/class/rc/rcN/wakeup_filter_mask
 Date:		Jan 2014
 KernelVersion:	3.15
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Sets the scancode wakeup filter mask of bits to compare.
 		Use in combination with /sys/class/rc/rcN/wakeup_filter to set

2

Documentation/ABI/testing/sysfs-class-rc-nuvoton

View File

@@ -1,7 +1,7 @@
 What:		/sys/class/rc/rcN/wakeup_data
 Date:		Mar 2016
 KernelVersion:	4.6
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 Description:
 		Reading this file returns the stored CIR wakeup sequence.
 		It starts with a pulse, followed by a space, pulse etc.

62

Documentation/ABI/testing/sysfs-class-typec

View File

@@ -222,12 +222,70 @@ Description:
 		available. The value can be polled.
 USB Type-C port alternate mode devices.
 Alternate Mode devices.
 What:		/sys/class/typec/<port>/<alt mode>/supported_roles
 The alternate modes will have Standard or Vendor ID (SVID) assigned by USB-IF.
 The ports, partners and cable plugs can have alternate modes. A supported SVID
 will consist of a set of modes. Every SVID a port/partner/plug supports will
 have a device created for it, and every supported mode for a supported SVID will
 have its own directory under that device. Below <dev> refers to the device for
 the alternate mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/svid
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The SVID (Standard or Vendor ID) assigned by USB-IF for this
 		alternate mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Every supported mode will have its own directory. The name of
 		a mode will be "mode<index>" (for example mode1), where <index>
 		is the actual index to the mode VDO returned by Discover Modes
 		USB power delivery command.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/description
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows description of the mode. The description is optional for
 		the drivers, just like with the Billboard Devices.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/vdo
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the VDO in hexadecimal returned by Discover Modes command
 		for this mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/active
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the mode is active or not. The attribute can be used
 		for entering/exiting the mode with partners and cable plugs, and
 		with the port alternate modes it can be used for disabling
 		support for specific alternate modes. Entering/exiting modes is
 		supported as synchronous operation so write(2) to the attribute
 		does not return until the enter/exit mode operation has
 		finished. The attribute is notified when the mode is
 		entered/exited so poll(2) on the attribute wakes up.
 		Entering/exiting a mode will also generate uevent KOBJ_CHANGE.
 		Valid values: yes, no
 What:		/sys/class/typec/<port>/<dev>/mode<index>/supported_roles
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Space separated list of the supported roles.
 		This attribute is available for the devices describing the
 		alternate modes a port supports, and it will not be exposed with
 		the devices presenting the alternate modes the partners or cable
 		plugs support.
 		Valid values: source, sink

14

Documentation/ABI/testing/sysfs-devices-edac

View File

@@ -77,7 +77,7 @@ Description:	Read/Write attribute file that controls memory scrubbing.
 What:		/sys/devices/system/edac/mc/mc*/max_location
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file displays the information about the last
 		available memory slot in this memory controller. It is used by
@@ -85,7 +85,7 @@ Description:	This attribute file displays the information about the last
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/size
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file will display the size of dimm or rank.
 		For dimm*/size, this is the size, in MB of the DIMM memory
@@ -96,14 +96,14 @@ Description:	This attribute file will display the size of dimm or rank.
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_dev_type
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file will display what type of DRAM device is
 		being utilized on this DIMM (x1, x2, x4, x8, ...).
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_edac_mode
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file will display what type of Error detection
 		and correction is being utilized. For example: S4ECD4ED would
@@ -111,7 +111,7 @@ Description:	This attribute file will display what type of Error detection
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_label
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This control file allows this DIMM to have a label assigned
 		to it. With this label in the module, when errors occur
@@ -126,14 +126,14 @@ Description:	This control file allows this DIMM to have a label assigned
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_location
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file will display the location (csrow/channel,
 		branch/channel/slot or channel/slot) of the dimm or rank.
 What:		/sys/devices/system/edac/mc/mc*/(dimm|rank)*/dimm_mem_type
 Date:		April 2012
 Contact:	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
 Contact:	Mauro Carvalho Chehab <m.chehab@samsung.com>
 		linux-edac@vger.kernel.org
 Description:	This attribute file will display what type of memory is
 		currently on this csrow. Normally, either buffered or

3

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

@@ -238,6 +238,9 @@ Description:	Discover and change clock speed of CPUs
 		See files in Documentation/cpu-freq/ for more information.
 		In particular, read Documentation/cpu-freq/user-guide.txt
 		to learn how to control the knobs.
 What:		/sys/devices/system/cpu/cpu#/cpufreq/freqdomain_cpus
 Date:		June 2013

27

Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator

View File

@@ -1,27 +0,0 @@
 What:		/sys/bus/i2c/devices/.../bd9571mwv-regulator.*.auto/backup_mode
 Date:		Jul 2018
 KernelVersion:	4.19
 Contact:	Geert Uytterhoeven <geert+renesas@glider.be>
 Description:	Read/write the current state of DDR Backup Mode, which controls
 		if DDR power rails will be kept powered during system suspend.
 		("on"/"1" = enabled, "off"/"0" = disabled).
 		Two types of power switches (or control signals) can be used:
 		  A. With a momentary power switch (or pulse signal), DDR
 		     Backup Mode is enabled by default when available, as the
 		     PMIC will be configured only during system suspend.
 		  B. With a toggle power switch (or level signal), the
 		     following steps must be followed exactly:
 . Configure PMIC for backup mode, to change the role of
 			  the accessory power switch from a power switch to a
 			  wake-up switch,
 . Switch accessory power switch off, to prepare for
 			  system suspend, which is a manual step not controlled
 			  by software,
 . Suspend system,
 . Switch accessory power switch on, to resume the
 			  system.
 		     DDR Backup Mode must be explicitly enabled by the user,
 		     to invoke step 1.
 		See also Documentation/devicetree/bindings/mfd/bd9571mwv.txt.
 Users:		User space applications for embedded boards equipped with a
 		BD9571MWV PMIC.

49

Documentation/ABI/testing/sysfs-driver-typec-displayport

View File

@@ -1,49 +0,0 @@
 What:		/sys/bus/typec/devices/.../displayport/configuration
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the current DisplayPort configuration for the connector.
 		Valid values are USB, source and sink. Source means DisplayPort
 		source, and sink means DisplayPort sink.
 		All supported configurations are listed as space separated list
 		with the active one wrapped in square brackets.
 		Source example:
 			USB [source] sink
 		The configuration can be changed by writing to the file
 		Note. USB configuration does not equal to Exit Mode. It is
 		separate configuration defined in VESA DisplayPort Alt Mode on
 		USB Type-C Standard. Functionally it equals to the situation
 		where the mode has been exited (to exit the mode, see
 		Documentation/ABI/testing/sysfs-bus-typec, and use file
 		/sys/bus/typec/devices/.../active).
 What:		/sys/bus/typec/devices/.../displayport/pin_assignment
 Date:		July 2018
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		VESA DisplayPort Alt Mode on USB Type-C Standard defines six
 		different pin assignments for USB Type-C connector that are
 		labeled A, B, C, D, E, and F. The supported pin assignments are
 		listed as space separated list with the active one wrapped in
 		square brackets.
 		Example:
 			C [D]
 		Pin assignment can be changed by writing to the file. It is
 		possible to set pin assignment before configuration has been
 		set, but the assignment will not be active before the
 		connector is actually configured.
 		Note. As of VESA DisplayPort Alt Mode on USB Type-C Standard
 		version 1.0b, pin assignments A, B, and F are deprecated. Only
 		pin assignment D can now carry simultaneously one channel of
 		USB SuperSpeed protocol. From user perspective pin assignments C
 		and E are equal, where all channels on the connector are used
 		for carrying DisplayPort protocol (allowing higher resolutions).

10

Documentation/ABI/testing/sysfs-driver-xen-blkback

View File

@@ -15,13 +15,3 @@ Description:
                 blkback. If the frontend tries to use more than
                 max_persistent_grants, the LRU kicks in and starts
                 removing 5% of max_persistent_grants every 100ms.
 What:           /sys/module/xen_blkback/parameters/persistent_grant_unused_seconds
 Date:           August 2018
 KernelVersion:  4.19
 Contact:        Roger Pau Monné <roger.pau@citrix.com>
 Description:
                 How long a persistent grant is allowed to remain
                 allocated without being in use. The time is in
                 seconds, 0 means indefinitely long.
                 The default is 60 seconds.

11

Documentation/ABI/testing/sysfs-fs-f2fs

View File

@@ -51,14 +51,6 @@ Description:
 		 Controls the dirty page count condition for the in-place-update
 		 policies.
 What:		/sys/fs/f2fs/<disk>/min_seq_blocks
 Date:		August 2018
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:
 		 Controls the dirty page count condition for batched sequential
 		 writes in ->writepages.
 What:		/sys/fs/f2fs/<disk>/min_hot_blocks
 Date:		March 2017
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
@@ -109,7 +101,6 @@ Date:		February 2015
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:
 		 Controls the trimming rate in batch mode.
 		 <deprecated>
 What:		/sys/fs/f2fs/<disk>/cp_interval
 Date:		October 2015
@@ -149,7 +140,7 @@ Contact:	"Shuoran Liu" <liushuoran@huawei.com>
 Description:
 		 Shows total written kbytes issued to disk.
 What:		/sys/fs/f2fs/<disk>/features
 What:		/sys/fs/f2fs/<disk>/feature
 Date:		July 2017
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:

2

Documentation/ABI/testing/sysfs-kernel-mm-hugepages

View File

@@ -12,4 +12,4 @@ Description:
 			free_hugepages
 			surplus_hugepages
 			resv_hugepages
 		See Documentation/admin-guide/mm/hugetlbpage.rst for details.
 		See Documentation/vm/hugetlbpage.txt for details.

2

Documentation/ABI/testing/sysfs-kernel-mm-ksm

View File

@@ -40,7 +40,7 @@ Description:	Kernel Samepage Merging daemon sysfs interface
 		sleep_millisecs: how many milliseconds ksm should sleep between
 		scans.
 		See Documentation/vm/ksm.rst for more information.
 		See Documentation/vm/ksm.txt for more information.
 What:		/sys/kernel/mm/ksm/merge_across_nodes
 Date:		January 2013

4

Documentation/ABI/testing/sysfs-kernel-slab

View File

@@ -37,7 +37,7 @@ Description:
 		The alloc_calls file is read-only and lists the kernel code
 		locations from which allocations for this cache were performed.
 		The alloc_calls file only contains information if debugging is
 		enabled for that cache (see Documentation/vm/slub.rst).
 		enabled for that cache (see Documentation/vm/slub.txt).
 What:		/sys/kernel/slab/cache/alloc_fastpath
 Date:		February 2008
@@ -219,7 +219,7 @@ Contact:	Pekka Enberg <penberg@cs.helsinki.fi>,
 Description:
 		The free_calls file is read-only and lists the locations of
 		object frees if slab debugging is enabled (see
 		Documentation/vm/slub.rst).
 		Documentation/vm/slub.txt).
 What:		/sys/kernel/slab/cache/free_fastpath
 Date:		February 2008

23

Documentation/ABI/testing/sysfs-platform-dfl-fme

View File

@@ -1,23 +0,0 @@
 What:		/sys/bus/platform/devices/dfl-fme.0/ports_num
 Date:		June 2018
 KernelVersion:  4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. One DFL FPGA device may have more than 1
 		port/Accelerator Function Unit (AFU). It returns the
 		number of ports on the FPGA device when read it.
 What:		/sys/bus/platform/devices/dfl-fme.0/bitstream_id
 Date:		June 2018
 KernelVersion:  4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. It returns Bitstream (static FPGA region)
 		identifier number, which includes the detailed version
 		and other information of this static FPGA region.
 What:		/sys/bus/platform/devices/dfl-fme.0/bitstream_metadata
 Date:		June 2018
 KernelVersion:  4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. It returns Bitstream (static FPGA region) meta
 		data, which includes the synthesis date, seed and other
 		information of this static FPGA region.

16

Documentation/ABI/testing/sysfs-platform-dfl-port

View File

@@ -1,16 +0,0 @@
 What:		/sys/bus/platform/devices/dfl-port.0/id
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. It returns id of this port. One DFL FPGA device
 		may have more than one port. Userspace could use this id to
 		distinguish different ports under same FPGA device.
 What:		/sys/bus/platform/devices/dfl-port.0/afu_id
 Date:		June 2018
 KernelVersion:	4.19
 Contact:	Wu Hao <hao.wu@intel.com>
 Description:	Read-only. User can program different PR bitstreams to FPGA
 		Accelerator Function Unit (AFU) for different functions. It
 		returns uuid which could be used to identify which PR bitstream
 		is programmed in this AFU.

13

Documentation/ABI/testing/sysfs-platform-ideapad-laptop

View File

@@ -25,16 +25,3 @@ Description:
 		Control touchpad mode.
 			* 1 -> Switched On
 			* 0 -> Switched Off
 What:		/sys/bus/pci/devices/<bdf>/<device>/VPC2004:00/fn_lock
 Date:		May 2018
 KernelVersion:	4.18
 Contact:	"Oleg Keri <ezhi99@gmail.com>"
 Description:
 		Control fn-lock mode.
 			* 1 -> Switched On
 			* 0 -> Switched Off
 		For example:
 		# echo "0" >	\
 		/sys/bus/pci/devices/0000:00:1f.0/PNP0C09:00/VPC2004:00/fn_lock

2

Documentation/PCI/00-INDEX

View File

@@ -1,7 +1,5 @@
 -INDEX
 	- this file
 acpi-info.txt
 	- info on how PCI host bridges are represented in ACPI
 MSI-HOWTO.txt
 	- the Message Signaled Interrupts (MSI) Driver Guide HOWTO and FAQ.
 PCIEBUS-HOWTO.txt

187

Documentation/PCI/acpi-info.txt

View File

@@ -1,187 +0,0 @@
 		ACPI considerations for PCI host bridges
 The general rule is that the ACPI namespace should describe everything the
 OS might use unless there's another way for the OS to find it [1, 2].
 For example, there's no standard hardware mechanism for enumerating PCI
 host bridges, so the ACPI namespace must describe each host bridge, the
 method for accessing PCI config space below it, the address space windows
 the host bridge forwards to PCI (using _CRS), and the routing of legacy
 INTx interrupts (using _PRT).
 PCI devices, which are below the host bridge, generally do not need to be
 described via ACPI.  The OS can discover them via the standard PCI
 enumeration mechanism, using config accesses to discover and identify
 devices and read and size their BARs.  However, ACPI may describe PCI
 devices if it provides power management or hotplug functionality for them
 or if the device has INTx interrupts connected by platform interrupt
 controllers and a _PRT is needed to describe those connections.
 ACPI resource description is done via _CRS objects of devices in the ACPI
 namespace [2].   The _CRS is like a generalized PCI BAR: the OS can read
 _CRS and figure out what resource is being consumed even if it doesn't have
 a driver for the device [3].  That's important because it means an old OS
 can work correctly even on a system with new devices unknown to the OS.
 The new devices might not do anything, but the OS can at least make sure no
 resources conflict with them.
 Static tables like MCFG, HPET, ECDT, etc., are *not* mechanisms for
 reserving address space.  The static tables are for things the OS needs to
 know early in boot, before it can parse the ACPI namespace.  If a new table
 is defined, an old OS needs to operate correctly even though it ignores the
 table.  _CRS allows that because it is generic and understood by the old
 OS; a static table does not.
 If the OS is expected to manage a non-discoverable device described via
 ACPI, that device will have a specific _HID/_CID that tells the OS what
 driver to bind to it, and the _CRS tells the OS and the driver where the
 device's registers are.
 PCI host bridges are PNP0A03 or PNP0A08 devices.  Their _CRS should
 describe all the address space they consume.  This includes all the windows
 they forward down to the PCI bus, as well as registers of the host bridge
 itself that are not forwarded to PCI.  The host bridge registers include
 things like secondary/subordinate bus registers that determine the bus
 range below the bridge, window registers that describe the apertures, etc.
 These are all device-specific, non-architected things, so the only way a
 PNP0A03/PNP0A08 driver can manage them is via _PRS/_CRS/_SRS, which contain
 the device-specific details.  The host bridge registers also include ECAM
 space, since it is consumed by the host bridge.
 ACPI defines a Consumer/Producer bit to distinguish the bridge registers
 ("Consumer") from the bridge apertures ("Producer") [4, 5], but early
 BIOSes didn't use that bit correctly.  The result is that the current ACPI
 spec defines Consumer/Producer only for the Extended Address Space
 descriptors; the bit should be ignored in the older QWord/DWord/Word
 Address Space descriptors.  Consequently, OSes have to assume all
 QWord/DWord/Word descriptors are windows.
 Prior to the addition of Extended Address Space descriptors, the failure of
 Consumer/Producer meant there was no way to describe bridge registers in
 the PNP0A03/PNP0A08 device itself.  The workaround was to describe the
 bridge registers (including ECAM space) in PNP0C02 catch-all devices [6].
 With the exception of ECAM, the bridge register space is device-specific
 anyway, so the generic PNP0A03/PNP0A08 driver (pci_root.c) has no need to
 know about it.
 New architectures should be able to use "Consumer" Extended Address Space
 descriptors in the PNP0A03 device for bridge registers, including ECAM,
 although a strict interpretation of [6] might prohibit this.  Old x86 and
 ia64 kernels assume all address space descriptors, including "Consumer"
 Extended Address Space ones, are windows, so it would not be safe to
 describe bridge registers this way on those architectures.
 PNP0C02 "motherboard" devices are basically a catch-all.  There's no
 programming model for them other than "don't use these resources for
 anything else."  So a PNP0C02 _CRS should claim any address space that is
 (1) not claimed by _CRS under any other device object in the ACPI namespace
 and (2) should not be assigned by the OS to something else.
 The PCIe spec requires the Enhanced Configuration Access Method (ECAM)
 unless there's a standard firmware interface for config access, e.g., the
 ia64 SAL interface [7].  A host bridge consumes ECAM memory address space
 and converts memory accesses into PCI configuration accesses.  The spec
 defines the ECAM address space layout and functionality; only the base of
 the address space is device-specific.  An ACPI OS learns the base address
 from either the static MCFG table or a _CBA method in the PNP0A03 device.
 The MCFG table must describe the ECAM space of non-hot pluggable host
 bridges [8].  Since MCFG is a static table and can't be updated by hotplug,
 a _CBA method in the PNP0A03 device describes the ECAM space of a
 hot-pluggable host bridge [9].  Note that for both MCFG and _CBA, the base
 address always corresponds to bus 0, even if the bus range below the bridge
 (which is reported via _CRS) doesn't start at 0.
 [1] ACPI 6.2, sec 6.1:
     For any device that is on a non-enumerable type of bus (for example, an
     ISA bus), OSPM enumerates the devices' identifier(s) and the ACPI
     system firmware must supply an _HID object ... for each device to
     enable OSPM to do that.
 [2] ACPI 6.2, sec 3.7:
     The OS enumerates motherboard devices simply by reading through the
     ACPI Namespace looking for devices with hardware IDs.
     Each device enumerated by ACPI includes ACPI-defined objects in the
     ACPI Namespace that report the hardware resources the device could
     occupy [_PRS], an object that reports the resources that are currently
     used by the device [_CRS], and objects for configuring those resources
     [_SRS].  The information is used by the Plug and Play OS (OSPM) to
     configure the devices.
 [3] ACPI 6.2, sec 6.2:
     OSPM uses device configuration objects to configure hardware resources
     for devices enumerated via ACPI.  Device configuration objects provide
     information about current and possible resource requirements, the
     relationship between shared resources, and methods for configuring
     hardware resources.
     When OSPM enumerates a device, it calls _PRS to determine the resource
     requirements of the device.  It may also call _CRS to find the current
     resource settings for the device.  Using this information, the Plug and
     Play system determines what resources the device should consume and
     sets those resources by calling the device’s _SRS control method.
     In ACPI, devices can consume resources (for example, legacy keyboards),
     provide resources (for example, a proprietary PCI bridge), or do both.
     Unless otherwise specified, resources for a device are assumed to be
     taken from the nearest matching resource above the device in the device
     hierarchy.
 [4] ACPI 6.2, sec 6.4.3.5.1, 2, 3, 4:
     QWord/DWord/Word Address Space Descriptor (.1, .2, .3)
     General Flags: Bit [0] Ignored
     Extended Address Space Descriptor (.4)
     General Flags: Bit [0] Consumer/Producer:
 –This device consumes this resource
 –This device produces and consumes this resource
 [5] ACPI 6.2, sec 19.6.43:
     ResourceUsage specifies whether the Memory range is consumed by
     this device (ResourceConsumer) or passed on to child devices
     (ResourceProducer).  If nothing is specified, then
     ResourceConsumer is assumed.
 [6] PCI Firmware 3.2, sec 4.1.2:
     If the operating system does not natively comprehend reserving the
     MMCFG region, the MMCFG region must be reserved by firmware.  The
     address range reported in the MCFG table or by _CBA method (see Section
 .1.3) must be reserved by declaring a motherboard resource.  For most
     systems, the motherboard resource would appear at the root of the ACPI
     namespace (under \_SB) in a node with a _HID of EISAID (PNP0C02), and
     the resources in this case should not be claimed in the root PCI bus’s
     _CRS.  The resources can optionally be returned in Int15 E820 or
     EFIGetMemoryMap as reserved memory but must always be reported through
     ACPI as a motherboard resource.
 [7] PCI Express 4.0, sec 7.2.2:
     For systems that are PC-compatible, or that do not implement a
     processor-architecture-specific firmware interface standard that allows
     access to the Configuration Space, the ECAM is required as defined in
     this section.
 [8] PCI Firmware 3.2, sec 4.1.2:
     The MCFG table is an ACPI table that is used to communicate the base
     addresses corresponding to the non-hot removable PCI Segment Groups
     range within a PCI Segment Group available to the operating system at
     boot. This is required for the PC-compatible systems.
     The MCFG table is only used to communicate the base addresses
     corresponding to the PCI Segment Groups available to the system at
     boot.
 [9] PCI Firmware 3.2, sec 4.1.3:
     The _CBA (Memory mapped Configuration Base Address) control method is
     an optional ACPI object that returns the 64-bit memory mapped
     configuration base address for the hot plug capable host bridge. The
     base address returned by _CBA is processor-relative address. The _CBA
     control method evaluates to an Integer.
     This control method appears under a host bridge object. When the _CBA
     method appears under an active host bridge object, the operating system
     evaluates this structure to identify the memory mapped configuration
     base address corresponding to the PCI Segment Group for the bus number
     range specified in _CRS method. An ACPI name space object that contains
     the _CBA method must also contain a corresponding _SEG method.

2

Documentation/PCI/endpoint/function/binding/pci-test.txt

View File

@@ -15,5 +15,3 @@ subsys_id	 : don't care
 interrupt_pin	 : Should be 1 - INTA, 2 - INTB, 3 - INTC, 4 -INTD
 msi_interrupts	 : Should be 1 to 32 depending on the number of MSI interrupts
 		   to test
 msix_interrupts	 : Should be 1 to 2048 depending on the number of MSI-X
 		   interrupts to test

4

Documentation/PCI/endpoint/pci-endpoint.txt

View File

@@ -44,7 +44,7 @@ by the PCI controller driver.
 	 * clear_bar: ops to reset the BAR
 	 * alloc_addr_space: ops to allocate in PCI controller address space
 	 * free_addr_space: ops to free the allocated address space
 	 * raise_irq: ops to raise a legacy, MSI or MSI-X interrupt
 	 * raise_irq: ops to raise a legacy or MSI interrupt
 	 * start: ops to start the PCI link
 	 * stop: ops to stop the PCI link
@@ -96,7 +96,7 @@ by the PCI endpoint function driver.
 *) pci_epc_raise_irq()
    The PCI endpoint function driver should use pci_epc_raise_irq() to raise
    Legacy Interrupt, MSI or MSI-X Interrupt.
    Legacy Interrupt or MSI Interrupt.
 *) pci_epc_mem_alloc_addr()

29

Documentation/PCI/endpoint/pci-test-function.txt

View File

@@ -20,8 +20,6 @@ The PCI endpoint test device has the following registers:
 ) PCI_ENDPOINT_TEST_DST_ADDR
 ) PCI_ENDPOINT_TEST_SIZE
 ) PCI_ENDPOINT_TEST_CHECKSUM
 ) PCI_ENDPOINT_TEST_IRQ_TYPE
 ) PCI_ENDPOINT_TEST_IRQ_NUMBER
 *) PCI_ENDPOINT_TEST_MAGIC
@@ -36,10 +34,10 @@ that the endpoint device must perform.
 Bitfield Description:
   Bit 0		: raise legacy IRQ
   Bit 1		: raise MSI IRQ
   Bit 2		: raise MSI-X IRQ
   Bit 3		: read command (read data from RC buffer)
   Bit 4		: write command (write data to RC buffer)
   Bit 5		: copy command (copy data from one RC buffer to another
   Bit 2 - 7	: MSI interrupt number
   Bit 8		: read command (read data from RC buffer)
   Bit 9		: write command (write data to RC buffer)
   Bit 10	: copy command (copy data from one RC buffer to another
 		  RC buffer)
 *) PCI_ENDPOINT_TEST_STATUS
@@ -66,22 +64,3 @@ COPY/READ command.
 This register contains the destination address (RC buffer address) for
 the COPY/WRITE command.
 *) PCI_ENDPOINT_TEST_IRQ_TYPE
 This register contains the interrupt type (Legacy/MSI) triggered
 for the READ/WRITE/COPY and raise IRQ (Legacy/MSI) commands.
 Possible types:
  - Legacy	: 0
  - MSI		: 1
  - MSI-X	: 2
 *) PCI_ENDPOINT_TEST_IRQ_NUMBER
 This register contains the triggered ID interrupt.
 Admissible values:
  - Legacy	: 0
  - MSI		: [1 .. 32]
  - MSI-X	: [1 .. 2048]

30

Documentation/PCI/endpoint/pci-test-howto.txt

View File

@@ -45,9 +45,9 @@ The PCI endpoint framework populates the directory with the following
 configurable fields.
 	# ls functions/pci_epf_test/func1
 	  baseclass_code	interrupt_pin	progif_code	subsys_id
 	  cache_line_size	msi_interrupts	revid		subsys_vendorid
 	  deviceid          	msix_interrupts	subclass_code	vendorid
 	  baseclass_code	interrupt_pin	revid		subsys_vendor_id
 	  cache_line_size	msi_interrupts	subclass_code	vendorid
 	  deviceid          	progif_code	subsys_id
 The PCI endpoint function driver populates these entries with default values
 when the device is bound to the driver. The pci-epf-test driver populates
@@ -67,7 +67,6 @@ device, the following commands can be used.
 	# echo 0x104c > functions/pci_epf_test/func1/vendorid
 	# echo 0xb500 > functions/pci_epf_test/func1/deviceid
 	# echo 16 > functions/pci_epf_test/func1/msi_interrupts
 	# echo 8 > functions/pci_epf_test/func1/msix_interrupts
 .5 Binding pci-epf-test Device to EP Controller
@@ -121,9 +120,7 @@ following commands.
 	Interrupt tests
 	SET IRQ TYPE TO LEGACY:         OKAY
 	LEGACY IRQ:     NOT OKAY
 	SET IRQ TYPE TO MSI:            OKAY
 	MSI1:           OKAY
 	MSI2:           OKAY
 	MSI3:           OKAY
@@ -156,30 +153,9 @@ following commands.
 	MSI30:          NOT OKAY
 	MSI31:          NOT OKAY
 	MSI32:          NOT OKAY
 	SET IRQ TYPE TO MSI-X:          OKAY
 	MSI-X1:         OKAY
 	MSI-X2:         OKAY
 	MSI-X3:         OKAY
 	MSI-X4:         OKAY
 	MSI-X5:         OKAY
 	MSI-X6:         OKAY
 	MSI-X7:         OKAY
 	MSI-X8:         OKAY
 	MSI-X9:         NOT OKAY
 	MSI-X10:        NOT OKAY
 	MSI-X11:        NOT OKAY
 	MSI-X12:        NOT OKAY
 	MSI-X13:        NOT OKAY
 	MSI-X14:        NOT OKAY
 	MSI-X15:        NOT OKAY
 	MSI-X16:        NOT OKAY
 	[...]
 	MSI-X2047:      NOT OKAY
 	MSI-X2048:      NOT OKAY
 	Read Tests
 	SET IRQ TYPE TO MSI:            OKAY
 	READ (      1 bytes):           OKAY
 	READ (   1024 bytes):           OKAY
 	READ (   1025 bytes):           OKAY

35

Documentation/PCI/pci-error-recovery.txt

View File

@@ -110,7 +110,7 @@ The actual steps taken by a platform to recover from a PCI error
 event will be platform-dependent, but will follow the general
 sequence described below.
 STEP 0: Error Event: ERR_NONFATAL
 STEP 0: Error Event
 -------------------
 A PCI bus error is detected by the PCI hardware.  On powerpc, the slot
 is isolated, in that all I/O is blocked: all reads return 0xffffffff,
@@ -228,7 +228,13 @@ proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
 If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
 proceeds to STEP 4 (Slot Reset)
 STEP 3: Slot Reset
 STEP 3: Link Reset
 ------------------
 The platform resets the link.  This is a PCI-Express specific step
 and is done whenever a fatal error has been detected that can be
 "solved" by resetting the link.
 STEP 4: Slot Reset
 ------------------
 In response to a return value of PCI_ERS_RESULT_NEED_RESET, the
@@ -314,7 +320,7 @@ Failure).
 >>> However, it probably should.
 STEP 4: Resume Operations
 STEP 5: Resume Operations
 -------------------------
 The platform will call the resume() callback on all affected device
 drivers if all drivers on the segment have returned
@@ -326,7 +332,7 @@ a result code.
 At this point, if a new error happens, the platform will restart
 a new error recovery sequence.
 STEP 5: Permanent Failure
 STEP 6: Permanent Failure
 -------------------------
 A "permanent failure" has occurred, and the platform cannot recover
 the device.  The platform will call error_detected() with a
@@ -349,27 +355,6 @@ errors. See the discussion in powerpc/eeh-pci-error-recovery.txt
 for additional detail on real-life experience of the causes of
 software errors.
 STEP 0: Error Event: ERR_FATAL
 -------------------
 PCI bus error is detected by the PCI hardware. On powerpc, the slot is
 isolated, in that all I/O is blocked: all reads return 0xffffffff, all
 writes are ignored.
 STEP 1: Remove devices
 --------------------
 Platform removes the devices depending on the error agent, it could be
 this port for all subordinates or upstream component (likely downstream
 port)
 STEP 2: Reset link
 --------------------
 The platform resets the link.  This is a PCI-Express specific step and is
 done whenever a fatal error has been detected that can be "solved" by
 resetting the link.
 STEP 3: Re-enumerate the devices
 --------------------
 Initiates the re-enumeration.
 Conclusion; General Remarks
 ---------------------------

5

Documentation/PCI/pcieaer-howto.txt

View File

@@ -73,11 +73,6 @@ In the example, 'Requester ID' means the ID of the device who sends
 the error message to root port. Pls. refer to pci express specs for
 other fields.
 .4 AER Statistics / Counters
 When PCIe AER errors are captured, the counters / statistics are also exposed
 in the form of sysfs attributes which are documented at
 Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
 . Developer Guide

									
										118

Documentation/RCU/Design/Data-Structures/Data-Structures.html
									
												View File
												
				@@ -380,26 +380,31 @@ and therefore need no protection.

				as follows:

				<pre>

				  1   unsigned long gp_seq;

				  1   unsigned long gpnum;

				  2   unsigned long completed;

				</pre>

				<p>RCU grace periods are numbered, and

				the <tt>-&gt;gp_seq</tt> field contains the current grace-period

				sequence number.

				The bottom two bits are the state of the current grace period,

				which can be zero for not yet started or one for in progress.

				In other words, if the bottom two bits of <tt>-&gt;gp_seq</tt> are

				zero, the corresponding flavor of RCU is idle.

				Any other value in the bottom two bits indicates that something is broken.

				This field is protected by the root <tt>rcu_node</tt> structure's

				the <tt>-&gt;gpnum</tt> field contains the number of the grace

				period that started most recently.

				The <tt>-&gt;completed</tt> field contains the number of the

				grace period that completed most recently.

				If the two fields are equal, the RCU grace period that most recently

				started has already completed, and therefore the corresponding

				flavor of RCU is idle.

				If <tt>-&gt;gpnum</tt> is one greater than <tt>-&gt;completed</tt>,

				then <tt>-&gt;gpnum</tt> gives the number of the current RCU

				grace period, which has not yet completed.

				Any other combination of values indicates that something is broken.

				These two fields are protected by the root <tt>rcu_node</tt>'s

				<tt>-&gt;lock</tt> field.

				</p><p>There are <tt>-&gt;gp_seq</tt> fields

				</p><p>There are <tt>-&gt;gpnum</tt> and <tt>-&gt;completed</tt> fields

				in the <tt>rcu_node</tt> and <tt>rcu_data</tt> structures

				as well.

				The fields in the <tt>rcu_state</tt> structure represent the

				most current value, and those of the other structures are compared

				in order to detect the beginnings and ends of grace periods in a distributed

				most current values, and those of the other structures are compared

				in order to detect the start of a new grace period in a distributed

				fashion.

				The values flow from <tt>rcu_state</tt> to <tt>rcu_node</tt>

				(down the tree from the root to the leaves) to <tt>rcu_data</tt>.

				@@ -507,47 +512,27 @@ than to be heisenbugged out of existence.

				as follows:

				<pre>

				  1   unsigned long gp_seq;

				  2   unsigned long gp_seq_needed;

				  1   unsigned long gpnum;

				  2   unsigned long completed;

				</pre>

				<p>The <tt>rcu_node</tt> structures' <tt>-&gt;gp_seq</tt> fields are

				the counterparts of the field of the same name in the <tt>rcu_state</tt>

				structure.

				They each may lag up to one step behind their <tt>rcu_state</tt>

				counterpart.

				If the bottom two bits of a given <tt>rcu_node</tt> structure's

				<tt>-&gt;gp_seq</tt> field is zero, then this <tt>rcu_node</tt>

				<p>These fields are the counterparts of the fields of the same name in

				the <tt>rcu_state</tt> structure.

				They each may lag up to one behind their <tt>rcu_state</tt>

				counterparts.

				If a given <tt>rcu_node</tt> structure's <tt>-&gt;gpnum</tt> and

				<tt>-&gt;complete</tt> fields are equal, then this <tt>rcu_node</tt>

				structure believes that RCU is idle.

				</p><p>The <tt>&gt;gp_seq</tt> field of each <tt>rcu_node</tt>

				structure is updated at the beginning and the end

				of each grace period.

				Otherwise, as with the <tt>rcu_state</tt> structure,

				the <tt>-&gt;gpnum</tt> field will be one greater than the

				<tt>-&gt;complete</tt> fields, with <tt>-&gt;gpnum</tt>

				indicating which grace period this <tt>rcu_node</tt> believes

				is still being waited for.

				<p>The <tt>-&gt;gp_seq_needed</tt> fields record the

				furthest-in-the-future grace period request seen by the corresponding

				<tt>rcu_node</tt> structure.  The request is considered fulfilled when

				the value of the <tt>-&gt;gp_seq</tt> field equals or exceeds that of

				the <tt>-&gt;gp_seq_needed</tt> field.

				<table>

				<tr><th>&nbsp;</th></tr>

				<tr><th align="left">Quick Quiz:</th></tr>

				<tr><td>

					Suppose that this <tt>rcu_node</tt> structure doesn't see

					a request for a very long time.

					Won't wrapping of the <tt>-&gt;gp_seq</tt> field cause

					problems?

				</td></tr>

				<tr><th align="left">Answer:</th></tr>

				<tr><td bgcolor="#ffffff"><font color="ffffff">

					No, because if the <tt>-&gt;gp_seq_needed</tt> field lags behind the

					<tt>-&gt;gp_seq</tt> field, the <tt>-&gt;gp_seq_needed</tt> field

					will be updated at the end of the grace period.

					Modulo-arithmetic comparisons therefore will always get the

					correct answer, even with wrapping.

				</font></td></tr>

				<tr><td>&nbsp;</td></tr>

				</table>

				</p><p>The <tt>&gt;gpnum</tt> field of each <tt>rcu_node</tt>

				structure is updated at the beginning

				of each grace period, and the <tt>-&gt;completed</tt> fields are

				updated at the end of each grace period.

				<h5>Quiescent-State Tracking</h5>

				@@ -641,8 +626,9 @@ normal and expedited grace periods, respectively.

					</ol>

					<p><font color="ffffff">So the locking is absolutely required in

					order to coordinate clearing of the bits with updating of the

					grace-period sequence number in <tt>-&gt;gp_seq</tt>.

					order to coordinate

					clearing of the bits with the grace-period numbers in

					<tt>-&gt;gpnum</tt> and <tt>-&gt;completed</tt>.

				</font></td></tr>

				<tr><td>&nbsp;</td></tr>

				</table>

				@@ -1052,15 +1038,15 @@ out any <tt>rcu_data</tt> structure for which this flag is not set.

				as follows:

				<pre>

				  1   unsigned long gp_seq;

				  2   unsigned long gp_seq_needed;

				  1   unsigned long completed;

				  2   unsigned long gpnum;

				  3   bool cpu_no_qs;

				  4   bool core_needs_qs;

				  5   bool gpwrap;

				  6   unsigned long rcu_qs_ctr_snap;

				</pre>

				<p>The <tt>-&gt;gp_seq</tt> and <tt>-&gt;gp_seq_needed</tt>

				<p>The <tt>completed</tt> and <tt>gpnum</tt>

				fields are the counterparts of the fields of the same name

				in the <tt>rcu_state</tt> and <tt>rcu_node</tt> structures.

				They may each lag up to one behind their <tt>rcu_node</tt>

				@@ -1068,9 +1054,15 @@ counterparts, but in <tt>CONFIG_NO_HZ_IDLE</tt> and

				<tt>CONFIG_NO_HZ_FULL</tt> kernels can lag

				arbitrarily far behind for CPUs in dyntick-idle mode (but these counters

				will catch up upon exit from dyntick-idle mode).

				If the lower two bits of a given <tt>rcu_data</tt> structure's

				<tt>-&gt;gp_seq</tt> are zero, then this <tt>rcu_data</tt>

				If a given <tt>rcu_data</tt> structure's <tt>-&gt;gpnum</tt> and

				<tt>-&gt;complete</tt> fields are equal, then this <tt>rcu_data</tt>

				structure believes that RCU is idle.

				Otherwise, as with the <tt>rcu_state</tt> and <tt>rcu_node</tt>

				structure,

				the <tt>-&gt;gpnum</tt> field will be one greater than the

				<tt>-&gt;complete</tt> fields, with <tt>-&gt;gpnum</tt>

				indicating which grace period this <tt>rcu_data</tt> believes

				is still being waited for.

				<table>

				<tr><th>&nbsp;</th></tr>

				@@ -1078,13 +1070,13 @@ structure believes that RCU is idle.

				<tr><td>

					All this replication of the grace period numbers can only cause

					massive confusion.

					Why not just keep a global sequence number and be done with it???

					Why not just keep a global pair of counters and be done with it???

				</td></tr>

				<tr><th align="left">Answer:</th></tr>

				<tr><td bgcolor="#ffffff"><font color="ffffff">

					Because if there was only a single global sequence

					Because if there was only a single global pair of grace-period

					numbers, there would need to be a single global lock to allow

					safely accessing and updating it.

					safely accessing and updating them.

					And if we are not going to have a single global lock, we need

					to carefully manage the numbers on a per-node basis.

					Recall from the answer to a previous Quick Quiz that the consequences

				@@ -1099,8 +1091,8 @@ CPU has not yet passed through a quiescent state,

				while the <tt>-&gt;core_needs_qs</tt> flag indicates that the

				RCU core needs a quiescent state from the corresponding CPU.

				The <tt>-&gt;gpwrap</tt> field indicates that the corresponding

				CPU has remained idle for so long that the

				<tt>gp_seq</tt> counter is in danger of overflow, which

				CPU has remained idle for so long that the <tt>completed</tt>

				and <tt>gpnum</tt> counters are in danger of overflow, which

				will cause the CPU to disregard the values of its counters on

				its next exit from idle.

				Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect

				@@ -1138,10 +1130,10 @@ The CPU advances the callbacks in its <tt>rcu_data</tt> structure

				whenever it notices that another RCU grace period has completed.

				The CPU detects the completion of an RCU grace period by noticing

				that the value of its <tt>rcu_data</tt> structure's

				<tt>-&gt;gp_seq</tt> field differs from that of its leaf

				<tt>-&gt;completed</tt> field differs from that of its leaf

				<tt>rcu_node</tt> structure.

				Recall that each <tt>rcu_node</tt> structure's

				<tt>-&gt;gp_seq</tt> field is updated at the beginnings and ends of each

				<tt>-&gt;completed</tt> field is updated at the end of each

				grace period.

				<p>

									
										22

Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html
									
												View File
												
				@@ -357,7 +357,7 @@ parts, starting in this section with the various phases of

				grace-period initialization.

				<p>The first ordering-related grace-period initialization action is to

				advance the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt>

				increment the <tt>rcu_state</tt> structure's <tt>-&gt;gpnum</tt>

				grace-period-number counter, as shown below:

				</p><p><img src="TreeRCU-gp-init-1.svg" alt="TreeRCU-gp-init-1.svg" width="75%">

				@@ -388,7 +388,7 @@ its last CPU and if the next <tt>rcu_node</tt> structure has no online CPUs).

				<p>The final <tt>rcu_gp_init()</tt> pass through the <tt>rcu_node</tt>

				tree traverses breadth-first, setting each <tt>rcu_node</tt> structure's

				<tt>-&gt;gp_seq</tt> field to the newly advanced value from the

				<tt>-&gt;gpnum</tt> field to the newly incremented value from the

				<tt>rcu_state</tt> structure, as shown in the following diagram.

				</p><p><img src="TreeRCU-gp-init-3.svg" alt="TreeRCU-gp-init-1.svg" width="75%">

				@@ -398,9 +398,9 @@ tree traverses breadth-first, setting each <tt>rcu_node</tt> structure's

				to notice that a new grace period has started, as described in the next

				section.

				But because the grace-period kthread started the grace period at the

				root (with the advancing of the <tt>rcu_state</tt> structure's

				<tt>-&gt;gp_seq</tt> field) before setting each leaf <tt>rcu_node</tt>

				structure's <tt>-&gt;gp_seq</tt> field, each CPU's observation of

				root (with the increment of the <tt>rcu_state</tt> structure's

				<tt>-&gt;gpnum</tt> field) before setting each leaf <tt>rcu_node</tt>

				structure's <tt>-&gt;gpnum</tt> field, each CPU's observation of

				the start of the grace period will happen after the actual start

				of the grace period.

				@@ -466,7 +466,7 @@ section that the grace period must wait on.

				<tr><td>

					But a RCU read-side critical section might have started

					after the beginning of the grace period

					(the advancing of <tt>-&gt;gp_seq</tt> from earlier), so why should

					(the <tt>-&gt;gpnum++</tt> from earlier), so why should

					the grace period wait on such a critical section?

				</td></tr>

				<tr><th align="left">Answer:</th></tr>

				@@ -609,8 +609,10 @@ states outstanding from other CPUs.

				<h4><a name="Grace-Period Cleanup">Grace-Period Cleanup</a></h4>

				<p>Grace-period cleanup first scans the <tt>rcu_node</tt> tree

				breadth-first advancing all the <tt>-&gt;gp_seq</tt> fields, then it

				advances the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt> field.

				breadth-first setting all the <tt>-&gt;completed</tt> fields equal

				to the number of the newly completed grace period, then it sets

				the <tt>rcu_state</tt> structure's <tt>-&gt;completed</tt> field,

				again to the number of the newly completed grace period.

				The ordering effects are shown below:

				</p><p><img src="TreeRCU-gp-cleanup.svg" alt="TreeRCU-gp-cleanup.svg" width="75%">

				@@ -632,7 +634,7 @@ grace-period cleanup is complete, the next grace period can begin.

					CPU has reported its quiescent state, but it may be some

					milliseconds before RCU becomes aware of this.

					The latest reasonable candidate is once the <tt>rcu_state</tt>

					structure's <tt>-&gt;gp_seq</tt> field has been updated,

					structure's <tt>-&gt;completed</tt> field has been updated,

					but it is quite possible that some CPUs have already completed

					phase two of their updates by that time.

					In short, if you are going to work with RCU, you need to

				@@ -645,7 +647,7 @@ grace-period cleanup is complete, the next grace period can begin.

				<h4><a name="Callback Invocation">Callback Invocation</a></h4>

				<p>Once a given CPU's leaf <tt>rcu_node</tt> structure's

				<tt>-&gt;gp_seq</tt> field has been updated, that CPU can begin

				<tt>-&gt;completed</tt> field has been updated, that CPU can begin

				invoking its RCU callbacks that were waiting for this grace period

				to end.

				These callbacks are identified by <tt>rcu_advance_cbs()</tt>,

123

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg

View File

@@ -384,11 +384,11 @@
      inkscape:window-height="1144"
      id="namedview208"
      showgrid="true"
      inkscape:zoom="0.78716603"
      inkscape:cx="513.06403"
      inkscape:cy="623.1214"
      inkscape:window-x="102"
      inkscape:window-y="38"
      inkscape:zoom="0.70710678"
      inkscape:cx="617.89017"
      inkscape:cy="542.52419"
      inkscape:window-x="86"
      inkscape:window-y="28"
      inkscape:window-maximized="0"
      inkscape:current-layer="g3188-3"
      fit-margin-top="5"
@@ -417,15 +417,13 @@
      id="g3188">
     <text
        xml:space="preserve"
        x="3145.9592"
        x="3199.1516"
        y="13255.592"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3143">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     <g
        id="g3107"
        transform="translate(947.90548,11584.029)">
@@ -504,15 +502,13 @@
     </g>
     <text
        xml:space="preserve"
        x="5264.4731"
        y="15428.84"
        x="5324.5371"
        y="15414.598"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-7"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-5">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
        id="text202-753"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
   </g>
   <g
      style="fill:none;stroke-width:0.025in"
@@ -551,6 +547,15 @@
        sodipodi:linespacing="125%"><tspan
          style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
          id="tspan3104-6-5-6-0">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7479.5796"
        y="17699.943"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-9"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     <path
        sodipodi:nodetypes="cc"
        inkscape:connector-curvature="0"
@@ -561,6 +566,15 @@
        style="fill:none;stroke-width:0.025in"
        transform="translate(-737.93887,7732.6672)"
        id="g3188-3">
       <text
          xml:space="preserve"
          x="3225.7478"
          y="13175.802"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-60"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;completed =</text>
       <g
          id="g3107-62"
          transform="translate(947.90548,11584.029)">
@@ -593,6 +607,15 @@
          sodipodi:linespacing="125%"><tspan
            style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
            id="tspan3104-6-5-7">Root</tspan></text>
       <text
          xml:space="preserve"
          x="3225.7478"
          y="13390.038"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-60-3"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">       rnp-&gt;completed</text>
       <flowRoot
          xml:space="preserve"
          id="flowRoot3356"
@@ -604,18 +627,7 @@
              height="63.63961"
              x="332.34018"
              y="681.87292" /></flowRegion><flowPara
            id="flowPara3362" /></flowRoot>      <text
          xml:space="preserve"
          x="3156.6121"
          y="13317.754"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36-6"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166-0">rcu_seq_end(&amp;rsp-&gt;gp_seq)</tspan></text>
     </g>
            id="flowPara3362" /></flowRoot>    </g>
     <g
        style="fill:none;stroke-width:0.025in"
        transform="translate(-858.40227,7769.0342)"
@@ -847,17 +859,6 @@
        id="path3414-8-3-6-6"
        inkscape:connector-curvature="0"
        sodipodi:nodetypes="cc" />
     <text
        xml:space="preserve"
        x="7418.769"
        y="17646.104"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-70"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-93">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
   </g>
   <g
      transform="translate(-1642.5377,-11611.245)"
@@ -886,15 +887,13 @@
     </g>
     <text
        xml:space="preserve"
        x="5274.1133"
        x="5327.3057"
        y="15428.84"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
   </g>
   <g
      transform="translate(-151.71746,-11647.612)"
@@ -973,15 +972,13 @@
          id="tspan3104-6-5-6-0-92">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7408.5918"
        y="17619.504"
        x="7486.4907"
        y="17670.119"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-2"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-9">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
        id="text202-6"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
   </g>
   <g
      transform="translate(-6817.1997,-11647.612)"
@@ -1022,15 +1019,13 @@
          id="tspan3104-6-5-6-0-1">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7416.8003"
        y="17619.504"
        x="7474.1382"
        y="17688.926"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-3"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-56">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
        id="text202-5"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
   </g>
   <path
      style="fill:none;stroke:#000000;stroke-width:13.29812908px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -1064,6 +1059,15 @@
      id="path3414-8-3-6"
      inkscape:connector-curvature="0"
      sodipodi:nodetypes="cc" />
   <text
      xml:space="preserve"
      x="7318.9653"
      y="6031.6353"
      font-style="normal"
      font-weight="bold"
      font-size="192"
      id="text202-2"
      style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
   <g
      style="fill:none;stroke-width:0.025in"
      id="g4504-3-9"
@@ -1119,15 +1123,4 @@
      id="path3134-9-0-3-5"
      d="m 6875.6003,15833.906 1595.7755,0"
      style="fill:none;stroke:#969696;stroke-width:53.19251633;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;marker-end:url(#Arrow1Send-36)" />
   <text
      xml:space="preserve"
      x="7275.2612"
      y="5971.8916"
      font-style="normal"
      font-weight="bold"
      font-size="192"
      id="text202-36-1"
      style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
        style="font-size:172.87567139px"
        id="tspan3166-2">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
 </svg>

Before

Width: | Height: | Size: 42 KiB

After

Width: | Height: | Size: 42 KiB

16

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg

View File

@@ -272,13 +272,13 @@
      inkscape:window-height="1144"
      id="namedview208"
      showgrid="true"
      inkscape:zoom="2.6330492"
      inkscape:cx="524.82797"
      inkscape:cy="519.31194"
      inkscape:window-x="79"
      inkscape:zoom="0.70710678"
      inkscape:cx="617.89019"
      inkscape:cy="636.57143"
      inkscape:window-x="697"
      inkscape:window-y="28"
      inkscape:window-maximized="0"
      inkscape:current-layer="g3188"
      inkscape:current-layer="svg2"
      fit-margin-top="5"
      fit-margin-right="5"
      fit-margin-left="5"
@@ -305,15 +305,13 @@
      id="g3188">
     <text
        xml:space="preserve"
        x="3119.363"
        x="3305.5364"
        y="13255.592"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3071">rcu_seq_start(rsp-&gt;gp_seq)</tspan></text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;gpnum++</text>
     <g
        id="g3107"
        transform="translate(947.90548,11584.029)">

Before

Width: | Height: | Size: 24 KiB

After

Width: | Height: | Size: 23 KiB

56

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg

View File

@@ -19,7 +19,7 @@
    id="svg2"
    version="1.1"
    inkscape:version="0.48.4 r9939"
    sodipodi:docname="TreeRCU-gp-init-3.svg">
    sodipodi:docname="TreeRCU-gp-init-2.svg">
   <metadata
      id="metadata212">
     <rdf:RDF>
@@ -257,22 +257,18 @@
      inkscape:window-width="1087"
      inkscape:window-height="1144"
      id="namedview208"
      showgrid="true"
      inkscape:zoom="0.68224756"
      showgrid="false"
      inkscape:zoom="0.70710678"
      inkscape:cx="617.89019"
      inkscape:cy="625.84293"
      inkscape:window-x="54"
      inkscape:window-x="697"
      inkscape:window-y="28"
      inkscape:window-maximized="0"
      inkscape:current-layer="g3153"
      inkscape:current-layer="svg2"
      fit-margin-top="5"
      fit-margin-right="5"
      fit-margin-left="5"
      fit-margin-bottom="5">
     <inkscape:grid
        type="xygrid"
        id="grid3090" />
   </sodipodi:namedview>
      fit-margin-bottom="5" />
   <path
      sodipodi:nodetypes="cccccccccccccccccccccccc"
      inkscape:connector-curvature="0"
@@ -285,13 +281,13 @@
      id="g3188">
     <text
        xml:space="preserve"
        x="3145.9592"
        x="3305.5364"
        y="13255.592"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     <g
        id="g3107"
        transform="translate(947.90548,11584.029)">
@@ -370,13 +366,13 @@
     </g>
     <text
        xml:space="preserve"
        x="5253.6904"
        y="15407.032"
        x="5392.3345"
        y="15407.104"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-6"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
   </g>
   <g
      style="fill:none;stroke-width:0.025in"
@@ -417,13 +413,13 @@
          id="tspan3104-6-5-6-0">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7415.4365"
        y="17670.572"
        x="7536.4883"
        y="17640.934"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-9"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
   </g>
   <g
      transform="translate(-1642.5375,-11610.962)"
@@ -452,13 +448,13 @@
     </g>
     <text
        xml:space="preserve"
        x="5258.0688"
        y="15412.313"
        x="5378.4146"
        y="15436.927"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-3"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
   </g>
   <g
      transform="translate(-151.71726,-11647.329)"
@@ -537,13 +533,13 @@
          id="tspan3104-6-5-6-0-92">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7405.2607"
        y="17670.572"
        x="7520.1294"
        y="17673.639"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-35"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
   </g>
   <g
      transform="translate(-6817.1998,-11647.329)"
@@ -584,13 +580,13 @@
          id="tspan3104-6-5-6-0-1">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="7413.4688"
        y="17670.566"
        x="7521.4663"
        y="17666.062"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-75"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
   </g>
   <path
      style="fill:none;stroke:#000000;stroke-width:13.29812908px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -626,11 +622,11 @@
      sodipodi:nodetypes="cc" />
   <text
      xml:space="preserve"
      x="7271.9297"
      y="6023.2412"
      x="7370.856"
      y="5997.5972"
      font-style="normal"
      font-weight="bold"
      font-size="192"
      id="text202-62"
      style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
      style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
 </svg>

Before

Width: | Height: | Size: 23 KiB

After

Width: | Height: | Size: 22 KiB

237

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg

View File

@@ -1070,13 +1070,13 @@
      inkscape:window-height="1144"
      id="namedview208"
      showgrid="true"
      inkscape:zoom="0.81932583"
      inkscape:cx="840.45848"
      inkscape:cy="5052.4242"
      inkscape:window-x="787"
      inkscape:window-y="24"
      inkscape:zoom="0.6004608"
      inkscape:cx="826.65969"
      inkscape:cy="483.3047"
      inkscape:window-x="66"
      inkscape:window-y="28"
      inkscape:window-maximized="0"
      inkscape:current-layer="g4"
      inkscape:current-layer="svg2"
      fit-margin-top="5"
      fit-margin-right="5"
      fit-margin-left="5"
@@ -1543,6 +1543,15 @@
        style="fill:none;stroke-width:0.025in"
        transform="translate(1749.0282,658.72243)"
        id="g3188">
       <text
          xml:space="preserve"
          x="3305.5364"
          y="13255.592"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-5"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">rsp-&gt;gpnum++</text>
       <g
          id="g3107-62"
          transform="translate(947.90548,11584.029)">
@@ -1575,17 +1584,6 @@
          sodipodi:linespacing="125%"><tspan
            style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
            id="tspan3104-6-5-7">Root</tspan></text>
       <text
          xml:space="preserve"
          x="3137.9988"
          y="13271.316"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-626"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3071">rcu_seq_start(rsp-&gt;gp_seq)</tspan></text>
     </g>
     <rect
        ry="0"
@@ -2320,6 +2318,15 @@
        style="fill:none;stroke-width:0.025in"
        transform="translate(1739.0986,17188.625)"
        id="g3188-6">
       <text
          xml:space="preserve"
          x="3305.5364"
          y="13255.592"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-1"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
       <g
          id="g3107-5"
          transform="translate(947.90548,11584.029)">
@@ -2352,15 +2359,6 @@
          sodipodi:linespacing="125%"><tspan
            style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
            id="tspan3104-6-5-1">Root</tspan></text>
       <text
          xml:space="preserve"
          x="3147.9268"
          y="13240.524"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-1"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
     </g>
     <g
        style="fill:none;stroke-width:0.025in"
@@ -2389,13 +2387,13 @@
       </g>
       <text
          xml:space="preserve"
          x="5263.1094"
          y="15411.646"
          x="5392.3345"
          y="15407.104"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-92"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
          id="text202-6-7"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     </g>
     <g
        style="fill:none;stroke-width:0.025in"
@@ -2436,13 +2434,13 @@
            id="tspan3104-6-5-6-0-94">Leaf</tspan></text>
       <text
          xml:space="preserve"
          x="7417.4053"
          y="17655.502"
          x="7536.4883"
          y="17640.934"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-759"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
          id="text202-9"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     </g>
     <g
        transform="translate(-2353.8462,17224.992)"
@@ -2471,13 +2469,13 @@
       </g>
       <text
          xml:space="preserve"
          x="5246.1548"
          y="15411.648"
          x="5378.4146"
          y="15436.927"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-87"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
          id="text202-3"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     </g>
     <g
        transform="translate(-863.02613,17188.625)"
@@ -2556,13 +2554,13 @@
            id="tspan3104-6-5-6-0-92-6">Leaf</tspan></text>
       <text
          xml:space="preserve"
          x="7433.8257"
          y="17682.098"
          x="7520.1294"
          y="17673.639"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-2"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
          id="text202-35"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     </g>
     <g
        transform="translate(-7528.5085,17188.625)"
@@ -2603,13 +2601,13 @@
            id="tspan3104-6-5-6-0-1-8">Leaf</tspan></text>
       <text
          xml:space="preserve"
          x="7415.4404"
          y="17682.098"
          x="7521.4663"
          y="17666.062"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-0"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
          id="text202-75-1"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     </g>
     <path
        style="fill:none;stroke:#000000;stroke-width:13.29812813px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -2643,6 +2641,15 @@
        id="path3414-8-3-6-4"
        inkscape:connector-curvature="0"
        sodipodi:nodetypes="cc" />
     <text
        xml:space="preserve"
        x="6659.5469"
        y="34833.551"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-62"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gpnum = rsp-&gt;gpnum</text>
     <path
        sodipodi:nodetypes="ccc"
        inkscape:connector-curvature="0"
@@ -3837,7 +3844,7 @@
          font-weight="bold"
          font-size="192"
          id="text202-6-6-5"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gp_seq</text>
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gpnum</text>
       <text
          xml:space="preserve"
          x="5035.4155"
@@ -4277,6 +4284,15 @@
        style="fill:none;stroke-width:0.025in"
        transform="translate(1874.038,53203.538)"
        id="g3188-7">
       <text
          xml:space="preserve"
          x="3199.1516"
          y="13255.592"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-82"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
       <g
          id="g3107-53"
          transform="translate(947.90548,11584.029)">
@@ -4309,17 +4325,6 @@
          sodipodi:linespacing="125%"><tspan
            style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
            id="tspan3104-6-5-19">Root</tspan></text>
       <text
          xml:space="preserve"
          x="3175.896"
          y="13240.11"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36-3"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
     </g>
     <rect
        ry="0"
@@ -4366,15 +4371,13 @@
       </g>
       <text
          xml:space="preserve"
          x="5264.4829"
          y="15411.231"
          x="5324.5371"
          y="15414.598"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36-7"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166-5">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
          id="text202-753"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     </g>
     <g
        style="fill:none;stroke-width:0.025in"
@@ -4409,12 +4412,30 @@
        sodipodi:linespacing="125%"><tspan
          style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
          id="tspan3104-6-5-6-0-4">Leaf</tspan></text>
     <text
        xml:space="preserve"
        x="10084.225"
        y="70903.312"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-9-0"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     <path
        sodipodi:nodetypes="ccc"
        inkscape:connector-curvature="0"
        id="path3134-9-0-3-9"
        d="m 6315.6122,72629.054 -20.9533,8108.684 1648.968,0"
        style="fill:none;stroke:#969696;stroke-width:53.19251251;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none;marker-end:url(#Arrow1Send)" />
     <text
        xml:space="preserve"
        x="5092.4683"
        y="74111.672"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-60"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rsp-&gt;completed =</text>
     <g
        style="fill:none;stroke-width:0.025in"
        id="g3107-62-6"
@@ -4448,6 +4469,15 @@
        sodipodi:linespacing="125%"><tspan
          style="font-size:159.57754517px;font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;text-align:start;line-height:125%;writing-mode:lr-tb;text-anchor:start;font-family:Liberation Sans;-inkscape-font-specification:Liberation Sans"
          id="tspan3104-6-5-7-7">Root</tspan></text>
     <text
        xml:space="preserve"
        x="5092.4683"
        y="74325.906"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-60-3"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">       rnp-&gt;completed</text>
     <g
        style="fill:none;stroke-width:0.025in"
        transform="translate(1746.2528,60972.572)"
@@ -4706,15 +4736,13 @@
       </g>
       <text
          xml:space="preserve"
          x="5274.1216"
          y="15411.231"
          x="5327.3057"
          y="15428.84"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166-6">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     </g>
     <g
        transform="translate(-728.08545,53203.538)"
@@ -4793,15 +4821,13 @@
            id="tspan3104-6-5-6-0-92-5">Leaf</tspan></text>
       <text
          xml:space="preserve"
          x="7435.1987"
          y="17708.281"
          x="7486.4907"
          y="17670.119"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36-9"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166-1">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
          id="text202-6-2"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     </g>
     <g
        transform="translate(-7393.5687,53203.538)"
@@ -4842,15 +4868,13 @@
            id="tspan3104-6-5-6-0-1-5">Leaf</tspan></text>
       <text
          xml:space="preserve"
          x="7416.8125"
          y="17708.281"
          x="7474.1382"
          y="17688.926"
          font-style="normal"
          font-weight="bold"
          font-size="192"
          id="text202-36-35"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
            style="font-size:172.87567139px"
            id="tspan3166-62">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
          id="text202-5-1"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     </g>
     <path
        style="fill:none;stroke:#000000;stroke-width:13.29812813px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#Arrow2Lend)"
@@ -4884,6 +4908,15 @@
        id="path3414-8-3-6-67"
        inkscape:connector-curvature="0"
        sodipodi:nodetypes="cc" />
     <text
        xml:space="preserve"
        x="6742.6001"
        y="70882.617"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-2"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;completed = -&gt;gpnum</text>
     <g
        style="fill:none;stroke-width:0.025in"
        id="g4504-3-9-6"
@@ -5098,47 +5131,5 @@
        font-size="192"
        id="text202-7-9-6-6-7"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rcu_do_batch()</text>
     <text
        xml:space="preserve"
        x="6698.9019"
        y="70885.211"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-2"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-7">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
     <text
        xml:space="preserve"
        x="10023.457"
        y="70885.234"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-0"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-9">rcu_seq_end(&amp;rnp-&gt;gp_seq)</tspan></text>
     <text
        xml:space="preserve"
        x="5023.3389"
        y="74209.773"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-36-36"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier"><tspan
          style="font-size:172.87567139px"
          id="tspan3166-0">rcu_seq_end(&amp;rsp-&gt;gp_seq)</tspan></text>
     <text
        xml:space="preserve"
        x="6562.5884"
        y="34870.727"
        font-style="normal"
        font-weight="bold"
        font-size="192"
        id="text202-3"
        style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">-&gt;gp_seq = rsp-&gt;gp_seq</text>
   </g>
 </svg>

Before

Width: | Height: | Size: 209 KiB

After

Width: | Height: | Size: 208 KiB

12

Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg

View File

@@ -300,13 +300,13 @@
      inkscape:window-height="1144"
      id="namedview208"
      showgrid="true"
      inkscape:zoom="0.96484375"
      inkscape:cx="507.0191"
      inkscape:cy="885.62207"
      inkscape:window-x="47"
      inkscape:zoom="0.70710678"
      inkscape:cx="616.47598"
      inkscape:cy="595.41964"
      inkscape:window-x="813"
      inkscape:window-y="28"
      inkscape:window-maximized="0"
      inkscape:current-layer="g3115"
      inkscape:current-layer="g4405"
      fit-margin-top="5"
      fit-margin-right="5"
      fit-margin-left="5"
@@ -710,7 +710,7 @@
          font-weight="bold"
          font-size="192"
          id="text202-6-6"
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gp_seq</text>
          style="font-size:192px;font-style:normal;font-weight:bold;text-anchor:start;fill:#000000;stroke-width:0.025in;font-family:Courier">rdp-&gt;gpnum</text>
       <text
          xml:space="preserve"
          x="5035.4155"

Before

Width: | Height: | Size: 43 KiB

After

Width: | Height: | Size: 43 KiB

24

Documentation/RCU/stallwarn.txt

View File

@@ -172,7 +172,7 @@ it will print a message similar to the following:
 	INFO: rcu_sched detected stalls on CPUs/tasks:
 -...: (3 GPs behind) idle=06c/0/0 softirq=1453/1455 fqs=0
 -...: (0 ticks this GP) idle=81c/0/0 softirq=764/764 fqs=0
 	(detected by 32, t=2603 jiffies, g=7075, q=625)
 	(detected by 32, t=2603 jiffies, g=7073, c=7072, q=625)
 This message indicates that CPU 32 detected that CPUs 2 and 16 were both
 causing stalls, and that the stall was affecting RCU-sched.  This message
@@ -215,10 +215,11 @@ CPU since the last time that this CPU noted the beginning of a grace
 period.
 The "detected by" line indicates which CPU detected the stall (in this
 case, CPU 32), how many jiffies have elapsed since the start of the grace
 period (in this case 2603), the grace-period sequence number (7075), and
 an estimate of the total number of RCU callbacks queued across all CPUs
 (625 in this case).
 case, CPU 32), how many jiffies have elapsed since the start of the
 grace period (in this case 2603), the number of the last grace period
 to start and to complete (7073 and 7072, respectively), and an estimate
 of the total number of RCU callbacks queued across all CPUs (625 in
 this case).
 In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
 for each CPU:
@@ -265,16 +266,15 @@ If the relevant grace-period kthread has been unable to run prior to
 the stall warning, as was the case in the "All QSes seen" line above,
 the following additional line is printed:
 	kthread starved for 23807 jiffies! g7075 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1 ->cpu=5
 	kthread starved for 23807 jiffies! g7073 c7072 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x1
 Starving the grace-period kthreads of CPU time can of course result
 in RCU CPU stall warnings even when all CPUs and tasks have passed
 through the required quiescent states.  The "g" number shows the current
 grace-period sequence number, the "f" precedes the ->gp_flags command
 to the grace-period kthread, the "RCU_GP_WAIT_FQS" indicates that the
 kthread is waiting for a short timeout, the "state" precedes value of the
 task_struct ->state field, and the "cpu" indicates that the grace-period
 kthread last ran on CPU 5.
 through the required quiescent states.  The "g" and "c" numbers flag the
 number of the last grace period started and completed, respectively,
 the "f" precedes the ->gp_flags command to the grace-period kthread,
 the "RCU_GP_WAIT_FQS" indicates that the kthread is waiting for a short
 timeout, and the "state" precedes value of the task_struct ->state field.
 Multiple Warnings From One Stall

20

Documentation/RCU/whatisRCU.txt

View File

@@ -1,5 +1,3 @@
 What is RCU?  --  "Read, Copy, Update"
 Please note that the "What is RCU?" LWN series is an excellent place
 to start learning about RCU:
@@ -588,7 +586,6 @@ It is extremely simple:
 	void synchronize_rcu(void)
 	{
 		write_lock(&rcu_gp_mutex);
 		smp_mb__after_spinlock();
 		write_unlock(&rcu_gp_mutex);
 	}
@@ -610,15 +607,12 @@ don't forget about them when submitting patches making use of RCU!]
 The rcu_read_lock() and rcu_read_unlock() primitive read-acquire
 and release a global reader-writer lock.  The synchronize_rcu()
 primitive write-acquires this same lock, then releases it.  This means
 that once synchronize_rcu() exits, all RCU read-side critical sections
 that were in progress before synchronize_rcu() was called are guaranteed
 to have completed -- there is no way that synchronize_rcu() would have
 been able to write-acquire the lock otherwise.  The smp_mb__after_spinlock()
 promotes synchronize_rcu() to a full memory barrier in compliance with
 the "Memory-Barrier Guarantees" listed in:
 	Documentation/RCU/Design/Requirements/Requirements.html.
 primitive write-acquires this same lock, then immediately releases
 it.  This means that once synchronize_rcu() exits, all RCU read-side
 critical sections that were in progress before synchronize_rcu() was
 called are guaranteed to have completed -- there is no way that
 synchronize_rcu() would have been able to write-acquire the lock
 otherwise.
 It is possible to nest rcu_read_lock(), since reader-writer locks may
 be recursively acquired.  Note also that rcu_read_lock() is immune
@@ -820,13 +814,11 @@ RCU list traversal:
 	list_next_rcu
 	list_for_each_entry_rcu
 	list_for_each_entry_continue_rcu
 	list_for_each_entry_from_rcu
 	hlist_first_rcu
 	hlist_next_rcu
 	hlist_pprev_rcu
 	hlist_for_each_entry_rcu
 	hlist_for_each_entry_rcu_bh
 	hlist_for_each_entry_from_rcu
 	hlist_for_each_entry_continue_rcu
 	hlist_for_each_entry_continue_rcu_bh
 	hlist_nulls_first_rcu

									
										11

Documentation/accelerators/ocxl.rst
									
												View File
												
				@@ -157,17 +157,6 @@ OCXL_IOCTL_GET_METADATA:

				  Obtains configuration information from the card, such at the size of

				  MMIO areas, the AFU version, and the PASID for the current context.

				OCXL_IOCTL_ENABLE_P9_WAIT:

				  Allows the AFU to wake a userspace thread executing 'wait'. Returns

				  information to userspace to allow it to configure the AFU. Note that

				  this is only available on POWER9.

				OCXL_IOCTL_GET_FEATURES:

				  Reports on which CPU features that affect OpenCAPI are usable from

				  userspace.

				mmap

				----

69

Documentation/acpi/cppc_sysfs.txt

View File

@@ -1,69 +0,0 @@
 	Collaborative Processor Performance Control (CPPC)
 CPPC defined in the ACPI spec describes a mechanism for the OS to manage the
 performance of a logical processor on a contigious and abstract performance
 scale. CPPC exposes a set of registers to describe abstract performance scale,
 to request performance levels and to measure per-cpu delivered performance.
 For more details on CPPC please refer to the ACPI specification at:
 http://uefi.org/specifications
 Some of the CPPC registers are exposed via sysfs under:
 /sys/devices/system/cpu/cpuX/acpi_cppc/
 for each cpu X
 --------------------------------------------------------------------------------
 $ ls -lR  /sys/devices/system/cpu/cpu0/acpi_cppc/
 /sys/devices/system/cpu/cpu0/acpi_cppc/:
 total 0
 -r--r--r-- 1 root root 65536 Mar  5 19:38 feedback_ctrs
 -r--r--r-- 1 root root 65536 Mar  5 19:38 highest_perf
 -r--r--r-- 1 root root 65536 Mar  5 19:38 lowest_freq
 -r--r--r-- 1 root root 65536 Mar  5 19:38 lowest_nonlinear_perf
 -r--r--r-- 1 root root 65536 Mar  5 19:38 lowest_perf
 -r--r--r-- 1 root root 65536 Mar  5 19:38 nominal_freq
 -r--r--r-- 1 root root 65536 Mar  5 19:38 nominal_perf
 -r--r--r-- 1 root root 65536 Mar  5 19:38 reference_perf
 -r--r--r-- 1 root root 65536 Mar  5 19:38 wraparound_time
 --------------------------------------------------------------------------------
 * highest_perf : Highest performance of this processor (abstract scale).
 * nominal_perf : Highest sustained performance of this processor (abstract scale).
 * lowest_nonlinear_perf : Lowest performance of this processor with nonlinear
   power savings (abstract scale).
 * lowest_perf : Lowest performance of this processor (abstract scale).
 * lowest_freq : CPU frequency corresponding to lowest_perf (in MHz).
 * nominal_freq : CPU frequency corresponding to nominal_perf (in MHz).
   The above frequencies should only be used to report processor performance in
   freqency instead of abstract scale. These values should not be used for any
   functional decisions.
 * feedback_ctrs : Includes both Reference and delivered performance counter.
   Reference counter ticks up proportional to processor's reference performance.
   Delivered counter ticks up proportional to processor's delivered performance.
 * wraparound_time: Minimum time for the feedback counters to wraparound (seconds).
 * reference_perf : Performance level at which reference performance counter
   accumulates (abstract scale).
 --------------------------------------------------------------------------------
 		Computing Average Delivered Performance
 Below describes the steps to compute the average performance delivered by taking
 two different snapshots of feedback counters at time T1 and T2.
 T1: Read feedback_ctrs as fbc_t1
     Wait or run some workload
 T2: Read feedback_ctrs as fbc_t2
 delivered_counter_delta = fbc_t2[del] - fbc_t1[del]
 reference_counter_delta = fbc_t2[ref] - fbc_t1[ref]
 delivered_perf = (refernce_perf x delivered_counter_delta) / reference_counter_delta

89

Documentation/acpi/dsd/data-node-references.txt

View File

@@ -1,89 +0,0 @@
 Copyright (C) 2018 Intel Corporation
 Author: Sakari Ailus <sakari.ailus@linux.intel.com>
 Referencing hierarchical data nodes
 -----------------------------------
 ACPI in general allows referring to device objects in the tree only.
 Hierarchical data extension nodes may not be referred to directly, hence this
 document defines a scheme to implement such references.
 A reference consist of the device object name followed by one or more
 hierarchical data extension [1] keys. Specifically, the hierarchical data
 extension node which is referred to by the key shall lie directly under the
 parent object i.e. either the device object or another hierarchical data
 extension node.
 The keys in the hierarchical data nodes shall consist of the name of the node,
 "@" character and the number of the node in hexadecimal notation (without pre-
 or postfixes). The same ACPI object shall include the _DSD property extension
 with a property "reg" that shall have the same numerical value as the number of
 the node.
 In case a hierarchical data extensions node has no numerical value, then the
 "reg" property shall be omitted from the ACPI object's _DSD properties and the
 "@" character and the number shall be omitted from the hierarchical data
 extension key.
 Example
 -------
 	In the ASL snippet below, the "reference" _DSD property [2] contains a
 	device object reference to DEV0 and under that device object, a
 	hierarchical data extension key "node@1" referring to the NOD1 object
 	and lastly, a hierarchical data extension key "anothernode" referring to
 	the ANOD object which is also the final target node of the reference.
 	Device (DEV0)
 	{
 	    Name (_DSD, Package () {
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "node@0", NOD0 },
 		    Package () { "node@1", NOD1 },
 		}
 	    })
 	    Name (NOD0, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "random-property", 3 },
 		}
 	    })
 	    Name (NOD1, Package() {
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "anothernode", ANOD },
 		}
 	    })
 	    Name (ANOD, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "random-property", 0 },
 		}
 	    })
 	}
 	Device (DEV1)
 	{
 	    Name (_DSD, Package () {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "reference", ^DEV0, "node@1", "anothernode" },
 		}
 	    })
 	}
 Please also see a graph example in graph.txt .
 References
 ----------
 [1] Hierarchical Data Extension UUID For _DSD.
     <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
     referenced 2018-07-17.
 [2] Device Properties UUID For _DSD.
     <URL:http://www.uefi.org/sites/default/files/resources/_DSD-device-properties-UUID.pdf>,
     referenced 2016-10-04.

68

Documentation/acpi/dsd/graph.txt

View File

@@ -36,41 +36,29 @@ The port and endpoint concepts are very similar to those in Devicetree
 [3]. A port represents an interface in a device, and an endpoint
 represents a connection to that interface.
 All port nodes are located under the device's "_DSD" node in the hierarchical
 data extension tree. The data extension related to each port node must begin
 with "port" and must be followed by the "@" character and the number of the port
 as its key. The target object it refers to should be called "PRTX", where "X" is
 the number of the port. An example of such a package would be:
 All port nodes are located under the device's "_DSD" node in the
 hierarchical data extension tree. The property extension related to
 each port node must contain the key "port" and an integer value which
 is the number of the port. The object it refers to should be called "PRTX",
 where "X" is the number of the port.
     Package() { "port@4", PRT4 }
 Further on, endpoints are located under the individual port nodes. The
 first hierarchical data extension package list entry of the endpoint
 nodes must begin with "endpoint" and must be followed by the number
 of the endpoint. The object it refers to should be called "EPXY", where
 "X" is the number of the port and "Y" is the number of the endpoint.
 Further on, endpoints are located under the port nodes. The hierarchical
 data extension key of the endpoint nodes must begin with
 "endpoint" and must be followed by the "@" character and the number of the
 endpoint. The object it refers to should be called "EPXY", where "X" is the
 number of the port and "Y" is the number of the endpoint. An example of such a
 package would be:
     Package() { "endpoint@0", EP40 }
 Each port node contains a property extension key "port", the value of which is
 the number of the port. Each endpoint is similarly numbered with a property
 extension key "reg", the value of which is the number of the endpoint. Port
 numbers must be unique within a device and endpoint numbers must be unique
 within a port. If a device object may only has a single port, then the number
 of that port shall be zero. Similarly, if a port may only have a single
 endpoint, the number of that endpoint shall be zero.
 Each port node contains a property extension key "port", the value of
 which is the number of the port node. The each endpoint is similarly numbered
 with a property extension key "endpoint". Port numbers must be unique within a
 device and endpoint numbers must be unique within a port.
 The endpoint reference uses property extension with "remote-endpoint" property
 name followed by a reference in the same package. Such references consist of the
 the remote device reference, the first package entry of the port data extension
 reference under the device and finally the first package entry of the endpoint
 data extension reference under the port. Individual references thus appear as:
 the remote device reference, number of the port in the device and finally the
 number of the endpoint in that port. Individual references thus appear as:
     Package() { device, "port@X", "endpoint@Y" }
 In the above example, "X" is the number of the port and "Y" is the number of the
 endpoint.
     Package() { device, port_number, endpoint_number }
 The references to endpoints must be always done both ways, to the
 remote endpoint and back from the referred remote endpoint node.
@@ -88,24 +76,24 @@ A simple example of this is show below:
 		},
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "port@0", PRT0 },
 		    Package () { "port0", "PRT0" },
 		}
 	    })
 	    Name (PRT0, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "reg", 0 },
 		    Package () { "port", 0 },
 		},
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "endpoint@0", EP00 },
 		    Package () { "endpoint0", "EP00" },
 		}
 	    })
 	    Name (EP00, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "reg", 0 },
 		    Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, "port@4", "endpoint@0" } },
 		    Package () { "endpoint", 0 },
 		    Package () { "remote-endpoint", Package() { \_SB.PCI0.ISP, 4, 0 } },
 		}
 	    })
 	}
@@ -118,26 +106,26 @@ A simple example of this is show below:
 	    Name (_DSD, Package () {
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "port@4", PRT4 },
 		    Package () { "port4", "PRT4" },
 		}
 	    })
 	    Name (PRT4, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "reg", 4 }, /* CSI-2 port number */
 		    Package () { "port", 4 }, /* CSI-2 port number */
 		},
 		ToUUID("dbb8e3e6-5886-4ba6-8795-1319f52a966b"),
 		Package () {
 		    Package () { "endpoint@0", EP40 },
 		    Package () { "endpoint0", "EP40" },
 		}
 	    })
 	    Name (EP40, Package() {
 		ToUUID("daffd814-6eba-4d8c-8a91-bc9bbf4aa301"),
 		Package () {
 		    Package () { "reg", 0 },
 		    Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, "port@0", "endpoint@0" } },
 		    Package () { "endpoint", 0 },
 		    Package () { "remote-endpoint", Package () { \_SB.PCI0.I2C2.CAM0, 0, 0 } },
 		}
 	    })
 	}
@@ -163,7 +151,7 @@ References
     referenced 2016-10-04.
 [5] Hierarchical Data Extension UUID For _DSD.
     <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.1.pdf>,
     <URL:http://www.uefi.org/sites/default/files/resources/_DSD-hierarchical-data-extension-UUID-v1.pdf>,
     referenced 2016-10-04.
 [6] Advanced Configuration and Power Interface Specification.

10

Documentation/acpi/method-customizing.txt

View File

@@ -16,8 +16,7 @@ control method rather than override the entire DSDT, because kernel
 rebuild/reboot is not needed and test result can be got in minutes.
 Note: Only ACPI METHOD can be overridden, any other object types like
       "Device", "OperationRegion", are not recognized. Methods
       declared inside scope operators are also not supported.
       "Device", "OperationRegion", are not recognized.
 Note: The same ACPI control method can be overridden for many times,
       and it's always the latest one that used by Linux/kernel.
 Note: To get the ACPI debug object output (Store (AAAA, Debug)),
@@ -33,6 +32,8 @@ Note: To get the ACPI debug object output (Store (AAAA, Debug)),
       DefinitionBlock ("", "SSDT", 1, "", "", 0x20080715)
       {
 	External (ACON)
 	Method (\_SB_.AC._PSR, 0, NotSerialized)
 	{
 		Store ("In AC _PSR", Debug)
@@ -41,10 +42,9 @@ Note: To get the ACPI debug object output (Store (AAAA, Debug)),
       }
       Note that the full pathname of the method in ACPI namespace
       should be used.
       And remember to use "External" to declare external objects.
    e) assemble the file to generate the AML code of the method.
       e.g. "iasl -vw 6084 psr.asl" (psr.aml is generated as a result)
       If parameter "-vw 6084" is not supported by your iASL compiler,
       please try a newer version.
       e.g. "iasl psr.asl" (psr.aml is generated as a result)
    f) mount debugfs by "mount -t debugfs none /sys/kernel/debug"
    g) override the old method via the debugfs by running
       "cat /tmp/psr.aml > /sys/kernel/debug/acpi/custom_method"

									
										6

Documentation/admin-guide/LSM/apparmor.rst
									
												View File
												
				@@ -44,8 +44,8 @@ Links

				Mailing List - apparmor@lists.ubuntu.com

				Wiki - http://wiki.apparmor.net

				Wiki - http://apparmor.wiki.kernel.org/

				User space tools - https://gitlab.com/apparmor

				User space tools - https://launchpad.net/apparmor

				Kernel module - git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor

				Kernel module - git://git.kernel.org/pub/scm/linux/kernel/git/jj/apparmor-dev.git

									
										2

Documentation/admin-guide/README.rst
									
												View File
												
				@@ -1,5 +1,3 @@

				.. _readme:

				Linux kernel release 4.x <http://kernel.org/>

				=============================================

2142

Documentation/admin-guide/cgroup-v2.rst

View File

File diff suppressed because it is too large Load Diff

16

Documentation/admin-guide/devices.txt

View File

@@ -173,18 +173,14 @@
 		they are redirected through the parport multiplex layer.
 char	Virtual console capture devices
 = /dev/vcs		Current vc text (glyph) contents
 = /dev/vcs1		tty1 text (glyph) contents
 = /dev/vcs		Current vc text contents
 = /dev/vcs1		tty1 text contents
 		    ...
 = /dev/vcs63	tty63 text (glyph) contents
 = /dev/vcsu		Current vc text (unicode) contents
 = /dev/vcsu1		tty1 text (unicode) contents
 = /dev/vcs63	tty63 text contents
 = /dev/vcsa		Current vc text/attribute contents
 = /dev/vcsa1	tty1 text/attribute contents
 		    ...
 = /dev/vcsu63	tty63 text (unicode) contents
 = /dev/vcsa		Current vc text/attribute (glyph) contents
 = /dev/vcsa1	tty1 text/attribute (glyph) contents
 		    ...
 = /dev/vcsa63	tty63 text/attribute (glyph) contents
 = /dev/vcsa63	tty63 text/attribute contents
 		NOTE: These devices permit both read and write access.

									
										3

Documentation/admin-guide/index.rst
									
												View File
												
				@@ -57,7 +57,6 @@ configure specific aspects of kernel behavior to your liking.

				   :maxdepth: 1

				   initrd

				   cgroup-v2

				   serial-console

				   braille-console

				   parport

				@@ -70,11 +69,9 @@ configure specific aspects of kernel behavior to your liking.

				   mono

				   java

				   ras

				   bcache

				   pm/index

				   thunderbolt

				   LSM/index

				   mm/index

				.. only::  subproject and html

375

Documentation/admin-guide/kernel-parameters.txt

View File

@@ -106,11 +106,11 @@
 			use by PCI
 			Format: <irq>,<irq>...
 	acpi_mask_gpe=	[HW,ACPI]
 	acpi_mask_gpe=  [HW,ACPI]
 			Due to the existence of _Lxx/_Exx, some GPEs triggered
 			by unsupported hardware/firmware features can result in
 			GPE floodings that cannot be automatically disabled by
 			the GPE dispatcher.
                         GPE floodings that cannot be automatically disabled by
                         the GPE dispatcher.
 			This facility can be used to prevent such uncontrolled
 			GPE floodings.
 			Format: <int>
@@ -256,7 +256,7 @@
 				(may crash computer or cause data corruption)
 	ALSA		[HW,ALSA]
 			See Documentation/sound/alsa-configuration.rst
 			See Documentation/sound/alsa/alsa-parameters.txt
 	alignment=	[KNL,ARM]
 			Allow the default userspace alignment fault handler
@@ -472,10 +472,10 @@
 			for platform specific values (SB1, Loongson3 and
 			others).
 	ccw_timeout_log	[S390]
 	ccw_timeout_log [S390]
 			See Documentation/s390/CommonIO for details.
 	cgroup_disable=	[KNL] Disable a particular controller
 	cgroup_disable= [KNL] Disable a particular controller
 			Format: {name of the controller(s) to disable}
 			The effects of cgroup_disable=foo are:
 			- foo isn't auto-mounted if you mount all cgroups in
@@ -518,7 +518,7 @@
 			those clocks in any way. This parameter is useful for
 			debug and development, but should not be needed on a
 			platform with proper driver support.  For more
 			information, see Documentation/driver-api/clk.rst.
 			information, see Documentation/clk.txt.
 	clock=		[BUGS=X86-32, HW] gettimeofday clocksource override.
 			[Deprecated]
@@ -587,6 +587,11 @@
 			Sets the size of memory pool for coherent, atomic dma
 			allocations, by default set to 256K.
 	code_bytes	[X86] How many bytes of object code to print
 			in an oops report.
 			Range: 0 - 8192
 			Default: 64
 	com20020=	[HW,NET] ARCnet - COM20020 chipset
 			Format:
 			<io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
@@ -636,8 +641,8 @@
 		hvc<n>	Use the hypervisor console device <n>. This is for
 			both Xen and PowerPC hypervisors.
 		If the device connected to the port is not a TTY but a braille
 		device, prepend "brl," before the device type, for instance
                 If the device connected to the port is not a TTY but a braille
                 device, prepend "brl," before the device type, for instance
 			console=brl,ttyS0
 		For now, only VisioBraille is supported.
@@ -657,7 +662,7 @@
 	consoleblank=	[KNL] The console blank (screen saver) timeout in
 			seconds. A value of 0 disables the blank timer.
 			Defaults to 0.
                        Defaults to 0.
 	coredump_filter=
 			[KNL] Change the default value for
@@ -725,7 +730,7 @@
 			or memory reserved is below 4G.
 	cryptomgr.notests
 			[KNL] Disable crypto self-tests
                         [KNL] Disable crypto self-tests
 	cs89x0_dma=	[HW,NET]
 			Format: <dma>
@@ -741,21 +746,13 @@
 			Format: <port#>,<type>
 			See also Documentation/input/devices/joystick-parport.rst
 	ddebug_query=	[KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
 	ddebug_query=   [KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
 			time. See
 			Documentation/admin-guide/dynamic-debug-howto.rst for
 			details.  Deprecated, see dyndbg.
 	debug		[KNL] Enable kernel debugging (events log level).
 	debug_boot_weak_hash
 			[KNL] Enable printing [hashed] pointers early in the
 			boot sequence.  If enabled, we use a weak hash instead
 			of siphash to hash pointers.  Use this option if you are
 			seeing instances of '(___ptrval___)') and need to see a
 			value (hashed pointer) instead. Cryptographically
 			insecure, please do not use on production kernels.
 	debug_locks_verbose=
 			[KNL] verbose self-tests
 			Format=<0|1>
@@ -812,15 +809,6 @@
 			Defaults to the default architecture's huge page size
 			if not specified.
 	deferred_probe_timeout=
 			[KNL] Debugging option to set a timeout in seconds for
 			deferred probe to give up waiting on dependencies to
 			probe. Only specific dependencies (subsystems or
 			drivers) that have opted in will be ignored. A timeout of 0
 			will timeout at the end of initcalls. This option will also
 			dump out devices still on the deferred probe list after
 			retrying.
 	dhash_entries=	[KNL]
 			Set number of hash buckets for dentry cache.
@@ -833,17 +821,6 @@
 	disable=	[IPV6]
 			See Documentation/networking/ipv6.txt.
 	hardened_usercopy=
                         [KNL] Under CONFIG_HARDENED_USERCOPY, whether
                         hardening is enabled for this boot. Hardened
                         usercopy checking is used to protect the kernel
                         from reading or writing beyond known memory
                         allocation boundaries as a proactive defense
                         against bounds-checking flaws in the kernel's
                         copy_to_user()/copy_from_user() interface.
                 on      Perform hardened usercopy checks (default).
                 off     Disable hardened usercopy checks.
 	disable_radix	[PPC]
 			Disable RADIX MMU mode on POWER9
@@ -856,7 +833,7 @@
 			causing system reset or hang due to sending
 			INIT from AP to BSP.
 	disable_ddw	[PPC/PSERIES]
 	disable_ddw     [PPC/PSERIES]
 			Disable Dynamic DMA Window support. Use this if
 			to workaround buggy firmware.
@@ -1048,12 +1025,6 @@
 			address. The serial port must already be setup
 			and configured. Options are not yet supported.
 		qcom_geni,<addr>
 			Start an early, polled-mode console on a Qualcomm
 			Generic Interface (GENI) based serial port at the
 			specified address. The serial port must already be
 			setup and configured. Options are not yet supported.
 	earlyprintk=	[X86,SH,ARM,M68k,S390]
 			earlyprintk=vga
 			earlyprintk=efi
@@ -1063,7 +1034,7 @@
 			earlyprintk=serial[,0x...[,baudrate]]
 			earlyprintk=ttySn[,baudrate]
 			earlyprintk=dbgp[debugController#]
 			earlyprintk=pciserial[,force],bus:device.function[,baudrate]
 			earlyprintk=pciserial,bus:device.function[,baudrate]
 			earlyprintk=xdbc[xhciController#]
 			earlyprintk is useful when the kernel crashes before
@@ -1095,10 +1066,6 @@
 			The sclp output can only be used on s390.
 			The optional "force" to "pciserial" enables use of a
 			PCI device even when its classcode is not of the
 			UART class.
 	edac_report=	[HW,EDAC] Control how to report EDAC event
 			Format: {"on" | "off" | "force"}
 			on: enable EDAC to report H/W event. May be overridden
@@ -1221,7 +1188,7 @@
 			parameter will force ia64_sal_cache_flush to call
 			ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
 	forcepae	[X86-32]
 	forcepae [X86-32]
 			Forcefully enable Physical Address Extension (PAE).
 			Many Pentium M systems disable PAE but may have a
 			functionally usable PAE implementation.
@@ -1280,7 +1247,7 @@
 	gamma=		[HW,DRM]
 	gart_fix_e820=	[X86_64] disable the fix e820 for K8 GART
 	gart_fix_e820=  [X86_64] disable the fix e820 for K8 GART
 			Format: off | on
 			default: on
@@ -1374,32 +1341,23 @@
 			x86-64 are 2M (when the CPU supports "pse") and 1G
 			(when the CPU supports the "pdpe1gb" cpuinfo flag).
 	hung_task_panic=
 			[KNL] Should the hung task detector generate panics.
 			Format: <integer>
 	hvc_iucv=	[S390] Number of z/VM IUCV hypervisor console (HVC)
 			       terminal devices. Valid values: 0..8
 	hvc_iucv_allow=	[S390] Comma-separated list of z/VM user IDs.
 			       If specified, z/VM IUCV HVC accepts connections
 			       from listed z/VM user IDs only.
 			A nonzero value instructs the kernel to panic when a
 			hung task is detected. The default value is controlled
 			by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
 			option. The value selected by this boot parameter can
 			be changed later by the kernel.hung_task_panic sysctl.
 	hvc_iucv=	[S390]	Number of z/VM IUCV hypervisor console (HVC)
 				terminal devices. Valid values: 0..8
 	hvc_iucv_allow=	[S390]	Comma-separated list of z/VM user IDs.
 				If specified, z/VM IUCV HVC accepts connections
 				from listed z/VM user IDs only.
 	keep_bootcon	[KNL]
 			Do not unregister boot console at start. This is only
 			useful for debugging when something happens in the window
 			between unregistering the boot console and initializing
 			the real console.
 	i2c_bus=	[HW]	Override the default board specific I2C bus speed
 				or register an additional I2C bus that is not
 				registered from board initialization code.
 				Format:
 				<bus_id>,<clkrate>
 	i2c_bus=	[HW] Override the default board specific I2C bus speed
 			     or register an additional I2C bus that is not
 			     registered from board initialization code.
 			     Format:
 			     <bus_id>,<clkrate>
 	i8042.debug	[HW] Toggle i8042 debug mode
 	i8042.unmask_kbd_data
@@ -1428,7 +1386,7 @@
 			Default: only on s2r transitions on x86; most other
 			architectures force reset to be always executed
 	i8042.unlock	[HW] Unlock (ignore) the keylock
 	i8042.kbdreset	[HW] Reset device connected to KBD port
 	i8042.kbdreset  [HW] Reset device connected to KBD port
 	i810=		[HW,DRM]
@@ -1590,13 +1548,13 @@
 			programs exec'd, files mmap'd for exec, and all files
 			opened for read by uid=0.
 	ima_template=	[IMA]
 	ima_template=   [IMA]
 			Select one of defined IMA measurements template formats.
 			Formats: { "ima" | "ima-ng" | "ima-sig" }
 			Default: "ima-ng"
 	ima_template_fmt=
 			[IMA] Define a custom template format.
 	                [IMA] Define a custom template format.
 			Format: { "field1|...|fieldN" }
 	ima.ahash_minsize= [IMA] Minimum file size for asynchronous hash usage
@@ -1639,7 +1597,7 @@
 	inport.irq=	[HW] Inport (ATI XL and Microsoft) busmouse driver
 			Format: <irq>
 	int_pln_enable	[x86] Enable power limit notification interrupt
 	int_pln_enable  [x86] Enable power limit notification interrupt
 	integrity_audit=[IMA]
 			Format: { "0" | "1" }
@@ -1692,39 +1650,39 @@
 	disables intel_idle and fall back on acpi_idle.
 to 9	specify maximum depth of C-state.
 	intel_pstate=	[X86]
 			disable
 			  Do not enable intel_pstate as the default
 			  scaling driver for the supported processors
 			passive
 			  Use intel_pstate as a scaling driver, but configure it
 			  to work with generic cpufreq governors (instead of
 			  enabling its internal governor).  This mode cannot be
 			  used along with the hardware-managed P-states (HWP)
 			  feature.
 			force
 			  Enable intel_pstate on systems that prohibit it by default
 			  in favor of acpi-cpufreq. Forcing the intel_pstate driver
 			  instead of acpi-cpufreq may disable platform features, such
 			  as thermal controls and power capping, that rely on ACPI
 			  P-States information being indicated to OSPM and therefore
 			  should be used with caution. This option does not work with
 			  processors that aren't supported by the intel_pstate driver
 			  or on platforms that use pcc-cpufreq instead of acpi-cpufreq.
 			no_hwp
 			  Do not enable hardware P state control (HWP)
 			  if available.
 			hwp_only
 			  Only load intel_pstate on systems which support
 			  hardware P state control (HWP) if available.
 			support_acpi_ppc
 			  Enforce ACPI _PPC performance limits. If the Fixed ACPI
 			  Description Table, specifies preferred power management
 			  profile as "Enterprise Server" or "Performance Server",
 			  then this feature is turned on by default.
 			per_cpu_perf_limits
 			  Allow per-logical-CPU P-State performance control limits using
 			  cpufreq sysfs interface
 	intel_pstate=  [X86]
 		       disable
 		         Do not enable intel_pstate as the default
 		         scaling driver for the supported processors
 		       passive
 			 Use intel_pstate as a scaling driver, but configure it
 			 to work with generic cpufreq governors (instead of
 			 enabling its internal governor).  This mode cannot be
 			 used along with the hardware-managed P-states (HWP)
 			 feature.
 		       force
 			 Enable intel_pstate on systems that prohibit it by default
 			 in favor of acpi-cpufreq. Forcing the intel_pstate driver
 			 instead of acpi-cpufreq may disable platform features, such
 			 as thermal controls and power capping, that rely on ACPI
 			 P-States information being indicated to OSPM and therefore
 			 should be used with caution. This option does not work with
 			 processors that aren't supported by the intel_pstate driver
 			 or on platforms that use pcc-cpufreq instead of acpi-cpufreq.
 		       no_hwp
 		         Do not enable hardware P state control (HWP)
 			 if available.
 		hwp_only
 			Only load intel_pstate on systems which support
 			hardware P state control (HWP) if available.
 		support_acpi_ppc
 			Enforce ACPI _PPC performance limits. If the Fixed ACPI
 			Description Table, specifies preferred power management
 			profile as "Enterprise Server" or "Performance Server",
 			then this feature is turned on by default.
 		per_cpu_perf_limits
 			Allow per-logical-CPU P-State performance control limits using
 			cpufreq sysfs interface
 	intremap=	[X86-64, Intel-IOMMU]
 			on	enable Interrupt Remapping (default)
@@ -1747,9 +1705,9 @@
 		nopanic
 		merge
 		nomerge
 		forcesac
 		soft
 		pt		[x86]
 		nopt		[x86]
 		pt		[x86, IA-64]
 		nobypass	[PPC/POWERNV]
 			Disable IOMMU bypass, using IOMMU for PCI devices.
@@ -2073,9 +2031,6 @@
 			off
 				Disables hypervisor mitigations and doesn't
 				emit any warnings.
 				It also drops the swap size and available
 				RAM limit restriction on both hypervisor and
 				bare metal.
 			Default is 'flush'.
@@ -2146,7 +2101,7 @@
 			* [no]ncqtrim: Turn off queued DSM TRIM.
 			* nohrst, nosrst, norst: suppress hard, soft
 			  and both resets.
                           and both resets.
 			* rstonce: only attempt one reset during
 			  hot-unplug link recovery
@@ -2334,7 +2289,7 @@
 			[KNL,SH] Allow user to override the default size for
 			per-device physically contiguous DMA buffers.
 	memhp_default_state=online/offline
         memhp_default_state=online/offline
 			[KNL] Set the initial state for the memory hotplug
 			onlining policy. If not specified, the default value is
 			set according to the
@@ -2719,9 +2674,6 @@
 			emulation library even if a 387 maths coprocessor
 			is present.
 	no5lvl		[X86-64] Disable 5-level paging mode. Forces
 			kernel to use 4-level paging instead.
 	no_console_suspend
 			[HW] Never suspend the console
 			Disable suspending of consoles during suspend and
@@ -2801,10 +2753,6 @@
 			nosmt=force: Force disable SMT, cannot be undone
 				     via the sysfs control file.
 	nospectre_v1	[PPC] Disable mitigations for Spectre Variant 1 (bounds
 			check bypass). With this option data leaks are possible
 			in the system.
 	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
 			(indirect branch prediction) vulnerability. System may
 			allow data leaks with this option, which is equivalent
@@ -2895,7 +2843,7 @@
 			[X86,PV_OPS] Disable paravirtualized VMware scheduler
 			clock and use the default one.
 	no-steal-acc	[X86,KVM] Disable paravirtualized steal time accounting.
 	no-steal-acc    [X86,KVM] Disable paravirtualized steal time accounting.
 			steal time is computed, but won't influence scheduler
 			behaviour
@@ -2953,8 +2901,10 @@
 	nosync		[HW,M68K] Disables sync negotiation for all devices.
 	notsc		[BUGS=X86-32] Disable Time Stamp Counter
 	nowatchdog	[KNL] Disable both lockup detectors, i.e.
 			soft-lockup and NMI watchdog (hard-lockup).
                         soft-lockup and NMI watchdog (hard-lockup).
 	nowb		[ARM]
@@ -2974,7 +2924,7 @@
 			If the dependencies are under your control, you can
 			turn on cpu0_hotplug.
 	nps_mtm_hs_ctr=	[KNL,ARC]
 	nps_mtm_hs_ctr= [KNL,ARC]
 			This parameter sets the maximum duration, in
 			cycles, each HW thread of the CTOP can run
 			without interruptions, before HW switches it.
@@ -3042,6 +2992,9 @@
 			This will also cause panics on machine check exceptions.
 			Useful together with panic=30 to trigger a reboot.
 	OSS		[HW,OSS]
 			See Documentation/sound/oss/oss-parameters.txt
 	page_owner=	[KNL] Boot-time page_owner enabling option.
 			Storage of the information about who allocated
 			each page is disabled in default. With this switch,
@@ -3049,9 +3002,8 @@
 			on: enable the feature
 	page_poison=	[KNL] Boot-time parameter changing the state of
 			poisoning on the buddy allocator, available with
 			CONFIG_PAGE_POISONING=y.
 			off: turn off poisoning (default)
 			poisoning on the buddy allocator.
 			off: turn off poisoning
 			on: turn on poisoning
 	panic=		[KNL] Kernel behaviour on panic: delay <timeout>
@@ -3111,32 +3063,9 @@
 			See header of drivers/block/paride/pcd.c.
 			See also Documentation/blockdev/paride.txt.
 	pci=option[,option...]	[PCI] various PCI subsystem options.
 				Some options herein operate on a specific device
 				or a set of devices (<pci_dev>). These are
 				specified in one of the following formats:
 				[<domain>:]<bus>:<dev>.<func>[/<dev>.<func>]*
 				pci:<vendor>:<device>[:<subvendor>:<subdevice>]
 				Note: the first format specifies a PCI
 				bus/device/function address which may change
 				if new hardware is inserted, if motherboard
 				firmware changes, or due to changes caused
 				by other kernel parameters. If the
 				domain is left unspecified, it is
 				taken to be zero. Optionally, a path
 				to a device through multiple device/function
 				addresses can be specified after the base
 				address (this is more robust against
 				renumbering issues).  The second format
 				selects devices using IDs from the
 				configuration space which may match multiple
 				devices in the system.
 		earlydump	dump PCI config space before the kernel
 				changes anything
 	pci=option[,option...]	[PCI] various PCI subsystem options:
 		earlydump	[X86] dump PCI config space before the kernel
 			        changes anything
 		off		[X86] don't probe for the PCI bus
 		bios		[X86-32] force use of PCI BIOS, don't access
 				the hardware directly. Use this if your machine
@@ -3224,7 +3153,7 @@
 				is enabled by default.  If you need to use this,
 				please report a bug.
 		nocrs		[X86] Ignore PCI host bridge windows from ACPI.
 				If you need to use this, please report a bug.
 			        If you need to use this, please report a bug.
 		routeirq	Do IRQ routing for all PCI devices.
 				This is normally done in pci_enable_device(),
 				so this option is a temporary workaround
@@ -3263,10 +3192,11 @@
 				window. The default value is 64 megabytes.
 		resource_alignment=
 				Format:
 				[<order of align>@]<pci_dev>[; ...]
 				[<order of align>@][<domain>:]<bus>:<slot>.<func>[; ...]
 				[<order of align>@]pci:<vendor>:<device>\
 						[:<subvendor>:<subdevice>][; ...]
 				Specifies alignment and device to reassign
 				aligned memory resources. How to
 				specify the device is described above.
 				aligned memory resources.
 				If <order of align> is not specified,
 				PAGE_SIZE is used as alignment.
 				PCI-PCI bridge can be specified, if resource
@@ -3298,8 +3228,6 @@
 				on: Turn realloc on
 		realloc		same as realloc=on
 		noari		do not use PCIe ARI.
 		noats		[PCIE, Intel-IOMMU, AMD-IOMMU]
 				do not use PCIe ATS (and IOMMU device IOTLB).
 		pcie_scan_all	Scan all possible PCIe devices.  Otherwise we
 				only look for one device below a PCIe downstream
 				port.
@@ -3309,15 +3237,6 @@
 				Adding the window is slightly risky (it may
 				conflict with unreported devices), so this
 				taints the kernel.
 		disable_acs_redir=<pci_dev>[; ...]
 				Specify one or more PCI devices (in the format
 				specified above) separated by semicolons.
 				Each device specified will have the PCI ACS
 				redirect capabilities forced off which will
 				allow P2P traffic between devices through
 				bridges without forcing it upstream. Note:
 				this removes isolation between devices and
 				may put more devices in an IOMMU group.
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
@@ -3530,12 +3449,6 @@
 	ramdisk_size=	[RAM] Sizes of RAM disks in kilobytes
 			See Documentation/blockdev/ramdisk.txt.
 	random.trust_cpu={on,off}
 			[KNL] Enable or disable trusting the use of the
 			CPU's random number generator (if available) to
 			fully seed the kernel's CRNG. Default is controlled
 			by CONFIG_RANDOM_TRUST_CPU.
 	ras=option[,option,...]	[KNL] RAS-specific options
 		cec_disable	[X86]
@@ -3786,8 +3699,8 @@
 			Set time (s) after boot for CPU-hotplug testing.
 	rcutorture.onoff_interval= [KNL]
 			Set time (jiffies) between CPU-hotplug operations,
 			or zero to disable CPU-hotplug testing.
 			Set time (s) between CPU-hotplug operations, or
 			zero to disable CPU-hotplug testing.
 	rcutorture.shuffle_interval= [KNL]
 			Set task-shuffle interval (s).  Shuffling tasks
@@ -4083,7 +3996,7 @@
 			cache (risks via metadata attacks are mostly
 			unchanged). Debug options disable merging on their
 			own.
 			For more information see Documentation/vm/slub.rst.
 			For more information see Documentation/vm/slub.txt.
 	slab_max_order=	[MM, SLAB]
 			Determines the maximum allowed order for slabs.
@@ -4097,7 +4010,7 @@
 			slub_debug can create guard zones around objects and
 			may poison objects when not in use. Also tracks the
 			last alloc / free. For more information see
 			Documentation/vm/slub.rst.
 			Documentation/vm/slub.txt.
 	slub_memcg_sysfs=	[MM, SLUB]
 			Determines whether to enable sysfs directories for
@@ -4111,7 +4024,7 @@
 			Determines the maximum allowed order for slabs.
 			A high setting may cause OOMs due to memory
 			fragmentation. For more information see
 			Documentation/vm/slub.rst.
 			Documentation/vm/slub.txt.
 	slub_min_objects=	[MM, SLUB]
 			The minimum number of objects per slab. SLUB will
@@ -4120,12 +4033,12 @@
 			the number of objects indicated. The higher the number
 			of objects the smaller the overhead of tracking slabs
 			and the less frequently locks need to be acquired.
 			For more information see Documentation/vm/slub.rst.
 			For more information see Documentation/vm/slub.txt.
 	slub_min_order=	[MM, SLUB]
 			Determines the minimum page order for slabs. Must be
 			lower than slub_max_order.
 			For more information see Documentation/vm/slub.rst.
 			For more information see Documentation/vm/slub.txt.
 	slub_nomerge	[MM, SLUB]
 			Same with slab_nomerge. This is supported for legacy.
@@ -4172,13 +4085,9 @@
 	spectre_v2=	[X86] Control mitigation of Spectre variant 2
 			(indirect branch speculation) vulnerability.
 			The default operation protects the kernel from
 			user space attacks.
 			on   - unconditionally enable, implies
 			       spectre_v2_user=on
 			off  - unconditionally disable, implies
 			       spectre_v2_user=off
 			on   - unconditionally enable
 			off  - unconditionally disable
 			auto - kernel detects whether your CPU model is
 			       vulnerable
@@ -4188,12 +4097,6 @@
 			CONFIG_RETPOLINE configuration option, and the
 			compiler with which the kernel was built.
 			Selecting 'on' will also enable the mitigation
 			against user space to user space task attacks.
 			Selecting 'off' will disable both the kernel and
 			the user space protections.
 			Specific mitigations can also be selected manually:
 			retpoline	  - replace indirect branches
@@ -4203,48 +4106,6 @@
 			Not specifying this option is equivalent to
 			spectre_v2=auto.
 	spectre_v2_user=
 			[X86] Control mitigation of Spectre variant 2
 		        (indirect branch speculation) vulnerability between
 		        user space tasks
 			on	- Unconditionally enable mitigations. Is
 				  enforced by spectre_v2=on
 			off     - Unconditionally disable mitigations. Is
 				  enforced by spectre_v2=off
 			prctl   - Indirect branch speculation is enabled,
 				  but mitigation can be enabled via prctl
 				  per thread.  The mitigation control state
 				  is inherited on fork.
 			prctl,ibpb
 				- Like "prctl" above, but only STIBP is
 				  controlled per thread. IBPB is issued
 				  always when switching between different user
 				  space processes.
 			seccomp
 				- Same as "prctl" above, but all seccomp
 				  threads will enable the mitigation unless
 				  they explicitly opt out.
 			seccomp,ibpb
 				- Like "seccomp" above, but only STIBP is
 				  controlled per thread. IBPB is issued
 				  always when switching between different
 				  user space processes.
 			auto    - Kernel selects the mitigation depending on
 				  the available CPU features and vulnerability.
 			Default mitigation:
 			If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
 			Not specifying this option is equivalent to
 			spectre_v2_user=auto.
 	spec_store_bypass_disable=
 			[HW] Control Speculative Store Bypass (SSB) Disable mitigation
 			(Speculative Store Bypass vulnerability)
@@ -4266,8 +4127,6 @@
 			This parameter controls whether the Speculative Store
 			Bypass optimization is used.
 			On x86 the options are:
 			on      - Unconditionally disable Speculative Store Bypass
 			off     - Unconditionally enable Speculative Store Bypass
 			auto    - Kernel detects whether the CPU model contains an
@@ -4283,20 +4142,12 @@
 			seccomp - Same as "prctl" above, but all seccomp threads
 				  will disable SSB unless they explicitly opt out.
 			Default mitigations:
 			X86:	If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
 			On powerpc the options are:
 			on,auto - On Power8 and Power9 insert a store-forwarding
 				  barrier on kernel entry and exit. On Power7
 				  perform a software flush on kernel entry and
 				  exit.
 			off	- No action.
 			Not specifying this option is equivalent to
 			spec_store_bypass_disable=auto.
 			Default mitigations:
 			X86:	If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
 	spia_io_base=	[HW,MTD]
 	spia_fio_base=
 	spia_pedr=
@@ -4548,7 +4399,7 @@
 			[FTRACE] Set and start specified trace events in order
 			to facilitate early boot debugging. The event-list is a
 			comma separated list of trace events to enable. See
 			also Documentation/trace/events.rst
 			also Documentation/trace/events.txt
 	trace_options=[option-list]
 			[FTRACE] Enable or disable tracer options at boot.
@@ -4563,7 +4414,7 @@
 			      trace_options=stacktrace
 			See also Documentation/trace/ftrace.rst "trace options"
 			See also Documentation/trace/ftrace.txt "trace options"
 			section.
 	tp_printk[FTRACE]
@@ -4602,8 +4453,7 @@
 			Format: [always|madvise|never]
 			Can be used to control the default behavior of the system
 			with respect to transparent hugepages.
 			See Documentation/admin-guide/mm/transhuge.rst
 			for more details.
 			See Documentation/vm/transhuge.txt for more details.
 	tsc=		Disable clocksource stability checks for TSC.
 			Format: <string>
@@ -4681,7 +4531,7 @@
 	usbcore.initial_descriptor_timeout=
 			[USB] Specifies timeout for the initial 64-byte
 			USB_REQ_GET_DESCRIPTOR request in milliseconds
                         USB_REQ_GET_DESCRIPTOR request in milliseconds
 			(default 5000 = 5.0 seconds).
 	usbcore.nousb	[USB] Disable the USB subsystem
@@ -4742,8 +4592,6 @@
 					prevent spurious wakeup);
 				n = USB_QUIRK_DELAY_CTRL_MSG (Device needs a
 					pause after every control message);
 				o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra
 					delay after resetting its port);
 			Example: quirks=0781:5580:bk,0a5c:5834:gij
 	usbhid.mousepoll=
@@ -5061,17 +4909,6 @@
 			Disables the PV optimizations forcing the HVM guest to
 			run as generic HVM guest with no PV drivers.
 	xen_scrub_pages=	[XEN]
 			Boolean option to control scrubbing pages before giving them back
 			to Xen, for use by other domains. Can be also changed at runtime
 			with /sys/devices/system/xen_memory/xen_memory0/scrub_pages.
 			Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT.
 	xirc2ps_cs=	[NET,PCMCIA]
 			Format:
 			<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
 	xhci-hcd.quirks		[USB,KNL]
 			A hex value specifying bitmask with supplemental xhci
 			host controller quirks. Meaning of each bit can be
 			consulted in header drivers/usb/host/xhci.h.

									
										6

Documentation/admin-guide/l1tf.rst
									
												View File
												
				@@ -405,9 +405,6 @@ time with the option "l1tf=". The valid arguments for this option are:

				  off		Disables hypervisor mitigations and doesn't emit any

						warnings.

						It also drops the swap size and available RAM limit restrictions

						on both hypervisor and bare metal.

				  ============  =============================================================

				The default is 'flush'. For details about L1D flushing see :ref:`l1d_flush`.

				@@ -579,8 +576,7 @@ Default mitigations

				  The kernel default mitigations for vulnerable processors are:

				  - PTE inversion to protect against malicious user space. This is done

				    unconditionally and cannot be controlled. The swap storage is limited

				    to ~16TB.

				    unconditionally and cannot be controlled.

				  - L1D conditional flushing on VMENTER when EPT is enabled for

				    a guest.

									
										222

Documentation/admin-guide/mm/concepts.rst
									
												View File
											
				@@ -1,222 +0,0 @@

				.. _mm_concepts:

				=================

				Concepts overview

				=================

				The memory management in Linux is complex system that evolved over the

				years and included more and more functionality to support variety of

				systems from MMU-less microcontrollers to supercomputers. The memory

				management for systems without MMU is called ``nommu`` and it

				definitely deserves a dedicated document, which hopefully will be

				eventually written. Yet, although some of the concepts are the same,

				here we assume that MMU is available and CPU can translate a virtual

				address to a physical address.

				.. contents:: :local:

				Virtual Memory Primer

				=====================

				The physical memory in a computer system is a limited resource and

				even for systems that support memory hotplug there is a hard limit on

				the amount of memory that can be installed. The physical memory is not

				necessary contiguous, it might be accessible as a set of distinct

				address ranges. Besides, different CPU architectures, and even

				different implementations of the same architecture have different view

				how these address ranges defined.

				All this makes dealing directly with physical memory quite complex and

				to avoid this complexity a concept of virtual memory was developed.

				The virtual memory abstracts the details of physical memory from the

				application software, allows to keep only needed information in the

				physical memory (demand paging) and provides a mechanism for the

				protection and controlled sharing of data between processes.

				With virtual memory, each and every memory access uses a virtual

				address. When the CPU decodes the an instruction that reads (or

				writes) from (or to) the system memory, it translates the `virtual`

				address encoded in that instruction to a `physical` address that the

				memory controller can understand.

				The physical system memory is divided into page frames, or pages. The

				size of each page is architecture specific. Some architectures allow

				selection of the page size from several supported values; this

				selection is performed at the kernel build time by setting an

				appropriate kernel configuration option.

				Each physical memory page can be mapped as one or more virtual

				pages. These mappings are described by page tables that allow

				translation from virtual address used by programs to real address in

				the physical memory. The page tables organized hierarchically.

				The tables at the lowest level of the hierarchy contain physical

				addresses of actual pages used by the software. The tables at higher

				levels contain physical addresses of the pages belonging to the lower

				levels. The pointer to the top level page table resides in a

				register. When the CPU performs the address translation, it uses this

				register to access the top level page table. The high bits of the

				virtual address are used to index an entry in the top level page

				table. That entry is then used to access the next level in the

				hierarchy with the next bits of the virtual address as the index to

				that level page table. The lowest bits in the virtual address define

				the offset inside the actual page.

				Huge Pages

				==========

				The address translation requires several memory accesses and memory

				accesses are slow relatively to CPU speed. To avoid spending precious

				processor cycles on the address translation, CPUs maintain a cache of

				such translations called Translation Lookaside Buffer (or

				TLB). Usually TLB is pretty scarce resource and applications with

				large memory working set will experience performance hit because of

				TLB misses.

				Many modern CPU architectures allow mapping of the memory pages

				directly by the higher levels in the page table. For instance, on x86,

				it is possible to map 2M and even 1G pages using entries in the second

				and the third level page tables. In Linux such pages are called

				`huge`. Usage of huge pages significantly reduces pressure on TLB,

				improves TLB hit-rate and thus improves overall system performance.

				There are two mechanisms in Linux that enable mapping of the physical

				memory with the huge pages. The first one is `HugeTLB filesystem`, or

				hugetlbfs. It is a pseudo filesystem that uses RAM as its backing

				store. For the files created in this filesystem the data resides in

				the memory and mapped using huge pages. The hugetlbfs is described at

				:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`.

				Another, more recent, mechanism that enables use of the huge pages is

				called `Transparent HugePages`, or THP. Unlike the hugetlbfs that

				requires users and/or system administrators to configure what parts of

				the system memory should and can be mapped by the huge pages, THP

				manages such mappings transparently to the user and hence the

				name. See

				:ref:`Documentation/admin-guide/mm/transhuge.rst <admin_guide_transhuge>`

				for more details about THP.

				Zones

				=====

				Often hardware poses restrictions on how different physical memory

				ranges can be accessed. In some cases, devices cannot perform DMA to

				all the addressable memory. In other cases, the size of the physical

				memory exceeds the maximal addressable size of virtual memory and

				special actions are required to access portions of the memory. Linux

				groups memory pages into `zones` according to their possible

				usage. For example, ZONE_DMA will contain memory that can be used by

				devices for DMA, ZONE_HIGHMEM will contain memory that is not

				permanently mapped into kernel's address space and ZONE_NORMAL will

				contain normally addressed pages.

				The actual layout of the memory zones is hardware dependent as not all

				architectures define all zones, and requirements for DMA are different

				for different platforms.

				Nodes

				=====

				Many multi-processor machines are NUMA - Non-Uniform Memory Access -

				systems. In such systems the memory is arranged into banks that have

				different access latency depending on the "distance" from the

				processor. Each bank is referred as `node` and for each node Linux

				constructs an independent memory management subsystem. A node has it's

				own set of zones, lists of free and used pages and various statistics

				counters. You can find more details about NUMA in

				:ref:`Documentation/vm/numa.rst <numa>` and in

				:ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`.

				Page cache

				==========

				The physical memory is volatile and the common case for getting data

				into the memory is to read it from files. Whenever a file is read, the

				data is put into the `page cache` to avoid expensive disk access on

				the subsequent reads. Similarly, when one writes to a file, the data

				is placed in the page cache and eventually gets into the backing

				storage device. The written pages are marked as `dirty` and when Linux

				decides to reuse them for other purposes, it makes sure to synchronize

				the file contents on the device with the updated data.

				Anonymous Memory

				================

				The `anonymous memory` or `anonymous mappings` represent memory that

				is not backed by a filesystem. Such mappings are implicitly created

				for program's stack and heap or by explicit calls to mmap(2) system

				call. Usually, the anonymous mappings only define virtual memory areas

				that the program is allowed to access. The read accesses will result

				in creation of a page table entry that references a special physical

				page filled with zeroes. When the program performs a write, regular

				physical page will be allocated to hold the written data. The page

				will be marked dirty and if the kernel will decide to repurpose it,

				the dirty page will be swapped out.

				Reclaim

				=======

				Throughout the system lifetime, a physical page can be used for storing

				different types of data. It can be kernel internal data structures,

				DMA'able buffers for device drivers use, data read from a filesystem,

				memory allocated by user space processes etc.

				Depending on the page usage it is treated differently by the Linux

				memory management. The pages that can be freed at any time, either

				because they cache the data available elsewhere, for instance, on a

				hard disk, or because they can be swapped out, again, to the hard

				disk, are called `reclaimable`. The most notable categories of the

				reclaimable pages are page cache and anonymous memory.

				In most cases, the pages holding internal kernel data and used as DMA

				buffers cannot be repurposed, and they remain pinned until freed by

				their user. Such pages are called `unreclaimable`. However, in certain

				circumstances, even pages occupied with kernel data structures can be

				reclaimed. For instance, in-memory caches of filesystem metadata can

				be re-read from the storage device and therefore it is possible to

				discard them from the main memory when system is under memory

				pressure.

				The process of freeing the reclaimable physical memory pages and

				repurposing them is called (surprise!) `reclaim`. Linux can reclaim

				pages either asynchronously or synchronously, depending on the state

				of the system. When system is not loaded, most of the memory is free

				and allocation request will be satisfied immediately from the free

				pages supply. As the load increases, the amount of the free pages goes

				down and when it reaches a certain threshold (high watermark), an

				allocation request will awaken the ``kswapd`` daemon. It will

				asynchronously scan memory pages and either just free them if the data

				they contain is available elsewhere, or evict to the backing storage

				device (remember those dirty pages?). As memory usage increases even

				more and reaches another threshold - min watermark - an allocation

				will trigger the `direct reclaim`. In this case allocation is stalled

				until enough memory pages are reclaimed to satisfy the request.

				Compaction

				==========

				As the system runs, tasks allocate and free the memory and it becomes

				fragmented. Although with virtual memory it is possible to present

				scattered physical pages as virtually contiguous range, sometimes it is

				necessary to allocate large physically contiguous memory areas. Such

				need may arise, for instance, when a device driver requires large

				buffer for DMA, or when THP allocates a huge page. Memory `compaction`

				addresses the fragmentation issue. This mechanism moves occupied pages

				from the lower part of a memory zone to free pages in the upper part

				of the zone. When a compaction scan is finished free pages are grouped

				together at the beginning of the zone and allocations of large

				physically contiguous areas become possible.

				Like reclaim, the compaction may happen asynchronously in ``kcompactd``

				daemon or synchronously as a result of memory allocation request.

				OOM killer

				==========

				It may happen, that on a loaded machine memory will be exhausted. When

				the kernel detects that the system runs out of memory (OOM) it invokes

				`OOM killer`. Its mission is simple: all it has to do is to select a

				task to sacrifice for the sake of the overall system health. The

				selected task is killed in a hope that after it exits enough memory

				will be freed to continue normal operation.

									
										382

Documentation/admin-guide/mm/hugetlbpage.rst
									
												View File
											
				@@ -1,382 +0,0 @@

				.. _hugetlbpage:

				=============

				HugeTLB Pages

				=============

				Overview

				========

				The intent of this file is to give a brief summary of hugetlbpage support in

				the Linux kernel.  This support is built on top of multiple page size support

				that is provided by most modern architectures.  For example, x86 CPUs normally

				support 4K and 2M (1G if architecturally supported) page sizes, ia64

				architecture supports multiple page sizes 4K, 8K, 64K, 256K, 1M, 4M, 16M,

				256M and ppc64 supports 4K and 16M.  A TLB is a cache of virtual-to-physical

				translations.  Typically this is a very scarce resource on processor.

				Operating systems try to make best use of limited number of TLB resources.

				This optimization is more critical now as bigger and bigger physical memories

				(several GBs) are more readily available.

				Users can use the huge page support in Linux kernel by either using the mmap

				system call or standard SYSV shared memory system calls (shmget, shmat).

				First the Linux kernel needs to be built with the CONFIG_HUGETLBFS

				(present under "File systems") and CONFIG_HUGETLB_PAGE (selected

				automatically when CONFIG_HUGETLBFS is selected) configuration

				options.

				The ``/proc/meminfo`` file provides information about the total number of

				persistent hugetlb pages in the kernel's huge page pool.  It also displays

				default huge page size and information about the number of free, reserved

				and surplus huge pages in the pool of huge pages of default size.

				The huge page size is needed for generating the proper alignment and

				size of the arguments to system calls that map huge page regions.

				The output of ``cat /proc/meminfo`` will include lines like::

					HugePages_Total: uuu

					HugePages_Free:  vvv

					HugePages_Rsvd:  www

					HugePages_Surp:  xxx

					Hugepagesize:    yyy kB

					Hugetlb:         zzz kB

				where:

				HugePages_Total

					is the size of the pool of huge pages.

				HugePages_Free

					is the number of huge pages in the pool that are not yet

				        allocated.

				HugePages_Rsvd

					is short for "reserved," and is the number of huge pages for

				        which a commitment to allocate from the pool has been made,

				        but no allocation has yet been made.  Reserved huge pages

				        guarantee that an application will be able to allocate a

				        huge page from the pool of huge pages at fault time.

				HugePages_Surp

					is short for "surplus," and is the number of huge pages in

				        the pool above the value in ``/proc/sys/vm/nr_hugepages``. The

				        maximum number of surplus huge pages is controlled by

				        ``/proc/sys/vm/nr_overcommit_hugepages``.

				Hugepagesize

					is the default hugepage size (in Kb).

				Hugetlb

				        is the total amount of memory (in kB), consumed by huge

				        pages of all sizes.

				        If huge pages of different sizes are in use, this number

				        will exceed HugePages_Total \* Hugepagesize. To get more

				        detailed information, please, refer to

				        ``/sys/kernel/mm/hugepages`` (described below).

				``/proc/filesystems`` should also show a filesystem of type "hugetlbfs"

				configured in the kernel.

				``/proc/sys/vm/nr_hugepages`` indicates the current number of "persistent" huge

				pages in the kernel's huge page pool.  "Persistent" huge pages will be

				returned to the huge page pool when freed by a task.  A user with root

				privileges can dynamically allocate more or free some persistent huge pages

				by increasing or decreasing the value of ``nr_hugepages``.

				Pages that are used as huge pages are reserved inside the kernel and cannot

				be used for other purposes.  Huge pages cannot be swapped out under

				memory pressure.

				Once a number of huge pages have been pre-allocated to the kernel huge page

				pool, a user with appropriate privilege can use either the mmap system call

				or shared memory system calls to use the huge pages.  See the discussion of

				:ref:`Using Huge Pages <using_huge_pages>`, below.

				The administrator can allocate persistent huge pages on the kernel boot

				command line by specifying the "hugepages=N" parameter, where 'N' = the

				number of huge pages requested.  This is the most reliable method of

				allocating huge pages as memory has not yet become fragmented.

				Some platforms support multiple huge page sizes.  To allocate huge pages

				of a specific size, one must precede the huge pages boot command parameters

				with a huge page size selection parameter "hugepagesz=<size>".  <size> must

				be specified in bytes with optional scale suffix [kKmMgG].  The default huge

				page size may be selected with the "default_hugepagesz=<size>" boot parameter.

				When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``

				indicates the current number of pre-allocated huge pages of the default size.

				Thus, one can use the following command to dynamically allocate/deallocate

				default sized persistent huge pages::

					echo 20 > /proc/sys/vm/nr_hugepages

				This command will try to adjust the number of default sized huge pages in the

				huge page pool to 20, allocating or freeing huge pages, as required.

				On a NUMA platform, the kernel will attempt to distribute the huge page pool

				over all the set of allowed nodes specified by the NUMA memory policy of the

				task that modifies ``nr_hugepages``. The default for the allowed nodes--when the

				task has default memory policy--is all on-line nodes with memory.  Allowed

				nodes with insufficient available, contiguous memory for a huge page will be

				silently skipped when allocating persistent huge pages.  See the

				:ref:`discussion below <mem_policy_and_hp_alloc>`

				of the interaction of task memory policy, cpusets and per node attributes

				with the allocation and freeing of persistent huge pages.

				The success or failure of huge page allocation depends on the amount of

				physically contiguous memory that is present in system at the time of the

				allocation attempt.  If the kernel is unable to allocate huge pages from

				some nodes in a NUMA system, it will attempt to make up the difference by

				allocating extra pages on other nodes with sufficient available contiguous

				memory, if any.

				System administrators may want to put this command in one of the local rc

				init files.  This will enable the kernel to allocate huge pages early in

				the boot process when the possibility of getting physical contiguous pages

				is still very high.  Administrators can verify the number of huge pages

				actually allocated by checking the sysctl or meminfo.  To check the per node

				distribution of huge pages in a NUMA system, use::

					cat /sys/devices/system/node/node*/meminfo | fgrep Huge

				``/proc/sys/vm/nr_overcommit_hugepages`` specifies how large the pool of

				huge pages can grow, if more huge pages than ``/proc/sys/vm/nr_hugepages`` are

				requested by applications.  Writing any non-zero value into this file

				indicates that the hugetlb subsystem is allowed to try to obtain that

				number of "surplus" huge pages from the kernel's normal page pool, when the

				persistent huge page pool is exhausted. As these surplus huge pages become

				unused, they are freed back to the kernel's normal page pool.

				When increasing the huge page pool size via ``nr_hugepages``, any existing

				surplus pages will first be promoted to persistent huge pages.  Then, additional

				huge pages will be allocated, if necessary and if possible, to fulfill

				the new persistent huge page pool size.

				The administrator may shrink the pool of persistent huge pages for

				the default huge page size by setting the ``nr_hugepages`` sysctl to a

				smaller value.  The kernel will attempt to balance the freeing of huge pages

				across all nodes in the memory policy of the task modifying ``nr_hugepages``.

				Any free huge pages on the selected nodes will be freed back to the kernel's

				normal page pool.

				Caveat: Shrinking the persistent huge page pool via ``nr_hugepages`` such that

				it becomes less than the number of huge pages in use will convert the balance

				of the in-use huge pages to surplus huge pages.  This will occur even if

				the number of surplus pages would exceed the overcommit value.  As long as

				this condition holds--that is, until ``nr_hugepages+nr_overcommit_hugepages`` is

				increased sufficiently, or the surplus huge pages go out of use and are freed--

				no more surplus huge pages will be allowed to be allocated.

				With support for multiple huge page pools at run-time available, much of

				the huge page userspace interface in ``/proc/sys/vm`` has been duplicated in

				sysfs.

				The ``/proc`` interfaces discussed above have been retained for backwards

				compatibility. The root huge page control directory in sysfs is::

					/sys/kernel/mm/hugepages

				For each huge page size supported by the running kernel, a subdirectory

				will exist, of the form::

					hugepages-${size}kB

				Inside each of these directories, the same set of files will exist::

					nr_hugepages

					nr_hugepages_mempolicy

					nr_overcommit_hugepages

					free_hugepages

					resv_hugepages

					surplus_hugepages

				which function as described above for the default huge page-sized case.

				.. _mem_policy_and_hp_alloc:

				Interaction of Task Memory Policy with Huge Page Allocation/Freeing

				===================================================================

				Whether huge pages are allocated and freed via the ``/proc`` interface or

				the ``/sysfs`` interface using the ``nr_hugepages_mempolicy`` attribute, the

				NUMA nodes from which huge pages are allocated or freed are controlled by the

				NUMA memory policy of the task that modifies the ``nr_hugepages_mempolicy``

				sysctl or attribute.  When the ``nr_hugepages`` attribute is used, mempolicy

				is ignored.

				The recommended method to allocate or free huge pages to/from the kernel

				huge page pool, using the ``nr_hugepages`` example above, is::

				    numactl --interleave <node-list> echo 20 \

								>/proc/sys/vm/nr_hugepages_mempolicy

				or, more succinctly::

				    numactl -m <node-list> echo 20 >/proc/sys/vm/nr_hugepages_mempolicy

				This will allocate or free ``abs(20 - nr_hugepages)`` to or from the nodes

				specified in <node-list>, depending on whether number of persistent huge pages

				is initially less than or greater than 20, respectively.  No huge pages will be

				allocated nor freed on any node not included in the specified <node-list>.

				When adjusting the persistent hugepage count via ``nr_hugepages_mempolicy``, any

				memory policy mode--bind, preferred, local or interleave--may be used.  The

				resulting effect on persistent huge page allocation is as follows:

				#. Regardless of mempolicy mode [see

				   :ref:`Documentation/admin-guide/mm/numa_memory_policy.rst <numa_memory_policy>`],

				   persistent huge pages will be distributed across the node or nodes

				   specified in the mempolicy as if "interleave" had been specified.

				   However, if a node in the policy does not contain sufficient contiguous

				   memory for a huge page, the allocation will not "fallback" to the nearest

				   neighbor node with sufficient contiguous memory.  To do this would cause

				   undesirable imbalance in the distribution of the huge page pool, or

				   possibly, allocation of persistent huge pages on nodes not allowed by

				   the task's memory policy.

				#. One or more nodes may be specified with the bind or interleave policy.

				   If more than one node is specified with the preferred policy, only the

				   lowest numeric id will be used.  Local policy will select the node where

				   the task is running at the time the nodes_allowed mask is constructed.

				   For local policy to be deterministic, the task must be bound to a cpu or

				   cpus in a single node.  Otherwise, the task could be migrated to some

				   other node at any time after launch and the resulting node will be

				   indeterminate.  Thus, local policy is not very useful for this purpose.

				   Any of the other mempolicy modes may be used to specify a single node.

				#. The nodes allowed mask will be derived from any non-default task mempolicy,

				   whether this policy was set explicitly by the task itself or one of its

				   ancestors, such as numactl.  This means that if the task is invoked from a

				   shell with non-default policy, that policy will be used.  One can specify a

				   node list of "all" with numactl --interleave or --membind [-m] to achieve

				   interleaving over all nodes in the system or cpuset.

				#. Any task mempolicy specified--e.g., using numactl--will be constrained by

				   the resource limits of any cpuset in which the task runs.  Thus, there will

				   be no way for a task with non-default policy running in a cpuset with a

				   subset of the system nodes to allocate huge pages outside the cpuset

				   without first moving to a cpuset that contains all of the desired nodes.

				#. Boot-time huge page allocation attempts to distribute the requested number

				   of huge pages over all on-lines nodes with memory.

				Per Node Hugepages Attributes

				=============================

				A subset of the contents of the root huge page control directory in sysfs,

				described above, will be replicated under each the system device of each

				NUMA node with memory in::

					/sys/devices/system/node/node[0-9]*/hugepages/

				Under this directory, the subdirectory for each supported huge page size

				contains the following attribute files::

					nr_hugepages

					free_hugepages

					surplus_hugepages

				The free\_' and surplus\_' attribute files are read-only.  They return the number

				of free and surplus [overcommitted] huge pages, respectively, on the parent

				node.

				The ``nr_hugepages`` attribute returns the total number of huge pages on the

				specified node.  When this attribute is written, the number of persistent huge

				pages on the parent node will be adjusted to the specified value, if sufficient

				resources exist, regardless of the task's mempolicy or cpuset constraints.

				Note that the number of overcommit and reserve pages remain global quantities,

				as we don't know until fault time, when the faulting task's mempolicy is

				applied, from which node the huge page allocation will be attempted.

				.. _using_huge_pages:

				Using Huge Pages

				================

				If the user applications are going to request huge pages using mmap system

				call, then it is required that system administrator mount a file system of

				type hugetlbfs::

				  mount -t hugetlbfs \

					-o uid=<value>,gid=<value>,mode=<value>,pagesize=<value>,size=<value>,\

					min_size=<value>,nr_inodes=<value> none /mnt/huge

				This command mounts a (pseudo) filesystem of type hugetlbfs on the directory

				``/mnt/huge``.  Any file created on ``/mnt/huge`` uses huge pages.

				The ``uid`` and ``gid`` options sets the owner and group of the root of the

				file system.  By default the ``uid`` and ``gid`` of the current process

				are taken.

				The ``mode`` option sets the mode of root of file system to value & 01777.

				This value is given in octal. By default the value 0755 is picked.

				If the platform supports multiple huge page sizes, the ``pagesize`` option can

				be used to specify the huge page size and associated pool. ``pagesize``

				is specified in bytes. If ``pagesize`` is not specified the platform's

				default huge page size and associated pool will be used.

				The ``size`` option sets the maximum value of memory (huge pages) allowed

				for that filesystem (``/mnt/huge``). The ``size`` option can be specified

				in bytes, or as a percentage of the specified huge page pool (``nr_hugepages``).

				The size is rounded down to HPAGE_SIZE boundary.

				The ``min_size`` option sets the minimum value of memory (huge pages) allowed

				for the filesystem. ``min_size`` can be specified in the same way as ``size``,

				either bytes or a percentage of the huge page pool.

				At mount time, the number of huge pages specified by ``min_size`` are reserved

				for use by the filesystem.

				If there are not enough free huge pages available, the mount will fail.

				As huge pages are allocated to the filesystem and freed, the reserve count

				is adjusted so that the sum of allocated and reserved huge pages is always

				at least ``min_size``.

				The option ``nr_inodes`` sets the maximum number of inodes that ``/mnt/huge``

				can use.

				If the ``size``, ``min_size`` or ``nr_inodes`` option is not provided on

				command line then no limits are set.

				For ``pagesize``, ``size``, ``min_size`` and ``nr_inodes`` options, you can

				use [G|g]/[M|m]/[K|k] to represent giga/mega/kilo.

				For example, size=2K has the same meaning as size=2048.

				While read system calls are supported on files that reside on hugetlb

				file systems, write system calls are not.

				Regular chown, chgrp, and chmod commands (with right permissions) could be

				used to change the file attributes on hugetlbfs.

				Also, it is important to note that no such mount command is required if

				applications are going to use only shmat/shmget system calls or mmap with

				MAP_HUGETLB.  For an example of how to use mmap with MAP_HUGETLB see

				:ref:`map_hugetlb <map_hugetlb>` below.

				Users who wish to use hugetlb memory via shared memory segment should be

				members of a supplementary group and system admin needs to configure that gid

				into ``/proc/sys/vm/hugetlb_shm_group``.  It is possible for same or different

				applications to use any combination of mmaps and shm* calls, though the mount of

				filesystem will be required for using mmap calls without MAP_HUGETLB.

				Syscalls that operate on memory backed by hugetlb pages only have their lengths

				aligned to the native page size of the processor; they will normally fail with

				errno set to EINVAL or exclude hugetlb pages that extend beyond the length if

				not hugepage aligned.  For example, munmap(2) will fail if memory is backed by

				a hugetlb page and the length is smaller than the hugepage size.

				Examples

				========

				.. _map_hugetlb:

				``map_hugetlb``

					see tools/testing/selftests/vm/map_hugetlb.c

				``hugepage-shm``

					see tools/testing/selftests/vm/hugepage-shm.c

				``hugepage-mmap``

					see tools/testing/selftests/vm/hugepage-mmap.c

				The `libhugetlbfs`_  library provides a wide range of userspace tools

				to help with huge page usability, environment setup, and control.

				.. _libhugetlbfs: https://github.com/libhugetlbfs/libhugetlbfs

									
										121

Documentation/admin-guide/mm/idle_page_tracking.rst
									
												View File
											
				@@ -1,121 +0,0 @@

				.. _idle_page_tracking:

				==================

				Idle Page Tracking

				==================

				Motivation

				==========

				The idle page tracking feature allows to track which memory pages are being

				accessed by a workload and which are idle. This information can be useful for

				estimating the workload's working set size, which, in turn, can be taken into

				account when configuring the workload parameters, setting memory cgroup limits,

				or deciding where to place the workload within a compute cluster.

				It is enabled by CONFIG_IDLE_PAGE_TRACKING=y.

				.. _user_api:

				User API

				========

				The idle page tracking API is located at ``/sys/kernel/mm/page_idle``.

				Currently, it consists of the only read-write file,

				``/sys/kernel/mm/page_idle/bitmap``.

				The file implements a bitmap where each bit corresponds to a memory page. The

				bitmap is represented by an array of 8-byte integers, and the page at PFN #i is

				mapped to bit #i%64 of array element #i/64, byte order is native. When a bit is

				set, the corresponding page is idle.

				A page is considered idle if it has not been accessed since it was marked idle

				(for more details on what "accessed" actually means see the :ref:`Implementation

				Details <impl_details>` section).

				To mark a page idle one has to set the bit corresponding to

				the page by writing to the file. A value written to the file is OR-ed with the

				current bitmap value.

				Only accesses to user memory pages are tracked. These are pages mapped to a

				process address space, page cache and buffer pages, swap cache pages. For other

				page types (e.g. SLAB pages) an attempt to mark a page idle is silently ignored,

				and hence such pages are never reported idle.

				For huge pages the idle flag is set only on the head page, so one has to read

				``/proc/kpageflags`` in order to correctly count idle huge pages.

				Reading from or writing to ``/sys/kernel/mm/page_idle/bitmap`` will return

				-EINVAL if you are not starting the read/write on an 8-byte boundary, or

				if the size of the read/write is not a multiple of 8 bytes. Writing to

				this file beyond max PFN will return -ENXIO.

				That said, in order to estimate the amount of pages that are not used by a

				workload one should:

				 1. Mark all the workload's pages as idle by setting corresponding bits in

				    ``/sys/kernel/mm/page_idle/bitmap``. The pages can be found by reading

				    ``/proc/pid/pagemap`` if the workload is represented by a process, or by

				    filtering out alien pages using ``/proc/kpagecgroup`` in case the workload

				    is placed in a memory cgroup.

				 2. Wait until the workload accesses its working set.

				 3. Read ``/sys/kernel/mm/page_idle/bitmap`` and count the number of bits set.

				    If one wants to ignore certain types of pages, e.g. mlocked pages since they

				    are not reclaimable, he or she can filter them out using

				    ``/proc/kpageflags``.

				The page-types tool in the tools/vm directory can be used to assist in this.

				If the tool is run initially with the appropriate option, it will mark all the

				queried pages as idle.  Subsequent runs of the tool can then show which pages have

				their idle flag cleared in the interim.

				See :ref:`Documentation/admin-guide/mm/pagemap.rst <pagemap>` for more

				information about ``/proc/pid/pagemap``, ``/proc/kpageflags``, and

				``/proc/kpagecgroup``.

				.. _impl_details:

				Implementation Details

				======================

				The kernel internally keeps track of accesses to user memory pages in order to

				reclaim unreferenced pages first on memory shortage conditions. A page is

				considered referenced if it has been recently accessed via a process address

				space, in which case one or more PTEs it is mapped to will have the Accessed bit

				set, or marked accessed explicitly by the kernel (see mark_page_accessed()). The

				latter happens when:

				 - a userspace process reads or writes a page using a system call (e.g. read(2)

				   or write(2))

				 - a page that is used for storing filesystem buffers is read or written,

				   because a process needs filesystem metadata stored in it (e.g. lists a

				   directory tree)

				 - a page is accessed by a device driver using get_user_pages()

				When a dirty page is written to swap or disk as a result of memory reclaim or

				exceeding the dirty memory limit, it is not marked referenced.

				The idle memory tracking feature adds a new page flag, the Idle flag. This flag

				is set manually, by writing to ``/sys/kernel/mm/page_idle/bitmap`` (see the

				:ref:`User API <user_api>`

				section), and cleared automatically whenever a page is referenced as defined

				above.

				When a page is marked idle, the Accessed bit must be cleared in all PTEs it is

				mapped to, otherwise we will not be able to detect accesses to the page coming

				from a process address space. To avoid interference with the reclaimer, which,

				as noted above, uses the Accessed bit to promote actively referenced pages, one

				more page flag is introduced, the Young flag. When the PTE Accessed bit is

				cleared as a result of setting or updating a page's Idle flag, the Young flag

				is set on the page. The reclaimer treats the Young flag as an extra PTE

				Accessed bit and therefore will consider such a page as referenced.

				Since the idle memory tracking feature is based on the memory reclaimer logic,

				it only works with pages that are on an LRU list, other pages are silently

				ignored. That means it will ignore a user memory page if it is isolated, but

				since there are usually not many of them, it should not affect the overall

				result noticeably. In order not to stall scanning of the idle page bitmap,

				locked pages may be skipped too.

									
										36

Documentation/admin-guide/mm/index.rst
									
												View File
											
				@@ -1,36 +0,0 @@

				=================

				Memory Management

				=================

				Linux memory management subsystem is responsible, as the name implies,

				for managing the memory in the system. This includes implemnetation of

				virtual memory and demand paging, memory allocation both for kernel

				internal structures and user space programms, mapping of files into

				processes address space and many other cool things.

				Linux memory management is a complex system with many configurable

				settings. Most of these settings are available via ``/proc``

				filesystem and can be quired and adjusted using ``sysctl``. These APIs

				are described in Documentation/sysctl/vm.txt and in `man 5 proc`_.

				.. _man 5 proc: http://man7.org/linux/man-pages/man5/proc.5.html

				Linux memory management has its own jargon and if you are not yet

				familiar with it, consider reading

				:ref:`Documentation/admin-guide/mm/concepts.rst <mm_concepts>`.

				Here we document in detail how to interact with various mechanisms in

				the Linux memory management.

				.. toctree::

				   :maxdepth: 1

				   concepts

				   hugetlbpage

				   idle_page_tracking

				   ksm

				   numa_memory_policy

				   pagemap

				   soft-dirty

				   transhuge

				   userfaultfd

									
										189

Documentation/admin-guide/mm/ksm.rst
									
												View File
											
				@@ -1,189 +0,0 @@

				.. _admin_guide_ksm:

				=======================

				Kernel Samepage Merging

				=======================

				Overview

				========

				KSM is a memory-saving de-duplication feature, enabled by CONFIG_KSM=y,

				added to the Linux kernel in 2.6.32.  See ``mm/ksm.c`` for its implementation,

				and http://lwn.net/Articles/306704/ and http://lwn.net/Articles/330589/

				KSM was originally developed for use with KVM (where it was known as

				Kernel Shared Memory), to fit more virtual machines into physical memory,

				by sharing the data common between them.  But it can be useful to any

				application which generates many instances of the same data.

				The KSM daemon ksmd periodically scans those areas of user memory

				which have been registered with it, looking for pages of identical

				content which can be replaced by a single write-protected page (which

				is automatically copied if a process later wants to update its

				content). The amount of pages that KSM daemon scans in a single pass

				and the time between the passes are configured using :ref:`sysfs

				intraface <ksm_sysfs>`

				KSM only merges anonymous (private) pages, never pagecache (file) pages.

				KSM's merged pages were originally locked into kernel memory, but can now

				be swapped out just like other user pages (but sharing is broken when they

				are swapped back in: ksmd must rediscover their identity and merge again).

				Controlling KSM with madvise

				============================

				KSM only operates on those areas of address space which an application

				has advised to be likely candidates for merging, by using the madvise(2)

				system call::

					int madvise(addr, length, MADV_MERGEABLE)

				The app may call

				::

					int madvise(addr, length, MADV_UNMERGEABLE)

				to cancel that advice and restore unshared pages: whereupon KSM

				unmerges whatever it merged in that range.  Note: this unmerging call

				may suddenly require more memory than is available - possibly failing

				with EAGAIN, but more probably arousing the Out-Of-Memory killer.

				If KSM is not configured into the running kernel, madvise MADV_MERGEABLE

				and MADV_UNMERGEABLE simply fail with EINVAL.  If the running kernel was

				built with CONFIG_KSM=y, those calls will normally succeed: even if the

				the KSM daemon is not currently running, MADV_MERGEABLE still registers

				the range for whenever the KSM daemon is started; even if the range

				cannot contain any pages which KSM could actually merge; even if

				MADV_UNMERGEABLE is applied to a range which was never MADV_MERGEABLE.

				If a region of memory must be split into at least one new MADV_MERGEABLE

				or MADV_UNMERGEABLE region, the madvise may return ENOMEM if the process

				will exceed ``vm.max_map_count`` (see Documentation/sysctl/vm.txt).

				Like other madvise calls, they are intended for use on mapped areas of

				the user address space: they will report ENOMEM if the specified range

				includes unmapped gaps (though working on the intervening mapped areas),

				and might fail with EAGAIN if not enough memory for internal structures.

				Applications should be considerate in their use of MADV_MERGEABLE,

				restricting its use to areas likely to benefit.  KSM's scans may use a lot

				of processing power: some installations will disable KSM for that reason.

				.. _ksm_sysfs:

				KSM daemon sysfs interface

				==========================

				The KSM daemon is controlled by sysfs files in ``/sys/kernel/mm/ksm/``,

				readable by all but writable only by root:

				pages_to_scan

				        how many pages to scan before ksmd goes to sleep

				        e.g. ``echo 100 > /sys/kernel/mm/ksm/pages_to_scan``.

				        Default: 100 (chosen for demonstration purposes)

				sleep_millisecs

				        how many milliseconds ksmd should sleep before next scan

				        e.g. ``echo 20 > /sys/kernel/mm/ksm/sleep_millisecs``

				        Default: 20 (chosen for demonstration purposes)

				merge_across_nodes

				        specifies if pages from different NUMA nodes can be merged.

				        When set to 0, ksm merges only pages which physically reside

				        in the memory area of same NUMA node. That brings lower

				        latency to access of shared pages. Systems with more nodes, at

				        significant NUMA distances, are likely to benefit from the

				        lower latency of setting 0. Smaller systems, which need to

				        minimize memory usage, are likely to benefit from the greater

				        sharing of setting 1 (default). You may wish to compare how

				        your system performs under each setting, before deciding on

				        which to use. ``merge_across_nodes`` setting can be changed only

				        when there are no ksm shared pages in the system: set run 2 to

				        unmerge pages first, then to 1 after changing

				        ``merge_across_nodes``, to remerge according to the new setting.

				        Default: 1 (merging across nodes as in earlier releases)

				run

				        * set to 0 to stop ksmd from running but keep merged pages,

				        * set to 1 to run ksmd e.g. ``echo 1 > /sys/kernel/mm/ksm/run``,

				        * set to 2 to stop ksmd and unmerge all pages currently merged, but

					  leave mergeable areas registered for next run.

				        Default: 0 (must be changed to 1 to activate KSM, except if

				        CONFIG_SYSFS is disabled)

				use_zero_pages

				        specifies whether empty pages (i.e. allocated pages that only

				        contain zeroes) should be treated specially.  When set to 1,

				        empty pages are merged with the kernel zero page(s) instead of

				        with each other as it would happen normally. This can improve

				        the performance on architectures with coloured zero pages,

				        depending on the workload. Care should be taken when enabling

				        this setting, as it can potentially degrade the performance of

				        KSM for some workloads, for example if the checksums of pages

				        candidate for merging match the checksum of an empty

				        page. This setting can be changed at any time, it is only

				        effective for pages merged after the change.

				        Default: 0 (normal KSM behaviour as in earlier releases)

				max_page_sharing

				        Maximum sharing allowed for each KSM page. This enforces a

				        deduplication limit to avoid high latency for virtual memory

				        operations that involve traversal of the virtual mappings that

				        share the KSM page. The minimum value is 2 as a newly created

				        KSM page will have at least two sharers. The higher this value

				        the faster KSM will merge the memory and the higher the

				        deduplication factor will be, but the slower the worst case

				        virtual mappings traversal could be for any given KSM

				        page. Slowing down this traversal means there will be higher

				        latency for certain virtual memory operations happening during

				        swapping, compaction, NUMA balancing and page migration, in

				        turn decreasing responsiveness for the caller of those virtual

				        memory operations. The scheduler latency of other tasks not

				        involved with the VM operations doing the virtual mappings

				        traversal is not affected by this parameter as these

				        traversals are always schedule friendly themselves.

				stable_node_chains_prune_millisecs

				        specifies how frequently KSM checks the metadata of the pages

				        that hit the deduplication limit for stale information.

				        Smaller milllisecs values will free up the KSM metadata with

				        lower latency, but they will make ksmd use more CPU during the

				        scan. It's a noop if not a single KSM page hit the

				        ``max_page_sharing`` yet.

				The effectiveness of KSM and MADV_MERGEABLE is shown in ``/sys/kernel/mm/ksm/``:

				pages_shared

				        how many shared pages are being used

				pages_sharing

				        how many more sites are sharing them i.e. how much saved

				pages_unshared

				        how many pages unique but repeatedly checked for merging

				pages_volatile

				        how many pages changing too fast to be placed in a tree

				full_scans

				        how many times all mergeable areas have been scanned

				stable_node_chains

				        the number of KSM pages that hit the ``max_page_sharing`` limit

				stable_node_dups

				        number of duplicated KSM pages

				A high ratio of ``pages_sharing`` to ``pages_shared`` indicates good

				sharing, but a high ratio of ``pages_unshared`` to ``pages_sharing``

				indicates wasted effort.  ``pages_volatile`` embraces several

				different kinds of activity, but a high proportion there would also

				indicate poor use of madvise MADV_MERGEABLE.

				The maximum possible ``pages_sharing/pages_shared`` ratio is limited by the

				``max_page_sharing`` tunable. To increase the ratio ``max_page_sharing`` must

				be increased accordingly.

				--

				Izik Eidus,

				Hugh Dickins, 17 Nov 2009

									
										495

Documentation/admin-guide/mm/numa_memory_policy.rst
									
												View File
											
				@@ -1,495 +0,0 @@

				.. _numa_memory_policy:

				==================

				NUMA Memory Policy

				==================

				What is NUMA Memory Policy?

				============================

				In the Linux kernel, "memory policy" determines from which node the kernel will

				allocate memory in a NUMA system or in an emulated NUMA system.  Linux has

				supported platforms with Non-Uniform Memory Access architectures since 2.4.?.

				The current memory policy support was added to Linux 2.6 around May 2004.  This

				document attempts to describe the concepts and APIs of the 2.6 memory policy

				support.

				Memory policies should not be confused with cpusets

				(``Documentation/cgroup-v1/cpusets.txt``)

				which is an administrative mechanism for restricting the nodes from which

				memory may be allocated by a set of processes. Memory policies are a

				programming interface that a NUMA-aware application can take advantage of.  When

				both cpusets and policies are applied to a task, the restrictions of the cpuset

				takes priority.  See :ref:`Memory Policies and cpusets <mem_pol_and_cpusets>`

				below for more details.

				Memory Policy Concepts

				======================

				Scope of Memory Policies

				------------------------

				The Linux kernel supports _scopes_ of memory policy, described here from

				most general to most specific:

				System Default Policy

					this policy is "hard coded" into the kernel.  It is the policy

					that governs all page allocations that aren't controlled by

					one of the more specific policy scopes discussed below.  When

					the system is "up and running", the system default policy will

					use "local allocation" described below.  However, during boot

					up, the system default policy will be set to interleave

					allocations across all nodes with "sufficient" memory, so as

					not to overload the initial boot node with boot-time

					allocations.

				Task/Process Policy

					this is an optional, per-task policy.  When defined for a

					specific task, this policy controls all page allocations made

					by or on behalf of the task that aren't controlled by a more

					specific scope. If a task does not define a task policy, then

					all page allocations that would have been controlled by the

					task policy "fall back" to the System Default Policy.

					The task policy applies to the entire address space of a task. Thus,

					it is inheritable, and indeed is inherited, across both fork()

					[clone() w/o the CLONE_VM flag] and exec*().  This allows a parent task

					to establish the task policy for a child task exec()'d from an

					executable image that has no awareness of memory policy.  See the

					:ref:`Memory Policy APIs <memory_policy_apis>` section,

					below, for an overview of the system call

					that a task may use to set/change its task/process policy.

					In a multi-threaded task, task policies apply only to the thread

					[Linux kernel task] that installs the policy and any threads

					subsequently created by that thread.  Any sibling threads existing

					at the time a new task policy is installed retain their current

					policy.

					A task policy applies only to pages allocated after the policy is

					installed.  Any pages already faulted in by the task when the task

					changes its task policy remain where they were allocated based on

					the policy at the time they were allocated.

				.. _vma_policy:

				VMA Policy

					A "VMA" or "Virtual Memory Area" refers to a range of a task's

					virtual address space.  A task may define a specific policy for a range

					of its virtual address space.   See the

					:ref:`Memory Policy APIs <memory_policy_apis>` section,

					below, for an overview of the mbind() system call used to set a VMA

					policy.

					A VMA policy will govern the allocation of pages that back

					this region of the address space.  Any regions of the task's

					address space that don't have an explicit VMA policy will fall

					back to the task policy, which may itself fall back to the

					System Default Policy.

					VMA policies have a few complicating details:

					* VMA policy applies ONLY to anonymous pages.  These include

					  pages allocated for anonymous segments, such as the task

					  stack and heap, and any regions of the address space

					  mmap()ed with the MAP_ANONYMOUS flag.  If a VMA policy is

					  applied to a file mapping, it will be ignored if the mapping

					  used the MAP_SHARED flag.  If the file mapping used the

					  MAP_PRIVATE flag, the VMA policy will only be applied when

					  an anonymous page is allocated on an attempt to write to the

					  mapping-- i.e., at Copy-On-Write.

					* VMA policies are shared between all tasks that share a

					  virtual address space--a.k.a. threads--independent of when

					  the policy is installed; and they are inherited across

					  fork().  However, because VMA policies refer to a specific

					  region of a task's address space, and because the address

					  space is discarded and recreated on exec*(), VMA policies

					  are NOT inheritable across exec().  Thus, only NUMA-aware

					  applications may use VMA policies.

					* A task may install a new VMA policy on a sub-range of a

					  previously mmap()ed region.  When this happens, Linux splits

					  the existing virtual memory area into 2 or 3 VMAs, each with

					  it's own policy.

					* By default, VMA policy applies only to pages allocated after

					  the policy is installed.  Any pages already faulted into the

					  VMA range remain where they were allocated based on the

					  policy at the time they were allocated.  However, since

					  2.6.16, Linux supports page migration via the mbind() system

					  call, so that page contents can be moved to match a newly

					  installed policy.

				Shared Policy

					Conceptually, shared policies apply to "memory objects" mapped

					shared into one or more tasks' distinct address spaces.  An

					application installs shared policies the same way as VMA

					policies--using the mbind() system call specifying a range of

					virtual addresses that map the shared object.  However, unlike

					VMA policies, which can be considered to be an attribute of a

					range of a task's address space, shared policies apply

					directly to the shared object.  Thus, all tasks that attach to

					the object share the policy, and all pages allocated for the

					shared object, by any task, will obey the shared policy.

					As of 2.6.22, only shared memory segments, created by shmget() or

					mmap(MAP_ANONYMOUS|MAP_SHARED), support shared policy.  When shared

					policy support was added to Linux, the associated data structures were

					added to hugetlbfs shmem segments.  At the time, hugetlbfs did not

					support allocation at fault time--a.k.a lazy allocation--so hugetlbfs

					shmem segments were never "hooked up" to the shared policy support.

					Although hugetlbfs segments now support lazy allocation, their support

					for shared policy has not been completed.

					As mentioned above in :ref:`VMA policies <vma_policy>` section,

					allocations of page cache pages for regular files mmap()ed

					with MAP_SHARED ignore any VMA policy installed on the virtual

					address range backed by the shared file mapping.  Rather,

					shared page cache pages, including pages backing private

					mappings that have not yet been written by the task, follow

					task policy, if any, else System Default Policy.

					The shared policy infrastructure supports different policies on subset

					ranges of the shared object.  However, Linux still splits the VMA of

					the task that installs the policy for each range of distinct policy.

					Thus, different tasks that attach to a shared memory segment can have

					different VMA configurations mapping that one shared object.  This

					can be seen by examining the /proc/<pid>/numa_maps of tasks sharing

					a shared memory region, when one task has installed shared policy on

					one or more ranges of the region.

				Components of Memory Policies

				-----------------------------

				A NUMA memory policy consists of a "mode", optional mode flags, and

				an optional set of nodes.  The mode determines the behavior of the

				policy, the optional mode flags determine the behavior of the mode,

				and the optional set of nodes can be viewed as the arguments to the

				policy behavior.

				Internally, memory policies are implemented by a reference counted

				structure, struct mempolicy.  Details of this structure will be

				discussed in context, below, as required to explain the behavior.

				NUMA memory policy supports the following 4 behavioral modes:

				Default Mode--MPOL_DEFAULT

					This mode is only used in the memory policy APIs.  Internally,

					MPOL_DEFAULT is converted to the NULL memory policy in all

					policy scopes.  Any existing non-default policy will simply be

					removed when MPOL_DEFAULT is specified.  As a result,

					MPOL_DEFAULT means "fall back to the next most specific policy

					scope."

					For example, a NULL or default task policy will fall back to the

					system default policy.  A NULL or default vma policy will fall

					back to the task policy.

					When specified in one of the memory policy APIs, the Default mode

					does not use the optional set of nodes.

					It is an error for the set of nodes specified for this policy to

					be non-empty.

				MPOL_BIND

					This mode specifies that memory must come from the set of

					nodes specified by the policy.  Memory will be allocated from

					the node in the set with sufficient free memory that is

					closest to the node where the allocation takes place.

				MPOL_PREFERRED

					This mode specifies that the allocation should be attempted

					from the single node specified in the policy.  If that

					allocation fails, the kernel will search other nodes, in order

					of increasing distance from the preferred node based on

					information provided by the platform firmware.

					Internally, the Preferred policy uses a single node--the

					preferred_node member of struct mempolicy.  When the internal

					mode flag MPOL_F_LOCAL is set, the preferred_node is ignored

					and the policy is interpreted as local allocation.  "Local"

					allocation policy can be viewed as a Preferred policy that

					starts at the node containing the cpu where the allocation

					takes place.

					It is possible for the user to specify that local allocation

					is always preferred by passing an empty nodemask with this

					mode.  If an empty nodemask is passed, the policy cannot use

					the MPOL_F_STATIC_NODES or MPOL_F_RELATIVE_NODES flags

					described below.

				MPOL_INTERLEAVED

					This mode specifies that page allocations be interleaved, on a

					page granularity, across the nodes specified in the policy.

					This mode also behaves slightly differently, based on the

					context where it is used:

					For allocation of anonymous pages and shared memory pages,

					Interleave mode indexes the set of nodes specified by the

					policy using the page offset of the faulting address into the

					segment [VMA] containing the address modulo the number of

					nodes specified by the policy.  It then attempts to allocate a

					page, starting at the selected node, as if the node had been

					specified by a Preferred policy or had been selected by a

					local allocation.  That is, allocation will follow the per

					node zonelist.

					For allocation of page cache pages, Interleave mode indexes

					the set of nodes specified by the policy using a node counter

					maintained per task.  This counter wraps around to the lowest

					specified node after it reaches the highest specified node.

					This will tend to spread the pages out over the nodes

					specified by the policy based on the order in which they are

					allocated, rather than based on any page offset into an

					address range or file.  During system boot up, the temporary

					interleaved system default policy works in this mode.

				NUMA memory policy supports the following optional mode flags:

				MPOL_F_STATIC_NODES

					This flag specifies that the nodemask passed by

					the user should not be remapped if the task or VMA's set of allowed

					nodes changes after the memory policy has been defined.

					Without this flag, any time a mempolicy is rebound because of a

					change in the set of allowed nodes, the node (Preferred) or

					nodemask (Bind, Interleave) is remapped to the new set of

					allowed nodes.  This may result in nodes being used that were

					previously undesired.

					With this flag, if the user-specified nodes overlap with the

					nodes allowed by the task's cpuset, then the memory policy is

					applied to their intersection.  If the two sets of nodes do not

					overlap, the Default policy is used.

					For example, consider a task that is attached to a cpuset with

					mems 1-3 that sets an Interleave policy over the same set.  If

					the cpuset's mems change to 3-5, the Interleave will now occur

					over nodes 3, 4, and 5.  With this flag, however, since only node

					3 is allowed from the user's nodemask, the "interleave" only

					occurs over that node.  If no nodes from the user's nodemask are

					now allowed, the Default behavior is used.

					MPOL_F_STATIC_NODES cannot be combined with the

					MPOL_F_RELATIVE_NODES flag.  It also cannot be used for

					MPOL_PREFERRED policies that were created with an empty nodemask

					(local allocation).

				MPOL_F_RELATIVE_NODES

					This flag specifies that the nodemask passed

					by the user will be mapped relative to the set of the task or VMA's

					set of allowed nodes.  The kernel stores the user-passed nodemask,

					and if the allowed nodes changes, then that original nodemask will

					be remapped relative to the new set of allowed nodes.

					Without this flag (and without MPOL_F_STATIC_NODES), anytime a

					mempolicy is rebound because of a change in the set of allowed

					nodes, the node (Preferred) or nodemask (Bind, Interleave) is

					remapped to the new set of allowed nodes.  That remap may not

					preserve the relative nature of the user's passed nodemask to its

					set of allowed nodes upon successive rebinds: a nodemask of

					1,3,5 may be remapped to 7-9 and then to 1-3 if the set of

					allowed nodes is restored to its original state.

					With this flag, the remap is done so that the node numbers from

					the user's passed nodemask are relative to the set of allowed

					nodes.  In other words, if nodes 0, 2, and 4 are set in the user's

					nodemask, the policy will be effected over the first (and in the

					Bind or Interleave case, the third and fifth) nodes in the set of

					allowed nodes.  The nodemask passed by the user represents nodes

					relative to task or VMA's set of allowed nodes.

					If the user's nodemask includes nodes that are outside the range

					of the new set of allowed nodes (for example, node 5 is set in

					the user's nodemask when the set of allowed nodes is only 0-3),

					then the remap wraps around to the beginning of the nodemask and,

					if not already set, sets the node in the mempolicy nodemask.

					For example, consider a task that is attached to a cpuset with

					mems 2-5 that sets an Interleave policy over the same set with

					MPOL_F_RELATIVE_NODES.  If the cpuset's mems change to 3-7, the

					interleave now occurs over nodes 3,5-7.  If the cpuset's mems

					then change to 0,2-3,5, then the interleave occurs over nodes

					0,2-3,5.

					Thanks to the consistent remapping, applications preparing

					nodemasks to specify memory policies using this flag should

					disregard their current, actual cpuset imposed memory placement

					and prepare the nodemask as if they were always located on

					memory nodes 0 to N-1, where N is the number of memory nodes the

					policy is intended to manage.  Let the kernel then remap to the

					set of memory nodes allowed by the task's cpuset, as that may

					change over time.

					MPOL_F_RELATIVE_NODES cannot be combined with the

					MPOL_F_STATIC_NODES flag.  It also cannot be used for

					MPOL_PREFERRED policies that were created with an empty nodemask

					(local allocation).

				Memory Policy Reference Counting

				================================

				To resolve use/free races, struct mempolicy contains an atomic reference

				count field.  Internal interfaces, mpol_get()/mpol_put() increment and

				decrement this reference count, respectively.  mpol_put() will only free

				the structure back to the mempolicy kmem cache when the reference count

				goes to zero.

				When a new memory policy is allocated, its reference count is initialized

				to '1', representing the reference held by the task that is installing the

				new policy.  When a pointer to a memory policy structure is stored in another

				structure, another reference is added, as the task's reference will be dropped

				on completion of the policy installation.

				During run-time "usage" of the policy, we attempt to minimize atomic operations

				on the reference count, as this can lead to cache lines bouncing between cpus

				and NUMA nodes.  "Usage" here means one of the following:

				1) querying of the policy, either by the task itself [using the get_mempolicy()

				   API discussed below] or by another task using the /proc/<pid>/numa_maps

				   interface.

				2) examination of the policy to determine the policy mode and associated node

				   or node lists, if any, for page allocation.  This is considered a "hot

				   path".  Note that for MPOL_BIND, the "usage" extends across the entire

				   allocation process, which may sleep during page reclaimation, because the

				   BIND policy nodemask is used, by reference, to filter ineligible nodes.

				We can avoid taking an extra reference during the usages listed above as

				follows:

				1) we never need to get/free the system default policy as this is never

				   changed nor freed, once the system is up and running.

				2) for querying the policy, we do not need to take an extra reference on the

				   target task's task policy nor vma policies because we always acquire the

				   task's mm's mmap_sem for read during the query.  The set_mempolicy() and

				   mbind() APIs [see below] always acquire the mmap_sem for write when

				   installing or replacing task or vma policies.  Thus, there is no possibility

				   of a task or thread freeing a policy while another task or thread is

				   querying it.

				3) Page allocation usage of task or vma policy occurs in the fault path where

				   we hold them mmap_sem for read.  Again, because replacing the task or vma

				   policy requires that the mmap_sem be held for write, the policy can't be

				   freed out from under us while we're using it for page allocation.

				4) Shared policies require special consideration.  One task can replace a

				   shared memory policy while another task, with a distinct mmap_sem, is

				   querying or allocating a page based on the policy.  To resolve this

				   potential race, the shared policy infrastructure adds an extra reference

				   to the shared policy during lookup while holding a spin lock on the shared

				   policy management structure.  This requires that we drop this extra

				   reference when we're finished "using" the policy.  We must drop the

				   extra reference on shared policies in the same query/allocation paths

				   used for non-shared policies.  For this reason, shared policies are marked

				   as such, and the extra reference is dropped "conditionally"--i.e., only

				   for shared policies.

				   Because of this extra reference counting, and because we must lookup

				   shared policies in a tree structure under spinlock, shared policies are

				   more expensive to use in the page allocation path.  This is especially

				   true for shared policies on shared memory regions shared by tasks running

				   on different NUMA nodes.  This extra overhead can be avoided by always

				   falling back to task or system default policy for shared memory regions,

				   or by prefaulting the entire shared memory region into memory and locking

				   it down.  However, this might not be appropriate for all applications.

				.. _memory_policy_apis:

				Memory Policy APIs

				==================

				Linux supports 3 system calls for controlling memory policy.  These APIS

				always affect only the calling task, the calling task's address space, or

				some shared object mapped into the calling task's address space.

				.. note::

				   the headers that define these APIs and the parameter data types for

				   user space applications reside in a package that is not part of the

				   Linux kernel.  The kernel system call interfaces, with the 'sys\_'

				   prefix, are defined in <linux/syscalls.h>; the mode and flag

				   definitions are defined in <linux/mempolicy.h>.

				Set [Task] Memory Policy::

					long set_mempolicy(int mode, const unsigned long *nmask,

									unsigned long maxnode);

				Set's the calling task's "task/process memory policy" to mode

				specified by the 'mode' argument and the set of nodes defined by

				'nmask'.  'nmask' points to a bit mask of node ids containing at least

				'maxnode' ids.  Optional mode flags may be passed by combining the

				'mode' argument with the flag (for example: MPOL_INTERLEAVE |

				MPOL_F_STATIC_NODES).

				See the set_mempolicy(2) man page for more details

				Get [Task] Memory Policy or Related Information::

					long get_mempolicy(int *mode,

							   const unsigned long *nmask, unsigned long maxnode,

							   void *addr, int flags);

				Queries the "task/process memory policy" of the calling task, or the

				policy or location of a specified virtual address, depending on the

				'flags' argument.

				See the get_mempolicy(2) man page for more details

				Install VMA/Shared Policy for a Range of Task's Address Space::

					long mbind(void *start, unsigned long len, int mode,

						   const unsigned long *nmask, unsigned long maxnode,

						   unsigned flags);

				mbind() installs the policy specified by (mode, nmask, maxnodes) as a

				VMA policy for the range of the calling task's address space specified

				by the 'start' and 'len' arguments.  Additional actions may be

				requested via the 'flags' argument.

				See the mbind(2) man page for more details.

				Memory Policy Command Line Interface

				====================================

				Although not strictly part of the Linux implementation of memory policy,

				a command line tool, numactl(8), exists that allows one to:

				+ set the task policy for a specified program via set_mempolicy(2), fork(2) and

				  exec(2)

				+ set the shared policy for a shared memory segment via mbind(2)

				The numactl(8) tool is packaged with the run-time version of the library

				containing the memory policy system call wrappers.  Some distributions

				package the headers and compile-time libraries in a separate development

				package.

				.. _mem_pol_and_cpusets:

				Memory Policies and cpusets

				===========================

				Memory policies work within cpusets as described above.  For memory policies

				that require a node or set of nodes, the nodes are restricted to the set of

				nodes whose memories are allowed by the cpuset constraints.  If the nodemask

				specified for the policy contains nodes that are not allowed by the cpuset and

				MPOL_F_RELATIVE_NODES is not used, the intersection of the set of nodes

				specified for the policy and the set of nodes with memory is used.  If the

				result is the empty set, the policy is considered invalid and cannot be

				installed.  If MPOL_F_RELATIVE_NODES is used, the policy's nodes are mapped

				onto and folded into the task's set of allowed nodes as previously described.

				The interaction of memory policies and cpusets can be problematic when tasks

				in two cpusets share access to a memory region, such as shared memory segments

				created by shmget() of mmap() with the MAP_ANONYMOUS and MAP_SHARED flags, and

				any of the tasks install shared policy on the region, only nodes whose

				memories are allowed in both cpusets may be used in the policies.  Obtaining

				this information requires "stepping outside" the memory policy APIs to use the

				cpuset information and requires that one know in what cpusets other task might

				be attaching to the shared region.  Furthermore, if the cpusets' allowed

				memory sets are disjoint, "local" allocation is the only valid policy.

									
										204

Documentation/admin-guide/mm/pagemap.rst
									
												View File
											
				@@ -1,204 +0,0 @@

				.. _pagemap:

				=============================

				Examining Process Page Tables

				=============================

				pagemap is a new (as of 2.6.25) set of interfaces in the kernel that allow

				userspace programs to examine the page tables and related information by

				reading files in ``/proc``.

				There are four components to pagemap:

				 * ``/proc/pid/pagemap``.  This file lets a userspace process find out which

				   physical frame each virtual page is mapped to.  It contains one 64-bit

				   value for each virtual page, containing the following data (from

				   ``fs/proc/task_mmu.c``, above pagemap_read):

				    * Bits 0-54  page frame number (PFN) if present

				    * Bits 0-4   swap type if swapped

				    * Bits 5-54  swap offset if swapped

				    * Bit  55    pte is soft-dirty (see

				      :ref:`Documentation/admin-guide/mm/soft-dirty.rst <soft_dirty>`)

				    * Bit  56    page exclusively mapped (since 4.2)

				    * Bits 57-60 zero

				    * Bit  61    page is file-page or shared-anon (since 3.5)

				    * Bit  62    page swapped

				    * Bit  63    page present

				   Since Linux 4.0 only users with the CAP_SYS_ADMIN capability can get PFNs.

				   In 4.0 and 4.1 opens by unprivileged fail with -EPERM.  Starting from

				   4.2 the PFN field is zeroed if the user does not have CAP_SYS_ADMIN.

				   Reason: information about PFNs helps in exploiting Rowhammer vulnerability.

				   If the page is not present but in swap, then the PFN contains an

				   encoding of the swap file number and the page's offset into the

				   swap. Unmapped pages return a null PFN. This allows determining

				   precisely which pages are mapped (or in swap) and comparing mapped

				   pages between processes.

				   Efficient users of this interface will use ``/proc/pid/maps`` to

				   determine which areas of memory are actually mapped and llseek to

				   skip over unmapped regions.

				 * ``/proc/kpagecount``.  This file contains a 64-bit count of the number of

				   times each page is mapped, indexed by PFN.

				The page-types tool in the tools/vm directory can be used to query the

				number of times a page is mapped.

				 * ``/proc/kpageflags``.  This file contains a 64-bit set of flags for each

				   page, indexed by PFN.

				   The flags are (from ``fs/proc/page.c``, above kpageflags_read):

				    0. LOCKED

				    1. ERROR

				    2. REFERENCED

				    3. UPTODATE

				    4. DIRTY

				    5. LRU

				    6. ACTIVE

				    7. SLAB

				    8. WRITEBACK

				    9. RECLAIM

				    10. BUDDY

				    11. MMAP

				    12. ANON

				    13. SWAPCACHE

				    14. SWAPBACKED

				    15. COMPOUND_HEAD

				    16. COMPOUND_TAIL

				    17. HUGE

				    18. UNEVICTABLE

				    19. HWPOISON

				    20. NOPAGE

				    21. KSM

				    22. THP

				    23. BALLOON

				    24. ZERO_PAGE

				    25. IDLE

				 * ``/proc/kpagecgroup``.  This file contains a 64-bit inode number of the

				   memory cgroup each page is charged to, indexed by PFN. Only available when

				   CONFIG_MEMCG is set.

				Short descriptions to the page flags

				====================================

				0 - LOCKED

				   page is being locked for exclusive access, e.g. by undergoing read/write IO

				7 - SLAB

				   page is managed by the SLAB/SLOB/SLUB/SLQB kernel memory allocator

				   When compound page is used, SLUB/SLQB will only set this flag on the head

				   page; SLOB will not flag it at all.

				10 - BUDDY

				    a free memory block managed by the buddy system allocator

				    The buddy system organizes free memory in blocks of various orders.

				    An order N block has 2^N physically contiguous pages, with the BUDDY flag

				    set for and _only_ for the first page.

				15 - COMPOUND_HEAD

				    A compound page with order N consists of 2^N physically contiguous pages.

				    A compound page with order 2 takes the form of "HTTT", where H donates its

				    head page and T donates its tail page(s).  The major consumers of compound

				    pages are hugeTLB pages

				    (:ref:`Documentation/admin-guide/mm/hugetlbpage.rst <hugetlbpage>`),

				    the SLUB etc.  memory allocators and various device drivers.

				    However in this interface, only huge/giga pages are made visible

				    to end users.

				16 - COMPOUND_TAIL

				    A compound page tail (see description above).

				17 - HUGE

				    this is an integral part of a HugeTLB page

				19 - HWPOISON

				    hardware detected memory corruption on this page: don't touch the data!

				20 - NOPAGE

				    no page frame exists at the requested address

				21 - KSM

				    identical memory pages dynamically shared between one or more processes

				22 - THP

				    contiguous pages which construct transparent hugepages

				23 - BALLOON

				    balloon compaction page

				24 - ZERO_PAGE

				    zero page for pfn_zero or huge_zero page

				25 - IDLE

				    page has not been accessed since it was marked idle (see

				    :ref:`Documentation/admin-guide/mm/idle_page_tracking.rst <idle_page_tracking>`).

				    Note that this flag may be stale in case the page was accessed via

				    a PTE. To make sure the flag is up-to-date one has to read

				    ``/sys/kernel/mm/page_idle/bitmap`` first.

				IO related page flags

				---------------------

				1 - ERROR

				   IO error occurred

				3 - UPTODATE

				   page has up-to-date data

				   ie. for file backed page: (in-memory data revision >= on-disk one)

				4 - DIRTY

				   page has been written to, hence contains new data

				   i.e. for file backed page: (in-memory data revision >  on-disk one)

				8 - WRITEBACK

				   page is being synced to disk

				LRU related page flags

				----------------------

				5 - LRU

				   page is in one of the LRU lists

				6 - ACTIVE

				   page is in the active LRU list

				18 - UNEVICTABLE

				   page is in the unevictable (non-)LRU list It is somehow pinned and

				   not a candidate for LRU page reclaims, e.g. ramfs pages,

				   shmctl(SHM_LOCK) and mlock() memory segments

				2 - REFERENCED

				   page has been referenced since last LRU list enqueue/requeue

				9 - RECLAIM

				   page will be reclaimed soon after its pageout IO completed

				11 - MMAP

				   a memory mapped page

				12 - ANON

				   a memory mapped page that is not part of a file

				13 - SWAPCACHE

				   page is mapped to swap space, i.e. has an associated swap entry

				14 - SWAPBACKED

				   page is backed by swap/RAM

				The page-types tool in the tools/vm directory can be used to query the

				above flags.

				Using pagemap to do something useful

				====================================

				The general procedure for using pagemap to find out about a process' memory

				usage goes like this:

				 1. Read ``/proc/pid/maps`` to determine which parts of the memory space are

				    mapped to what.

				 2. Select the maps you are interested in -- all of them, or a particular

				    library, or the stack or the heap, etc.

				 3. Open ``/proc/pid/pagemap`` and seek to the pages you would like to examine.

				 4. Read a u64 for each page from pagemap.

				 5. Open ``/proc/kpagecount`` and/or ``/proc/kpageflags``.  For each PFN you

				    just read, seek to that entry in the file, and read the data you want.

				For example, to find the "unique set size" (USS), which is the amount of

				memory that a process is using that is not shared with any other process,

				you can go through every map in the process, find the PFNs, look those up

				in kpagecount, and tally up the number of pages that are only referenced

				once.

				Other notes

				===========

				Reading from any of the files will return -EINVAL if you are not starting

				the read on an 8-byte boundary (e.g., if you sought an odd number of bytes

				into the file), or if the size of the read is not a multiple of 8 bytes.

				Before Linux 3.11 pagemap bits 55-60 were used for "page-shift" (which is

				always 12 at most architectures). Since Linux 3.11 their meaning changes

				after first clear of soft-dirty bits. Since Linux 4.2 they are used for

				flags unconditionally.

									
										47

Documentation/admin-guide/mm/soft-dirty.rst
									
												View File
											
				@@ -1,47 +0,0 @@

				.. _soft_dirty:

				===============

				Soft-Dirty PTEs

				===============

				The soft-dirty is a bit on a PTE which helps to track which pages a task

				writes to. In order to do this tracking one should

				  1. Clear soft-dirty bits from the task's PTEs.

				     This is done by writing "4" into the ``/proc/PID/clear_refs`` file of the

				     task in question.

				  2. Wait some time.

				  3. Read soft-dirty bits from the PTEs.

				     This is done by reading from the ``/proc/PID/pagemap``. The bit 55 of the

				     64-bit qword is the soft-dirty one. If set, the respective PTE was

				     written to since step 1.

				Internally, to do this tracking, the writable bit is cleared from PTEs

				when the soft-dirty bit is cleared. So, after this, when the task tries to

				modify a page at some virtual address the #PF occurs and the kernel sets

				the soft-dirty bit on the respective PTE.

				Note, that although all the task's address space is marked as r/o after the

				soft-dirty bits clear, the #PF-s that occur after that are processed fast.

				This is so, since the pages are still mapped to physical memory, and thus all

				the kernel does is finds this fact out and puts both writable and soft-dirty

				bits on the PTE.

				While in most cases tracking memory changes by #PF-s is more than enough

				there is still a scenario when we can lose soft dirty bits -- a task

				unmaps a previously mapped memory region and then maps a new one at exactly

				the same place. When unmap is called, the kernel internally clears PTE values

				including soft dirty bits. To notify user space application about such

				memory region renewal the kernel always marks new memory regions (and

				expanded regions) as soft dirty.

				This feature is actively used by the checkpoint-restore project. You

				can find more details about it on http://criu.org

				-- Pavel Emelyanov, Apr 9, 2013

									
										418

Documentation/admin-guide/mm/transhuge.rst
									
												View File
											
				@@ -1,418 +0,0 @@

				.. _admin_guide_transhuge:

				============================

				Transparent Hugepage Support

				============================

				Objective

				=========

				Performance critical computing applications dealing with large memory

				working sets are already running on top of libhugetlbfs and in turn

				hugetlbfs. Transparent HugePage Support (THP) is an alternative mean of

				using huge pages for the backing of virtual memory with huge pages

				that supports the automatic promotion and demotion of page sizes and

				without the shortcomings of hugetlbfs.

				Currently THP only works for anonymous memory mappings and tmpfs/shmem.

				But in the future it can expand to other filesystems.

				.. note::

				   in the examples below we presume that the basic page size is 4K and

				   the huge page size is 2M, although the actual numbers may vary

				   depending on the CPU architecture.

				The reason applications are running faster is because of two

				factors. The first factor is almost completely irrelevant and it's not

				of significant interest because it'll also have the downside of

				requiring larger clear-page copy-page in page faults which is a

				potentially negative effect. The first factor consists in taking a

				single page fault for each 2M virtual region touched by userland (so

				reducing the enter/exit kernel frequency by a 512 times factor). This

				only matters the first time the memory is accessed for the lifetime of

				a memory mapping. The second long lasting and much more important

				factor will affect all subsequent accesses to the memory for the whole

				runtime of the application. The second factor consist of two

				components:

				1) the TLB miss will run faster (especially with virtualization using

				   nested pagetables but almost always also on bare metal without

				   virtualization)

				2) a single TLB entry will be mapping a much larger amount of virtual

				   memory in turn reducing the number of TLB misses. With

				   virtualization and nested pagetables the TLB can be mapped of

				   larger size only if both KVM and the Linux guest are using

				   hugepages but a significant speedup already happens if only one of

				   the two is using hugepages just because of the fact the TLB miss is

				   going to run faster.

				THP can be enabled system wide or restricted to certain tasks or even

				memory ranges inside task's address space. Unless THP is completely

				disabled, there is ``khugepaged`` daemon that scans memory and

				collapses sequences of basic pages into huge pages.

				The THP behaviour is controlled via :ref:`sysfs <thp_sysfs>`

				interface and using madivse(2) and prctl(2) system calls.

				Transparent Hugepage Support maximizes the usefulness of free memory

				if compared to the reservation approach of hugetlbfs by allowing all

				unused memory to be used as cache or other movable (or even unmovable

				entities). It doesn't require reservation to prevent hugepage

				allocation failures to be noticeable from userland. It allows paging

				and all other advanced VM features to be available on the

				hugepages. It requires no modifications for applications to take

				advantage of it.

				Applications however can be further optimized to take advantage of

				this feature, like for example they've been optimized before to avoid

				a flood of mmap system calls for every malloc(4k). Optimizing userland

				is by far not mandatory and khugepaged already can take care of long

				lived page allocations even for hugepage unaware applications that

				deals with large amounts of memory.

				In certain cases when hugepages are enabled system wide, application

				may end up allocating more memory resources. An application may mmap a

				large region but only touch 1 byte of it, in that case a 2M page might

				be allocated instead of a 4k page for no good. This is why it's

				possible to disable hugepages system-wide and to only have them inside

				MADV_HUGEPAGE madvise regions.

				Embedded systems should enable hugepages only inside madvise regions

				to eliminate any risk of wasting any precious byte of memory and to

				only run faster.

				Applications that gets a lot of benefit from hugepages and that don't

				risk to lose memory by using hugepages, should use

				madvise(MADV_HUGEPAGE) on their critical mmapped regions.

				.. _thp_sysfs:

				sysfs

				=====

				Global THP controls

				-------------------

				Transparent Hugepage Support for anonymous memory can be entirely disabled

				(mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE

				regions (to avoid the risk of consuming more memory resources) or enabled

				system wide. This can be achieved with one of::

					echo always >/sys/kernel/mm/transparent_hugepage/enabled

					echo madvise >/sys/kernel/mm/transparent_hugepage/enabled

					echo never >/sys/kernel/mm/transparent_hugepage/enabled

				It's also possible to limit defrag efforts in the VM to generate

				anonymous hugepages in case they're not immediately free to madvise

				regions or to never try to defrag memory and simply fallback to regular

				pages unless hugepages are immediately available. Clearly if we spend CPU

				time to defrag memory, we would expect to gain even more by the fact we

				use hugepages later instead of regular pages. This isn't always

				guaranteed, but it may be more likely in case the allocation is for a

				MADV_HUGEPAGE region.

				::

					echo always >/sys/kernel/mm/transparent_hugepage/defrag

					echo defer >/sys/kernel/mm/transparent_hugepage/defrag

					echo defer+madvise >/sys/kernel/mm/transparent_hugepage/defrag

					echo madvise >/sys/kernel/mm/transparent_hugepage/defrag

					echo never >/sys/kernel/mm/transparent_hugepage/defrag

				always

					means that an application requesting THP will stall on

					allocation failure and directly reclaim pages and compact

					memory in an effort to allocate a THP immediately. This may be

					desirable for virtual machines that benefit heavily from THP

					use and are willing to delay the VM start to utilise them.

				defer

					means that an application will wake kswapd in the background

					to reclaim pages and wake kcompactd to compact memory so that

					THP is available in the near future. It's the responsibility

					of khugepaged to then install the THP pages later.

				defer+madvise

					will enter direct reclaim and compaction like ``always``, but

					only for regions that have used madvise(MADV_HUGEPAGE); all

					other regions will wake kswapd in the background to reclaim

					pages and wake kcompactd to compact memory so that THP is

					available in the near future.

				madvise

					will enter direct reclaim like ``always`` but only for regions

					that are have used madvise(MADV_HUGEPAGE). This is the default

					behaviour.

				never

					should be self-explanatory.

				By default kernel tries to use huge zero page on read page fault to

				anonymous mapping. It's possible to disable huge zero page by writing 0

				or enable it back by writing 1::

					echo 0 >/sys/kernel/mm/transparent_hugepage/use_zero_page

					echo 1 >/sys/kernel/mm/transparent_hugepage/use_zero_page

				Some userspace (such as a test program, or an optimized memory allocation

				library) may want to know the size (in bytes) of a transparent hugepage::

					cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size

				khugepaged will be automatically started when

				transparent_hugepage/enabled is set to "always" or "madvise, and it'll

				be automatically shutdown if it's set to "never".

				Khugepaged controls

				-------------------

				khugepaged runs usually at low frequency so while one may not want to

				invoke defrag algorithms synchronously during the page faults, it

				should be worth invoking defrag at least in khugepaged. However it's

				also possible to disable defrag in khugepaged by writing 0 or enable

				defrag in khugepaged by writing 1::

					echo 0 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag

					echo 1 >/sys/kernel/mm/transparent_hugepage/khugepaged/defrag

				You can also control how many pages khugepaged should scan at each

				pass::

					/sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan

				and how many milliseconds to wait in khugepaged between each pass (you

				can set this to 0 to run khugepaged at 100% utilization of one core)::

					/sys/kernel/mm/transparent_hugepage/khugepaged/scan_sleep_millisecs

				and how many milliseconds to wait in khugepaged if there's an hugepage

				allocation failure to throttle the next allocation attempt::

					/sys/kernel/mm/transparent_hugepage/khugepaged/alloc_sleep_millisecs

				The khugepaged progress can be seen in the number of pages collapsed::

					/sys/kernel/mm/transparent_hugepage/khugepaged/pages_collapsed

				for each pass::

					/sys/kernel/mm/transparent_hugepage/khugepaged/full_scans

				``max_ptes_none`` specifies how many extra small pages (that are

				not already mapped) can be allocated when collapsing a group

				of small pages into one large page::

					/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none

				A higher value leads to use additional memory for programs.

				A lower value leads to gain less thp performance. Value of

				max_ptes_none can waste cpu time very little, you can

				ignore it.

				``max_ptes_swap`` specifies how many pages can be brought in from

				swap when collapsing a group of pages into a transparent huge page::

					/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_swap

				A higher value can cause excessive swap IO and waste

				memory. A lower value can prevent THPs from being

				collapsed, resulting fewer pages being collapsed into

				THPs, and lower memory access performance.

				Boot parameter

				==============

				You can change the sysfs boot time defaults of Transparent Hugepage

				Support by passing the parameter ``transparent_hugepage=always`` or

				``transparent_hugepage=madvise`` or ``transparent_hugepage=never``

				to the kernel command line.

				Hugepages in tmpfs/shmem

				========================

				You can control hugepage allocation policy in tmpfs with mount option

				``huge=``. It can have following values:

				always

				    Attempt to allocate huge pages every time we need a new page;

				never

				    Do not allocate huge pages;

				within_size

				    Only allocate huge page if it will be fully within i_size.

				    Also respect fadvise()/madvise() hints;

				advise

				    Only allocate huge pages if requested with fadvise()/madvise();

				The default policy is ``never``.

				``mount -o remount,huge= /mountpoint`` works fine after mount: remounting

				``huge=never`` will not attempt to break up huge pages at all, just stop more

				from being allocated.

				There's also sysfs knob to control hugepage allocation policy for internal

				shmem mount: /sys/kernel/mm/transparent_hugepage/shmem_enabled. The mount

				is used for SysV SHM, memfds, shared anonymous mmaps (of /dev/zero or

				MAP_ANONYMOUS), GPU drivers' DRM objects, Ashmem.

				In addition to policies listed above, shmem_enabled allows two further

				values:

				deny

				    For use in emergencies, to force the huge option off from

				    all mounts;

				force

				    Force the huge option on for all - very useful for testing;

				Need of application restart

				===========================

				The transparent_hugepage/enabled values and tmpfs mount option only affect

				future behavior. So to make them effective you need to restart any

				application that could have been using hugepages. This also applies to the

				regions registered in khugepaged.

				Monitoring usage

				================

				The number of anonymous transparent huge pages currently used by the

				system is available by reading the AnonHugePages field in ``/proc/meminfo``.

				To identify what applications are using anonymous transparent huge pages,

				it is necessary to read ``/proc/PID/smaps`` and count the AnonHugePages fields

				for each mapping.

				The number of file transparent huge pages mapped to userspace is available

				by reading ShmemPmdMapped and ShmemHugePages fields in ``/proc/meminfo``.

				To identify what applications are mapping file transparent huge pages, it

				is necessary to read ``/proc/PID/smaps`` and count the FileHugeMapped fields

				for each mapping.

				Note that reading the smaps file is expensive and reading it

				frequently will incur overhead.

				There are a number of counters in ``/proc/vmstat`` that may be used to

				monitor how successfully the system is providing huge pages for use.

				thp_fault_alloc

					is incremented every time a huge page is successfully

					allocated to handle a page fault. This applies to both the

					first time a page is faulted and for COW faults.

				thp_collapse_alloc

					is incremented by khugepaged when it has found

					a range of pages to collapse into one huge page and has

					successfully allocated a new huge page to store the data.

				thp_fault_fallback

					is incremented if a page fault fails to allocate

					a huge page and instead falls back to using small pages.

				thp_collapse_alloc_failed

					is incremented if khugepaged found a range

					of pages that should be collapsed into one huge page but failed

					the allocation.

				thp_file_alloc

					is incremented every time a file huge page is successfully

					allocated.

				thp_file_mapped

					is incremented every time a file huge page is mapped into

					user address space.

				thp_split_page

					is incremented every time a huge page is split into base

					pages. This can happen for a variety of reasons but a common

					reason is that a huge page is old and is being reclaimed.

					This action implies splitting all PMD the page mapped with.

				thp_split_page_failed

					is incremented if kernel fails to split huge

					page. This can happen if the page was pinned by somebody.

				thp_deferred_split_page

					is incremented when a huge page is put onto split

					queue. This happens when a huge page is partially unmapped and

					splitting it would free up some memory. Pages on split queue are

					going to be split under memory pressure.

				thp_split_pmd

					is incremented every time a PMD split into table of PTEs.

					This can happen, for instance, when application calls mprotect() or

					munmap() on part of huge page. It doesn't split huge page, only

					page table entry.

				thp_zero_page_alloc

					is incremented every time a huge zero page is

					successfully allocated. It includes allocations which where

					dropped due race with other allocation. Note, it doesn't count

					every map of the huge zero page, only its allocation.

				thp_zero_page_alloc_failed

					is incremented if kernel fails to allocate

					huge zero page and falls back to using small pages.

				thp_swpout

					is incremented every time a huge page is swapout in one

					piece without splitting.

				thp_swpout_fallback

					is incremented if a huge page has to be split before swapout.

					Usually because failed to allocate some continuous swap space

					for the huge page.

				As the system ages, allocating huge pages may be expensive as the

				system uses memory compaction to copy data around memory to free a

				huge page for use. There are some counters in ``/proc/vmstat`` to help

				monitor this overhead.

				compact_stall

					is incremented every time a process stalls to run

					memory compaction so that a huge page is free for use.

				compact_success

					is incremented if the system compacted memory and

					freed a huge page for use.

				compact_fail

					is incremented if the system tries to compact memory

					but failed.

				compact_pages_moved

					is incremented each time a page is moved. If

					this value is increasing rapidly, it implies that the system

					is copying a lot of data to satisfy the huge page allocation.

					It is possible that the cost of copying exceeds any savings

					from reduced TLB misses.

				compact_pagemigrate_failed

					is incremented when the underlying mechanism

					for moving a page failed.

				compact_blocks_moved

					is incremented each time memory compaction examines

					a huge page aligned range of pages.

				It is possible to establish how long the stalls were using the function

				tracer to record how long was spent in __alloc_pages_nodemask and

				using the mm_page_alloc tracepoint to identify which allocations were

				for huge pages.

				Optimizing the applications

				===========================

				To be guaranteed that the kernel will map a 2M page immediately in any

				memory region, the mmap region has to be hugepage naturally

				aligned. posix_memalign() can provide that guarantee.

				Hugetlbfs

				=========

				You can use hugetlbfs on a kernel that has transparent hugepage

				support enabled just fine as always. No difference can be noted in

				hugetlbfs other than there will be less overall fragmentation. All

				usual features belonging to hugetlbfs are preserved and

				unaffected. libhugetlbfs will also work fine as usual.

									
										241

Documentation/admin-guide/mm/userfaultfd.rst
									
												View File
											
				@@ -1,241 +0,0 @@

				.. _userfaultfd:

				===========

				Userfaultfd

				===========

				Objective

				=========

				Userfaults allow the implementation of on-demand paging from userland

				and more generally they allow userland to take control of various

				memory page faults, something otherwise only the kernel code could do.

				For example userfaults allows a proper and more optimal implementation

				of the PROT_NONE+SIGSEGV trick.

				Design

				======

				Userfaults are delivered and resolved through the userfaultfd syscall.

				The userfaultfd (aside from registering and unregistering virtual

				memory ranges) provides two primary functionalities:

				1) read/POLLIN protocol to notify a userland thread of the faults

				   happening

				2) various UFFDIO_* ioctls that can manage the virtual memory regions

				   registered in the userfaultfd that allows userland to efficiently

				   resolve the userfaults it receives via 1) or to manage the virtual

				   memory in the background

				The real advantage of userfaults if compared to regular virtual memory

				management of mremap/mprotect is that the userfaults in all their

				operations never involve heavyweight structures like vmas (in fact the

				userfaultfd runtime load never takes the mmap_sem for writing).

				Vmas are not suitable for page- (or hugepage) granular fault tracking

				when dealing with virtual address spaces that could span

				Terabytes. Too many vmas would be needed for that.

				The userfaultfd once opened by invoking the syscall, can also be

				passed using unix domain sockets to a manager process, so the same

				manager process could handle the userfaults of a multitude of

				different processes without them being aware about what is going on

				(well of course unless they later try to use the userfaultfd

				themselves on the same region the manager is already tracking, which

				is a corner case that would currently return -EBUSY).

				API

				===

				When first opened the userfaultfd must be enabled invoking the

				UFFDIO_API ioctl specifying a uffdio_api.api value set to UFFD_API (or

				a later API version) which will specify the read/POLLIN protocol

				userland intends to speak on the UFFD and the uffdio_api.features

				userland requires. The UFFDIO_API ioctl if successful (i.e. if the

				requested uffdio_api.api is spoken also by the running kernel and the

				requested features are going to be enabled) will return into

				uffdio_api.features and uffdio_api.ioctls two 64bit bitmasks of

				respectively all the available features of the read(2) protocol and

				the generic ioctl available.

				The uffdio_api.features bitmask returned by the UFFDIO_API ioctl

				defines what memory types are supported by the userfaultfd and what

				events, except page fault notifications, may be generated.

				If the kernel supports registering userfaultfd ranges on hugetlbfs

				virtual memory areas, UFFD_FEATURE_MISSING_HUGETLBFS will be set in

				uffdio_api.features. Similarly, UFFD_FEATURE_MISSING_SHMEM will be

				set if the kernel supports registering userfaultfd ranges on shared

				memory (covering all shmem APIs, i.e. tmpfs, IPCSHM, /dev/zero

				MAP_SHARED, memfd_create, etc).

				The userland application that wants to use userfaultfd with hugetlbfs

				or shared memory need to set the corresponding flag in

				uffdio_api.features to enable those features.

				If the userland desires to receive notifications for events other than

				page faults, it has to verify that uffdio_api.features has appropriate

				UFFD_FEATURE_EVENT_* bits set. These events are described in more

				detail below in "Non-cooperative userfaultfd" section.

				Once the userfaultfd has been enabled the UFFDIO_REGISTER ioctl should

				be invoked (if present in the returned uffdio_api.ioctls bitmask) to

				register a memory range in the userfaultfd by setting the

				uffdio_register structure accordingly. The uffdio_register.mode

				bitmask will specify to the kernel which kind of faults to track for

				the range (UFFDIO_REGISTER_MODE_MISSING would track missing

				pages). The UFFDIO_REGISTER ioctl will return the

				uffdio_register.ioctls bitmask of ioctls that are suitable to resolve

				userfaults on the range registered. Not all ioctls will necessarily be

				supported for all memory types depending on the underlying virtual

				memory backend (anonymous memory vs tmpfs vs real filebacked

				mappings).

				Userland can use the uffdio_register.ioctls to manage the virtual

				address space in the background (to add or potentially also remove

				memory from the userfaultfd registered range). This means a userfault

				could be triggering just before userland maps in the background the

				user-faulted page.

				The primary ioctl to resolve userfaults is UFFDIO_COPY. That

				atomically copies a page into the userfault registered range and wakes

				up the blocked userfaults (unless uffdio_copy.mode &

				UFFDIO_COPY_MODE_DONTWAKE is set). Other ioctl works similarly to

				UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an

				half copied page since it'll keep userfaulting until the copy has

				finished.

				QEMU/KVM

				========

				QEMU/KVM is using the userfaultfd syscall to implement postcopy live

				migration. Postcopy live migration is one form of memory

				externalization consisting of a virtual machine running with part or

				all of its memory residing on a different node in the cloud. The

				userfaultfd abstraction is generic enough that not a single line of

				KVM kernel code had to be modified in order to add postcopy live

				migration to QEMU.

				Guest async page faults, FOLL_NOWAIT and all other GUP features work

				just fine in combination with userfaults. Userfaults trigger async

				page faults in the guest scheduler so those guest processes that

				aren't waiting for userfaults (i.e. network bound) can keep running in

				the guest vcpus.

				It is generally beneficial to run one pass of precopy live migration

				just before starting postcopy live migration, in order to avoid

				generating userfaults for readonly guest regions.

				The implementation of postcopy live migration currently uses one

				single bidirectional socket but in the future two different sockets

				will be used (to reduce the latency of the userfaults to the minimum

				possible without having to decrease /proc/sys/net/ipv4/tcp_wmem).

				The QEMU in the source node writes all pages that it knows are missing

				in the destination node, into the socket, and the migration thread of

				the QEMU running in the destination node runs UFFDIO_COPY|ZEROPAGE

				ioctls on the userfaultfd in order to map the received pages into the

				guest (UFFDIO_ZEROCOPY is used if the source page was a zero page).

				A different postcopy thread in the destination node listens with

				poll() to the userfaultfd in parallel. When a POLLIN event is

				generated after a userfault triggers, the postcopy thread read() from

				the userfaultfd and receives the fault address (or -EAGAIN in case the

				userfault was already resolved and waken by a UFFDIO_COPY|ZEROPAGE run

				by the parallel QEMU migration thread).

				After the QEMU postcopy thread (running in the destination node) gets

				the userfault address it writes the information about the missing page

				into the socket. The QEMU source node receives the information and

				roughly "seeks" to that page address and continues sending all

				remaining missing pages from that new page offset. Soon after that

				(just the time to flush the tcp_wmem queue through the network) the

				migration thread in the QEMU running in the destination node will

				receive the page that triggered the userfault and it'll map it as

				usual with the UFFDIO_COPY|ZEROPAGE (without actually knowing if it

				was spontaneously sent by the source or if it was an urgent page

				requested through a userfault).

				By the time the userfaults start, the QEMU in the destination node

				doesn't need to keep any per-page state bitmap relative to the live

				migration around and a single per-page bitmap has to be maintained in

				the QEMU running in the source node to know which pages are still

				missing in the destination node. The bitmap in the source node is

				checked to find which missing pages to send in round robin and we seek

				over it when receiving incoming userfaults. After sending each page of

				course the bitmap is updated accordingly. It's also useful to avoid

				sending the same page twice (in case the userfault is read by the

				postcopy thread just before UFFDIO_COPY|ZEROPAGE runs in the migration

				thread).

				Non-cooperative userfaultfd

				===========================

				When the userfaultfd is monitored by an external manager, the manager

				must be able to track changes in the process virtual memory

				layout. Userfaultfd can notify the manager about such changes using

				the same read(2) protocol as for the page fault notifications. The

				manager has to explicitly enable these events by setting appropriate

				bits in uffdio_api.features passed to UFFDIO_API ioctl:

				UFFD_FEATURE_EVENT_FORK

					enable userfaultfd hooks for fork(). When this feature is

					enabled, the userfaultfd context of the parent process is

					duplicated into the newly created process. The manager

					receives UFFD_EVENT_FORK with file descriptor of the new

					userfaultfd context in the uffd_msg.fork.

				UFFD_FEATURE_EVENT_REMAP

					enable notifications about mremap() calls. When the

					non-cooperative process moves a virtual memory area to a

					different location, the manager will receive

					UFFD_EVENT_REMAP. The uffd_msg.remap will contain the old and

					new addresses of the area and its original length.

				UFFD_FEATURE_EVENT_REMOVE

					enable notifications about madvise(MADV_REMOVE) and

					madvise(MADV_DONTNEED) calls. The event UFFD_EVENT_REMOVE will

					be generated upon these calls to madvise. The uffd_msg.remove

					will contain start and end addresses of the removed area.

				UFFD_FEATURE_EVENT_UNMAP

					enable notifications about memory unmapping. The manager will

					get UFFD_EVENT_UNMAP with uffd_msg.remove containing start and

					end addresses of the unmapped area.

				Although the UFFD_FEATURE_EVENT_REMOVE and UFFD_FEATURE_EVENT_UNMAP

				are pretty similar, they quite differ in the action expected from the

				userfaultfd manager. In the former case, the virtual memory is

				removed, but the area is not, the area remains monitored by the

				userfaultfd, and if a page fault occurs in that area it will be

				delivered to the manager. The proper resolution for such page fault is

				to zeromap the faulting address. However, in the latter case, when an

				area is unmapped, either explicitly (with munmap() system call), or

				implicitly (e.g. during mremap()), the area is removed and in turn the

				userfaultfd context for such area disappears too and the manager will

				not get further userland page faults from the removed area. Still, the

				notification is required in order to prevent manager from using

				UFFDIO_COPY on the unmapped area.

				Unlike userland page faults which have to be synchronous and require

				explicit or implicit wakeup, all the events are delivered

				asynchronously and the non-cooperative process resumes execution as

				soon as manager executes read(). The userfaultfd manager should

				carefully synchronize calls to UFFDIO_COPY with the events

				processing. To aid the synchronization, the UFFDIO_COPY ioctl will

				return -ENOSPC when the monitored process exits at the time of

				UFFDIO_COPY, and -ENOENT, when the non-cooperative process has changed

				its virtual memory layout simultaneously with outstanding UFFDIO_COPY

				operation.

				The current asynchronous model of the event delivery is optimal for

				single threaded non-cooperative userfaultfd manager implementations. A

				synchronous event delivery model can be added later as a new

				userfaultfd feature to facilitate multithreading enhancements of the

				non cooperative manager, for example to allow UFFDIO_COPY ioctls to

				run in parallel to the event reception. Single threaded

				implementations should continue to use the current async event

				delivery model instead.

									
										16

Documentation/admin-guide/pm/intel_pstate.rst
									
												View File
												
				@@ -324,7 +324,8 @@ Global Attributes

				``intel_pstate`` exposes several global attributes (files) in ``sysfs`` to

				control its functionality at the system level.  They are located in the

				``/sys/devices/system/cpu/intel_pstate/`` directory and affect all CPUs.

				``/sys/devices/system/cpu/cpufreq/intel_pstate/`` directory and affect all

				CPUs.

				Some of them are not present if the ``intel_pstate=per_cpu_perf_limits``

				argument is passed to the kernel in the command line.

				@@ -378,17 +379,6 @@ argument is passed to the kernel in the command line.

					but it affects the maximum possible value of per-policy P-state	limits

					(see `Interpretation of Policy Attributes`_ below for details).

				``hwp_dynamic_boost``

					This attribute is only present if ``intel_pstate`` works in the

					`active mode with the HWP feature enabled <Active Mode With HWP_>`_ in

					the processor.  If set (equal to 1), it causes the minimum P-state limit

					to be increased dynamically for a short time whenever a task previously

					waiting on I/O is selected to run on a given logical CPU (the purpose

					of this mechanism is to improve performance).

					This setting has no effect on logical CPUs whose minimum P-state limit

					is directly set to the highest non-turbo P-state or above it.

				.. _status_attr:

				``status``

				@@ -420,7 +410,7 @@ argument is passed to the kernel in the command line.

					That only is supported in some configurations, though (for example, if

					the `HWP feature is enabled in the processor <Active Mode With HWP_>`_,

					the operation mode of the driver cannot be changed), and if it is not

					supported in the current configuration, writes to this attribute will

					supported in the current configuration, writes to this attribute with

					fail with an appropriate error.

				Interpretation of Policy Attributes

									
										2

Documentation/admin-guide/ramoops.rst
									
												View File
												
				@@ -61,7 +61,7 @@ Setting the ramoops parameters can be done in several different manners:

					mem=128M ramoops.mem_address=0x8000000 ramoops.ecc=1

				 B. Use Device Tree bindings, as described in

				 ``Documentation/devicetree/bindings/reserved-memory/ramoops.txt``.

				 ``Documentation/device-tree/bindings/reserved-memory/admin-guide/ramoops.rst``.

				 For example::

					reserved-memory {

									
										44

Documentation/admin-guide/security-bugs.rst
									
												View File
												
				@@ -26,35 +26,23 @@ information is helpful.  Any exploit code is very helpful and will not

				be released without consent from the reporter unless it has already been

				made public.

				Disclosure and embargoed information

				------------------------------------

				Disclosure

				----------

				The security list is not a disclosure channel.  For that, see Coordination

				below.

				The goal of the Linux kernel security team is to work with the bug

				submitter to understand and fix the bug.  We prefer to publish the fix as

				soon as possible, but try to avoid public discussion of the bug itself

				and leave that to others.

				Once a robust fix has been developed, the release process starts.  Fixes

				for publicly known bugs are released immediately.

				Although our preference is to release fixes for publicly undisclosed bugs

				as soon as they become available, this may be postponed at the request of

				the reporter or an affected party for up to 7 calendar days from the start

				of the release process, with an exceptional extension to 14 calendar days

				if it is agreed that the criticality of the bug requires more time.  The

				only valid reason for deferring the publication of a fix is to accommodate

				the logistics of QA and large scale rollouts which require release

				coordination.

				Whilst embargoed information may be shared with trusted individuals in

				order to develop a fix, such information will not be published alongside

				the fix or on any other disclosure channel without the permission of the

				reporter.  This includes but is not limited to the original bug report

				and followup discussions (if any), exploits, CVE information or the

				identity of the reporter.

				In other words our only interest is in getting bugs fixed.  All other

				information submitted to the security list and any followup discussions

				of the report are treated confidentially even after the embargo has been

				lifted, in perpetuity.

				Publishing the fix may be delayed when the bug or the fix is not yet

				fully understood, the solution is not well-tested or for vendor

				coordination.  However, we expect these delays to be short, measurable in

				days, not weeks or months.  A release date is negotiated by the security

				team working with the bug submitter as well as vendors.  However, the

				kernel security team holds the final say when setting a timeframe.  The

				timeframe varies from immediate (esp. if it's already publicly known bug)

				to a few weeks.  As a basic default policy, we expect report date to

				release date to be on the order of 7 days.

				Coordination

				------------

				@@ -80,7 +68,7 @@ may delay the bug handling. If a reporter wishes to have a CVE identifier

				assigned ahead of public disclosure, they will need to contact the private

				linux-distros list, described above. When such a CVE identifier is known

				before a patch is provided, it is desirable to mention it in the commit

				message if the reporter agrees.

				message, though.

				Non-disclosure agreements

				-------------------------

15

Documentation/arm/Marvell/README

View File

@@ -302,15 +302,19 @@ Berlin family (Multimedia Solutions)
 DE3010, Armada 1000 (no Linux support)
 		Core:		Marvell PJ1 (ARMv5TE), Dual-core
 		Product Brief:	http://www.marvell.com.cn/digital-entertainment/assets/armada_1000_pb.pdf
 DE3005, Armada 1500-mini
 DE3005, Armada 1500 Mini
 		Design name:	BG2CD
 		Core:		ARM Cortex-A9, PL310 L2CC
 DE3006, Armada 1500 Mini Plus
 		Design name:	BG2CDP
 		Core:		Dual Core ARM Cortex-A7
 		Homepage:	http://www.marvell.com/multimedia-solutions/armada-1500-mini/
 DE3006, Armada 1500 Mini Plus
                 Design name:    BG2CDP
                 Core:           Dual Core ARM Cortex-A7
                 Homepage:       http://www.marvell.com/multimedia-solutions/armada-1500-mini-plus/
 DE3100, Armada 1500
 		Design name:	BG2
 		Core:		Marvell PJ4B-MP (ARMv7), Tauros3 L2CC
 		Product Brief:	http://www.marvell.com/digital-entertainment/armada-1500/assets/Marvell-ARMADA-1500-Product-Brief.pdf
 DE3114, Armada 1500 Pro
 		Design name:	BG2Q
 		Core:		Quad Core ARM Cortex-A9, PL310 L2CC
@@ -320,16 +324,13 @@ Berlin family (Multimedia Solutions)
 DE3218, ARMADA 1500 Ultra
 		Core:		ARM Cortex-A53
   Homepage: https://www.synaptics.com/products/multimedia-solutions
   Homepage: http://www.marvell.com/multimedia-solutions/
   Directory: arch/arm/mach-berlin
   Comments:
    * This line of SoCs is based on Marvell Sheeva or ARM Cortex CPUs
      with Synopsys DesignWare (IRQ, GPIO, Timers, ...) and PXA IP (SDHCI, USB, ETH, ...).
    * The Berlin family was acquired by Synaptics from Marvell in 2017.
 CPU Cores
 ---------

4

Documentation/arm/OMAP/README

View File

@@ -5,7 +5,3 @@ KERNEL		NEW DEPENDENCIES
 v4.3+		Update is needed for custom .config files to make sure
 		CONFIG_REGULATOR_PBIAS is enabled for MMC1 to work
 		properly.
 v4.18+		Update is needed for custom .config files to make sure
 		CONFIG_MMC_SDHCI_OMAP is enabled for all MMC instances
 		to work in DRA7 and K2G based boards.

2

Documentation/arm64/silicon-errata.txt

View File

@@ -44,8 +44,6 @@ stable kernels.
 | Implementor    | Component       | Erratum ID      | Kconfig                     |
 +----------------+-----------------+-----------------+-----------------------------+
 | Allwinner      | A64/R18         | UNKNOWN1        | SUN50I_ERRATUM_UNKNOWN1     |
 |                |                 |                 |                             |
 | ARM            | Cortex-A53      | #826319         | ARM64_ERRATUM_826319        |
 | ARM            | Cortex-A53      | #827319         | ARM64_ERRATUM_827319        |
 | ARM            | Cortex-A53      | #824069         | ARM64_ERRATUM_824069        |

4

Documentation/arm64/sve.txt

View File

@@ -200,7 +200,7 @@ prctl(PR_SVE_SET_VL, unsigned long arg)
       thread.
     * Changing the vector length causes all of P0..P15, FFR and all bits of
       Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
       Z0..V31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
       unspecified.  Calling PR_SVE_SET_VL with vl equal to the thread's current
       vector length, or calling PR_SVE_SET_VL with the PR_SVE_SET_VL_ONEXEC
       flag, does not constitute a change to the vector length for this purpose.
@@ -500,7 +500,7 @@ References
 [2] arch/arm64/include/uapi/asm/ptrace.h
     AArch64 Linux ptrace ABI definitions
 [3] Documentation/arm64/cpu-feature-registers.txt
 [3] linux/Documentation/arm64/cpu-feature-registers.txt
 [4] ARM IHI0055C
     http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055c/IHI0055C_beta_aapcs64.pdf

0

Documentation/admin-guide/bcache.rst → Documentation/bcache.txt

View File

15

Documentation/block/biodoc.txt

View File

@@ -752,6 +752,18 @@ completion of the request to the block layer. This means ending tag
 operations before calling end_that_request_last()! For an example of a user
 of these helpers, see the IDE tagged command queueing support.
 Certain hardware conditions may dictate a need to invalidate the block tag
 queue. For instance, on IDE any tagged request error needs to clear both
 the hardware and software block queue and enable the driver to sanely restart
 all the outstanding requests. There's a third helper to do that:
 	blk_queue_invalidate_tags(struct request_queue *q)
 	Clear the internal block tag queue and re-add all the pending requests
 	to the request queue. The driver will receive them again on the
 	next request_fn run, just like it did the first time it encountered
 	them.
 .2.5.2 Tag info
 Some block functions exist to query current tag status or to go from a
@@ -793,7 +805,8 @@ Internally, block manages tags in the blk_queue_tag structure:
 Most of the above is simple and straight forward, however busy_list may need
 a bit of explaining. Normally we don't care too much about request ordering,
 but in the event of any barrier requests in the tag queue we need to ensure
 that requests are restarted in the order they were queue.
 that requests are restarted in the order they were queue. This may happen
 if the driver needs to use blk_queue_invalidate_tags().
 .3 I/O Submission

21

Documentation/block/cmdline-partition.txt

View File

@@ -1,9 +1,7 @@
 Embedded device command line partition parsing
 =====================================================================
 The "blkdevparts" command line option adds support for reading the
 block device partition table from the kernel command line.
 Support for reading the block device partition table from the command line.
 It is typically used for fixed block (eMMC) embedded devices.
 It has no MBR, so saves storage space. Bootloader can be easily accessed
 by absolute address of data on the block device.
@@ -16,27 +14,22 @@ blkdevparts=<blkdev-def>[;<blkdev-def>]
     <partdef> := <size>[@<offset>](part-name)
 <blkdev-id>
     block device disk name. Embedded device uses fixed block device.
     Its disk name is also fixed, such as: mmcblk0, mmcblk1, mmcblk0boot0.
     block device disk name, embedded device used fixed block device,
     it's disk name also fixed. such as: mmcblk0, mmcblk1, mmcblk0boot0.
 <size>
     partition size, in bytes, such as: 512, 1m, 1G.
     size may contain an optional suffix of (upper or lower case):
       K, M, G, T, P, E.
     "-" is used to denote all remaining space.
 <offset>
     partition start address, in bytes.
     offset may contain an optional suffix of (upper or lower case):
       K, M, G, T, P, E.
 (part-name)
     partition name. Kernel sends uevent with "PARTNAME". Application can
     create a link to block device partition with the name "PARTNAME".
     User space application can access partition by partition name.
     partition name, kernel send uevent with "PARTNAME". application can create
     a link to block device partition with the name "PARTNAME".
     user space application can access partition by partition name.
 Example:
     eMMC disk names are "mmcblk0" and "mmcblk0boot0".
     eMMC disk name is "mmcblk0" and "mmcblk0boot0"
   bootargs:
     'blkdevparts=mmcblk0:1G(data0),1G(data1),-;mmcblk0boot0:1m(boot),-(kernel)'

16

Documentation/block/null_blk.txt

View File

@@ -71,24 +71,14 @@ use_per_node_hctx=[0/1]: Default: 0
 : The multi-queue block layer is instantiated with a hardware dispatch
      queue for each CPU node in the system.
 use_lightnvm=[0/1]: Default: 0
   Register device with LightNVM. Requires blk-mq and CONFIG_NVM to be enabled.
 no_sched=[0/1]: Default: 0
 : nullb* use default blk-mq io scheduler.
 : nullb* doesn't use io scheduler.
 blocking=[0/1]: Default: 0
 : Register as a non-blocking blk-mq driver device.
 : Register as a blocking blk-mq driver device, null_blk will set
      the BLK_MQ_F_BLOCKING flag, indicating that it sometimes/always
      needs to block in its ->queue_rq() function.
 shared_tags=[0/1]: Default: 0
 : Tag set is not shared.
 : Tag set shared between devices for blk-mq. Only makes sense with
      nr_devices > 1, otherwise there's no tag set to share.
 zoned=[0/1]: Default: 0
 : Block device is exposed as a random-access block device.
 : Block device is exposed as a host-managed zoned block device.
 zone_size=[MB]: Default: 256
   Per zone size when exposed as a zoned block device. Must be a power of two.

Compare commits

1842 Commits pull/2922/ ... rpi-4.17.y

2 .clang-format Unescape Escape View File

34 .github/ISSUE_TEMPLATE/bug_report.md vendored Unescape Escape View File

9 .mailmap Unescape Escape View File

5 CREDITS Unescape Escape View File

10 Documentation/00-INDEX Unescape Escape View File

48 Documentation/ABI/obsolete/sysfs-class-typec Unescape Escape View File

2 Documentation/ABI/obsolete/sysfs-gpio Unescape Escape View File

17 Documentation/ABI/removed/sysfs-bus-nfit Unescape Escape View File

7 Documentation/ABI/stable/sysfs-bus-vmbus Unescape Escape View File

9 Documentation/ABI/stable/sysfs-bus-xen-backend Unescape Escape View File

6 Documentation/ABI/stable/sysfs-class-rfkill Unescape Escape View File

2 Documentation/ABI/stable/sysfs-devices-node Unescape Escape View File

9 Documentation/ABI/stable/sysfs-devices-system-xen_memory Unescape Escape View File

78 Documentation/ABI/stable/sysfs-driver-mlxreg-io Unescape Escape View File

5 Documentation/ABI/testing/configfs-usb-gadget-uvc Unescape Escape View File

13 Documentation/ABI/testing/evm Unescape Escape View File

2 Documentation/ABI/testing/ima_policy Unescape Escape View File

9 Documentation/ABI/testing/ppc-memtrace Unescape Escape View File

10 Documentation/ABI/testing/procfs-diskstats Unescape Escape View File

8 Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc Unescape Escape View File

35 Documentation/ABI/testing/sysfs-bus-iio Unescape Escape View File

47 Documentation/ABI/testing/sysfs-bus-iio-isl29501 Unescape Escape View File

22 Documentation/ABI/testing/sysfs-bus-iio-light-si1133 Unescape Escape View File

19 Documentation/ABI/testing/sysfs-bus-nfit Unescape Escape View File

122 Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats Unescape Escape View File

20 Documentation/ABI/testing/sysfs-bus-rpmsg Unescape Escape View File

51 Documentation/ABI/testing/sysfs-bus-typec Unescape Escape View File

40 Documentation/ABI/testing/sysfs-bus-usb Unescape Escape View File

24 Documentation/ABI/testing/sysfs-class-fpga-manager Unescape Escape View File

9 Documentation/ABI/testing/sysfs-class-fpga-region Unescape Escape View File

15 Documentation/ABI/testing/sysfs-class-gnss Unescape Escape View File

11 Documentation/ABI/testing/sysfs-class-mei Unescape Escape View File

8 Documentation/ABI/testing/sysfs-class-mtd Unescape Escape View File

11 Documentation/ABI/testing/sysfs-class-net-queues Unescape Escape View File

455 Documentation/ABI/testing/sysfs-class-power Unescape Escape View File

16 Documentation/ABI/testing/sysfs-class-rc Unescape Escape View File

2 Documentation/ABI/testing/sysfs-class-rc-nuvoton Unescape Escape View File

62 Documentation/ABI/testing/sysfs-class-typec Unescape Escape View File

14 Documentation/ABI/testing/sysfs-devices-edac Unescape Escape View File

3 Documentation/ABI/testing/sysfs-devices-system-cpu Unescape Escape View File

27 Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator Unescape Escape View File

49 Documentation/ABI/testing/sysfs-driver-typec-displayport Unescape Escape View File

10 Documentation/ABI/testing/sysfs-driver-xen-blkback Unescape Escape View File

11 Documentation/ABI/testing/sysfs-fs-f2fs Unescape Escape View File

2 Documentation/ABI/testing/sysfs-kernel-mm-hugepages Unescape Escape View File

2 Documentation/ABI/testing/sysfs-kernel-mm-ksm Unescape Escape View File

4 Documentation/ABI/testing/sysfs-kernel-slab Unescape Escape View File

23 Documentation/ABI/testing/sysfs-platform-dfl-fme Unescape Escape View File

16 Documentation/ABI/testing/sysfs-platform-dfl-port Unescape Escape View File

13 Documentation/ABI/testing/sysfs-platform-ideapad-laptop Unescape Escape View File

2 Documentation/PCI/00-INDEX Unescape Escape View File

187 Documentation/PCI/acpi-info.txt Unescape Escape View File

2 Documentation/PCI/endpoint/function/binding/pci-test.txt Unescape Escape View File

4 Documentation/PCI/endpoint/pci-endpoint.txt Unescape Escape View File

29 Documentation/PCI/endpoint/pci-test-function.txt Unescape Escape View File

30 Documentation/PCI/endpoint/pci-test-howto.txt Unescape Escape View File

35 Documentation/PCI/pci-error-recovery.txt Unescape Escape View File

5 Documentation/PCI/pcieaer-howto.txt Unescape Escape View File

118 Documentation/RCU/Design/Data-Structures/Data-Structures.html Unescape Escape View File

22 Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html Unescape Escape View File

123 Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg Unescape Escape View File

16 Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg Unescape Escape View File

56 Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg Unescape Escape View File

237 Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg Unescape Escape View File

12 Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg Unescape Escape View File

24 Documentation/RCU/stallwarn.txt Unescape Escape View File

20 Documentation/RCU/whatisRCU.txt Unescape Escape View File

11 Documentation/accelerators/ocxl.rst Unescape Escape View File

69 Documentation/acpi/cppc_sysfs.txt Unescape Escape View File

89 Documentation/acpi/dsd/data-node-references.txt Unescape Escape View File

68 Documentation/acpi/dsd/graph.txt Unescape Escape View File

10 Documentation/acpi/method-customizing.txt Unescape Escape View File

6 Documentation/admin-guide/LSM/apparmor.rst Unescape Escape View File

2 Documentation/admin-guide/README.rst Unescape Escape View File

2142 Documentation/admin-guide/cgroup-v2.rst View File

16 Documentation/admin-guide/devices.txt Unescape Escape View File

3 Documentation/admin-guide/index.rst Unescape Escape View File

375 Documentation/admin-guide/kernel-parameters.txt Unescape Escape View File

1842 Commits

pull/2922/ ... rpi-4.17.y

2

.clang-format

View File

34

.github/ISSUE_TEMPLATE/bug_report.md vendored

View File

9

.mailmap

View File

5

CREDITS

View File

10

Documentation/00-INDEX

View File

48

Documentation/ABI/obsolete/sysfs-class-typec

View File

2

Documentation/ABI/obsolete/sysfs-gpio

View File

17

Documentation/ABI/removed/sysfs-bus-nfit

View File

7

Documentation/ABI/stable/sysfs-bus-vmbus

View File

9

Documentation/ABI/stable/sysfs-bus-xen-backend

View File

6

Documentation/ABI/stable/sysfs-class-rfkill

View File

2

Documentation/ABI/stable/sysfs-devices-node

View File

9

Documentation/ABI/stable/sysfs-devices-system-xen_memory

View File

78

Documentation/ABI/stable/sysfs-driver-mlxreg-io

View File

5

Documentation/ABI/testing/configfs-usb-gadget-uvc

View File

13

Documentation/ABI/testing/evm

View File

2

Documentation/ABI/testing/ima_policy

View File

9

Documentation/ABI/testing/ppc-memtrace

View File

10

Documentation/ABI/testing/procfs-diskstats

View File

8

Documentation/ABI/testing/sysfs-bus-coresight-devices-tmc

View File

35

Documentation/ABI/testing/sysfs-bus-iio

View File

47

Documentation/ABI/testing/sysfs-bus-iio-isl29501

View File

22

Documentation/ABI/testing/sysfs-bus-iio-light-si1133

View File

19

Documentation/ABI/testing/sysfs-bus-nfit

View File

122

Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats

View File

20

Documentation/ABI/testing/sysfs-bus-rpmsg

View File

51

Documentation/ABI/testing/sysfs-bus-typec

View File

40

Documentation/ABI/testing/sysfs-bus-usb

View File

24

Documentation/ABI/testing/sysfs-class-fpga-manager

View File

9

Documentation/ABI/testing/sysfs-class-fpga-region

View File

15

Documentation/ABI/testing/sysfs-class-gnss

View File

11

Documentation/ABI/testing/sysfs-class-mei

View File

8

Documentation/ABI/testing/sysfs-class-mtd

View File

11

Documentation/ABI/testing/sysfs-class-net-queues

View File

455

Documentation/ABI/testing/sysfs-class-power

View File

16

Documentation/ABI/testing/sysfs-class-rc

View File

2

Documentation/ABI/testing/sysfs-class-rc-nuvoton

View File

62

Documentation/ABI/testing/sysfs-class-typec

View File

14

Documentation/ABI/testing/sysfs-devices-edac

View File

3

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

27

Documentation/ABI/testing/sysfs-driver-bd9571mwv-regulator

View File

49

Documentation/ABI/testing/sysfs-driver-typec-displayport

View File

10

Documentation/ABI/testing/sysfs-driver-xen-blkback

View File

11

Documentation/ABI/testing/sysfs-fs-f2fs

View File

2

Documentation/ABI/testing/sysfs-kernel-mm-hugepages

View File

2

Documentation/ABI/testing/sysfs-kernel-mm-ksm

View File

4

Documentation/ABI/testing/sysfs-kernel-slab

View File

23

Documentation/ABI/testing/sysfs-platform-dfl-fme

View File

16

Documentation/ABI/testing/sysfs-platform-dfl-port

View File

13

Documentation/ABI/testing/sysfs-platform-ideapad-laptop

View File

2

Documentation/PCI/00-INDEX

View File

187

Documentation/PCI/acpi-info.txt

View File

2

Documentation/PCI/endpoint/function/binding/pci-test.txt

View File

4

Documentation/PCI/endpoint/pci-endpoint.txt

View File

29

Documentation/PCI/endpoint/pci-test-function.txt

View File

30

Documentation/PCI/endpoint/pci-test-howto.txt

View File

35

Documentation/PCI/pci-error-recovery.txt

View File

5

Documentation/PCI/pcieaer-howto.txt

View File

118

Documentation/RCU/Design/Data-Structures/Data-Structures.html

View File

22

Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.html

View File

123

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-cleanup.svg

View File

16

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-1.svg

View File

56

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp-init-3.svg

View File

237

Documentation/RCU/Design/Memory-Ordering/TreeRCU-gp.svg

View File

12

Documentation/RCU/Design/Memory-Ordering/TreeRCU-qs.svg

View File

24

Documentation/RCU/stallwarn.txt

View File

20

Documentation/RCU/whatisRCU.txt

View File

11

Documentation/accelerators/ocxl.rst

View File

69

Documentation/acpi/cppc_sysfs.txt

View File

89

Documentation/acpi/dsd/data-node-references.txt

View File

68

Documentation/acpi/dsd/graph.txt

View File

10

Documentation/acpi/method-customizing.txt

View File

6

Documentation/admin-guide/LSM/apparmor.rst

View File

2

Documentation/admin-guide/README.rst

View File

2142

Documentation/admin-guide/cgroup-v2.rst

View File

16

Documentation/admin-guide/devices.txt

View File

3

Documentation/admin-guide/index.rst

View File

375

Documentation/admin-guide/kernel-parameters.txt

View File

6

Documentation/admin-guide/l1tf.rst

View File