linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-27 04:22:58 +00:00

Author	SHA1	Message	Date
Phil Elwell	d5c0e92fd7	overlays: i2c1-bcm2708: Don't overwrite i2c1 pins node It is bad practise to overwrite an entire node in an overlay. Instead, target the node and overwrite any properties that need changing. See: https://github.com/raspberrypi/linux/pull/2118 Suggested-by: soodvarun78 <soodvarun78@gmail.com> Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:50 +01:00
Conn	6344586e5a	config: enhance DualShock3 controller support Enable rumble support in Sony HID & HID battery strength.	2017-07-21 15:30:50 +01:00
Phil Elwell	0c8e419521	bcm2835-mmc: Prevent DMA race condition The end of a read operation is triggered by the completion of the DMA transfer, but writes are complete when the data IRQ is raised. The bcm2835-mmc driver contains a race between the handling of the DMA completion interrupt and the submission of the next request. Fix the race by deferring the completion of the request until the DMA transfer finishes. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:49 +01:00
Phil Elwell	c336fcc30a	bcm2835-mmc: Fix DMA usage The previous change ("bcm2835-mmc: Only claim one DMA channel") used an incorrect variable, the effect of which was to prevent DMA from being used at all. Fix that bug by using the right variable. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:49 +01:00
Matthias Reichl	671b5e9fdf	config: enable generic S/PDIF codec drivers (#2104 ) These drivers can be used as dummy ADC/DAC drivers for attaching general codecs that don't need to be configured. This option will build 2 additional drivers, spdif_receiver and spdif_transmitter. Since these drivers have DT bindings they are handy for quick testing of I2S peripherals with simple-audio-card. eg: fragment@0 { target-path = "/"; __overlay__ { dummy_receiver: spdif-receiver { #address-cells = <0>; #size-cells = <0>; #sound-dai-cells = <0>; compatible = "linux,spdif-dir"; status = "okay"; }; }; }; Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:30:48 +01:00
Matthijs Kooijman	e9f345c16f	overlays: Add gpio-shutdown overlay (#2103 ) This overlay facilitates the addition of a powerbutton by converting GPIO edges into KEY_POWER keypresses, which can be handled by systemd-logind to shut down the system. Signed-off-by: Matthijs Kooijman <matthijs@stdin.nl>	2017-07-21 15:30:47 +01:00
Allo	772dd939bf	PianoPlus: Dual Mono & Dual Stereo features added (#2069 )	2017-07-21 15:30:47 +01:00
Steve Conner	6bfd2cf1dd	New i2c-rtc-gpio device overlay (#2092 ) Created new i2c-rtc-gpio device overlay by combining i2c-rtc and i2c-gpio. Tested with PCF2127 on CM3.	2017-07-21 15:30:46 +01:00
Eric Anholt	4ebc00356a	bcm2708: Drop CMA alignment from FKMS mode as well. I dropped it from KMS mode in `d88274d88e` and should have done both of them at that time. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:30:46 +01:00
popcornmix	85210fe346	bcm2835-cpufreq: Change licence to GPLv2 Signed-off-by: Eben Upton <eben.upton@broadcom.com> Signed-off-by: Dom Cobley <dom@raspberrypi.com>	2017-07-21 15:30:45 +01:00
Andrei Gherzan	d62acd5f62	dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708 bcm2708-dmaengine.c defines functions like bcm_dma_start which are defined as well in dma-bcm2708.h as inline versions when CONFIG_DMA_BCM2708 is not defined. This works fine when CONFIG_DMA_BCM2708 is built in, but when it is selected as module build fails with redefinition errors because in the build system when CONFIG_DMA_BCM2708 is selected as module, the macro becomes CONFIG_DMA_BCM2708_MODULE. This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when available. Fixes https://github.com/raspberrypi/linux/issues/2056 Signed-off-by: Andrei Gherzan <andrei@gherzan.com>	2017-07-21 15:30:45 +01:00
sandeepal	1fff13b410	Allo Digione Driver (#2048 ) Driver for the Allo Digione soundcard	2017-07-21 15:30:44 +01:00
Stefan Tatschner	c594627138	Add device tree config for htu21 See: https://github.com/raspberrypi/linux/pull/2041 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:43 +01:00
Phil Elwell	5610f9908a	BCM270X_DT: Improve i2c-sensor and i2c-rtc overlay Use the "__dormant__" feature to permit multiple instances of each overlay, which is more useful now that changing the "reg" property also changes the node address. Although the overlay grows slightly, when applied only the requested node is included. Usage does not change, except that the "lm75addr" parameter of the i2c-sensor overlay has been deprecated in favour of the generic "addr" parameter. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:43 +01:00
Phil Elwell	2e40c9f569	config: Adding SENSOR_JC42 The jc42 module supports a number of I2C-based temperature sensor modules. [ DM_RAID0 config lost because now selected by DM_RAID ] See: https://github.com/raspberrypi/linux/issues/2046 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:42 +01:00
Anton Onishchenko	b69fc2c191	mpu6050 device tree overlay (#2031 ) Add overlay and config options for InvenSense MPU6050 6-axis motion detector.	2017-07-21 15:30:42 +01:00
popcornmix	e1118cfe93	config: Add CONFIG_IPV6_SIT_6RD	2017-07-21 15:30:41 +01:00
popcornmix	75e63659b6	config: Add CONFIG_IPV6_ROUTE_INFO	2017-07-21 15:30:41 +01:00
Liviu Dudau	422ff2a226	ASoC: TLV320AIC23: Unquote NULL from control name commit `a03faba972` upstream. Without this I am getting the following messages at boot on my Trimslice: tlv320aic23-codec 2-001a: Control not supported for path LLINEIN -> [NULL] -> Line Input tlv320aic23-codec 2-001a: ASoC: no dapm match for LLINEIN --> NULL --> Line Input tlv320aic23-codec 2-001a: ASoC: Failed to add route LLINEIN -> NULL -> Line Input tlv320aic23-codec 2-001a: Control not supported for path RLINEIN -> [NULL] -> Line Input tlv320aic23-codec 2-001a: ASoC: no dapm match for RLINEIN --> NULL --> Line Input tlv320aic23-codec 2-001a: ASoC: Failed to add route RLINEIN -> NULL -> Line Input tlv320aic23-codec 2-001a: Control not supported for path MICIN -> [NULL] -> Mic Input tlv320aic23-codec 2-001a: ASoC: no dapm match for MICIN --> NULL --> Mic Input tlv320aic23-codec 2-001a: ASoC: Failed to add route MICIN -> NULL -> Mic Input tegra-snd-trimslice sound: tlv320aic23-hifi <-> 70002800.i2s mapping ok Signed-off-by: Liviu Dudau <liviu@dudau.co.uk> Signed-off-by: Mark Brown <broonie@kernel.org>	2017-07-21 15:30:40 +01:00
chenzhiwo	6e87abb735	Add device tree overlay for GPIO connected rotary encoder. See Documentation/input/rotary-encoder.txt for more information.	2017-07-21 15:30:40 +01:00
Ahmet Inan	a876cd42ea	overlays: Add Goodix overlay Add support for I2C connected Goodix gt9271 multiple touch controller using GPIOs 4 and 17 (pins 7 and 11 on GPIO header) for interrupt and reset. Signed-off-by: Ahmet Inan <inan@distec.de>	2017-07-21 15:30:39 +01:00
Phil Elwell	2385daca80	irq_bcm2836: Send event when onlining sleeping cores In order to reduce power consumption and bus traffic, it is sensible for secondary cores to enter a low-power idle state when waiting to be started. The wfe instruction causes a core to wait until an event or interrupt arrives before continuing to the next instruction. The sev instruction sends a wakeup event to the other cores, so call it from bcm2836_smp_boot_secondary, the function that wakes up the waiting cores during booting. It is harmless to use this patch without the corresponding change adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated and this patch is not applied then the other cores will sleep forever. See: https://github.com/raspberrypi/linux/issues/1989 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:39 +01:00
Phil Elwell	fdf388a67c	ARM: dts: bcm283x: Reserve first page for firmware The Raspberry Pi startup stub files for multi-core BCM27XX processors make the secondary CPUs spin until the corresponding mailbox is written. These stubs are loaded at physical address 0x00000xxx (as seen by the ARMs), but this page will be reused by the kernel unless it is explicitly reserved, causing the waiting cores to execute random code. Use the /memreserve/ Device Tree directive to mark the first page as off-limits to the kernel. See: https://github.com/raspberrypi/linux/issues/1989 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:38 +01:00
Scott Ellis	30730c830e	BCM270X_DT: Add tmp102 to i2c sensor overlay Signed-off-by: Scott Ellis <scott@jumpnowtek.com>	2017-07-21 15:30:38 +01:00
Scott Ellis	e376ff0729	config: Enable TI TMP102 temp sensor module Signed-off-by: Scott Ellis <scott@jumpnowtek.com>	2017-07-21 15:30:37 +01:00
Phil Elwell	ccfbc8d3dd	config: Add CONFIG_BMP280 (and CONFIG_BMP280_I2C) Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:37 +01:00
Phil Elwell	8106cc44f5	BCM270X_DT: Add bme280 and bmp180 to i2c-sensor overlay Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:36 +01:00
Matt Flax	bc4ab40291	Audioinjector octo : Make the playback and capture symmetric This patch ensures that the sample rate and channel count of the audioinjector octo sound card are symmetric.	2017-07-21 15:30:36 +01:00
Matt Flax	152abddf6a	Audioinjector : make the octo and pi sound cards have different driver names This patch gives the audioinjector octo and pi soundcards different driver names. This allows both the be loaded without clashing.	2017-07-21 15:30:35 +01:00
Bilal Amarni	4f9f477455	rtl8192: switch to netdev->priv_destructor() When trying to build from the rpi-4.11.y branch, I'm getting the following error : drivers/net/wireless/realtek/rtl8192cu/os_dep/linux/ioctl_cfg80211.c:3464:10: error: 'struct net_device' has no member named 'destructor' It seems to occur since this upstream commit : `cf124db566` [...] netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Signed-off-by: Bilal Amarni <bilal.amarni@gmail.com>	2017-07-21 15:30:34 +01:00
Matthias Reichl	ac11953a3d	ASoC: bcm2835: Enforce full symmetry bcm2835's configuration registers can't be changed when a stream is running, which means asymmetric configurations aren't supported. Channel and rate symmetry are already enforced by constraints but samplebits had been missed. As hw_params doesn't check for symmetry constraints by itself and just returns success if a stream is running this led to situations where asymmetric configurations were seeming to succeed but of course didn't work because the hardware wasn't configured at all. Fix this by adding the missing samplerate symmetry constraint. Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:30:34 +01:00
Matthias Reichl	50be212ed1	ASoC: bcm2835: Support additional samplerates up to 384kHz Sample rates are only restricted by the capabilities of the clock driver, so use SNDRV_PCM_RATE_CONTINUOUS instead of SNDRV_PCM_RATE_8000_192000. Tests (eg with pcm5122) have shown that bcm2835 works fine in 384kHz/32bit stereo mode, so change the maximum allowed rate from 192kHz to 384kHz. Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:30:33 +01:00
Matthias Reichl	66106af9bf	ASoC: bcm2835: Support left/right justified and DSP modes DSP modes and left/right justified modes can be supported on bcm2835 by configuring the frame sync polarity and frame sync length registers and by adjusting the channel data position registers. Clock and frame sync polarity handling in hw_params has been refactored to make the interaction between logical rising/falling edge frame start and physical configuration (changed by normal/inverted polarity modes) clearer. Modes where the first active data bit is transmitted immediately after frame start (eg DSP mode B with slot 0 active) only work reliable if bcm2835 is configured as frame master. In frame slave mode channel swap (or shift, this isn't quite clear yet) can occur. Currently the driver only warns if an unstable configuration is detected but doensn't prevent using them. Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:30:33 +01:00
Matthias Reichl	afb1c713ca	ASoC: bcm2835: Add support for TDM modes bcm2835 supports arbitrary positioning of channel data within a frame and thus is capable of supporting TDM modes. Since the driver is limited to 2-channel operations only TDM setups with exactly 2 active slots are supported. Logical TDM slot numbering follows the usual convention: For I2S-like modes, with a 50% duty-cycle frame clock, slots 0, 2, ... are transmitted in the first half of a frame, slots 1, 3, ... are transmitted in the second half. For DSP modes slot numbering is ascending: 0, 1, 2, 3, ... Channel position calculation has been refactored to use TDM info and moved out of hw_params. set_tdm_slot, set_bclk_ratio and hw_params now check more strictly if the configuration is valid. Illegal configurations like odd number of slots in I2S mode, data lengths exceeding slot width or frame sizes larger than the hardware limit of 1024 are rejected. Also hw_params now properly checks for errors from clk_set_rate. Allowed PCM formats are already guarded by stream constraints, thus the formats check in hw_params has been removed and data_length is now retrieved via params_width(). Also standard functions like snd_soc_params_to_bclk are now being used instead of manual calculations to make the code more readable. Special care has been taken to ensure that set_bclk_ratio works as before. The bclk ratio is mapped to a 2-channel TDM config with a slot width of half the ratio. In order to support odd ratios, which can't be expressed via a TDM config, the ratio (frame length) is stored and used by hw_params. Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:30:32 +01:00
Phil Elwell	69b9dec468	SQUASH: mmc: Apply ERASE_BROKEN quirks correctly Squash with: mmc: Add MMC_QUIRK_ERASE_BROKEN for some cards Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:32 +01:00
Phil Elwell	021a18ae8f	overlays: README: remove vestigial SDIO parameters Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:31 +01:00
Phil Elwell	5dfec3b239	BCM270X_DT: Add midi-uart1 overlay Add a scaler to the ttyS0 clock so that requesting 38400 baud results in an approximately 31250 baud signal. This is analagous to midi-uart0, except for ttyS0, which may be useful on Pi3 and also may avoid an issue with ttyAMA0 failing to synchronise to an active data stream. See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=183860 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:31 +01:00
Phil Elwell	d06400e148	serial: 8250: Fix THRE flag usage for CAP_MINI The BCM2835 MINI UART has non-standard THRE semantics. Conventionally the bit means that the FIFO is empty (although there may still be a byte in the transmit register), but on 2835 it indicates that the FIFO is not empty. This causes interrupts after every byte is transmitted, with the FIFO providing some interrupt latency tolerance. A consequence of this difference is that the usual strategy of writing multiple bytes into the TX FIFO after checking THRE once is unsafe. In the worst case of 7 bytes in the FIFO, writing 8 bytes loses all but the first since by then the FIFO is full. There is an HFIFO ("Hidden FIFO") bit which is almost what is needed, but it only adds more bytes while both THRE and TEMT are set, i.e. when the TX side is completely idle. This is unnecessarily pessimistic. Add a new special case, predicated on CAP_MINI, that loops until THRE is no longer set. With this change, the FIFO fills quickly but subsequent writes are paced by the transmission rate. See: https://github.com/raspberrypi/linux/issues/1855 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:30 +01:00
P33M	1eb2efce82	dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints Certain hub types do not discriminate between pipe direction (IN or OUT) when considering non-periodic transfers. Therefore these hubs get confused if multiple transfers are issued in different directions with the same device address and endpoint number. Constrain queuing non-periodic split transactions so they are performed serially in such cases. Related: https://github.com/raspberrypi/linux/issues/2024	2017-07-21 15:30:29 +01:00
P33M	d456b51f02	dwc_otg: add module parameter int_ep_interval_min Add a module parameter (defaulting to ignored) that clamps the polling rate of high-speed Interrupt endpoints to a minimum microframe interval. The parameter is modifiable at runtime as it is used when activating new endpoints (such as on device connect).	2017-07-21 15:30:29 +01:00
popcornmix	fc9cb00f47	config: Add CONFIG_CAN_GS_USB	2017-07-21 15:30:28 +01:00
P33M	b40f3f3b21	dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly Get rid of the spammy printk and local pointer mangling. Also, there is a nominal benefit for using fiq_fsm for isochronous transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second) so remove the root port speed check.	2017-07-21 15:30:28 +01:00
Phil Elwell	a4044a28c6	serial: 8250: Add CAP_MINI, set for bcm2835aux commit `d087e7a991` upstream. The AUX/mini-UART in the BCM2835 family of procesors is a cut-down 8250 clone. In particular it is lacking support for the following features: CSTOPB PARENB PARODD CMSPAR CS5 CS6 Add a new capability (UART_CAP_MINI) that exposes the restrictions to the user of the termios API by turning off the unsupported features in the request. N.B. It is almost possible to automatically discover the missing features by reading back the LCR register, but the CSIZE bits don't cooperate (contrary to the documentation, both bits are significant, but CS5 and CS6 are mapped to CS7) and the code is much longer. See: https://github.com/raspberrypi/linux/issues/1561 Signed-off-by: Phil Elwell <phil@raspberrypi.org> Acked-by: Eric Anholt <eric@anholt.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 15:30:27 +01:00
P33M	78d631dbba	dwc_otg: make periodic scheduling behave properly for FS buses If the root port is in full-speed mode, transfer times at 12mbit/s would be calculated but matched against high-speed quotas. Reinitialise hcd->frame_usecs[i] on each port enable event so that full-speed bandwidth can be tracked sensibly. Also, don't bother using the FIQ for transfers when in full-speed mode - at the slower bus speed, interrupt frequency is reduced by an order of magnitude. Related issue: https://github.com/raspberrypi/linux/issues/2020	2017-07-21 15:30:27 +01:00
Bilal Amarni	3bd4441377	[ARM64] enable drivers for GPIO expander and vcio	2017-07-21 15:30:26 +01:00
Phil Elwell	1f7eba341f	clk: bcm2835: Minimise clock jitter for PCM clock Fractional clock dividers generate accurate average frequencies but with jitter, particularly when the integer divisor is small. Introduce a new metric of clock accuracy to penalise clocks with a good average but worse jitter compared to clocks with an average which is no better but with lower jitter. The metric is the ideal rate minus the worse deviation from that ideal using the nearest integer divisors. Use this metric for parent selection for clocks requiring low jitter (currently just PCM). Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:26 +01:00
Phil Elwell	2006379501	clk: bcm2835: Limit PCM clock to OSC and PLLD_PER It is unwise to use sources other than the oscillator and PLLD_PER for the PCM peripheral (and perhaps others - TBD) because their rate can change and they may even be switched off, so explicitly restrict the choice using dummy entries in the list of potential parents (item index is significant). See: https://github.com/raspberrypi/linux/issues/1949 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:25 +01:00
popcornmix	e3557cd68b	config: Add CONFIG_I2C_ROBOTFUZZ_OSIF	2017-07-21 15:30:24 +01:00
popcornmix	9c3f9a7ec7	config: Add CONFIG_TOUCHSCREEN_EDT_FT5X06	2017-07-21 15:30:24 +01:00
popcornmix	8128372c29	config: Add FB_TFT_ST7789V module	2017-07-21 15:30:23 +01:00
popcornmix	fb5e99a3eb	config: Add CONFIG_TOUCHSCREEN_GOODIX	2017-07-21 15:30:23 +01:00
P33M	bbdff72ce5	dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt Host channels are already halted in kill_urbs_in_qh_list() with the subsequent interrupt processing behaving as if the URB was dequeued via HCD callback. There's no need to clobber the host channel registers a second time as this exposes races between the driver and host channel resulting in hcd->free_hc_list becoming corrupted.	2017-07-21 15:30:22 +01:00
P33M	38adc2cf4e	dwc_otg: delete hcd->channel_lock The lock serves no purpose as it is only held while the HCD spinlock is already being held.	2017-07-21 15:30:21 +01:00
P33M	ec433ae456	dwc_otg: fix several potential crash sources On root port disconnect events, the host driver state is cleared and in-progress host channels are forcibly stopped. This doesn't play well with the FIQ running in the background, so: - Guard the disconnect callback with both the host spinlock and FIQ spinlock - Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out so we don't dereference a qtd that has gone away - Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.	2017-07-21 15:30:21 +01:00
Phil Elwell	538a327fa9	BCM270X_DT: Tidy up mmc, sdhost, sdio overlays The mmc, sdhost, sdio and sdio-1bit overlays had a few anachronisms and oddities which were overdue for fixing. The new versions should be functionally equivalent. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:20 +01:00
popcornmix	71419dd874	vc4_fkms: Apply firmware overscan offset to hardware cursor	2017-07-21 15:30:20 +01:00
popcornmix	32b2124773	squash: vc4_firmware_kms fixups	2017-07-21 15:30:19 +01:00
Eric Anholt	9982c2e002	drm/vc4: Fix sending of page flip completion events in FKMS mode. In the rewrite of vc4_crtc.c for fkms, I dropped the part of the CRTC's atomic flush handler that moved the completion event from the proposed atomic state change to the CRTC's current state. That meant that when full screen pageflipping happened (glxgears -fullscreen in X, compton, por weston), the app would end up blocked firever waiting to draw its next frame. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:30:19 +01:00
Eric Anholt	8904b89efb	drm/vc4: Add DRM_DEBUG_ATOMIC for the insides of fkms. Trying to debug weston on fkms involved figuring out what calls I was making to the firmware. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:30:18 +01:00
Eric Anholt	83a548e9ea	drm/vc4: Name the primary and cursor planes in fkms. This makes debugging nicer, compared to trying to remember what the IDs are. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:30:18 +01:00
Eric Anholt	f90d9f66c2	drm/vc4: Add a mode for using the closed firmware for display. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:30:17 +01:00
Nisar Sayed	a3df4d5531	According to RFC 2460, IPv6 UDP calculated checksum yields a result of zero must be changed to 0xffff, however this feature is not supported by smsc95xx family hence enable csum offload only for IPv4 TCP/UDP packets. Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com> Reported-by: popcorn mix <popcornmix@gmail.com>	2017-07-21 15:30:17 +01:00
popcornmix	83e1965d42	bcm2708_fb: Avoid firmware mbox call in vc_mem_copy If firmware has locked up it is useful to get vcdbg log out without a firmware mbox response. Issue the mbox call at probe time instead. Signed-off-by: popcornmix <popcornmix@gmail.com>	2017-07-21 15:30:16 +01:00
P33M	1af74879f3	fiq_fsm: Use correct states when starting isoc OUT transfers In fiq_fsm_start_next_periodic() if an isochronous OUT transfer was selected, no regard was given as to whether this was a single-packet transfer or a multi-packet staged transfer. For single-packet transfers, this had the effect of repeatedly sending OUT packets with bogus data and lengths. Eventually if the channel was repeatedly enabled enough times, this would lock up the OTG core and no further bus transfers would happen. Set the FSM state up properly if we select a single-packet transfer. Fixes https://github.com/raspberrypi/linux/issues/1842	2017-07-21 15:30:16 +01:00
popcornmix	2d5be3aff2	vcsm: Treat EBUSY as success rather than SIGBUS Currently if two cores access the same page concurrently one will return VM_FAULT_NOPAGE and the other VM_FAULT_SIGBUS crashing the user code. Also report when mapping fails. Signed-off-by: popcornmix <popcornmix@gmail.com>	2017-07-21 15:30:15 +01:00
P33M	5ab62a4cac	dwc_otg: fix split transaction data toggle handling around dequeues See https://github.com/raspberrypi/linux/issues/1709 Fix several issues regarding endpoint state when URBs are dequeued - If the HCD is disconnected, flush FIQ-enabled channels properly - Save the data toggle state for bulk endpoints if the last transfer from an endpoint where URBs were dequeued returned a data packet - Reset hc->start_pkt_count properly in assign_and_init_hc()	2017-07-21 15:30:14 +01:00
P33M	bef8fb8208	dwc_otg: make nak_holdoff work as intended with empty queues If URBs reading from non-periodic split endpoints were dequeued and the last transfer from the endpoint was a NAK handshake, the resulting qh->nak_frame value was stale which would result in unnecessarily long polling intervals for the first subsequent transfer with a fresh URB. Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against a case where a single URB is submitted to the endpoint, a NAK was received on the transfer immediately prior to receiving data and the device subsequently resubmits another URB past the qh->nak_frame interval. Fixes https://github.com/raspberrypi/linux/issues/1709	2017-07-21 15:30:14 +01:00
Yasunari Takiguchi	ea1074f9ef	BCM2708: Add Raspberry Pi TV HAT Device Tree Support This is an EXAMPLE CODE of Raspberry Pi TV HAT device tree overlay. Although this is not a part of our release code, it has been used to verify CXD2880 device driver with TV HAT. Add the following line to /boot/config.txt to enable TV HAT: dtoverlay=rpi-tv Reboot Raspberry Pi and check the existance of /proc/device-tree/soc/spi@7e204000/cxd2880@0. If exists, the installation is successful. you should be able to find the following three files. /dev/dvb/adapter0/frontend0 /dev/dvb/adapter0/demux0 /dev/dvb/adapter0/dvr0 Signed-off-by: Yasunari Takiguchi <Yasunari.Takiguchi@sony.com> Signed-off-by: Masayuki Yamamoto <Masayuki.Yamamoto@sony.com> Signed-off-by: Hideki Nozawa <Hideki.Nozawa@sony.com> Signed-off-by: Kota Yonezawa <Kota.Yonezawa@sony.com> Signed-off-by: Toshihiko Matsumoto <Toshihiko.Matsumoto@sony.com> Signed-off-by: Satoshi Watanabe <Satoshi.C.Watanabe@sony.com>	2017-07-21 15:30:13 +01:00
Yasunari Takiguchi	4cc5ed6c6e	This is the driver for Sony CXD2880 DVB-T2/T tuner + demodulator. It includes the CXD2880 driver and the CXD2880 SPI adapter. The current CXD2880 driver version is 1.4.1 - 1.0.1 released on April 13, 2017. Signed-off-by: Yasunari Takiguchi <Yasunari.Takiguchi@sony.com> Signed-off-by: Masayuki Yamamoto <Masayuki.Yamamoto@sony.com> Signed-off-by: Hideki Nozawa <Hideki.Nozawa@sony.com> Signed-off-by: Kota Yonezawa <Kota.Yonezawa@sony.com> Signed-off-by: Toshihiko Matsumoto <Toshihiko.Matsumoto@sony.com> Signed-off-by: Satoshi Watanabe <Satoshi.C.Watanabe@sony.com>	2017-07-21 15:30:13 +01:00
BabuSubashChandar	744ba4009b	Add clock changes and mute gpios (#1938 ) Also improve code style and adhere to ALSA coding conventions. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com> Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>	2017-07-21 15:30:12 +01:00
BabuSubashChandar C	223866f5de	Add support for new clock rate and mute gpios. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Deepak <deepak@zilogic.com> Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>	2017-07-21 15:30:12 +01:00
BabuSubashChandar	b25036f84f	Add support for Allo Boss DAC add-on board for Raspberry Pi. (#1924 ) Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Deepak <deepak@zilogic.com> Reviewed-by: BabuSubashChandar <babusubashchandar@zilogic.com>	2017-07-21 15:30:11 +01:00
Raashid Muhammed	79f0d4d43e	Add support for Allo Piano DAC 2.1 plus add-on board for Raspberry Pi. The Piano DAC 2.1 has support for 4 channels with subwoofer. Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Reviewed-by: Vijay Kumar B. <vijaykumar@zilogic.com> Reviewed-by: Raashid Muhammed <raashidmuhammed@zilogic.com>	2017-07-21 15:30:11 +01:00
Peter Malkin	20d83f4d2f	Driver support for Google voiceHAT soundcard.	2017-07-21 15:30:10 +01:00
Matt Flax	c797f2de43	AudioInjector Octo: sample rates, regulators, reset This patch adds new sample rates to the Audioinjector Octo sound card. The new supported rates are (in kHz) : 96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7 Reference the bcm270x DT regulators in the overlay. This patch adds a reset GPIO for the AudioInjector.net octo sound card.	2017-07-21 15:30:09 +01:00
popcornmix	5696ef2f9d	config: Add back MMC_BCM2835_DMA	2017-07-21 15:30:09 +01:00
Phil Elwell	4a423e7c1a	leds-gpio: Remove stray assignment to brightness_set The brightness_set method is intended for use cases that must not block, and can only be used if the GPIO provider can never sleep. Remove an accidental initialisation (a copy-and-paste error) that sets it regardless, which has been seen to cause crashes with the gpio expander driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:08 +01:00
Phil Elwell	38340e4214	BCM270X_DT: Allow multiple instances of w1-gpio overlays Upcoming firmware will modify the address portion of node names when their "reg" property is written by a dtparam. Modify the w1-gpio overlays to write the gpiopin parameter value to "reg" properties, so that multiple instances can be loaded simultaneously. Note: The value of the "address" is unimportant - the w1 subsystem assigns instance numbers to buses sequentially from 1, and it is not necessary to know which bus a device is on in order to find it.	2017-07-21 15:30:08 +01:00
Phil Elwell	41b3e088fa	BCM270X_DT: Add lm75 to i2c-sensor overlay See: https://www.raspberrypi.org/forums/viewtopic.php?f=107&t=177236 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:07 +01:00
John Greb	529aaf8a53	Match dwc2 device-tree fifo sizes to the hardware values. Since commit `aa381a7259` was reverted with `3fa9538539` the g-tx-fifo-size array in the device-tree needs to match the preset values in the bcm2835. Resolves https://github.com/raspberrypi/linux/issues/1876	2017-07-21 15:30:07 +01:00
Phil Elwell	8d980078f2	thermal: Compatible strings for bcm2836, bcm2837 The upstream dt-bindings documentation for bcm2835-thermal (which exists even though the driver isn't upstreamed) says to use dedicated compatible strings on bcm2836 and bcm2837, even though the downstream driver doesn't support them. The Pi2 DTB uses "brcm,bcm2836-thermal", so the driver doesn't load. The Pi3 DTB doesn't override the base value, but the arm64 Pi3 support uses "brcm,bcm2837-thermal". Solve the documentation problem by adding "brcm,bcm2836-thermal" and "brcm,bcm2837-thermal" as alternative compatible strings for the bcm2835-thermal driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:06 +01:00
Phil Elwell	940f718490	bcm2835-sdhost: mmc_card_blockaddr fix Get the definition of mmc_card_blockaddr from drivers/mmc/core/card.h. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:06 +01:00
Phil Elwell	52c4b18f09	mmc: Add MMC_QUIRK_ERASE_BROKEN for some cards Some SD cards have been found that corrupt data when small blocks are erased. Add a quirk to indicate that ERASE should not be used, and set it for cards of that type. Signed-off-by: Phil Elwell <phil@raspberrypi.org> mmc: Apply QUIRK_BROKEN_ERASE to other capacities Signed-off-by: Phil Elwell <phil@raspberrypi.org> mmc: Add card_quirks module parameter, log quirks Use mmc_block.card_quirks to override the quirks for all SD or MMC cards. The value is a bitfield using the bit positions defined in include/linux/mmc/card.h. If the module parameter is placed in the kernel command line (or bootargs) stored on the card then, assuming the device only has one SD card interface, the override effectively becomes card-specific. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:05 +01:00
Phil Elwell	2f41c261cc	config: Re-enable the bcm2835-mmc driver With the patch to assign mmc device IDs based on DT aliases and appropriate aliases in the rpi DTBs, it is now safe to re-enable the bcm2835-mmc driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:05 +01:00
Phil Elwell	86ff9317f8	BCM270X_DT: Add numbered aliases for SD/MMC devices In order to force a specific ID assignment to SD/MMC devices, add numbered aliases to the DT: sdhost -> mmc0, mmc -> mmc1 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:04 +01:00
Stefan Agner	4575781e19	mmc: read mmc alias from device tree To get the SD/MMC host device ID, read the alias from the device tree. This is useful in case a SoC has multipe SD/MMC host controllers while the second controller should logically be the first device (e.g. if the second controller is connected to an internal eMMC). Combined with block device numbering using MMC/SD host device ID, this results in predictable name assignment of the internal eMMC block device. Signed-off-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Dmitry Torokhov <dtor@chromium.org> [dianders: rebase + roll in http://crosreview.com/259916] Signed-off-by: Douglas Anderson <dianders@chromium.org>	2017-07-21 15:30:03 +01:00
Phil Elwell	b24e557c48	mkknlimg: Find some more downstream-only strings See: https://github.com/raspberrypi/linux/issues/1920 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:03 +01:00
Phil Elwell	581ebb3d87	BCM270X_DT: Enable AUX interrupt controller in DT See: https://github.com/raspberrypi/linux/issues/1484 https://github.com/raspberrypi/linux/issues/1573 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:02 +01:00
Phil Elwell	20699c212b	bcm2835-aux: Add aux interrupt controller The AUX block has a shared interrupt line with a register indicating which devices have active IRQs. Expose this as a nested interrupt controller to avoid sharing problems. See: https://github.com/raspberrypi/linux/issues/1484 https://github.com/raspberrypi/linux/issues/1573 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:02 +01:00
Phil Elwell	9d6d67ce56	ASoC: Add prompt for ICS43432 codec Without a prompt string, a config setting can't be included in a defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards can use the driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:01 +01:00
Phil Elwell	4491d298ad	config: Make spidev a loadable module spidev isn't required early in the boot process, and not all users need it (spi_bcm2835 is a module), so make it a loadable module. See: https://github.com/raspberrypi/linux/issues/1897 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:30:01 +01:00
popcornmix	5af260d714	config: disable MMC driver temporarily for now. Currently causes a breakage to sdhost driver. However when MMC is disabled Pi3 wifi will not work	2017-07-21 15:30:00 +01:00
Dave Stevenson	18cb72ffea	bcm2835-gpio-exp: Copy/paste error adding base twice brcmexp_gpio_set was adding gpio->gc.base to the offset twice, so passing an invalid number to the mailbox service. The firmware treated it modulo-8 anyway, but was logging an assert every time. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2017-07-21 15:30:00 +01:00
Dave Stevenson	b17ef8a62a	BCM270X_DT: Invert Pi3 power LED to match fw change Firmware expgpio driver reworked due to complaint over hotplug detect. Requires power LED to change sense as firmware is no longer inverting the read value. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2017-07-21 15:29:59 +01:00
Phil Elwell	0ab43d71be	pinctrl-bcm2835: Fix interrupt handling for GPIOs 28-31 and 46-53 Contrary to the documentation, the BCM2835 GPIO controller actually has four interrupt lines - one each for the three IRQ groups and one common. Rather confusingly, the GPIO interrupt groups don't correspond directly with the GPIO control banks. Instead, GPIOs 0-27 generate IRQ GPIO0, 28-45 GPIO1 and 46-53 GPIO2. Awkwardly, the GPIOS for IRQ GPIO1 straddle two 32-entry GPIO banks, so it is cleaner to split out a function to process the interrupts for a single GPIO bank. This bug has only just been observed because GPIOs above 27 can only be accessed on an old Raspberry Pi with the optional P5 header fitted, where the pins are often used for I2S instead.	2017-07-21 15:29:59 +01:00
Eric Anholt	f5abd50d02	clk: bcm2835: Mark GPIO clocks enabled at boot as critical. These divide off of PLLD_PER and are used for the ethernet and wifi PHYs source PLLs. Neither of them is currently represented by a phy device that would grab the clock for us. This keeps other drivers from killing the networking PHYs when they disable their own clocks and trigger PLLD_PER's refcount going to 0. v2: Skip marking as critical if they aren't on at boot. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:29:58 +01:00
Khem Raj	67674d836c	build/arm64: Add rules for .dtbo files for dts overlays We now create overlays as .dtbo files. Signed-off-by: Khem Raj <raj.khem@gmail.com>	2017-07-21 15:29:58 +01:00
Michael Zoran	085f3cb97d	ARM64: Force hardware emulation of deprecated instructions.	2017-07-21 15:29:57 +01:00
Michael Zoran	e6b06f4a98	ARM64: Enable DWC_OTG Driver In ARM64 Build Config(bcmrpi3_defconfig) Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:57 +01:00
Michael Zoran	d230a37d1b	ARM64: Round-Robin dispatch IRQs between CPUs. IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:56 +01:00
Michael Zoran	357090b9b0	ARM64/DWC_OTG: Port dwc_otg driver to ARM64 In ARM64, the FIQ mechanism used by this driver is not current implemented. As a workaround, reqular IRQ is used instead of FIQ. In a separate change, the IRQ-CPU mapping is round robined on ARM64 to increase concurrency and allow multiple interrupts to be serviced at a time. This reduces the need for FIQ. Tests Run: This mechanism is most likely to break when multiple USB devices are attached at the same time. So the system was tested under stress. Devices: 1. USB Speakers playing back a FLAC audio through VLC at 96KHz.(Higher then typically, but supported on my speakers). 2. sftp transferring large files through the buildin ethernet connection which is connected through USB. 3. Keyboard and mouse attached and being used. Although I do occasionally hear some glitches, the music seems to play quite well. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:56 +01:00
Michael Zoran	04e714d7fb	ARM64: Enable RTL8187/RTL8192CU wifi in build config These drivers build now, so they can be enabled back in the build configuration just like they are for 32 bit. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:55 +01:00
Michael Zoran	f379156a1b	ARM64: Fix build break for RTL8187/RTL8192CU wifi These drivers use an ASM function from the base system to compute the ipv6 checksum. These functions are not available on ARM64, probably because nobody has bother to write them. The base system does have a generic "C" version, so a simple fix is to include the header to use the generic version on ARM64 only. A longer term solution would be to submit the necessary ASM function to the upstream source. With this change, these drivers now compile without any errors on ARM64. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:55 +01:00
Electron752	3af9c46623	ARM64: Enable Kernel Address Space Randomization (#1792 ) Randomization allows the mapping between virtual addresses and physical address to be different on each boot. This makes it more difficult to exploit security vulnerabilities that require knowledge of fixed hardware addresses. The firmware generates a 8 byte random number during bootup and stores it in the device tree under chosen/kaslr-seed. This number is used to randomize the address mapping. This change enables this feature in the build configuration for ARM64. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:54 +01:00
Michael Zoran	e9b3c9f6f6	ARM64: Run bcmrpi3_defconfig through savedefconfig. Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:53 +01:00
Michael Zoran	306e8019a3	ARM64: Enable HDMI audio and vc04_services in bcmrpi3_defconfig Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:53 +01:00
Electron752	6925c8c39b	ARM64: Make it work again on 4.9 (#1790 ) * Invoke the dtc compiler with the same options used in arm mode. * ARM64 now uses the bcm2835 platform just like ARM32. * ARM64: Update bcmrpi3_defconfig Signed-off-by: Michael Zoran <mzoran@crowfest.net>	2017-07-21 15:29:52 +01:00
Eric Anholt	45b125f5af	raspberrypi-firmware: Export the general transaction function. The vc4-firmware-kms module is going to be doing the MBOX FB call. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:29:52 +01:00
Eric Anholt	6607493d9f	raspberrypi-firmware: Define the MBOX channel in the header. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:29:51 +01:00
Michael Zoran	ae9dd432c5	Add arm64 configuration and device tree differences. Disable MMC_BCM2835_SDHOST and MMC_BCM2835 since these drivers are crashing at the moment. ARM64: Modify default config to get raspbian to boot (#1686) 1. Enable emulation of deprecated instructions. 2. Enable ARM 8.1 and 8.2 features which are not detected at runtime. 3. Switch the default governer to powersave. 4. Include the watchdog timer driver in the kernel image rather then a module. Tested with raspbian-jessie 2016-09-23.	2017-07-21 15:29:51 +01:00
popcornmix	7f2b53b398	config: Add default configs	2017-07-21 15:29:50 +01:00
Phil Elwell	02e06c6abe	hci_h5: Don't send conf_req when ACTIVE Without this patch, a modem and kernel can continuously bombard each other with conf_req and conf_rsp messages, in a demented game of tag.	2017-07-21 15:29:50 +01:00
Phil Elwell	1031a399a6	brcmfmac: Mute expected startup 'errors' The brcmfmac WiFi driver always complains about the '00' country code and the firmware version is reported as an error. Modify the driver to ignore '00' silently and display firmware version at INFO level. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:49 +01:00
Cheong2K	8bc5391f4b	brcm: adds support for BCM43341 wifi brcmfmac: Disable power management Disable wireless power saving in the brcmfmac WLAN driver. This is a temporary measure until the connectivity loss resulting from power saving is resolved. Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: Use original country code as a fallback Commit `73345fd212`: brcmfmac: Configure country code using device specific settings prevents region codes from working on devices that lack a region code translation table. In the event of an absent table, preserve the old behaviour of using the provided code as-is. Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: Plug memory leak in brcmf_fill_bss_param See: https://github.com/raspberrypi/linux/issues/1471 Signed-off-by: Phil Elwell <phil@raspberrypi.org> brcmfmac: do not use internal roaming engine by default Some evidence of curing disconnects with this disabled, so make it a default. Can be overridden with module parameter roamoff=0 See: http://projectable.me/optimize-my-pi-wi-fi/ brcmfmac: Change stop_ap sequence Patch from Broadcom/Cypress to resolve a customer error Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:48 +01:00
Pantelis Antoniou	6736fcf88b	OF: DT-Overlay configfs interface This is a port of Pantelis Antoniou's v3 port that makes use of the new upstreamed configfs support for binary attributes. Original commit message: Add a runtime interface to using configfs for generic device tree overlay usage. With it its possible to use device tree overlays without having to use a per-platform overlay manager. Please see Documentation/devicetree/configfs-overlays.txt for more info. Changes since v2: - Removed ifdef CONFIG_OF_OVERLAY (since for now it's required) - Created a documentation entry - Slight rewording in Kconfig Changes since v1: - of_resolve() -> of_resolve_phandles(). Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Signed-off-by: Phil Elwell <phil@raspberrypi.org> DT configfs: Fix build errors on other platforms Signed-off-by: Phil Elwell <phil@raspberrypi.org> DT configfs: fix build error There is an error when compiling rpi-4.6.y branch: CC drivers/of/configfs.o drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types] .default_groups = of_cfs_def_groups, ^ drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next') The .default_groups is linked list since commit `1ae1602de0`. This commit uses configfs_add_default_group to fix this problem. Signed-off-by: Slawomir Stepien <sst@poczta.fm>	2017-07-21 15:29:48 +01:00
Phil Elwell	8b0a226a1d	net: Fix rtl8192cu build errors on other platforms Signed-off-by: Phil Elwell <phil@raspberrypi.org> suppress spurious messages Add #if for 3.14 kernel change (#87) Fixes compiling after changes in `f663dd9aaf` and `99932d4fc0` Fixes #86 Set dev_type to wlan Fixes #23 Tentatively added support for more 8188CUS based devices. Add support for more 8188CUS and 8192CUS devices Add ProductId for the Netgear N150 - WNA1000M Fixes CONFIG_CONCURRENT_MODE CONFIG_MULTI_VIR_IFACES Fixes compatibility with 3.13 Enables warning in the compiler and fixes some issues, reference => https://github.com/diederikdehaas/rtl8812AU Starts device in station mode instead of monitor, fixes NetworkManager issues Enable cfg80211 support Fix cfg80211 for kernel >= 4.7 Fixes rtl8192cu for kernel >= 4.8	2017-07-21 15:29:47 +01:00
popcornmix	de9f499890	net: Add non-mainline source for rtl8192cu wlan Add non-mainline source for rtl8192cu wireless driver version v4.0.2_9000 as this is widely used. Disable older rtlwifi driver. 8192cu needs old wireless extensions The obsolete WIRELESS_EXT configuration is used by the old Realtek code and is needed for AP support. 8192cu: CONFIG_AP_MODE hardcoded in autoconf.h rtl8192c_rf6052: PHY_RFShadowRefresh(): fix off-by-one Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org> rtl8192cu: Add PID for D-Link DWA 131	2017-07-21 15:29:47 +01:00
Phil Elwell	722ffabf92	amba_pl011: Round input clock up The UART clock is initialised to be as close to the requested frequency as possible without exceeding it. Now that there is a clock manager that returns the actual frequencies, an expected 48MHz clock is reported as 47999625. If the requested baudrate == requested clock/16, there is no headroom and the slight reduction in actual clock rate results in failure. Detect cases where it looks like a "round" clock was chosen and adjust the reported clock to match that "round" value. As the code comment says: /* * If increasing a clock by less than 0.1% changes it * from ..999.. to ..000.., round up. */ Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:46 +01:00
Phil Elwell	0a15a991e1	amba_pl011: Don't use DT aliases for numbering The pl011 driver looks for DT aliases of the form "serial<n>", and if found uses <n> as the device ID. This can cause /dev/ttyAMA0 to become /dev/ttyAMA1, which is confusing if the other serial port is provided by the 8250 driver which doesn't use the same logic.	2017-07-21 15:29:45 +01:00
Dave Stevenson	67e480c38a	bcm2835-gpio-exp: Driver for GPIO expander via mailbox service Pi3 and Compute Module 3 have a GPIO expander that the VPU communicates with. There is a mailbox service that now allows control of this expander, so add a kernel driver that can make use of it. Pwr_led node added to device-tree for Pi3. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2017-07-21 15:29:45 +01:00
popcornmix	9358b82507	bcm2835-virtgpio: Virtual GPIO driver Add a virtual GPIO driver that uses the firmware mailbox interface to request that the VPU toggles LEDs.	2017-07-21 15:29:44 +01:00
P33M	7989b11216	rpi_display: add backlight driver and overlay Add a mailbox-driven backlight controller for the Raspberry Pi DSI touchscreen display. Requires updated GPU firmware to recognise the mailbox request. Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org>	2017-07-21 15:29:44 +01:00
Matt Flax	a8cceaab7e	Add support for the AudioInjector.net Octo sound card	2017-07-21 15:29:43 +01:00
Fe-Pi	57a77d2169	Add support for Fe-Pi audio sound card. (#1867 ) Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec. Mechanical specification of the board is the same the Raspberry Pi Zero. 3.5mm jacks for Headphone/Mic, Line In, and Line Out. Signed-off-by: Henry Kupis <fe-pi@cox.net>	2017-07-21 15:29:42 +01:00
Miquel	557c68916d	sound: Support for Dion Audio LOCO-V2 DAC-AMP HAT Signed-off-by: Miquel Blauw <info@dionaudio.nl>	2017-07-21 15:29:42 +01:00
Matthias Reichl	3c3a8153d8	ASoC: Add driver for Cirrus Logic Audio Card Note: due to problems with deferred probing of regulators the following softdep should be added to a modprobe.d file softdep arizona-spi pre: arizona-ldo1 Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:29:41 +01:00
gtrainavicius	147619f796	Support for Blokas Labs pisound board Pisound dynamic overlay (#1760) Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay. Print a logline when the kernel module is removed. pisound improvements: * Added a writable sysfs object to enable scripts / user space software to blink MIDI activity LEDs for variable duration. * Improved hw_param constraints setting. * Added compatibility with S16_LE sample format. * Exposed some simple placeholder volume controls, so the card appears in volumealsa widget. Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>	2017-07-21 15:29:41 +01:00
Clive Messer	cc29410e38	Allo Piano DAC boards: Initial 2 channel (stereo) support (#1645 ) Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards, using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC machine driver. NB. The initial support is 2 channel (stereo) ONLY! (The Piano DAC 2.1 will only support 2 channel (stereo) left/right output, pending an update to the upstream pcm512x codec driver, which will have to be submitted via upstream. With the initial downstream support, provided by this patch, the Piano DAC 2.1 subwoofer outputs will not function.) Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net> Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk> Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>	2017-07-21 15:29:40 +01:00
DigitalDreamtime	4a7d2259db	Add support for Dion Audio LOCO DAC-AMP HAT Using dedicated machine driver and pcm5102a codec driver. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2017-07-21 15:29:39 +01:00
escalator2015	8394d8a2de	New driver for RRA DigiDAC1 soundcard using WM8741 + WM8804	2017-07-21 15:29:39 +01:00
DigitalDreamtime	2e5bd466ce	Add IQAudIO Digi WM8804 board support Support IQAudIO Digi board with iqaudio_digi machine driver and iqaudio-digi-wm8804-audio overlay. NB. Machine driver is a cut and paste of hifiberry_digi code, with format and general cleanup to comply with kernel coding standards. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2017-07-21 15:29:38 +01:00
Matt Flax	698cb72b53	New AudioInjector.net Pi soundcard with low jitter audio in and out. Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile. Adds the dts overlay and updates the Makefile and README. Updates the relevant defconfig files to enable building for the Raspberry Pi. Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions. Added support for headphones, microphone and bclk_ratio settings. This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.	2017-07-21 15:29:38 +01:00
Andrey Grodzovsky	58c0e8414f	ARM: adau1977-adc: Add basic machine driver for adau1977 codec driver. This commit adds basic support for the codec usage including: Device tree overlay, binding I2S bus and setting I2S mode, clock source and frequency setting according to spec. Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>	2017-07-21 15:29:37 +01:00
Aaron Shaw	7a8cf67aa4	Add Support for JustBoom Audio boards justboom-dac: Adjust for ALSA API change As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card * rather than a struct snd_soc_codec *. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:37 +01:00
Jan Grulich	3361b2c837	RaspiDAC3 support Signed-off-by: Jan Grulich <jan@grulich.eu> config: fix RaspiDAC Rev.3x dependencies Change depends to SND_BCM2708_SOC_I2S \|\| SND_BCM2835_SOC_I2S like the other I2S soundcard drivers. Signed-off-by: Matthias Reichl <hias@horus.com>	2017-07-21 15:29:36 +01:00
Waldemar Brodkorb	f6e82126fc	Add driver for rpi-proto Forward port of 3.10.x driver from https://github.com/koalo We are using a custom board and would like to use rpi 3.18.x kernel. Patch works fine for our embedded system. URL to the audio chip: http://www.mikroe.com/add-on-boards/audio-voice/audio-codec-proto/ Playback tested with devicetree enabled. Signed-off-by: Waldemar Brodkorb <wbrodkorb@conet.de>	2017-07-21 15:29:35 +01:00
Ryan Coe	b0b389acb3	Update ds1307 driver for device-tree support Signed-off-by: Ryan Coe <bluemrp9@gmail.com>	2017-07-21 15:29:35 +01:00
Daniel Matuschek	007512668e	Added driver for HiFiBerry Amp amplifier add-on board The driver contains a low-level hardware driver for the TAS5713 and the drivers for the Raspberry Pi I2S subsystem. TAS5713: return error if initialisation fails Existing TAS5713 driver logs errors during initialisation, but does not return an error code. Therefore even if initialisation fails, the driver will still be loaded, but won't work. This patch fixes this. I2C communication error will now reported correctly by a non-zero return code. HiFiBerry Amp: fix device-tree problems Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.	2017-07-21 15:29:34 +01:00
Daniel Matuschek	9ad09e437c	Added support for HiFiBerry DAC+ The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses a different codec chip (PCM5122), therefore a new driver is necessary. Add support for the HiFiBerry DAC+ Pro. The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators. An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame. Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+ 24db_digital_gain DT param can be used to specify that PCM512x codec "Digital" volume control should not be limited to 0dB gain, and if specified will allow the full 24dB gain. Add dt param to force HiFiBerry DAC+ Pro into slave mode "dtoverlay=hifiberry-dacplus,slave" Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode, with Pi as master for bit and frame clock. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2017-07-21 15:29:34 +01:00
Gordon Garrity	e9a490642f	Add IQaudIO Sound Card support for Raspberry Pi Set a limit of 0dB on Digital Volume Control The main volume control in the PCM512x DAC has a range up to +24dB. This is dangerously loud and can potentially cause massive clipping in the output stages. Therefore this sets a sensible limit of 0dB for this control. Allow up to 24dB digital gain to be applied when using IQAudIO DAC+ 24db_digital_gain DT param can be used to specify that PCM512x codec "Digital" volume control should not be limited to 0dB gain, and if specified will allow the full 24dB gain. Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt Add the ability to set the card name, dai name and dai stream name, from dt config. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk> IQaudIO: auto-mute for AMP+ and DigiAMP+ IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute and auto mute. Revision 2, auto mute implementing HiassofT suggestion to mute/unmute using set_bias_level, rather than startup/shutdown.... "By default DAPM waits 5 seconds (pmdown_time) before shutting down playback streams so a close/stop immediately followed by open/start doesn't trigger an amp mute+unmute." Tested on both AMP+ (via DAC+) and DigiAMP+, with both options... dtoverlay=iqaudio-dacplus,unmute_amp "one-shot" unmute when kernel module loads. dtoverlay=iqaudio-dacplus,auto_mute_amp Unmute amp when ALSA device opened by a client. Mute, with 5 second delay when ALSA device closed. (Re-opening the device within the 5 second close window, will cancel mute.) Revision 4, using gpiod. Revision 5, clean-up formatting before adding mute code. - Convert tab plus 4 space formatting to 2x tab - Remove '// NOT USED' commented code Revision 6, don't attempt to "one-shot" unmute amp, unless card is successfully registered. Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>	2017-07-21 15:29:33 +01:00
Daniel Matuschek	89222c7094	ASoC: BCM:Add support for HiFiBerry Digi. Driver is based on the patched WM8804 driver. Signed-off-by: Daniel Matuschek <daniel@matuschek.net> Add a parameter to turn off SPDIF output if no audio is playing This patch adds the paramater auto_shutdown_output to the kernel module. Default behaviour of the module is the same, but when auto_shutdown_output is set to 1, the SPDIF oputput will shutdown if no stream is playing. bugfix for 32kHz sample rate, was missing HiFiBerry Digi: set SPDIF status bits for sample rate The HiFiBerry Digi driver did not signal the sample rate in the SPDIF status bits. While this is optional, some DACs and receivers do not accept this signal. This patch adds the sample rate bits in the SPDIF status block. Added HiFiBerry Digi+ Pro driver Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>	2017-07-21 15:29:32 +01:00
Daniel Matuschek	f6ef1f57d3	ASoC: wm8804: Implement MCLK configuration options, add 32bit support WM8804 can run with PLL frequencies of 256xfs and 128xfs for most sample rates. At 192kHz only 128xfs is supported. The existing driver selects 128xfs automatically for some lower samples rates. By using an additional mclk_div divider, it is now possible to control the behaviour. This allows using 256xfs PLL frequency on all sample rates up to 96kHz. It should allow lower jitter and better signal quality. The behavior has to be controlled by the sound card driver, because some sample frequency share the same setting. e.g. 192kHz and 96kHz use 24.576MHz master clock. The only difference is the MCLK divider. This also added support for 32bit data. Signed-off-by: Daniel Matuschek <daniel@matuschek.net>	2017-07-21 15:29:31 +01:00
Florian Meier	a9bdb9df5b	ASoC: Add support for Rpi-DAC	2017-07-21 15:29:31 +01:00
Florian Meier	01bcb85ef7	ASoC: Add support for HifiBerry DAC This adds a machine driver for the HifiBerry DAC. It is a sound card that can be stacked onto the Raspberry Pi. Signed-off-by: Florian Meier <florian.meier@koalo.de>	2017-07-21 15:29:30 +01:00
Phil Elwell	af6690c0c7	mfd: Add Raspberry Pi Sense HAT core driver	2017-07-21 15:29:29 +01:00
Phil Elwell	b43e7d8b4f	gpio-poweroff: Allow it to work on Raspberry Pi The Raspberry Pi firmware manages the power-down and reboot process. To do this it installs a pm_power_off handler, causing the gpio-poweroff module to abort the probe function. This patch introduces a "force" DT property that overrides that behaviour, and also adds a DT overlay to enable and control it. Note that running in an active-low configuration (DT parameter "active_low") requires a custom dt-blob.bin and probably won't allow a reboot without switching off, so an external inversion of the trigger signal may be preferable.	2017-07-21 15:29:28 +01:00
popcornmix	898300b5c3	Improve __copy_to_user and __copy_from_user performance Provide a __copy_from_user that uses memcpy. On BCM2708, use optimised memcpy/memmove/memcmp/memset implementations. arch/arm: Add mmiocpy/set aliases for memcpy/set See: https://github.com/raspberrypi/linux/issues/1082 copy_from_user: CPU_SW_DOMAIN_PAN compatibility The downstream copy_from_user acceleration must also play nice with CONFIG_CPU_SW_DOMAIN_PAN. See: https://github.com/raspberrypi/linux/issues/1381 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:28 +01:00
Gordon Hollingworth	2bc6147011	rpi-ft5406: Add touchscreen driver for pi LCD display Fix driver detection failure Check that the buffer response is non-zero meaning the touchscreen was detected rpi-ft5406: Use firmware API RPI-FT5406: Enable aarch64 support through explicit iomem interface Signed-off-by: Gerhard de Clercq <gerharddeclercq@outlook.com>	2017-07-21 15:29:27 +01:00
popcornmix	ac8de07eab	hid: Reduce default mouse polling interval to 60Hz Reduces overhead when using X	2017-07-21 15:29:27 +01:00
popcornmix	65db7b6e60	Added Device IDs for August DVB-T 205	2017-07-21 15:29:26 +01:00
popcornmix	16f390bd57	enabling the realtime clock 1-wire chip DS1307 and 1-wire on GPIO4 (as a module) 1-wire: Add support for configuring pin for w1-gpio kernel module See: https://github.com/raspberrypi/linux/pull/457 Add bitbanging pullups, use them for w1-gpio Allows parasite power to work, uses module option pullup=1 bcm2708: Ensure 1-wire pullup is disabled by default, and expose as module parameter Signed-off-by: Alex J Lennon <ajlennon@dynamicdevices.co.uk> w1-gpio: Add gpiopin module parameter and correctly free up gpio pull-up pin, if set Signed-off-by: Alex J Lennon <ajlennon@dynamicdevices.co.uk> w1-gpio: Sort out the pullup/parasitic power tangle	2017-07-21 15:29:26 +01:00
Harm Hanemaaijer	5eb71e1753	Speed up console framebuffer imageblit function Especially on platforms with a slower CPU but a relatively high framebuffer fill bandwidth, like current ARM devices, the existing console monochrome imageblit function used to draw console text is suboptimal for common pixel depths such as 16bpp and 32bpp. The existing code is quite general and can deal with several pixel depths. By creating special case functions for 16bpp and 32bpp, by far the most common pixel formats used on modern systems, a significant speed-up is attained which can be readily felt on ARM-based devices like the Raspberry Pi and the Allwinner platform, but should help any platform using the fb layer. The special case functions allow constant folding, eliminating a number of instructions including divide operations, and allow the use of an unrolled loop, eliminating instructions with a variable shift size, reducing source memory access instructions, and eliminating excessive branching. These unrolled loops also allow much better code optimization by the C compiler. The code that selects which optimized variant is used is also simplified, eliminating integer divide instructions. The speed-up, measured by timing 'cat file.txt' in the console, varies between 40% and 70%, when testing on the Raspberry Pi and Allwinner ARM-based platforms, depending on font size and the pixel depth, with the greater benefit for 32bpp. Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>	2017-07-21 15:29:25 +01:00
Siarhei Siamashka	5f401cdd41	fbdev: add FBIOCOPYAREA ioctl Based on the patch authored by Ali Gholami Rudi at https://lkml.org/lkml/2009/7/13/153 Provide an ioctl for userspace applications, but only if this operation is hardware accelerated (otherwide it does not make any sense). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com> bcm2708_fb: Add ioctl for reading gpu memory through dma	2017-07-21 15:29:24 +01:00
Phil Elwell	60b7ce61cc	BCM270x_DT: Add pwr_led, and the required "input" trigger The "input" trigger makes the associated GPIO an input. This is to support the Raspberry Pi PWR LED, which is driven by external hardware in normal use. N.B. pwr_led is not available on Model A or B boards. leds-gpio: Implement the brightness_get method The power LED uses some clever logic that means it is driven by a voltage measuring circuit when configured as input, otherwise it is driven by the GPIO output value. This patch wires up the brightness_get method for leds-gpio so that user-space can monitor the LED value via /sys/class/gpio/led1/brightness. Using the input trigger this returns an indication of the system power health, otherwise it is just whatever value the trigger has written most recently. See: https://github.com/raspberrypi/linux/issues/1064	2017-07-21 15:29:24 +01:00
notro	1fa9ecd8a8	BCM2708: Add core Device Tree support Add the bare minimum needed to boot BCM2708 from a Device Tree. Signed-off-by: Noralf Tronnes <notro@tronnes.org> BCM2708: DT: change 'axi' nodename to 'soc' Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835. The VC4 bootloader fills in certain properties in the 'axi' subtree, but since this is part of an upstreaming effort, the name is changed. Signed-off-by: Noralf Tronnes notro@tronnes.org BCM2708_DT: Correct length of the peripheral space Use dts-dirs feature for overlays. The kernel makefiles have a dts-dirs target that is for vendor subdirectories. Using this fixes the install_dtbs target, which previously did not install the overlays. BCM270X_DT: configure I2S DMA channels Signed-off-by: Matthias Reichl <hias@horus.com> BCM270X_DT: switch to bcm2835-i2s I2S soundcard drivers with proper devicetree support (i.e. not linking to the cpu_dai/platform via name but to cpu/platform via of_node) will work out of the box without any modifications. When the kernel is compiled without devicetree support the platform code will instantiate the bcm2708-i2s driver and I2S soundcard drivers will link to it via name, as before. Signed-off-by: Matthias Reichl <hias@horus.com> SDIO-overlay: add poll_once-boolean parameter Add paramter to toggle sdio-device-polling done every second or once at boot-time. Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de> BCM270X_DT: Make mmc overlay compatible with current firmware The original DT overlay logic followed a merge-then-patch procedure, i.e. parameters are applied to the loaded overlay before the overlay is merged into the base DTB. This sequence has been changed to patch-then-merge, in order to support parameterised node names, and to protect against bad overlays. As a result, overrides (parameters) must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB. mmc-overlay.dts (that switches back to the original mmc sdcard driver) is the only overlay violating that rule, and this patch fixes it. bcm270x_dt: Use the sdhost MMC controller by default The "mmc" overlay reverts to using the other controller. squash: Add cprman to dt BCM270X_DT: Use clk_core for I2C interfaces BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi The mainline Device Tree files are quite close to downstream now. Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files for our dts files. Mainline dts files are based on these files: bcm2835-rpi.dtsi bcm2835.dtsi bcm2836.dtsi bcm283x.dtsi Current downstream are based on these: bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi bcm2708_common.dtsi This patch introduces this dependency: bcm2708.dtsi bcm2709.dtsi bcm2708-rpi.dtsi bcm270x.dtsi bcm2835.dtsi bcm2836.dtsi bcm283x.dtsi And: bcm2710.dtsi bcm2708-rpi.dtsi bcm270x.dtsi bcm283x.dtsi bcm270x.dtsi contains the downstream bcm283x.dtsi diff. bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi. Other changes: - The led node has moved from /soc/leds to /leds. This is not a problem since the label is used to reference it. - The clk_osc reg property changes from 6 to 3. - The gpu nodes has their interrupt property set in the base file. - the clocks label does not point to the /clocks node anymore, but points to the cprman node. This is not a problem since the overlays that use the clock node refer to it directly: target-path = "/clocks"; - some nodes now have 2 labels since mainline and downstream differs in this respect: cprman/clocks, spi0/spi, gpu/vc4. - some nodes doesn't have an explicit status = "okay" since they're not disabled in the base file: watchdog and random. - gpiomem doesn't need an explicit status = "okay". - bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi, it's now set directly in that file. - bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer. - Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270X_DT: Use raspberrypi-power to turn on USB power Use the raspberrypi-power driver to turn on USB power. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270X_DT: Add a .dtbo target, use for overlays Change the filenames and extensions to keep the pre-DDT style of overlay (<name>-overlay.dtb) distinct from new ones that use a different style of local fixups (<name>.dtbo), and to match other platforms. The RPi firmware uses the DDTK trailer atom to choose which type of overlay to use for each kernel. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Don't generate "linux,phandle" props The EPAPR standard says to use "phandle" properties to store phandles, rather than the deprecated "linux,phandle" version. By default, dtc generates both, but adding "-H epapr" causes it to only generate "phandle"s, saving some space and clutter. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Add overlay for enc28j60 on SPI2 Works on SPI2 for compute module BCM270X_DT: Add midi-uart0 overlay MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock so that requesting 38.4kbaud actually gets 31.25kbaud. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: Add i2c-sensor overlay The i2c-sensor overlay is a container for various pressure and temperature sensors, currently bmp085 and bmp280. The standalone bmp085_i2c-sensor overlay is now deprecated. Signed-off-by: Phil Elwell <phil@raspberrypi.org> BCM270X_DT: overlays/-overlay.dtb -> overlays/.dtbo (#1752) We now create overlays as .dtbo files. build: support for .dtbo files for dtb overlays Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb. Patch the kernel, which has faulty rules to generate .dtbo the way yocto does Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr> Signed-off-by: Khem Raj <raj.khem@gmail.com>	2017-07-21 15:29:23 +01:00
Phil Elwell	859b7f34bc	scripts: Add mkknlimg and knlinfo scripts from tools repo The Raspberry Pi firmware looks for a trailer on the kernel image to determine whether it was compiled with Device Tree support enabled. If the firmware finds a kernel without this trailer, or which has a trailer indicating that it isn't DT-capable, it disables DT support and reverts to using ATAGs. The mkknlimg utility adds that trailer, having first analysed the image to look for signs of DT support and the kernel version string. knlinfo displays the contents of the trailer in the given kernel image. scripts/mkknlimg: Add support for ARCH_BCM2835 Add a new trailer field indicating whether this is an ARCH_BCM2835 build, as opposed to MACH_BCM2708/9. If the loader finds this flag is set it changes the default base dtb file name from bcm270x... to bcm283y... Also update knlinfo to show the status of the field. scripts/mkknlimg: Improve ARCH_BCM2835 detection The board support code contains sufficient strings to be able to distinguish 2708 vs. 2835 builds, so remove the check for bcm2835-pm-wdt which could exist in either. Also, since the canned configuration is no longer built in (it's a module), remove the config string checking. See: https://github.com/raspberrypi/linux/issues/1157 scripts: Multi-platform support for mkknlimg and knlinfo The firmware uses tags in the kernel trailer to choose which dtb file to load. Current firmware loads bcm2835-.dtb if the '283x' tag is true, otherwise it loads bcm270.dtb. This scheme breaks if an image supports multiple platforms. This patch adds '270X' and '283X' tags to indicate support for RPi and upstream platforms, respectively. '283x' (note lower case 'x') is left for old firmware, and is only set if the image only supports upstream builds. scripts/mkknlimg: Append a trailer for all input Now that the firmware assumes an unsigned kernel is DT-capable, it is helpful to be able to mark a kernel as being non-DT-capable. Signed-off-by: Phil Elwell <phil@raspberrypi.org> scripts/knlinfo: Decode DDTK atom Show the DDTK atom as being a boolean. Signed-off-by: Phil Elwell <phil@raspberrypi.org> mkknlimg: Retain downstream-kernel detection With the death of ARCH_BCM2708 and ARCH_BCM2709, a new way is needed to determine if this is a "downstream" build that wants the firmware to load a bcm27xx .dtb. The vc_cma driver is used downstream but not upstream, making vc_cma_init a suitable predicate symbol.	2017-07-21 15:29:23 +01:00
Noralf Trønnes	caa393cc40	firmware: bcm2835: Support ARCH_BCM270x Support booting without Device Tree. Turn on USB power. Load driver early because of lacking support for deferred probing in many drivers. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> firmware: bcm2835: Don't turn on USB power The raspberrypi-power driver is now used to turn on USB power. This partly reverts commit: firmware: bcm2835: Support ARCH_BCM270x Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:22 +01:00
Noralf Trønnes	7769dd32fd	char: broadcom: Add vcio module Add module for accessing the mailbox property channel through /dev/vcio. Was previously in bcm2708-vcio. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:22 +01:00
popcornmix	746010fbbc	Add Chris Boot's i2c driver i2c-bcm2708: fixed baudrate Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock). In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits. This resulted in incorrect setting of CDIV and higher baudrate than intended. Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz The correct baudrate is shown in the log after the cdiv > 0xffff correction. Perform I2C combined transactions when possible Perform I2C combined transactions whenever possible, within the restrictions of the Broadcomm Serial Controller. Disable DONE interrupt during TA poll Prevent interrupt from being triggered if poll is missed and transfer starts and finishes. i2c: Make combined transactions optional and disabled by default i2c: bcm2708: add device tree support Add DT support to driver and add to .dtsi file. Setup pins in .dts file. i2c is disabled by default. Signed-off-by: Noralf Tronnes <notro@tronnes.org> bcm2708: don't register i2c controllers when using DT The devices for the i2c controllers are in the Device Tree. Only register devices when not using DT. Signed-off-by: Noralf Tronnes <notro@tronnes.org> I2C: Only register the I2C device for the current board revision i2c_bcm2708: Fix clock reference counting Fix grabbing lock from atomic context in i2c driver 2 main changes: - check for timeouts in the bcm2708_bsc_setup function as indicated by this comment: /* poll for transfer start bit (should only take 1-20 polls) / This implies that the setup function can now fail so account for this everywhere it's called - Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock. i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl i2c-bcm2708: Increase timeouts to allow larger transfers Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting for completion. The default timeout is 1 second. See: https://github.com/raspberrypi/linux/issues/260 i2c-bcm2708/BCM270X_DT: Add support for I2C2 The third I2C bus (I2C2) is normally reserved for HDMI use. Careless use of this bus can break an attached display - use with caution. It is recommended to disable accesses by VideoCore by setting hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt. The interface is disabled by default - enable using the i2c2_iknowwhatimdoing DT parameter. bcm2708-spi: Don't use static pin configuration with DT Also remove superfluous error checking - the SPI framework ensures the validity of the chip_select value. i2c-bcm2708: Remove non-DT support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs. Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574) i2c: fix i2c_bcm2708: Clear FIFO before sending data Make sure FIFO gets cleared before trying to send data in case of a repeated start (COMBINED=Y). * i2c: fix i2c_bcm2708: Only write to FIFO when not full Check if FIFO can accept data before writing. To avoid a peripheral read on the last iteration of a loop, both bcm2708_bsc_fifo_fill and ~drain are changed as well.	2017-07-21 15:29:21 +01:00
popcornmix	e3e872cce3	Added hwmon/thermal driver for reporting core temperature. Thanks Dorian BCM270x: Move thermal sensor to Device Tree Add Device Tree support to bcm2835-thermal driver. Add thermal sensor device to Device Tree. Don't add platform device when booting in DT mode. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:20 +01:00
popcornmix	07eccca1f5	Add cpufreq driver Signed-off-by: popcornmix <popcornmix@gmail.com>	2017-07-21 15:29:20 +01:00
Aron Szabo	b13314b973	lirc: added support for RaspberryPi GPIO lirc_rpi: Use read_current_timer to determine transmitter delay. Thanks to jjmz and others See: https://github.com/raspberrypi/linux/issues/525 lirc: Remove restriction on gpio pins that can be used with lirc Compute Module, for example could use different pins lirc_rpi: Add parameter to specify input pin pull Depending on the connected IR circuitry it might be desirable to change the gpios internal pull from it pull-down default behaviour. Add a module parameter to allow the user to set it explicitly. Signed-off-by: Julian Scheel <julian@jusst.de> lirc-rpi: Use the higher-level irq control functions This module used to access the irq_chip methods of the gpio controller directly, rather than going through the standard enable_irq/irq_set_irq_type functions. This caused problems on pinctrl-bcm2835 which only implements the irq_enable/disable methods and not irq_unmask/mask. lirc-rpi: Correct the interrupt usage 1) Correct the use of enable_irq (i.e. don't call it so often) 2) Correct the shutdown sequence. 3) Avoid a bcm2708_gpio driver quirk by setting the irq flags earlier lirc-rpi: use getnstimeofday instead of read_current_timer read_current_timer isn't guaranteed to return values in microseconds, and indeed it doesn't on a Pi2. Issue: linux#827 lirc-rpi: Add device tree support, and a suitable overlay The overlay supports DT parameters that match the old module parameters, except that gpio_in_pull should be set using the strings "up", "down" or "off". lirc-rpi: Also support pinctrl-bcm2835 in non-DT mode fix auto-sense in lirc_rpi driver On a Raspberry Pi 2, the lirc_rpi driver might receive spurious interrupts and change it's low-active / high-active setting. When this happens, the IR remote control stops working. This patch disables this auto-detection if the 'sense' parameter was set in the device tree, making the driver robust to such spurious interrupts.	2017-07-21 15:29:19 +01:00
Luke Wren	97608fdade	Add SMI NAND driver Signed-off-by: Luke Wren <wren6991@gmail.com>	2017-07-21 15:29:19 +01:00
Martin Sperl	48b478d013	MISC: bcm2835: smi: use clock manager and fix reload issues Use clock manager instead of self-made clockmanager. Also fix some error paths that showd up during development (especially missing release of dma resources on rmmod) Signed-off-by: Martin Sperl <kernel@martin.sperl.org>	2017-07-21 15:29:18 +01:00
Luke Wren	0aff46c0bb	Add SMI driver Signed-off-by: Luke Wren <wren6991@gmail.com>	2017-07-21 15:29:17 +01:00
Luke Wren	c9bbf05c6f	Add /dev/gpiomem device for rootless user GPIO access Signed-off-by: Luke Wren <luke@raspberrypi.org> bcm2835-gpiomem: Fix for ARCH_BCM2835 builds Build on ARCH_BCM2835, and fail to probe if no IO resource. See: https://github.com/raspberrypi/linux/issues/1154	2017-07-21 15:29:17 +01:00
Tim Gover	49eab76764	vcsm: VideoCore shared memory service for BCM2835 Add experimental support for the VideoCore shared memory service. This allows user processes to allocate memory from VideoCore's GPU relocatable heap and mmap the buffers. Additionally, the memory handles can passed to other VideoCore services such as MMAL, OpenMax and DispmanX TODO * This driver was originally released for BCM28155 which has a different cache architecture to BCM2835. Consequently, in this release only uncached mappings are supported. However, there's no fundamental reason which cached mappings cannot be support or BCM2835 * More refactoring is required to remove the typedefs. * Re-enable the some of the commented out debug-fs statistics which were disabled when migrating code from proc-fs. * There's a lot of code to support sharing of VCSM in order to support Android. This could probably done more cleanly or perhaps just removed. Signed-off-by: Tim Gover <timgover@gmail.com> config: Disable VC_SM for now to fix hang with cutdown kernel vcsm: Use boolean as it cannot be built as module On building the bcm_vc_sm as a module we get the following error: v7_dma_flush_range and do_munmap are undefined in vc-sm.ko. Fix by making it not an option to build as module vcsm: Add ioctl for custom cache flushing vc-sm: Move headers out of arch directory Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:16 +01:00
popcornmix	2c9171a929	vc_mem: Add vc_mem driver for querying firmware memory addresses Signed-off-by: popcornmix <popcornmix@gmail.com> BCM270x: Move vc_mem Make the vc_mem module available for ARCH_BCM2835 by moving it. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:16 +01:00
Phil Elwell	83e7bb7ebb	Adding bcm2835-sdhost driver, and an overlay to enable it BCM2835 has two SD card interfaces. This driver uses the other one. bcm2835-sdhost: Error handling fix, and code clarification bcm2835-sdhost: Adding overclocking option Allow a different clock speed to be substitued for a requested 50MHz. This option is exposed using the "overclock_50" DT parameter. Note that the sdhost interface is restricted to integer divisions of core_freq, and the highest sensible option for a core_freq of 250MHz is 84 (250/3 = 83.3MHz), the next being 125 (250/2) which is much too high. Use at your own risk. bcm2835-sdhost: Round up the overclock, so 62 works for 62.5Mhz Also only warn once for each overclock setting. bcm2835-sdhost: Improve error handling and recovery 1) Expose the hw_reset method to the MMC framework, removing many internal calls by the driver. 2) Reduce overclock setting on error. 3) Increase timeout to cope with high capacity cards. 4) Add properties and parameters to control pio_limit and debug. 5) Reduce messages at probe time. bcm2835-sdhost: Further improve overclock back-off bcm2835-sdhost: Clear HBLC for PIO mode Also update pio_limit default in overlay README. bcm2835-sdhost: Add the ERASE capability See: https://github.com/raspberrypi/linux/issues/1076 bcm2835-sdhost: Ignore CRC7 for MMC CMD1 It seems that the sdhost interface returns CRC7 errors for CMD1, which is the MMC-specific SEND_OP_COND. Returning these errors to the MMC layer causes a downward spiral, but ignoring them seems to be harmless. bcm2835-mmc/sdhost: Remove ARCH_BCM2835 differences The bcm2835-mmc driver (and -sdhost driver that copied from it) contains code to handle SDIO interrupts in a threaded interrupt handler rather than waking the MMC framework thread. The change follows a patch from Russell King that adds the facility as the preferred way of working. However, the new code path is only present in ARCH_BCM2835 builds, which I have taken to be a way of testing the waters rather than making the change across the board; I can't see any technical reason why it wouldn't be enabled for MACH_BCM270X builds. So this patch standardises on the ARCH_BCM2835 code, removing the old code paths. bcm2835-sdhost: Don't log timeout errors unless debug=1 The MMC card-discovery process generates timeouts. This is expected behaviour, so reporting it to the user serves no purpose. Suppress the reporting of timeout errors unless the debug flag is on. bcm2835-sdhost: Add workaround for odd behaviour on some cards For reasons not understood, the sdhost driver fails when reading sectors very near the end of some SD cards. The problem could be related to the similar issue that reading the final sector of any card as part of a multiple read never completes, and the workaround is an extension of the mechanism introduced to solve that problem which ensures those sectors are always read singly. bcm2835-sdhost: Major revision This is a significant revision of the bcm2835-sdhost driver. It improves on the original in a number of ways: 1) Through the use of CMD23 for reads it appears to avoid problems reading some sectors on certain high speed cards. 2) Better atomicity to prevent crashes. 3) Higher performance. 4) Activity logging included, for easier diagnosis in the event of a problem. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Restore ATOMIC flag to PIO sg mapping Allocation problems have been seen in a wireless driver, and this is the only change which might have been responsible. SQUASH: bcm2835-sdhost: Only claim one DMA channel With both MMC controllers enabled there are few DMA channels left. The bcm2835-sdhost driver only uses DMA in one direction at a time, so it doesn't need to claim two channels. See: https://github.com/raspberrypi/linux/issues/1327 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Workaround for "slow" sectors Some cards have been seen to cause timeouts after certain sectors are read. This workaround enforces a minimum delay between the stop after reading one of those sectors and a subsequent data command. Using CMD23 (SET_BLOCK_COUNT) avoids this problem, so good cards will not be penalised by this workaround. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Firmware manages the clock divisor The bcm2835-sdhost driver hands control of the CDIV clock divisor register to matching firmware, allowing it to adjust to a changing core clock. This removes the need to use the performance governor or to enable io_is_busy on the on-demand governor in order to get the best SD performance. N.B. As SD clocks must be an integer divisor of the core clock, it is possible that the SD clock for "turbo" mode can be different (even lower) than "normal" mode. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Reset the clock in task context Since reprogramming the clock can now involve a round-trip to the firmware it must not be done at atomic context, and a tasklet is not a task. Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-sdhost: Don't exit cmd wait loop on error The FAIL flag can be set in the CMD register before command processing is complete, leading to spurious "failed to complete" errors. This has the effect of promoting harmless CRC7 errors during CMD1 processing into errors that can delay and even prevent booting. Also: 1) Convert the last KERN_ERROR message in the register dumping to KERN_INFO. 2) Remove an unnecessary reset call from bcm2835_sdhost_add_host. See: https://github.com/raspberrypi/linux/pull/1492 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:15 +01:00
gellert	82062bd37a	MMC: added alternative MMC driver mmc: Disable CMD23 transfers on all cards Pending wire-level investigation of these types of transfers and associated errors on bcm2835-mmc, disable for now. Fallback of CMD18/CMD25 transfers will be used automatically by the MMC layer. Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org> mmc: bcm2835-mmc: enable DT support for all architectures Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Enable Device Tree support for all architectures. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> mmc: bcm2835-mmc: fix probe error handling Probe error handling is broken in several places. Simplify error handling by using device managed functions. Replace pr_{err,info} with dev_{err,info}. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Add locks when accessing sdhost registers bcm2835-mmc: Add range of debug options for slowing things down bcm2835-mmc: Add option to disable some delays bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Adding overclocking option Allow a different clock speed to be substitued for a requested 50MHz. This option is exposed using the "overclock_50" DT parameter. Note that the mmc interface is restricted to EVEN integer divisions of 250MHz, and the highest sensible option is 63 (250/4 = 62.5), the next being 125 (250/2) which is much too high. Use at your own risk. bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz Also only warn once for each overclock setting. mmc: bcm2835-mmc: Make available on ARCH_BCM2835 Make the bcm2835-mmc driver available for use on ARCH_BCM2835. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-mmc entry Add Device Tree entry for bcm2835-mmc. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Don't overwrite MMC capabilities from DT bcm2835-mmc: Don't override bus width capabilities from devicetree Take out the force setting of the MMC_CAP_4_BIT_DATA host capability so that the result read from devicetree via mmc_of_parse() is preserved. bcm2835-mmc: Only claim one DMA channel With both MMC controllers enabled there are few DMA channels left. The bcm2835-mmc driver only uses DMA in one direction at a time, so it doesn't need to claim two channels. See: https://github.com/raspberrypi/linux/issues/1327 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:15 +01:00
Florian Meier	f26687e72f	dmaengine: Add support for BCM2708 Add support for DMA controller of BCM2708 as used in the Raspberry Pi. Currently it only supports cyclic DMA. Signed-off-by: Florian Meier <florian.meier@koalo.de> dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels DMA: fix cyclic LITE length overflow bug dmaengine: bcm2708: Remove chancnt affectations Mirror bcm2835-dma.c commit `9eba5536a7`: chancnt is already filled by dma_async_device_register, which uses the channel list to know how much channels there is. Since it's already filled, we can safely remove it from the drivers' probe function. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: overwrite dreq only if it is not set dreq is set when the DMA channel is fetched from Device Tree. slave_id is set using dmaengine_slave_config(). Only overwrite dreq with slave_id if it is not set. dreq/slave_id in the cyclic DMA case is not touched, because I don't have hardware to test with. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: do device registration in the board file Don't register the device in the driver. Do it in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835 Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Add Device Tree support to the non ARCH_BCM2835 case. Use the same driver name regardless of architecture. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-dma entry Add Device Tree entry for bcm2835-dma. The entry doesn't contain any resources since they are handled by the arch/arm/mach-bcm270x/dma.c driver. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine: Add debug options BCM270x: Add memory and irq resources to dmaengine device and DT Prepare for merging of the legacy DMA API arch driver dma.c with bcm2708-dmaengine by adding memory and irq resources both to platform file device and Device Tree node. Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c Merge the legacy DMA API driver with bcm2708-dmaengine. This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox driver is also needed). Changes to the dma.c code: - Use BIT() macro. - Cutdown some comments to one line. - Add mutex to vc_dmaman and use this, since the dev lock is locked during probing of the engine part. - Add global g_dmaman variable since drvdata is used by the engine part. - Restructure for readability: vc_dmaman_chan_alloc() vc_dmaman_chan_free() bcm_dma_chan_free() - Restructure bcm_dma_chan_alloc() to simplify error handling. - Use device irq resources instead of hardcoded bcm_dma_irqs table. - Remove dev_dmaman_register() and code it directly. - Remove dev_dmaman_deregister() and code it directly. - Simplify bcm_dmaman_probe() using devm_* functions. - Get dmachans from DT if available. - Keep 'dma.dmachans' module argument name for backwards compatibility. Make it available on ARCH_BCM2835 as well. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: set residue_granularity field bcm2708-dmaengine supports residue reporting at burst level but didn't report this via the residue_granularity field. Without this field set properly we get playback issues with I2S cards. dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer bcm2708-dmaengine: Use more DMA channels (but not 12) 1) Only the bcm2708_fb drivers uses the legacy DMA API, and it requires a BULK-capable channel, so all other types (FAST, NORMAL and LITE) can be made available to the regular DMA API. 2) DMA channels 11-14 share an interrupt. The driver can't handle this, so don't use channels 12-14 (12 was used, probably because it appears to have an interrupt, but in reality that interrupt is for activity on ANY channel). This may explain a lockup encountered when running out of DMA channels. The combined effect of this patch is to leave 7 DMA channels available + channel 0 for bcm2708_fb via the legacy API. See: https://github.com/raspberrypi/linux/issues/1110 https://github.com/raspberrypi/linux/issues/1108 dmaengine: bcm2708: Make legacy API available for bcm2835-dma bcm2708_fb uses the legacy DMA API, so in order to start using bcm2835-dma, bcm2835-dma has to support the legacy API. Make this possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove(). Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Change DT compatible string Both bcm2835-dma and bcm2708-dmaengine have the same compatible string. So change compatible to "brcm,bcm2708-dma". Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Remove driver but keep legacy API Dropping non-DT support means we don't need this driver, but we still need the legacy DMA API. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine - Fix arm64 portability/build issues	2017-07-21 15:29:14 +01:00
popcornmix	041d570b26	bcm2708 framebuffer driver Signed-off-by: popcornmix <popcornmix@gmail.com> bcm2708_fb : Implement blanking support using the mailbox property interface bcm2708_fb: Add pan and vsync controls bcm2708_fb: DMA acceleration for fb_copyarea Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425 Also used Simon's dmaer_master module as a reference for tweaking DMA settings for better performance. For now busylooping only. IRQ support might be added later. With non-overclocked Raspberry Pi, the performance is ~360 MB/s for simple copy or ~260 MB/s for two-pass copy (used when dragging windows to the right). In the case of using DMA channel 0, the performance improves to ~440 MB/s. For comparison, VFP optimized CPU copy can only do ~114 MB/s in the same conditions (hindered by reading uncached source buffer). Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com> bcm2708_fb: report number of dma copies Add a counter (exported via debugfs) reporting the number of dma copies that the framebuffer driver has done, in order to help evaluate different optimization strategies. Signed-off-by: Luke Diamand <luked@broadcom.com> bcm2708_fb: use IRQ for DMA copies The copyarea ioctl() uses DMA to speed things along. This was busy-waiting for completion. This change supports using an interrupt instead for larger transfers. For small transfers, busy-waiting is still likely to be faster. Signed-off-by: Luke Diamand <luke@diamand.org> bcm2708: Make ioctl logging quieter video: fbdev: bcm2708_fb: Don't panic on error No need to panic the kernel if the video driver fails. Just print a message and return an error. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> fbdev: bcm2708_fb: Add ARCH_BCM2835 support Add Device Tree support. Pass the device to dma_alloc_coherent() in order to get the correct bus address on ARCH_BCM2835. Use the new DMA legacy API header file. Including <mach/platform.h> is not necessary. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: Add bcm2708-fb device Add bcm2708-fb to Device Tree and don't add the platform device when booting in DT mode. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:13 +01:00
popcornmix	300f541a54	dwcotg: Allow to build without FIQ on ARM64 Signed-off-by: popcornmix <popcornmix@gmail.com>	2017-07-21 15:29:13 +01:00
popcornmix	b85eec7b28	Add dwc_otg driver Signed-off-by: popcornmix <popcornmix@gmail.com> usb: dwc: fix lockdep false positive Signed-off-by: Kari Suvanto <karis79@gmail.com> usb: dwc: fix inconsistent lock state Signed-off-by: Kari Suvanto <karis79@gmail.com> Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance. Thanks to Gordon and Costas Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005. Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh Make sure we wait for the reset to finish dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel memory corruption, escalating to OOPS under high USB load. dwc_otg: Fix unsafe access of QTD during URB enqueue In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the transaction could complete almost immediately after the qtd was assigned to a host channel during URB enqueue, which meant the qtd pointer was no longer valid having been completed and removed. Usually, this resulted in an OOPS during URB submission. By predetermining whether transactions need to be queued or not, this unsafe pointer access is avoided. This bug was only evident on the Pi model A where a device was attached that had no periodic endpoints (e.g. USB pendrive or some wlan devices). dwc_otg: Fix incorrect URB allocation error handling If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS because for some reason a member of the unallocated struct was set to zero. Error handling changed to fail correctly. dwc_otg: fix potential use-after-free case in interrupt handler If a transaction had previously aborted, certain interrupts are enabled to track error counts and reset where necessary. On IN endpoints the host generates an ACK interrupt near-simultaneously with completion of transfer. In the case where this transfer had previously had an error, this results in a use-after-free on the QTD memory space with a 1-byte length being overwritten to 0x00. dwc_otg: add handling of SPLIT transaction data toggle errors Previously a data toggle error on packets from a USB1.1 device behind a TT would result in the Pi locking up as the driver never handled the associated interrupt. Patch adds basic retry mechanism and interrupt acknowledgement to cater for either a chance toggle error or for devices that have a broken initial toggle state (FT8U232/FT232BM). dwc_otg: implement tasklet for returning URBs to usbcore hcd layer The dwc_otg driver interrupt handler for transfer completion will spend a very long time with interrupts disabled when a URB is completed - this is because usb_hcd_giveback_urb is called from within the handler which for a USB device driver with complicated processing (e.g. webcam) will take an exorbitant amount of time to complete. This results in missed completion interrupts for other USB packets which lead to them being dropped due to microframe overruns. This patch splits returning the URB to the usb hcd layer into a high-priority tasklet. This will have most benefit for isochronous IN transfers but will also have incidental benefit where multiple periodic devices are active at once. dwc_otg: fix NAK holdoff and allow on split transactions only This corrects a bug where if a single active non-periodic endpoint had at least one transaction in its qh, on frnum == MAX_FRNUM the qh would get skipped and never get queued again. This would result in a silent device until error detection (automatic or otherwise) would either reset the device or flush and requeue the URBs. Additionally the NAK holdoff was enabled for all transactions - this would potentially stall a HS endpoint for 1ms if a previous error state enabled this interrupt and the next response was a NAK. Fix so that only split transactions get held off. dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it asynchronously in the tasklet was not safe (regression in `c4564d4a1a`). This change unlinks it from the endpoint prior to queueing it for handling in the tasklet, and also adds a check to ensure the urb is OK to be unlinked before doing so. NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb when a USB device was unplugged/replugged during data transfer. This effect was reproduced using automated USB port power control, hundreds of replug events were performed during active transfers to confirm that the problem was eliminated. USB fix using a FIQ to implement split transactions This commit adds a FIQ implementaion that schedules the split transactions using a FIQ so we don't get held off by the interrupt latency of Linux dwc_otg: fix device attributes and avoid kernel warnings on boot dcw_otg: avoid logging function that can cause panics See: https://github.com/raspberrypi/firmware/issues/21 Thanks to cleverca22 for fix dwc_otg: mask correct interrupts after transaction error recovery The dwc_otg driver will unmask certain interrupts on a transaction that previously halted in the error state in order to reset the QTD error count. The various fine-grained interrupt handlers do not consider that other interrupts besides themselves were unmasked. By disabling the two other interrupts only ever enabled in DMA mode for this purpose, we can avoid unnecessary function calls in the IRQ handler. This will also prevent an unneccesary FIQ interrupt from being generated if the FIQ is enabled. dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ In the case of a transaction to a device that had previously aborted due to an error, several interrupts are enabled to reset the error count when a device responds. This has the side-effect of making the FIQ thrash because the hardware will generate multiple instances of a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the associated interrupts. Additionally, on non-split transactions make sure that only unmasked interrupts are cleared. This caused a hard-to-trigger but serious race condition when you had the combination of an endpoint awaiting error recovery and a transaction completed on an endpoint - due to the sequencing and timing of interrupts generated by the dwc_otg core, it was possible to confuse the IRQ handler. Fix function tracing dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue dwc_otg: prevent OOPSes during device disconnects The dwc_otg_urb_enqueue function is thread-unsafe. In particular the access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and friends does not occur within a critical section and so if a device was unplugged during activity there was a high chance that the usbcore hub_thread would try to disable the endpoint with partially- formed entries in the URB queue. This would result in BUG() or null pointer dereferences. Fix so that access of urb->hcpriv, enqueuing to the hardware and adding to usbcore endpoint URB lists is contained within a single critical section. dwc_otg: prevent BUG() in TT allocation if hub address is > 16 A fixed-size array is used to track TT allocation. This was previously set to 16 which caused a crash because dwc_otg_hcd_allocate_port would read past the end of the array. This was hit if a hub was plugged in which enumerated as addr > 16, due to previous device resets or unplugs. Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows to a large size if 128 hub addresses are supported. This field is for debug only for tracking which frame an allocate happened in. dwc_otg: make channel halts with unknown state less damaging If the IRQ received a channel halt interrupt through the FIQ with no other bits set, the IRQ would not release the host channel and never complete the URB. Add catchall handling to treat as a transaction error and retry. dwc_otg: fiq_split: use TTs with more granularity This fixes certain issues with split transaction scheduling. - Isochronous multi-packet OUT transactions now hog the TT until they are completed - this prevents hubs aborting transactions if they get a periodic start-split out-of-order - Don't perform TT allocation on non-periodic endpoints - this allows simultaneous use of the TT's bulk/control and periodic transaction buffers This commit will mainly affect USB audio playback. dwc_otg: fix potential sleep while atomic during urb enqueue Fixes a regression introduced with `eb1b482a`. Kmalloc called from dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have the GPF_ATOMIC flag set. Force this flag when inside the larger critical section. dwc_otg: make fiq_split_enable imply fiq_fix_enable Failing to set up the FIQ correctly would result in "IRQ 32: nobody cared" errors in dmesg. dwc_otg: prevent crashes on host port disconnects Fix several issues resulting in crashes or inconsistent state if a Model A root port was disconnected. - Clean up queue heads properly in kill_urbs_in_qh_list by removing the empty QHs from the schedule lists - Set the halt status properly to prevent IRQ handlers from using freed memory - Add fiq_split related cleanup for saved registers - Make microframe scheduling reclaim host channels if active during a disconnect - Abort URBs with -ESHUTDOWN status response, informing device drivers so they respond in a more correct fashion and don't try to resubmit URBs - Prevent IRQ handlers from attempting to handle channel interrupts if the associated URB was dequeued (and the driver state was cleared) dwc_otg: prevent leaking URBs during enqueue A dwc_otg_urb would get leaked if the HCD enqueue function failed for any reason. Free the URB at the appropriate points. dwc_otg: Enable NAK holdoff for control split transactions Certain low-speed devices take a very long time to complete a data or status stage of a control transaction, producing NAK responses until they complete internal processing - the USB2.0 spec limit is up to 500mS. This causes the same type of interrupt storm as seen with USB-serial dongles prior to `c8edb238`. In certain circumstances, usually while booting, this interrupt storm could cause SD card timeouts. dwc_otg: Fix for occasional lockup on boot when doing a USB reset dwc_otg: Don't issue traffic to LS devices in FS mode Issuing low-speed packets when the root port is in full-speed mode causes the root port to stop responding. Explicitly fail when enqueuing URBs to a LS endpoint on a FS bus. Fix ARM architecture issue with local_irq_restore() If local_fiq_enable() is called before a local_irq_restore(flags) where the flags variable has the F bit set, the FIQ will be erroneously disabled. Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR. Also fix some of the hacks previously implemented for previous dwc_otg incarnations. dwc_otg: fiq_fsm: Base commit for driver rewrite This commit removes the previous FIQ fixes entirely and adds fiq_fsm. This rewrite features much more complete support for split transactions and takes into account several OTG hardware bugs. High-speed isochronous transactions are also capable of being performed by fiq_fsm. All driver options have been removed and replaced with: - dwc_otg.fiq_enable (bool) - dwc_otg.fiq_fsm_enable (bool) - dwc_otg.fiq_fsm_mask (bitmask) - dwc_otg.nak_holdoff (unsigned int) Defaults are specified such that fiq_fsm behaves similarly to the previously implemented FIQ fixes. fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used If the transfer associated with a QTD failed due to a bus error, the HCD would retry the transfer up to 3 times (implementing the USB2.0 three-strikes retry in software). Due to the masking mechanism used by fiq_fsm, it is only possible to pass a single interrupt through to the HCD per-transfer. In this instance host channels would fall off the radar because the error reset would function, but the subsequent channel halt would be lost. Push the error count reset into the FIQ handler. fiq_fsm: Implement timeout mechanism For full-speed endpoints with a large packet size, interrupt latency runs the risk of the FIQ starting a transaction too late in a full-speed frame. If the device is still transmitting data when EOF2 for the downstream frame occurs, the hub will disable the port. This change is not reflected in the hub status endpoint and the device becomes unresponsive. Prevent high-bandwidth transactions from being started too late in a frame. The mechanism is not guaranteed: a combination of bit stuffing and hub latency may still result in a device overrunning. fiq_fsm: fix bounce buffer utilisation for Isochronous OUT Multi-packet isochronous OUT transactions were subject to a few bounday bugs. Fix them. Audio playback is now much more robust: however, an issue stands with devices that have adaptive sinks - ALSA plays samples too fast. dwc_otg: Return full-speed frame numbers in HS mode The frame counter increments on every microframe in high-speed mode. Most device drivers expect this number to be in full-speed frames - this caused considerable confusion to e.g. snd_usb_audio which uses the frame counter to estimate the number of samples played. fiq_fsm: save PID on completion of interrupt OUT transfers Also add edge case handling for interrupt transports. Note that for periodic split IN, data toggles are unimplemented in the OTG host hardware - it unconditionally accepts any PID. fiq_fsm: add missing case for fiq_fsm_tt_in_use() Certain combinations of bitrate and endpoint activity could result in a periodic transaction erroneously getting started while the previous Isochronous OUT was still active. fiq_fsm: clear hcintmsk for aborted transactions Prevents the FIQ from erroneously handling interrupts on a timed out channel. fiq_fsm: enable by default fiq_fsm: fix dequeues for non-periodic split transactions If a dequeue happened between the SSPLIT and CSPLIT phases of the transaction, the HCD would never receive an interrupt. fiq_fsm: Disable by default fiq_fsm: Handle HC babble errors The HCTSIZ transfer size field raises a babble interrupt if the counter wraps. Handle the resulting interrupt in this case. dwc_otg: fix interrupt registration for fiq_enable=0 Additionally make the module parameter conditional for wherever hcd->fiq_state is touched. fiq_fsm: Enable by default dwc_otg: Fix various issues with root port and transaction errors Process the host port interrupts correctly (and don't trample them). Root port hotplug now functional again. Fix a few thinkos with the transaction error passthrough for fiq_fsm. fiq_fsm: Implement hack for Split Interrupt transactions Hubs aren't too picky about which endpoint we send Control type split transactions to. By treating Interrupt transfers as Control, it is possible to use the non-periodic queue in the OTG core as well as the non-periodic FIFOs in the hub itself. This massively reduces the microframe exclusivity/contention that periodic split transactions otherwise have to enforce. It goes without saying that this is a fairly egregious USB specification violation, but it works. Original idea by Hans Petter Selasky @ FreeBSD.org. dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only. dwc_otg: introduce fiq_fsm_spin(un\|)lock() SMP safety for the FIQ relies on register read-modify write cycles being completed in the correct order. Several places in the DWC code modify registers also touched by the FIQ. Protect these by a bare-bones lock mechanism. This also makes it possible to run the FIQ and IRQ handlers on different cores. fiq_fsm: fix build on bcm2708 and bcm2709 platforms dwc_otg: put some barriers back where they should be for UP bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active dwc_otg: fixup read-modify-write in critical paths Be more careful about read-modify-write on registers that the FIQ also touches. Guard fiq_fsm_spin_lock with fiq_enable check fiq_fsm: Falling out of the state machine isn't fatal This edge case can be hit if the port is disabled while the FIQ is in the middle of a transaction. Make the effects less severe. Also get rid of the useless return value. squash: dwc_otg: Allow to build without SMP usb: core: make overcurrent messages more prominent Hub overcurrent messages are more serious than "debug". Increase loglevel. usb: dwc_otg: Don't use dma_to_virt() Commit `6ce0d20` changes dma_to_virt() which breaks this driver. Open code the old dma_to_virt() implementation to work around this. Limit the use of __bus_to_virt() to cases where transfer_buffer_length is set and transfer_buffer is not set. This is done to increase the chance that this driver will also work on ARCH_BCM2835. transfer_buffer should not be NULL if the length is set, but the comment in the code indicates that there are situations where this might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar comment pointing to a possible: 'usb storage / SCSI bug'. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Fix crash when fiq_enable=0 dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly Certain low-bandwidth high-speed USB devices (specialist audio devices, compressed-frame webcams) have packet intervals > 1 microframe. Stride these transfers in the FIQ by using the start-of-frame interrupt to restart the channel at the right time. dwc_otg: Force host mode to fix incorrect compute module boards dwc_otg: Add ARCH_BCM2835 support Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Simplify FIQ irq number code Dropping ATAGS means we can simplify the FIQ irq number code. Also add error checking on the returned irq number. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: Remove duplicate gadget probe/unregister function dwc_otg: Properly set the HFIR Douglas Anderson reported: According to the most up to date version of the dwc2 databook, the FRINT field of the HFIR register should be programmed to: * 125 us * (PHY clock freq for HS) - 1 * 1000 us * (PHY clock freq for FS/LS) - 1 This is opposed to older versions of the doc that claimed it should be: * 125 us * (PHY clock freq for HS) * 1000 us * (PHY clock freq for FS/LS) and reported lower timing jitter on a USB analyser dcw_otg: trim xfer length when buffer larger than allocated size is received dwc_otg: Don't free qh align buffers in atomic context dwc_otg: Enable the hack for Split Interrupt transactions by default dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues. So far we are aware of many success stories but no failure caused by this setting. Make it a default to learn more. See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437 Signed-off-by: popcornmix <popcornmix@gmail.com> dwc_otg: Use kzalloc when suitable dwc_otg: Pass struct device to dma_alloc*() This makes it possible to get the bus address from Device Tree. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dwc_otg: fix summarize urb->actual_length for isochronous transfers Kernel does not copy input data of ISO transfers to userspace if actual_length is set only in ISO transfers and not summarized in urb->actual_length. Fixes raspberrypi/linux#903	2017-07-21 15:29:12 +01:00
popcornmix	db760a57f2	Main bcm2708/bcm2709 linux port Signed-off-by: popcornmix <popcornmix@gmail.com> Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2709: Drop platform smp and timer init code irq-bcm2836 handles this through these functions: bcm2835_init_local_timer_frequency() bcm2836_arm_irqchip_smp_init() Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm270x: Use watchdog for reboot/poweroff The watchdog driver already has support for reboot/poweroff. Make use of this and remove the code from the platform files. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:12 +01:00
Noralf Trønnes	3ede2fae92	i2c: bcm2835: Add debug support This adds a debug module parameter to aid in debugging transfer issues by printing info to the kernel log. When enabled, status values are collected in the interrupt routine and msg info in bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid affecting timing. Having printk in the isr can mask issues. debug values (additive): 1: Print info on error 2: Print info on all transfers 3: Print messages before transfer is started The value can be changed at runtime: /sys/module/i2c_bcm2835/parameters/debug Example output, debug=3: [ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1] [ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1] [ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1] [ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1] [ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1] [ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1] [ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1] [ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1] Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:11 +01:00
Matt Flax	4a0d70b90c	ASoC: bcm2835_i2s.c: relax the ch2 register setting for 8 channels This patch allows ch2 registers to be set for 8 channels of audio.	2017-07-21 15:29:11 +01:00
Claggy3	fa3ceea174	Update vfpmodule.c Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m. This patch fixes a problem with VFP state save and restore related to exception handling (panic with message "BUG: unsupported FP instruction in kernel mode") present on VFP11 floating point units (as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry Pi boards). This patch was developed and discussed on https://github.com/raspberrypi/linux/issues/859 A precondition to see the crashes is that floating point exception traps are enabled. In this case, the VFP11 might determine that a FPU operation needs to trap at a point in time when it is not possible to signal this to the ARM11 core any more. The VFP11 will then set the FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases, a second opcode might have been accepted by the VFP11 before the exception was detected and could be reported to the ARM11 - in this case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in FPINST2.) If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits to decide what actions to take, i.e., whether to emulate the opcodes found in FPINST and FPINST2, and whether to retry the bounced instruction. If a user space application has left the VFP11 in this "pending trap" state, the next FPU opcode issued to the VFP11 might actually be the VSTMIA operation vfp_save_state() uses to store the FPU registers to memory (in our test cases, when building the signal stack frame). In this case, the kernel crashes as described above. This patch fixes the problem by making sure that vfp_save_state() is always entered with FPEXC.EX cleared. (The current value of FPEXC has already been saved, so this does not corrupt the context. Clearing FPEXC.EX has no effects on FPINST or FPINST2. Also note that many callers already modify FPEXC by setting FPEXC.EN before invoking vfp_save_state().) This patch also addresses a second problem related to FPEXC.EX: After returning from signal handling, the kernel reloads the VFP context from the user mode stack. However, the current code explicitly clears both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these bits to be preserved, this patch disables clearing them for VFP implementations belonging to architecture 1. There should be no negative side effects: the user can set both bits by executing FPU opcodes anyway, and while user code may now place arbitrary values into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support code knows which instructions can be emulated, and rejects other opcodes with "unhandled bounce" messages, so there should be no security impact from allowing reloading FPEXC.EX and FPEXC.FP2V. Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>	2017-07-21 15:29:10 +01:00
Phil Elwell	c771f07139	sound: Demote deferral errors to INFO level At present there is no mechanism to specify driver load order, which can lead to deferrals and repeated retries until successful. Since this situation is expected, reduce the dmesg level to INFO and mention that the operation will be retried. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:10 +01:00
Phil Elwell	5b11072f51	clk-bcm2835: Read max core clock from firmware The VPU is responsible for managing the core clock, usually under direction from the bcm2835-cpufreq driver but not via the clk-bcm2835 driver. Since the core frequency can change without warning, it is safer to report the maximum clock rate to users of the core clock - I2C, SPI and the mini UART - to err on the safe side when calculating clock divisors. If the DT node for the clock driver includes a reference to the firmware node, use the firmware API to query the maximum core clock instead of reading the divider registers. Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about 160KHz. In particular, switching to the 4.9 kernel was likely to break SenseHAT usage on a Pi3. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:09 +01:00
Phil Elwell	3bd24ca5c2	clk-bcm2835: Correct the prediv logic If a clock has the prediv flag set, both the integer and fractional parts must be scaled when calculating the resulting frequency. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:09 +01:00
Phil Elwell	018cf56051	clk-bcm2835: Add claim-clocks property The claim-clocks property can be used to prevent PLLs and dividers from being marked as critical. It contains a vector of clock IDs, as defined by dt-bindings/clock/bcm2835.h. Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and PLLH_PIX for the vc4_kms_v3d driver. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:08 +01:00
Phil Elwell	4e3150002b	clk-bcm2835: Mark used PLLs and dividers CRITICAL The VPU configures and relies on several PLLs and dividers. Mark all enabled dividers and their PLLs as CRITICAL to prevent the kernel from switching them off. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:07 +01:00
Robert Tiemann	15c90bc61d	BCM2835_DT: Fix I2S register map	2017-07-21 15:29:07 +01:00
Phil Elwell	4c99bd7497	kbuild: Ignore dtco targets when filtering symbols	2017-07-21 15:29:06 +01:00
popcornmix	9bb1f3aabf	bcm2835-rng: Avoid initialising if already enabled Avoids the 0x40000 cycles of warmup again if firmware has already used it	2017-07-21 15:29:06 +01:00
Martin Sperl	0c9c9fa8bb	Register the clocks early during the boot process, so that special/critical clocks can get enabled early on in the boot process avoiding the risk of disabling a clock, pll_divider or pll when a claiming driver fails to install propperly - maybe it needs to defer. Signed-off-by: Martin Sperl <kernel@martin.sperl.org>	2017-07-21 15:29:05 +01:00
popcornmix	8d0af7752c	bcm: Make RASPBERRYPI_POWER depend on PM	2017-07-21 15:29:04 +01:00
popcornmix	6035819ab1	reboot: Use power off rather than busy spinning when halt is requested	2017-07-21 15:29:04 +01:00
Noralf Trønnes	c030818274	watchdog: bcm2835: Support setting reboot partition The Raspberry Pi firmware looks at the RSTS register to know which partition to boot from. The reboot syscall command LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument. Add support for passing in a partition number 0..63 to boot from. Partition 63 is a special partiton indicating halt. If the partition doesn't exist, the firmware falls back to partition 0. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:03 +01:00
Phil Elwell	5570c7223d	rtc: Add SPI alias for pcf2123 driver Without this alias, Device Tree won't cause the driver to be loaded. See: https://github.com/raspberrypi/linux/pull/1510	2017-07-21 15:29:03 +01:00
popcornmix	adbadb0d03	firmware: Updated mailbox header	2017-07-21 15:29:02 +01:00
Noralf Trønnes	7f99c9a709	dmaengine: bcm2835: Load driver early and support legacy API Load driver early since at least bcm2708_fb doesn't support deferred probing and even if it did, we don't want the video driver deferred. Support the legacy DMA API which is needed by bcm2708_fb. Don't mask out channel 2. Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:01 +01:00
Noralf Trønnes	48a6e815c7	ARM: bcm2835: Set Serial number and Revision The VideoCore bootloader passes in Serial number and Revision number through Device Tree. Make these available to userspace through /proc/cpuinfo. Mainline status: There is a commit in linux-next that standardize passing the serial number through Device Tree (string: /serial-number): ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo There was an attempt to do the same with the revision number, but it didn't get in: [PATCH v2 1/2] arm: devtree: Set system_rev from DT revision Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:29:01 +01:00
Phil Elwell	898a71fd37	spi-bcm2835: Remove unused code	2017-07-21 15:29:00 +01:00
Phil Elwell	54c0e31bb7	spi-bcm2835: Disable forced software CS Select software CS in bcm2708_common.dtsi, and disable the automatic conversion in the driver to allow hardware CS to be re-enabled with an overlay. See: https://github.com/raspberrypi/linux/issues/1547 Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:29:00 +01:00
Phil Elwell	74bc50a52b	spi-bcm2835: Support pin groups other than 7-11 The spi-bcm2835 driver automatically uses GPIO chip-selects due to some unreliability of the native ones. In doing so it chooses the same pins as the native chip-selects would use, but the existing code always uses pins 7 and 8, wherever the SPI function is mapped. Search the pinctrl group assigned to the driver for pins that correspond to native chip-selects, and use those for GPIO chip- selects. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:28:59 +01:00
Phil Elwell	a566ac2dec	pinctrl-bcm2835: Only request the interrupts listed in the DTB Although the GPIO controller can generate three interrupts (four counting the common one), the device tree files currently only specify two. In the absence of the third, simply don't register that interrupt (as opposed to registering 0), which has the effect of making it impossible to generate interrupts for GPIOs 46-53 which, since they share pins with the SD card interface, is unlikely to be a problem.	2017-07-21 15:28:59 +01:00
notro	626e3c7273	pinctrl-bcm2835: Set base to 0 give expected gpio numbering Signed-off-by: Noralf Tronnes <notro@tronnes.org>	2017-07-21 15:28:58 +01:00
popcornmix	ed66a12bf9	Revert "pinctrl: bcm2835: switch to GPIOLIB_IRQCHIP" This reverts commit `85ae9e512f`.	2017-07-21 15:28:58 +01:00
Phil Elwell	540b2f0bf0	spidev: Add "spidev" compatible string to silence warning See: https://github.com/raspberrypi/linux/issues/1054	2017-07-21 15:28:57 +01:00
Noralf Trønnes	03305eae47	irqchip: irq-bcm2835: Add 2836 FIQ support Signed-off-by: Noralf Trønnes <noralf@tronnes.org>	2017-07-21 15:28:56 +01:00
Noralf Trønnes	adf041275e	irqchip: bcm2835: Add FIQ support Add a duplicate irq range with an offset on the hwirq's so the driver can detect that enable_fiq() is used. Tested with downstream dwc_otg USB controller driver. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> Reviewed-by: Eric Anholt <eric@anholt.net> Acked-by: Stephen Warren <swarren@wwwdotorg.org>	2017-07-21 15:28:56 +01:00
Phil Elwell	2cc90f63f0	irq-bcm2836: Avoid "Invalid trigger warning" Initialise the level for each IRQ to avoid a warning from the arm arch timer code. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:28:55 +01:00
Phil Elwell	2424c93406	irq-bcm2836: Prevent spurious interrupts, and trap them early The old arch-specific IRQ macros included a dsb to ensure the write to clear the mailbox interrupt completed before returning from the interrupt. The BCM2836 irqchip driver needs the same precaution to avoid spurious interrupts. Spurious interrupts are still possible for other reasons, though, so trap them early.	2017-07-21 15:28:55 +01:00
Eric Anholt	86b3753625	mm: Remove the PFN busy warning See commit `dae803e165` -- the warning is expected sometimes when using CMA. However, that commit still spams my kernel log with these warnings. Signed-off-by: Eric Anholt <eric@anholt.net>	2017-07-21 15:28:54 +01:00
Phil Elwell	6ab7d91624	Protect __release_resource against resources without parents Without this patch, removing a device tree overlay can crash here. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2017-07-21 15:28:54 +01:00
popcornmix	83f373703b	Allow mac address to be set in smsc95xx Signed-off-by: popcornmix <popcornmix@gmail.com>	2017-07-21 15:28:53 +01:00
Sam Nazarko	307fc898e0	smsc95xx: Experimental: Enable turbo_mode and packetsize=2560 by default See: http://forum.kodi.tv/showthread.php?tid=285288	2017-07-21 15:28:53 +01:00
Steve Glendinning	be705ca98e	smsx95xx: fix crimes against truesize smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings. This patch stops smsc95xx from changing truesize. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>	2017-07-21 15:28:52 +01:00
Greg Kroah-Hartman	bd1a9eb6a7	Linux 4.11.12	2017-07-21 07:19:02 +02:00
Haozhong Zhang	c69bb56712	kvm: vmx: allow host to access guest MSR_IA32_BNDCFGS commit `691bd4340b` upstream. It's easier for host applications, such as QEMU, if they can always access guest MSR_IA32_BNDCFGS in VMCS, even though MPX is disabled in guest cpuid. Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:16 +02:00
Jim Mattson	bedb27f748	kvm: vmx: Check value written to IA32_BNDCFGS commit `4531662d1a` upstream. Bits 11:2 must be zero and the linear addess in bits 63:12 must be canonical. Otherwise, WRMSR(BNDCFGS) should raise #GP. Fixes: `0dd376e709` ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save") Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:16 +02:00
Jim Mattson	e0c0372d43	kvm: x86: Guest BNDCFGS requires guest MPX support commit `4439af9f91` upstream. The BNDCFGS MSR should only be exposed to the guest if the guest supports MPX. (cf. the TSC_AUX MSR and RDTSCP.) Fixes: `0dd376e709` ("KVM: x86: add MSR_IA32_BNDCFGS to msrs_to_save") Change-Id: I3ad7c01bda616715137ceac878f3fa7e66b6b387 Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:16 +02:00
Jim Mattson	e36fbd0c09	kvm: vmx: Do not disable intercepts for BNDCFGS commit `a8b6fda38f` upstream. The MSR permission bitmaps are shared by all VMs. However, some VMs may not be configured to support MPX, even when the host does. If the host supports VMX and the guest does not, we should intercept accesses to the BNDCFGS MSR, so that we can synthesize a #GP fault. Furthermore, if the host does not support MPX and the "ignore_msrs" kvm kernel parameter is set, then we should intercept accesses to the BNDCFGS MSR, so that we can skip over the rdmsr/wrmsr without raising a #GP fault. Fixes: `da8999d318` ("KVM: x86: Intel MPX vmx and msr handle") Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Dan Carpenter	27a231e2cb	PM / QoS: return -EINVAL for bogus strings commit `2ca30331c1` upstream. In the current code, if the user accidentally writes a bogus command to this sysfs file, then we set the latency tolerance to an uninitialized variable. Fixes: `2d984ad132` (PM / QoS: Introcuce latency tolerance device PM QoS type) Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Ville Syrjälä	81d6d6cff7	ALSA: x86: Clear the pdata.notify_lpe_audio pointer before teardown commit `8d5c30308d` upstream. Clear the notify function pointer in the platform data before we tear down the driver. Otherwise i915 would end up calling a stale function pointer and possibly explode. Cc: Takashi Iwai <tiwai@suse.de> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-3-ville.syrjala@linux.intel.com Reviewed-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Thomas Gleixner	4ad8c2aa7b	PM / wakeirq: Convert to SRCU commit `ea0212f40c` upstream. The wakeirq infrastructure uses RCU to protect the list of wakeirqs. That breaks the irq bus locking infrastructure, which is allows sleeping functions to be called so interrupt controllers behind slow busses, e.g. i2c, can be handled. The wakeirq functions hold rcu_read_lock and call into irq functions, which in case of interrupts using the irq bus locking will trigger a might_sleep() splat. Convert the wakeirq infrastructure to Sleepable RCU and unbreak it. Fixes: `4990d4fe32` (PM / Wakeirq: Add automated device wake IRQ handling) Reported-by: Brian Norris <briannorris@chromium.org> Suggested-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Tony Lindgren <tony@atomide.com> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Peter Zijlstra	f8d5166576	sched/topology: Fix overlapping sched_group_mask commit `73bb059f9b` upstream. The point of sched_group_mask is to select those CPUs from sched_group_cpus that can actually arrive at this balance domain. The current code gets it wrong, as can be readily demonstrated with a topology like: node 0 1 2 3 0: 10 20 30 20 1: 20 10 20 30 2: 30 20 10 20 3: 20 30 20 10 Where (for example) domain 1 on CPU1 ends up with a mask that includes CPU0: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 (mask: 1), 2, 0 [] domain 1: span 0-3 level NUMA [] groups: 0-2 (mask: 0-2) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072) This causes sched_balance_cpu() to compute the wrong CPU and consequently should_we_balance() will terminate early resulting in missed load-balance opportunities. The fixed topology looks like: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 (mask: 1), 2, 0 [] domain 1: span 0-3 level NUMA [] groups: 0-2 (mask: 1) (cpu_capacity: 3072), 0,2-3 (cpu_capacity: 3072) (note: this relies on OVERLAP domains to always have children, this is true because the regular topology domains are still here -- this is before degenerate trimming) Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: `e3589f6c81` ("sched: Allow for overlapping sched_domain spans") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Lauro Ramos Venancio	25816b4f1c	sched/topology: Optimize build_group_mask() commit `f32d782e31` upstream. The group mask is always used in intersection with the group CPUs. So, when building the group mask, we don't have to care about CPUs that are not part of the group. Signed-off-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lwang@redhat.com Cc: riel@redhat.com Link: http://lkml.kernel.org/r/1492717903-5195-2-git-send-email-lvenanci@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Peter Zijlstra	5d2515d4fd	sched/topology: Fix building of overlapping sched-groups commit `0372dd2736` upstream. When building the overlapping groups, we very obviously should start with the previous domain of _this_ @cpu, not CPU-0. This can be readily demonstrated with a topology like: node 0 1 2 3 0: 10 20 30 20 1: 20 10 20 30 2: 30 20 10 20 3: 20 30 20 10 Where (for example) CPU1 ends up generating the following nonsensical groups: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 2 0 [] domain 1: span 0-3 level NUMA [] groups: 1-3 (cpu_capacity = 3072) 0-1,3 (cpu_capacity = 3072) Where the fact that domain 1 doesn't include a group with span 0-2 is the obvious fail. With patch this looks like: [] CPU1 attaching sched-domain: [] domain 0: span 0-2 level NUMA [] groups: 1 0 2 [] domain 1: span 0-3 level NUMA [] groups: 0-2 (cpu_capacity = 3072) 0,2-3 (cpu_capacity = 3072) Debugged-by: Lauro Ramos Venancio <lvenanci@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Fixes: `e3589f6c81` ("sched: Allow for overlapping sched_domain spans") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Peter Zijlstra	497deefb42	sched/fair, cpumask: Export for_each_cpu_wrap() commit `c6508a3964` upstream. commit `c743f0a5c5` upstream. More users for for_each_cpu_wrap() have appeared. Promote the construct to generic cpumask interface. The implementation is slightly modified to reduce arguments. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Lauro Ramos Venancio <lvenanci@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: lwang@redhat.com Link: http://lkml.kernel.org/r/20170414122005.o35me2h5nowqkxbv@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:15 +02:00
Horia Geantă	5e256be7bb	crypto: caam - fix signals handling commit `7459e1d25f` upstream. Driver does not properly handle the case when signals interrupt wait_for_completion_interruptible(): -it does not check for return value -completion structure is allocated on stack; in case a signal interrupts the sleep, it will go out of scope, causing the worker thread (caam_jr_dequeue) to fail when it accesses it wait_for_completion_interruptible() is replaced with uninterruptable wait_for_completion(). We choose to block all signals while waiting for I/O (device executing the split key generation job descriptor) since the alternative - in order to have a deterministic device state - would be to flush the job ring (aborting all in-progress jobs). Fixes: `045e36780f` ("crypto: caam - ahash hmac support") Fixes: `4c1ec1f930` ("crypto: caam - refactor key_gen, sg") Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
David Gstir	d060c2f05f	crypto: caam - properly set IV after {en,de}crypt commit `854b06f768` upstream. Certain cipher modes like CTS expect the IV (req->info) of ablkcipher_request (or equivalently req->iv of skcipher_request) to contain the last ciphertext block when the {en,de}crypt operation is done. This is currently not the case for the CAAM driver which in turn breaks e.g. cts(cbc(aes)) when the CAAM driver is enabled. This patch fixes the CAAM driver to properly set the IV after the {en,de}crypt operation of ablkcipher finishes. This issue was revealed by the changes in the SW CTS mode in commit `0605c41cc5` ("crypto: cts - Convert to skcipher") Signed-off-by: David Gstir <david@sigma-star.at> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Herbert Xu	96a5ede157	crypto: sha1-ssse3 - Disable avx2 commit `b82ce24426` upstream. It has been reported that sha1-avx2 can cause page faults by reading beyond the end of the input. This patch disables it until it can be fixed. Fixes: `7c1da8d0d0` ("crypto: sha - SHA1 transform x86_64 AVX2") Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Gilad Ben-Yossef	d7e274720a	crypto: atmel - only treat EBUSY as transient if backlog commit `1606043f21` upstream. The Atmel SHA driver was treating -EBUSY as indication of queueing to backlog without checking that backlog is enabled for the request. Fix it by checking request flags. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Martin Hicks	a7439186c9	crypto: talitos - Extend max key length for SHA384/512-HMAC and AEAD commit `03d2c5114c` upstream. An updated patch that also handles the additional key length requirements for the AEAD algorithms. The max keysize is not 96. For SHA384/512 it's 128, and for the AEAD algorithms it's longer still. Extend the max keysize for the AEAD size for AES256 + HMAC(SHA512). Fixes: `357fb60502` ("crypto: talitos - add sha224, sha384 and sha512 to existing AEAD algorithms") Signed-off-by: Martin Hicks <mort@bork.org> Acked-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Helge Deller	87dde1dce9	mm: fix overflow check in expand_upwards() commit `37511fb5c9` upstream. JÃ¶rn Engel noticed that the expand_upwards() function might not return -ENOMEM in case the requested address is (unsigned long)-PAGE_SIZE and if the architecture didn't defined TASK_SIZE as multiple of PAGE_SIZE. Affected architectures are arm, frv, m68k, blackfin, h8300 and xtensa which all define TASK_SIZE as 0xffffffff, but since none of those have an upwards-growing stack we currently have no actual issue. Nevertheless let's fix this just in case any of the architectures with an upward-growing stack (currently parisc, metag and partly ia64) define TASK_SIZE similar. Link: http://lkml.kernel.org/r/20170702192452.GA11868@p100.box Fixes: `bd726c90b6` ("Allow stack to grow up to address space limit") Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Jörn Engel <joern@purestorage.com> Cc: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Andy Lutomirski	defb6f0bee	selftests/capabilities: Fix the test_execve test commit `796a3bae2f` upstream. test_execve does rather odd mount manipulations to safely create temporary setuid and setgid executables that aren't visible to the rest of the system. Those executables end up in the test's cwd, but that cwd is MNT_DETACHed. The core namespace code considers MNT_DETACHed trees to belong to no mount namespace at all and, in general, MNT_DETACHed trees are only barely function. This interacted with commit `380cf5ba6b` ("fs: Treat foreign mounts as nosuid") to cause all MNT_DETACHed trees to act as though they're nosuid, breaking the test. Fix it by just not detaching the tree. It's still in a private mount namespace and is therefore still invisible to the rest of the system (except via /proc, and the same nosuid logic will protect all other programs on the system from believing in test_execve's setuid bits). While we're at it, fix some blatant whitespace problems. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: `380cf5ba6b` ("fs: Treat foreign mounts as nosuid") Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuahkh@osg.samsung.com> Cc: Greg KH <greg@kroah.com> Cc: linux-kselftest@vger.kernel.org Signed-off-by: Andy Lutomirski <luto@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:14 +02:00
Eric W. Biederman	d61020efab	mnt: Make propagate_umount less slow for overlapping mount propagation trees commit `296990deb3` upstream. Andrei Vagin pointed out that time to executue propagate_umount can go non-linear (and take a ludicrious amount of time) when the mount propogation trees of the mounts to be unmunted by a lazy unmount overlap. Make the walk of the mount propagation trees nearly linear by remembering which mounts have already been visited, allowing subsequent walks to detect when walking a mount propgation tree or a subtree of a mount propgation tree would be duplicate work and to skip them entirely. Walk the list of mounts whose propgatation trees need to be traversed from the mount highest in the mount tree to mounts lower in the mount tree so that odds are higher that the code will walk the largest trees first, allowing later tree walks to be skipped entirely. Add cleanup_umount_visitation to remover the code's memory of which mounts have been visited. Add the functions last_slave and skip_propagation_subtree to allow skipping appropriate parts of the mount propagation tree without needing to change the logic of the rest of the code. A script to generate overlapping mount propagation trees: $ cat runs.h set -e mount -t tmpfs zdtm /mnt mkdir -p /mnt/1 /mnt/2 mount -t tmpfs zdtm /mnt/1 mount --make-shared /mnt/1 mkdir /mnt/1/1 iteration=10 if [ -n "$1" ] ; then iteration=$1 fi for i in $(seq $iteration); do mount --bind /mnt/1/1 /mnt/1/1 done mount --rbind /mnt/1 /mnt/2 TIMEFORMAT='%Rs' nr=$(( ( 2 ** ( $iteration + 1 ) ) + 1 )) echo -n "umount -l /mnt/1 -> $nr " time umount -l /mnt/1 nr=$(cat /proc/self/mountinfo \| grep zdtm \| wc -l ) time umount -l /mnt/2 $ for i in $(seq 9 19); do echo $i; unshare -Urm bash ./run.sh $i; done Here are the performance numbers with and without the patch: mhash \| 8192 \| 8192 \| 1048576 \| 1048576 mounts \| before \| after \| before \| after ------------------------------------------------ 1025 \| 0.040s \| 0.016s \| 0.038s \| 0.019s 2049 \| 0.094s \| 0.017s \| 0.080s \| 0.018s 4097 \| 0.243s \| 0.019s \| 0.206s \| 0.023s 8193 \| 1.202s \| 0.028s \| 1.562s \| 0.032s 16385 \| 9.635s \| 0.036s \| 9.952s \| 0.041s 32769 \| 60.928s \| 0.063s \| 44.321s \| 0.064s 65537 \| \| 0.097s \| \| 0.097s 131073 \| \| 0.233s \| \| 0.176s 262145 \| \| 0.653s \| \| 0.344s 524289 \| \| 2.305s \| \| 0.735s 1048577 \| \| 7.107s \| \| 2.603s Andrei Vagin reports fixing the performance problem is part of the work to fix CVE-2016-6213. Fixes: `a05964f391` ("[PATCH] shared mounts handling: umount") Reported-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Eric W. Biederman	b43f81ef0b	mnt: In propgate_umount handle visiting mounts in any order commit `99b19d1647` upstream. While investigating some poor umount performance I realized that in the case of overlapping mount trees where some of the mounts are locked the code has been failing to unmount all of the mounts it should have been unmounting. This failure to unmount all of the necessary mounts can be reproduced with: $ cat locked_mounts_test.sh mount -t tmpfs test-base /mnt mount --make-shared /mnt mkdir -p /mnt/b mount -t tmpfs test1 /mnt/b mount --make-shared /mnt/b mkdir -p /mnt/b/10 mount -t tmpfs test2 /mnt/b/10 mount --make-shared /mnt/b/10 mkdir -p /mnt/b/10/20 mount --rbind /mnt/b /mnt/b/10/20 unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo \| wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi' sleep 1 umount -l /mnt/b wait %% $ unshare -Urm ./locked_mounts_test.sh This failure is corrected by removing the prepass that marks mounts that may be umounted. A first pass is added that umounts mounts if possible and if not sets mount mark if they could be unmounted if they weren't locked and adds them to a list to umount possibilities. This first pass reconsiders the mounts parent if it is on the list of umount possibilities, ensuring that information of umoutability will pass from child to mount parent. A second pass then walks through all mounts that are umounted and processes their children unmounting them or marking them for reparenting. A last pass cleans up the state on the mounts that could not be umounted and if applicable reparents them to their first parent that remained mounted. While a bit longer than the old code this code is much more robust as it allows information to flow up from the leaves and down from the trunk making the order in which mounts are encountered in the umount propgation tree irrelevant. Fixes: `0c56fe3142` ("mnt: Don't propagate unmounts to locked mounts") Reviewed-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Eric W. Biederman	2d3d57171b	mnt: In umount propagation reparent in a separate pass commit `570487d3fa` upstream. It was observed that in some pathlogical cases that the current code does not unmount everything it should. After investigation it was determined that the issue is that mnt_change_mntpoint can can change which mounts are available to be unmounted during mount propagation which is wrong. The trivial reproducer is: $ cat ./pathological.sh mount -t tmpfs test-base /mnt cd /mnt mkdir 1 2 1/1 mount --bind 1 1 mount --make-shared 1 mount --bind 1 2 mount --bind 1/1 1/1 mount --bind 1/1 1/1 echo grep test-base /proc/self/mountinfo umount 1/1 echo grep test-base /proc/self/mountinfo $ unshare -Urm ./pathological.sh The expected output looks like: 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 49 54 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 50 53 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 51 49 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 54 47 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 53 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 50 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 The output without the fix looks like: 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 49 54 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 50 53 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 51 49 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 54 47 0:25 /1/1 /mnt/1/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 53 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 50 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 46 31 0:25 / /mnt rw,relatime - tmpfs test-base rw,uid=1000,gid=1000 47 46 0:25 /1 /mnt/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 48 46 0:25 /1 /mnt/2 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 52 48 0:25 /1/1 /mnt/2/1 rw,relatime shared:1 - tmpfs test-base rw,uid=1000,gid=1000 That last mount in the output was in the propgation tree to be unmounted but was missed because the mnt_change_mountpoint changed it's parent before the walk through the mount propagation tree observed it. Fixes: `1064f874ab` ("mnt: Tuck mounts under others instead of creating shadow/side mounts.") Acked-by: Andrei Vagin <avagin@virtuozzo.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Michael Kelley	ec39c02d39	Drivers: hv: vmbus: Close timing hole that can corrupt per-cpu page commit `13b9abfc92` upstream. Extend the disabling of preemption to include the hypercall so that another thread can't get the CPU and corrupt the per-cpu page used for hypercall arguments. Signed-off-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Johan Hovold	9b5eaef15c	nvmem: core: fix leaks on registration errors commit `3360acdf83` upstream. Make sure to deregister and release the nvmem device and underlying memory on registration errors. Note that the private data must be freed using put_device() once the struct device has been initialised. Also note that there's a related reference leak in the deregistration function as reported by Mika Westerberg which is being fixed separately. Fixes: `b6c217ab9b` ("nvmem: Add backwards compatibility support for older EEPROM drivers.") Fixes: `eace75cfdc` ("nvmem: Add a simple NVMEM framework for nvmem providers") Cc: Andrew Lunn <andrew@lunn.ch> Cc: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Paul E. McKenney	98ea29734e	rcu: Add memory barriers for NOCB leader wakeup commit `6b5fc3a133` upstream. Wait/wakeup operations do not guarantee ordering on their own. Instead, either locking or memory barriers are required. This commit therefore adds memory barriers to wake_nocb_leader() and nocb_leader_wait(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Krister Johansen <kjlx@templeofstupid.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Adam Borowski	994b860857	vt: fix unchecked __put_user() in tioclinux ioctls commit `6987dc8a70` upstream. Only read access is checked before this call. Actually, at the moment this is not an issue, as every in-tree arch does the same manual checks for VERIFY_READ vs VERIFY_WRITE, relying on the MMU to tell them apart, but this wasn't the case in the past and may happen again on some odd arch in the future. If anyone cares about 3.7 and earlier, this is a security hole (untested) on real 80386 CPUs. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:13 +02:00
Dong Bo	a6b177129a	arm64: Preventing READ_IMPLIES_EXEC propagation commit `48f99c8ec0` upstream. Like arch/arm/, we inherit the READ_IMPLIES_EXEC personality flag across fork(). This is undesirable for a number of reasons: * ELF files that don't require executable stack can end up with it anyway * We end up performing un-necessary I-cache maintenance when mapping what should be non-executable pages * Restricting what is executable is generally desirable when defending against overflow attacks This patch clears the personality flag when setting up the personality for newly spwaned native tasks. Given that semi-recent AArch64 toolchains emit a non-executable PT_GNU_STACK header, userspace applications can already not rely on READ_IMPLIES_EXEC so shouldn't be adversely affected by this change. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Dong Bo <dongbo4@huawei.com> [will: added comment to compat code, rewrote commit message] Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Marc Zyngier	618986c4bc	ARM64: dts: marvell: armada37xx: Fix timer interrupt specifiers commit `88cda00733` upstream. Contrary to popular belief, PPIs connected to a GICv3 to not have an affinity field similar to that of GICv2. That is consistent with the fact that GICv3 is designed to accomodate thousands of CPUs, and fitting them as a bitmap in a byte is... difficult. Fixes: `adbc3695d9` ("arm64: dts: add the Marvell Armada 3700 family and a development board") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Balbir Singh	21f81d9022	powerpc/kexec: Fix radix to hash kexec due to IAMR/AMOR commit `1e2a516e89` upstream. This patch fixes a crash seen while doing a kexec from radix mode to hash mode. Key 0 is special in hash and used in the RPN by default, we set the key values to 0 today. In radix mode key 0 is used to control supervisor<->user access. In hash key 0 is used by default, so the first instruction after the switch causes a crash on kexec. Commit `3b10d0095a` ("powerpc/mm/radix: Prevent kernel execution of user space") introduced the setting of IAMR and AMOR values to prevent execution of user mode instructions from supervisor mode. We need to clean up these SPR's on kexec. Fixes: `3b10d0095a` ("powerpc/mm/radix: Prevent kernel execution of user space") Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	2ee500dcfd	exec: Limit arg stack to at most 75% of _STK_LIM commit `da029c11e6` upstream. To avoid pathological stack usage or the need to special-case setuid execs, just limit all arg stack usage to at most 75% of _STK_LIM (6MB). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	4d5266f108	s390: reduce ELF_ET_DYN_BASE commit `a73dc5370e` upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For s390 the position could be 0x10000, but that is needlessly close to the NULL address. Link: http://lkml.kernel.org/r/1498154792-49952-5-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	fa3a378b22	powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB commit `47ebb09d54` upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, which is the traditional x86 minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). Link: http://lkml.kernel.org/r/1498154792-49952-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Tested-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	4e33130d2c	arm64: move ELF_ET_DYN_BASE to 4GB / 4MB commit `02445990a9` upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. For 64-bit, align to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. On 32-bit use 4MB, to match ARM. This could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running arm compat PIE will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498251600-132458-4-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	612584f74d	arm: move ELF_ET_DYN_BASE to 4MB commit `6a9af90a3b` upstream. Now that explicitly executed loaders are loaded in the mmap region, we have more freedom to decide where we position PIE binaries in the address space to avoid possible collisions with mmap or stack regions. 4MB is chosen here mainly to have parity with x86, where this is the traditional minimum load location, likely to avoid historically requiring a 4MB page table entry when only a portion of the first 4MB would be used (since the NULL address is avoided). For ARM the position could be 0x8000, the standard ET_EXEC load address, but that is needlessly close to the NULL address, and anyone running PIE on 32-bit ARM will have an MMU, so the tight mapping is not needed. Link: http://lkml.kernel.org/r/1498154792-49952-2-git-send-email-keescook@chromium.org Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Pratyush Anand <panand@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Kees Cook <keescook@chromium.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:12 +02:00
Kees Cook	9b1bbf6ea9	binfmt_elf: use ELF_ET_DYN_BASE only for PIE commit `eab09532d4` upstream. The ELF_ET_DYN_BASE position was originally intended to keep loaders away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2 /bin/cat" might cause the subsequent load of /bin/cat into where the loader had been loaded.) With the advent of PIE (ET_DYN binaries with an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial portion of the address space is unused. For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs are loaded above the mmap region. This means they can be made to collide (CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with pathological stack regions. Lowering ELF_ET_DYN_BASE solves both by moving programs below the mmap region in all cases, and will now additionally avoid programs falling back to the mmap region by enforcing MAP_FIXED for program loads (i.e. if it would have collided with the stack, now it will fail to load instead of falling back to the mmap region). To allow for a lower ELF_ET_DYN_BASE, loaders (ET_DYN without INTERP) are loaded into the mmap region, leaving space available for either an ET_EXEC binary with a fixed location or PIE being loaded into mmap by the loader. Only PIE programs are loaded offset from ELF_ET_DYN_BASE, which means architectures can now safely lower their values without risk of loaders colliding with their subsequently loaded programs. For 64-bit, ELF_ET_DYN_BASE is best set to 4GB to allow runtimes to use the entire 32-bit address space for 32-bit pointers. Thanks to PaX Team, Daniel Micay, and Rik van Riel for inspiration and suggestions on how to implement this solution. Fixes: `d1fd836dcf` ("mm: split ET_DYN ASLR from mmap ASLR") Link: http://lkml.kernel.org/r/20170621173201.GA114489@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dmitry Safonov <dsafonov@virtuozzo.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Pratyush Anand <panand@redhat.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Cyril Bur	422b6365be	checkpatch: silence perl 5.26.0 unescaped left brace warnings commit `8d81ae05d0` upstream. As of perl 5, version 26, subversion 0 (v5.26.0) some new warnings have occurred when running checkpatch. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s){ <-- HERE \s/ at scripts/checkpatch.pl line 3544. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(.\s){ <-- HERE \s/ at scripts/checkpatch.pl line 3885. Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by <-- HERE in m/^(\+.*(?:do\|\))){ <-- HERE / at scripts/checkpatch.pl line 4374. It seems perfectly reasonable to do as the warning suggests and simply escape the left brace in these three locations. Link: http://lkml.kernel.org/r/20170607060135.17384-1-cyrilbur@gmail.com Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Sahitya Tummala	29d52923b2	fs/dcache.c: fix spin lockup issue on nlru->lock commit `b17c070fb6` upstream. __list_lru_walk_one() acquires nlru spin lock (nlru->lock) for longer duration if there are more number of items in the lru list. As per the current code, it can hold the spin lock for upto maximum UINT_MAX entries at a time. So if there are more number of items in the lru list, then "BUG: spinlock lockup suspected" is observed in the below path: spin_bug+0x90 do_raw_spin_lock+0xfc _raw_spin_lock+0x28 list_lru_add+0x28 dput+0x1c8 path_put+0x20 terminate_walk+0x3c path_lookupat+0x100 filename_lookup+0x6c user_path_at_empty+0x54 SyS_faccessat+0xd0 el0_svc_naked+0x24 This nlru->lock is acquired by another CPU in this path - d_lru_shrink_move+0x34 dentry_lru_isolate_shrink+0x48 __list_lru_walk_one.isra.10+0x94 list_lru_walk_node+0x40 shrink_dcache_sb+0x60 do_remount_sb+0xbc do_emergency_remount+0xb0 process_one_work+0x228 worker_thread+0x2e0 kthread+0xf4 ret_from_fork+0x10 Fix this lockup by reducing the number of entries to be shrinked from the lru list to 1024 at once. Also, add cond_resched() before processing the lru list again. Link: http://marc.info/?t=149722864900001&r=1&w=2 Link: http://lkml.kernel.org/r/1498707575-2472-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Suggested-by: Jan Kara <jack@suse.cz> Suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Sahitya Tummala	7d212c2078	mm/list_lru.c: fix list_lru_count_node() to be race free commit `2c80cd57c7` upstream. list_lru_count_node() iterates over all memcgs to get the total number of entries on the node but it can race with memcg_drain_all_list_lrus(), which migrates the entries from a dead cgroup to another. This can return incorrect number of entries from list_lru_count_node(). Fix this by keeping track of entries per node and simply return it in list_lru_count_node(). Link: http://lkml.kernel.org/r/1498707555-30525-1-git-send-email-stummala@codeaurora.org Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Jan Kara <jack@suse.cz> Cc: Alexander Polakov <apolyakov@beget.ru> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Marcin Nowakowski	35038e4ff3	kernel/extable.c: mark core_kernel_text notrace commit `c0d80ddab8` upstream. core_kernel_text is used by MIPS in its function graph trace processing, so having this method traced leads to an infinite set of recursive calls such as: Call Trace: ftrace_return_to_handler+0x50/0x128 core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 return_to_handler+0x10/0x30 return_to_handler+0x0/0x30 return_to_handler+0x0/0x30 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 core_kernel_text+0x10/0x1b8 ftrace_ops_no_ops+0x114/0x1bc core_kernel_text+0x10/0x1b8 prepare_ftrace_return+0x6c/0x114 ftrace_graph_caller+0x20/0x44 (...) Mark the function notrace to avoid it being traced. Link: http://lkml.kernel.org/r/1498028607-6765-1-git-send-email-marcin.nowakowski@imgtec.com Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Meyer <thomas@m3y3r.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Kirill A. Shutemov	687203d9a7	thp, mm: fix crash due race in MADV_FREE handling commit `bbf29ffc7f` upstream. Reinette reported the following crash: BUG: Bad page state in process log2exe pfn:57600 page:ffffea00015d8000 count:0 mapcount:0 mapping: (null) index:0x20200 flags: 0x4000000000040019(locked\|uptodate\|dirty\|swapbacked) raw: 4000000000040019 0000000000000000 0000000000020200 00000000ffffffff raw: ffffea00015d8020 ffffea00015d8020 0000000000000000 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1(locked) Modules linked in: rfcomm 8021q bnep intel_rapl x86_pkg_temp_thermal coretemp efivars btusb btrtl btbcm pwm_lpss_pci snd_hda_codec_hdmi btintel pwm_lpss snd_hda_codec_realtek snd_soc_skl snd_hda_codec_generic snd_soc_skl_ipc spi_pxa2xx_platform snd_soc_sst_ipc snd_soc_sst_dsp i2c_designware_platform i2c_designware_core snd_hda_ext_core snd_soc_sst_match snd_hda_intel snd_hda_codec mei_me snd_hda_core mei snd_soc_rt286 snd_soc_rl6347a snd_soc_core efivarfs CPU: 1 PID: 354 Comm: log2exe Not tainted 4.12.0-rc7-test-test #19 Hardware name: Intel corporation NUC6CAYS/NUC6CAYB, BIOS AYAPLCEL.86A.0027.2016.1108.1529 11/08/2016 Call Trace: bad_page+0x16a/0x1f0 free_pages_check_bad+0x117/0x190 free_hot_cold_page+0x7b1/0xad0 __put_page+0x70/0xa0 madvise_free_huge_pmd+0x627/0x7b0 madvise_free_pte_range+0x6f8/0x1150 __walk_page_range+0x6b5/0xe30 walk_page_range+0x13b/0x310 madvise_free_page_range.isra.16+0xad/0xd0 madvise_free_single_vma+0x2e4/0x470 SyS_madvise+0x8ce/0x1450 If somebody frees the page under us and we hold the last reference to it, put_page() would attempt to free the page before unlocking it. The fix is trivial reorder of operations. Dave said: "I came up with the exact same patch. For posterity, here's the test case, generated by syzkaller and trimmed down by Reinette: https://www.sr71.net/~dave/intel/log2.c And the config that helps detect this: https://www.sr71.net/~dave/intel/config-log2" Fixes: `b8d3c4c300` ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called") Link: http://lkml.kernel.org/r/20170628101249.17879-1-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Minchan Kim <minchan@kernel.org> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Ben Hutchings	f795825cd6	tools/lib/lockdep: Reduce MAX_LOCK_DEPTH to avoid overflowing lock_chain/: Depth commit `98dcea0cfd` upstream. liblockdep has been broken since commit `75dd602a51` ("lockdep: Fix lock_chain::base size"), as that adds a check that MAX_LOCK_DEPTH is within the range of lock_chain::depth and in liblockdep it is much too large. That should have resulted in a compiler error, but didn't because: - the check uses ARRAY_SIZE(), which isn't yet defined in liblockdep so is assumed to be an (undeclared) function - putting a function call inside a BUILD_BUG_ON() expression quietly turns it into some nonsense involving a variable-length array It did produce a compiler warning, but I didn't notice because liblockdep already produces too many warnings if -Wall is enabled (which I'll fix shortly). Even before that commit, which reduced lock_chain::depth from 8 bits to 6, MAX_LOCK_DEPTH was too large. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: a.p.zijlstra@chello.nl Link: http://lkml.kernel.org/r/20170525130005.5947-3-alexander.levin@verizon.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Helge Deller	e03e4acf59	parisc/mm: Ensure IRQs are off in switch_mm() commit `649aa24254` upstream. This is because of commit `f98db6013c` ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") in which switch_mm_irqs_off() is called by the scheduler, vs switch_mm() which is used by use_mm(). This patch lets the parisc code mirror the x86 and powerpc code, ie. it disables interrupts in switch_mm(), and optimises the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:11 +02:00
Thomas Bogendoerfer	b05deb7892	parisc: DMA API: return error instead of BUG_ON for dma ops on non dma devs commit `33f9e02495` upstream. Enabling parport pc driver on a B2600 (and probably other 64bit PARISC systems) produced following BUG: CPU: 0 PID: 1 Comm: swapper Not tainted 4.12.0-rc5-30198-g1132d5e #156 task: 000000009e050000 task.stack: 000000009e04c000 YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI PSW: 00001000000001101111111100001111 Not tainted r00-03 000000ff0806ff0f 000000009e04c990 0000000040871b78 000000009e04cac0 r04-07 0000000040c14de0 ffffffffffffffff 000000009e07f098 000000009d82d200 r08-11 000000009d82d210 0000000000000378 0000000000000000 0000000040c345e0 r12-15 0000000000000005 0000000040c345e0 0000000000000000 0000000040c9d5e0 r16-19 0000000040c345e0 00000000f00001c4 00000000f00001bc 0000000000000061 r20-23 000000009e04ce28 0000000000000010 0000000000000010 0000000040b89e40 r24-27 0000000000000003 0000000000ffffff 000000009d82d210 0000000040c14de0 r28-31 0000000000000000 000000009e04ca90 000000009e04cb40 0000000000000000 sr00-03 0000000000000000 0000000000000000 0000000000000000 0000000000000000 sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000404aece0 00000000404aece4 IIR: 03ffe01f ISR: 0000000010340000 IOR: 000001781304cac8 CPU: 0 CR30: 000000009e04c000 CR31: 00000000e2976de2 ORIG_R28: 0000000000000200 IAOQ[0]: sba_dma_supported+0x80/0xd0 IAOQ[1]: sba_dma_supported+0x84/0xd0 RP(r2): parport_pc_probe_port+0x178/0x1200 Cause is a call to dma_coerce_mask_and_coherenet in parport_pc_probe_port, which PARISC DMA API doesn't handle very nicely. This commit gives back DMA_ERROR_CODE for DMA API calls, if device isn't capable of DMA transaction. Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Eric Biggers	e8de691feb	parisc: use compat_sys_keyctl() commit `b0f94efd5a` upstream. Architectures with a compat syscall table must put compat_sys_keyctl() in it, not sys_keyctl(). The parisc architecture was not doing this; fix it. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Helge Deller	1d4e20497b	parisc: Report SIGSEGV instead of SIGBUS when running out of stack commit `247462316f` upstream. When a process runs out of stack the parisc kernel wrongly faults with SIGBUS instead of the expected SIGSEGV signal. This example shows how the kernel faults: do_page_fault() command='a.out' type=15 address=0xfaac2000 in libc-2.24.so[f8308000+16c000] trap #15: Data TLB miss fault, vm_start = 0xfa2c2000, vm_end = 0xfaac2000 The vma->vm_end value is the first address which does not belong to the vma, so adjust the check to include vma->vm_end to the range for which to send the SIGSEGV signal. This patch unbreaks building the debian libsigsegv package. Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Suzuki K Poulose	46bed2221c	irqchip/gic-v3: Fix out-of-bound access in gic_set_affinity commit `866d7c1b0a` upstream. The GICv3 driver doesn't check if the target CPU for gic_set_affinity is valid before going ahead and making the changes. This triggers the following splat with KASAN: [ 141.189434] BUG: KASAN: global-out-of-bounds in gic_set_affinity+0x8c/0x140 [ 141.189704] Read of size 8 at addr ffff200009741d20 by task swapper/1/0 [ 141.189958] [ 141.190158] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.12.0-rc7 [ 141.190458] Hardware name: Foundation-v8A (DT) [ 141.190658] Call trace: [ 141.190908] [<ffff200008089d70>] dump_backtrace+0x0/0x328 [ 141.191224] [<ffff20000808a1b4>] show_stack+0x14/0x20 [ 141.191507] [<ffff200008504c3c>] dump_stack+0xa4/0xc8 [ 141.191858] [<ffff20000826c19c>] print_address_description+0x13c/0x250 [ 141.192219] [<ffff20000826c5c8>] kasan_report+0x210/0x300 [ 141.192547] [<ffff20000826ad54>] __asan_load8+0x84/0x98 [ 141.192874] [<ffff20000854eeec>] gic_set_affinity+0x8c/0x140 [ 141.193158] [<ffff200008148b14>] irq_do_set_affinity+0x54/0xb8 [ 141.193473] [<ffff200008148d2c>] irq_set_affinity_locked+0x64/0xf0 [ 141.193828] [<ffff200008148e00>] __irq_set_affinity+0x48/0x78 [ 141.194158] [<ffff200008bc48a4>] arm_perf_starting_cpu+0x104/0x150 [ 141.194513] [<ffff2000080d73bc>] cpuhp_invoke_callback+0x17c/0x1f8 [ 141.194783] [<ffff2000080d94ec>] notify_cpu_starting+0x8c/0xb8 [ 141.195130] [<ffff2000080911ec>] secondary_start_kernel+0x15c/0x200 [ 141.195390] [<0000000080db81b4>] 0x80db81b4 [ 141.195603] [ 141.195685] The buggy address belongs to the variable: [ 141.196012] __cpu_logical_map+0x200/0x220 [ 141.196176] [ 141.196315] Memory state around the buggy address: [ 141.196586] ffff200009741c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.196913] ffff200009741c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.197158] >ffff200009741d00: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 00 [ 141.197487] ^ [ 141.197758] ffff200009741d80: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00 [ 141.198060] ffff200009741e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 141.198358] ================================================================== [ 141.198609] Disabling lock debugging due to kernel taint [ 141.198961] CPU1: Booted secondary processor [410fd051] This patch adds the check to make sure the cpu is valid. Fixes: commit `021f653791` ("irqchip: gic-v3: Initial support for GICv3") Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Alex Deucher	da3538c859	drm/amdgpu/gfx6: properly cache mc_arb_ramcfg commit `6653ebd48f` upstream. This was missing for gfx6. Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Srinivas Dasari	fc0a2569a0	cfg80211: Check if NAN service ID is of expected size commit `0a27844ce8` upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, cfg80211 may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_NAN_FUNC_SERVICE_ID to make these NLA_UNSPEC and to make sure minimum NL80211_NAN_FUNC_SERVICE_ID_LEN bytes are received from userspace with NL80211_NAN_FUNC_SERVICE_ID. Fixes: `a442b761b2` ("cfg80211: add add_nan_func / del_nan_func") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Srinivas Dasari	0e60de601f	cfg80211: Check if PMKID attribute is of expected size commit `9361df14d1` upstream. nla policy checks for only maximum length of the attribute data when the attribute type is NLA_BINARY. If userspace sends less data than specified, the wireless drivers may access illegal memory. When type is NLA_UNSPEC, nla policy check ensures that userspace sends minimum specified length number of bytes. Remove type assignment to NLA_BINARY from nla_policy of NL80211_ATTR_PMKID to make this NLA_UNSPEC and to make sure minimum WLAN_PMKID_LEN bytes are received from userspace with NL80211_ATTR_PMKID. Fixes: `67fbb16be6` ("nl80211: PMKSA caching support") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Srinivas Dasari	880b98c339	cfg80211: Validate frequencies nested in NL80211_ATTR_SCAN_FREQUENCIES commit `d7f13f7450` upstream. validate_scan_freqs() retrieves frequencies from attributes nested in the attribute NL80211_ATTR_SCAN_FREQUENCIES with nla_get_u32(), which reads 4 bytes from each attribute without validating the size of data received. Attributes nested in NL80211_ATTR_SCAN_FREQUENCIES don't have an nla policy. Validate size of each attribute before parsing to avoid potential buffer overread. Fixes: `2a51931192` ("cfg80211/nl80211: scanning (and mac80211 update to use it)") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:10 +02:00
Srinivas Dasari	61d3f24df7	cfg80211: Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE commit `8feb69c7bd` upstream. Buffer overread may happen as nl80211_set_station() reads 4 bytes from the attribute NL80211_ATTR_LOCAL_MESH_POWER_MODE without validating the size of data received when userspace sends less than 4 bytes of data with NL80211_ATTR_LOCAL_MESH_POWER_MODE. Define nla_policy for NL80211_ATTR_LOCAL_MESH_POWER_MODE to avoid the buffer overread. Fixes: `3b1c5a5307` ("{cfg,nl}80211: mesh power mode primitives and userspace access") Signed-off-by: Srinivas Dasari <dasaris@qti.qualcomm.com> Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Daniel Kiper	5c5b4bff5f	efi: Process the MEMATTR table only if EFI_MEMMAP is enabled commit `457ea3f7e9` upstream. Otherwise e.g. Xen dom0 on x86_64 EFI platforms crashes. In theory we can check EFI_PARAVIRT too, however, EFI_MEMMAP looks more targeted and covers more cases. Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: andrew.cooper3@citrix.com Cc: boris.ostrovsky@oracle.com Cc: jgross@suse.com Cc: linux-efi@vger.kernel.org Cc: matt@codeblueprint.co.uk Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1498128697-12943-2-git-send-email-daniel.kiper@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Peter S. Housel	046f5b7733	brcmfmac: Fix glom_skb leak in brcmf_sdiod_recv_chain commit `5ea59db8a3` upstream. An earlier change to this function (`3bdae81072`) fixed a leak in the case of an unsuccessful call to brcmf_sdiod_buffrw(). However, the glom_skb buffer, used for emulating a scattering read, is never used or referenced after its contents are copied into the destination buffers, and therefore always needs to be freed by the end of the function. Fixes: `3bdae81072` ("brcmfmac: Fix glob_skb leak in brcmf_sdiod_recv_chain") Fixes: `a413e39a38` ("brcmfmac: fix brcmf_sdcard_recv_chain() for host without sg support") Signed-off-by: Peter S. Housel <housel@acm.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Christophe Jaillet	305b4b1b5b	brcmfmac: Fix a memory leak in error handling path in 'brcmf_cfg80211_attach' commit `57c00f2fac` upstream. If 'wiphy_new()' fails, we leak 'ops'. Add a new label in the error handling path to free it in such a case. Fixes: `5c22fb8510` ("brcmfmac: add wowl gtk rekeying offload support") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Bart Van Assche	a6378366f6	block: Fix a blk_exit_rl() regression commit `dc9edc44de` upstream. Avoid that the following complaint is reported: BUG: sleeping function called from invalid context at kernel/workqueue.c:2790 in_atomic(): 1, irqs_disabled(): 0, pid: 41, name: rcuop/3 1 lock held by rcuop/3/41: #0: (rcu_callback){......}, at: [<ffffffff8111f9a2>] rcu_nocb_kthread+0x282/0x500 Call Trace: dump_stack+0x86/0xcf ___might_sleep+0x174/0x260 __might_sleep+0x4a/0x80 flush_work+0x7e/0x2e0 __cancel_work_timer+0x143/0x1c0 cancel_work_sync+0x10/0x20 blk_throtl_exit+0x25/0x60 blkcg_exit_queue+0x35/0x40 blk_release_queue+0x42/0x130 kobject_put+0xa9/0x190 This happens since we invoke callbacks that need to block from the queue release handler. Fix this by pushing the final release to a workqueue. Reported-by: Ross Zwisler <zwisler@gmail.com> Fixes: commit `b425e50492` ("block: Avoid that blk_exit_rl() triggers a use-after-free") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Updated changelog Signed-off-by: Jens Axboe <axboe@fb.com> Cc: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Nitin Gupta	27a85c4acc	sparc64: Fix gup_huge_pmd [ Upstream commit `dbd2667a4f` ] The function assumes that each PMD points to head of a huge page. This is not correct as a PMD can point to start of any 8M region with a, say 256M, hugepage. The fix ensures that it points to the correct head of any PMD huge page. Cc: Julian Calaby <julian.calaby@gmail.com> Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Nagarathnam Muthusamy	64e3d66451	Adding the type of exported symbols [ Upstream commit `f5a651f1d5` ] Missing symbol type for few functions prevents genksyms from generating symbol versions for those functions. This patch fixes them. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Nagarathnam Muthusamy	2ea8b8188f	sed regex in Makefile.build requires line break between exported symbols [ Upstream commit `d16c0649fe` ] The following regex in Makefile.build matches only one ___EXPORT_SYMBOL per line. sed 's/.___EXPORT_SYMBOL[[:space:]]$[a-zA-Z0-9_]$[[:space:]],.*/EXPORT_SYMBOL(\1);/' ATOMIC_OPS macro in atomic_64.S expands multiple symbols in same line hence version generation is done only for the last matched symbol. This patch adds new line between the symbol expansions. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:09 +02:00
Nagarathnam Muthusamy	4aac8676b8	Adding asm-prototypes.h for genksyms to generate crc [ Upstream commit `bdca8cc096` ] This patch adds the prototypes of assembly defined functions to asm-prototypes.h. Some prototypes are directly added as they are not present in any existing header files. Signed-off-by: Nagarathnam Muthusamy <nagarathnam.muthusamy@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Bert Kenward	065d0643c0	sfc: don't read beyond unicast address list [ Upstream commit `c70d68150f` ] If we have more than 32 unicast MAC addresses assigned to an interface we will read beyond the end of the address table in the driver when adding filters. The next 256 entries store multicast addresses, so we will end up attempting to insert duplicate filters, which is mostly harmless. If we add more than 288 unicast addresses we will then read past the multicast address table, which is likely to be more exciting. Fixes: `12fb0da45c` ("sfc: clean fallbacks between promisc/normal in efx_ef10_filter_sync_rx_mode") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Arend van Spriel	0dc4be778d	brcmfmac: fix possible buffer overflow in brcmf_cfg80211_mgmt_tx() [ Upstream commit `8f44c9a413` ] The lower level nl80211 code in cfg80211 ensures that "len" is between 25 and NL80211_ATTR_FRAME (2304). We subtract DOT11_MGMT_HDR_LEN (24) from "len" so thats's max of 2280. However, the action_frame->data[] buffer is only BRCMF_FIL_ACTION_FRAME_SIZE (1800) bytes long so this memcpy() can overflow. memcpy(action_frame->data, &buf[DOT11_MGMT_HDR_LEN], le16_to_cpu(action_frame->len)); Cc: stable@vger.kernel.org # 3.9.x Fixes: `18e2f61db3` ("brcmfmac: P2P action frame tx.") Reported-by: "freenerguo(郭大兴)" <freenerguo@tencent.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Eduardo Valentin	a470f5fb25	bridge: mdb: fix leak on complete_info ptr on fail path [ Upstream commit `1bfb159673` ] We currently get the following kmemleak report: unreferenced object 0xffff8800039d9820 (size 32): comm "softirq", pid 0, jiffies 4295212383 (age 792.416s) hex dump (first 32 bytes): 00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00 ................ 00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff ................ backtrace: [<ffffffff8152b4aa>] kmemleak_alloc+0x4a/0xa0 [<ffffffff811d8ec8>] kmem_cache_alloc_trace+0xb8/0x1c0 [<ffffffffa0389683>] __br_mdb_notify+0x2a3/0x300 [bridge] [<ffffffffa038a0ce>] br_mdb_notify+0x6e/0x70 [bridge] [<ffffffffa0386479>] br_multicast_add_group+0x109/0x150 [bridge] [<ffffffffa0386518>] br_ip6_multicast_add_group+0x58/0x60 [bridge] [<ffffffffa0387fb5>] br_multicast_rcv+0x1d5/0xdb0 [bridge] [<ffffffffa037d7cf>] br_handle_frame_finish+0xcf/0x510 [bridge] [<ffffffffa03a236b>] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter] [<ffffffffa03a3738>] br_nf_hook_thresh+0x48/0xb0 [br_netfilter] [<ffffffffa03a3fb9>] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 [br_netfilter] [<ffffffffa03a4400>] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter] [<ffffffffa03a3c27>] br_nf_pre_routing+0x197/0x3d0 [br_netfilter] [<ffffffff814a2952>] nf_iterate+0x52/0x60 [<ffffffff814a29bc>] nf_hook_slow+0x5c/0xb0 [<ffffffffa037ddf4>] br_handle_frame+0x1a4/0x2c0 [bridge] This happens when switchdev_port_obj_add() fails. This patch frees complete_info object in the fail path. Reviewed-by: Vallish Vaidyeshwara <vallish@amazon.com> Signed-off-by: Eduardo Valentin <eduval@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
WANG Cong	c5363b1aa8	tap: convert a mutex to a spinlock [ Upstream commit `ffa423fb32` ] We are not allowed to block on the RCU reader side, so can't just hold the mutex as before. As a quick fix, convert it to a spinlock. Fixes: `d9f1f61c08` ("tap: Extending tap device create/destroy APIs") Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Sainath Grandhi <sainath.grandhi@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Guilherme G. Piccoli	5af29e937e	cxgb4: fix BUG() on interrupt deallocating path of ULD [ Upstream commit `6a146f3a58` ] Since the introduction of ULD (Upper-Layer Drivers), the MSI-X deallocating path changed in cxgb4: the driver frees the interrupts of ULD when unregistering it or on shutdown PCI handler. Problem is that if a MSI-X is not freed before deallocated in the PCI layer, it will trigger a BUG() due to still "alive" interrupt being tentatively quiesced. The below trace was observed when doing a simple unbind of Chelsio's adapter PCI function, like: "echo 001e:80:00.4 > /sys/bus/pci/drivers/cxgb4/unbind" Trace: kernel BUG at drivers/pci/msi.c:352! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c0000000005a5e60] free_msi_irqs+0xa0/0x250 LR [c0000000005a5e50] free_msi_irqs+0x90/0x250 Call Trace: [c0000000005a5e50] free_msi_irqs+0x90/0x250 (unreliable) [c0000000005a72c4] pci_disable_msix+0x124/0x180 [d000000011e06708] disable_msi+0x88/0xb0 [cxgb4] [d000000011e06948] free_some_resources+0xa8/0x160 [cxgb4] [d000000011e06d60] remove_one+0x170/0x3c0 [cxgb4] [c00000000058a910] pci_device_remove+0x70/0x110 [c00000000064ef04] device_release_driver_internal+0x1f4/0x2c0 ... This patch fixes the issue by refactoring the shutdown path of ULD on cxgb4 driver, by properly freeing and disabling interrupts on PCI remove handler too. Fixes: `0fbc81b3ad` ("Allocate resources dynamically for all cxgb4 ULD's") Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Huy Nguyen	9192b9c71e	net/mlx5e: Initialize CEE's getpermhwaddr address buffer to 0xff [ Upstream commit `d968f0f2e4` ] Latest change in open-lldp code uses bytes 6-11 of perm_addr buffer as the Ethernet source address for the host TLV packet. Since our driver does not fill these bytes, they stay at zero and the open-lldp code ends up sending the TLV packet with zero source address and the switch drops this packet. The fix is to initialize these bytes to 0xff. The open-lldp code considers 0xff:ff:ff:ff:ff:ff as the invalid address and falls back to use the host's mac address as the Ethernet source address. Fixes: `3a6a931dfb` ("net/mlx5e: Support DCBX CEE API") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Sowmini Varadhan	1567144c57	rds: tcp: use sock_create_lite() to create the accept socket [ Upstream commit `0933a578cd` ] There are two problems with calling sock_create_kern() from rds_tcp_accept_one() 1. it sets up a new_sock->sk that is wasteful, because this ->sk is going to get replaced by inet_accept() in the subsequent ->accept() 2. The new_sock->sk is a leaked reference in sock_graft() which expects to find a null parent->sk Avoid these problems by calling sock_create_lite(). Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
Nikolay Aleksandrov	372cc0d29e	vrf: fix bug_on triggered by rx when destroying a vrf [ Upstream commit `f630c38ef0` ] When destroying a VRF device we cleanup the slaves in its ndo_uninit() function, but that causes packets to be switched (skb->dev == vrf being destroyed) even though we're pass the point where the VRF should be receiving any packets while it is being dismantled. This causes a BUG_ON to trigger if we have raw sockets (trace below). The reason is that the inetdev of the VRF has been destroyed but we're still sending packets up the stack with it, so let's free the slaves in the dellink callback as David Ahern suggested. Note that this fix doesn't prevent packets from going up when the VRF device is admin down. [ 35.631371] ------------[ cut here ]------------ [ 35.631603] kernel BUG at net/ipv4/fib_frontend.c:285! [ 35.631854] invalid opcode: 0000 [#1] SMP [ 35.631977] Modules linked in: [ 35.632081] CPU: 2 PID: 22 Comm: ksoftirqd/2 Not tainted 4.12.0-rc7+ #45 [ 35.632247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 35.632477] task: ffff88005ad68000 task.stack: ffff88005ad64000 [ 35.632632] RIP: 0010:fib_compute_spec_dst+0xfc/0x1ee [ 35.632769] RSP: 0018:ffff88005ad67978 EFLAGS: 00010202 [ 35.632910] RAX: 0000000000000001 RBX: ffff880059a7f200 RCX: 0000000000000000 [ 35.633084] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff82274af0 [ 35.633256] RBP: ffff88005ad679f8 R08: 000000000001ef70 R09: 0000000000000046 [ 35.633430] R10: ffff88005ad679f8 R11: ffff880037731cb0 R12: 0000000000000001 [ 35.633603] R13: ffff8800599e3000 R14: 0000000000000000 R15: ffff8800599cb852 [ 35.634114] FS: 0000000000000000(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000 [ 35.634306] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.634456] CR2: 00007f3563227095 CR3: 000000000201d000 CR4: 00000000000406e0 [ 35.634632] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 35.634865] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 35.635055] Call Trace: [ 35.635271] ? __lock_acquire+0xf0d/0x1117 [ 35.635522] ipv4_pktinfo_prepare+0x82/0x151 [ 35.635831] raw_rcv_skb+0x17/0x3c [ 35.636062] raw_rcv+0xe5/0xf7 [ 35.636287] raw_local_deliver+0x169/0x1d9 [ 35.636534] ip_local_deliver_finish+0x87/0x1c4 [ 35.636820] ip_local_deliver+0x63/0x7f [ 35.637058] ip_rcv_finish+0x340/0x3a1 [ 35.637295] ip_rcv+0x314/0x34a [ 35.637525] __netif_receive_skb_core+0x49f/0x7c5 [ 35.637780] ? lock_acquire+0x13f/0x1d7 [ 35.638018] ? lock_acquire+0x15e/0x1d7 [ 35.638259] __netif_receive_skb+0x1e/0x94 [ 35.638502] ? __netif_receive_skb+0x1e/0x94 [ 35.638748] netif_receive_skb_internal+0x74/0x300 [ 35.639002] ? dev_gro_receive+0x2ed/0x411 [ 35.639246] ? lock_is_held_type+0xc4/0xd2 [ 35.639491] napi_gro_receive+0x105/0x1a0 [ 35.639736] receive_buf+0xc32/0xc74 [ 35.639965] ? detach_buf+0x67/0x153 [ 35.640201] ? virtqueue_get_buf_ctx+0x120/0x176 [ 35.640453] virtnet_poll+0x128/0x1c5 [ 35.640690] net_rx_action+0x103/0x343 [ 35.640932] __do_softirq+0x1c7/0x4b7 [ 35.641171] run_ksoftirqd+0x23/0x5c [ 35.641403] smpboot_thread_fn+0x24f/0x26d [ 35.641646] ? sort_range+0x22/0x22 [ 35.641878] kthread+0x129/0x131 [ 35.642104] ? __list_add+0x31/0x31 [ 35.642335] ? __list_add+0x31/0x31 [ 35.642568] ret_from_fork+0x2a/0x40 [ 35.642804] Code: 05 bd 87 a3 00 01 e8 1f ef 98 ff 4d 85 f6 48 c7 c7 f0 4a 27 82 41 0f 94 c4 31 c9 31 d2 41 0f b6 f4 e8 04 71 a1 ff 45 84 e4 74 02 <0f> 0b 0f b7 93 c4 00 00 00 4d 8b a5 80 05 00 00 48 03 93 d0 00 [ 35.644342] RIP: fib_compute_spec_dst+0xfc/0x1ee RSP: ffff88005ad67978 Fixes: `193125dbd8` ("net: Introduce VRF device driver") Reported-by: Chris Cormier <chriscormier@cumulusnetworks.com> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:08 +02:00
David Ahern	04e0f78f87	net: ipv6: Compare lwstate in detecting duplicate nexthops [ Upstream commit `f06b7549b7` ] Lennert reported a failure to add different mpls encaps in a multipath route: $ ip -6 route add 1234::/16 \ nexthop encap mpls 10 via fe80::1 dev ens3 \ nexthop encap mpls 20 via fe80::1 dev ens3 RTNETLINK answers: File exists The problem is that the duplicate nexthop detection does not compare lwtunnel configuration. Add it. Fixes: `19e42e4515` ("ipv6: support for fib route lwtunnel encap attributes") Signed-off-by: David Ahern <dsahern@gmail.com> Reported-by: João Taveira Araújo <joao.taveira@gmail.com> Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Tested-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Derek Chickles	5269faa455	liquidio: fix bug in soft reset failure detection [ Upstream commit `05a6b4cae8` ] The code that detects a failed soft reset of Octeon is comparing the wrong value against the reset value of the Octeon SLI_SCRATCH_1 register, resulting in an inability to detect a soft reset failure. Fix it by using the correct value in the comparison, which is any non-zero value. Fixes: `f21fb3ed36` ("Add support of Cavium Liquidio ethernet adapters") Fixes: `c0eab5b358` ("liquidio: CN23XX firmware download") Signed-off-by: Derek Chickles <derek.chickles@cavium.com> Signed-off-by: Satanand Burla <satananda.burla@cavium.com> Signed-off-by: Raghu Vatsavayi <raghu.vatsavayi@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Alban Browaeys	8a2f02b890	net: core: Fix slab-out-of-bounds in netdev_stats_to_stats64 [ Upstream commit `9af9959e14` ] commit `9256645af0` ("net/core: relax BUILD_BUG_ON in netdev_stats_to_stats64") made an attempt to read beyond the size of the source a possibility. Fix to only copy src size to dest. As dest might be bigger than src. ================================================================== BUG: KASAN: slab-out-of-bounds in netdev_stats_to_stats64+0xe/0x30 at addr ffff8801be248b20 Read of size 192 by task VBoxNetAdpCtl/6734 CPU: 1 PID: 6734 Comm: VBoxNetAdpCtl Tainted: G O 4.11.4prahal+intel+ #118 Hardware name: LENOVO 20CDCTO1WW/20CDCTO1WW, BIOS GQET52WW (1.32 ) 05/04/2017 Call Trace: dump_stack+0x63/0x86 kasan_object_err+0x1c/0x70 kasan_report+0x270/0x520 ? netdev_stats_to_stats64+0xe/0x30 ? sched_clock_cpu+0x1b/0x190 ? __module_address+0x3e/0x3b0 ? unwind_next_frame+0x1ea/0xb00 check_memory_region+0x13c/0x1a0 memcpy+0x23/0x50 netdev_stats_to_stats64+0xe/0x30 dev_get_stats+0x1b9/0x230 rtnl_fill_stats+0x44/0xc00 ? nla_put+0xc6/0x130 rtnl_fill_ifinfo+0xe9e/0x3700 ? rtnl_fill_vfinfo+0xde0/0xde0 ? sched_clock+0x9/0x10 ? sched_clock+0x9/0x10 ? sched_clock_local+0x120/0x130 ? __module_address+0x3e/0x3b0 ? unwind_next_frame+0x1ea/0xb00 ? sched_clock+0x9/0x10 ? sched_clock+0x9/0x10 ? sched_clock_cpu+0x1b/0x190 ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? depot_save_stack+0x1d8/0x4a0 ? depot_save_stack+0x34f/0x4a0 ? depot_save_stack+0x34f/0x4a0 ? save_stack+0xb1/0xd0 ? save_stack_trace+0x16/0x20 ? save_stack+0x46/0xd0 ? kasan_slab_alloc+0x12/0x20 ? __kmalloc_node_track_caller+0x10d/0x350 ? __kmalloc_reserve.isra.36+0x2c/0xc0 ? __alloc_skb+0xd0/0x560 ? rtmsg_ifinfo_build_skb+0x61/0x120 ? rtmsg_ifinfo.part.25+0x16/0xb0 ? rtmsg_ifinfo+0x47/0x70 ? register_netdev+0x15/0x30 ? vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp] ? vboxNetAdpCreate+0x210/0x400 [vboxnetadp] ? VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? do_vfs_ioctl+0x17f/0xff0 ? SyS_ioctl+0x74/0x80 ? do_syscall_64+0x182/0x390 ? __alloc_skb+0xd0/0x560 ? __alloc_skb+0xd0/0x560 ? save_stack_trace+0x16/0x20 ? init_object+0x64/0xa0 ? ___slab_alloc+0x1ae/0x5c0 ? ___slab_alloc+0x1ae/0x5c0 ? __alloc_skb+0xd0/0x560 ? sched_clock+0x9/0x10 ? kasan_unpoison_shadow+0x35/0x50 ? kasan_kmalloc+0xad/0xe0 ? __kmalloc_node_track_caller+0x246/0x350 ? __alloc_skb+0xd0/0x560 ? kasan_unpoison_shadow+0x35/0x50 ? memset+0x31/0x40 ? __alloc_skb+0x31f/0x560 ? napi_consume_skb+0x320/0x320 ? br_get_link_af_size_filtered+0xb7/0x120 [bridge] ? if_nlmsg_size+0x440/0x630 rtmsg_ifinfo_build_skb+0x83/0x120 rtmsg_ifinfo.part.25+0x16/0xb0 rtmsg_ifinfo+0x47/0x70 register_netdevice+0xa2b/0xe50 ? __kmalloc+0x171/0x2d0 ? netdev_change_features+0x80/0x80 register_netdev+0x15/0x30 vboxNetAdpOsCreate+0xc0/0x1c0 [vboxnetadp] vboxNetAdpCreate+0x210/0x400 [vboxnetadp] ? vboxNetAdpComposeMACAddress+0x1d0/0x1d0 [vboxnetadp] ? kasan_check_write+0x14/0x20 VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] ? VBoxNetAdpLinuxOpen+0x20/0x20 [vboxnetadp] ? lock_acquire+0x11c/0x270 ? __audit_syscall_entry+0x2fb/0x660 do_vfs_ioctl+0x17f/0xff0 ? __audit_syscall_entry+0x2fb/0x660 ? ioctl_preallocate+0x1d0/0x1d0 ? __audit_syscall_entry+0x2fb/0x660 ? kmem_cache_free+0xb2/0x250 ? syscall_trace_enter+0x537/0xd00 ? exit_to_usermode_loop+0x100/0x100 SyS_ioctl+0x74/0x80 ? do_sys_open+0x350/0x350 ? do_vfs_ioctl+0xff0/0xff0 do_syscall_64+0x182/0x390 entry_SYSCALL64_slow_path+0x25/0x25 RIP: 0033:0x7f7e39a1ae07 RSP: 002b:00007ffc6f04c6d8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00007ffc6f04c730 RCX: 00007f7e39a1ae07 RDX: 00007ffc6f04c730 RSI: 00000000c0207601 RDI: 0000000000000007 RBP: 00007ffc6f04c700 R08: 00007ffc6f04c780 R09: 0000000000000008 R10: 0000000000000541 R11: 0000000000000206 R12: 0000000000000007 R13: 00000000c0207601 R14: 00007ffc6f04c730 R15: 0000000000000012 Object at ffff8801be248008, in cache kmalloc-4096 size: 4096 Allocated: PID = 6734 save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 __kmalloc+0x171/0x2d0 alloc_netdev_mqs+0x8a7/0xbe0 vboxNetAdpOsCreate+0x65/0x1c0 [vboxnetadp] vboxNetAdpCreate+0x210/0x400 [vboxnetadp] VBoxNetAdpLinuxIOCtlUnlocked+0x14b/0x280 [vboxnetadp] do_vfs_ioctl+0x17f/0xff0 SyS_ioctl+0x74/0x80 do_syscall_64+0x182/0x390 return_from_SYSCALL_64+0x0/0x6a Freed: PID = 5600 save_stack_trace+0x16/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x73/0xc0 kfree+0xe4/0x220 kvfree+0x25/0x30 single_release+0x74/0xb0 __fput+0x265/0x6b0 ____fput+0x9/0x10 task_work_run+0xd5/0x150 exit_to_usermode_loop+0xe2/0x100 do_syscall_64+0x26c/0x390 return_from_SYSCALL_64+0x0/0x6a Memory state around the buggy address: ffff8801be248a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff8801be248b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff8801be248b80: 00 00 00 00 00 00 00 00 00 00 00 07 fc fc fc fc ^ ffff8801be248c00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8801be248c80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ================================================================== Signed-off-by: Alban Browaeys <alban.browaeys@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Jiri Benc	c1daa73dac	geneve: fix hlist corruption [ Upstream commit `4b4c21fad6` ] It's not a good idea to add the same hlist_node to two different hash lists. This leads to various hard to debug memory corruptions. Fixes: `8ed66f0e82` ("geneve: implement support for IPv6-based tunnels") Cc: John W. Linville <linville@tuxdriver.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Jiri Benc	173ced803d	vxlan: fix hlist corruption [ Upstream commit `69e766612c` ] It's not a good idea to add the same hlist_node to two different hash lists. This leads to various hard to debug memory corruptions. Fixes: `b1be00a6c3` ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device") Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Sabrina Dubroca	3ca557c85a	ipv6: dad: don't remove dynamic addresses if link is down [ Upstream commit `ec8add2a4c` ] Currently, when the link for $DEV is down, this command succeeds but the address is removed immediately by DAD (1): ip addr add 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800 In the same situation, this will succeed and not remove the address (2): ip addr add 1111::12/64 dev $DEV ip addr change 1111::12/64 dev $DEV valid_lft 3600 preferred_lft 1800 The comment in addrconf_dad_begin() when !IF_READY makes it look like this is the intended behavior, but doesn't explain why: * If the device is not ready: * - keep it tentative if it is a permanent address. * - otherwise, kill it. We clearly cannot prevent userspace from doing (2), but we can make (1) work consistently with (2). addrconf_dad_stop() is only called in two cases: if DAD failed, or to skip DAD when the link is down. In that second case, the fix is to avoid deleting the address, like we already do for permanent addresses. Fixes: `3c21edbd11` ("[IPV6]: Defer IPv6 device initialization until the link becomes ready.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Gal Pressman	febc276df9	net/mlx5e: Fix TX carrier errors report in get stats ndo [ Upstream commit `8ff93de766` ] Symbol error during carrier counter from PPCNT was mistakenly reported as TX carrier errors in get_stats ndo, although it's an RX counter. Fixes: `269e6b3af3` ("net/mlx5e: Report additional error statistics in get stats ndo") Signed-off-by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:07 +02:00
Mohamad Haj Yahia	fa8b93e1dc	net/mlx5: Cancel delayed recovery work when unloading the driver [ Upstream commit `2a0165a034` ] Draining the health workqueue will ignore future health works including the one that report hardware failure and thus we can't enter error state Instead cancel the recovery flow and make sure only recovery flow won't be scheduled. Fixes: `5e44fca504` ('net/mlx5: Only cancel recovery work when cleaning up device') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Michal Kubeček	0a3eafac6c	net: handle NAPI_GRO_FREE_STOLEN_HEAD case also in napi_frags_finish() [ Upstream commit `e44699d2c2` ] Recently I started seeing warnings about pages with refcount -1. The problem was traced to packets being reused after their head was merged into a GRO packet by skb_gro_receive(). While bisecting the issue pointed to commit `c21b48cc1b` ("net: adjust skb->truesize in ___pskb_trim()") and I have never seen it on a kernel with it reverted, I believe the real problem appeared earlier when the option to merge head frag in GRO was implemented. Handling NAPI_GRO_FREE_STOLEN_HEAD state was only added to GRO_MERGED_FREE branch of napi_skb_finish() so that if the driver uses napi_gro_frags() and head is merged (which in my case happens after the skb_condense() call added by the commit mentioned above), the skb is reused including the head that has been merged. As a result, we release the page reference twice and eventually end up with negative page refcount. To fix the problem, handle NAPI_GRO_FREE_STOLEN_HEAD in napi_frags_finish() the same way it's done in napi_skb_finish(). Fixes: `d7e8883cfc` ("net: make GRO aware of skb->head_frag") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Daniel Borkmann	87a82d2505	bpf: prevent leaking pointer via xadd on unpriviledged [ Upstream commit `6bdf6abc56` ] Leaking kernel addresses on unpriviledged is generally disallowed, for example, verifier rejects the following: 0: (b7) r0 = 0 1: (18) r2 = 0xffff897e82304400 3: (7b) (u64 )(r1 +48) = r2 R2 leaks addr into ctx Doing pointer arithmetic on them is also forbidden, so that they don't turn into unknown value and then get leaked out. However, there's xadd as a special case, where we don't check the src reg for being a pointer register, e.g. the following will pass: 0: (b7) r0 = 0 1: (7b) (u64 )(r1 +48) = r0 2: (18) r2 = 0xffff897e82304400 ; map 4: (db) lock (u64 )(r1 +48) += r2 5: (95) exit We could store the pointer into skb->cb, loose the type context, and then read it out from there again to leak it eventually out of a map value. Or more easily in a different variant, too: 0: (bf) r6 = r1 1: (7a) (u64 )(r10 -8) = 0 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x0 6: (85) call bpf_map_lookup_elem#1 7: (15) if r0 == 0x0 goto pc+3 R0=map_value(ks=8,vs=8,id=0),min_value=0,max_value=0 R6=ctx R10=fp 8: (b7) r3 = 0 9: (7b) (u64 )(r0 +0) = r3 10: (db) lock (u64 )(r0 +0) += r6 11: (b7) r0 = 0 12: (95) exit from 7 to 11: R0=inv,min_value=0,max_value=0 R6=ctx R10=fp 11: (b7) r0 = 0 12: (95) exit Prevent this by checking xadd src reg for pointer types. Also add a couple of test cases related to this. Fixes: `1be7f75d16` ("bpf: enable non-root eBPF programs") Fixes: `17a5267067` ("bpf: verifier (add verifier core)") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Dan Carpenter	dedf7348e8	rocker: move dereference before free [ Upstream commit `acb4b7df48` ] My static checker complains that ofdpa_neigh_del() can sometimes free "found". It just makes sense to use it first before deleting it. Fixes: `ecf244f753` ("rocker: fix maybe-uninitialized warning") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Ido Schimmel	7841af1a3a	mlxsw: spectrum_router: Fix NULL pointer dereference [ Upstream commit `6b27c8adf2` ] In case a VLAN device is enslaved to a bridge we shouldn't create a router interface (RIF) for it when it's configured with an IP address. This is already handled by the driver for other types of netdevs, such as physical ports and LAG devices. If this IP address is then removed and the interface is subsequently unlinked from the bridge, a NULL pointer dereference can happen, as the original 802.1d FID was replaced with an rFID which was then deleted. To reproduce: $ ip link set dev enp3s0np9 up $ ip link add name enp3s0np9.111 link enp3s0np9 type vlan id 111 $ ip link set dev enp3s0np9.111 up $ ip link add name br0 type bridge $ ip link set dev br0 up $ ip link set enp3s0np9.111 master br0 $ ip address add dev enp3s0np9.111 192.168.0.1/24 $ ip address del dev enp3s0np9.111 192.168.0.1/24 $ ip link set dev enp3s0np9.111 nomaster Fixes: `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Petr Machata <petrm@mellanox.com> Tested-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Gao Feng	3f4715e60e	net: sched: Fix one possible panic when no destroy callback [ Upstream commit `c1a4872ebf` ] When qdisc fail to init, qdisc_create would invoke the destroy callback to cleanup. But there is no check if the callback exists really. So it would cause the panic if there is no real destroy callback like the qdisc codel, fq, and so on. Take codel as an example following: When a malicious user constructs one invalid netlink msg, it would cause codel_init->codel_change->nla_parse_nested failed. Then kernel would invoke the destroy callback directly but qdisc codel doesn't define one. It causes one panic as a result. Now add one the check for destroy to avoid the possible panic. Fixes: `87b60cfacf` ("net_sched: fix error recovery at qdisc creation") Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Jason Wang	378345d61f	virtio-net: serialize tx routine during reset [ Upstream commit `713a98d90c` ] We don't hold any tx lock when trying to disable TX during reset, this would lead a use after free since ndo_start_xmit() tries to access the virtqueue which has already been freed. Fix this by using netif_tx_disable() before freeing the vqs, this could make sure no tx after vq freeing. Reported-by: Jean-Philippe Menil <jpmenil@gmail.com> Tested-by: Jean-Philippe Menil <jpmenil@gmail.com> Fixes commit `f600b69050` ("virtio_net: Add XDP support") Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Robert McCabe <robert.mccabe@rockwellcollins.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:06 +02:00
Eric Dumazet	28b35b3ffb	net: prevent sign extension in dev_get_stats() [ Upstream commit `6f64ec7451` ] Similar to the fix provided by Dominik Heidler in commit `9b3dc0a17d` ("l2tp: cast l2tp traffic counter to unsigned") we need to take care of 32bit kernels in dev_get_stats(). When using atomic_long_read(), we add a 'long' to u64 and might misinterpret high order bit, unless we cast to unsigned. Fixes: `caf586e5f2` ("net: add a core netdev->rx_dropped counter") Fixes: `015f0688f5` ("net: net: add a core netdev->tx_dropped counter") Fixes: `6e7333d315` ("net: add rx_nohandler stat counter") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
WANG Cong	85c1186f5d	tcp: reset sk_rx_dst in tcp_disconnect() [ Upstream commit `d747a7a51b` ] We have to reset the sk->sk_rx_dst when we disconnect a TCP connection, because otherwise when we re-connect it this dst reference is simply overridden in tcp_finish_connect(). This fixes a dst leak which leads to a loopback dev refcnt leak. It is a long-standing bug, Kevin reported a very similar (if not same) bug before. Thanks to Andrei for providing such a reliable reproducer which greatly narrows down the problem. Fixes: `41063e9dd1` ("ipv4: Early TCP socket demux.") Reported-by: Andrei Vagin <avagin@gmail.com> Reported-by: Kevin Xu <kaiwen.xu@hulu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
Richard Cochran	a700bd8dec	net: dp83640: Avoid NULL pointer dereference. [ Upstream commit `db9d8b29d1` ] The function, skb_complete_tx_timestamp(), used to allow passing in a NULL pointer for the time stamps, but that was changed in commit `62bccb8cdb` ("net-timestamp: Make the clone operation stand-alone from phy timestamping"), and the existing call sites, all of which are in the dp83640 driver, were fixed up. Even though the kernel-doc was subsequently updated in commit `7a76a021cd` ("net-timestamp: Update skb_complete_tx_timestamp comment"), still a bug fix from Manfred Rudigier came into the driver using the old semantics. Probably Manfred derived that patch from an older kernel version. This fix should be applied to the stable trees as well. Fixes: `81e8f2e930` ("net: dp83640: Fix tx timestamp overflow handling.") Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
Michal Kubeček	8a0cd5660d	net: account for current skb length when deciding about UFO [ Upstream commit `a5cb659bbc` ] Our customer encountered stuck NFS writes for blocks starting at specific offsets w.r.t. page boundary caused by networking stack sending packets via UFO enabled device with wrong checksum. The problem can be reproduced by composing a long UDP datagram from multiple parts using MSG_MORE flag: sendto(sd, buff, 1000, MSG_MORE, ...); sendto(sd, buff, 1000, MSG_MORE, ...); sendto(sd, buff, 3000, 0, ...); Assume this packet is to be routed via a device with MTU 1500 and NETIF_F_UFO enabled. When second sendto() gets into __ip_append_data(), this condition is tested (among others) to decide whether to call ip_ufo_append_data(): ((length + fragheaderlen) > mtu) \|\| (skb && skb_is_gso(skb)) At the moment, we already have skb with 1028 bytes of data which is not marked for GSO so that the test is false (fragheaderlen is usually 20). Thus we append second 1000 bytes to this skb without invoking UFO. Third sendto(), however, has sufficient length to trigger the UFO path so that we end up with non-UFO skb followed by a UFO one. Later on, udp_send_skb() uses udp_csum() to calculate the checksum but that assumes all fragments have correct checksum in skb->csum which is not true for UFO fragments. When checking against MTU, we need to add skb->len to length of new segment if we already have a partially filled skb and fragheaderlen only if there isn't one. In the IPv6 case, skb can only be null if this is the first segment so that we have to use headersize (length of the first IPv6 header) rather than fragheaderlen (length of IPv6 header of further fragments) for skb == NULL. Fixes: `e89e9cf539` ("[IPv4/IPv6]: UFO Scatter-gather approach") Fixes: `e4c5e13aa4` ("ipv6: Should use consistent conditional judgement for ip6 fragment between __ip6_append_data and ip6_finish_output") Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
Martin Habets	e2894b8778	sfc: Fix MCDI command size for filter operations [ Upstream commit `bb53f4d4f5` ] The 8000 series adapters uses catch-all filters for encapsulated traffic to support filtering VXLAN, NVGRE and GENEVE traffic. This new filter functionality requires a longer MCDI command. This patch increases the size of buffers on stack that were missed, which fixes a kernel panic from the stack protector. Fixes: `9b41080125` ("sfc: insert catch-all filters for encapsulated traffic") Signed-off-by: Martin Habets <mhabets@solarflare.com> Acked-by: Edward Cree <ecree@solarflare.com> Acked-by: Bert Kenward bkenward@solarflare.com Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
Arnd Bergmann	527bad77dd	netvsc: don't access netdev->num_rx_queues directly [ Upstream commit `b92b7d3312` ] This structure member is hidden behind CONFIG_SYSFS, and we get a build error when that is disabled: drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_channels': drivers/net/hyperv/netvsc_drv.c:754:49: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'? drivers/net/hyperv/netvsc_drv.c: In function 'netvsc_set_rxfh': drivers/net/hyperv/netvsc_drv.c:1181:25: error: 'struct net_device' has no member named 'num_rx_queues'; did you mean 'num_tx_queues'? As the value is only set once to the argument of alloc_netdev_mq(), we can compare against that constant directly. Fixes: `ff4a441990` ("netvsc: allow get/set of RSS indirection table") Fixes: `2b01888d1b` ("netvsc: allow more flexible setting of number of channels") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
WANG Cong	72056dab81	ipv6: avoid unregistering inet6_dev for loopback [ Upstream commit `60abc0be96` ] The per netns loopback_dev->ip6_ptr is unregistered and set to NULL when its mtu is set to smaller than IPV6_MIN_MTU, this leads to that we could set rt->rt6i_idev NULL after a rt6_uncached_list_flush_dev() and then crash after another call. In this case we should just bring its inet6_dev down, rather than unregistering it, at least prior to commit `176c39af29` ("netns: fix addrconf_ifdown kernel panic") we always override the case for loopback. Thanks a lot to Andrey for finding a reliable reproducer. Fixes: `176c39af29` ("netns: fix addrconf_ifdown kernel panic") Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Daniel Lezcano <dlezcano@fr.ibm.com> Cc: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:05 +02:00
Zach Brown	5dc86c8d78	net/phy: micrel: configure intterupts after autoneg workaround [ Upstream commit `b866203d87` ] The commit ("net/phy: micrel: Add workaround for bad autoneg") fixes an autoneg failure case by resetting the hardware. This turns off intterupts. Things will work themselves out if the phy polls, as it will figure out it's state during a poll. However if the phy uses only intterupts, the phy will stall, since interrupts are off. This patch fixes the issue by calling config_intr after resetting the phy. Fixes: `d2fd719bcb` ("net/phy: micrel: Add workaround for bad autoneg ") Signed-off-by: Zach Brown <zach.brown@ni.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-21 07:00:04 +02:00
Greg Kroah-Hartman	50a05be0a0	Linux 4.11.11	2017-07-15 13:04:51 +02:00
Mikulas Patocka	f6d5ffafb0	x86/mm/pat: Don't report PAT on CPUs that don't support it commit `99c13b8c88` upstream. The pat_enabled() logic is broken on CPUs which do not support PAT and where the initialization code fails to call pat_init(). Due to that the enabled flag stays true and pat_enabled() returns true wrongfully. As a consequence the mappings, e.g. for Xorg, are set up with the wrong caching mode and the required MTRR setups are omitted. To cure this the following changes are required: 1) Make pat_enabled() return true only if PAT initialization was invoked and successful. 2) Invoke init_cache_modes() unconditionally in setup_arch() and remove the extra callsites in pat_disable() and the pat disabled code path in pat_init(). Also rename __pat_enabled to pat_disabled to reflect the real purpose of this variable. Fixes: `9cd25aac1f` ("x86/mm/pat: Emulate PAT when it is disabled") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bernhard Held <berny156@gmx.de> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: "Luis R. Rodriguez" <mcgrof@suse.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/alpine.LRH.2.02.1707041749300.3456@file01.intranet.prod.int.rdu2.redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Chao Yu	2f7921d8de	ext4: check return value of kstrtoull correctly in reserved_clusters_store commit `1ea1516fbb` upstream. kstrtoull returns 0 on success, however, in reserved_clusters_store we will return -EINVAL if kstrtoull returns 0, it makes us fail to update reserved_clusters value through sysfs. Fixes: `76d33bca55` Signed-off-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Miao Xie <miaoxie@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Jason A. Donenfeld	ab28946bbc	crypto: rsa-pkcs1pad - use constant time memory comparison for MACs commit `fec17cb223` upstream. Otherwise, we enable all sorts of forgeries via timing attack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Suggested-by: Stephan Müller <smueller@chronox.de> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: linux-crypto@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Horia Geantă	34deb5387e	crypto: caam - fix gfp allocation flags (part I) commit `42cfcafb91` upstream. Changes in the SW cts (ciphertext stealing) code in commit `0605c41cc5` ("crypto: cts - Convert to skcipher") revealed a problem in the CAAM driver: when cts(cbc(aes)) is executed and cts runs in SW, cbc(aes) is offloaded in CAAM; cts encrypts the last block in atomic context and CAAM incorrectly decides to use GFP_KERNEL for memory allocation. Fix this by allowing GFP_KERNEL (sleeping) only when MAY_SLEEP flag is set, i.e. remove MAY_BACKLOG flag. We split the fix in two parts - first is sent to -stable, while the second is not (since there is no known failure case). Link: http://lkml.kernel.org/g/20170602122446.2427-1-david@sigma-star.at Reported-by: David Gstir <david@sigma-star.at> Signed-off-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Ian Abbott	eb844f27b7	staging: comedi: fix clean-up of comedi_class in comedi_init() commit `a9332e9ad0` upstream. There is a clean-up bug in the core comedi module initialization functions, `comedi_init()`. If the `comedi_num_legacy_minors` module parameter is non-zero (and valid), it creates that many "legacy" devices and registers them in SysFS. A failure causes the function to clean up and return an error. Unfortunately, it fails to destroy the "comedi" class that was created earlier. Fix it by adding a call to `class_destroy(comedi_class)` at the appropriate place in the clean-up sequence. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Malcolm Priestley	ed53d437d6	staging: vt6556: vnt_start Fix missing call to vnt_key_init_table. commit `dc32190f2c` upstream. The key table is not intialized correctly without this call. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Kirill Tkhai	3350230a54	locking/rwsem-spinlock: Fix EINTR branch in __down_write_common() commit `a0c4acd2c2` upstream. If a writer could been woken up, the above branch if (sem->count == 0) break; would have moved us to taking the sem. So, it's not the time to wake a writer now, and only readers are allowed now. Thus, 0 must be passed to __rwsem_do_wake(). Next, __rwsem_do_wake() wakes readers unconditionally. But we mustn't do that if the sem is owned by writer in the moment. Otherwise, writer and reader own the sem the same time, which leads to memory corruption in callers. rwsem-xadd.c does not need that, as: 1) the similar check is made lockless there, 2) in __rwsem_mark_wake::try_reader_grant we test, that sem is not owned by writer. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Niklas Cassel <niklas.cassel@axis.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `17fcbd590d` "locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y" Link: http://lkml.kernel.org/r/149762063282.19811.9129615532201147826.stgit@localhost.localdomain Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:41 +02:00
Eric W. Biederman	43fac435b2	proc: Fix proc_sys_prune_dcache to hold a sb reference commit `2fd1d2c4ce` upstream. Andrei Vagin writes: FYI: This bug has been reproduced on 4.11.7 > BUG: Dentry ffff895a3dd01240{i=4e7c09a,n=lo} still in use (1) [unmount of proc proc] > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 13588 at fs/dcache.c:1445 umount_check+0x6e/0x80 > CPU: 1 PID: 13588 Comm: kworker/1:1 Not tainted 4.11.7-200.fc25.x86_64 #1 > Hardware name: CompuLab sbc-flt1/fitlet, BIOS SBCFLT_0.08.04 06/27/2015 > Workqueue: events proc_cleanup_work > Call Trace: > dump_stack+0x63/0x86 > __warn+0xcb/0xf0 > warn_slowpath_null+0x1d/0x20 > umount_check+0x6e/0x80 > d_walk+0xc6/0x270 > ? dentry_free+0x80/0x80 > do_one_tree+0x26/0x40 > shrink_dcache_for_umount+0x2d/0x90 > generic_shutdown_super+0x1f/0xf0 > kill_anon_super+0x12/0x20 > proc_kill_sb+0x40/0x50 > deactivate_locked_super+0x43/0x70 > deactivate_super+0x5a/0x60 > cleanup_mnt+0x3f/0x90 > mntput_no_expire+0x13b/0x190 > kern_unmount+0x3e/0x50 > pid_ns_release_proc+0x15/0x20 > proc_cleanup_work+0x15/0x20 > process_one_work+0x197/0x450 > worker_thread+0x4e/0x4a0 > kthread+0x109/0x140 > ? process_one_work+0x450/0x450 > ? kthread_park+0x90/0x90 > ret_from_fork+0x2c/0x40 > ---[ end trace e1c109611e5d0b41 ]--- > VFS: Busy inodes after unmount of proc. Self-destruct in 5 seconds. Have a nice day... > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: _raw_spin_lock+0xc/0x30 > PGD 0 Fix this by taking a reference to the super block in proc_sys_prune_dcache. The superblock reference is the core of the fix however the sysctl_inodes list is converted to a hlist so that hlist_del_init_rcu may be used. This allows proc_sys_prune_dache to remove inodes the sysctl_inodes list, while not causing problems for proc_sys_evict_inode when if it later choses to remove the inode from the sysctl_inodes list. Removing inodes from the sysctl_inodes list allows proc_sys_prune_dcache to have a progress guarantee, while still being able to drop all locks. The fact that head->unregistering is set in start_unregistering ensures that no more inodes will be added to the the sysctl_inodes list. Previously the code did a dance where it delayed calling iput until the next entry in the list was being considered to ensure the inode remained on the sysctl_inodes list until the next entry was walked to. The structure of the loop in this patch does not need that so is much easier to understand and maintain. Reported-by: Andrei Vagin <avagin@gmail.com> Tested-by: Andrei Vagin <avagin@openvz.org> Fixes: `ace0c791e6` ("proc/sysctl: Don't grab i_lock under sysctl_lock.") Fixes: `d6cffbbe9a` ("proc/sysctl: prune stale dentries during unregistering") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:40 +02:00
Cong Wang	c353aee3bc	mqueue: fix a use-after-free in sys_mq_notify() commit `f991af3daa` upstream. The retry logic for netlink_attachskb() inside sys_mq_notify() is nasty and vulnerable: 1) The sock refcnt is already released when retry is needed 2) The fd is controllable by user-space because we already release the file refcnt so we when retry but the fd has been just closed by user-space during this small window, we end up calling netlink_detachskb() on the error path which releases the sock again, later when the user-space closes this socket a use-after-free could be triggered. Setting 'sock' to NULL here should be sufficient to fix it. Reported-by: GeneBlue <geneblue.mail@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-15 13:04:40 +02:00
Greg Kroah-Hartman	e3d9e2b278	Linux 4.11.10	2017-07-12 16:54:12 +02:00
Yifeng Li	a6ed89c62d	rt286: add Thinkpad Helix 2 to force_combo_jack_table commit `fe0dfd6358` upstream. Thinkpad Helix 2 is a tablet PC, the audio is powered by Core M broadwell-audio and rt286 codec. For all versions of Linux kernel, the stereo output doesn't work properly when earphones are plugged in, the sound was coming out from both channels even if the audio contains only the left or right channel. Furthermore, if a music recorded in stereo is played, the two channels cancle out each other out, as a result, no voice but only distorted background music can be heard, like a sound card with builtin a Karaoke sount effect. Apparently this tablet uses a combo jack with polarity incorrectly set by rt286 driver. This patch adds DMI information of Thinkpad Helix 2 to force_combo_jack_table[] and the issue is resolved. The microphone input doesn't work regardless to the presence of this patch and still needs help from other developers to investigate. This is my first patch to LKML directly, sorry for CC-ing too many people here. Link: https://bugzilla.kernel.org/show_bug.cgi?id=93841 Signed-off-by: Yifeng Li <tomli@tomli.me> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:58 +02:00
Stephan Mueller	4714754028	crypto: drbg - Fixes panic in wait_for_completion call commit `b61929c654` upstream. Initialise ctr_completion variable before use. Cc: <stable@vger.kernel.org> Signed-off-by: Harsh Jain <harshjain.prof@gmail.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:58 +02:00
Juergen Gross	f7995dfdfa	xen: avoid deadlock in xenbus driver commit `1a3fc2c402` upstream. There has been a report about a deadlock in the xenbus driver: [ 247.979498] ====================================================== [ 247.985688] WARNING: possible circular locking dependency detected [ 247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted [ 247.997040] ------------------------------------------------------ [ 248.003232] xenbus/91 is trying to acquire lock: [ 248.007875] (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>] xenbus_dev_queue_reply+0x3c/0x230 [ 248.017163] [ 248.017163] but task is already holding lock: [ 248.023096] (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] xenbus_thread+0x5f0/0x798 [ 248.031267] [ 248.031267] which lock already depends on the new lock. [ 248.031267] [ 248.039615] [ 248.039615] the existing dependency chain (in reverse order) is: [ 248.047176] [ 248.047176] -> #1 (xb_write_mutex){+.+...}: [ 248.052943] __lock_acquire+0x1728/0x1778 [ 248.057498] lock_acquire+0xc4/0x288 [ 248.061630] __mutex_lock+0x84/0x868 [ 248.065755] mutex_lock_nested+0x3c/0x50 [ 248.070227] xs_send+0x164/0x1f8 [ 248.074015] xenbus_dev_request_and_reply+0x6c/0x88 [ 248.079427] xenbus_file_write+0x260/0x420 [ 248.084073] __vfs_write+0x48/0x138 [ 248.088113] vfs_write+0xa8/0x1b8 [ 248.091983] SyS_write+0x54/0xb0 [ 248.095768] el0_svc_naked+0x24/0x28 [ 248.099897] [ 248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}: [ 248.106088] print_circular_bug+0x80/0x2e0 [ 248.110730] __lock_acquire+0x1768/0x1778 [ 248.115288] lock_acquire+0xc4/0x288 [ 248.119417] __mutex_lock+0x84/0x868 [ 248.123545] mutex_lock_nested+0x3c/0x50 [ 248.128016] xenbus_dev_queue_reply+0x3c/0x230 [ 248.133005] xenbus_thread+0x788/0x798 [ 248.137306] kthread+0x110/0x140 [ 248.141087] ret_from_fork+0x10/0x40 It is rather easy to avoid by dropping xb_write_mutex before calling xenbus_dev_queue_reply(). Fixes: `fd8aa9095a` ("xen: optimize xenbus driver for multiple concurrent xenstore accesses"). Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:58 +02:00
Paolo Abeni	060be7f75e	x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings commit `236222d393` upstream. According to the Intel datasheet, the REP MOVSB instruction exposes a pretty heavy setup cost (50 ticks), which hurts short string copy operations. This change tries to avoid this cost by calling the explicit loop available in the unrolled code for strings shorter than 64 bytes. The 64 bytes cutoff value is arbitrary from the code logic point of view - it has been selected based on measurements, as the largest value that still ensures a measurable gain. Micro benchmarks of the __copy_from_user() function with lengths in the [0-63] range show this performance gain (shorter the string, larger the gain): - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4 - in the [72%-9%] range on Intel Core i7-4810MQ Other tested CPUs - namely Intel Atom S1260 and AMD Opteron 8216 - show no difference, because they do not expose the ERMS feature bit. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com [ Clarified the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>	2017-07-12 16:53:58 +02:00
Jarkko Sakkinen	062c8518b9	tpm: fix a kernel memory leak in tpm-sysfs.c commit `13b47cfcfc` upstream. While cleaning up sysfs callback that prints EK we discovered a kernel memory leak. This commit fixes the issue by zeroing the buffer used for TPM command/response. The leak happen when we use either tpm_vtpm_proxy, tpm_ibmvtpm or xen-tpmfront. Fixes: `0883743825` ("TPM: sysfs functions consolidation") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:58 +02:00
Josh Zimmerman	f008ae10ad	tpm: Issue a TPM2_Shutdown for TPM2 devices. commit `d1bd4a792d` upstream. If a TPM2 loses power without a TPM2_Shutdown command being issued (a "disorderly reboot"), it may lose some state that has yet to be persisted to NVRam, and will increment the DA counter. After the DA counter gets sufficiently large, the TPM will lock the user out. NOTE: This only changes behavior on TPM2 devices. Since TPM1 uses sysfs, and sysfs relies on implicit locking on chip->ops, it is not safe to allow this code to run in TPM1, or to add sysfs support to TPM2, until that locking is made explicit. Signed-off-by: Josh Zimmerman <joshz@google.com> Fixes: `74d6b3ceaa` ("tpm: fix suspend/resume paths for TPM 2.0") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:58 +02:00
Josh Zimmerman	dc2ab45eed	Add "shutdown" to "struct class". commit `f77af15165` upstream. The TPM class has some common shutdown code that must be executed for all drivers. This adds some needed functionality for that. Signed-off-by: Josh Zimmerman <joshz@google.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Fixes: `74d6b3ceaa` ("tpm: fix suspend/resume paths for TPM 2.0") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Andreas Gruenbacher	78256de9a8	gfs2: Fix glock rhashtable rcu bug commit `961ae1d83d` upstream. Before commit `88ffbf3e03` "GFS2: Use resizable hash table for glocks", glocks were freed via call_rcu to allow reading the glock hashtable locklessly using rcu. This was then changed to free glocks immediately, which made reading the glock hashtable unsafe. Bring back the original code for freeing glocks via call_rcu. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Jiahau Chang	9b4adf4902	xhci: Limit USB2 port wake support for AMD Promontory hosts commit `dec08194ff` upstream. For AMD Promontory xHCI host, although you can disable USB 2.0 ports in BIOS settings, those ports will be enabled anyway after you remove a device on that port and re-plug it in again. It's a known limitation of the chip. As a workaround we can clear the PORT_WAKE_BITS. This will disable wake on connect, disconnect and overcurrent on AMD Promontory USB2 ports [checkpatch cleanup and commit message reword -Mathias] Cc: Tsai Nicholas <nicholas.tsai@amd.com> Signed-off-by: Jiahau Chang <Lars_Chang@asmedia.com.tw> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Bjørn Mork	12f5ef4ef5	USB: serial: qcserial: new Sierra Wireless EM7305 device ID commit `996fab55d8` upstream. A new Sierra Wireless EM7305 device ID used in a Toshiba laptop. Reported-by: Petr Kloc <petr_kloc@yahoo.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Johan Hovold	364e52124c	USB: serial: option: add two Longcheer device ids commit `8fb060da71` upstream. Add two Longcheer device-id entries which specifically enables a Telewell TW-3G HSPA+ branded modem (0x9801). Reported-by: Teemu Likonen <tlikonen@iki.fi> Reported-by: Bjørn Mork <bjorn@mork.no> Reported-by: Lars Melin <larsm17@gmail.com> Tested-by: Teemu Likonen <tlikonen@iki.fi> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Geert Uytterhoeven	a9c5e5330c	pinctrl: sh-pfc: Update info pointer after SoC-specific init commit `3091ae775f` upstream. Update the sh_pfc_soc_info pointer after calling the SoC-specific initialization function, as it may have been updated to e.g. handle different SoC revisions. This makes sure the correct subdriver name is printed later. Fixes: `0c151062f3` ("sh-pfc: Add support for SoC-specific initialization") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Sergei Shtylyov	59326554d9	pinctrl: sh-pfc: r8a7791: Add missing HSCIF1 pinmux data commit `da7a692fbb` upstream. The R8A7791 PFC driver was apparently based on the preliminary revisions of the user's manual, which omitted the HSCIF1 group E signals in the IPSR4 register description. This would cause HSCIF1's probe to fail with the messages like below: sh-pfc e6060000.pfc: cannot locate data/mark enum_id for mark 1989 sh-sci e62c8000.serial: Error applying setting, reverse things back sh-sci: probe of e62c8000.serial failed with error -22 Add the neceassary PINMUX_IPSR_MSEL() invocations for the HSCK1_E, HCTS1#_E, and HRTS1#_E signals... Fixes: `5088451962` ("pinctrl: sh-pfc: r8a7791 PFC support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Uwe Kleine-König	53f806e46b	pinctrl: mxs: atomically switch mux and drive strength config commit `da6c2addf6` upstream. To set the mux mode of a pin two bits must be set. Up to now this is implemented using the following idiom: writel(mask, reg + CLR); writel(value, reg + SET); . This however results in the mux mode being 0 between the two writes. On my machine there is an IC's reset pin connected to LCD_D20. The bootloader configures this pin as GPIO output-high (i.e. not holding the IC in reset). When Linux reconfigures the pin to GPIO the short time LCD_D20 is muxed as LCD_D20 instead of GPIO_1_20 is enough to confuse the connected IC. The same problem is present for the pin's drive strength setting which is reset to low drive strength before using the right value. So instead of relying on the hardware to modify the register setting using two writes implement the bit toggling using read-modify-write. Fixes: `17723111e6` ("pinctrl: add pinctrl-mxs support") Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:57 +02:00
Tony Lindgren	9c89b4cc99	pinctrl: core: Fix warning by removing bogus code commit `664b7c4728` upstream. Andre Przywara <andre.przywara@arm.com> noticed that we can get the following warning with -EPROBE_DEFER: "WARNING: CPU: 1 PID: 89 at drivers/base/dd.c:349 driver_probe_device+0x2ac/0x2e8" Let's fix the issue by removing the indices as suggested by Tejun Heo <tj@kernel.org>. All we have to do here is kill the radix tree. I probably ended up with the indices after grepping for removal of all entries using radix_tree_for_each_slot() and the first match found was gmap_radix_tree_free(). Anyways, no need for indices here, and we can just do remove all the entries using radix_tree_for_each_slot() along how the item_kill_tree() test case does. Fixes: `c7059c5ac7` ("pinctrl: core: Add generic pinctrl functions for managing groups") Fixes: `a76edc89b1` ("pinctrl: core: Add generic pinctrl functions for managing groups") Reported-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Chen-Yu Tsai	789aa0c0b2	pinctrl: sunxi: Fix SPDIF function name for A83T commit `7903d4f5e1` upstream. We use well known standard names for functions that have name, such as I2C, SPI, SPDIF, etc.. Fix the function name of SPDIF, which was named OWA (One Wire Audio) based on Allwinner datasheets. Fixes: `4730f33f0d` ("pinctrl: sunxi: add allwinner A83T PIO controller support") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Alexandre TORGUE	6fc9a147d8	pinctrl: stm32: Fix bad function call commit `b7c747d462` upstream. In stm32_pconf_parse_conf function, stm32_pmx_gpio_set_direction is called with wrong parameter value. Indeed, using NULL value for range will raise an oops. Fixes: `aceb16dc2d` ("pinctrl: Add STM32 MCUs support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Martin Blumenstingl	5756449737	pinctrl: meson: meson8b: fix the NAND DQS pins commit `97ba26b8a9` upstream. The nand_groups table uses different names for the NAND DQS pins than the GROUP() definition in meson8b_cbus_groups (nand_dqs_0 vs nand_dqs0). This prevents using the NAND DQS pins in the devicetree. Fix this by ensuring that the GROUP() definition and the meson8b_cbus_groups use the same name for these pins. Fixes: `0fefcb6876` ("pinctrl: Add support for Meson8b") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Kevin Hilman <khilman@baylibre.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Geert Uytterhoeven	281e3b2914	pinctrl: sh-pfc: r8a7795: Fix hscif2_clk_b and hscif4_ctrl commit `4324b6084f` upstream. Fix typos in hscif2_clk_b_mux[] and hscif4_ctrl_mux[]. Fixes: `a56069c46c` ("pinctrl: sh-pfc: r8a7795: Add HSCIF pins, groups, and functions") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Sergei Shtylyov	5aff152fa5	pinctrl: sh-pfc: r8a7791: Add missing DVC_MUTE signal commit `3908632fb8` upstream. The R8A7791 PFC driver was apparently based on the preliminary revisions of the user's manual, which omitted the DVC_MUTE signal altogether in the PFC section. The modern manual has the signal described, so just add the necassary data to the driver... Fixes: `5088451962` ("pinctrl: sh-pfc: r8a7791 PFC support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Sergei Shtylyov	692a0484da	pinctrl: sh-pfc: r8a7791: Fix SCIF2 pinmux data commit `58439280f8` upstream. PINMUX_IPSR_MSEL() macro invocation for the TX2 signal has apparently wrong 1st argument -- most probably a result of cut&paste programming... Fixes: `5088451962` ("pinctrl: sh-pfc: r8a7791 PFC support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Sergei Shtylyov	3c6e293668	pinctrl: sh-pfc: r8a7794: Swap ATA signals commit `5f4c8cafe1` upstream. All R8A7794 manuals I have here (0.50 and 1.10) agree that the PFC driver has ATAG0# and ATAWR0# signals in IPSR12 swapped -- fix this. Fixes: `43c4436e2f` ("pinctrl: sh-pfc: add R8A7794 PFC support") Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:56 +02:00
Juri Lelli	da49ee24e1	arm: remove wrong CONFIG_PROC_SYSCTL ifdef commit `f70b281b59` upstream. The sysfs cpu_capacity entry for each CPU has nothing to do with PROC_FS, nor it's in /proc/sys path. Remove such ifdef. Cc: Russell King <linux@arm.linux.org.uk> Reported-and-suggested-by: Sudeep Holla <sudeep.holla@arm.com> Fixes: `7e5930aaef` ('ARM: 8622/3: add sysfs cpu_capacity attribute') Signed-off-by: Juri Lelli <juri.lelli@arm.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Johan Hovold	afcf26c845	USB: core: fix device node leak commit `e271b2c909` upstream. Make sure to release any OF device-node reference taken when creating the USB device. Note that we currently do not hold a reference to the root hub device-tree node (i.e. the parent controller node). Fixes: `69bec72598` ("USB: core: let USB device know device node") Acked-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Benjamin Herrenschmidt	64e9992d22	usb: Fix typo in the definition of Endpoint[out]Request commit `7cf916bd63` upstream. The current definition is wrong. This breaks my upcoming Aspeed virtual hub driver. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Michael Grzeschik	e3954b4c00	usb: usbip: set buffer pointers to NULL after free commit `b3b51417d0` upstream. The usbip stack dynamically allocates the transfer_buffer and setup_packet of each urb that got generated by the tcp to usb stub code. As these pointers are always used only once we will set them to NULL after use. This is done likewise to the free_urb code in vudc_dev.c. This patch fixes double kfree situations where the usbip remote side added the URB_FREE_BUFFER. Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Devin Heitmueller	2727982681	Add USB quirk for HVR-950q to avoid intermittent device resets commit `6836796de4` upstream. The USB core and sysfs will attempt to enumerate certain parameters which are unsupported by the au0828 - causing inconsistent behavior and sometimes causing the chip to reset. Avoid making these calls. This problem manifested as intermittent cases where the au8522 would be reset on analog video startup, in particular when starting up ALSA audio streaming in parallel - the sysfs entries created by snd-usb-audio on streaming startup would result in unsupported control messages being sent during tuning which would put the chip into an unknown state. Signed-off-by: Devin Heitmueller <dheitmueller@kernellabs.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Jeremie Rapin	2c5f0e7040	USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick commit `fd90f73a99` upstream. Added the USB serial device ID for the CEL ZigBee EM3588 radio stick. Signed-off-by: Jeremie Rapin <rapinj@gmail.com> Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Felipe Balbi	da39242fe9	usb: dwc3: replace %p with %pK commit `04fb365c45` upstream. %p will leak kernel pointers, so let's not expose the information on dmesg and instead use %pK. %pK will only show the actual addresses if explicitly enabled under /proc/sys/kernel/kptr_restrict. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Gerd Hoffmann	a2746d8b78	drm/virtio: don't leak bo on drm_gem_object_init failure commit `385aee965b` upstream. Reported-by: 李强 <liqiang6-s@360.cn> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170406155941.458-1-kraxel@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Sabrina Dubroca	7dde693403	tracing/kprobes: Allow to create probe with a module name starting with a digit commit `9e52b32567` upstream. Always try to parse an address, since kstrtoul() will safely fail when given a symbol as input. If that fails (which will be the case for a symbol), try to parse a symbol instead. This allows creating a probe such as: p:probe/vlan_gro_receive 8021q:vlan_gro_receive+0 Which is necessary for this command to work: perf probe -m 8021q -a vlan_gro_receive Link: http://lkml.kernel.org/r/fd72d666f45b114e2c5b9cf7e27b91de1ec966f1.1498122881.git.sd@queasysnail.net Fixes: `413d37d1e` ("tracing: Add kprobe-based event tracer") Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:55 +02:00
Yan, Zheng	8e7dedf725	ceph: choose readdir frag based on previous readdir reply commit `b50c2de51e` upstream. The dirfragtree is lazily updated, it's not always accurate. Infinite loops happens in following circumstance. - client send request to read frag A - frag A has been fragmented into frag B and C. So mds fills the reply with contents of frag B - client wants to read next frag C. ceph_choose_frag(frag value of C) return frag A. The fix is using previous readdir reply to calculate next readdir frag when possible. Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:54 +02:00
Boris Pismenny	29f0189eb1	RDMA/uverbs: Check port number supplied by user verbs cmds commit `5ecce4c9b1` upstream. The ib_uverbs_create_ah() ind ib_uverbs_modify_qp() calls receive the port number from user input as part of its attributes and assumes it is valid. Down on the stack, that parameter is used to access kernel data structures. If the value is invalid, the kernel accesses memory it should not. To prevent this, verify the port number before using it. BUG: KASAN: use-after-free in ib_uverbs_create_ah+0x6d5/0x7b0 Read of size 4 at addr ffff880018d67ab8 by task syz-executor/313 BUG: KASAN: slab-out-of-bounds in modify_qp.isra.4+0x19d0/0x1ef0 Read of size 4 at addr ffff88006c40ec58 by task syz-executor/819 Fixes: `67cdb40ca4` ("[IB] uverbs: Implement more commands") Fixes: `189aba99e7` ("IB/uverbs: Extend modify_qp and support packet pacing") Cc: Yevgeny Kliteynik <kliteyn@mellanox.com> Cc: Tziporet Koren <tziporet@mellanox.com> Cc: Alex Polak <alexpo@mellanox.com> Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:54 +02:00
Adrian Salido	89488f3193	driver core: platform: fix race condition with driver_override commit `6265539776` upstream. The driver_override implementation is susceptible to race condition when different threads are reading vs storing a different driver override. Add locking to avoid race condition. Fixes: `3d713e0e38` ("driver core: platform: add device binding path 'driver_override'") Cc: stable@vger.kernel.org Signed-off-by: Adrian Salido <salidoa@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:54 +02:00
Christoph Hellwig	47a82dad34	fs: completely ignore unknown open flags commit `629e014bb8` upstream. Currently we just stash anything we got into file->f_flags, and the report it in fcntl(F_GETFD). This patch just clears out all unknown flags so that we don't pass them to the fs or report them. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:54 +02:00
Christoph Hellwig	c012328136	fs: add a VALID_OPEN_FLAGS commit `80f18379a7` upstream. Add a central define for all valid open flags, and use it in the uniqueness check. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-12 16:53:54 +02:00
Greg Kroah-Hartman	f82a53b875	Linux 4.11.9	2017-07-05 14:41:57 +02:00
David S. Miller	f29125639b	hsi: Fix build regression due to netdev destructor fix. commit `ed66e50d95` upstream. > ../drivers/hsi/clients/ssi_protocol.c:1069:5: error: 'struct net_device' has no member named 'destructor' Reported-by: Mark Brown <broonie@kernel.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:43 +02:00
Steffen Klassert	a5afcf8553	esp4: Fix udpencap for local TCP packets. [ Upstream commit `0e78a87306` ] Locally generated TCP packets are usually cloned, so we do skb_cow_data() on this packets. After that we need to reload the pointer to the esp header. On udpencap this header has an offset to skb_transport_header, so take this offset into account. This is a backport of: commit `0e78a87306` ("esp4: Fix udpencap for local TCP packets.") Fixes: `67d349ed60` ("net/esp4: Fix invalid esph pointer crash") Fixes: `fca11ebde3` ("esp4: Reorganize esp_output") Reported-by: Don Bowman <db@donbowman.ca> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:43 +02:00
Wanpeng Li	b9a1254c31	KVM: nVMX: Fix exception injection commit `d4912215d1` upstream. WARNING: CPU: 3 PID: 2840 at arch/x86/kvm/vmx.c:10966 nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel] CPU: 3 PID: 2840 Comm: qemu-system-x86 Tainted: G OE 4.12.0-rc3+ #23 RIP: 0010:nested_vmx_vmexit+0xdcd/0xde0 [kvm_intel] Call Trace: ? kvm_check_async_pf_completion+0xef/0x120 [kvm] ? rcu_read_lock_sched_held+0x79/0x80 vmx_queue_exception+0x104/0x160 [kvm_intel] ? vmx_queue_exception+0x104/0x160 [kvm_intel] kvm_arch_vcpu_ioctl_run+0x1171/0x1ce0 [kvm] ? kvm_arch_vcpu_load+0x47/0x240 [kvm] ? kvm_arch_vcpu_load+0x62/0x240 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This is triggered occasionally by running both win7 and win2016 in L2, in addition, EPT is disabled on both L1 and L2. It can't be reproduced easily. Commit `0b6ac343fc` (KVM: nVMX: Correct handling of exception injection) mentioned that "KVM wants to inject page-faults which it got to the guest. This function assumes it is called with the exit reason in vmcs02 being a #PF exception". Commit `e011c663` (KVM: nVMX: Check all exceptions for intercept during delivery to L2) allows to check all exceptions for intercept during delivery to L2. However, there is no guarantee the exit reason is exception currently, when there is an external interrupt occurred on host, maybe a time interrupt for host which should not be injected to guest, and somewhere queues an exception, then the function nested_vmx_check_exception() will be called and the vmexit emulation codes will try to emulate the "Acknowledge interrupt on exit" behavior, the warning is triggered. Reusing the exit reason from the L2->L0 vmexit is wrong in this case, the reason must always be EXCEPTION_NMI when injecting an exception into L1 as a nested vmexit. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Fixes: `e011c663b9` ("KVM: nVMX: Check all exceptions for intercept during delivery to L2") Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:43 +02:00
Radim Krčmář	bde6903736	KVM: x86: zero base3 of unusable segments commit `f0367ee1d6` upstream. Static checker noticed that base3 could be used uninitialized if the segment was not present (useable). Random stack values probably would not pass VMCS entry checks. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `1aa366163b` ("KVM: x86 emulator: consolidate segment accessors") Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Radim Krčmář	02800188d7	KVM: x86/vPMU: fix undefined shift in intel_pmu_refresh() commit `34b0dadbdf` upstream. Static analysis noticed that pmu->nr_arch_gp_counters can be 32 (INTEL_PMC_MAX_GENERIC) and therefore cannot be used to shift 'int'. I didn't add BUILD_BUG_ON for it as we have a better checker. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `25462f7f52` ("KVM: x86/vPMU: Define kvm_pmu_ops to support vPMU function dispatch") Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Ladi Prosek	3d2a8efd7d	KVM: x86: fix emulation of RSM and IRET instructions commit `6ed071f051` upstream. On AMD, the effect of set_nmi_mask called by emulate_iret_real and em_rsm on hflags is reverted later on in x86_emulate_instruction where hflags are overwritten with ctxt->emul_flags (the kvm_set_hflags call). This manifests as a hang when rebooting Windows VMs with QEMU, OVMF, and >1 vcpu. Instead of trying to merge ctxt->emul_flags into vcpu->arch.hflags after an instruction is emulated, this commit deletes emul_flags altogether and makes the emulator access vcpu->arch.hflags using two new accessors. This way all changes, on the emulator side as well as in functions called from the emulator and accessing vcpu state with emul_to_vcpu, are preserved. More details on the bug and its manifestation with Windows and OVMF: It's a KVM bug in the interaction between SMI/SMM and NMI, specific to AMD. I believe that the SMM part explains why we started seeing this only with OVMF. KVM masks and unmasks NMI when entering and leaving SMM. When KVM emulates the RSM instruction in em_rsm, the set_nmi_mask call doesn't stick because later on in x86_emulate_instruction we overwrite arch.hflags with ctxt->emul_flags, effectively reverting the effect of the set_nmi_mask call. The AMD-specific hflag of interest here is HF_NMI_MASK. When rebooting the system, Windows sends an NMI IPI to all but the current cpu to shut them down. Only after all of them are parked in HLT will the initiating cpu finish the restart. If NMI is masked, other cpus never get the memo and the initiating cpu spins forever, waiting for hal!HalpInterruptProcessorsStarted to drop. That's the symptom we observe. Fixes: `a584539b24` ("KVM: x86: pass the whole hflags field to emulator and back") Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Thomas Petazzoni	51c3bb1d99	mtd: nand: fsmc: fix NAND width handling commit `ee56874f23` upstream. In commit `eea628199d` ("mtd: Add device-tree support to fsmc_nand"), Device Tree support was added to the fmsc_nand driver. However, this code has a bug in how it handles the bank-width DT property to set the bus width. Indeed, in the function fsmc_nand_probe_config_dt() that parses the Device Tree, it sets pdata->width to either 8 or 16 depending on the value of the bank-width DT property. Then, the ->probe() function will test if pdata->width is equal to FSMC_NAND_BW16 (which is 2) to set NAND_BUSWIDTH_16 in nand->options. Therefore, with the DT probing, this condition will never match. This commit fixes that by removing the "width" field from fsmc_nand_platform_data and instead have the fsmc_nand_probe_config_dt() function directly set the appropriate nand->options value. It is worth mentioning that if this commit gets backported to older kernels, prior to the drop of non-DT probing, then non-DT probing will be broken because nand->options will no longer be set to NAND_BUSWIDTH_16. Fixes: `eea628199d` ("mtd: Add device-tree support to fsmc_nand") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Kamal Dasu	f34b2f6f68	mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program commit `9d2ee0a60b` upstream. On brcmnand controller v6.x and v7.x, the #WP pin is controlled through the NAND_WP bit in CS_SELECT register. The driver currently assumes that toggling the #WP pin is instantaneously enabling/disabling write-protection, but it actually takes some time to propagate the new state to the internal NAND chip logic. This behavior is sometime causing data corruptions when an erase/program operation is executed before write-protection has really been disabled. Fixes: `27c5b17cd1` ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: Kamal Dasu <kdasu.kdev@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Arnd Bergmann	3b422c3c1f	infiniband: hns: avoid gcc-7.0.1 warning for uninitialized data commit `5b0ff9a007` upstream. hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field, which will then change only a few of its bits, causing a warning with the latest gcc: infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci': infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized] roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1); The code is actually correct since we always set all bits of the port_vlan field, but gcc correctly points out that the first access does contain uninitialized data. This initializes the field to zero first before setting the individual bits. Fixes: `9a4435375c` ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Suravee Suthikulpanit	6af65c535d	iommu/amd: Fix interrupt remapping when disable guest_mode commit `84a21dbdef` upstream. Pass-through devices to VM guest can get updated IRQ affinity information via irq_set_affinity() when not running in guest mode. Currently, AMD IOMMU driver in GA mode ignores the updated information if the pass-through device is setup to use vAPIC regardless of guest_mode. This could cause invalid interrupt remapping. Also, the guest_mode bit should be set and cleared only when SVM updates posted-interrupt interrupt remapping information. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Joerg Roedel <jroedel@suse.de> Fixes: `d98de49a53` ('iommu/amd: Enable vAPIC interrupt remapping mode by default') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Pan Bian	fb6237c733	iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid() commit `73dbd4a423` upstream. In function amd_iommu_bind_pasid(), the control flow jumps to label out_free when pasid_state->mm and mm is NULL. And mmput(mm) is called. In function mmput(mm), mm is referenced without validation. This will result in a NULL dereference bug. This patch fixes the bug. Signed-off-by: Pan Bian <bianpan2016@163.com> Fixes: `f0aac63b87` ('iommu/amd: Don't hold a reference to mm_struct') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Robin Murphy	3d06032fb2	iommu/dma: Don't reserve PCI I/O windows commit `938f1bbe35` upstream. Even if a host controller's CPU-side MMIO windows into PCI I/O space do happen to leak into PCI memory space such that it might treat them as peer addresses, trying to reserve the corresponding I/O space addresses doesn't do anything to help solve that problem. Stop doing a silly thing. Fixes: `fade1ec055` ("iommu/dma: Avoid PCI host bridge windows") Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Eric Ren	092702fa4d	ocfs2: fix deadlock caused by recursive locking in xattr commit `8818efaaac` upstream. Another deadlock path caused by recursive locking is reported. This kind of issue was introduced since commit `743b5f1434` ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been fixed by commit `b891fa5024` ("ocfs2: fix deadlock issue when taking inode lock at vfs entry points"). Yes, we intend to fix this kind of case in incremental way, because it's hard to find out all possible paths at once. This one can be reproduced like this. On node1, cp a large file from home directory to ocfs2 mountpoint. While on node2, run setfacl/getfacl. Both nodes will hang up there. The backtraces: On node1: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_write_begin+0x43/0x1a0 [ocfs2] generic_perform_write+0xa9/0x180 __generic_file_write_iter+0x1aa/0x1d0 ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2] __vfs_write+0xc3/0x130 vfs_write+0xb1/0x1a0 SyS_write+0x46/0xa0 On node2: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_xattr_set+0x12e/0xe80 [ocfs2] ocfs2_set_acl+0x22d/0x260 [ocfs2] ocfs2_iop_set_acl+0x65/0xb0 [ocfs2] set_posix_acl+0x75/0xb0 posix_acl_xattr_set+0x49/0xa0 __vfs_setxattr+0x69/0x80 __vfs_setxattr_noperm+0x72/0x1a0 vfs_setxattr+0xa7/0xb0 setxattr+0x12d/0x190 path_setxattr+0x9f/0xb0 SyS_setxattr+0x14/0x20 Fix this one by using ocfs2_inode_{lock\|unlock}_tracker, which is exported by commit `439a36b8ef` ("ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock"). Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com Fixes: `743b5f1434` ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <zren@suse.com> Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:42 +02:00
Junxiao Bi	404dfb7533	ocfs2: o2hb: revert hb threshold to keep compatible commit `33496c3c3d` upstream. Configfs is the interface for ocfs2-tools to set configure to kernel and $configfs_dir/cluster/$clustername/heartbeat/dead_threshold is the one used to configure heartbeat dead threshold. Kernel has a default value of it but user can set O2CB_HEARTBEAT_THRESHOLD in /etc/sysconfig/o2cb to override it. Commit `45b997737a` ("ocfs2/cluster: use per-attribute show and store methods") changed heartbeat dead threshold name while ocfs2-tools did not, so ocfs2-tools won't set this configurable and the default value is always used. So revert it. Fixes: `45b997737a` ("ocfs2/cluster: use per-attribute show and store methods") Link: http://lkml.kernel.org/r/1490665245-15374-1-git-send-email-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Acked-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Andy Lutomirski	b363931e89	x86/mm: Fix flush_tlb_page() on Xen commit `dbd68d8e84` upstream. flush_tlb_page() passes a bogus range to flush_tlb_others() and expects the latter to fix it up. native_flush_tlb_others() has the fixup but Xen's version doesn't. Move the fixup to flush_tlb_others(). AFAICS the only real effect is that, without this fix, Xen would flush everything instead of just the one page on remote vCPUs in when flush_tlb_page() was called. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Nadav Amit <namit@vmware.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `e7b52ffd45` ("x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range") Link: http://lkml.kernel.org/r/10ed0e4dfea64daef10b87fb85df1746999b4dba.1492844372.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Joerg Roedel	ed815afd32	x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space commit `5ed386ec09` upstream. When this function fails it just sends a SIGSEGV signal to user-space using force_sig(). This signal is missing essential information about the cause, e.g. the trap_nr or an error code. Fix this by propagating the error to the only caller of mpx_handle_bd_fault(), do_bounds(), which sends the correct SIGSEGV signal to the process. Signed-off-by: Joerg Roedel <jroedel@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `fe3d197f84` ('x86, mpx: On-demand kernel allocation of bounds tables') Link: http://lkml.kernel.org/r/1491488362-27198-1-git-send-email-joro@8bytes.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Kan Liang	92ff9b6ce8	perf/x86: Fix spurious NMI with PEBS Load Latency event commit `fd583ad156` upstream. Spurious NMIs will be observed with the following command: while :; do perf record -bae "cpu/umask=0x01,event=0xcd,ldlat=0x80/pp" -e "cpu/umask=0x03,event=0x0/" -e "cpu/umask=0x02,event=0x0/" -e cycles,branches,cache-misses -e cache-references -- sleep 10 done The bug was introduced by commit: `8077eca079` ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") That commit clears the status bits for the counters used for PEBS events, by masking the whole 64 bits pebs_enabled. However, only the low 32 bits of both status and pebs_enabled are reserved for PEBS-able counters. For status bits 32-34 are fixed counter overflow bits. For pebs_enabled bits 32-34 are for PEBS Load Latency. In the test case, the PEBS Load Latency event and fixed counter event could overflow at the same time. The fixed counter overflow bit will be cleared by mistake. Once it is cleared, the fixed counter overflow never be processed, which finally trigger spurious NMI. Correct the PEBS enabled mask by ignoring the non-PEBS bits. Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: `8077eca079` ("perf/x86/pebs: Add workaround for broken OVFL status on HSW+") Link: http://lkml.kernel.org/r/1491333246-3965-1-git-send-email-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Baoquan He	cc6b8fbcc6	x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug commit `8eabf42ae5` upstream. Kernel text KASLR is separated into physical address and virtual address randomization. And for virtual address randomization, we only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE. So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR, but not the original kernel loading address 'output'. The bug will cause kernel boot failure if kernel is loaded at a different position than the address, 16M, which is decided at compiled time. Kexec/kdump is such practical case. To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value. Tested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `8391c73` ("x86/KASLR: Randomize virtual address separately") Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Thomas Gleixner	875cfdbe15	x86/mshyperv: Remove excess #includes from mshyperv.h commit `26fcd952d5` upstream. A recent commit included linux/slab.h in linux/irq.h. This breaks the build of vdso32 on a 64-bit kernel. The reason is that linux/irq.h gets included into the vdso code via linux/interrupt.h which is included from asm/mshyperv.h. That makes the 32-bit vdso compile fail, because slab.h includes the pgtable headers for 64-bit on a 64-bit build. Neither linux/clocksource.h nor linux/interrupt.h are needed in the mshyperv.h header file itself - it has a dependency on <linux/atomic.h>. Remove the includes and unbreak the build. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: devel@linuxdriverproject.org Fixes: `dee863b571` ("hv: export current Hyper-V clocksource") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Josh Poimboeuf	b3bc81143c	Revert "x86/entry: Fix the end of the stack for newly forked tasks" commit `ebd574994c` upstream. Petr Mladek reported the following warning when loading the livepatch sample module: WARNING: CPU: 1 PID: 3699 at arch/x86/kernel/stacktrace.c:132 save_stack_trace_tsk_reliable+0x133/0x1a0 ... Call Trace: __schedule+0x273/0x820 schedule+0x36/0x80 kthreadd+0x305/0x310 ? kthread_create_on_cpu+0x80/0x80 ? icmp_echo.part.32+0x50/0x50 ret_from_fork+0x2c/0x40 That warning means the end of the stack is no longer recognized as such for newly forked tasks. The problem was introduced with the following commit: `ff3f7e2475` ("x86/entry: Fix the end of the stack for newly forked tasks") ... which was completely misguided. It only partially fixed the reported issue, and it introduced another bug in the process. None of the other entry code saves the frame pointer before calling into C code, so it doesn't make sense for ret_from_fork to do so either. Contrary to what I originally thought, the original issue wasn't related to newly forked tasks. It was actually related to ftrace. When entry code calls into a function which then calls into an ftrace handler, the stack frame looks different than normal. The original issue will be fixed in the unwinder, in a subsequent patch. Reported-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Dave Jones <davej@codemonkey.org.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: live-patching@vger.kernel.org Fixes: `ff3f7e2475` ("x86/entry: Fix the end of the stack for newly forked tasks") Link: http://lkml.kernel.org/r/f350760f7e82f0750c8d1dd093456eb212751caa.1495553739.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Arnaldo Carvalho de Melo	839fe2c416	tools arch: Sync arch/x86/lib/memcpy_64.S with the kernel commit `e883d09c9e` upstream. Just a minor fix done in: Fixes: `26a37ab319` ("x86/mce: Fix copy/paste error in exception table entries") Cc: Tony Luck <tony.luck@intel.com> Link: http://lkml.kernel.org/n/tip-ni9jzdd5yxlail6pq8cuexw2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Christophe JAILLET	b856d45c71	ARM: davinci: PM: Do not free useful resources in normal path in 'davinci_pm_init' commit `95d7c1f18b` upstream. It is wrong to iounmap resources in the normal path of davinci_pm_init() The 3 ioremap'ed fields of 'pm_config' can be accessed later on in other functions, so we should return 'success' instead of unrolling everything. Fixes: `aa9aa1ec2d` ("ARM: davinci: PM: rework init, remove platform device") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [nsekhar@ti.com: commit message and minor style fixes] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Christophe JAILLET	b0ed471883	ARM: davinci: PM: Free resources in error handling path in 'davinci_pm_init' commit `f3f6cc814f` upstream. If 'sram_alloc' fails, we need to free already allocated resources. Fixes: `aa9aa1ec2d` ("ARM: davinci: PM: rework init, remove platform device") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:41 +02:00
Doug Berger	0afbd9fd39	ARM: 8685/1: ensure memblock-limit is pmd-aligned commit `9e25ebfe56` upstream. The pmd containing memblock_limit is cleared by prepare_page_table() which creates the opportunity for early_alloc() to allocate unmapped memory if memblock_limit is not pmd aligned causing a boot-time hang. Commit `965278dcb8` ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") attempted to resolve this problem, but there is a path through the adjust_lowmem_bounds() routine where if all memory regions start and end on pmd-aligned addresses the memblock_limit will be set to arm_lowmem_limit. Since arm_lowmem_limit can be affected by the vmalloc early parameter, the value of arm_lowmem_limit may not be pmd-aligned. This commit corrects this oversight such that memblock_limit is always rounded down to pmd-alignment. Fixes: `965278dcb8` ("ARM: 8356/1: mm: handle non-pmd-aligned end of RAM") Signed-off-by: Doug Berger <opendmb@gmail.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Lorenzo Pieralisi	16dfde4831	ARM64/ACPI: Fix BAD_MADT_GICC_ENTRY() macro implementation commit `cb7cf772d8` upstream. The BAD_MADT_GICC_ENTRY() macro checks if a GICC MADT entry passes muster from an ACPI specification standpoint. Current macro detects the MADT GICC entry length through ACPI firmware version (it changed from 76 to 80 bytes in the transition from ACPI 5.1 to ACPI 6.0 specification) but always uses (erroneously) the ACPICA (latest) struct (ie struct acpi_madt_generic_interrupt - that is 80-bytes long) length to check if the current GICC entry memory record exceeds the MADT table end in memory as defined by the MADT table header itself, which may result in false negatives depending on the ACPI firmware version and how the MADT entries are laid out in memory (ie on ACPI 5.1 firmware MADT GICC entries are 76 bytes long, so by adding 80 to a GICC entry start address in memory the resulting address may well be past the actual MADT end, triggering a false negative). Fix the BAD_MADT_GICC_ENTRY() macro by reshuffling the condition checks and update them to always use the firmware version specific MADT GICC entry length in order to carry out boundary checks. Fixes: `b6cfb27737` ("ACPI / ARM64: add BAD_MADT_GICC_ENTRY() macro") Reported-by: Julien Grall <julien.grall@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Julien Grall <julien.grall@arm.com> Cc: Hanjun Guo <hanjun.guo@linaro.org> Cc: Al Stone <ahs3@redhat.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Timmy Li	5e1c6a5d7f	ARM64: PCI: Fix struct acpi_pci_root_ops allocation failure path commit `717902cc93` upstream. Commit `093d24a204` ("arm64: PCI: Manage controller-specific data on per-controller basis") added code to allocate ACPI PCI root_ops dynamically on a per host bridge basis but failed to update the corresponding memory allocation failure path in pci_acpi_scan_root() leading to a potential memory leakage. Fix it by adding the required kfree call. Fixes: `093d24a204` ("arm64: PCI: Manage controller-specific data on per-controller basis") Reviewed-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Timmy Li <lixiaoping3@huawei.com> [lorenzo.pieralisi@arm.com: refactored code, rewrote commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> CC: Will Deacon <will.deacon@arm.com> CC: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Eric Anholt	961010353d	watchdog: bcm281xx: Fix use of uninitialized spinlock. commit `fedf266f99` upstream. The bcm_kona_wdt_set_resolution_reg() call takes the spinlock, so initialize it earlier. Fixes a warning at boot with lock debugging enabled. Fixes: `6adb730dc2` ("watchdog: bcm281xx: Watchdog Driver") Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Dan Carpenter	662bb18efe	xfrm: Oops on error in pfkey_msg2xfrm_state() commit `1e3d0c2c70` upstream. There are some missing error codes here so we accidentally return NULL instead of an error pointer. It results in a NULL pointer dereference. Fixes: `df71837d50` ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Dan Carpenter	f37a5bfa5c	xfrm: NULL dereference on allocation failure commit `e747f64336` upstream. The default error code in pfkey_msg2xfrm_state() is -ENOBUFS. We added a new call to security_xfrm_state_alloc() which sets "err" to zero so there several places where we can return ERR_PTR(0) if kmalloc() fails. The caller is expecting error pointers so it leads to a NULL dereference. Fixes: `df71837d50` ("[LSM-IPSec]: Security association restriction.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Sabrina Dubroca	29be0c1aef	xfrm: fix stack access out of bounds with CONFIG_XFRM_SUB_POLICY commit `9b3eb54106` upstream. When CONFIG_XFRM_SUB_POLICY=y, xfrm_dst stores a copy of the flowi for that dst. Unfortunately, the code that allocates and fills this copy doesn't care about what type of flowi (flowi, flowi4, flowi6) gets passed. In multiple code paths (from raw_sendmsg, from TCP when replying to a FIN, in vxlan, geneve, and gre), the flowi that gets passed to xfrm is actually an on-stack flowi4, so we end up reading stuff from the stack past the end of the flowi4 struct. Since xfrm_dst->origin isn't used anywhere following commit `ca116922af` ("xfrm: Eliminate "fl" and "pol" args to xfrm_bundle_ok()."), just get rid of it. xfrm_dst->partner isn't used either, so get rid of that too. Fixes: `9d6ec93801` ("ipv4: Use flowi4 in public route lookup interfaces.") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Hangbin Liu	e0cee9f3bf	xfrm: move xfrm_garbage_collect out of xfrm_policy_flush commit `138437f591` upstream. Now we will force to do garbage collection if any policy removed in xfrm_policy_flush(). But during xfrm_net_exit(). We call flow_cache_fini() first and set set fc->percpu to NULL. Then after we call xfrm_policy_fini() -> frxm_policy_flush() -> flow_cache_flush(), we will get NULL pointer dereference when check percpu_empty. The code path looks like: flow_cache_fini() - fc->percpu = NULL xfrm_policy_fini() - xfrm_policy_flush() - xfrm_garbage_collect() - flow_cache_flush() - flow_cache_percpu_empty() - fcp = per_cpu_ptr(fc->percpu, cpu) To reproduce, just add ipsec in netns and then remove the netns. v2: As Xin Long suggested, since only two other places need to call it. move xfrm_garbage_collect() outside xfrm_policy_flush(). v3: Fix subject mismatch after v2 fix. Fixes: `35db069121` ("xfrm: do the garbage collection after flushing policy") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Yossi Kuperman	4d03c61711	xfrm6: Fix IPv6 payload_len in xfrm6_transport_finish commit `7c88e21aef` upstream. IPv6 payload length indicates the size of the payload, including any extension headers. In xfrm6_transport_finish, ipv6_hdr(skb)->payload_len is set to the payload size only, regardless of the presence of any extension headers. After ESP GRO transport mode decapsulation, ipv6_rcv trims the packet according to the wrong payload_len, thus corrupting the packet. Set payload_len to account for extension headers as well. Fixes: `7785bba299` ("esp: Add a software GRO codepath") Signed-off-by: Yossi Kuperman <yossiku@mellanox.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:40 +02:00
Juergen Gross	c4ed418dc9	xen/blkback: don't free be structure too early commit `71df1d7cca` upstream. The be structure must not be freed when freeing the blkif structure isn't done. Otherwise a use-after-free of be when unmapping the ring used for communicating with the frontend will occur in case of a late call of xenblk_disconnect() (e.g. due to an I/O still active when trying to disconnect). Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Steven Haigh <netwiz@crc.id.au> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Ard Biesheuvel	f21d9eda32	mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings commit `029c54b095` upstream. Existing code that uses vmalloc_to_page() may assume that any address for which is_vmalloc_addr() returns true may be passed into vmalloc_to_page() to retrieve the associated struct page. This is not un unreasonable assumption to make, but on architectures that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need to ensure that vmalloc_to_page() does not go off into the weeds trying to dereference huge PUDs or PMDs as table entries. Given that vmalloc() and vmap() themselves never create huge mappings or deal with compound pages at all, there is no correct answer in this case, so return NULL instead, and issue a warning. When reading /proc/kcore on arm64, you will hit an oops as soon as you hit the huge mappings used for the various segments that make up the mapping of vmlinux. With this patch applied, you will no longer hit the oops, but the kcore contents willl be incorrect (these regions will be zeroed out) We are fixing this for kcore specifically, so it avoids vread() for those regions. At least one other problematic user exists, i.e., /dev/kmem, but that is currently broken on arm64 for other reasons. Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Laura Abbott <labbott@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: zhong jiang <zhongjiang@huawei.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Thomas Gleixner	0ec03ce7d7	pinctrl/amd: Use regular interrupt instead of chained commit `ba714a9c1d` upstream. The AMD pinctrl driver uses a chained interrupt to demultiplex the GPIO interrupts. Kevin Vandeventer reported, that his new AMD Ryzen locks up hard on boot when the AMD pinctrl driver is initialized. The reason is an interrupt storm. It's not clear whether that's caused by hardware or firmware or both. Using chained interrupts on X86 is a dangerous endavour. If a system is misconfigured or the hardware buggy there is no safety net to catch an interrupt storm. Convert the driver to use a regular interrupt for the demultiplex handler. This allows the interrupt storm detector to catch the malfunction and lets the system boot up. This should be backported to stable because it's likely that more users run into this problem as the AMD Ryzen machines are spreading. Reported-by: Kevin Vandeventer Link: https://bugzilla.suse.com/show_bug.cgi?id=1034261 Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Baoquan He	d9efa9db58	x86/mm: Fix boot crash caused by incorrect loop count calculation in sync_global_pgds() commit `fc5f9d5f15` upstream. Jeff Moyer reported that on his system with two memory regions 0~64G and 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR will make the system hang intermittently during boot. While adding 'nokaslr' won't. The back trace is: Oops: 0000 [#1] SMP RIP: memcpy_erms() [ .... ] Call Trace: pmem_rw_page() bdev_read_page() do_mpage_readpage() mpage_readpages() blkdev_readpages() __do_page_cache_readahead() force_page_cache_readahead() page_cache_sync_readahead() generic_file_read_iter() blkdev_read_iter() __vfs_read() vfs_read() SyS_read() entry_SYSCALL_64_fastpath() This crash happens because the for loop count calculation in sync_global_pgds() is not correct. When a mapping area crosses PGD entries, we should calculate the starting address of region which next PGD covers and assign it to next for loop count, but not add PGDIR_SIZE directly. The old code works right only if the mapping area is an exact multiple of PGDIR_SIZE, otherwize the end region could be skipped so that it can't be synchronized to all other processes from kernel PGD init_mm.pgd. In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it makes this area be mapped inside one PGD entry. With KASLR enabled, this area could cross two PGD entries, then the next PGD entry won't be synced to all other processes. That is why we saw empty PGD. Fix it. Reported-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Jinbum Park <jinb.park7@gmail.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com> Cc: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1493864747-8506-1-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Vallish Vaidyeshwara	a340661a36	dm thin: do not queue freed thin mapping for next stage processing commit `00a0ea33b4` upstream. process_prepared_discard_passdown_pt1() should cleanup dm_thin_new_mapping in cases of error. dm_pool_inc_data_range() can fail trying to get a block reference: metadata operation 'dm_pool_inc_data_range' failed: error = -61 When dm_pool_inc_data_range() fails, dm thin aborts current metadata transaction and marks pool as PM_READ_ONLY. Memory for thin mapping is released as well. However, current thin mapping will be queued onto next stage as part of queue_passdown_pt2() or passdown_endio(). This dangling thin mapping memory when processed and accessed in next stage will lead to device mapper crashing. Code flow without fix: -> process_prepared_discard_passdown_pt1(m) -> dm_thin_remove_range() -> discard passdown --> passdown_endio(m) queues m onto next stage -> dm_pool_inc_data_range() fails, frees memory m but does not remove it from next stage queue -> process_prepared_discard_passdown_pt2(m) -> processes freed memory m and crashes One such stack: Call Trace: [<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison] [<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool] [<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool] [<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool] [<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool] [<ffffffff8152bf54>] ? __schedule+0x244/0x680 [<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0 [<ffffffff81089f53>] process_one_work+0x153/0x3f0 [<ffffffff8108a71b>] worker_thread+0x12b/0x4b0 [<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350 [<ffffffff8108fd6a>] kthread+0xca/0xe0 [<ffffffff8108fca0>] ? kthread_park+0x60/0x60 [<ffffffff81530b45>] ret_from_fork+0x25/0x30 The fix is to first take the block ref count for discarded block and then do a passdown discard of this block. If block ref count fails, then bail out aborting current metadata transaction, mark pool as PM_READ_ONLY and also free current thin mapping memory (existing error handling code) without queueing this thin mapping onto next stage of processing. If block ref count succeeds, then passdown discard of this block. Discard callback of passdown_endio() will queue this thin mapping onto next stage of processing. Code flow with fix: -> process_prepared_discard_passdown_pt1(m) -> dm_thin_remove_range() -> dm_pool_inc_data_range() --> if fails, free memory m and bail out -> discard passdown --> passdown_endio(m) queues m onto next stage Reviewed-by: Eduardo Valentin <eduval@amazon.com> Reviewed-by: Cristian Gafton <gafton@amazon.com> Reviewed-by: Anchal Agarwal <anchalag@amazon.com> Signed-off-by: Vallish Vaidyeshwara <vallish@amazon.com> Reviewed-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Deepak Rawat	d7822ccb7f	drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgr commit `82fcee526b` upstream. The hash table created during vmw_cmdbuf_res_man_create was never freed. This causes memory leak in context creation. Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy. Tested for memory leak by running piglit overnight and kernel memory is not inflated which earlier was. Signed-off-by: Deepak Rawat <drawat@vmware.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Kan Liang	bdbe850337	perf/x86/intel/uncore: Fix wrong box pointer check commit `80c65fdb4c` upstream. Should not init a NULL box. It will cause system crash. The issue looks like caused by a typo. This was not noticed because there is no NULL box. Also, for most boxes, they are enabled by default. The init code is not critical. Fixes: `fff4b87e59` ("perf/x86/intel/uncore: Make package handling more robust") Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Vikas Shivappa	60c9685b42	x86/intel_rdt: Fix memory leak on mount failure commit `79298acc4b` upstream. If mount fails, the kn_info directory is not freed causing memory leak. Add the missing error handling path. Fixes: `4e978d06de` ("x86/intel_rdt: Add "info" files to resctrl file system") Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: vikas.shivappa@intel.com Cc: andi.kleen@intel.com Link: http://lkml.kernel.org/r/1498503368-20173-3-git-send-email-vikas.shivappa@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Bartosz Golaszewski	dfa1f24487	gpiolib: fix filtering out unwanted events commit `ad537b8225` upstream. GPIOEVENT_REQUEST_BOTH_EDGES is not a single flag, but a binary OR of GPIOEVENT_REQUEST_RISING_EDGE and GPIOEVENT_REQUEST_FALLING_EDGE. The expression 'le->eflags & GPIOEVENT_REQUEST_BOTH_EDGES' we'll get evaluated to true even if only one event type was requested. Fix it by checking both RISING & FALLING flags explicitly. Fixes: `61f922db72` ("gpio: userspace ABI for reading GPIO line events") Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Miklos Szeredi	515a95fafa	ovl: copy-up: don't unlock between lookup and link commit `e85f82ff9b` upstream. Nothing prevents mischief on upper layer while we are busy copying up the data. Move the lookup right before the looked up dentry is actually used. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: `01ad3eb8a0` ("ovl: concurrent copy up of regular files") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:39 +02:00
Benjamin Coddington	003192c3d3	Revert "NFS: nfs_rename() handle -ERESTARTSYS dentry left behind" commit `d9f2950006` upstream. This reverts commit `920b4530fb` which could call d_move() without holding the directory's i_mutex, and reverts commit `d4ea7e3c5c` "NFS: Fix old dentry rehash after move", which was a follow-up fix. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `920b4530fb` ("NFS: nfs_rename() handle -ERESTARTSYS dentry left behind") Reviewed-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Trond Myklebust	95b2e0882b	NFSv4.1: Fix a race in nfs4_proc_layoutget commit `bd171930e6` upstream. If the task calling layoutget is signalled, then it is possible for the calls to nfs4_sequence_free_slot() and nfs4_layoutget_prepare() to race, in which case we leak a slot. The fix is to move the call to nfs4_sequence_free_slot() into the nfs4_layoutget_release() so that it gets called at task teardown time. Fixes: `2e80dbe7ac` ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Benjamin Coddington	f8da5dee09	NFSv4.2: Don't send mode again in post-EXCLUSIVE4_1 SETATTR with umask commit `501e7a4689` upstream. Now that we have umask support, we shouldn't re-send the mode in a SETATTR following an exclusive CREATE, or we risk having the same problem fixed in commit `5334c5bdac` ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1"), which is that files with S_ISGID will have that bit stripped away. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `dff25ddb48` ("nfs: add support for the umask attribute") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Hui Wang	f521e0bfcb	ALSA: hda - set input_path bitmap to zero after moving it to new place commit `a8f20fd25b` upstream. Recently we met a problem, the codec has valid adcs and input pins, and they can form valid input paths, but the driver does not build valid controls for them like "Mic boost", "Capture Volume" and "Capture Switch". Through debugging, I found the driver needs to shrink the invalid adcs and input paths for this machine, so it will move the whole column bitmap value to the previous column, after moving it, the driver forgets to set the original column bitmap value to zero, as a result, the driver will invalidate the path whose index value is the original colume bitmap value. After executing this function, all valid input paths are invalidated by a mistake, there are no any valid input paths, so the driver won't build controls for them. Fixes: `3a65bcdc57` ("ALSA: hda - Fix inconsistent input_paths after ADC reduction") Signed-off-by: Hui Wang <hui.wang@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Takashi Iwai	a189e2f294	ALSA: hda - Fix endless loop of codec configure commit `d94815f917` upstream. azx_codec_configure() loops over the codecs found on the given controller via a linked list. The code used to work in the past, but in the current version, this may lead to an endless loop when a codec binding returns an error. The culprit is that the snd_hda_codec_configure() unregisters the device upon error, and this eventually deletes the given codec object from the bus. Since the list is initialized via list_del_init(), the next object points to the same device itself. This behavior change was introduced at splitting the HD-audio code code, and forgotten to adapt it here. For fixing this bug, just use a *_safe() version of list iteration. Fixes: `d068ebc25e` ("ALSA: hda - Move some codes up to hdac_bus struct") Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Paul Burton	f1a8a4fe83	MIPS: Fix IRQ tracing & lockdep when rescheduling commit `d8550860d9` upstream. When the scheduler sets TIF_NEED_RESCHED & we call into the scheduler from arch/mips/kernel/entry.S we disable interrupts. This is true regardless of whether we reach work_resched from syscall_exit_work, resume_userspace or by looping after calling schedule(). Although we disable interrupts in these paths we don't call trace_hardirqs_off() before calling into C code which may acquire locks, and we therefore leave lockdep with an inconsistent view of whether interrupts are disabled or not when CONFIG_PROVE_LOCKING & CONFIG_DEBUG_LOCKDEP are both enabled. Without tracing this interrupt state lockdep will print warnings such as the following once a task returns from a syscall via syscall_exit_partial with TIF_NEED_RESCHED set: [ 49.927678] ------------[ cut here ]------------ [ 49.934445] WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3687 check_flags.part.41+0x1dc/0x1e8 [ 49.946031] DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) [ 49.946355] CPU: 0 PID: 1 Comm: init Not tainted 4.10.0-00439-gc9fd5d362289-dirty #197 [ 49.963505] Stack : 0000000000000000 ffffffff81bb5d6a 0000000000000006 ffffffff801ce9c4 [ 49.974431] 0000000000000000 0000000000000000 0000000000000000 000000000000004a [ 49.985300] ffffffff80b7e487 ffffffff80a24498 a8000000ff160000 ffffffff80ede8b8 [ 49.996194] 0000000000000001 0000000000000000 0000000000000000 0000000077c8030c [ 50.007063] 000000007fd8a510 ffffffff801cd45c 0000000000000000 a8000000ff127c88 [ 50.017945] 0000000000000000 ffffffff801cf928 0000000000000001 ffffffff80a24498 [ 50.028827] 0000000000000000 0000000000000001 0000000000000000 0000000000000000 [ 50.039688] 0000000000000000 a8000000ff127bd0 0000000000000000 ffffffff805509bc [ 50.050575] 00000000140084e0 0000000000000000 0000000000000000 0000000000040a00 [ 50.061448] 0000000000000000 ffffffff8010e1b0 0000000000000000 ffffffff805509bc [ 50.072327] ... [ 50.076087] Call Trace: [ 50.079869] [<ffffffff8010e1b0>] show_stack+0x80/0xa8 [ 50.086577] [<ffffffff805509bc>] dump_stack+0x10c/0x190 [ 50.093498] [<ffffffff8015dde0>] __warn+0xf0/0x108 [ 50.099889] [<ffffffff8015de34>] warn_slowpath_fmt+0x3c/0x48 [ 50.107241] [<ffffffff801c15b4>] check_flags.part.41+0x1dc/0x1e8 [ 50.114961] [<ffffffff801c239c>] lock_is_held_type+0x8c/0xb0 [ 50.122291] [<ffffffff809461b8>] __schedule+0x8c0/0x10f8 [ 50.129221] [<ffffffff80946a60>] schedule+0x30/0x98 [ 50.135659] [<ffffffff80106278>] work_resched+0x8/0x34 [ 50.142397] ---[ end trace 0cb4f6ef5b99fe21 ]--- [ 50.148405] possible reason: unannotated irqs-off. [ 50.154600] irq event stamp: 400463 [ 50.159566] hardirqs last enabled at (400463): [<ffffffff8094edc8>] _raw_spin_unlock_irqrestore+0x40/0xa8 [ 50.171981] hardirqs last disabled at (400462): [<ffffffff8094eb98>] _raw_spin_lock_irqsave+0x30/0xb0 [ 50.183897] softirqs last enabled at (400450): [<ffffffff8016580c>] __do_softirq+0x4ac/0x6a8 [ 50.195015] softirqs last disabled at (400425): [<ffffffff80165e78>] irq_exit+0x110/0x128 Fix this by using the TRACE_IRQS_OFF macro to call trace_hardirqs_off() when CONFIG_TRACE_IRQFLAGS is enabled. This is done before invoking schedule() following the work_resched label because: 1) Interrupts are disabled regardless of the path we take to reach work_resched() & schedule(). 2) Performing the tracing here avoids the need to do it in paths which disable interrupts but don't call out to C code before hitting a path which uses the RESTORE_SOME macro that will call trace_hardirqs_on() or trace_hardirqs_off() as appropriate. We call trace_hardirqs_on() using the TRACE_IRQS_ON macro before calling syscall_trace_leave() for similar reasons, ensuring that lockdep has a consistent view of state after we re-enable interrupts. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15385/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Paul Burton	5033e62248	MIPS: pm-cps: Drop manual cache-line alignment of ready_count commit `161c51ccb7` upstream. We allocate memory for a ready_count variable per-CPU, which is accessed via a cached non-coherent TLB mapping to perform synchronisation between threads within the core using LL/SC instructions. In order to ensure that the variable is contained within its own data cache line we allocate 2 lines worth of memory & align the resulting pointer to a line boundary. This is however unnecessary, since kmalloc is guaranteed to return memory which is at least cache-line aligned (see ARCH_DMA_MINALIGN). Stop the redundant manual alignment. Besides cleaning up the code & avoiding needless work, this has the side effect of avoiding an arithmetic error found by Bryan on 64 bit systems due to the 32 bit size of the former dlinesz. This led the ready_count variable to have its upper 32b cleared erroneously for MIPS64 kernels, causing problems when ready_count was later used on MIPS64 via cpuidle. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: `3179d37ee1` ("MIPS: pm-cps: add PM state entry code for CPS systems") Reported-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com> Tested-by: Bryan O'Donoghue <bryan.odonoghue@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15383/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
James Hogan	dba185d7a5	MIPS: Avoid accidental raw backtrace commit `8542363633` upstream. Since commit `81a76d7119` ("MIPS: Avoid using unwind_stack() with usermode") show_backtrace() invokes the raw backtracer when cp0_status & ST0_KSU indicates user mode to fix issues on EVA kernels where user and kernel address spaces overlap. However this is used by show_stack() which creates its own pt_regs on the stack and leaves cp0_status uninitialised in most of the code paths. This results in the non deterministic use of the raw back tracer depending on the previous stack content. show_stack() deals exclusively with kernel mode stacks anyway, so explicitly initialise regs.cp0_status to KSU_KERNEL (i.e. 0) to ensure we get a useful backtrace. Fixes: `81a76d7119` ("MIPS: Avoid using unwind_stack() with usermode") Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16656/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Karl Beldan	9de0e07dfb	MIPS: head: Reorder instructions missing a delay slot commit `25d8b92e0a` upstream. In this sequence the 'move' is assumed in the delay slot of the 'beq', but head.S is in reorder mode and the former gets pushed one 'nop' farther by the assembler. The corrected behavior made booting with an UHI supplied dtb erratic. Fixes: `15f37e1588` ("MIPS: store the appended dtb address in a variable") Signed-off-by: Karl Beldan <karl.beldan+oss@gmail.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Cc: Jonas Gorski <jogo@openwrt.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/16614/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Juergen Gross	c806e0188d	xen/blkback: don't use xen_blkif_get() in xen-blkback kthread commit `a24fa22ce2` upstream. There is no need to use xen_blkif_get()/xen_blkif_put() in the kthread of xen-blkback. Thread stopping is synchronous and using the blkif reference counting in the kthread will avoid to ever let the reference count drop to zero at the end of an I/O running concurrent to disconnecting and multiple rings. Setting ring->xenblkd to NULL after stopping the kthread isn't needed as the kthread does this already. Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Steven Haigh <netwiz@crc.id.au> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:38 +02:00
Kinglong Mee	887e338c2e	NFSv4.x/callback: Create the callback service through svc_create_pooled commit `df807fffaa` upstream. As the comments for svc_set_num_threads() said, " Destroying threads relies on the service threads filling in rqstp->rq_task, which only the nfs ones do. Assumes the serv has been created using svc_create_pooled()." If creating service through svc_create(), the svc_pool_map_put() will be called in svc_destroy(), but the pool map isn't used. So that, the reference of pool map will be drop, the next using of pool map will get a zero npools. [ 137.992130] divide error: 0000 [#1] SMP [ 137.992148] Modules linked in: nfsd(E) nfsv4 nfs fscache fuse tun bridge stp llc ip_set nfnetlink vmw_vsock_vmci_transport vsock snd_seq_midi snd_seq_midi_event vmw_balloon coretemp crct10dif_pclmul crc32_pclmul ppdev ghash_clmulni_intel intel_rapl_perf joydev snd_ens1371 gameport snd_ac97_codec ac97_bus snd_seq snd_pcm snd_rawmidi snd_timer snd_seq_device snd soundcore parport_pc parport nfit acpi_cpufreq tpm_tis tpm_tis_core tpm vmw_vmci i2c_piix4 shpchp auth_rpcgss nfs_acl lockd(E) grace sunrpc(E) xfs libcrc32c vmwgfx drm_kms_helper ttm crc32c_intel drm e1000 mptspi scsi_transport_spi serio_raw mptscsih mptbase ata_generic pata_acpi [last unloaded: nfsd] [ 137.992336] CPU: 0 PID: 4514 Comm: rpc.nfsd Tainted: G E 4.11.0-rc8+ #536 [ 137.992777] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 137.993757] task: ffff955984101d00 task.stack: ffff9873c2604000 [ 137.994231] RIP: 0010:svc_pool_for_cpu+0x2b/0x80 [sunrpc] [ 137.994768] RSP: 0018:ffff9873c2607c18 EFLAGS: 00010246 [ 137.995227] RAX: 0000000000000000 RBX: ffff95598376f000 RCX: 0000000000000002 [ 137.995673] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9559944aec00 [ 137.996156] RBP: ffff9873c2607c18 R08: ffff9559944aec28 R09: 0000000000000000 [ 137.996609] R10: 0000000001080002 R11: 0000000000000000 R12: ffff95598376f010 [ 137.997063] R13: ffff95598376f018 R14: ffff9559944aec28 R15: ffff9559944aec00 [ 137.997584] FS: 00007f755529eb40(0000) GS:ffff9559bb600000(0000) knlGS:0000000000000000 [ 137.998048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 137.998548] CR2: 000055f3aecd9660 CR3: 0000000084290000 CR4: 00000000001406f0 [ 137.999052] Call Trace: [ 137.999517] svc_xprt_do_enqueue+0xef/0x260 [sunrpc] [ 138.000028] svc_xprt_received+0x47/0x90 [sunrpc] [ 138.000487] svc_add_new_perm_xprt+0x76/0x90 [sunrpc] [ 138.000981] svc_addsock+0x14b/0x200 [sunrpc] [ 138.001424] ? recalc_sigpending+0x1b/0x50 [ 138.001860] ? __getnstimeofday64+0x41/0xd0 [ 138.002346] ? do_gettimeofday+0x29/0x90 [ 138.002779] write_ports+0x255/0x2c0 [nfsd] [ 138.003202] ? _copy_from_user+0x4e/0x80 [ 138.003676] ? write_recoverydir+0x100/0x100 [nfsd] [ 138.004098] nfsctl_transaction_write+0x48/0x80 [nfsd] [ 138.004544] __vfs_write+0x37/0x160 [ 138.004982] ? selinux_file_permission+0xd7/0x110 [ 138.005401] ? security_file_permission+0x3b/0xc0 [ 138.005865] vfs_write+0xb5/0x1a0 [ 138.006267] SyS_write+0x55/0xc0 [ 138.006654] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 138.007071] RIP: 0033:0x7f7554b9dc30 [ 138.007437] RSP: 002b:00007ffc9f92c788 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 138.007807] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f7554b9dc30 [ 138.008168] RDX: 0000000000000002 RSI: 00005640cd536640 RDI: 0000000000000003 [ 138.008573] RBP: 00007ffc9f92c780 R08: 0000000000000001 R09: 0000000000000002 [ 138.008918] R10: 0000000000000064 R11: 0000000000000246 R12: 0000000000000004 [ 138.009254] R13: 00005640cdbf77a0 R14: 00005640cdbf7720 R15: 00007ffc9f92c238 [ 138.009610] Code: 0f 1f 44 00 00 48 8b 87 98 00 00 00 55 48 89 e5 48 83 78 08 00 74 10 8b 05 07 42 02 00 83 f8 01 74 40 83 f8 02 74 19 31 c0 31 d2 <f7> b7 88 00 00 00 5d 89 d0 48 c1 e0 07 48 03 87 90 00 00 00 c3 [ 138.010664] RIP: svc_pool_for_cpu+0x2b/0x80 [sunrpc] RSP: ffff9873c2607c18 [ 138.011061] ---[ end trace b3468224cafa7d11 ]--- Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Eric Leblond	a3042f8a0c	netfilter: synproxy: fix conntrackd interaction commit `87e94dbc21` upstream. This patch fixes the creation of connection tracking entry from netlink when synproxy is used. It was missing the addition of the synproxy extension. This was causing kernel crashes when a conntrack entry created by conntrackd was used after the switch of traffic from active node to the passive node. Signed-off-by: Eric Leblond <eric@regit.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Serhey Popovych	f19613afaf	rtnetlink: add IFLA_GROUP to ifla_policy [ Upstream commit `db833d40ad` ] Network interface groups support added while ago, however there is no IFLA_GROUP attribute description in policy and netlink message size calculations until now. Add IFLA_GROUP attribute to the policy. Fixes: `cbda10fa97` ("net_device: add support for network device groups") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Serhey Popovych	4ade61f463	ipv6: Do not leak throw route references [ Upstream commit `07f615574f` ] While commit `73ba57bfae` ("ipv6: fix backtracking for throw routes") does good job on error propagation to the fib_rules_lookup() in fib rules core framework that also corrects throw routes handling, it does not solve route reference leakage problem happened when we return -EAGAIN to the fib_rules_lookup() and leave routing table entry referenced in arg->result. If rule with matched throw route isn't last matched in the list we overwrite arg->result losing reference on throw route stored previously forever. We also partially revert commit `ab997ad408` ("ipv6: fix the incorrect return value of throw route") since we never return routing table entry with dst.error == -EAGAIN when CONFIG_IPV6_MULTIPLE_TABLES is on. Also there is no point to check for RTF_REJECT flag since it is always set throw route. Fixes: `73ba57bfae` ("ipv6: fix backtracking for throw routes") Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Gao Feng	16c4d1be8f	net: 8021q: Fix one possible panic caused by BUG_ON in free_netdev [ Upstream commit `9745e362ad` ] The register_vlan_device would invoke free_netdev directly, when register_vlan_dev failed. It would trigger the BUG_ON in free_netdev if the dev was already registered. In this case, the netdev would be freed in netdev_run_todo later. So add one condition check now. Only when dev is not registered, then free it directly. The following is the part coredump when netdev_upper_dev_link failed in register_vlan_dev. I removed the lines which are too long. [ 411.237457] ------------[ cut here ]------------ [ 411.237458] kernel BUG at net/core/dev.c:7998! [ 411.237484] invalid opcode: 0000 [#1] SMP [ 411.237705] [last unloaded: 8021q] [ 411.237718] CPU: 1 PID: 12845 Comm: vconfig Tainted: G E 4.12.0-rc5+ #6 [ 411.237737] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 [ 411.237764] task: ffff9cbeb6685580 task.stack: ffffa7d2807d8000 [ 411.237782] RIP: 0010:free_netdev+0x116/0x120 [ 411.237794] RSP: 0018:ffffa7d2807dbdb0 EFLAGS: 00010297 [ 411.237808] RAX: 0000000000000002 RBX: ffff9cbeb6ba8fd8 RCX: 0000000000001878 [ 411.237826] RDX: 0000000000000001 RSI: 0000000000000282 RDI: 0000000000000000 [ 411.237844] RBP: ffffa7d2807dbdc8 R08: 0002986100029841 R09: 0002982100029801 [ 411.237861] R10: 0004000100029980 R11: 0004000100029980 R12: ffff9cbeb6ba9000 [ 411.238761] R13: ffff9cbeb6ba9060 R14: ffff9cbe60f1a000 R15: ffff9cbeb6ba9000 [ 411.239518] FS: 00007fb690d81700(0000) GS:ffff9cbebb640000(0000) knlGS:0000000000000000 [ 411.239949] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 411.240454] CR2: 00007f7115624000 CR3: 0000000077cdf000 CR4: 00000000003406e0 [ 411.240936] Call Trace: [ 411.241462] vlan_ioctl_handler+0x3f1/0x400 [8021q] [ 411.241910] sock_ioctl+0x18b/0x2c0 [ 411.242394] do_vfs_ioctl+0xa1/0x5d0 [ 411.242853] ? sock_alloc_file+0xa6/0x130 [ 411.243465] SyS_ioctl+0x79/0x90 [ 411.243900] entry_SYSCALL_64_fastpath+0x1e/0xa9 [ 411.244425] RIP: 0033:0x7fb69089a357 [ 411.244863] RSP: 002b:00007ffcd04e0fc8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 411.245445] RAX: ffffffffffffffda RBX: 00007ffcd04e2884 RCX: 00007fb69089a357 [ 411.245903] RDX: 00007ffcd04e0fd0 RSI: 0000000000008983 RDI: 0000000000000003 [ 411.246527] RBP: 00007ffcd04e0fd0 R08: 0000000000000000 R09: 1999999999999999 [ 411.246976] R10: 000000000000053f R11: 0000000000000202 R12: 0000000000000004 [ 411.247414] R13: 00007ffcd04e1128 R14: 00007ffcd04e2888 R15: 0000000000000001 [ 411.249129] RIP: free_netdev+0x116/0x120 RSP: ffffa7d2807dbdb0 Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Wei Wang	c207e0594f	decnet: always not take dst->__refcnt when inserting dst into hash table [ Upstream commit `76371d2e3a` ] In the existing dn_route.c code, dn_route_output_slow() takes dst->__refcnt before calling dn_insert_route() while dn_route_input_slow() does not take dst->__refcnt before calling dn_insert_route(). This makes the whole routing code very buggy. In dn_dst_check_expire(), dnrt_free() is called when rt expires. This makes the routes inserted by dn_route_output_slow() not able to be freed as the refcnt is not released. In dn_dst_gc(), dnrt_drop() is called to release rt which could potentially cause the dst->__refcnt to be dropped to -1. In dn_run_flush(), dst_free() is called to release all the dst. Again, it makes the dst inserted by dn_route_output_slow() not able to be released and also, it does not wait on the rcu and could potentially cause crash in the path where other users still refer to this dst. This patch makes sure both input and output path do not take dst->__refcnt before calling dn_insert_route() and also makes sure dnrt_free()/dst_free() is called when removing dst from the hash table. The only difference between those 2 calls is that dnrt_free() waits on the rcu while dst_free() does not. Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Maor Dickman	941bdec095	net/mlx5e: Fix timestamping capabilities reporting [ Upstream commit `f0b381178b` ] Misuse of (BIT) macro caused to report wrong flags for "Hardware Transmit Timestamp Modes" and "Hardware Receive Filter Modes" Fixes: `ef9814deaf` ('net/mlx5e: Add HW timestamping (TS) support') Signed-off-by: Maor Dickman <maord@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Eli Cohen	f464aace78	net/mlx5: Wait for FW readiness before initializing command interface [ Upstream commit `6c780a0267` ] Before attempting to initialize the command interface we must wait till the fw_initializing bit is clear. If we fail to meet this condition the hardware will drop our configuration, specifically the descriptors page address. This scenario can happen when the firmware is still executing an FLR flow and did not finish yet so the driver needs to wait for that to finish. Fixes: `e3297246c2` ('net/mlx5_core: Wait for FW readiness on startup') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Or Gerlitz	c7d1260afb	net/mlx5e: Avoid doing a cleanup call if the profile doesn't have it [ Upstream commit `31ac93386d` ] The error flow of mlx5e_create_netdev calls the cleanup call of the given profile without checking if it exists, fix that. Currently the VF reps don't register that callback and we crash if getting into error -- can be reproduced by the user doing ctrl^C while attempting to change the sriov mode from legacy to switchdev. Fixes: `26e59d8077` '(net/mlx5e: Implement mlx5e interface attach/detach callbacks') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Sabrina Dubroca <sdubroca@redhat.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:37 +02:00
Chris Mi	050efbe129	net/mlx5e: Fix min inline value for VF rep SQs [ Upstream commit `5f195c2c5c` ] The offending commit only changed the code path for PF/VF, but it didn't take care of VF representors. As a result, since params->tx_min_inline_mode for VF representors is kzalloced to 0 (MLX5_INLINE_MODE_NONE), all VF reps SQs were set to that mode. This actually works on CX5 by default but broke CX4. Fix that by adding a call to query the min inline mode from the VF rep build up code. Fixes: `a6f402e499` ("net/mlx5e: Tx, no inline copy on ConnectX-5") Signed-off-by: Chris Mi <chrism@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Xin Long	eb0d418f2e	sctp: return next obj by passing pos + 1 into sctp_transport_get_idx [ Upstream commit `988c732211` ] In sctp_for_each_transport, pos is used to save how many objs it has dumped. Now it gets the last obj by sctp_transport_get_idx, then gets the next obj by sctp_transport_get_next. The issue is that in the meanwhile if some objs in transport hashtable are removed and the objs nums are less than pos, sctp_transport_get_idx would return NULL and hti.walker.tbl is NULL as well. At this moment it should stop hti, instead of continue getting the next obj. Or it would cause a NULL pointer dereference in sctp_transport_get_next. This patch is to pass pos + 1 into sctp_transport_get_idx to get the next obj directly, even if pos > objs nums, it would return NULL and stop hti. Fixes: `626d16f50f` ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Xin Long	247ab3c1c1	ipv6: fix calling in6_ifa_hold incorrectly for dad work [ Upstream commit `f8a894b218` ] Now when starting the dad work in addrconf_mod_dad_work, if the dad work is idle and queued, it needs to hold ifa. The problem is there's one gap in [1], during which if the pending dad work is removed elsewhere. It will miss to hold ifa, but the dad word is still idea and queue. if (!delayed_work_pending(&ifp->dad_work)) in6_ifa_hold(ifp); <--------------[1] mod_delayed_work(addrconf_wq, &ifp->dad_work, delay); An use-after-free issue can be caused by this. Chen Wei found this issue when WARN_ON(!hlist_unhashed(&ifp->addr_lst)) in net6_ifa_finish_destroy was hit because of it. As Hannes' suggestion, this patch is to fix it by holding ifa first in addrconf_mod_dad_work, then calling mod_delayed_work and putting ifa if the dad_work is already in queue. Note that this patch did not choose to fix it with: if (!mod_delayed_work(delay)) in6_ifa_hold(ifp); As with it, when delay == 0, dad_work would be scheduled immediately, all addrconf_mod_dad_work(0) callings had to be moved under ifp->lock. Reported-by: Wei Chen <weichen@redhat.com> Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Jesper Dangaard Brouer	bdfae324ba	net: don't global ICMP rate limit packets originating from loopback [ Upstream commit `849a44de91` ] Florian Weimer seems to have a glibc test-case which requires that loopback interfaces does not get ICMP ratelimited. This was broken by commit `c0303efeab` ("net: reduce cycles spend on ICMP replies that gets rate limited"). An ICMP response will usually be routed back-out the same incoming interface. Thus, take advantage of this and skip global ICMP ratelimit when the incoming device is loopback. In the unlikely event that the outgoing it not loopback, due to strange routing policy rules, ICMP rate limiting still works via peer ratelimiting via icmpv4_xrlim_allow(). Thus, we should still comply with RFC1812 (section 4.3.2.8 "Rate Limiting"). This seems to fix the reproducer given by Florian. While still avoiding to perform expensive and unneeded outgoing route lookup for rate limited packets (in the non-loopback case). Fixes: `c0303efeab` ("net: reduce cycles spend on ICMP replies that gets rate limited") Reported-by: Florian Weimer <fweimer@redhat.com> Reported-by: "H.J. Lu" <hjl.tools@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Bjørn Mork	487dd0ab72	qmi_wwan: new Telewell and Sierra device IDs [ Upstream commit `60cfe1eacc` ] A new Sierra Wireless EM7305 device ID used in a Toshiba laptop, and two Longcheer device IDs entries used by Telewell TW-3G HSPA+ branded modems. Reported-by: Petr Kloc <petr_kloc@yahoo.com> Reported-by: Teemu Likonen <tlikonen@iki.fi> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
WANG Cong	5113c2dcb9	igmp: add a missing spin_lock_init() [ Upstream commit `b4846fc3c8` ] Andrey reported a lockdep warning on non-initialized spinlock: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 1 PID: 4099 Comm: a.out Not tainted 4.12.0-rc6+ #9 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x292/0x395 lib/dump_stack.c:52 register_lock_class+0x717/0x1aa0 kernel/locking/lockdep.c:755 ? 0xffffffffa0000000 __lock_acquire+0x269/0x3690 kernel/locking/lockdep.c:3255 lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 __raw_spin_lock_bh ./include/linux/spinlock_api_smp.h:135 _raw_spin_lock_bh+0x36/0x50 kernel/locking/spinlock.c:175 spin_lock_bh ./include/linux/spinlock.h:304 ip_mc_clear_src+0x27/0x1e0 net/ipv4/igmp.c:2076 igmpv3_clear_delrec+0xee/0x4f0 net/ipv4/igmp.c:1194 ip_mc_destroy_dev+0x4e/0x190 net/ipv4/igmp.c:1736 We miss a spin_lock_init() in igmpv3_add_delrec(), probably because previously we never use it on this code path. Since we already unlink it from the global mc_tomb list, it is probably safe not to acquire this spinlock here. It does not harm to have it although, to avoid conditional locking. Fixes: `c38b7d327a` ("igmp: acquire pmc lock for ip_mc_clear_src()") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
WANG Cong	20407b1d4a	igmp: acquire pmc lock for ip_mc_clear_src() [ Upstream commit `c38b7d327a` ] Andrey reported a use-after-free in add_grec(): for (psf = *psf_list; psf; psf = psf_next) { ... psf_next = psf->sf_next; where the struct ip_sf_list's were already freed by: kfree+0xe8/0x2b0 mm/slub.c:3882 ip_mc_clear_src+0x69/0x1c0 net/ipv4/igmp.c:2078 ip_mc_dec_group+0x19a/0x470 net/ipv4/igmp.c:1618 ip_mc_drop_socket+0x145/0x230 net/ipv4/igmp.c:2609 inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:411 sock_release+0x8d/0x1e0 net/socket.c:597 sock_close+0x16/0x20 net/socket.c:1072 This happens because we don't hold pmc->lock in ip_mc_clear_src() and a parallel mr_ifc_timer timer could jump in and access them. The RCU lock is there but it is merely for pmc itself, this spinlock could actually ensure we don't access them in parallel. Thanks to Eric and Long for discussion on this bug. Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Xin Long <lucien.xin@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Christian Perle	42b0540a98	proc: snmp6: Use correct type in memset [ Upstream commit `3500cd73df` ] Reading /proc/net/snmp6 yields bogus values on 32 bit kernels. Use "u64" instead of "unsigned long" in sizeof(). Fixes: `4a4857b1c8` ("proc: Reduce cache miss in snmp6_seq_show") Signed-off-by: Christian Perle <christian.perle@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Majd Dibbiny	8fc4f0e6c3	net/mlx5: Enable 4K UAR only when page size is bigger than 4K [ Upstream commit `91828bd899` ] When the page size isn't bigger than 4K, there is no added value of enabling 4K UAR feature in the Firmware. Modified the condition of enabling the 4K UAR accordingly. Fixes: `f502d83495` ("net/mlx5: Activate support for 4K UARs") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Tal Gilboa	f0e6f2314d	net/mlx5e: Fix wrong indications in DIM due to counter wraparound [ Upstream commit `53acd76ce5` ] DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for changing the channel interrupt moderation values in order to reduce CPU overhead for all traffic types. Each iteration of the algorithm, DIM calculates the difference in throughput, packet rate and interrupt rate from last iteration in order to make a decision. DIM relies on counters for each metric. When these counters get to their type's max value they wraparound. In this case the delta between 'end' and 'start' samples is negative and when translated to unsigned integers - very high. This results in a false indication to the algorithm and might result in a wrong decision. The fix calculates the 'distance' between 'end' and 'start' samples in a cyclic way around the relevant type's max value. It can also be viewed as an absolute value around the type's max value instead of around 0. Testing show higher stability in DIM profile selection and no wraparound issues. Fixes: `cb3c7fd4f8` ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:36 +02:00
Tal Gilboa	f98a0883af	net/mlx5e: Added BW check for DIM decision mechanism [ Upstream commit `c3164d2fc4` ] DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for changing the channel interrupt moderation values in order to reduce CPU overhead for all traffic types. Until now only interrupt and packet rate were sampled. We found a scenario on which we get a false indication since a change in DIM caused more aggregation and reduced packet rate while increasing BW. We now regard a change as succesfull iff: current_BW > (prev_BW + threshold) or current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or current_BW ~= prev_BW and current_PR ~= prev_PR and current_IR < (prev_IR - threshold) Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off) -------------------------------------------------- packet size \| before[Mb/s] \| after[Mb/s] \| gain \| 2B \| 343.4 \| 359.4 \| 4.5% \| 16B \| 2739.7 \| 2814.8 \| 2.7% \| 64B \| 9739 \| 10185.3 \| 4.5% \| Fixes: `cb3c7fd4f8` ("net/mlx5e: Support adaptive RX coalescing") Signed-off-by: Tal Gilboa <talgi@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Huy Nguyen	ff8cf391fe	net/mlx5: Remove several module events out of ethtool stats [ Upstream commit `f729860a17` ] Remove the following module event counters out of ethtool stats. The reason for removing these event counters is that these events do not occur without techinician's intervention. module_pwr_budget_exd module_long_range module_no_eeprom module_enforce_part module_unknown_id module_unknown_status module_plug Fixes: `bedb7c909c` ("net/mlx5e: Add port module event counters to ethtool stats") Signed-off-by: Huy Nguyen <huyn@mellanox.com> Reviewed by: Gal Pressman <galp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Jia-Ju Bai	e938c05e06	net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse [ Upstream commit `343eba69c6` ] The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the function call path is: tipc_l2_rcv_msg (acquire the lock by rcu_read_lock) tipc_rcv tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep tipc_node_broadcast tipc_node_xmit_skb tipc_node_xmit tipc_sk_rcv tipc_msg_reverse pskb_expand_head(GFP_KERNEL) --> may sleep To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC". Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Jia-Ju Bai	a89d15fc8b	net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx [ Upstream commit `f146e872eb` ] The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the function call path is: cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock) cfctrl_linkdown_req cfpkt_create cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep cfserl_receive (acquire the lock by rcu_read_lock) cfpkt_split cfpkt_create_pfx alloc_skb(GFP_KERNEL) --> may sleep There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or "GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function is called under a rcu read lock, instead in interrupt. To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx. Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Xin Long	29b8ea35b0	sctp: disable BH in sctp_for_each_endpoint [ Upstream commit `581409dacc` ] Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling BH. If CPU schedules to another thread A at this moment, the thread A may be trying to hold the write_lock with disabling BH. As BH is disabled and CPU cannot schedule back to the thread holding the read_lock, while the thread A keeps waiting for the read_lock. A dead lock would be triggered by this. This patch is to fix this dead lock by calling read_lock_bh instead to disable BH when holding the read_lock in sctp_for_each_endpoint. Fixes: `626d16f50f` ("sctp: export some apis or variables for sctp_diag and reuse some for proc") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Krister Johansen	e607742172	Fix an intermittent pr_emerg warning about lo becoming free. [ Upstream commit `f186ce61bb` ] It looks like this: Message from syslogd@flamingo at Apr 26 00:45:00 ... kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4 They seem to coincide with net namespace teardown. The message is emitted by netdev_wait_allrefs(). Forced a kdump in netdev_run_todo, but found that the refcount on the lo device was already 0 at the time we got to the panic. Used bcc to check the blocking in netdev_run_todo. The only places where we're off cpu there are in the rcu_barrier() and msleep() calls. That behavior is expected. The msleep time coincides with the amount of time we spend waiting for the refcount to reach zero; the rcu_barrier() wait times are not excessive. After looking through the list of callbacks that the netdevice notifiers invoke in this path, it appears that the dst_dev_event is the most interesting. The dst_ifdown path places a hold on the loopback_dev as part of releasing the dev associated with the original dst cache entry. Most of our notifier callbacks are straight-forward, but this one a) looks complex, and b) places a hold on the network interface in question. I constructed a new bcc script that watches various events in the liftime of a dst cache entry. Note that dst_ifdown will take a hold on the loopback device until the invalidated dst entry gets freed. [ __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183 __dst_free rcu_nocb_kthread kthread ret_from_fork Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Mateusz Jurczyk	6e8f72327b	af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers [ Upstream commit `defbcf2dec` ] Verify that the caller-provided sockaddr structure is large enough to contain the sa_family field, before accessing it in bind() and connect() handlers of the AF_UNIX socket. Since neither syscall enforces a minimum size of the corresponding memory region, very short sockaddrs (zero or one byte long) result in operating on uninitialized memory while referencing .sa_family. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
David Ahern	83cb92f4cf	net: vrf: Make add_fib_rules per network namespace flag [ Upstream commit `097d3c9508` ] Commit `1aa6c4f6b8` ("net: vrf: Add l3mdev rules on first device create") adds the l3mdev FIB rule the first time a VRF device is created. However, it only creates the rule once and only in the namespace the first device is created - which may not be init_net. Fix by using the net_generic capability to make the add_fib_rules flag per network namespace. Fixes: `1aa6c4f6b8` ("net: vrf: Add l3mdev rules on first device create") Reported-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
David Ahern	5586883813	net: ipv6: Release route when device is unregistering [ Upstream commit `8397ed36b7` ] Roopa reported attempts to delete a bond device that is referenced in a multipath route is hanging: $ ifdown bond2 # ifupdown2 command that deletes virtual devices unregister_netdevice: waiting for bond2 to become free. Usage count = 2 Steps to reproduce: echo 1 > /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown ip link add dev bond12 type bond ip link add dev bond13 type bond ip addr add 2001:db8:2::0/64 dev bond12 ip addr add 2001:db8:3::0/64 dev bond13 ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2 ip link del dev bond12 ip link del dev bond13 The root cause is the recent change to keep routes on a linkdown. Update the check to detect when the device is unregistering and release the route for that case. Fixes: `a1a22c1206` ("net: ipv6: Keep nexthop of multipath route on admin down") Reported-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:35 +02:00
Mintz, Yuval	199f4baff6	net: Zero ifla_vf_info in rtnl_fill_vfinfo() [ Upstream commit `0eed9cf584` ] Some of the structure's fields are not initialized by the rtnetlink. If driver doesn't set those in ndo_get_vf_config(), they'd leak memory to user. Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> CC: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:34 +02:00
Mateusz Jurczyk	d8d01fc9ba	decnet: dn_rtmsg: Improve input length sanitization in dnrmg_receive_user_skb [ Upstream commit `dd0da17b20` ] Verify that the length of the socket buffer is sufficient to cover the nlmsghdr structure before accessing the nlh->nlmsg_len field for further input sanitization. If the client only supplies 1-3 bytes of data in sk_buff, then nlh->nlmsg_len remains partially uninitialized and contains leftover memory from the corresponding kernel allocation. Operating on such data may result in indeterminate evaluation of the nlmsg_len < sizeof(*nlh) expression. The bug was discovered by a runtime instrumentation designed to detect use of uninitialized memory in the kernel. The patch prevents this and other similar tools (e.g. KMSAN) from flagging this behavior in the future. Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:34 +02:00
Johannes Berg	e34cacd27f	mac80211: free netdev on dev_alloc_name() error [ Upstream commit `c7a61cba71` ] The change to remove free_netdev() from ieee80211_if_free() erroneously didn't add the necessary free_netdev() for when ieee80211_if_free() is called directly in one place, rather than as the priv_destructor. Add the missing call. Fixes: `cf124db566` ("net: Fix inconsistent teardown and release of private netdev state.") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:34 +02:00
Stephen Rothwell	64603b75f8	net: s390: fix up for "Fix inconsistent teardown and release of private netdev state" [ Upstream commit `cd1997f6c1` ] Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:34 +02:00
David S. Miller	95876855a5	net: Fix inconsistent teardown and release of private netdev state. [ Upstream commit `cf124db566` ] Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:34 +02:00
Alexander Potapenko	3227b51e72	net: don't call strlen on non-terminated string in dev_set_alias() [ Upstream commit `c28294b941` ] KMSAN reported a use of uninitialized memory in dev_set_alias(), which was caused by calling strlcpy() (which in turn called strlen()) on the user-supplied non-terminated string. Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-07-05 14:41:33 +02:00
Greg Kroah-Hartman	8afcfa55e6	Linux 4.11.8	2017-06-29 13:03:17 +02:00
Arend Van Spriel	bc3512188f	brcmfmac: fix uninitialized warning in brcmf_usb_probe_phase2() commit `35abcd4f9f` upstream. This fixes the following warning: drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c: In function 'brcmf_usb_probe_phase2': drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c:1198:2: warning: 'devinfo' may be used uninitialized in this function [-Wmaybe-uninitialized] mutex_unlock(&devinfo->dev_init_lock); Fixes: `6d0507a777` ("brcmfmac: add parameter to pass error code in firmware callback") Cc: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:56 +02:00
Willem de Bruijn	e5d12fe110	netfilter: xtables: fix build failure from COMPAT_XT_ALIGN outside CONFIG_COMPAT commit `751a9c7638` upstream. The patch in the Fixes references COMPAT_XT_ALIGN in the definition of XT_DATA_TO_USER, outside an #ifdef CONFIG_COMPAT block. Split XT_DATA_TO_USER into separate compat and non compat variants and define the first inside an CONFIG_COMPAT block. This simplifies both variants by removing branches inside the macro. Fixes: `324318f024` ("netfilter: xtables: zero padding in data_to_user") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:56 +02:00
Willem de Bruijn	865cdeeae7	netfilter: xtables: zero padding in data_to_user commit `324318f024` upstream. When looking up an iptables rule, the iptables binary compares the aligned match and target data (XT_ALIGN). In some cases this can exceed the actual data size to include padding bytes. Before commit `f77bc5b23f` ("iptables: use match, target and data copy_to_user helpers") the malloc()ed bytes were overwritten by the kernel with kzalloced contents, zeroing the padding and making the comparison succeed. After this patch, the kernel copies and clears only data, leaving the padding bytes undefined. Extend the clear operation from data size to aligned data size to include the padding bytes, if any. Padding bytes can be observed in both match and target, and the bug triggered, by issuing a rule with match icmp and target ACCEPT: iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT Fixes: `f77bc5b23f` ("iptables: use match, target and data copy_to_user helpers") Reported-by: Paul Moore <pmoore@redhat.com> Reported-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:55 +02:00
Russell King	958c8a07d5	net: phy: fix marvell phy status reading commit `898805e0cd` upstream. The Marvell driver incorrectly provides phydev->lp_advertising as the logical and of the link partner's advert and our advert. This is incorrect - this field is supposed to store the link parter's unmodified advertisment. This allows ethtool to report the correct link partner auto-negotiation status. Fixes: `be937f1f89` ("Marvell PHY m88e1111 driver fix") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:55 +02:00
Hauke Mehrtens	dc95f4a320	spi: double time out tolerance commit `833bfade96` upstream. The generic SPI code calculates how long the issued transfer would take and adds 100ms in addition to the timeout as tolerance. On my 500 MHz Lantiq Mips SoC I am getting timeouts from the SPI like this when the system boots up: m25p80 spi32766.4: SPI transfer timed out blk_update_request: I/O error, dev mtdblock3, sector 2 SQUASHFS error: squashfs_read_data failed to read block 0x6e After increasing the tolerance for the timeout to 200ms I haven't seen these SPI transfer time outs any more. The Lantiq SPI driver in use here has an extra work queue in between, which gets triggered when the controller send the last word and the hardware FIFOs used for reading and writing are only 8 words long. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Amit Pundir <amit.pundir@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:55 +02:00
William Wu	87e67ff633	usb: gadget: f_fs: avoid out of bounds access on comp_desc commit `b7f73850bb` upstream. Companion descriptor is only used for SuperSpeed endpoints, if the endpoints are HighSpeed or FullSpeed, the Companion descriptor will not allocated, so we can only access it if gadget is SuperSpeed. I can reproduce this issue on Rockchip platform rk3368 SoC which supports USB 2.0, and use functionfs for ADB. Kernel build with CONFIG_KASAN=y and CONFIG_SLUB_DEBUG=y report the following BUG: ================================================================== BUG: KASAN: slab-out-of-bounds in ffs_func_set_alt+0x224/0x3a0 at addr ffffffc0601f6509 Read of size 1 by task swapper/0/0 ============================================================================ BUG kmalloc-256 (Not tainted): kasan: bad access detected ---------------------------------------------------------------------------- Disabling lock debugging due to kernel taint INFO: Allocated in ffs_func_bind+0x52c/0x99c age=1275 cpu=0 pid=1 alloc_debug_processing+0x128/0x17c ___slab_alloc.constprop.58+0x50c/0x610 __slab_alloc.isra.55.constprop.57+0x24/0x34 __kmalloc+0xe0/0x250 ffs_func_bind+0x52c/0x99c usb_add_function+0xd8/0x1d4 configfs_composite_bind+0x48c/0x570 udc_bind_to_driver+0x6c/0x170 usb_udc_attach_driver+0xa4/0xd0 gadget_dev_desc_UDC_store+0xcc/0x118 configfs_write_file+0x1a0/0x1f8 __vfs_write+0x64/0x174 vfs_write+0xe4/0x200 SyS_write+0x68/0xc8 el0_svc_naked+0x24/0x28 INFO: Freed in inode_doinit_with_dentry+0x3f0/0x7c4 age=1275 cpu=7 pid=247 ... Call trace: [<ffffff900808aab4>] dump_backtrace+0x0/0x230 [<ffffff900808acf8>] show_stack+0x14/0x1c [<ffffff90084ad420>] dump_stack+0xa0/0xc8 [<ffffff90082157cc>] print_trailer+0x188/0x198 [<ffffff9008215948>] object_err+0x3c/0x4c [<ffffff900821b5ac>] kasan_report+0x324/0x4dc [<ffffff900821aa38>] __asan_load1+0x24/0x50 [<ffffff90089eb750>] ffs_func_set_alt+0x224/0x3a0 [<ffffff90089d3760>] composite_setup+0xdcc/0x1ac8 [<ffffff90089d7394>] android_setup+0x124/0x1a0 [<ffffff90089acd18>] _setup+0x54/0x74 [<ffffff90089b6b98>] handle_ep0+0x3288/0x4390 [<ffffff90089b9b44>] dwc_otg_pcd_handle_out_ep_intr+0x14dc/0x2ae4 [<ffffff90089be85c>] dwc_otg_pcd_handle_intr+0x1ec/0x298 [<ffffff90089ad680>] dwc_otg_pcd_irq+0x10/0x20 [<ffffff9008116328>] handle_irq_event_percpu+0x124/0x3ac [<ffffff9008116610>] handle_irq_event+0x60/0xa0 [<ffffff900811af30>] handle_fasteoi_irq+0x10c/0x1d4 [<ffffff9008115568>] generic_handle_irq+0x30/0x40 [<ffffff90081159b4>] __handle_domain_irq+0xac/0xdc [<ffffff9008080e9c>] gic_handle_irq+0x64/0xa4 ... Memory state around the buggy address: ffffffc0601f6400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffffffc0601f6480: 00 00 00 00 00 00 00 00 00 00 06 fc fc fc fc fc >ffffffc0601f6500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffffffc0601f6580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffffc0601f6600: fc fc fc fc fc fc fc fc 00 00 00 00 00 00 00 00 ================================================================== Signed-off-by: William Wu <william.wu@rock-chips.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Cc: Jerry Zhang <zhangjerry@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:55 +02:00
Daniel Vetter	5693426a5c	drm: Fix GETCONNECTOR regression commit `e94ac3510b` upstream. In commit `91eefc05f0` Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Wed Dec 14 00:08:10 2016 +0100 drm: Tighten locking in drm_mode_getconnector I reordered the logic a bit in that IOCTL, but that broke userspace since it'll get the new mode list, but not the new property values. Fix that again. v2: Fix up the error path handling when copy_to_user for the modes failes (Dhinakaran). Fixes: `91eefc05f0` ("drm: Tighten locking in drm_mode_getconnector") Cc: Sean Paul <seanpaul@chromium.org> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: David Airlie <airlied@linux.ie> Cc: dri-devel@lists.freedesktop.org Reported-by: "H.J. Lu" <hjl.tools@gmail.com> Tested-by: "H.J. Lu" <hjl.tools@gmail.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100576 Cc: "H.J. Lu" <hjl.tools@gmail.com> Cc: "Pandiyan, Dhinakaran" <dhinakaran.pandiyan@intel.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170620202837.1701-1-daniel.vetter@ffwll.ch Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:54 +02:00
David Howells	575cd7d4ce	rxrpc: Fix several cases where a padded len isn't checked in ticket decode commit `5f2f97656a` upstream. This fixes CVE-2017-7482. When a kerberos 5 ticket is being decoded so that it can be loaded into an rxrpc-type key, there are several places in which the length of a variable-length field is checked to make sure that it's not going to overrun the available data - but the data is padded to the nearest four-byte boundary and the code doesn't check for this extra. This could lead to the size-remaining variable wrapping and the data pointer going over the end of the buffer. Fix this by making the various variable-length data checks use the padded length. Reported-by: 石磊 <shilei-c@360.cn> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.c.dionne@auristor.com> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:54 +02:00
Jarkko Nikula	017294bd93	ACPI / scan: Fix enumeration for special SPI and I2C devices commit `e4330d8bf6` upstream. Commit `f406270bf7` ("ACPI / scan: Set the visited flag for all enumerated devices") caused that two group of special SPI or I2C devices do not enumerate. SPI and I2C devices are expected to be enumerated by the SPI and I2C subsystems but change caused that acpi_bus_attach() marks those devices with acpi_device_set_enumerated(). First group of devices are matched using Device Tree compatible property with special _HID "PRP0001". Those devices have matched scan handler, acpi_scan_attach_handler() retuns 1 and acpi_bus_attach() marks them with acpi_device_set_enumerated(). Second group of devices without valid _HID such as "LNXVIDEO" have device->pnp.type.platform_id set to zero and change again marks them with acpi_device_set_enumerated(). Fix this by flagging the SPI and I2C devices during struct acpi_device object initialization time and let the code in acpi_bus_attach() to go through the device_attach() and acpi_default_enumeration() path for all SPI and I2C devices. Fixes: `f406270bf7` (ACPI / scan: Set the visited flag for all enumerated devices) Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:54 +02:00
Rafael J. Wysocki	3645b8f1fb	ACPI / scan: Apply default enumeration to devices with ACPI drivers commit `f5beabfe61` upstream. The current code in acpi_bus_attach() is inconsistent with respect to device objects with ACPI drivers bound to them, as it allows ACPI drivers to bind to device objects with existing "physical" device companions, but it doesn't allow "physical" device objects to be created for ACPI device objects with ACPI drivers bound to them. Thus, in some cases, the outcome depends on the ordering of events which is confusing at best. For this reason, modify acpi_bus_attach() to call acpi_default_enumeration() for device objects with the pnp.type.platform_id flag set regardless of whether or not any ACPI drivers are bound to them. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Joey Lee <jlee@suse.com> Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:54 +02:00
Junshan Fang	e79fccb184	drm/amdgpu: add Polaris12 DID commit `6e88491cf2` upstream. Signed-off-by: Junshan Fang <Junshan.Fang@amd.com> Reviewed-by: Roger.He <Hongbo.He@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:54 +02:00
Alex Deucher	033b267d4f	drm/amdgpu: adjust default display clock commit `52b482b0f4` upstream. Increase the default display clock on newer asics to accomodate some high res modes with really high refresh rates. bug: https://bugs.freedesktop.org/show_bug.cgi?id=93826 Acked-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:53 +02:00
Alex Deucher	0e01b65960	drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating commit `05b4017b37` upstream. We were using the wrong structure which lead to an overflow on some boards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=101387 Acked-by: Chunming Zhou <david1.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:53 +02:00
Alex Deucher	08a02c2c3f	drm/radeon: add a quirk for Toshiba Satellite L20-183 commit `acfd6ee4fa` upstream. Fixes resume from suspend. bug: https://bugzilla.kernel.org/show_bug.cgi?id=196121 Reported-by: Przemek <soprwa@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:53 +02:00
Alex Deucher	b54ffeb713	drm/radeon: add a PX quirk for another K53TK variant commit `4eb59793cc` upstream. Disable PX on these systems. bug: https://bugs.freedesktop.org/show_bug.cgi?id=101491 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:53 +02:00
Nicholas Bellinger	47a1f8d951	iscsi-target: Reject immediate data underflow larger than SCSI transfer length commit `abb85a9b51` upstream. When iscsi WRITE underflow occurs there are two different scenarios that can happen. Normally in practice, when an EDTL vs. SCSI CDB TRANSFER LENGTH underflow is detected, the iscsi immediate data payload is the smaller SCSI CDB TRANSFER LENGTH. That is, when a host fabric LLD is using a fixed size EDTL for a specific control CDB, the SCSI CDB TRANSFER LENGTH and actual SCSI payload ends up being smaller than EDTL. In iscsi, this means the received iscsi immediate data payload matches the smaller SCSI CDB TRANSFER LENGTH, because there is no more SCSI payload to accept beyond SCSI CDB TRANSFER LENGTH. However, it's possible for a malicous host to send a WRITE underflow where EDTL is larger than SCSI CDB TRANSFER LENGTH, but incoming iscsi immediate data actually matches EDTL. In the wild, we've never had a iscsi host environment actually try to do this. For this special case, it's wrong to truncate part of the control CDB payload and continue to process the command during underflow when immediate data payload received was larger than SCSI CDB TRANSFER LENGTH, so go ahead and reject and drop the bogus payload as a defensive action. Note this potential bug was originally relaxed by the following for allowing WRITE underflow in MSFT FCP host environments: commit `c72c525022` Author: Roland Dreier <roland@purestorage.com> Date: Wed Jul 22 15:08:18 2015 -0700 target: allow underflow/overflow for PR OUT etc. commands Cc: Roland Dreier <roland@purestorage.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:52 +02:00
Nicholas Bellinger	600966d6bc	iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP commit `105fa2f44e` upstream. This patch fixes a BUG() in iscsit_close_session() that could be triggered when iscsit_logout_post_handler() execution from within tx thread context was not run for more than SECONDS_FOR_LOGOUT_COMP (15 seconds), and the TCP connection didn't already close before then forcing tx thread context to automatically exit. This would manifest itself during explicit logout as: [33206.974254] 1 connection(s) still exist for iSCSI session to iqn.1993-08.org.debian:01:3f5523242179 [33206.980184] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 2100.772 msecs [33209.078643] ------------[ cut here ]------------ [33209.078646] kernel BUG at drivers/target/iscsi/iscsi_target.c:4346! Normally when explicit logout attempt fails, the tx thread context exits and iscsit_close_connection() from rx thread context does the extra cleanup once it detects conn->conn_logout_remove has not been cleared by the logout type specific post handlers. To address this special case, if the logout post handler in tx thread context detects conn->tx_thread_active has already been cleared, simply return and exit in order for existing iscsit_close_connection() logic from rx thread context do failed logout cleanup. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagig@mellanox.com> Tested-by: Gary Guo <ghg@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:52 +02:00
Nicholas Bellinger	852a804236	target: Fix kref->refcount underflow in transport_cmd_finish_abort commit `73d4e580cc` upstream. This patch fixes a se_cmd->cmd_kref underflow during CMD_T_ABORTED when a fabric driver drops it's second reference from below the target_core_tmr.c based callers of transport_cmd_finish_abort(). Recently with the conversion of kref to refcount_t, this bug was manifesting itself as: [705519.601034] refcount_t: underflow; use-after-free. [705519.604034] INFO: NMI handler (kgdb_nmi_handler) took too long to run: 20116.512 msecs [705539.719111] ------------[ cut here ]------------ [705539.719117] WARNING: CPU: 3 PID: 26510 at lib/refcount.c:184 refcount_sub_and_test+0x33/0x51 Since the original kref atomic_t based kref_put() didn't check for underflow and only invoked the final callback when zero was reached, this bug did not manifest in practice since all se_cmd memory is using preallocated tags. To address this, go ahead and propigate the existing return from transport_put_cmd() up via transport_cmd_finish_abort(), and change transport_cmd_finish_abort() + core_tmr_handle_tas_abort() callers to only do their local target_put_sess_cmd() if necessary. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Christie <mchristi@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Himanshu Madhani <himanshu.madhani@qlogic.com> Cc: Sagi Grimberg <sagig@mellanox.com> Tested-by: Gary Guo <ghg@datera.io> Tested-by: Chu Yuan Lin <cyl@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:52 +02:00
Will Deacon	830146b639	arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW commit `dbb236c1ce` upstream. Recently vDSO support for CLOCK_MONOTONIC_RAW was added in `49eea433b3` ("arm64: Add support for CLOCK_MONOTONIC_RAW in clock_gettime() vDSO"). Noticing that the core timekeeping code never set tkr_raw.xtime_nsec, the vDSO implementation didn't bother exposing it via the data page and instead took the unshifted tk->raw_time.tv_nsec value which was then immediately shifted left in the vDSO code. Unfortunately, by accellerating the MONOTONIC_RAW clockid, it uncovered potential 1ns time inconsistencies caused by the timekeeping core not handing sub-ns resolution. Now that the core code has been fixed and is actually setting tkr_raw.xtime_nsec, we need to take that into account in the vDSO by adding it to the shifted raw_time value, in order to fix the user-visible inconsistency. Rather than do that at each use (and expand the data page in the process), instead perform the shift/addition operation when populating the data page and remove the shift from the vDSO code entirely. [jstultz: minor whitespace tweak, tried to improve commit message to make it more clear this fixes a regression] Reported-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Acked-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-4-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:52 +02:00
John Stultz	102d12f156	time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting commit `3d88d56c58` upstream. Due to how the MONOTONIC_RAW accumulation logic was handled, there is the potential for a 1ns discontinuity when we do accumulations. This small discontinuity has for the most part gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW in their vDSO clock_gettime implementation, we've seen failures with the inconsistency-check test in kselftest. This patch addresses the issue by using the same sub-ns accumulation handling that CLOCK_MONOTONIC uses, which avoids the issue for in-kernel users. Since the ARM64 vDSO implementation has its own clock_gettime calculation logic, this patch reduces the frequency of errors, but failures are still seen. The ARM64 vDSO will need to be updated to include the sub-nanosecond xtime_nsec values in its calculation for this issue to be completely fixed. Signed-off-by: John Stultz <john.stultz@linaro.org> Tested-by: Daniel Mentz <danielmentz@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Will Deacon <will.deacon@arm.com> Cc: Miroslav Lichvar <mlichvar@redhat.com> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:52 +02:00
John Stultz	fa79bdd4c3	time: Fix clock->read(clock) race around clocksource changes commit `ceea5e3771` upstream. In tests, which excercise switching of clocksources, a NULL pointer dereference can be observed on AMR64 platforms in the clocksource read() function: u64 clocksource_mmio_readl_down(struct clocksource *c) { return ~(u64)readl_relaxed(to_mmio_clksrc(c)->reg) & c->mask; } This is called from the core timekeeping code via: cycle_now = tkr->read(tkr->clock); tkr->read is the cached tkr->clock->read() function pointer. When the clocksource is changed then tkr->clock and tkr->read are updated sequentially. The code above results in a sequential load operation of tkr->read and tkr->clock as well. If the store to tkr->clock hits between the loads of tkr->read and tkr->clock, then the old read() function is called with the new clock pointer. As a consequence the read() function dereferences a different data structure and the resulting 'reg' pointer can point anywhere including NULL. This problem was introduced when the timekeeping code was switched over to use struct tk_read_base. Before that, it was theoretically possible as well when the compiler decided to reload clock in the code sequence: now = tk->clock->read(tk->clock); Add a helper function which avoids the issue by reading tk_read_base->clock once into a local variable clk and then issue the read function via clk->read(clk). This guarantees that the read() function always gets the proper clocksource pointer handed in. Since there is now no use for the tkr.read pointer, this patch also removes it, and to address stopping the fast timekeeper during suspend/resume, it introduces a dummy clocksource to use rather then just a dummy read function. Signed-off-by: John Stultz <john.stultz@linaro.org> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Stephen Boyd <stephen.boyd@linaro.org> Cc: Miroslav Lichvar <mlichvar@redhat.com> Cc: Daniel Mentz <danielmentz@google.com> Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:51 +02:00
Arend Van Spriel	0f3ae212fc	brcmfmac: unbind all devices upon failure in firmware callback commit `7a51461fc2` upstream. When request firmware fails, brcmf_ops_sdio_remove is being called and brcmf_bus freed. In such circumstancies if you do a suspend/resume cycle the kernel hangs on resume due a NULL pointer dereference in resume function. So in brcmf_sdio_firmware_callback() we need to unbind the driver from both sdio_func devices when firmware load failure is indicated. Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:51 +02:00
Arend Van Spriel	2cb9b88c7d	brcmfmac: use firmware callback upon failure to load commit `03fb0e8393` upstream. When firmware loading failed the code used to unbind the device provided by the calling code. However, for the sdio driver two devices are bound and both need to be released upon failure. The callback has been extended with parameter to pass error code so add that in this commit upon firmware loading failure. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:51 +02:00
Arend Van Spriel	1e5c71934d	brcmfmac: add parameter to pass error code in firmware callback commit `6d0507a777` upstream. Extend the parameters in the firmware callback so it can be called upon success and failure. This allows the caller to properly clear all resources in the failure path. Right now the error code is always zero, ie. success. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:51 +02:00
Daniel Drake	b6bfede20b	Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list commit `817ae460c7` upstream. Without this quirk, the touchpad is not responsive on this product, with the following message repeated in the logs: psmouse serio1: bad data from KBC - timeout Add it to the notimeout list alongside other similar Fujitsu laptops. Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:50 +02:00
Naveen N. Rao	fc1c180ba6	powerpc/64s: Handle data breakpoints in Radix mode commit `d89ba5353f` upstream. On Power9, trying to use data breakpoints throws the splat shown below. This is because the check for a data breakpoint in DSISR is in do_hash_page(), which is not called when in Radix mode. Unable to handle kernel paging request for data at address 0xc000000000e19218 Faulting instruction address: 0xc0000000001155e8 cpu 0x0: Vector: 300 (Data Access) at [c0000000ef1e7b20] pc: c0000000001155e8: find_pid_ns+0x48/0xe0 lr: c000000000116ac4: find_task_by_vpid+0x44/0x90 sp: c0000000ef1e7da0 msr: 9000000000009033 dar: c000000000e19218 dsisr: 400000 Move the check to handle_page_fault() so as to catch data breakpoints in both Hash and Radix MMU modes. We have to change the check in do_hash_page() against 0xa410 to use 0xa450, so as to include the value of (DSISR_DABRMATCH << 16). There are two sites that call handle_page_fault() when in Radix, both already pass DSISR in r4. Fixes: `caca285e5a` ("powerpc/mm/radix: Use STD_MMU_64 to properly isolate hash related code") Reported-by: Shriya R. Kulkarni <shriykul@in.ibm.com> Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> [mpe: Fix the fall-through case on hash, we need to reload DSISR] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:50 +02:00
Naveen N. Rao	7a6c505366	powerpc/kprobes: Pause function_graph tracing during jprobes handling commit `a9f8553e93` upstream. This fixes a crash when function_graph and jprobes are used together. This is essentially commit `237d28db03` ("ftrace/jprobes/x86: Fix conflict between jprobes and function graph tracing"), but for powerpc. Jprobes breaks function_graph tracing since the jprobe hook needs to use jprobe_return(), which never returns back to the hook, but instead to the original jprobe'd function. The solution is to momentarily pause function_graph tracing before invoking the jprobe hook and re-enable it when returning back to the original jprobe'd function. Fixes: `6794c78243` ("powerpc64: port of the function graph tracer") Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:50 +02:00
Eric W. Biederman	e6281e0df1	signal: Only reschedule timers on signals timers have sent commit `57db7e4a2d` upstream. Thomas Gleixner wrote: > The CRIU support added a 'feature' which allows a user space task to send > arbitrary (kernel) signals to itself. The changelog says: > > The kernel prevents sending of siginfo with positive si_code, because > these codes are reserved for kernel. I think we can allow a task to > send such a siginfo to itself. This operation should not be dangerous. > > Quite contrary to that claim, it turns out that it is outright dangerous > for signals with info->si_code == SI_TIMER. The following code sequence in > a user space task allows to crash the kernel: > > id = timer_create(CLOCK_XXX, ..... signo = SIGX); > timer_set(id, ....); > info->si_signo = SIGX; > info->si_code = SI_TIMER: > info->_sifields._timer._tid = id; > info->_sifields._timer._sys_private = 2; > rt_[tg]sigqueueinfo(..., SIGX, info); > sigemptyset(&sigset); > sigaddset(&sigset, SIGX); > rt_sigtimedwait(sigset, info); > > For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this > results in a kernel crash because sigwait() dequeues the signal and the > dequeue code observes: > > info->si_code == SI_TIMER && info->_sifields._timer._sys_private != 0 > > which triggers the following callchain: > > do_schedule_next_timer() -> posix_cpu_timer_schedule() -> arm_timer() > > arm_timer() executes a list_add() on the timer, which is already armed via > the timer_set() syscall. That's a double list add which corrupts the posix > cpu timer list. As a consequence the kernel crashes on the next operation > touching the posix cpu timer list. > > Posix clocks which are internally implemented based on hrtimers are not > affected by this because hrtimer_start() can handle already armed timers > nicely, but it's a reliable way to trigger the WARN_ON() in > hrtimer_forward(), which complains about calling that function on an > already armed timer. This problem has existed since the posix timer code was merged into 2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to inject not just a signal (which linux has supported since 1.0) but the full siginfo of a signal. The core problem is that the code will reschedule in response to signals getting dequeued not just for signals the timers sent but for other signals that happen to a si_code of SI_TIMER. Avoid this confusion by testing to see if the queued signal was preallocated as all timer signals are preallocated, and so far only the timer code preallocates signals. Move the check for if a timer needs to be rescheduled up into collect_signal where the preallocation check must be performed, and pass the result back to dequeue_signal where the code reschedules timers. This makes it clear why the code cares about preallocated timers. Reported-by: Thomas Gleixner <tglx@linutronix.de> History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Reference: `66dd34ad31` ("signal: allow to send any siginfo to itself") Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO") Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks & timers") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:50 +02:00
Jason A. Donenfeld	3a182635e0	random: silence compiler warnings and fix race commit `4a072c71f4` upstream. Odd versions of gcc for the sh4 architecture will actually warn about flags being used while uninitialized, so we set them to zero. Non crazy gccs will optimize that out again, so it doesn't make a difference. Next, over aggressive gccs could inline the expression that defines use_lock, which could then introduce a race resulting in a lock imbalance. By using READ_ONCE, we prevent that fate. Finally, we make that assignment const, so that gcc can still optimize a nice amount. Finally, we fix a potential deadlock between primary_crng.lock and batched_entropy_reset_lock, where they could be called in opposite order. Moving the call to invalidate_batched_entropy to outside the lock rectifies this issue. Fixes: `b169c13de4` Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:49 +02:00
Sebastian Parschauer	1429fdb15f	HID: Add quirk for Dell PIXART OEM mouse commit `3db28271f0` upstream. This mouse is also known under other IDs. It needs the quirk ALWAYS_POLL or will disconnect in runlevel 1 or 3. Signed-off-by: Sebastian Parschauer <sparschauer@suse.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:49 +02:00
Raju Rangoju	abef7afb35	cxgb4: notify uP to route ctrlq compl to rdma rspq commit `dec6b33163` upstream. During the module initialisation there is a possible race (basically race between uld and lld) where neither the uld nor lld notifies the uP about where to route the ctrl queue completions. LLD skips notifying uP as the rdma queues were not created by then (will leave it to ULD to notify the uP). As the ULD comes up, it also skips notifying the uP as the flag FULL_INIT_DONE is not set yet (ULD assumes that the interface is not up yet). Consequently, this race between uld and lld leaves uP unnotified about where to send the ctrl queue completions to, leading to iwarp RI_RES WR failure. Here is the race: CPU 0 CPU1 - allocates nic rx queus - t4_sge_alloc_ctrl_txq() (if rdma rsp queues exists, tell uP to route ctrl queue compl to rdma rspq) - acquires the mutex_lock - allocates rdma response queues - if FULL_INIT_DONE set, tell uP to route ctrl queue compl to rdma rspq - relinquishes mutex_lock - acquires the mutex_lock - enable_rx() - set FULL_INIT_DONE - relinquishes mutex_lock This patch fixes the above issue. Fixes: e7519f9926f1('cxgb4: avoid enabling napi twice to the same queue') Signed-off-by: Raju Rangoju <rajur@chelsio.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:49 +02:00
Christophe Jaillet	4eec15d336	CIFS: Fix some return values in case of error in 'crypt_message' commit `517a6e43c4` upstream. 'rc' is known to be 0 at this point. So if 'init_sg' or 'kzalloc' fails, we should return -ENOMEM instead. Also remove a useless 'rc' in a debug message as it is meaningless here. Fixes: `026e93dc0a` ("CIFS: Encrypt SMB3 requests before sending") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Reviewed-by: Aurelien Aptel <aaptel@suse.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:48 +02:00
Pavel Shilovsky	dcaa5a53cc	CIFS: Improve readdir verbosity commit `dcd87838c0` upstream. Downgrade the loglevel for SMB2 to prevent filling the log with messages if e.g. readdir was interrupted. Also make SMB2 and SMB1 codepaths do the same logging during readdir. Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:48 +02:00
Paul Mackerras	3ff8eda23a	KVM: PPC: Book3S HV: Save/restore host values of debug registers commit `7ceaa6dcd8` upstream. At present, HV KVM on POWER8 and POWER9 machines loses any instruction or data breakpoint set in the host whenever a guest is run. Instruction breakpoints are currently only used by xmon, but ptrace and the perf_event subsystem can set data breakpoints as well as xmon. To fix this, we save the host values of the debug registers (CIABR, DAWR and DAWRX) before entering the guest and restore them on exit. To provide space to save them in the stack frame, we expand the stack frame allocated by kvmppc_hv_entry() from 112 to 144 bytes. Fixes: `b005255e12` ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:48 +02:00
Paul Mackerras	324df574b9	KVM: PPC: Book3S HV: Restore critical SPRs to host values on guest exit commit `4c3bb4ccd0` upstream. This restores several special-purpose registers (SPRs) to sane values on guest exit that were missed before. TAR and VRSAVE are readable and writable by userspace, and we need to save and restore them to prevent the guest from potentially affecting userspace execution (not that TAR or VRSAVE are used by any known program that run uses the KVM_RUN ioctl). We save/restore these in kvmppc_vcpu_run_hv() rather than on every guest entry/exit. FSCR affects userspace execution in that it can prohibit access to certain facilities by userspace. We restore it to the normal value for the task on exit from the KVM_RUN ioctl. IAMR is normally 0, and is restored to 0 on guest exit. However, with a radix host on POWER9, it is set to a value that prevents the kernel from executing user-accessible memory. On POWER9, we save IAMR on guest entry and restore it on guest exit to the saved value rather than 0. On POWER8 we continue to set it to 0 on guest exit. PSPB is normally 0. We restore it to 0 on guest exit to prevent userspace taking advantage of the guest having set it non-zero (which would allow userspace to set its SMT priority to high). UAMOR is normally 0. We restore it to 0 on guest exit to prevent the AMR from being used as a covert channel between userspace processes, since the AMR is not context-switched at present. Fixes: `b005255e12` ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:48 +02:00
Paul Mackerras	4ae3f3581f	KVM: PPC: Book3S HV: Context-switch EBB registers properly commit `ca8efa1df1` upstream. This adds code to save the values of three SPRs (special-purpose registers) used by userspace to control event-based branches (EBBs), which are essentially interrupts that get delivered directly to userspace. These registers are loaded up with guest values when entering the guest, and their values are saved when exiting the guest, but we were not saving the host values and restoring them before going back to userspace. On POWER8 this would only affect userspace programs which explicitly request the use of EBBs and also use the KVM_RUN ioctl, since the only source of EBBs on POWER8 is the PMU, and there is an explicit enable bit in the PMU registers (and those PMU registers do get properly context-switched between host and guest). On POWER9 there is provision for externally-generated EBBs, and these are not subject to the control in the PMU registers. Since these registers only affect userspace, we can save them when we first come in from userspace and restore them before returning to userspace, rather than saving/restoring the host values on every guest entry/exit. Similarly, we don't need to worry about their values on offline secondary threads since they execute in the context of the idle task, which never executes in userspace. Fixes: `b005255e12` ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:48 +02:00
Paul Mackerras	a7d29e276e	KVM: PPC: Book3S HV: Ignore timebase offset on POWER9 DD1 commit `3d3efb68c1` upstream. POWER9 DD1 has an erratum where writing to the TBU40 register, which is used to apply an offset to the timebase, can cause the timebase to lose counts. This results in the timebase on some CPUs getting out of sync with other CPUs, which then results in misbehaviour of the timekeeping code. To work around the problem, we make KVM ignore the timebase offset for all guests on POWER9 DD1 machines. This means that live migration cannot be supported on POWER9 DD1 machines. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:47 +02:00
Paul Mackerras	4d9c6828d0	KVM: PPC: Book3S HV: Preserve userspace HTM state properly commit `46a704f840` upstream. If userspace attempts to call the KVM_RUN ioctl when it has hardware transactional memory (HTM) enabled, the values that it has put in the HTM-related SPRs TFHAR, TFIAR and TEXASR will get overwritten by guest values. To fix this, we detect this condition and save those SPR values in the thread struct, and disable HTM for the task. If userspace goes to access those SPRs or the HTM facility in future, a TM-unavailable interrupt will occur and the handler will reload those SPRs and re-enable HTM. If userspace has started a transaction and suspended it, we would currently lose the transactional state in the guest entry path and would almost certainly get a "TM Bad Thing" interrupt, which would cause the host to crash. To avoid this, we detect this case and return from the KVM_RUN ioctl with an EINVAL error, with the KVM exit reason set to KVM_EXIT_FAIL_ENTRY. Fixes: `b005255e12` ("KVM: PPC: Book3S HV: Context-switch new POWER8 SPRs", 2014-01-08) Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:47 +02:00
Paul Mackerras	1e0d799688	KVM: PPC: Book3S HV: Cope with host using large decrementer mode commit `2f2724630f` upstream. POWER9 introduces a new mode for the decrementer register, called large decrementer mode, in which the decrementer counter is 56 bits wide rather than 32, and reads are sign-extended rather than zero-extended. For the decrementer, this new mode is optional and controlled by a bit in the LPCR. The hypervisor decrementer (HDEC) is 56 bits wide on POWER9 and has no mode control. Since KVM code reads and writes the decrementer and hypervisor decrementer registers in a few places, it needs to be aware of the need to treat the decrementer value as a 64-bit quantity, and only do a 32-bit sign extension when large decrementer mode is not in effect. Similarly, the HDEC should always be treated as a 64-bit quantity on POWER9. We define a new EXTEND_HDEC macro to encapsulate the feature test for POWER9 and the sign extension. To enable the sign extension to be removed in large decrementer mode, we test the LPCR_LD bit in the host LPCR image stored in the struct kvm for the guest. If is set then large decrementer mode is enabled and the sign extension should be skipped. This is partly based on an earlier patch by Oliver O'Halloran. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:47 +02:00
Heiko Carstens	5d34bddddc	KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows commit `addb63c18a` upstream. For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: `3218f7094b` ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:47 +02:00
James Cowgill	87ee4cdd8c	KVM: MIPS: Fix maybe-uninitialized build failure commit `e27a9eca5d` upstream. This commit fixes a "maybe-uninitialized" build failure in arch/mips/kvm/tlb.c when KVM, DYNAMIC_DEBUG and JUMP_LABEL are all enabled. The failure is: In file included from ./include/linux/printk.h:329:0, from ./include/linux/kernel.h:13, from ./include/asm-generic/bug.h:15, from ./arch/mips/include/asm/bug.h:41, from ./include/linux/bug.h:4, from ./include/linux/thread_info.h:11, from ./include/asm-generic/current.h:4, from ./arch/mips/include/generated/asm/current.h:1, from ./include/linux/sched.h:11, from arch/mips/kvm/tlb.c:13: arch/mips/kvm/tlb.c: In function ‘kvm_mips_host_tlb_inv’: ./include/linux/dynamic_debug.h:126:3: error: ‘idx_kernel’ may be used uninitialized in this function [-Werror=maybe-uninitialized] __dynamic_pr_debug(&descriptor, pr_fmt(fmt), \ ^~~~~~~~~~~~~~~~~~ arch/mips/kvm/tlb.c:169:16: note: ‘idx_kernel’ was declared here int idx_user, idx_kernel; ^~~~~~~~~~ There is a similar error relating to "idx_user". Both errors were observed with GCC 6. As far as I can tell, it is impossible for either idx_user or idx_kernel to be uninitialized when they are later read in the calls to kvm_debug, but to satisfy the compiler, add zero initializers to both variables. Signed-off-by: James Cowgill <James.Cowgill@imgtec.com> Fixes: `57e3869cfa` ("KVM: MIPS/TLB: Generalise host TLB invalidate to kernel ASID") Acked-by: James Hogan <james.hogan@imgtec.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:46 +02:00
Paolo Bonzini	3af2b32a50	KVM: x86: fix singlestepping over syscall commit `c8401dda2f` upstream. TF is handled a bit differently for syscall and sysret, compared to the other instructions: TF is checked after the instruction completes, so that the OS can disable #DB at a syscall by adding TF to FMASK. When the sysret is executed the #DB is taken "as if" the syscall insn just completed. KVM emulates syscall so that it can trap 32-bit syscall on Intel processors. Fix the behavior, otherwise you could get #DB on a user stack which is not nice. This does not affect Linux guests, as they use an IST or task gate for #DB. This fixes CVE-2017-7518. Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:46 +02:00
Björn Töpel	75d7353890	perf probe: Fix probe definition for inlined functions commit `7598f8bc13` upstream. In commit `613f050d68` ("perf probe: Fix to probe on gcc generated functions in modules"), the offset from symbol is, incorrectly, added to the trace point address. This leads to incorrect probe trace points for inlined functions and when using relative line number on symbols. Prior this patch: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2212 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_lan_xmit_frame+626 After: $ perf probe -m nf_nat -D in_range p:probe/in_range nf_nat:in_range.isra.9+0 $ perf probe -m i40e -D i40e_clean_rx_irq p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+1106 $ perf probe -m i40e -D i40e_clean_rx_irq:16 p:probe/i40e_clean_rx_irq i40e:i40e_napi_poll+2665 Committer testing: Using 'pfunct', a tool found in the 'dwarves' package [1], one can ask what are the functions that while not being explicitely marked as inline, were inlined by the compiler: # pfunct --cc_inlined /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko \| head __ew32 e1000_regdump e1000e_dump_ps_pages e1000_desc_unused e1000e_systim_to_hwtstamp e1000e_rx_hwtstamp e1000e_update_rdt_wa e1000e_update_tdt_wa e1000_put_txbuf e1000_consume_page Then ask 'perf probe' to produce the kprobe_tracer probe definitions for two of them: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+74 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+876 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+1506 p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 Now lets concentrate on the 'e1000_consume_page' one, that was inlined twice in e1000_clean_jumbo_rx_irq(), lets see what readelf says about the DWARF tags for that function: $ readelf -wi /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko <SNIP> <1><13e27b>: Abbrev Number: 121 (DW_TAG_subprogram) <13e27c> DW_AT_name : (indirect string, offset: 0xa8945): e1000_clean_jumbo_rx_irq <13e287> DW_AT_low_pc : 0x17a30 <3><13e6ef>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e6f0> DW_AT_abstract_origin: <0x13ed2c> <13e6f4> DW_AT_low_pc : 0x17be6 <SNIP> <1><13ed2c>: Abbrev Number: 142 (DW_TAG_subprogram) <13ed2e> DW_AT_name : (indirect string, offset: 0xa54c3): e1000_consume_page So, the first time in e1000_clean_jumbo_rx_irq() where e1000_consume_page() is inlined is at PC 0x17be6, which subtracted from e1000_clean_jumbo_rx_irq()'s address, gives us the offset we should use in the probe definition: 0x17be6 - 0x17a30 = 438 but above we have 876, which is twice as much. Lets see the second inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13e86e>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13e86f> DW_AT_abstract_origin: <0x13ed2c> <13e873> DW_AT_low_pc : 0x17d21 0x17d21 - 0x17a30 = 753 So we where adding it at twice the offset from the containing function as we should. And then after this patch: # perf probe -m e1000e -D e1000e_rx_hwtstamp p:probe/e1000e_rx_hwtstamp e1000e:e1000_receive_skb+37 # perf probe -m e1000e -D e1000_consume_page p:probe/e1000_consume_page e1000e:e1000_clean_jumbo_rx_irq+438 p:probe/e1000_consume_page_1 e1000e:e1000_clean_jumbo_rx_irq+753 p:probe/e1000_consume_page_2 e1000e:e1000_clean_jumbo_rx_irq+1353 # Which matches the two first expansions and shows that because we were doubling the offset it would spill over the next function: readelf -sw /lib/modules/4.12.0-rc4+/kernel/drivers/net/ethernet/intel/e1000e/e1000e.ko 673: 0000000000017a30 1626 FUNC LOCAL DEFAULT 2 e1000_clean_jumbo_rx_irq 674: 0000000000018090 2013 FUNC LOCAL DEFAULT 2 e1000_clean_rx_irq_ps This is the 3rd inline expansion of e1000_consume_page() in e1000_clean_jumbo_rx_irq(): <3><13ec77>: Abbrev Number: 119 (DW_TAG_inlined_subroutine) <13ec78> DW_AT_abstract_origin: <0x13ed2c> <13ec7c> DW_AT_low_pc : 0x17f79 0x17f79 - 0x17a30 = 1353 So: 0x17a30 + 2 * 1353 = 0x184c2 And: 0x184c2 - 0x18090 = 1074 Which explains the bogus third expansion for e1000_consume_page() to end up at: p:probe/e1000_consume_page_2 e1000e:e1000_clean_rx_irq_ps+1074 All fixed now :-) [1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/ Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `613f050d68` ("perf probe: Fix to probe on gcc generated functions in modules") Link: http://lkml.kernel.org/r/20170621164134.5701-1-bjorn.topel@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:46 +02:00
Kan Liang	d46eda1978	perf/x86/intel: Add 1G DTLB load/store miss support for SKL commit `fb3a5055cd` upstream. Current DTLB load/store miss events (0x608/0x649) only counts 4K,2M and 4M page size. Need to extend the events to support any page size (4K/2M/4M/1G). The complete DTLB load/store miss events are: DTLB_LOAD_MISSES.WALK_COMPLETED 0xe08 DTLB_STORE_MISSES.WALK_COMPLETED 0xe49 Signed-off-by: Kan Liang <Kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: eranian@google.com Link: http://lkml.kernel.org/r/20170619142609.11058-1-kan.liang@intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:46 +02:00
Ilya Matveychikov	1a640581c8	lib/cmdline.c: fix get_options() overflow while parsing ranges commit `a91e0f680b` upstream. When using get_options() it's possible to specify a range of numbers, like 1-100500. The problem is that it doesn't track array size while calling internally to get_range() which iterates over the range and fills the memory with numbers. Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:45 +02:00
Jan Kara	5f83a74414	fs/dax.c: fix inefficiency in dax_writeback_mapping_range() commit `1eb643d02b` upstream. dax_writeback_mapping_range() fails to update iteration index when searching radix tree for entries needing cache flushing. Thus each pagevec worth of entries is searched starting from the start which is inefficient and prone to livelocks. Update index properly. Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz Fixes: `9973c98ecf` ("dax: add support for fsync/sync") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:45 +02:00
NeilBrown	909c25623a	autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL commit `9fa4eb8e49` upstream. If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl, autofs4_d_automount() will return ERR_PTR(status) with that status to follow_automount(), which will then dereference an invalid pointer. So treat a positive status the same as zero, and map to ENOENT. See comment in systemd src/core/automount.c::automount_send_ready(). Link: http://lkml.kernel.org/r/871sqwczx5.fsf@notabene.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.com> Cc: Ian Kent <raven@themaw.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:45 +02:00
Ravi Bangoria	a2063a2054	powerpc/perf: Fix oops when kthread execs user process commit `bf05fc25f2` upstream. When a kthread calls call_usermodehelper() the steps are: 1. allocate current->mm 2. load_elf_binary() 3. populate current->thread.regs While doing this, interrupts are not disabled. If there is a perf interrupt in the middle of this process (i.e. step 1 has completed but not yet reached to step 3) and if perf tries to read userspace regs, kernel oops with following log: Unable to handle kernel paging request for data at address 0x00000000 Faulting instruction address: 0xc0000000000da0fc ... Call Trace: perf_output_sample_regs+0x6c/0xd0 perf_output_sample+0x4e4/0x830 perf_event_output_forward+0x64/0x90 __perf_event_overflow+0x8c/0x1e0 record_and_restart+0x220/0x5c0 perf_event_interrupt+0x2d8/0x4d0 performance_monitor_exception+0x54/0x70 performance_monitor_common+0x158/0x160 --- interrupt: f01 at avtab_search_node+0x150/0x1a0 LR = avtab_search_node+0x100/0x1a0 ... load_elf_binary+0x6e8/0x15a0 search_binary_handler+0xe8/0x290 do_execveat_common.isra.14+0x5f4/0x840 call_usermodehelper_exec_async+0x170/0x210 ret_from_kernel_thread+0x5c/0x7c Fix it by setting abi to PERF_SAMPLE_REGS_ABI_NONE when userspace pt_regs are not set. Fixes: `ed4a4ef85c` ("powerpc/perf: Add support for sampling interrupt register state") Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:45 +02:00
Kees Cook	fed07e8907	fs/exec.c: account for argv/envp pointers commit `98da7d0885` upstream. When limiting the argv/envp strings during exec to 1/4 of the stack limit, the storage of the pointers to the strings was not included. This means that an exec with huge numbers of tiny strings could eat 1/4 of the stack limit in strings and then additional space would be later used by the pointers to the strings. For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721 single-byte strings would consume less than 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the strings would consume the remaining additional stack space (1677721 * 4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust stack space entirely. Controlling this stack exhaustion could result in pathological behavior in setuid binaries (CVE-2017-1000365). [akpm@linux-foundation.org: additional commenting from Kees] Fixes: `b6a2fea393` ("mm: variable length argument support") Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Qualys Security Advisory <qsa@qualys.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:44 +02:00
Takashi Iwai	c8bfdd083e	ALSA: hda - Apply quirks to Broxton-T, too commit `c7ecb9068e` upstream. Broxton-T was a forgotten child and we didn't apply the quirks for Skylake+ properly. Meanwhile, a quirk for reducing the DMA latency seems specific to the early Broxton model, so we leave as is. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:44 +02:00
Megha Dey	14487f9f6e	ALSA: hda - Add Coffelake PCI ID commit `e79b0006c4` upstream. Coffelake is another Intel part, so need to add PCI ID for it. Signed-off-by: Megha Dey <megha.dey@intel.com> Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:44 +02:00
Takashi Iwai	e6dc243c28	ALSA: pcm: Don't treat NULL chmap as a fatal error commit `2deaeaf102` upstream. The standard PCM chmap helper callbacks treat the NULL info->chmap as a fatal error and spews the kernel warning with stack trace when CONFIG_SND_DEBUG is on. This was OK, originally it was supposed to be always static and non-NULL. But, as the recent addition of Intel LPE audio driver shows, the chmap content may vary dynamically, and it can be even NULL when disconnected. The user still sees the kernel warning unnecessarily. For clearing such a confusion, this patch simply removes the snd_BUG_ON() in each place, just returns an error without warning. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:44 +02:00
Takashi Sakamoto	c06718bcbd	ALSA: firewire-lib: Fix stall of process context at packet error commit `4a9bfafc64` upstream. At Linux v3.5, packet processing can be done in process context of ALSA PCM application as well as software IRQ context for OHCI 1394. Below is an example of the callgraph (some calls are omitted). ioctl(2) with e.g. HWSYNC (sound/core/pcm_native.c) ->snd_pcm_common_ioctl1() ->snd_pcm_hwsync() ->snd_pcm_stream_lock_irq (sound/core/pcm_lib.c) ->snd_pcm_update_hw_ptr() ->snd_pcm_udpate_hw_ptr0() ->struct snd_pcm_ops.pointer() (sound/firewire/*) = Each handler on drivers in ALSA firewire stack (sound/firewire/amdtp-stream.c) ->amdtp_stream_pcm_pointer() (drivers/firewire/core-iso.c) ->fw_iso_context_flush_completions() ->struct fw_card_driver.flush_iso_completion() (drivers/firewire/ohci.c) = flush_iso_completions() ->struct fw_iso_context.callback.sc (sound/firewire/amdtp-stream.c) = in_stream_callback() or out_stream_callback() ->... ->snd_pcm_stream_unlock_irq When packet queueing error occurs or detecting invalid packets in 'in_stream_callback()' or 'out_stream_callback()', 'snd_pcm_stop_xrun()' is called on local CPU with disabled IRQ. (sound/firewire/amdtp-stream.c) in_stream_callback() or out_stream_callback() ->amdtp_stream_pcm_abort() ->snd_pcm_stop_xrun() ->snd_pcm_stream_lock_irqsave() ->snd_pcm_stop() ->snd_pcm_stream_unlock_irqrestore() The process is stalled on the CPU due to attempt to acquire recursive lock. [ 562.630853] INFO: rcu_sched detected stalls on CPUs/tasks: [ 562.630861] 2-...: (1 GPs behind) idle=37d/140000000000000/0 softirq=38323/38323 fqs=7140 [ 562.630862] (detected by 3, t=15002 jiffies, g=21036, c=21035, q=5933) [ 562.630866] Task dump for CPU 2: [ 562.630867] alsa-source-OXF R running task 0 6619 1 0x00000008 [ 562.630870] Call Trace: [ 562.630876] ? vt_console_print+0x79/0x3e0 [ 562.630880] ? msg_print_text+0x9d/0x100 [ 562.630883] ? up+0x32/0x50 [ 562.630885] ? irq_work_queue+0x8d/0xa0 [ 562.630886] ? console_unlock+0x2b6/0x4b0 [ 562.630888] ? vprintk_emit+0x312/0x4a0 [ 562.630892] ? dev_vprintk_emit+0xbf/0x230 [ 562.630895] ? do_sys_poll+0x37a/0x550 [ 562.630897] ? dev_printk_emit+0x4e/0x70 [ 562.630900] ? __dev_printk+0x3c/0x80 [ 562.630903] ? _raw_spin_lock+0x20/0x30 [ 562.630909] ? snd_pcm_stream_lock+0x31/0x50 [snd_pcm] [ 562.630914] ? _snd_pcm_stream_lock_irqsave+0x2e/0x40 [snd_pcm] [ 562.630918] ? snd_pcm_stop_xrun+0x16/0x70 [snd_pcm] [ 562.630922] ? in_stream_callback+0x3e6/0x450 [snd_firewire_lib] [ 562.630925] ? handle_ir_packet_per_buffer+0x8e/0x1a0 [firewire_ohci] [ 562.630928] ? ohci_flush_iso_completions+0xa3/0x130 [firewire_ohci] [ 562.630932] ? fw_iso_context_flush_completions+0x15/0x20 [firewire_core] [ 562.630935] ? amdtp_stream_pcm_pointer+0x2d/0x40 [snd_firewire_lib] [ 562.630938] ? pcm_capture_pointer+0x19/0x20 [snd_oxfw] [ 562.630943] ? snd_pcm_update_hw_ptr0+0x47/0x3d0 [snd_pcm] [ 562.630945] ? poll_select_copy_remaining+0x150/0x150 [ 562.630947] ? poll_select_copy_remaining+0x150/0x150 [ 562.630952] ? snd_pcm_update_hw_ptr+0x10/0x20 [snd_pcm] [ 562.630956] ? snd_pcm_hwsync+0x45/0xb0 [snd_pcm] [ 562.630960] ? snd_pcm_common_ioctl1+0x1ff/0xc90 [snd_pcm] [ 562.630962] ? futex_wake+0x90/0x170 [ 562.630966] ? snd_pcm_capture_ioctl1+0x136/0x260 [snd_pcm] [ 562.630970] ? snd_pcm_capture_ioctl+0x27/0x40 [snd_pcm] [ 562.630972] ? do_vfs_ioctl+0xa3/0x610 [ 562.630974] ? vfs_read+0x11b/0x130 [ 562.630976] ? SyS_ioctl+0x79/0x90 [ 562.630978] ? entry_SYSCALL_64_fastpath+0x1e/0xad This commit fixes the above bug. This assumes two cases: 1. Any error is detected in software IRQ context of OHCI 1394 context. In this case, PCM substream should be aborted in packet handler. On the other hand, it should not be done in any process context. TO distinguish these two context, use 'in_interrupt()' macro. 2. Any error is detect in process context of ALSA PCM application. In this case, PCM substream should not be aborted in packet handler because PCM substream lock is acquired. The task to abort PCM substream should be done in ALSA PCM core. For this purpose, SNDRV_PCM_POS_XRUN is returned at 'struct snd_pcm_ops.pointer()'. Suggested-by: Clemens Ladisch <clemens@ladisch.de> Fixes: e9148dddc3c7("ALSA: firewire-lib: flush completed packets when reading PCM position") Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:44 +02:00
Jan Beulich	b919d2dc59	xen-blkback: don't leak stack data via response ring commit `089bc0143f` upstream. Rather than constructing a local structure instance on the stack, fill the fields directly on the shared ring, just like other backends do. Build on the fact that all response structure flavors are actually identical (the old code did make this assumption too). This is XSA-216. Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:43 +02:00
Juergen Gross	7bffd6bb0c	xen/blkback: fix disconnect while I/Os in flight commit `4646441130` upstream. Today disconnecting xen-blkback is broken in case there are still I/Os in flight: xen_blkif_disconnect() will bail out early without releasing all resources in the hope it will be called again when the last request has terminated. This, however, won't happen as xen_blkif_free() won't be called on termination of the last running request: xen_blkif_put() won't decrement the blkif refcnt to 0 as xen_blkif_disconnect() didn't finish before thus some xen_blkif_put() calls in xen_blkif_disconnect() didn't happen. To solve this deadlock xen_blkif_disconnect() and xen_blkif_alloc_rings() shouldn't use xen_blkif_put() and xen_blkif_get() but use some other way to do their accounting of resources. This at once fixes another error in xen_blkif_disconnect(): when it returned early with -EBUSY for another ring than 0 it would call xen_blkif_put() again for already handled rings on a subsequent call. This will lead to inconsistencies in the refcnt handling. Signed-off-by: Juergen Gross <jgross@suse.com> Tested-by: Steven Haigh <netwiz@crc.id.au> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:43 +02:00
Boris Brezillon	61ab4c4a85	clk: sunxi-ng: sun5i: Fix ahb_bist_clk definition commit `370d919271` upstream. AHB BIST gate is actually controlled with bit 7. This bug was detected while trying to use the NAND controller which is using the DMA engine to transfer data to the NAND. Since the ahb_bist_clk gate bit conflicts with the ahb_dma_clk gate bit, the core was disabling the DMA engine clock as part of its 'disable unused clks' procedure, which was causing all DMA transfers to fail after this point. Fixes: `5e73761786` ("clk: sunxi-ng: Add sun5i CCU driver") Reported-by: Angus Ainslie <angus@akkea.ca> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Tested-by: Angus Ainslie <angus@akkea.ca> Reviewed-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Michael Turquette <mturquette@baylibre.com> Link: lkml.kernel.org/r/1495643669-28221-1-git-send-email-boris.brezillon@free-electrons.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:43 +02:00
Yong Deng	744b0f61b4	clk: sunxi-ng: v3s: Fix usb otg device reset bit commit `7ffc781ec4` upstream. V3S's usb otg device reset bit should be 24, not 23. Signed-off-by: Yong Deng <iemdey@gmail.com> Reviewed-By: Icenowy Zheng <icenowy@aosc.io> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:43 +02:00
Chen-Yu Tsai	861b4d9542	clk: sunxi-ng: a31: Correct lcd1-ch1 clock register offset commit `38b8f82386` upstream. The register offset for the lcd1-ch1 clock was incorrectly pointing to the lcd0-ch1 clock. This resulted in the lcd0-ch1 clock being disabled when the clk core disables unused clocks. This then stops the simplefb HDMI output path. Reported-by: Bob Ham <rah@settrans.net> Fixes: `c6e6c96d8f` ("clk: sunxi-ng: Add A31/A31s clocks") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-29 13:02:42 +02:00
Greg Kroah-Hartman	9f3069116e	Linux 4.11.7	2017-06-24 07:06:40 +02:00
Hugh Dickins	f5094f2d1a	mm: fix new crash in unmapped_area_topdown() commit `f4cb767d76` upstream. Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of mmap testing. That's the VM_BUG_ON(gap_end < gap_start) at the end of unmapped_area_topdown(). Linus points out how MAP_FIXED (which does not have to respect our stack guard gap intentions) could result in gap_end below gap_start there. Fix that, and the similar case in its alternative, unmapped_area(). Fixes: `1be7107fbe` ("mm: larger stack guard gap, between vmas") Reported-by: Dave Jones <davej@codemonkey.org.uk> Debugged-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:22 +02:00
Helge Deller	89d3c6457e	Allow stack to grow up to address space limit commit `bd726c90b6` upstream. Fix expand_upwards() on architectures with an upward-growing stack (parisc, metag and partly IA-64) to allow the stack to reliably grow exactly up to the address space limit given by TASK_SIZE. Signed-off-by: Helge Deller <deller@gmx.de> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:22 +02:00
Hugh Dickins	27f9070614	mm: larger stack guard gap, between vmas commit `1be7107fbe` upstream. Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [wt: backport to 4.11: adjust context] Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:22 +02:00
Enric Balletbo i Serra	ad7b76458e	ARM: dts: am335x-sl50: Fix cannot claim requested pins for spi0 commit `db145db99f` upstream. We don't need to bitbang these pins anymore, instead we muxed these pins as SPI, after this change, done in commit 6c69f726, we introduced the following error: pinctrl-single 44e10800.pinmux: pin PIN85 already requested \ by 44e10800.pinmux; cannot claim for 48030000.spi pinctrl-single 44e10800.pinmux: pin-85 (48030000.spi) status -22 Fixes: 6c69f726 ("ARM: dts: am335x-sl50: Enable SPI0 interface and Flash Memory") Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:22 +02:00
Enric Balletbo i Serra	25568ceca8	ARM: dts: am335x-sl50: Fix card detect pin for mmc1 commit `56b74ed9c1` upstream. The second version of the hardware moved the card detect pin from gpio0_6 to gpio1_9, as we won't support the first hardware version fix the pinmux configuration of this pin. Fixes: `8584d4fc` ("ARM: dts: am335x-sl50: Add Toby-Churchill SL50 board support.") Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
David Miller	b581da8c12	crypto: Work around deallocated stack frame reference gcc bug on sparc. commit `d41519a69b` upstream. On sparc, if we have an alloca() like situation, as is the case with SHASH_DESC_ON_STACK(), we can end up referencing deallocated stack memory. The result can be that the value is clobbered if a trap or interrupt arrives at just the right instruction. It only occurs if the function ends returning a value from that alloca() area and that value can be placed into the return value register using a single instruction. For example, in lib/libcrc32c.c:crc32c() we end up with a return sequence like: return %i7+8 lduw [%o5+16], %o0 ! MEM[(u32 *)__shash_desc.1_10 + 16B], %o5 holds the base of the on-stack area allocated for the shash descriptor. But the return released the stack frame and the register window. So if an intererupt arrives between 'return' and 'lduw', then the value read at %o5+16 can be corrupted. Add a data compiler barrier to work around this problem. This is exactly what the gcc fix will end up doing as well, and it absolutely should not change the code generated for other cpus (unless gcc on them has the same bug :-) With crucial insight from Eric Sandeen. Reported-by: Anatoly Pugachev <matorola@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Paul Burton	be071927ab	MIPS: .its targets depend on vmlinux commit `bcd7c45e0d` upstream. The .its targets require information about the kernel binary, such as its entry point, which is extracted from the vmlinux ELF. We therefore require that the ELF is built before the .its files are generated. Declare this requirement in the Makefile such that make will ensure this is always the case, otherwise in corner cases we can hit issues as the .its is generated with an incorrect (either invalid or stale) entry point. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: `cf2a5e0bb4` ("MIPS: Support generating Flattened Image Trees (.itb)") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16179/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Paul Burton	e01d01337a	MIPS: Fix bnezc/jialc return address calculation commit `1a73d9310e` upstream. The code handling the pop76 opcode (ie. bnezc & jialc instructions) in __compute_return_epc_for_insn() needs to set the value of $31 in the jialc case, which is encoded with rs = 0. However its check to differentiate bnezc (rs != 0) from jialc (rs = 0) was unfortunately backwards, meaning that if we emulate a bnezc instruction we clobber $31 & if we emulate a jialc instruction it actually behaves like a jic instruction. Fix this by inverting the check of rs to match the way the instructions are actually encoded. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: `28d6f93d20` ("MIPS: Emulate the new MIPS R6 BNEZC and JIALC instructions") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16178/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Michael S. Tsirkin	78f72c1e77	virtio_balloon: disable VIOMMU support commit `e41b135550` upstream. virtio balloon bypasses the DMA API entirely so does not support the VIOMMU right now. It's not clear we need that support, for now let's just make sure we don't pretend to support it. Cc: Wei Wang <wei.w.wang@intel.com> Fixes: `1a93769399` ("virtio: new feature to detect IOMMU device quirk") Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Thomas Gleixner	08ddb8f0e5	alarmtimer: Rate limit periodic intervals commit `ff86bf0c65` upstream. The alarmtimer code has another source of potentially rearming itself too fast. Interval timers with a very samll interval have a similar CPU hog effect as the previously fixed overflow issue. The reason is that alarmtimers do not implement the normal protection against this kind of problem which the other posix timer use: timer expires -> queue signal -> deliver signal -> rearm timer This scheme brings the rearming under scheduler control and prevents permanently firing timers which hog the CPU. Bringing this scheme to the alarm timer code is a major overhaul because it lacks all the necessary mechanisms completely. So for a quick fix limit the interval to one jiffie. This is not problematic in practice as alarmtimers are usually backed by an RTC for suspend which have 1 second resolution. It could be therefor argued that the resolution of this clock should be set to 1 second in general, but that's outside the scope of this fix. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Thomas Gleixner	1b00aad2cf	alarmtimer: Prevent overflow of relative timers commit `f4781e76f9` upstream. Andrey reported a alartimer related RCU stall while fuzzing the kernel with syzkaller. The reason for this is an overflow in ktime_add() which brings the resulting time into negative space and causes immediate expiry of the timer. The following rearm with a small interval does not bring the timer back into positive space due to the same issue. This results in a permanent firing alarmtimer which hogs the CPU. Use ktime_add_safe() instead which detects the overflow and clamps the result to KTIME_SEC_MAX. Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kostya Serebryany <kcc@google.com> Cc: syzkaller <syzkaller@googlegroups.com> Cc: John Stultz <john.stultz@linaro.org> Cc: Dmitry Vyukov <dvyukov@google.com> Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Heiner Kallweit	40dad0b041	genirq: Release resources in __setup_irq() error path commit `fa07ab72cb` upstream. In case __irq_set_trigger() fails the resources requested via irq_request_resources() are not released. Add the missing release call into the error handling path. Fixes: `c1bacbae81` ("genirq: Provide irq_request/release_resources chip callbacks") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/655538f5-cb20-a892-ff15-fbd2dd1fa4ec@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Andy Lutomirski	cc72dfdecc	sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off() commit `252d2a4117` upstream. idle_task_exit() can be called with IRQs on x86 on and therefore should use switch_mm(), not switch_mm_irqs_off(). This doesn't seem to cause any problems right now, but it will confuse my upcoming TLB flush changes. Nonetheless, I think it should be backported because it's trivial. There won't be any meaningful performance impact because idle_task_exit() is only used when offlining a CPU. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `f98db6013c` ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") Link: http://lkml.kernel.org/r/ca3d1a9fa93a0b49f5a8ff729eda3640fb6abdf9.1497034141.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Martin Blumenstingl	947af98310	iio: adc: meson-saradc: fix potential crash in meson_sar_adc_clear_fifo commit `103a07d427` upstream. meson_sar_adc_clear_fifo passes a 0 as value-pointer to regmap_read(). In case of the meson-saradc driver this ends up in regmap_mmio_read(), where the value-pointer is de-referenced unconditionally to assign the value which was read. Fix this by passing an actual pointer, even though all we want to do is to discard the value. As a side-effect this fixes a sparse warning ("Using plain integer as NULL pointer") as reported by Paolo Cretaro. Fixes: `3adbf34273` ("iio: adc: add a driver for the SAR ADC found in Amlogic Meson SoCs") Reported-by: Paolo Cretaro <paolocretaro@gmail.com> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:21 +02:00
Alexey Khoroshilov	7aeda39ef5	staging: iio: ad7152: Fix deadlock in ad7152_write_raw_samp_freq() commit `95264c8c6a` upstream. ad7152_write_raw_samp_freq() is called by ad7152_write_raw() with chip->state_lock held. So, there is unavoidable deadlock when ad7152_write_raw_samp_freq() locks the mutex itself. The patch removes unneeded locking. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Fixes: `6572389bcc` ("staging: iio: cdc: ad7152: Implement IIO_CHAN_INFO_SAMP_FREQ attribute") Acked-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Jean-Baptiste Maneyrol	dcf8e82942	iio: imu: inv_mpu6050: add accel lpf setting for chip >= MPU6500 commit `948588e25b` upstream. Starting from MPU6500, accelerometer dlpf is set in a separate register named ACCEL_CONFIG_2. Add this new register in the map and set it for the corresponding chips. Signed-off-by: Jean-Baptiste Maneyrol <jmaneyrol@invensense.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Andrea Arcangeli	8d96cfd1e3	userfaultfd: shmem: handle coredumping in handle_userfault() commit `64c2b20301` upstream. Anon and hugetlbfs handle FOLL_DUMP set by get_dump_page() internally to __get_user_pages(). shmem as opposed has no special FOLL_DUMP handling there so handle_mm_fault() is invoked without mmap_sem and ends up calling handle_userfault() that isn't expecting to be invoked without mmap_sem held. This makes handle_userfault() fail immediately if invoked through shmem_vm_ops->fault during coredumping and solves the problem. The side effect is a BUG_ON with no lock held triggered by the coredumping process which exits. Only 4.11 is affected, pre-4.11 anon memory holes are skipped in __get_user_pages by checking FOLL_DUMP explicitly against empty pagetables (mm/gup.c:no_page_table()). It's zero cost as we already had a check for current->flags to prevent futex to trigger userfaults during exit (PF_EXITING). Link: http://lkml.kernel.org/r/20170615214838.27429-1-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Mark Rutland	5daec00b8b	mm: numa: avoid waiting on freed migrated pages commit `3c226c637b` upstream. In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by waiting until the pmd is unlocked before we return and retry. However, we can race with migrate_misplaced_transhuge_page(): // do_huge_pmd_numa_page // migrate_misplaced_transhuge_page() // Holds 0 refs on page // Holds 2 refs on page vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); /* ... / if (pmd_trans_migrating(vmf->pmd)) { page = pmd_page(vmf->pmd); spin_unlock(vmf->ptl); ptl = pmd_lock(mm, pmd); if (page_count(page) != 2)) { / roll back / } / ... / mlock_migrate_page(new_page, page); / ... */ spin_unlock(ptl); put_page(page); put_page(page); // page freed here wait_on_page_locked(page); goto out; } This can result in the freed page having its waiters flag set unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the page alloc/free functions. This has been observed on arm64 KVM guests. We can avoid this by having do_huge_pmd_numa_page() take a reference on the page before dropping the pmd lock, mirroring what we do in __migration_entry_wait(). When we hit the race, migrate_misplaced_transhuge_page() will see the reference and abort the migration, as it may do today in other cases. Fixes: `b8916634b7` ("mm: Prevent parallel splits during THP migration") Link: http://lkml.kernel.org/r/1497349722-6731-2-git-send-email-will.deacon@arm.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Steve Capper <steve.capper@arm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Yu Zhao	490fdcdadf	swap: cond_resched in swap_cgroup_prepare() commit `ef70762948` upstream. I saw need_resched() warnings when swapping on large swapfile (TBs) because continuously allocating many pages in swap_cgroup_prepare() took too long. We already cond_resched when freeing page in swap_cgroup_swapoff(). Do the same for the page allocation. Link: http://lkml.kernel.org/r/20170604200109.17606-1-yuzhao@google.com Signed-off-by: Yu Zhao <yuzhao@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
James Morse	e163829260	mm/memory-failure.c: use compound_head() flags for huge pages commit `7258ae5c5a` upstream. memory_failure() chooses a recovery action function based on the page flags. For huge pages it uses the tail page flags which don't have anything interesting set, resulting in: > Memory failure: 0x9be3b4: Unknown page state > Memory failure: 0x9be3b4: recovery action for unknown page: Failed Instead, save a copy of the head page's flags if this is a huge page, this means if there are no relevant flags for this tail page, we use the head pages flags instead. This results in the me_huge_page() recovery action being called: > Memory failure: 0x9b7969: recovery action for huge page: Delayed For hugepages that have not yet been allocated, this allows the hugepage to be dequeued. Fixes: `524fca1e73` ("HWPOISON: fix misjudgement of page_action() for errors on mlocked pages") Link: http://lkml.kernel.org/r/20170524130204.21845-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Tested-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Alan Stern	20360f1af7	USB: gadgetfs, dummy-hcd, net2280: fix locking for callbacks commit `f16443a034` upstream. Using the syzkaller kernel fuzzer, Andrey Konovalov generated the following error in gadgetfs: > BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690 > kernel/locking/lockdep.c:3246 > Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903 > > CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Workqueue: usb_hub_wq hub_event > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x292/0x395 lib/dump_stack.c:52 > print_address_description+0x78/0x280 mm/kasan/report.c:252 > kasan_report_error mm/kasan/report.c:351 [inline] > kasan_report+0x230/0x340 mm/kasan/report.c:408 > __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429 > __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246 > lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855 > __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] > _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151 > spin_lock include/linux/spinlock.h:299 [inline] > gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682 > set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455 > dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074 > rh_call_control drivers/usb/core/hcd.c:689 [inline] > rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline] > usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650 > usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542 > usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56 > usb_internal_control_msg drivers/usb/core/message.c:100 [inline] > usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151 > usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412 > hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177 > hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648 > hub_port_connect drivers/usb/core/hub.c:4826 [inline] > hub_port_connect_change drivers/usb/core/hub.c:4999 [inline] > port_event drivers/usb/core/hub.c:5105 [inline] > hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185 > process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097 > process_scheduled_works kernel/workqueue.c:2157 [inline] > worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233 > kthread+0x363/0x440 kernel/kthread.c:231 > ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424 > > Allocated by task 9958: > save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59 > save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > set_track mm/kasan/kasan.c:525 [inline] > kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617 > kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745 > kmalloc include/linux/slab.h:492 [inline] > kzalloc include/linux/slab.h:665 [inline] > dev_new drivers/usb/gadget/legacy/inode.c:170 [inline] > gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993 > mount_single+0xf6/0x160 fs/super.c:1192 > gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019 > mount_fs+0x9c/0x2d0 fs/super.c:1223 > vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976 > vfs_kern_mount fs/namespace.c:2509 [inline] > do_new_mount fs/namespace.c:2512 [inline] > do_mount+0x41b/0x2d90 fs/namespace.c:2834 > SYSC_mount fs/namespace.c:3050 [inline] > SyS_mount+0xb0/0x120 fs/namespace.c:3027 > entry_SYSCALL_64_fastpath+0x1f/0xbe > > Freed by task 9960: > save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59 > save_stack+0x43/0xd0 mm/kasan/kasan.c:513 > set_track mm/kasan/kasan.c:525 [inline] > kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590 > slab_free_hook mm/slub.c:1357 [inline] > slab_free_freelist_hook mm/slub.c:1379 [inline] > slab_free mm/slub.c:2961 [inline] > kfree+0xed/0x2b0 mm/slub.c:3882 > put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163 > gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027 > deactivate_locked_super+0x8d/0xd0 fs/super.c:309 > deactivate_super+0x21e/0x310 fs/super.c:340 > cleanup_mnt+0xb7/0x150 fs/namespace.c:1112 > __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119 > task_work_run+0x1a0/0x280 kernel/task_work.c:116 > exit_task_work include/linux/task_work.h:21 [inline] > do_exit+0x18a8/0x2820 kernel/exit.c:878 > do_group_exit+0x14e/0x420 kernel/exit.c:982 > get_signal+0x784/0x1780 kernel/signal.c:2318 > do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808 > exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157 > prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline] > syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263 > entry_SYSCALL_64_fastpath+0xbc/0xbe > > The buggy address belongs to the object at ffff88003a2bdae0 > which belongs to the cache kmalloc-1024 of size 1024 > The buggy address is located 24 bytes inside of > 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0) > The buggy address belongs to the page: > page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null) > index:0x0 compound_mapcount: 0 > flags: 0x100000000008100(slab\|head) > raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017 > raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc > >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb > ^ > ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ================================================================== What this means is that the gadgetfs_suspend() routine was trying to access dev->lock after it had been deallocated. The root cause is a race in the dummy_hcd driver; the dummy_udc_stop() routine can race with the rest of the driver because it contains no locking. And even when proper locking is added, it can still race with the set_link_state() function because that function incorrectly drops the private spinlock before invoking any gadget driver callbacks. The result of this race, as seen above, is that set_link_state() can invoke a callback in gadgetfs even after gadgetfs has been unbound from dummy_hcd's UDC and its private data structures have been deallocated. include/linux/usb/gadget.h documents that the ->reset, ->disconnect, ->suspend, and ->resume callbacks may be invoked in interrupt context. In general this is necessary, to prevent races with gadget driver removal. This patch fixes dummy_hcd to retain the spinlock across these calls, and it adds a spinlock acquisition to dummy_udc_stop() to prevent the race. The net2280 driver makes the same mistake of dropping the private spinlock for its ->disconnect and ->reset callback invocations. The patch fixes it too. Lastly, since gadgetfs_suspend() may be invoked in interrupt context, it cannot assume that interrupts are enabled when it runs. It must use spin_lock_irqsave() instead of spin_lock_irq(). The patch fixes that bug as well. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Alan Stern	6097614a9e	USB: gadget: fix GPF in gadgetfs commit `f50b878fed` upstream. A NULL-pointer dereference bug in gadgetfs was uncovered by syzkaller: > kasan: GPF could be caused by NULL-ptr deref or user memory access > general protection fault: 0000 [#1] SMP KASAN > Dumping ftrace buffer: > (ftrace buffer empty) > Modules linked in: > CPU: 2 PID: 4820 Comm: syz-executor0 Not tainted 4.12.0-rc4+ #5 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > task: ffff880039542dc0 task.stack: ffff88003bdd0000 > RIP: 0010:__list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51 > RSP: 0018:ffff88003bdd6e50 EFLAGS: 00010246 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000010000 > RDX: 0000000000000000 RSI: ffffffff86504948 RDI: ffffffff86504950 > RBP: ffff88003bdd6e68 R08: ffff880039542dc0 R09: ffffffff8778ce00 > R10: ffff88003bdd6e68 R11: dffffc0000000000 R12: 0000000000000000 > R13: dffffc0000000000 R14: 1ffff100077badd2 R15: ffffffff864d2e40 > FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000002014aff9 CR3: 0000000006022000 CR4: 00000000000006e0 > Call Trace: > __list_del_entry include/linux/list.h:116 [inline] > list_del include/linux/list.h:124 [inline] > usb_gadget_unregister_driver+0x166/0x4c0 drivers/usb/gadget/udc/core.c:1387 > dev_release+0x80/0x160 drivers/usb/gadget/legacy/inode.c:1187 > __fput+0x332/0x7f0 fs/file_table.c:209 > ____fput+0x15/0x20 fs/file_table.c:245 > task_work_run+0x19b/0x270 kernel/task_work.c:116 > exit_task_work include/linux/task_work.h:21 [inline] > do_exit+0x18a3/0x2820 kernel/exit.c:878 > do_group_exit+0x149/0x420 kernel/exit.c:982 > get_signal+0x77f/0x1780 kernel/signal.c:2318 > do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808 > exit_to_usermode_loop+0x1a7/0x240 arch/x86/entry/common.c:157 > prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline] > syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263 > entry_SYSCALL_64_fastpath+0xbc/0xbe > RIP: 0033:0x4461f9 > RSP: 002b:00007fdac2b1ecf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca > RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004461f9 > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8 > RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 > R13: 0000000000000000 R14: 00007fdac2b1f9c0 R15: 00007fdac2b1f700 > Code: 00 00 00 00 ad de 49 39 c4 74 6a 48 b8 00 02 00 00 00 00 ad de > 48 89 da 48 39 c3 74 74 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80> > 3c 02 00 0f 85 92 00 00 00 48 8b 13 48 39 f2 75 66 49 8d 7c > RIP: __list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51 RSP: ffff88003bdd6e50 > ---[ end trace 30e94b1eec4831c8 ]--- > Kernel panic - not syncing: Fatal exception The bug was caused by dev_release() failing to turn off its gadget_registered flag after unregistering the gadget driver. As a result, when a later user closed the device file before writing a valid set of descriptors, dev_release() thought the gadget had been registered and tried to unregister it, even though it had not been. This led to the NULL pointer dereference. The fix is simple: turn off the flag when the gadget is unregistered. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Corentin Labbe	be8fec3b73	usb: xhci: ASMedia ASM1042A chipset need shorts TX quirk commit `d2f48f05cd` upstream. When plugging an USB webcam I see the following message: [106385.615559] xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs XHCI_TRUST_TX_LENGTH quirk? [106390.583860] handle_tx_event: 913 callbacks suppressed With this patch applied, I get no more printing of this message. Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
YD Tseng	51f8c53431	usb: xhci: Fix USB 3.1 supported protocol parsing commit `b72eb8435b` upstream. xHCI host controllers can have both USB 3.1 and 3.0 extended speed protocol lists. If the USB3.1 speed is parsed first and 3.0 second then the minor revision supported will be overwritten by the 3.0 speeds and the USB3 roothub will only show support for USB 3.0 speeds. This was the case with a xhci controller with the supported protocol capability listed below. In xhci-mem.c, the USB 3.1 speed is parsed first, the min_rev of usb3_rhub is set as 0x10. And then USB 3.0 is parsed. However, the min_rev of usb3_rhub will be changed to 0x00. If USB 3.1 device is connected behind this host controller, the speed of USB 3.1 device just reports 5G speed using lsusb. 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 00 01 08 00 00 00 00 00 40 00 00 00 00 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20 02 08 10 03 55 53 42 20 01 02 00 00 00 00 00 00 //USB 3.1 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 40 02 08 00 03 55 53 42 20 03 06 00 00 00 00 00 00 //USB 3.0 50 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 60 02 08 00 02 55 53 42 20 09 0E 19 00 00 00 00 00 //USB 2.0 70 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 This patch fixes the issue by only owerwriting the minor revision if it is higher than the existing one. [reword commit message -Mathias] Signed-off-by: YD Tseng <yd_tseng@asmedia.com.tw> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Dan Carpenter	fb293ea22d	drivers/misc/c2port/c2port-duramar2150.c: checking for NULL instead of IS_ERR() commit `8128a31eaa` upstream. c2port_device_register() never returns NULL, it uses error pointers. Link: http://lkml.kernel.org/r/20170412083321.GC3250@mwanda Fixes: `65131cd52b` ("c2port: add c2port support for Eurotech Duramar 2150") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Rodolfo Giometti <giometti@linux.it> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:20 +02:00
Philipp Zabel	08299403f9	coda: restore original firmware locations commit `1e9b71d53d` upstream. Recently, an unfinished patch was merged that added a third entry to the beginning of the array of firmware locations without changing the code to also look at the third element, thus pushing an old firmware location off the list. Fixes: `8af7779f3c` ("[media] coda: add Freescale firmware compatibility location") Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Acked-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Chris Brandt	b26ceaabac	usb: r8a66597-hcd: decrease timeout commit `dd14a3e9b9` upstream. The timeout for BULK packets was 300ms which is a long time if other endpoints or devices are waiting for their turn. Changing it to 50ms greatly increased the overall performance for multi-endpoint devices. Fixes: `5d3043586d` ("usb: r8a66597-hcd: host controller driver for R8A6659") Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Chris Brandt	545ae0a920	usb: r8a66597-hcd: select a different endpoint on timeout commit `1f873d857b` upstream. If multiple endpoints on a single device have pending IN URBs and one endpoint times out due to NAKs (perfectly legal), select a different endpoint URB to try. The existing code only checked to see another device address has pending URBs and ignores other IN endpoints on the current device address. This leads to endpoints never getting serviced if one endpoint is using NAK as a flow control method. Fixes: `5d3043586d` ("usb: r8a66597-hcd: host controller driver for R8A6659") Signed-off-by: Chris Brandt <chris.brandt@renesas.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Johan Hovold	6fa08ee0c3	USB: gadget: dummy_hcd: fix hub-descriptor removable fields commit `d81182ce30` upstream. Flag the first and only port as removable while also leaving the remaining bits (including the reserved bit zero) unset in accordance with the specifications: "Within a byte, if no port exists for a given location, the bit field representing the port characteristics shall be 0." Also add a comment marking the legacy PortPwrCtrlMask field. Fixes: `1cd8fd2887` ("usb: gadget: dummy_hcd: add SuperSpeed support") Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: Tatyana Brokhman <tlinder@codeaurora.org> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Arnd Bergmann	7cd9e91c87	pvrusb2: reduce stack usage pvr2_eeprom_analyze() commit `6830733d53` upstream. The driver uses a relatively large data structure on the stack, which showed up on my radar as we get a warning with the "latent entropy" GCC plugin: drivers/media/usb/pvrusb2/pvrusb2-eeprom.c:153:1: error: the frame size of 1376 bytes is larger than 1152 bytes [-Werror=frame-larger-than=] The warning is usually hidden as we raise the warning limit to 2048 when the plugin is enabled, but I'd like to lower that again in the future, and making this function smaller helps to do that without build regressions. Further analysis shows that putting an 'i2c_client' structure on the stack is not really supported, as the embedded 'struct device' is not initialized here, and we are only saved by the fact that the function that is called here does not use the pointer at all. Fixes: `d855497edb` ("V4L/DVB (4228a): pvrusb2 to kernel 2.6.18") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Roger Quadros	5f852eaf6c	usb: dwc3: gadget: Fix ISO transfer performance commit `f1d6826cae` upstream. Commit `08a36b5438` ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()") caused a small change in the way ISO transfer is handled in the case when XferInProgress event happens on Isoc EP with an active transfer. This caused a performance degradation of 50%. e.g. using g_webcam on DUT and luvcview on host the video frame rate dropped from 16fps to 8fps @high-speed. Make the ISO transfer handling equivalent to that prior to that commit to get back the original ISO performance numbers. Fixes: `08a36b5438` ("usb: dwc3: gadget: simplify __dwc3_gadget_ep_queue()") Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Johan Hovold	20a204f4f2	USB: usbip: fix nonconforming hub descriptor commit `ec963b412a` upstream. Fix up the root-hub descriptor to accommodate the variable-length DeviceRemovable and PortPwrCtrlMask fields, while marking all ports as removable (and leaving the reserved bit zero unset). Also add a build-time constraint on VHCI_HC_PORTS which must never be greater than USB_MAXCHILDREN (but this was only enforced through a KConfig constant). This specifically fixes the descriptor layout whenever VHCI_HC_PORTS is greater than seven (default is 8). Fixes: `04679b3489` ("Staging: USB/IP: add client driver") Cc: Takahiro Hirofuchi <hirofuchi@users.sourceforge.net> Cc: Valentina Manea <valentina.manea.m@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Anton Bondarenko	85b33106cd	usb: core: fix potential memory leak in error path during hcd creation commit `1a744d2eb7` upstream. Free memory allocated for address0_mutex if allocation of bandwidth_mutex failed. Fixes: `feb26ac31a` ("usb: core: hub: hub_port_init lock controller instead of bus") Signed-off-by: Anton Bondarenko <anton.bondarenko.sama@gmail.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Johan Hovold	3c07a67d6c	USB: hub: fix SS max number of ports commit `93491ced3c` upstream. Add define for the maximum number of ports on a SuperSpeed hub as per USB 3.1 spec Table 10-5, and use it when verifying the retrieved hub descriptor. This specifically avoids benign attempts to update the DeviceRemovable mask for non-existing ports (should we get that far). Fixes: `dbe79bbe9d` ("USB 3.0 Hub Changes") Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Yoshihiro Shimoda	3610490d4a	usb: gadget: udc: renesas_usb3: lock for PN_ registers access commit `940f538a10` upstream. This controller disallows to change the PIPE until reading/writing a packet finishes. However. the previous code is not enough to hold the lock in some functions. So, this patch fixes it. Fixes: `746bfe63bb` ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Yoshihiro Shimoda	fc8a41b78a	usb: gadget: udc: renesas_usb3: fix deadlock by spinlock commit `067d6fdc55` upstream. This patch fixes an issue that this driver is possible to cause deadlock by double-spinclocked in renesas_usb3_stop_controller(). So, this patch removes spinlock API calling in renesas_usb3_stop(). (In other words, the previous code had a redundant lock.) Fixes: `746bfe63bb` ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:19 +02:00
Yoshihiro Shimoda	0cd80a1595	usb: gadget: udc: renesas_usb3: fix pm_runtime functions calling commit `cdc876877e` upstream. This patch fixes an issue that this driver is possible to access the registers before pm_runtime_get_sync() if a gadget driver is installed first. After that, oops happens on R-Car Gen3 environment. To avoid it, this patch changes the pm_runtime call timing from probe/remove to udc_start/udc_stop. Fixes: `746bfe63bb` ("usb: gadget: renesas_usb3: add support for Renesas USB3.0 peripheral controller") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Johan Hovold	73d6c0f9d3	ALSA: usb-audio: fix Amanero Combo384 quirk on big-endian hosts commit `f83914fdfc` upstream. Add missing endianness conversion when using the USB device-descriptor bcdDevice field when applying the Amanero Combo384 (endianness!) quirk. Fixes: `3eff682d76` ("ALSA: usb-audio: Support both DSD LE/BE Amanero firmware versions") Cc: Jussi Laako <jussi@sonarnerd.net> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Subhransu S. Prusty	46efbffcf9	ALSA: hda: Add Geminilake id to SKL_PLUS commit `12ee4022f6` upstream. Geminilake is Skylake family platform. So add it's id to skl_plus check. Fixes: `126cfa2f5e` ("ALSA: hda: Add Geminilake HDMI codec ID") Signed-off-by: Subhransu S. Prusty <subhransu.s.prusty@intel.com> Cc: Senthilnathan Veppur <senthilnathanx.veppur@intel.com> Cc: Vinod Koul <vinod.koul@intel.com> Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Dan Carpenter	e5ea4d540d	iio: adc: ti_am335x_adc: allocating too much in probe commit `5ba5b437ef` upstream. We should be allocating enough information for a tiadc_device struct which is about 400 bytes but instead we allocate enough for a second iio_dev struct which is over 2000 bytes. Fixes: `fea89e2dfc` ("iio: adc: ti_am335x_adc: use variable names for sizeof() operator") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Matt Ranostay	2fc7dbb7b8	iio: proximity: as3935: recalibrate RCO after resume commit `6272c0de13` upstream. According to the datasheet the RCO must be recalibrated on every power-on-reset. Also remove mutex locking in the calibration function since callers other than the probe function (which doesn't need it) will have a lock. Fixes: `24ddb0e4bb` ("iio: Add AS3935 lightning sensor support") Cc: George McCollister <george.mccollister@gmail.com> Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Lorenzo Bianconi	387ea2eac9	iio: imu: st_lsm6dsx: do not apply ODR configuration in write_raw handler commit `2ccc15036d` upstream. This patch allows to avoid a transitory that occurs when a given sensor has been already enabled (e.g. gyroscope) and the user is configuring the sample frequency of the other one (e.g. accelerometer). The transitory lasts until the accelerometer is enabled. During that time slice the gyroscope ODR is incorrectly modified as well. At the end of the transitory both sensors work at the right frequency. Fix it introducing st_lsm6dsx_check_odr() routine to check ODR consistency in write_raw handler in order to apply frequency configuration just in st_lsm6dsx_set_odr() Fixes: `290a6ce11d` (iio: imu: add support to lsm6dsx driver) Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Eva Rachel Retuya	51119b7a9a	staging: iio: tsl2x7x_core: Fix standard deviation calculation commit `cf6c77323a` upstream. Standard deviation is calculated as the square root of the variance where variance is the mean of sample_sum and length. Correct the computation of statP->stddev in accordance to the proper calculation. Fixes: `3c97c08b57` ("staging: iio: add TAOS tsl2x7x driver") Reported-by: Abhiram Balasubramanian <abhiram@cs.utah.edu> Signed-off-by: Eva Rachel Retuya <eraretuya@gmail.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Dan Carpenter	0b125e0a60	staging: bcm2835-camera: fix error handling in init commit `8e17858a88` upstream. The unwinding here isn't right. We don't free gdev[0] and instead free 1 step past what was allocated. Also we can't allocate "dev" then we should unwind instead of returning directly. Fixes: `7b3ad5abf0` ("staging: Import the BCM2835 MMAL-based V4L2 camera driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: walter harms <wharms@bfs.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Dan Carpenter	8147e5d1b4	staging: rtl8188eu: prevent an underflow in rtw_check_beacon_data() commit `784047eb2d` upstream. The "len" could be as low as -14 so we should check for negatives. Fixes: `9a7fe54ddc` ("staging: r8188eu: Add source files for new driver - part 1") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Oliver O'Halloran	0393c1c8a6	powerpc/mm: Add physical address to Linux page table dump commit `aaa2295292` upstream. The current page table dumper scans the Linux page tables and coalesces mappings with adjacent virtual addresses and similar PTE flags. This behaviour is somewhat broken when you consider the IOREMAP space where entirely unrelated mappings will appear to be virtually contiguous. This patch modifies the range coalescing so that only ranges that are both physically and virtually contiguous are combined. This patch also adds to the dump output the physical address at the start of each range. Fixes: `8eb07b1870` ("powerpc/mm: Dump linux pagetables") Signed-off-by: Oliver O'Halloran <oohall@gmail.com> [mpe: Print the physicall address with 0x like the other addresses] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:18 +02:00
Linus Walleij	35434c87b1	mtd: physmap_of: really fix the physmap add-ons commit `8c925b2635` upstream. The current way of building the of_physmap add-ons result in just the add-on being in the object code, and not the actual core implementation and regress the Gemini and Versatile. Bake the physmap_of.o object by baking physmap_of_core.o and adding the Versatile and/or Gemini add-ons to the final object. Rename the source file physmap_of_core.c to get the desired build components. Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Fixes: `4f04f68e15` ("mtd: physmap_of: fixup gemini/versatile dependencies") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Yoshihiro Shimoda	50bc4e5e6b	phy: rcar-gen3-usb2: fix implementation for runtime PM commit `441a681b88` upstream. This patch fixes an issue that this driver doesn't take care of the runtime PM. This code assumed that devm_phy_create() called pm_runtime_enable(dev), but it misunderstood the dev_phy_create()'s specification. This driver should call its own pm_runtime_enable() before dev_phy_create(). Fixes: `f3b5a8d9b5` ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Tony Lindgren	902ef33f27	mfd: cpcap: Fix bad use of IRQ sense register commit `be269180c9` upstream. The cpcap INTS registers are for getting the value of the line, not for configuring the type. Fixes: `56e1d40d3b` ("mfd: cpcap: Add minimal support") Reviewed-By: Sebastian Reichel <sre@kernel.org> Tested-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Tony Lindgren	d9eb87dbb2	mfd: cpcap: Use ack_invert interrupts commit `5a88d41200` upstream. We should use ack_invert as the int_read_and_clear() in the Motorola kernel tree does "ireg_val & ~mreg_val" before writing to the mask register. Fixes: `56e1d40d3b` ("mfd: cpcap: Add minimal support") Tested-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Tony Lindgren	bce0fb9071	mfd: cpcap: Fix interrupt to use level interrupt commit `ac89473213` upstream. I made a mistake assuming the device tree configuration for interrupt triggering was somehow passed to the SPI device but it's not. In the Motorola Linux kernel tree CPCAP PMIC is configured as a rising edge triggered interrupt, but then then it's interrupt handler keeps looping until the GPIO line goes down. So the CPCAP interrupt is clearly a level interrupt and not an edge interrupt. Earlier when I tried to configure it as level interrupt using the device tree, I did not account that the triggering only gets passed to the SPI core and it also needs to be specified in the CPCAP driver when we do devm_regmap_add_irq_chip(). Fixes: `56e1d40d3b` ("mfd: cpcap: Add minimal support") Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Rask Ingemann Lambertsen	968b2a6e6a	dt-bindings: mfd: axp20x: Add "xpowers,master-mode" property for AXP806 PMICs commit `8461cf20d1` upstream. commit b101829a029a ("mfd: axp20x: Fix AXP806 access errors on cold boot") was intended to fix the case where a board uses an AXP806 in slave mode, but the boot loader leaves it in master mode for lack of AXP806 support. But now the driver breaks on boards where the PMIC is operating in master mode. To let the device tree describe which mode of operation is needed, this patch introduces a new property "xpowers,master-mode". Fixes: `204ae2963e` ("mfd: axp20x: Add bindings for AXP806 PMIC") Signed-off-by: Rask Ingemann Lambertsen <rask@formelder.dk> Acked-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Rask Ingemann Lambertsen	e5cc5a8a6f	mfd: axp20x: Add support for dts property "xpowers,master-mode" commit `c0369698e6` upstream. commit b101829a029a ("mfd: axp20x: Fix AXP806 access errors on cold boot") was intended to fix the case where a board uses an AXP806 in slave mode, but the boot loader leaves it in master mode for lack of AXP806 support. But now the driver breaks on boards where the PMIC is operating in master mode. This patch lets the driver use the new device tree property "xpowers,master-mode" to set the correct operating mode for the board. Fixes: `8824ee8573` ("mfd: axp20x: Add support for AXP806 PMIC") Signed-off-by: Rask Ingemann Lambertsen <rask@formelder.dk> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Tony Lindgren	78f2256a34	mfd: omap-usb-tll: Fix inverted bit use for USB TLL mode commit `8b8a84c54a` upstream. Commit `16fa3dc75c` ("mfd: omap-usb-tll: HOST TLL platform driver") added support for USB TLL, but uses OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF bit the wrong way. The comments in the code are correct, but the inverted use of OMAP_TLL_CHANNEL_CONF_ULPINOBITSTUFF causes the register to be enabled instead of disabled unlike what the comments say. Without this change the Wrigley 3G LTE modem on droid 4 EHCI bus can be only pinged few times before it stops responding. Fixes: `16fa3dc75c` ("mfd: omap-usb-tll: HOST TLL platform driver") Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Laura Abbott	e41d895954	x86/mm/32: Set the '__vmalloc_start_set' flag in initmem_init() commit `861ce4a324` upstream. '__vmalloc_start_set' currently only gets set in initmem_init() when !CONFIG_NEED_MULTIPLE_NODES. This breaks detection of vmalloc address with virt_addr_valid() with CONFIG_NEED_MULTIPLE_NODES=y, causing a kernel crash: [mm/usercopy] `517e1fbeb6`: kernel BUG at arch/x86/mm/physaddr.c:78! Set '__vmalloc_start_set' appropriately for that case as well. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: `dc16ecf7fd` ("x86-32: use specific __vmalloc_start_set flag in __virt_addr_valid") Link: http://lkml.kernel.org/r/1494278596-30373-1-git-send-email-labbott@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:17 +02:00
Geert Uytterhoeven	664106a33d	serial: sh-sci: Fix late enablement of AUTORTS commit `5f76895e4c` upstream. When changing hardware control flow for a UART with dedicated RTS/CTS pins, the new AUTORTS state is not immediately reflected in the hardware, but only when RTS is raised. However, the serial core does not call .set_mctrl() after .set_termios(), hence AUTORTS may only become effective when the port is closed, and reopened later. Note that this problem does not happen when manually using stty to change CRTSCTS, as AUTORTS will work fine on next open. To fix this, call .set_mctrl() from .set_termios() when dedicated RTS/CTS pins are present, to refresh the AUTORTS or RTS state. This is similar to what other drivers supporting AUTORTS do (e.g. omap-serial). Reported-by: Baumann, Christoph (C.) <cbaumann@visteon.com> Fixes: `33f50ffc25` ("serial: sh-sci: Fix support for hardware-assisted RTS/CTS") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Geert Uytterhoeven	263b39a07e	serial: sh-sci: Fix (AUTO)RTS in sci_init_pins() commit `cfa6eb2391` upstream. If a UART has dedicated RTS/CTS pins, and hardware control flow is disabled (or AUTORTS is not yet effective), changing any serial port configuration deasserts RTS, as .set_termios() calls sci_init_pins(). To fix this, consider the current (AUTO)RTS state when (re)initializing the pins. Note that for SCIFA/SCIFB, AUTORTS needs explicit configuration of the RTS# pin function, while (H)SCIF handles this automatically. Fixes: `d2b9775d79` ("serial: sh-sci: Correct pin initialization on (H)SCIF") Fixes: `e9d7a45a03` ("serial: sh-sci: Add pin initialization for SCIFA/SCIFB") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Jan Kiszka	a7b0bc2cb0	serial: 8250_lpss: Unconditionally set PCI master for Quark commit `7cd3e9dbdd` upstream. MSI needs it as well. Should have no practical impact, though, as DMA is always available on the Quark. But given the few users of pci_alloc_irq_vectors so far, this incorrect pattern may spread otherwise. Fixes: `3f3a46951e` ("serial: 8250_lpss: set PCI master only for private DMA") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Christophe JAILLET	01c434c6a8	serial: efm32: Fix parity management in 'efm32_uart_console_get_options()' commit `be40597a1b` upstream. UARTn_FRAME_PARITY_ODD is 0x0300 UARTn_FRAME_PARITY_EVEN is 0x0200 So if the UART is configured for EVEN parity, it would be reported as ODD. Fix it by correctly testing if the 2 bits are set. Fixes: `3afbd89c96` ("serial/efm32: add new driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Eric Anholt	a641ff2fbb	drm/vc4: Fix OOPSes from trying to cache a partially constructed BO. commit `ca39b449f6` upstream. If a CMA allocation failed, the partially constructed BO would be unreferenced through the normal path, and we might choose to put it in the BO cache. If we then reused it before it expired from the cache, the kernel would OOPS. Signed-off-by: Eric Anholt <eric@anholt.net> Fixes: `c826a6e106` ("drm/vc4: Add a BO cache.") Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170301185602.6873-2-eric@anholt.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
YYS	1e1aad3880	drm/mediatek: fix mtk_hdmi_setup_vendor_specific_infoframe mistake commit `014580ffab` upstream. mtk_hdmi_setup_vendor_specific_infoframe will return before handle mtk_hdmi_hw_send_info_frame.Because hdmi_vendor_infoframe_pack returns the number of bytes packed into the binary buffer or a negative error code on failure. So correct it. Fixes: `8f83f26891` ("drm/mediatek: Add HDMI support") Signed-off-by: Nickey Yang <nickey.yang@rock-chips.com> Signed-off-by: CK Hu <ck.hu@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Emmanuel Grumbach	46f5277dec	mac80211: don't send SMPS action frame in AP mode when not needed commit `b3dd827965` upstream. mac80211 allows to modify the SMPS state of an AP both, when it is started, and after it has been started. Such a change will trigger an action frame to all the peers that are currently connected, and will be remembered so that new peers will get notified as soon as they connect (since the SMPS setting in the beacon may not be the right one). This means that we need to remember the SMPS state currently requested as well as the SMPS state that was configured initially (and advertised in the beacon). The former is bss->req_smps and the latter is sdata->smps_mode. Initially, the AP interface could only be started with SMPS_OFF, which means that sdata->smps_mode was SMPS_OFF always. Later, a nl80211 API was added to be able to start an AP with a different AP mode. That code forgot to update bss->req_smps and because of that, if the AP interface was started with SMPS_DYNAMIC, we had: sdata->smps_mode = SMPS_DYNAMIC bss->req_smps = SMPS_OFF That configuration made mac80211 think it needs to fire off an action frame to any new station connecting to the AP in order to let it know that the actual SMPS configuration is SMPS_OFF. Fix that by properly setting bss->req_smps in ieee80211_start_ap. Fixes: `f699317487` ("mac80211: set smps_mode according to ap params") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Johannes Berg	1e11f87562	mac80211: fix dropped counter in multiqueue RX commit `e165bc02a0` upstream. In the commit enabling per-CPU station statistics, I inadvertedly copy-pasted some code to update rx_packets and forgot to change it to update rx_dropped_misc. Fix that. This addresses https://bugzilla.kernel.org/show_bug.cgi?id=195953. Fixes: `c9c5962b56` ("mac80211: enable collecting station statistics per-CPU") Reported-by: Petru-Florin Mihancea <petrum@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:16 +02:00
Rajkumar Manoharan	4d3f95fc26	mac80211: strictly check mesh address extension mode commit `5667c86acf` upstream. Mesh forwarding path checks for address extension mode to fetch appropriate proxied address and MPP address. Existing condition that looks for 6 address format is not strict enough so that frames with improper values are processed and invalid entries are added into MPP table. Fix that by adding a stricter check before processing the packet. Per IEEE Std 802.11s-2011 spec. Table 7-6g1 lists address extension mode 0x3 as reserved one. And also Table Table 9-13 does not specify 0x3 as valid address field. Fixes: `9b395bc3be` ("mac80211: verify that skb data is present") Signed-off-by: Rajkumar Manoharan <rmanohar@qti.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Johannes Berg	3fe1602b11	mac80211: fix IBSS presp allocation size commit `f1f3e9e2a5` upstream. When VHT IBSS support was added, the size of the extra elements wasn't considered in ieee80211_ibss_build_presp(), which makes it possible that it would overrun the allocated buffer. Fix it by allocating the necessary space. Fixes: `abcff6ef01` ("mac80211: add VHT support for IBSS") Reported-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Joonas Lahtinen	1d9fc42fc3	drm/i915: Do not sync RCU during shrinking commit `4681ee21d6` upstream. Due to the complex dependencies between workqueues and RCU, which are not easily detected by lockdep, do not synchronize RCU during shrinking. On low-on-memory systems (mem=1G for example), the RCU sync leads to all system workqueus freezing and unrelated lockdep splats are displayed according to reports. GIT bisecting done by J. R. Okajima points to the commit where RCU syncing was extended. RCU sync gains us very little benefit in real life scenarios where the amount of memory used by object backing storage is dominant over the metadata under RCU, so drop it altogether. " Yeeeaah, if core could just, go ahead and reclaim RCU queues, that'd be great. " - Chris Wilson, 2016 (`0eafec6d32`) v2: More information to commit message. v3: Remove "grep _rcu_" escapee from i915_gem_shrink_all (Andrea) Fixes: `c053b5a506` ("drm/i915: Don't call synchronize_rcu_expedited under struct_mutex") Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-by: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Hugh Dickins <hughd@google.com> Tested-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `73cc0b9aa9`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1495097379-573-1-git-send-email-joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Ville Syrjälä	d4ed6e67f8	drm/i915: Fix scaling check for 90/270 degree plane rotation commit `9a775e0308` upstream. Starting from commit `b63a16f6cd` ("drm/i915: Compute display surface offset in the plane check hook for SKL+") we've already rotated the src coordinates by 270 degrees by the time we check if a scaler is needed or not, so we must not account for the rotation a second time. Previously we did these steps in the opposite order and hence the scaler check had to deal with rotation itself. The double rotation handling causes us to enable a scaler pretty much every time 90/270 degree plane rotation is requested, leading to fuzzier fonts and whatnot. v2: s/unsigned/unsigned int/ to appease checkpatch v3: s/DRM_ROTATE_0/DRM_MODE_ROTATE_0/ Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `b63a16f6cd` ("drm/i915: Compute display surface offset in the plane check hook for SKL+") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170331180056.14086-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit `d96a7d2adb`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170608144002.1605-1-ville.syrjala@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Zhenyu Wang	e146d9ec25	drm/i915: Fix GVT-g PVINFO version compatibility check commit `c380f68124` upstream. Current it's strictly checked if PVINFO version matches 1.0 for GVT-g i915 guest which doesn't help for compatibility at all and forces GVT-g host can't extend PVINFO easily with version bump for real compatibility check. This fixes that to check minimal required PVINFO version instead. v2: - drop unneeded version macro - use only major version for sanity check v3: - fix up PVInfo value with kernel type - one indent fix Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chuanxiao Dong <chuanxiao.dong@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170609074805.5101-1-zhenyuw@linux.intel.com (cherry picked from commit `0c8792d00d`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Mario Kleiner	231a83af9b	drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions. commit `bea1041393` upstream. Commit `d63c277dc6` ("drm/amdgpu: Make display watermark calculations more accurate") made watermark calculations more accurate, but not for > 4k resolutions on 32-Bit architectures, as it introduced an integer overflow for those setups and resolutions. Fix this by proper u64 casting and division. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: `d63c277dc6` ("drm/amdgpu: Make display watermark calculations more accurate") Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Fabio Estevam	a3599c07f4	drm: mxsfb_crtc: Reset the eLCDIF controller commit `0f933328f0` upstream. According to the eLCDIF initialization steps listed in the MX6SX Reference Manual the eLCDIF block reset is mandatory. Without performing the eLCDIF reset the display shows garbage content when the kernel boots. In earlier tests this issue has not been observed because the bootloader was previously showing a splash screen and the bootloader display driver does properly implement the eLCDIF reset. Add the eLCDIF reset to the driver, so that it can operate correctly independently of the bootloader. Tested on a imx6sx-sdb board. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/1494007301-14535-1-git-send-email-fabio.estevam@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Jason A. Donenfeld	3891a5fc65	mac80211/wpa: use constant time memory comparison for MACs commit `98c67d187d` upstream. Otherwise, we enable all sorts of forgeries via timing attack. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: linux-wireless@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Emmanuel Grumbach	6a70d3bef9	mac80211: don't look at the PM bit of BAR frames commit `769dc04db3` upstream. When a peer sends a BAR frame with PM bit clear, we should not modify its PM state as madated by the spec in 802.11-20012 10.2.1.2. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:15 +02:00
Paul Moore	fcf0d8904a	selinux: fix double free in selinux_parse_opts_str() commit `023f108dcc` upstream. This patch is based on a discussion generated by an earlier patch from Tetsuo Handa: * https://marc.info/?t=149035659300001&r=1&w=2 The double free problem involves the mnt_opts field of the security_mnt_opts struct, selinux_parse_opts_str() frees the memory on error, but doesn't set the field to NULL so if the caller later attempts to call security_free_mnt_opts() we trigger the problem. In order to play it safe we change selinux_parse_opts_str() to call security_free_mnt_opts() on error instead of free'ing the memory directly. This should ensure that everything is handled correctly, regardless of what the caller may do. Fixes: `e000752989` ("LSM/SELinux: Interfaces to allow FS to control mount options") Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Hans Verkuil	ae505d7113	cec: race fix: don't return -ENONET in cec_receive() commit `b94aac64a4` upstream. When calling CEC_RECEIVE do not check if the adapter is configured. Typically CEC_RECEIVE is called after a select() and if that indicates that there are messages in the receive queue, then you should always be able to dequeue a message. The race condition here is that a message has been received and is queued, so select() tells userspace that a message is available. But before the application calls CEC_RECEIVE the adapter is unconfigured (e.g. the HDMI cable is removed). Now select will always report that there is a message, but calling CEC_RECEIVE will always return -ENONET because the adapter is no longer configured and so will never actually dequeue the message. There is really no need for this check, and in fact the ENONET error code was never documented for CEC_RECEIVE. This may have been a left-over of old code that was never updated. Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Christophe JAILLET	42e3d6f587	vb2: Fix an off by one error in 'vb2_plane_vaddr' commit `5ebb6dd36c` upstream. We should ensure that 'plane_no' is '< vb->num_planes' as done in 'vb2_plane_cookie' just a few lines below. Fixes: `e23ccc0ad9` ("[media] v4l: add videobuf2 Video for Linux 2 driver framework") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Sakari Ailus <sakari.ailus@linux.intel.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Tomasz Wilczyński	72d0ebe138	cpufreq: conservative: Allow down_threshold to take values from 1 to 10 commit `b8e11f7d27` upstream. Commit `27ed3cd2eb` (cpufreq: conservative: Fix the logic in frequency decrease checking) removed the 10 point substraction when comparing the load against down_threshold but did not remove the related limit for the down_threshold value. As a result, down_threshold lower than 11 is not allowed even though values from 1 to 10 do work correctly too. The comment ("cannot be lower than 11 otherwise freq will not fall") also does not apply after removing the substraction. For this reason, allow down_threshold to take any value from 1 to 99 and fix the related comment. Fixes: `27ed3cd2eb` (cpufreq: conservative: Fix the logic in frequency decrease checking) Signed-off-by: Tomasz Wilczyński <twilczynski@naver.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Arnd Bergmann	7f7bc8bf7f	ila_xlat: add missing hash secret initialization commit `0db47e3d32` upstream. While discussing the possible merits of clang warning about unused initialized functions, I found one function that was clearly meant to be called but never actually is. __ila_hash_secret_init() initializes the hash value for the ila locator, apparently this is intended to prevent hash collision attacks, but this ends up being a read-only zero constant since there is no caller. I could find no indication of why it was never called, the earliest patch submission for the module already was like this. If my interpretation is right, we certainly want to backport the patch to stable kernels as well. I considered adding it to the ila_xlat_init callback, but for best effect the random data is read as late as possible, just before it is first used. The underlying net_get_random_once() is already highly optimized to avoid overhead when called frequently. Fixes: `7f00feaf10` ("ila: Add generic ILA translation facility") Link: https://www.spinics.net/lists/kernel/msg2527243.html Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Marc Kleine-Budde	814001e796	can: gs_usb: fix memory leak in gs_cmd_reset() commit `5cda3ee513` upstream. This patch adds the missing kfree() in gs_cmd_reset() to free the memory that is not used anymore after usb_control_msg(). Cc: Maximilian Schneider <max@schneidersoft.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Nicholas Bellinger	360f227b38	configfs: Fix race between create_link and configfs_rmdir commit `ba80aa909c` upstream. This patch closes a long standing race in configfs between the creation of a new symlink in create_link(), while the symlink target's config_item is being concurrently removed via configfs_rmdir(). This can happen because the symlink target's reference is obtained by config_item_get() in create_link() before the CONFIGFS_USET_DROPPING bit set by configfs_detach_prep() during configfs_rmdir() shutdown is actually checked.. This originally manifested itself on ppc64 on v4.8.y under heavy load using ibmvscsi target ports with Novalink API: [ 7877.289863] rpadlpar_io: slot U8247.22L.212A91A-V1-C8 added [ 7879.893760] ------------[ cut here ]------------ [ 7879.893768] WARNING: CPU: 15 PID: 17585 at ./include/linux/kref.h:46 config_item_get+0x7c/0x90 [configfs] [ 7879.893811] CPU: 15 PID: 17585 Comm: targetcli Tainted: G O 4.8.17-customv2.22 #12 [ 7879.893812] task: c00000018a0d3400 task.stack: c0000001f3b40000 [ 7879.893813] NIP: d000000002c664ec LR: d000000002c60980 CTR: c000000000b70870 [ 7879.893814] REGS: c0000001f3b43810 TRAP: 0700 Tainted: G O (4.8.17-customv2.22) [ 7879.893815] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 28222242 XER: 00000000 [ 7879.893820] CFAR: d000000002c664bc SOFTE: 1 GPR00: d000000002c60980 c0000001f3b43a90 d000000002c70908 c0000000fbc06820 GPR04: c0000001ef1bd900 0000000000000004 0000000000000001 0000000000000000 GPR08: 0000000000000000 0000000000000001 d000000002c69560 d000000002c66d80 GPR12: c000000000b70870 c00000000e798700 c0000001f3b43ca0 c0000001d4949d40 GPR16: c00000014637e1c0 0000000000000000 0000000000000000 c0000000f2392940 GPR20: c0000001f3b43b98 0000000000000041 0000000000600000 0000000000000000 GPR24: fffffffffffff000 0000000000000000 d000000002c60be0 c0000001f1dac490 GPR28: 0000000000000004 0000000000000000 c0000001ef1bd900 c0000000f2392940 [ 7879.893839] NIP [d000000002c664ec] config_item_get+0x7c/0x90 [configfs] [ 7879.893841] LR [d000000002c60980] check_perm+0x80/0x2e0 [configfs] [ 7879.893842] Call Trace: [ 7879.893844] [c0000001f3b43ac0] [d000000002c60980] check_perm+0x80/0x2e0 [configfs] [ 7879.893847] [c0000001f3b43b10] [c000000000329770] do_dentry_open+0x2c0/0x460 [ 7879.893849] [c0000001f3b43b70] [c000000000344480] path_openat+0x210/0x1490 [ 7879.893851] [c0000001f3b43c80] [c00000000034708c] do_filp_open+0xfc/0x170 [ 7879.893853] [c0000001f3b43db0] [c00000000032b5bc] do_sys_open+0x1cc/0x390 [ 7879.893856] [c0000001f3b43e30] [c000000000009584] system_call+0x38/0xec [ 7879.893856] Instruction dump: [ 7879.893858] 409d0014 38210030 e8010010 7c0803a6 4e800020 3d220000 e94981e0 892a0000 [ 7879.893861] 2f890000 409effe0 39200001 992a0000 <0fe00000> 4bffffd0 60000000 60000000 [ 7879.893866] ---[ end trace 14078f0b3b5ad0aa ]--- To close this race, go ahead and obtain the symlink's target config_item reference only after the existing CONFIGFS_USET_DROPPING check succeeds. This way, if configfs_rmdir() wins create_link() will return -ENONET, and if create_link() wins configfs_rmdir() will return -EBUSY. Reported-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Christoph Hellwig	6b49f163f0	fs: pass on flags in compat_writev commit `20223f0f39` upstream. Fixes: `793b80ef14` ("vfs: pass a flags argument to vfs_readv/vfs_writev") Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-24 07:06:14 +02:00
Greg Kroah-Hartman	abb04342fc	Linux 4.11.6	2017-06-17 06:47:27 +02:00
Kai Chen	6a9a78ba17	drm/i915: Disable decoupled MMIO commit `4c4c565513` upstream. The decoupled MMIO feature doesn't work as intended by HW team. Enabling it with forcewake will only make debugging efforts more difficult, so let's disable it. Fixes: `85ee17ebee` ("drm/i915/bxt: Broxton decoupled MMIO") Cc: Zhe Wang <zhe1.wang@intel.com> Cc: Praveen Paneri <praveen.paneri@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Signed-off-by: Kai Chen <kai.chen@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170523215812.18328-2-kai.chen@intel.com (cherry picked from commit `0051c10aca`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Maarten Lankhorst	b094fd12b9	drm/i915: Always recompute watermarks when distrust_bios_wm is set, v2. commit `4e3aed8445` upstream. On some systems there can be a race condition in which no crtc state is added to the first atomic commit. This results in all crtc's having a null DDB allocation, causing a FIFO underrun on any update until the first modeset. Changes since v1: - Do not take the connection_mutex, this is already done below. Reported-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Inspired-by: Mahesh Kumar <mahesh1.kumar@intel.com> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: `98d39494d3` ("drm/i915/gen9: Compute DDB allocation at atomic check time (v4)") Cc: Mahesh Kumar <mahesh1.kumar@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531154236.27180-1-maarten.lankhorst@linux.intel.com Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `367d73d280`) Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2017-06-17 06:44:18 +02:00
Chris Wilson	dcdac1c29f	drm/i915: Guard against i915_ggtt_disable_guc() being invoked unconditionally commit `d90c98905a` upstream. Commit `7c3f86b6dc` ("drm/i915: Invalidate the guc ggtt TLB upon insertion") added the restoration of the invalidation routine after the GuC was disabled, but missed that the GuC was unconditionally disabled when not used. This then overwrites the invalidate routine for the older chipsets, causing havoc and breaking resume as the most obvious victim. We place the guard inside i915_ggtt_disable_guc() to be backport friendly (the bug was introduced into v4.11) but it would be preferred to be in more control over when this was guard (i.e. do not try and teardown the data structures before we have enabled them). That should be true with the reorganisation of the guc loaders. Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Fixes: `7c3f86b6dc` ("drm/i915: Invalidate the guc ggtt TLB upon insertion") Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170531190514.3691-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com> (cherry picked from commit `cb60606d83`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Ville Syrjälä	4d2c473f9f	drm/i915: Workaround VLV/CHV DSI scanline counter hardware fail commit `8f4d38099b` upstream. The scanline counter is bonkers on VLV/CHV DSI. The scanline counter increment is not lined up with the start of vblank like it is on every other platform and output type. This causes problems for both the vblank timestamping and atomic update vblank evasion. On my FFRD8 machine at least, the scanline counter increment happens about 1/3 of a scanline ahead of the start of vblank (which is where all register latching happens still). That means we can't trust the scanline counter to tell us whether we're in vblank or not while we're on that particular line. In order to keep vblank timestamping in working condition when called from the vblank irq, we'll leave scanline_offset at one, which means that the entire line containing the start of vblank is considered to be inside the vblank. For the vblank evasion we'll need to consider that entire line to be bad, since we can't tell whether the registers already got latched or not. And we can't actually use the start of vblank interrupt to get us past that line as the interrupt would fire too soon, and then we'd up waiting for the next start of vblank instead. One way around that would using the frame start interrupt instead since that wouldn't fire until the next scanline, but that would require some bigger changes in the interrupt code. So for simplicity we'll just poll until we get past the bad line. v2: Adjust the comments a bit Cc: Jonas Aaberg <cja@gmx.net> Tested-by: Jonas Aaberg <cja@gmx.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99086 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161215174734.28779-1-ville.syrjala@linux.intel.com Tested-by: Mika Kahola <mika.kahola@intel.com> Reviewed-by: Mika Kahola <mika.kahola@intel.com> (cherry picked from commit `ec1b4ee283`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Ville Syrjälä	3b981b2388	drm/i915: Fix 90/270 rotated coordinates for FBC commit `1065467ed8` upstream. The clipped src coordinates have already been rotated by 270 degrees for when the plane rotation is 90/270 degrees, hence the FBC code should no longer swap the width and height. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Fixes: `b63a16f6cd` ("drm/i915: Compute display surface offset in the plane check hook for SKL+") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170331180056.14086-4-ville.syrjala@linux.intel.com Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Tested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit `73714c05df`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Daniel Vetter	5b31ae00ac	Revert "drm/i915: Restore lost "Initialized i915" welcome message" commit `d38162e4b5` upstream. This reverts commit `bc5ca47c0a`. Gabriel put this back into generic code with commit `75f6dfe3e6` Author: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Date: Wed Dec 28 12:32:11 2016 -0200 drm: Deduplicate driver initialization message but somehow he missed Chris' patch to add the message meanwhile. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101025 Fixes: `75f6dfe3e6` ("drm: Deduplicate driver initialization message") Cc: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170517131557.7836-1-daniel.vetter@ffwll.ch (cherry picked from commit `6bdba81979`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Christian Borntraeger	c094b10c3c	s390/kvm: do not rely on the ILC on kvm host protection fauls commit `c0e7bb38c0` upstream. For most cases a protection exception in the host (e.g. copy on write or dirty tracking) on the sie instruction will indicate an instruction length of 4. Turns out that there are some corner cases (e.g. runtime instrumentation) where this is not necessarily true and the ILC is unpredictable. Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for all possible ILCs. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Max Filippov	2273302258	xtensa: don't use linux IRQ #0 commit `e5c86679d5` upstream. Linux IRQ #0 is reserved for error reporting and may not be used. Increase NR_IRQS for one additional slot and increase irq_domain_add_legacy parameter first_irq value to 1, so that linux IRQ #0 is not associated with hardware IRQ #0 in legacy IRQ domains. Introduce macro XTENSA_PIC_LINUX_IRQ for static translation of xtensa PIC hardware IRQ # to linux IRQ #. Use this macro in XTFPGA platform data definitions. This fixes inability to use hardware IRQ #0 in configurations that don't use device tree and allows for non-identity mapping between linux IRQ # and hardware IRQ #. Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Dave Young	2f3271eb10	efi: Fix boot panic because of invalid BGRT image address commit `792ef14df5` upstream. Maniaxx reported a kernel boot crash in the EFI code, which I emulated by using same invalid phys addr in code: BUG: unable to handle kernel paging request at ffffffffff280001 IP: efi_bgrt_init+0xfb/0x153 ... Call Trace: ? bgrt_init+0xbc/0xbc acpi_parse_bgrt+0xe/0x12 acpi_table_parse+0x89/0xb8 acpi_boot_init+0x445/0x4e2 ? acpi_parse_x2apic+0x79/0x79 ? dmi_ignore_irq0_timer_override+0x33/0x33 setup_arch+0xb63/0xc82 ? early_idt_handler_array+0x120/0x120 start_kernel+0xb7/0x443 ? early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x29/0x2b x86_64_start_kernel+0x154/0x177 secondary_startup_64+0x9f/0x9f There is also a similar bug filed in bugzilla.kernel.org: https://bugzilla.kernel.org/show_bug.cgi?id=195633 The crash is caused by this commit: `7b0a911478` efi/x86: Move the EFI BGRT init code to early init code The root cause is the firmware on those machines provides invalid BGRT image addresses. In a kernel before above commit BGRT initializes late and uses ioremap() to map the image address. Ioremap validates the address, if it is not a valid physical address ioremap() just fails and returns. However in current kernel EFI BGRT initializes early and uses early_memremap() which does not validate the image address, and kernel panic happens. According to ACPI spec the BGRT image address should fall into EFI_BOOT_SERVICES_DATA, see the section 5.2.22.4 of below document: http://www.uefi.org/sites/default/files/resources/ACPI_6_1.pdf Fix this issue by validating the image address in efi_bgrt_init(). If the image address does not fall into any EFI_BOOT_SERVICES_DATA areas we just bail out with a warning message. Reported-by: Maniaxx <tripleshiftone@gmail.com> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") Link: http://lkml.kernel.org/r/20170609084558.26766-2-ard.biesheuvel@linaro.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:18 +02:00
Richard	041d665eb1	partitions/msdos: FreeBSD UFS2 file systems are not recognized commit `223220356d` upstream. The code in block/partitions/msdos.c recognizes FreeBSD, OpenBSD and NetBSD partitions and does a reasonable job picking out OpenBSD and NetBSD UFS subpartitions. But for FreeBSD the subpartitions are always "bad". Kernel: <bsd:bad subpartition - ignored Though all 3 of these BSD systems use UFS as a file system, only FreeBSD uses relative start addresses in the subpartition declarations. The following patch fixes this for FreeBSD partitions and leaves the code for OpenBSD and NetBSD intact: Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Imre Deak	99f5ba009e	drm/i915: Prevent the system suspend complete optimization commit `6ab92afc95` upstream. Since commit `bac2a909a0` Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Date: Wed Jan 21 02:17:42 2015 +0100 PCI / PM: Avoid resuming PCI devices during system suspend PCI devices will default to allowing the system suspend complete optimization where devices are not woken up during system suspend if they were already runtime suspended. This however breaks the i915/HDA drivers for two reasons: - The i915 driver has system suspend specific steps that it needs to run, that bring the device to a different state than its runtime suspended state. - The HDA driver's suspend handler requires power that it will request from the i915 driver's power domain handler. This in turn requires the i915 driver to runtime resume itself, but this won't be possible if the suspend complete optimization is in effect: in this case the i915 runtime PM is disabled and trying to get an RPM reference returns -EACCESS. Solve this by requiring the PCI/PM core to resume the device during system suspend which in effect disables the suspend complete optimization. Regardless of the above commit the optimization stayed disabled for DRM devices until commit `d14d2a8453` Author: Lukas Wunner <lukas@wunner.de> Date: Wed Jun 8 12:49:29 2016 +0200 drm: Remove dev_pm_ops from drm_class so this patch is in practice a fix for this commit. Another reason for the bug staying hidden for so long is that the optimization for a device is disabled if it's disabled for any of its children devices. i915 may have a backlight device as its child which doesn't support runtime PM and so doesn't allow the optimization either. So if this backlight device got registered the bug stayed hidden. Credits to Marta, Tomi and David who enabled pstore logging, that caught one instance of this issue across a suspend/ resume-to-ram and Ville who rememberd that the optimization was enabled for some devices at one point. The first WARN triggered by the problem: [ 6250.746445] WARNING: CPU: 2 PID: 17384 at drivers/gpu/drm/i915/intel_runtime_pm.c:2846 intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746448] pm_runtime_get_sync() failed: -13 [ 6250.746451] Modules linked in: snd_hda_intel i915 vgem snd_hda_codec_hdmi x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic ghash_clmulni_intel e1000e snd_hda_codec snd_hwdep snd_hda_core ptp mei_me pps_core snd_pcm lpc_ich mei prime_ numbers i2c_hid i2c_designware_platform i2c_designware_core [last unloaded: i915] [ 6250.746512] CPU: 2 PID: 17384 Comm: kworker/u8:0 Tainted: G U W 4.11.0-rc5-CI-CI_DRM_334+ #1 [ 6250.746515] Hardware name: /NUC5i5RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017 [ 6250.746521] Workqueue: events_unbound async_run_entry_fn [ 6250.746525] Call Trace: [ 6250.746530] dump_stack+0x67/0x92 [ 6250.746536] __warn+0xc6/0xe0 [ 6250.746542] ? pci_restore_standard_config+0x40/0x40 [ 6250.746546] warn_slowpath_fmt+0x46/0x50 [ 6250.746553] ? __pm_runtime_resume+0x56/0x80 [ 6250.746584] intel_runtime_pm_get+0x6b/0xd0 [i915] [ 6250.746610] intel_display_power_get+0x1b/0x40 [i915] [ 6250.746646] i915_audio_component_get_power+0x15/0x20 [i915] [ 6250.746654] snd_hdac_display_power+0xc8/0x110 [snd_hda_core] [ 6250.746661] azx_runtime_resume+0x218/0x280 [snd_hda_intel] [ 6250.746667] pci_pm_runtime_resume+0x76/0xa0 [ 6250.746672] __rpm_callback+0xb4/0x1f0 [ 6250.746677] ? pci_restore_standard_config+0x40/0x40 [ 6250.746682] rpm_callback+0x1f/0x80 [ 6250.746686] ? pci_restore_standard_config+0x40/0x40 [ 6250.746690] rpm_resume+0x4ba/0x740 [ 6250.746698] __pm_runtime_resume+0x49/0x80 [ 6250.746703] pci_pm_suspend+0x57/0x140 [ 6250.746709] dpm_run_callback+0x6f/0x330 [ 6250.746713] ? pci_pm_freeze+0xe0/0xe0 [ 6250.746718] __device_suspend+0xf9/0x370 [ 6250.746724] ? dpm_watchdog_set+0x60/0x60 [ 6250.746730] async_suspend+0x1a/0x90 [ 6250.746735] async_run_entry_fn+0x34/0x160 [ 6250.746741] process_one_work+0x1f2/0x6d0 [ 6250.746749] worker_thread+0x49/0x4a0 [ 6250.746755] kthread+0x107/0x140 [ 6250.746759] ? process_one_work+0x6d0/0x6d0 [ 6250.746763] ? kthread_create_on_node+0x40/0x40 [ 6250.746768] ret_from_fork+0x2e/0x40 [ 6250.746778] ---[ end trace 102a62fd2160f5e6 ]--- v2: - Use the new pci_dev->needs_resume flag, to avoid any overhead during the ->pm_prepare hook. (Rafael) v3: - Update commit message to reference the actual regressing commit. (Lukas) v4: - Rebase on v4 of patch 1/2. Fixes: `d14d2a8453` ("drm: Remove dev_pm_ops from drm_class") References: https://bugs.freedesktop.org/show_bug.cgi?id=100378 References: https://bugs.freedesktop.org/show_bug.cgi?id=100770 Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Cc: David Weinehall <david.weinehall@linux.intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Takashi Iwai <tiwai@suse.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: linux-pci@vger.kernel.org Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reported-and-tested-by: Marta Lofstedt <marta.lofstedt@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1493726649-32094-2-git-send-email-imre.deak@intel.com (cherry picked from commit `adfdf85d79`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Imre Deak	d7c0a50b21	PCI/PM: Add needs_resume flag to avoid suspend complete optimization commit `4d071c3238` upstream. Some drivers - like i915 - may not support the system suspend direct complete optimization due to differences in their runtime and system suspend sequence. Add a flag that when set resumes the device before calling the driver's system suspend handlers which effectively disables the optimization. Needed by a future patch fixing suspend/resume on i915. Suggested by Rafael. Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: stable@vger.kernel.org (rebased on v4.8, added kernel version to commit message stable tag) Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Chris Wilson	92220696d5	drm/i915: Do not drop pagetables when empty This is the minimal backport for stable of the upstream commit: commit `dd19674bac` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Wed Feb 15 08:43:46 2017 +0000 drm/i915: Remove bitmap tracking for used-ptes Due to a race with the shrinker, when we try to allocate a pagetable, we may end up shrinking it instead. This comes as a nasty surprise as we try to dereference it to fill in the pagetable entries for the object. In linus/master this is fixed by pinning the pagetables prior to allocation, but that backport is roughly drivers/gpu/drm/i915/i915_gem_gtt.c \| 10 ---------- 1 file changed, 10 deletions(-) i.e. unsuitable for stable. Instead we neuter the code that tried to free the pagetables. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99295 Fixes: `2ce5179fe8` ("drm/i915/gtt: Free unused lower-level page tables") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: intel-gfx@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.10+ Tested-by: Maël Lavault <mael.lavault@protonmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-17 06:44:17 +02:00
Greg Kroah-Hartman	be8335f6b5	Linux 4.11.5	2017-06-14 15:08:04 +02:00
Vegard Nossum	2c2c503792	kthread: fix boot hang (regression) on MIPS/OpenRISC commit `b0f5a8f32e` upstream. This fixes a regression in commit `4d6501dce0` where I didn't notice that MIPS and OpenRISC were reinitialising p->{set,clear}_child_tid to NULL after our initialisation in copy_process(). We can simply get rid of the arch-specific initialisation here since it is now always done in copy_process() before hitting copy_thread{,_tls}(). Review notes: - As far as I can tell, copy_process() is the only user of copy_thread_tls(), which is the only caller of copy_thread() for architectures that don't implement copy_thread_tls(). - After this patch, there is no arch-specific code touching p->set_child_tid or p->clear_child_tid whatsoever. - It may look like MIPS/OpenRISC wanted to always have these fields be NULL, but that's not true, as copy_process() would unconditionally set them again _after_ calling copy_thread_tls() before commit `4d6501dce0`. Fixes: `4d6501dce0` ("kthread: Fix use-after-free if kthread fork fails") Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> # MIPS only Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: openrisc@lists.librecores.org Cc: Jamie Iles <jamie.iles@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Pablo Neira Ayuso	1a422e547a	netfilter: nft_set_rbtree: handle element re-addition after deletion commit `d2df92e98a` upstream. The existing code selects no next branch to be inspected when re-inserting an inactive element into the rb-tree, looping endlessly. This patch restricts the check for active elements to the EEXIST case only. Fixes: `e701001e7c` ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates") Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Tested-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Jani Nikula	6b183d1b84	drm/i915/vbt: split out defaults that are set when there is no VBT commit `bb1d132935` upstream. The main thing are the DDI ports. If there's a VBT that says there are no outputs, we should trust that, and not have semi-random defaults. Unfortunately, the defaults have resulted in some Chromebooks without VBT to rely on this behaviour, so we split out the defaults for the missing VBT case. Reviewed-by: Manasi Navare <manasi.d.navare@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/95c26079ff640d43f53b944f17e9fc356b36daec.1489152288.git.jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Jani Nikula	07f0aa40fc	drm/i915/vbt: don't propagate errors from intel_bios_init() commit `665788572c` upstream. We don't use the error return for anything other than reporting and logging that there is no VBT. We can pull the logging in the function, and remove the error status return. Moreover, if we needed the information for something later on, we'd probably be better off storing the bit in dev_priv, and using it where it's needed, instead of using the error return. While at it, improve the comments. Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/438ebbb0d5f0d321c625065b9cc78532a1dab24f.1489152288.git.jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Paul Moore	1df30f8264	audit: fix the RCU locking for the auditd_connection structure commit `48d0e023af` upstream. Cong Wang correctly pointed out that the RCU read locking of the auditd_connection struct was wrong, this patch correct this by adopting a more traditional, and correct RCU locking model. This patch is heavily based on an earlier prototype by Cong Wang. Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Thomas Gleixner	f17b15c896	hwmon: (coretemp) Handle frozen hotplug state correctly commit `90b4f30b6d` upstream. The recent conversion to the hotplug state machine missed that the original hotplug notifiers did not execute in the frozen state, which is used on suspend on resume. This does not matter on single socket machines, but on multi socket systems this breaks when the device for a non-boot socket is removed when the last CPU of that socket is brought offline. The device removal locks up the machine hard w/o any debug output. Prevent executing the hotplug callbacks when cpuhp_tasks_frozen is true. Thanks to Tommi for providing debug information patiently while I failed to spot the obvious. Fixes: `e00ca5df37` ("hwmon: (coretemp) Convert to hotplug state machine") Reported-by: Tommi Rantala <tt.rantala@gmail.com> Tested-by: Tommi Rantala <tt.rantala@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: "Chen, Yu C" <yu.c.chen@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:51 +02:00
Chandan Rajendra	dcaa3c1ec9	iomap_dio_rw: Prevent reading file data beyond iomap_dio->i_size commit `a008c31c7e` upstream. On a ppc64 machine executing overlayfs/019 with xfs as the lower and upper filesystem causes the following call trace, WARNING: CPU: 2 PID: 8034 at /root/repos/linux/fs/iomap.c:765 .iomap_dio_actor+0xcc/0x420 Modules linked in: CPU: 2 PID: 8034 Comm: fsstress Tainted: G L 4.11.0-rc5-next-20170405 #100 task: c000000631314880 task.stack: c0000003915d4000 NIP: c00000000035a72c LR: c00000000035a6f4 CTR: c00000000035a660 REGS: c0000003915d7570 TRAP: 0700 Tainted: G L (4.11.0-rc5-next-20170405) MSR: 800000000282b032 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 24004284 XER: 00000000 CFAR: c0000000006f7190 SOFTE: 1 GPR00: c00000000035a6f4 c0000003915d77f0 c0000000015a3f00 000000007c22f600 GPR04: 000000000022d000 0000000000002600 c0000003b2d56360 c0000003915d7960 GPR08: c0000003915d7cd0 0000000000000002 0000000000002600 c000000000521cc0 GPR12: 0000000024004284 c00000000fd80a00 000000004b04ae64 ffffffffffffffff GPR16: 000000001000ca70 0000000000000000 c0000003b2d56380 c00000000153d2b8 GPR20: 0000000000000010 c0000003bc87bac8 0000000000223000 000000000022f5ff GPR24: c0000003b2d56360 000000000000000c 0000000000002600 000000000022d000 GPR28: 0000000000000000 c0000003915d7960 c0000003b2d56360 00000000000001ff NIP [c00000000035a72c] .iomap_dio_actor+0xcc/0x420 LR [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 Call Trace: [c0000003915d77f0] [c00000000035a6f4] .iomap_dio_actor+0x94/0x420 (unreliable) [c0000003915d78f0] [c00000000035b9f4] .iomap_apply+0xf4/0x1f0 [c0000003915d79d0] [c00000000035c320] .iomap_dio_rw+0x230/0x420 [c0000003915d7ae0] [c000000000512a14] .xfs_file_dio_aio_read+0x84/0x160 [c0000003915d7b80] [c000000000512d24] .xfs_file_read_iter+0x104/0x130 [c0000003915d7c10] [c0000000002d6234] .__vfs_read+0x114/0x1a0 [c0000003915d7cf0] [c0000000002d7a8c] .vfs_read+0xac/0x1a0 [c0000003915d7d90] [c0000000002d96b8] .SyS_read+0x58/0x100 [c0000003915d7e30] [c00000000000b8e0] system_call+0x38/0xfc Instruction dump: 78630020 7f831b78 7ffc07b4 7c7ce039 40820360 a13d0018 2f890003 419e0288 2f890004 419e00a0 2f890001 419e02a8 <0fe00000> 3b80fffb 38210100 7f83e378 The above problem can also be recreated on a regular xfs filesystem using the command, $ fsstress -d /mnt -l 1000 -n 1000 -p 1000 The reason for the call trace is, 1. When 'reserving' blocks for delayed allocation , XFS reserves more blocks (i.e. past file's current EOF) than required. This is done because XFS assumes that userspace might write more data and hence 'reserving' more blocks might lead to the file's new data being stored contiguously on disk. 2. The in-memory 'struct xfs_bmbt_irec' mapping the file's last extent would then cover the prealloc-ed EOF blocks in addition to the regular blocks. 3. When flushing the dirty blocks to disk, we only flush data till the file's EOF. But before writing out the dirty data, we allocate blocks on the disk for holding the file's new data. This allocation includes the blocks that are part of the 'prealloc EOF blocks'. 4. Later, when the last reference to the inode is being closed, XFS frees the unused 'prealloc EOF blocks' in xfs_inactive(). In step 3 above, When allocating space on disk for the delayed allocation range, the space allocator might sometimes allocate less blocks than required. If such an allocation ends right at the current EOF of the file, We will not be able to clear the "delayed allocation" flag for the 'prealloc EOF blocks', since we won't have dirty buffer heads associated with that range of the file. In such a situation if a Direct I/O read operation is performed on file range [X, Y] (where X < EOF and Y > EOF), we flush dirty data in the range [X, Y] and invalidate page cache for that range (Refer to iomap_dio_rw()). Later for performing the Direct I/O read, XFS obtains the extent items (which are still cached in memory) for the file range. When doing so we are not supposed to get an extent item with IOMAP_DELALLOC flag set, since the previous "flush" operation should have converted any delayed allocation data in the range [X, Y]. Hence we end up hitting a WARN_ON_ONCE(1) statement in iomap_dio_actor(). This commit fixes the bug by preventing the read operation from going beyond iomap_dio->i_size. Reported-by: Santhosh G <santhog4@linux.vnet.ibm.com> Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Tejun Heo	695b2bd64b	cgroup: mark cgroup_get() with __maybe_unused commit `310b4816a5` upstream. `a590b90d47` ("cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc()") converted most cgroup_get() usages to cgroup_get_live() leaving cgroup_sk_alloc() the sole user of cgroup_get(). When !CONFIG_SOCK_CGROUP_DATA, this ends up triggering unused warning for cgroup_get(). Silence the warning by adding __maybe_unused to cgroup_get(). Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20170501145340.17e8ef86@canb.auug.org.au Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Wei Yongjun	ba301861f6	pinctrl: cherryview: Add terminate entry for dmi_system_id tables commit `a9de080bbc` upstream. Make sure dmi_system_id tables are NULL terminated. Fixes: `7036502783` ("pinctrl: cherryview: Add a quirk to make Acer Chromebook keyboard work again") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Cc: Jean Delvare <jdelvare@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Takatoshi Akiyama	944601cf16	serial: sh-sci: Fix panic when serial console and DMA are enabled commit `3c9101766b` upstream. This patch fixes an issue that kernel panic happens when DMA is enabled and we press enter key while the kernel booting on the serial console. * An interrupt may occur after sci_request_irq(). * DMA transfer area is initialized by setup_timer() in sci_request_dma() and used in interrupt. If an interrupt occurred between sci_request_irq() and setup_timer() in sci_request_dma(), DMA transfer area has not been initialized yet. So, this patch changes the order of sci_request_irq() and sci_request_dma(). Fixes: `73a19e4c03` ("serial: sh-sci: Add DMA support.") Signed-off-by: Takatoshi Akiyama <takatoshi.akiyama.kj@ps.hitachi-solutions.com> [Shimoda changes the commit log] Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Michał Winiarski	4e979ac972	drm/i915/skl: Add missing SKL ID commit `ca7a45ba6f` upstream. Used by production device: Intel(R) Iris(TM) Graphics P555 Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170227112256.20060-1-michal.winiarski@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Ville Syrjälä	fa8a1ce792	drm/i915: Fix runtime PM for LPE audio commit `668e3b014a` upstream. Not calling pm_runtime_enable() means that runtime PM can't be enabled at all via sysfs. So we definitely need to call it from somewhere. Calling it from the driver seems like a bad idea because it would have to be paired with a pm_runtime_disable() at driver unload time, otherwise the core gets upset. Also if there's no LPE audio driver loaded then we couldn't runtime suspend i915 either. So it looks like a better plan is to call it from i915 when we register the platform device. That seems to match how pci generally does things. I cargo culted the pm_runtime_forbid() and pm_runtime_set_active() calls from pci as well. The exposed runtime PM API is massive an thorougly misleading, so I don't actually know if this is how you're supposed to use the API or not. But it seems to work. I can now runtime suspend i915 again with or without the LPE audio driver loaded, and reloading the LPE audio driver also seems to work. Note that powertop won't auto-tune runtime PM for platform devices, which is a little annoying. So I'm not sure that leaving runtime PM in "on" mode by default is the best choice here. But I've left it like that for now at least. Also remove the comment about there not being much benefit from LPE audio runtime PM. Not allowing runtime PM blocks i915 runtime PM, which will also block s0ix, and that could have a measurable impact on power consumption. Cc: Takashi Iwai <tiwai@suse.de> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Fixes: `0b6b524f39` ("ALSA: x86: Don't enable runtime PM as default") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170427160231.13337-2-ville.syrjala@linux.intel.com Reviewed-by: Takashi Iwai <tiwai@suse.de> (cherry picked from commit `183c00350c`) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Julius Werner	fd6c3d0a3b	drivers: char: mem: Fix wraparound check to allow mappings up to the end commit `32829da54d` upstream. A recent fix to /dev/mem prevents mappings from wrapping around the end of physical address space. However, the check was written in a way that also prevents a mapping reaching just up to the end of physical address space, which may be a valid use case (especially on 32-bit systems). This patch fixes it by checking the last mapped address (instead of the first address behind that) for overflow. Fixes: `b299cde245` ("drivers: char: mem: Check for address space wraparound with mmap()") Reported-by: Nico Huber <nico.h@gmx.de> Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Sebastian Andrzej Siewior	fc337d0e00	cpu/hotplug: Drop the device lock on error commit `40da1b11f0` upstream. If a custom CPU target is specified and that one is not available _or_ can't be interrupted then the code returns to userland without dropping a lock as notices by lockdep: \|echo 133 > /sys/devices/system/cpu/cpu7/hotplug/target \| ================================================ \| [ BUG: lock held when returning to user space! ] \| ------------------------------------------------ \| bash/503 is leaving the kernel with locks still held! \| 1 lock held by bash/503: \| #0: (device_hotplug_lock){+.+...}, at: [<ffffffff815b5650>] lock_device_hotplug_sysfs+0x10/0x40 So release the lock then. Fixes: `757c989b99` ("cpu/hotplug: Make target state writeable") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170602142714.3ogo25f2wbq6fjpj@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:50 +02:00
Takashi Iwai	4fa867a37a	ASoC: Fix use-after-free at card unregistration commit `4efda5f213` upstream. soc_cleanup_card_resources() call snd_card_free() at the last of its procedure. This turned out to lead to a use-after-free. PCM runtimes have been already removed via soc_remove_pcm_runtimes(), while it's dereferenced later in soc_pcm_free() called via snd_card_free(). The fix is simple: just move the snd_card_free() call to the beginning of the whole procedure. This also gives another benefit: it guarantees that all operations have been shut down before actually releasing the resources, which was racy until now. Reported-and-tested-by: Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Takashi Iwai	6d4bee6600	ALSA: timer: Fix missing queue indices reset at SNDRV_TIMER_IOCTL_SELECT commit `ba3021b2c7` upstream. snd_timer_user_tselect() reallocates the queue buffer dynamically, but it forgot to reset its indices. Since the read may happen concurrently with ioctl and snd_timer_user_tselect() allocates the buffer via kmalloc(), this may lead to the leak of uninitialized kernel-space data, as spotted via KMSAN: BUG: KMSAN: use of unitialized memory in snd_timer_user_read+0x6c4/0xa10 CPU: 0 PID: 1037 Comm: probe Not tainted 4.11.0-rc5+ #2739 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:1007 kmsan_check_memory+0xc2/0x140 mm/kmsan/kmsan.c:1086 copy_to_user ./arch/x86/include/asm/uaccess.h:725 snd_timer_user_read+0x6c4/0xa10 sound/core/timer.c:2004 do_loop_readv_writev fs/read_write.c:716 __do_readv_writev+0x94c/0x1380 fs/read_write.c:864 do_readv_writev fs/read_write.c:894 vfs_readv fs/read_write.c:908 do_readv+0x52a/0x5d0 fs/read_write.c:934 SYSC_readv+0xb6/0xd0 fs/read_write.c:1021 SyS_readv+0x87/0xb0 fs/read_write.c:1018 This patch adds the missing reset of queue indices. Together with the previous fix for the ioctl/read race, we cover the whole problem. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Takashi Iwai	9018818b24	ALSA: timer: Fix race between read and ioctl commit `d11662f4f7` upstream. The read from ALSA timer device, the function snd_timer_user_tread(), may access to an uninitialized struct snd_timer_user fields when the read is concurrently performed while the ioctl like snd_timer_user_tselect() is invoked. We have already fixed the races among ioctls via a mutex, but we seem to have forgotten the race between read vs ioctl. This patch simply applies (more exactly extends the already applied range of) tu->ioctl_lock in snd_timer_user_tread() for closing the race window. Reported-by: Alexander Potapenko <glider@google.com> Tested-by: Alexander Potapenko <glider@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Ben Skeggs	a8282f018b	drm/nouveau/tmr: fully separate alarm execution/pending lists commit `b4e382ca75` upstream. Reusing the list_head for both is a bad idea. Callback execution is done with the lock dropped so that alarms can be rescheduled from the callback, which means that with some unfortunate timing, lists can get corrupted. The execution list should not require its own locking, the single function that uses it can only be called from a single context. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Dominik Brodowski	a7e7ade1b4	x86/microcode/intel: Clear patch pointer before jettisoning the initrd commit `5b0bc9ac2c` upstream. During early boot, load_ucode_intel_ap() uses __load_ucode_intel() to obtain a pointer to the relevant microcode patch (embedded in the initrd), and stores this value in 'intel_ucode_patch' to speed up the microcode patch application for subsequent CPUs. On resuming from suspend-to-RAM, however, load_ucode_ap() calls load_ucode_intel_ap() for each non-boot-CPU. By then the initramfs is long gone so the pointer stored in 'intel_ucode_patch' no longer points to a valid microcode patch. Clear that pointer so that we effectively fall back to the CPU hotplug notifier callbacks to update the microcode. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> [ Edit and massage commit message. ] Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170607095819.9754-1-bp@alien8.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Sinclair Yeh	3bc7a4a564	drm/vmwgfx: Make sure backup_handle is always valid commit `07678eca2c` upstream. When vmw_gb_surface_define_ioctl() is called with an existing buffer, we end up returning an uninitialized variable in the backup_handle. The fix is to first initialize backup_handle to 0 just to be sure, and second, when a user-provided buffer is found, we will use the req->buffer_handle as the backup_handle. Reported-by: Murray McAllister <murray.mcallister@insomniasec.com> Signed-off-by: Sinclair Yeh <syeh@vmware.com> Reviewed-by: Deepak Rawat <drawat@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Vladis Dronov	6a6a485719	drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() commit `ee9c4e681e` upstream. The 'req->mip_levels' parameter in vmw_gb_surface_define_ioctl() is a user-controlled 'uint32_t' value which is used as a loop count limit. This can lead to a kernel lockup and DoS. Add check for 'req->mip_levels'. References: https://bugzilla.redhat.com/show_bug.cgi?id=1437431 Signed-off-by: Vladis Dronov <vdronov@redhat.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Dan Carpenter	8c704276e2	drm/vmwgfx: Handle vmalloc() failure in vmw_local_fifo_reserve() commit `f0c62e9878` upstream. If vmalloc() fails then we need to a bit of cleanup before returning. Fixes: `fb1d9738ca` ("drm/vmwgfx: Add DRM driver for VMware Virtual GPU") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Sinclair Yeh <syeh@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:49 +02:00
Timur Tabi	3e857aaa8f	net: qcom/emac: do not use hardware mdio automatic polling commit `246096690b` upstream. Use software polling (PHY_POLL) to check for link state changes instead of relying on the EMAC's hardware polling feature. Some PHY drivers are unable to get a functioning link because the HW polling is not robust enough. The EMAC is able to poll the PHY on the MDIO bus looking for link state changes (via the Link Status bit in the Status Register at address 0x1). When the link state changes, the EMAC triggers an interrupt and tells the driver what the new state is. The feature eliminates the need for software to poll the MDIO bus. Unfortunately, this feature is incompatible with phylib, because it ignores everything that the PHY core and PHY drivers are trying to do. In particular: 1. It assumes a compatible register set, so PHYs with different registers may not work. 2. It doesn't allow for hardware errata that have work-arounds implemented in the PHY driver. 3. It doesn't support multiple register pages. If the PHY core switches the register set to another page, the EMAC won't know the page has changed and will still attempt to read the same PHY register. 4. It only checks the copper side of the link, not the SGMII side. Some PHY drivers (e.g. at803x) may also check the SGMII side, and report the link as not ready during autonegotiation if the SGMII link is still down. Phylib then waits for another interrupt to query the PHY again, but the EMAC won't send another interrupt because it thinks the link is up. Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Signed-off-by: Timur Tabi <timur@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Paolo Bonzini	cadb844501	srcu: Allow use of Classic SRCU from both process and interrupt context commit `1123a60416` upstream. Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit `719d93cd5f` ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit `5a41344a3d` ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Fixes: `719d93cd5f` ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Jin Yao	574f76fc58	perf/core: Drop kernel samples even though :u is specified commit `cc1582c231` upstream. When doing sampling, for example: perf record -e cycles:u ... On workloads that do a lot of kernel entry/exits we see kernel samples, even though :u is specified. This is due to skid existing. This might be a security issue because it can leak kernel addresses even though kernel sampling support is disabled. The patch drops the kernel samples if exclude_kernel is specified. For example, test on Haswell desktop: perf record -e cycles:u <mgen> perf report --stdio Before patch applied: 99.77% mgen mgen [.] buf_read 0.20% mgen mgen [.] rand_buf_init 0.01% mgen [kernel.vmlinux] [k] apic_timer_interrupt 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen mgen [.] rand_array_init 0.00% mgen [kernel.vmlinux] [k] page_fault 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] __strcasestr 0.00% mgen ld-2.23.so [.] strcmp 0.00% mgen ld-2.23.so [.] _dl_start 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start We can see kernel symbols apic_timer_interrupt and page_fault. After patch applied: 99.79% mgen mgen [.] buf_read 0.19% mgen mgen [.] rand_buf_init 0.00% mgen libc-2.23.so [.] __random_r 0.00% mgen mgen [.] rand_array_init 0.00% mgen mgen [.] last_free_elem 0.00% mgen libc-2.23.so [.] vfprintf 0.00% mgen libc-2.23.so [.] rand 0.00% mgen libc-2.23.so [.] __random 0.00% mgen libc-2.23.so [.] _int_malloc 0.00% mgen libc-2.23.so [.] _IO_doallocbuf 0.00% mgen ld-2.23.so [.] do_lookup_x 0.00% mgen ld-2.23.so [.] open_verify.constprop.7 0.00% mgen ld-2.23.so [.] _dl_important_hwcaps 0.00% mgen libc-2.23.so [.] sched_setaffinity@@GLIBC_2.3.4 0.00% mgen ld-2.23.so [.] _start There are only userspace symbols. Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: acme@kernel.org Cc: jolsa@kernel.org Cc: kan.liang@intel.com Cc: mark.rutland@arm.com Cc: will.deacon@arm.com Cc: yao.jin@intel.com Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Andrew Lunn	52782a1e16	Revert "ata: sata_mv: Convert to devm_ioremap_resource()" commit `3e4240da0e` upstream. This reverts commit `368e5fbdfc`. devm_ioremap_resource() enforces that there are no overlapping resources, where as devm_ioremap() does not. The sata phy driver needs a subset of the sata IO address space, so maps some of the sata address space. As a result, sata_mv now fails to probe, reporting it cannot get its resources, and so we don't have any SATA disks. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Breno Leitao	93dfb4e6a7	powerpc/kernel: Initialize load_tm on task creation commit `7f22ced437` upstream. Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: `5d176f751e` ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Breno Leitao	154b657008	powerpc/kernel: Fix FP and vector register restoration commit `1195892c09` upstream. Currently tsk->thread->load_vec and load_fp are not initialized during task creation, which can lead to garbage values in these variables (non-zero values). These variables will be checked later in restore_math() to validate if the FP and vector registers are being utilized. Since these values might be non-zero, the restore_math() will continue to save the FP and vectors even if they were never utilized by the userspace application. load_fp and load_vec counters will then overflow (they wrap at 255) and the FP and Altivec will be finally disabled, but before that condition is reached (counter overflow) several context switches will have restored FP and vector registers without need, causing a performance degradation. Fixes: `70fe3d980f` ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Gustavo Romero <gusbromero@gmail.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Michael Bringmann	c3e2874653	powerpc/hotplug-mem: Fix missing endian conversion of aa_index commit `dc421b200f` upstream. When adding or removing memory, the aa_index (affinity value) for the memblock must also be converted to match the endianness of the rest of the 'ibm,dynamic-memory' property. Otherwise, subsequent retrieval of the attribute will likely lead to non-existent nodes, followed by using the default node in the code inappropriately. Fixes: `5f97b2a0d1` ("powerpc/pseries: Implement memory hotplug add in the kernel") Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Michael Ellerman	840407e3da	powerpc/numa: Fix percpu allocations to be NUMA aware commit `ba4a648f12` upstream. In commit `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Fixes: `8c27226119` ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:48 +02:00
Christophe Leroy	2bcd25cbc6	powerpc/sysdev/simple_gpio: Fix oops in gpio save_regs function commit `6f553912ee` upstream. of_mm_gpiochip_add_data() generates an oops for NULL pointer dereference. of_mm_gpiochip_add_data() calls mm_gc->save_regs() before setting the data, therefore ->save_regs() cannot use gpiochip_get_data() Fixes: `937daafca7` ("powerpc: simple-gpio: use gpiochip data pointer") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	0ce1692273	scsi: qla2xxx: Fix mailbox pointer error in fwdump capture commit `74939a0bc7` upstream. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	223b1549f0	scsi: qla2xxx: Set bit 15 for DIAG_ECHO_TEST MBC commit `1d63496516` upstream. Set bit (BIT_15) to send right ECHO payload information for Diagnostic Echo Test command. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Joe Carnuccio	67034eaa0d	scsi: qla2xxx: Modify T262 FW dump template to specify same start/end to debug customer issues commit `ce6c668b14` upstream. Firmware dump allows for debugging customer issues. This patch fixes start/end pointer calculation to capture T262 template entry for dump tool. Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Quinn Tran	21fffaa1b7	scsi: qla2xxx: Fix NULL pointer access due to redundant fc_host_port_name call commit `0ea88662b5` upstream. Remove redundant fc_host_port_name calls to prevent early access of scsi_host->shost_data buffer. This prevent null pointer access. Following stack trace is seen: BUG: unable to handle kernel NULL pointer dereference at 00000000000008 IP: qla24xx_report_id_acquisition+0x22d/0x3a0 [qla2xxx] Signed-off-by: Quinn Tran <quinn.tran@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Sawan Chandak	96e0d45552	scsi: qla2xxx: Fix crash due to mismatch mumber of Q-pair creation for Multi queue commit `b95b9452aa` upstream. when driver is loaded with Multi Queue enabled, it was noticed that there was one less queue pair created. Following message would indicate this: "No resources to create additional q pair." The result of one less queue pair means that system can crash, if the block mq layer thinks there is an extra hardware queue available, and the driver will use a NULL ptr qpair in that instance. Following stack trace is seen in one of the crash: irq_create_affinity_masks+0x98/0x530 irq_create_affinity_masks+0x98/0x530 __pci_enable_msix+0x321/0x4e0 mutex_lock+0x12/0x40 pci_alloc_irq_vectors_affinity+0xb5/0x140 qla24xx_enable_msix+0x79/0x530 [qla2xxx] qla2x00_request_irqs+0x61/0x2d0 [qla2xxx] qla2x00_probe_one+0xc73/0x2390 [qla2xxx] ida_simple_get+0x98/0x100 kernfs_next_descendant_post+0x40/0x50 local_pci_probe+0x45/0xa0 pci_device_probe+0xfc/0x140 driver_probe_device+0x2c5/0x470 __driver_attach+0xdd/0xe0 driver_probe_device+0x470/0x470 bus_for_each_dev+0x6c/0xc0 driver_attach+0x1e/0x20 bus_add_driver+0x45/0x270 driver_register+0x60/0xe0 __pci_register_driver+0x4c/0x50 qla2x00_module_init+0x1ce/0x21e [qla2xxx] Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
himanshu.madhani@cavium.com	083dca440b	scsi: qla2xxx: Fix recursive loop during target mode configuration for ISP25XX leaving system unresponsive commit `cb590700e0` upstream. Following messages are seen into system logs qla2xxx [0000:09:00.0]-00af:9: Performing ISP error recovery - ha=ffff98315ee30000. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. qla2xxx [0000:09:00.0]-d009:9: Firmware has been previously dumped (ffffba488c001000) -- ignoring request. qla2xxx [0000:09:00.0]-504b:9: RISC paused -- HCCR=40, Dumping firmware. See Bugzilla for details https://bugzilla.kernel.org/show_bug.cgi?id=195285 Fixes: `d74595278f` ("scsi: qla2xxx: Add multiple queue pair functionality.") Reported-by: Laurence Oberman <loberman@redhat.com> Reported-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Tested-by: Laurence Oberman <loberman@redhat.com> Tested-by: Anthony Bloodoff <anthony.bloodoff@gmail.com> Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com> Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Johannes Thumshirn	daa15f6f84	scsi: qla2xxx: don't disable a not previously enabled PCI device commit `ddff7ed45e` upstream. When pci_enable_device() or pci_enable_device_mem() fail in qla2x00_probe_one() we bail out but do a call to pci_disable_device(). This causes the dev_WARN_ON() in pci_disable_device() to trigger, as the device wasn't enabled previously. So instead of taking the 'probe_out' error path we can directly return iff one of the pci_enable_device() calls fails. Additionally rename the 'probe_out' goto label's name to the more descriptive 'disable_device'. Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Fixes: `e315cd28b9` ("[SCSI] qla2xxx: Code changes for qla data structure refactoring") Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Giridhar Malavali <giridhar.malavali@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:47 +02:00
Marc Zyngier	2bae71aa85	KVM: arm/arm64: Handle possible NULL stage2 pud when ageing pages commit `d6dbdd3c85` upstream. Under memory pressure, we start ageing pages, which amounts to parsing the page tables. Since we don't want to allocate any extra level, we pass NULL for our private allocation cache. Which means that stage2_get_pud() is allowed to fail. This results in the following splat: [ 1520.409577] Unable to handle kernel NULL pointer dereference at virtual address 00000008 [ 1520.417741] pgd = ffff810f52fef000 [ 1520.421201] [00000008] pgd=0000010f636c5003, pud=0000010f56f48003, *pmd=0000000000000000 [ 1520.429546] Internal error: Oops: 96000006 [#1] PREEMPT SMP [ 1520.435156] Modules linked in: [ 1520.438246] CPU: 15 PID: 53550 Comm: qemu-system-aar Tainted: G W 4.12.0-rc4-00027-g1885c397eaec #7205 [ 1520.448705] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB12A 10/26/2016 [ 1520.463726] task: ffff800ac5fb4e00 task.stack: ffff800ce04e0000 [ 1520.469666] PC is at stage2_get_pmd+0x34/0x110 [ 1520.474119] LR is at kvm_age_hva_handler+0x44/0xf0 [ 1520.478917] pc : [<ffff0000080b137c>] lr : [<ffff0000080b149c>] pstate: 40000145 [ 1520.486325] sp : ffff800ce04e33d0 [ 1520.489644] x29: ffff800ce04e33d0 x28: 0000000ffff40064 [ 1520.494967] x27: 0000ffff27e00000 x26: 0000000000000000 [ 1520.500289] x25: ffff81051ba65008 x24: 0000ffff40065000 [ 1520.505618] x23: 0000ffff40064000 x22: 0000000000000000 [ 1520.510947] x21: ffff810f52b20000 x20: 0000000000000000 [ 1520.516274] x19: 0000000058264000 x18: 0000000000000000 [ 1520.521603] x17: 0000ffffa6fe7438 x16: ffff000008278b70 [ 1520.526940] x15: 000028ccd8000000 x14: 0000000000000008 [ 1520.532264] x13: ffff7e0018298000 x12: 0000000000000002 [ 1520.537582] x11: ffff000009241b93 x10: 0000000000000940 [ 1520.542908] x9 : ffff0000092ef800 x8 : 0000000000000200 [ 1520.548229] x7 : ffff800ce04e36a8 x6 : 0000000000000000 [ 1520.553552] x5 : 0000000000000001 x4 : 0000000000000000 [ 1520.558873] x3 : 0000000000000000 x2 : 0000000000000008 [ 1520.571696] x1 : ffff000008fd5000 x0 : ffff0000080b149c [ 1520.577039] Process qemu-system-aar (pid: 53550, stack limit = 0xffff800ce04e0000) [...] [ 1521.510735] [<ffff0000080b137c>] stage2_get_pmd+0x34/0x110 [ 1521.516221] [<ffff0000080b149c>] kvm_age_hva_handler+0x44/0xf0 [ 1521.522054] [<ffff0000080b0610>] handle_hva_to_gpa+0xb8/0xe8 [ 1521.527716] [<ffff0000080b3434>] kvm_age_hva+0x44/0xf0 [ 1521.532854] [<ffff0000080a58b0>] kvm_mmu_notifier_clear_flush_young+0x70/0xc0 [ 1521.539992] [<ffff000008238378>] __mmu_notifier_clear_flush_young+0x88/0xd0 [ 1521.546958] [<ffff00000821eca0>] page_referenced_one+0xf0/0x188 [ 1521.552881] [<ffff00000821f36c>] rmap_walk_anon+0xec/0x250 [ 1521.558370] [<ffff000008220f78>] rmap_walk+0x78/0xa0 [ 1521.563337] [<ffff000008221104>] page_referenced+0x164/0x180 [ 1521.569002] [<ffff0000081f1af0>] shrink_active_list+0x178/0x3b8 [ 1521.574922] [<ffff0000081f2058>] shrink_node_memcg+0x328/0x600 [ 1521.580758] [<ffff0000081f23f4>] shrink_node+0xc4/0x328 [ 1521.585986] [<ffff0000081f2718>] do_try_to_free_pages+0xc0/0x340 [ 1521.592000] [<ffff0000081f2a64>] try_to_free_pages+0xcc/0x240 [...] The trivial fix is to handle this NULL pud value early, rather than dereferencing it blindly. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Omar Sandoval	7751d94da9	Btrfs: fix delalloc accounting leak caused by u32 overflow commit `70e7af244f` upstream. btrfs_calc_trans_metadata_size() does an unsigned 32-bit multiplication, which can overflow if num_items >= 4 GB / (nodesize * BTRFS_MAX_LEVEL * 2). For a nodesize of 16kB, this overflow happens at 16k items. Usually, num_items is a small constant passed to btrfs_start_transaction(), but we also use btrfs_calc_trans_metadata_size() for metadata reservations for extent items in btrfs_delalloc_{reserve,release}_metadata(). In drop_outstanding_extents(), num_items is calculated as inode->reserved_extents - inode->outstanding_extents. The difference between these two counters is usually small, but if many delalloc extents are reserved and then the outstanding extents are merged in btrfs_merge_extent_hook(), the difference can become large enough to overflow in btrfs_calc_trans_metadata_size(). The overflow manifests itself as a leak of a multiple of 4 GB in delalloc_block_rsv and the metadata bytes_may_use counter. This in turn can cause early ENOSPC errors. Additionally, these WARN_ONs in extent-tree.c will be hit when unmounting: WARN_ON(fs_info->delalloc_block_rsv.size > 0); WARN_ON(fs_info->delalloc_block_rsv.reserved > 0); WARN_ON(space_info->bytes_pinned > 0 \|\| space_info->bytes_reserved > 0 \|\| space_info->bytes_may_use > 0); Fix it by casting nodesize to a u64 so that btrfs_calc_trans_metadata_size() does a full 64-bit multiplication. While we're here, do the same in btrfs_calc_trunc_metadata_size(); this can't overflow with any existing uses, but it's better to be safe here than have another hard-to-debug problem later on. Signed-off-by: Omar Sandoval <osandov@fb.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Jeff Mahoney	acccdbef2c	btrfs: fix race with relocation recovery and fs_root setup commit `a9b3311ef3` upstream. If we have to recover relocation during mount, we'll ultimately have to evict the orphan inode. That goes through the reservation dance, where priority_reclaim_metadata_space and flush_space expect fs_info->fs_root to be valid. That's the next thing to be set up during mount, so we crash, almost always in flush_space trying to join the transaction but priority_reclaim_metadata_space is possible as well. This call path has been problematic in the past WRT whether ->fs_root is valid yet. Commit `957780eb27` (Btrfs: introduce ticketed enospc infrastructure) added new users that are called in the direct path instead of the async path that had already been worked around. The thing is that we don't actually need the fs_root, specifically, for anything. We either use it to determine whether the root is the chunk_root for use in choosing an allocation profile or as a root to pass btrfs_join_transaction before immediately committing it. Anything that isn't the chunk root works in the former case and any root works in the latter. A simple fix is to use a root we know will always be there: the extent_root. Fixes: `957780eb27` (Btrfs: introduce ticketed enospc infrastructure) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Jeff Mahoney	5ca9daf722	btrfs: fix memory leak in update_space_info failure path commit `896533a7da` upstream. If we fail to add the space_info kobject, we'll leak the memory for the percpu counter. Fixes: `6ab0a2029c` (btrfs: publish allocation data in sysfs) Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
David Sterba	6bc3d6a633	btrfs: use correct types for page indices in btrfs_page_exists_in_range commit `cc2b702c52` upstream. Variables start_idx and end_idx are supposed to hold a page index derived from the file offsets. The int type is not the right one though, offsets larger than 1 << 44 will get silently trimmed off the high bits. (1 << 44 is 16TiB) What can go wrong, if start is below the boundary and end gets trimmed: - if there's a page after start, we'll find it (radix_tree_gang_lookup_slot) - the final check "if (page->index <= end_idx)" will unexpectedly fail The function will return false, ie. "there's no page in the range", although there is at least one. btrfs_page_exists_in_range is used to prevent races in: * in hole punching, where we make sure there are not pages in the truncated range, otherwise we'll wait for them to finish and redo truncation, but we're going to replace the pages with holes anyway so the only problem is the intermediate state * lock_extent_direct: we want to make sure there are no pages before we lock and start DIO, to prevent stale data reads For practical occurence of the bug, there are several constaints. The file must be quite large, the affected range must cross the 16TiB boundary and the internal state of the file pages and pending operations must match. Also, we must not have started any ordered data in the range, otherwise we don't even reach the buggy function check. DIO locking tries hard in several places to avoid deadlocks with buffered IO and avoids waiting for ranges. The worst consequence seems to be stale data read. CC: Liu Bo <bo.li.liu@oracle.com> Fixes: `fc4adbff82` ("btrfs: Drop EXTENT_UPTODATE check in hole punching and direct locking") Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Vaibhav Jain	574ab1b8fb	cxl: Avoid double free_irq() for psl,slice interrupts commit `b3aa20ba2b` upstream. During an eeh call to cxl_remove can result in double free_irq of psl,slice interrupts. This can happen if perst_reloads_same_image == 1 and call to cxl_configure_adapter() fails during slot_reset callback. In such a case we see a kernel oops with following back-trace: Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: free_irq+0x88/0xd0 (unreliable) cxl_unmap_irq+0x20/0x40 [cxl] cxl_native_release_psl_irq+0x78/0xd8 [cxl] pci_deconfigure_afu+0xac/0x110 [cxl] cxl_remove+0x104/0x210 [cxl] pci_device_remove+0x6c/0x110 device_release_driver_internal+0x204/0x2e0 pci_stop_bus_device+0xa0/0xd0 pci_stop_and_remove_bus_device+0x28/0x40 pci_hp_remove_devices+0xb0/0x150 pci_hp_remove_devices+0x68/0x150 eeh_handle_normal_event+0x140/0x580 eeh_handle_event+0x174/0x360 eeh_event_handler+0x1e8/0x1f0 This patch fixes the issue of double free_irq by checking that variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are not '0' before un-mapping and resetting these variables to '0' when they are un-mapped. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Frederic Barrat	0c94348b23	cxl: Fix error path on bad ioctl commit `cec422c11c` upstream. Fix error path if we can't copy user structure on CXL_IOCTL_START_WORK ioctl. We shouldn't unlock the context status mutex as it was not locked (yet). Fixes: `0712dc7e73` ("cxl: Fix issues when unmapping contexts") Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Al Viro	4907e3bb67	excessive checks in ufs_write_failed() and ufs_evict_inode() commit `babef37dcc` upstream. As it is, short copy in write() to append-only file will fail to truncate the excessive allocated blocks. As the matter of fact, all checks in ufs_truncate_blocks() are either redundant or wrong for that caller. As for the only other caller (ufs_evict_inode()), we only need the file type checks there. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:46 +02:00
Al Viro	6af5db5d39	ufs_getfrag_block(): we only grab ->truncate_mutex on block creation path commit `006351ac8e` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	c12c0c4ff5	ufs_extend_tail(): fix the braino in calling conventions of ufs_new_fragments() commit `940ef1a0ed` upstream. ... and it really needs splitting into "new" and "extend" cases, but that's for later Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	728154e963	ufs: set correct ->s_maxsize commit `6b0d144fa7` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	d426b9575f	ufs: restore maintaining ->i_blocks commit `eb315d2ae6` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	386e884c85	fix ufs_isblockset() commit `414cf7186d` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Al Viro	823c065a40	ufs: restore proper tail allocation commit `8785d84d00` upstream. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Tejun Heo	9be0c9d62d	cpuset: consider dying css as offline commit `41c25707d2` upstream. In most cases, a cgroup controller don't care about the liftimes of cgroups. For the controller, a css becomes online when ->css_online() is called on it and offline when ->css_offline() is called. However, cpuset is special in that the user interface it exposes cares whether certain cgroups exist or not. Combined with the RCU delay between cgroup removal and css offlining, this can lead to user visible behavior oddities where operations which should succeed after cgroup removals fail for some time period. The effects of cgroup removals are delayed when seen from userland. This patch adds css_is_dying() which tests whether offline is pending and updates is_cpuset_online() so that the function returns false also while offline is pending. This gets rid of the userland visible delays. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Daniel Jordan <daniel.m.jordan@oracle.com> Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Ulrik De Bie	48b2c7c865	Input: elantech - add Fujitsu Lifebook E546/E557 to force crc_enabled commit `47eb0c8b4d` upstream. The Lifebook E546 and E557 touchpad were also not functioning and worked after running: echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled Add them to the list of machines that need this workaround. Signed-off-by: Ulrik De Bie <ulrik.debie-os@e2big.org> Reviewed-by: Arjan Opmeer <arjan@opmeer.net> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:45 +02:00
Waiman Long	f3c1dfa84d	cgroup: Prevent kill_css() from being called more than once commit `33c35aa481` upstream. The kill_css() function may be called more than once under the condition that the css was killed but not physically removed yet followed by the removal of the cgroup that is hosting the css. This patch prevents any harmm from being done when that happens. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Sean Young	d31fff8cbc	rc-core: race condition during ir_raw_event_register() commit `963761a0b2` upstream. A rc device can call ir_raw_event_handle() after rc_allocate_device(), but before rc_register_device() has completed. This is racey because rcdev->raw is set before rcdev->raw->thread has a valid value. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Sui Chen	d9f48c4661	ahci: Acer SA5-271 SSD Not Detected Fix commit `8bfd174312` upstream. (Correction in this resend: fixed function name acer_sa5_271_workaround; fixed the always-true condition in the function; fixed description.) On the Acer Switch Alpha 12 (model number: SA5-271), the internal SSD may not get detected because the port_map and CAP.nr_ports combination causes the driver to skip the port that is actually connected to the SSD. More specifically, either all SATA ports are identified as DUMMY, or all ports get ``link down'' and never get up again. This problem occurs occasionally. When this problem occurs, CAP may hold a value of 0xC734FF00 or 0xC734FF01 and port_map may hold a value of 0x00 or 0x01. When this problem does not occur, CAP holds a value of 0xC734FF02 and port_map may hold a value of 0x07. Overriding the CAP value to 0xC734FF02 and port_map to 0x7 significantly reduces the occurrence of this problem. Link: https://bugzilla.kernel.org/attachment.cgi?id=253091 Signed-off-by: Sui Chen <suichen6@gmail.com> Tested-by: Damian Ivanov <damianatorrpm@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Rob Clark	39d584db44	drm/msm/mdp5: use __drm_atomic_helper_plane_duplicate_state() commit `786813c343` upstream. Somehow the helper was never retrofitted for mdp5. Which meant when plane_state->fence was added, it could get copied into new state in mdp5_plane_duplicate_state(). If an update to disable the plane (for example on rmfb) managed to sneak in after an nonblock update had swapped state, but before it was committed, we'd get a splat: WARNING: CPU: 1 PID: 69 at ../drivers/gpu/drm/drm_atomic_helper.c:1061 drm_atomic_helper_wait_for_fences+0xe0/0xf8 Modules linked in: CPU: 1 PID: 69 Comm: kworker/1:1 Tainted: G W 4.11.0-rc8+ #1187 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Workqueue: events drm_mode_rmfb_work_fn task: ffffffc036560d00 task.stack: ffffffc036550000 PC is at drm_atomic_helper_wait_for_fences+0xe0/0xf8 LR is at complete_commit.isra.1+0x44/0x1c0 pc : [<ffffff80084f6040>] lr : [<ffffff800854176c>] pstate: 20000145 sp : ffffffc036553b60 x29: ffffffc036553b60 x28: ffffffc0264e6a00 x27: ffffffc035659000 x26: 0000000000000000 x25: ffffffc0240e8000 x24: 0000000000000038 x23: 0000000000000000 x22: ffffff800858f200 x21: ffffffc0240e8000 x20: ffffffc02f56a800 x19: 0000000000000000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: ffffffc00a192700 x13: 0000000000000004 x12: 0000000000000000 x11: ffffff80089a1690 x10: 00000000000008f0 x9 : ffffffc036553b20 x8 : ffffffc036561650 x7 : ffffffc03fe6cb40 x6 : 0000000000000000 x5 : 0000000000000001 x4 : 0000000000000002 x3 : ffffffc035659000 x2 : ffffffc0240e8c80 x1 : 0000000000000000 x0 : ffffffc02adbe588 ---[ end trace 13aeec77c3fb55e2 ]--- Call trace: Exception stack(0xffffffc036553990 to 0xffffffc036553ac0) 3980: 0000000000000000 0000008000000000 39a0: ffffffc036553b60 ffffff80084f6040 0000000000004ff0 0000000000000038 39c0: ffffffc0365539d0 ffffff800857e098 ffffffc036553a00 ffffff800857e1b0 39e0: ffffffc036553a10 ffffff800857c554 ffffffc0365e8400 ffffffc0365e8400 3a00: ffffffc036553a20 ffffff8008103358 000000000001aad7 ffffff800851b72c 3a20: ffffffc036553a50 ffffff80080e9228 ffffffc02adbe588 0000000000000000 3a40: ffffffc0240e8c80 ffffffc035659000 0000000000000002 0000000000000001 3a60: 0000000000000000 ffffffc03fe6cb40 ffffffc036561650 ffffffc036553b20 3a80: 00000000000008f0 ffffff80089a1690 0000000000000000 0000000000000004 3aa0: ffffffc00a192700 0000000000000000 0000000000000000 0000000000000000 [<ffffff80084f6040>] drm_atomic_helper_wait_for_fences+0xe0/0xf8 [<ffffff800854176c>] complete_commit.isra.1+0x44/0x1c0 [<ffffff8008541c64>] msm_atomic_commit+0x32c/0x350 [<ffffff8008516230>] drm_atomic_commit+0x50/0x60 [<ffffff8008517548>] drm_atomic_remove_fb+0x158/0x250 [<ffffff80085186d0>] drm_framebuffer_remove+0x50/0x158 [<ffffff8008518818>] drm_mode_rmfb_work_fn+0x40/0x58 [<ffffff80080d5668>] process_one_work+0x1d0/0x378 [<ffffff80080d5a54>] worker_thread+0x244/0x488 [<ffffff80080db7fc>] kthread+0xfc/0x128 [<ffffff8008082ec0>] ret_from_fork+0x10/0x50 Fixes: 9626014 ("drm/fence: add in-fences support") Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reported-by: Stanimir Varbanov <stanimir.varbanov@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Eric Anholt	a5ab52b38f	drm/msm: Expose our reservation object when exporting a dmabuf. commit `43523eba79` upstream. Without this, polling on the dma-buf (and presumably other devices synchronizing against our rendering) would return immediately, even while the BO was busy. Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Rob Clark <robdclark@gmail.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Nicholas Bellinger	0354d1d64f	target: Re-add check to reject control WRITEs with overflow data commit `4ff83daa02` upstream. During v4.3 when the overflow/underflow check was relaxed by commit `c72c525022`: commit `c72c525022` Author: Roland Dreier <roland@purestorage.com> Date: Wed Jul 22 15:08:18 2015 -0700 target: allow underflow/overflow for PR OUT etc. commands to allow underflow/overflow for Windows compliance + FCP, a consequence was to allow control CDBs to process overflow data for iscsi-target with immediate data as well. As per Roland's original change, continue to allow underflow cases for control CDBs to make Windows compliance + FCP happy, but until overflow for control CDBs is supported tree-wide, explicitly reject all control WRITEs with overflow following pre v4.3.y logic. Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
David Arcari	0eedb7832e	cpufreq: cpufreq_register_driver() should return -ENODEV if init fails commit `6c77003677` upstream. For a driver that does not set the CPUFREQ_STICKY flag, if all of the ->init() calls fail, cpufreq_register_driver() should return an error. This will prevent the driver from loading. Fixes: `ce1bcfe94d` (cpufreq: check cpufreq_policy_list instead of scanning policies for all CPUs) Signed-off-by: David Arcari <darcari@redhat.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Jason A. Donenfeld	86f95e53ed	random: invalidate batched entropy after crng init commit `b169c13de4` upstream. It's possible that get_random_{u32,u64} is used before the crng has initialized, in which case, its output might not be cryptographically secure. For this problem, directly, this patch set is introducing the *_wait variety of functions, but even with that, there's a subtle issue: what happens to our batched entropy that was generated before initialization. Prior to this commit, it'd stick around, supplying bad numbers. After this commit, we force the entropy to be re-extracted after each phase of the crng has initialized. In order to avoid a race condition with the position counter, we introduce a simple rwlock for this invalidation. Since it's only during this awkward transition period, after things are all set up, we stop using it, so that it doesn't have an impact on performance. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:44 +02:00
Pratyush Anand	0524867ee2	mei: make sysfs modalias format similar as uevent modalias commit `6f9193ec04` upstream. modprobe is not able to resolve sysfs modalias for mei devices. # cat /sys/class/watchdog/watchdog0/device/watchdog/watchdog0/device/modalias mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: # modprobe --set-version 4.9.6-200.fc25.x86_64 -R mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: modprobe: FATAL: Module mei::05b79a6f-4628-4d7f-899d-a91514cb32ab: not found in directory /lib/modules/4.9.6-200.fc25.x86_64 # cat /lib/modules/4.9.6-200.fc25.x86_64/modules.alias \| grep 05b79a6f-4628-4d7f-899d-a91514cb32ab alias mei::05b79a6f-4628-4d7f-899d-a91514cb32ab::* mei_wdt commit `b26864cad1` ("mei: bus: add client protocol version to the device alias"), however sysfs modalias is still in formmat mei:S:uuid:*. This patch equates format of uevent and sysfs modalias so that modprobe is able to resolve the aliases. Fixes: commit `b26864cad1` ("mei: bus: add client protocol version to the device alias") Signed-off-by: Pratyush Anand <panand@redhat.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Bart Van Assche	6712554818	block: Avoid that blk_exit_rl() triggers a use-after-free commit `b425e50492` upstream. Since the introduction of .init_rq_fn() and .exit_rq_fn() it is essential that the memory allocated for struct request_queue stays around until all blk_exit_rl() calls have finished. Hence make blk_init_rl() take a reference on struct request_queue. This patch fixes the following crash: general protection fault: 0000 [#2] SMP CPU: 3 PID: 28 Comm: ksoftirqd/3 Tainted: G D 4.12.0-rc2-dbg+ #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 task: ffff88013a108040 task.stack: ffffc9000071c000 RIP: 0010:free_request_size+0x1a/0x30 RSP: 0018:ffffc9000071fd38 EFLAGS: 00010202 RAX: 6b6b6b6b6b6b6b6b RBX: ffff880067362a88 RCX: 0000000000000003 RDX: ffff880067464178 RSI: ffff880067362a88 RDI: ffff880135ea4418 RBP: ffffc9000071fd40 R08: 0000000000000000 R09: 0000000100180009 R10: ffffc9000071fd38 R11: ffffffff81110800 R12: ffff88006752d3d8 R13: ffff88006752d3d8 R14: ffff88013a108040 R15: 000000000000000a FS: 0000000000000000(0000) GS:ffff88013fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa8ec1edb00 CR3: 0000000138ee8000 CR4: 00000000001406e0 Call Trace: mempool_destroy.part.10+0x21/0x40 mempool_destroy+0xe/0x10 blk_exit_rl+0x12/0x20 blkg_free+0x4d/0xa0 __blkg_release_rcu+0x59/0x170 rcu_process_callbacks+0x260/0x4e0 __do_softirq+0x116/0x250 smpboot_thread_fn+0x123/0x1e0 kthread+0x109/0x140 ret_from_fork+0x31/0x40 Fixes: commit `e9c787e65c` ("scsi: allocate scsi_cmnd structures as part of struct request") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Matt Ranostay	698aa72067	iio: proximity: as3935: fix iio_trigger_poll issue commit `9122b54f26` upstream. Using iio_trigger_poll() can oops when multiple interrupts happen before the first is handled. Use iio_trigger_poll_chained() instead and use the timestamp when processed, since it will be in theory be 2 ms max latency. Fixes: `24ddb0e4bb` ("iio: Add AS3935 lightning sensor support") Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Matt Ranostay	71c0950cd7	iio: proximity: as3935: fix AS3935_INT mask commit `275292d3a3` upstream. AS3935 interrupt mask has been incorrect so valid lightning events would never trigger an buffer event. Also noise interrupt should be BIT(0). Fixes: `24ddb0e4bb` ("iio: Add AS3935 lightning sensor support") Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Marcin Niestroj	7b5d3c1a14	iio: trigger: fix NULL pointer dereference in iio_trigger_write_current() commit `4eecbe8188` upstream. In case oldtrig == trig == NULL (which happens when we set none trigger, when there is already none set) there is a NULL pointer dereference during iio_trigger_put(trig). Below is kernel output when this occurs: [ 26.741790] Unable to handle kernel NULL pointer dereference at virtual address 00000000 [ 26.750179] pgd = cacc0000 [ 26.752936] [00000000] pgd=8adc6835, pte=00000000, *ppte=00000000 [ 26.759531] Internal error: Oops: 17 [#1] SMP ARM [ 26.764261] Modules linked in: usb_f_ncm u_ether usb_f_acm u_serial usb_f_fs libcomposite configfs evbug [ 26.773844] CPU: 0 PID: 152 Comm: synchro Not tainted 4.12.0-rc1 #2 [ 26.780128] Hardware name: Freescale i.MX6 Ultralite (Device Tree) [ 26.786329] task: cb1de200 task.stack: cac92000 [ 26.790892] PC is at iio_trigger_write_current+0x188/0x1f4 [ 26.796403] LR is at lock_release+0xf8/0x20c [ 26.800696] pc : [<c0736f34>] lr : [<c016efb0>] psr: 600d0013 [ 26.800696] sp : cac93e30 ip : cac93db0 fp : cac93e5c [ 26.812193] r10: c0e64fe8 r9 : 00000000 r8 : 00000001 [ 26.817436] r7 : cb190810 r6 : 00000010 r5 : 00000001 r4 : 00000000 [ 26.823982] r3 : 00000000 r2 : 00000000 r1 : cb1de200 r0 : 00000000 [ 26.830528] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 26.837683] Control: 10c5387d Table: 8acc006a DAC: 00000051 [ 26.843448] Process synchro (pid: 152, stack limit = 0xcac92210) [ 26.849475] Stack: (0xcac93e30 to 0xcac94000) [ 26.853857] 3e20: 00000001 c0736dac c054033c cae6b680 [ 26.862060] 3e40: cae6b680 00000000 00000001 cb3f8610 cac93e74 cac93e60 c054035c c0736db8 [ 26.870264] 3e60: 00000001 c054033c cac93e94 cac93e78 c029bf34 c0540348 00000000 00000000 [ 26.878469] 3e80: cb3f8600 cae6b680 cac93ed4 cac93e98 c029b320 c029bef0 00000000 00000000 [ 26.886672] 3ea0: 00000000 cac93f78 cb2d41fc caed3280 c029b214 cac93f78 00000001 000e20f8 [ 26.894874] 3ec0: 00000001 00000000 cac93f44 cac93ed8 c0221dcc c029b220 c0e1ca39 cb2d41fc [ 26.903079] 3ee0: cac93f04 cac93ef0 c0183ef0 c0183ab0 cb2d41fc 00000000 cac93f44 cac93f08 [ 26.911282] 3f00: c0225eec c0183ebc 00000001 00000000 c0223728 00000000 c0245454 00000001 [ 26.919485] 3f20: 00000001 caed3280 000e20f8 cac93f78 000e20f8 00000001 cac93f74 cac93f48 [ 26.927690] 3f40: c0223680 c0221da4 c0246520 c0245460 caed3283 caed3280 00000000 00000000 [ 26.935893] 3f60: 000e20f8 00000001 cac93fa4 cac93f78 c0224520 c02235e4 00000000 00000000 [ 26.944096] 3f80: 00000001 000e20f8 00000001 00000004 c0107f84 cac92000 00000000 cac93fa8 [ 26.952299] 3fa0: c0107de0 c02244e8 00000001 000e20f8 0000000e 000e20f8 00000001 fbad2484 [ 26.960502] 3fc0: 00000001 000e20f8 00000001 00000004 beb6b698 00064260 0006421c beb6b4b4 [ 26.968705] 3fe0: 00000000 beb6b450 b6f219a0 b6e2f268 800d0010 0000000e cac93ff4 cac93ffc [ 26.976896] Backtrace: [ 26.979388] [<c0736dac>] (iio_trigger_write_current) from [<c054035c>] (dev_attr_store+0x20/0x2c) [ 26.988289] r10:cb3f8610 r9:00000001 r8:00000000 r7:cae6b680 r6:cae6b680 r5:c054033c [ 26.996138] r4:c0736dac r3:00000001 [ 26.999747] [<c054033c>] (dev_attr_store) from [<c029bf34>] (sysfs_kf_write+0x50/0x54) [ 27.007686] r5:c054033c r4:00000001 [ 27.011290] [<c029bee4>] (sysfs_kf_write) from [<c029b320>] (kernfs_fop_write+0x10c/0x224) [ 27.019579] r7:cae6b680 r6:cb3f8600 r5:00000000 r4:00000000 [ 27.025271] [<c029b214>] (kernfs_fop_write) from [<c0221dcc>] (__vfs_write+0x34/0x120) [ 27.033214] r10:00000000 r9:00000001 r8:000e20f8 r7:00000001 r6:cac93f78 r5:c029b214 [ 27.041059] r4:caed3280 [ 27.043622] [<c0221d98>] (__vfs_write) from [<c0223680>] (vfs_write+0xa8/0x170) [ 27.050959] r9:00000001 r8:000e20f8 r7:cac93f78 r6:000e20f8 r5:caed3280 r4:00000001 [ 27.058731] [<c02235d8>] (vfs_write) from [<c0224520>] (SyS_write+0x44/0x98) [ 27.065806] r9:00000001 r8:000e20f8 r7:00000000 r6:00000000 r5:caed3280 r4:caed3283 [ 27.073582] [<c02244dc>] (SyS_write) from [<c0107de0>] (ret_fast_syscall+0x0/0x1c) [ 27.081179] r9:cac92000 r8:c0107f84 r7:00000004 r6:00000001 r5:000e20f8 r4:00000001 [ 27.088947] Code: 1a000009 e1a04009 e3a06010 e1a05008 (e5943000) [ 27.095244] ---[ end trace 06d1dab86d6e6bab ]--- To fix that problem call iio_trigger_put(trig) only when trig is not NULL. Fixes: `d5d24bcc0a` ("iio: trigger: close race condition in acquiring trigger reference") Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Franziska Naepelt	80c8ac6b9b	iio: light: ltr501 Fix interchanged als/ps register field commit `7cc3bff4ef` upstream. The register mapping for the IIO driver for the Liteon Light and Proximity sensor LTR501 interrupt mode is interchanged (ALS/PS). There is a register called INTERRUPT register (address 0x8F) Bit 0 represents PS measurement trigger. Bit 1 represents ALS measurement trigger. This two bit fields are interchanged within the driver. see datasheet page 24: http://optoelectronics.liteon.com/upload/download/DS86-2012-0006/S_110_LTR-501ALS-01_PrelimDS_ver1%5B1%5D.pdf Signed-off-by: Franziska Naepelt <franziska.naepelt@idt.com> Fixes: `7ac702b314` ("iio: ltr501: Add interrupt support") Acked-by: Peter Meerwald-Stadler <pmeerw@pmeerw.net> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Raveendra Padasalagi	1cb7bbe7b5	iio: adc: bcm_iproc_adc: swap primary and secondary isr handler's commit `f7d86ecf83` upstream. The third argument of devm_request_threaded_irq() is the primary handler. It is called in hardirq context and checks whether the interrupt is relevant to the device. If the primary handler returns IRQ_WAKE_THREAD, the secondary handler (a.k.a. handler thread) is scheduled to run in process context. bcm_iproc_adc.c uses the secondary handler as the primary one and the other way around. So this patch fixes the same, along with re-naming the secondary handler and primary handler names properly. Tested on the BCM9583XX iProc SoC based boards. Fixes: `4324c97ece` ("iio: Add driver for Broadcom iproc-static-adc") Reported-by: Pavel Roskin <plroskin@gmail.com> Signed-off-by: Raveendra Padasalagi <raveendra.padasalagi@broadcom.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Oleg Drokin	d9b0c95585	staging/lustre/lov: remove set_fs() call from lov_getstripe() commit `0a33252e06` upstream. lov_getstripe() calls set_fs(KERNEL_DS) so that it can handle a struct lov_user_md pointer from user- or kernel-space. This changes the behavior of copy_from_user() on SPARC and may result in a misaligned access exception which in turn oopses the kernel. In fact the relevant argument to lov_getstripe() is never called with a kernel-space pointer and so changing the address limits is unnecessary and so we remove the calls to save, set, and restore the address limits. Signed-off-by: John L. Hammond <john.hammond@intel.com> Reviewed-on: http://review.whamcloud.com/6150 Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3221 Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Li Wei <wei.g.li@intel.com> Signed-off-by: Oleg Drokin <green@linuxhacker.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Michael Thalmeier	dd8980bb03	usb: chipidea: debug: check before accessing ci_role commit `0340ff83cd` upstream. ci_role BUGs when the role is >= CI_ROLE_END. Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:43 +02:00
Jisheng Zhang	f2967b72b6	usb: chipidea: udc: fix NULL pointer dereference if udc_start failed commit `aa1f058d7d` upstream. Fix below NULL pointer dereference. we set ci->roles[CI_ROLE_GADGET] too early in ci_hdrc_gadget_init(), if udc_start() fails due to some reason, the ci->roles[CI_ROLE_GADGET] check in ci_hdrc_gadget_destroy can't protect us. We fix this issue by only setting ci->roles[CI_ROLE_GADGET] if udc_start() succeed. [ 1.398550] Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... [ 1.448600] PC is at dma_pool_free+0xb8/0xf0 [ 1.453012] LR is at dma_pool_free+0x28/0xf0 [ 2.113369] [<ffffff80081817d8>] dma_pool_free+0xb8/0xf0 [ 2.118857] [<ffffff800841209c>] destroy_eps+0x4c/0x68 [ 2.124165] [<ffffff8008413770>] ci_hdrc_gadget_destroy+0x28/0x50 [ 2.130461] [<ffffff800840fa30>] ci_hdrc_probe+0x588/0x7e8 [ 2.136129] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8 [ 2.142066] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8 [ 2.148270] [<ffffff800837f68c>] __device_attach_driver+0x9c/0xf8 [ 2.154563] [<ffffff800837d570>] bus_for_each_drv+0x58/0x98 [ 2.160317] [<ffffff800837f174>] __device_attach+0xc4/0x138 [ 2.166072] [<ffffff800837f738>] device_initial_probe+0x10/0x18 [ 2.172185] [<ffffff800837e58c>] bus_probe_device+0x94/0xa0 [ 2.177940] [<ffffff800837c560>] device_add+0x3f0/0x560 [ 2.183337] [<ffffff8008380d20>] platform_device_add+0x180/0x240 [ 2.189541] [<ffffff800840f0e8>] ci_hdrc_add_device+0x440/0x4f8 [ 2.195654] [<ffffff8008414194>] ci_hdrc_usb2_probe+0x13c/0x2d8 [ 2.201769] [<ffffff8008380fb8>] platform_drv_probe+0x50/0xb8 [ 2.207705] [<ffffff800837f494>] driver_probe_device+0x1fc/0x2a8 [ 2.213910] [<ffffff800837f5ec>] __driver_attach+0xac/0xb0 [ 2.219575] [<ffffff800837d4b0>] bus_for_each_dev+0x60/0xa0 [ 2.225329] [<ffffff800837ec80>] driver_attach+0x20/0x28 [ 2.230816] [<ffffff800837e880>] bus_add_driver+0x1d0/0x238 [ 2.236571] [<ffffff800837fdb0>] driver_register+0x60/0xf8 [ 2.242237] [<ffffff8008380ef4>] __platform_driver_register+0x44/0x50 [ 2.248891] [<ffffff80086fd440>] ci_hdrc_usb2_driver_init+0x18/0x20 [ 2.255365] [<ffffff8008082950>] do_one_initcall+0x38/0x128 [ 2.261121] [<ffffff80086e0d00>] kernel_init_freeable+0x1ac/0x250 [ 2.267414] [<ffffff800852f0b8>] kernel_init+0x10/0x100 [ 2.272810] [<ffffff8008082680>] ret_from_fork+0x10/0x50 Fixes: `3f124d233e` ("usb: chipidea: add role init and destroy APIs") Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Andrey Smirnov	f26ac1fc9f	usb: chipidea: imx: Do not access CLKONOFF on i.MX51 commit `62b97d502b` upstream. Unlike i.MX53, i.MX51's USBOH3 register file does not implemenent registers past offset 0x018, which includes MX53_USB_CLKONOFF_CTRL_OFFSET and trying to access that register on said platform results in external abort. Fix it by enabling CLKONOFF accessing codepath only for i.MX53. Fixes `3be3251db0` ("usb: chipidea: imx: Disable internal 60Mhz clock with ULPI PHY") Cc: cphealy@gmail.com Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: linux-usb@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Bin Liu	b89de0406a	usb: musb: dsps: keep VBUS on for host-only mode commit `b3addcf0d1` upstream. Currently VBUS is turned off while a usb device is detached, and turned on again by the polling routine. This short period VBUS loss prevents usb modem to switch mode. VBUS should be constantly on for host-only mode, so this changes the driver to not turn off VBUS for host-only mode. Fixes: `2f3fd2c5bd` ("usb: musb: Prepare dsps glue layer for PM runtime support") Reported-by: Moreno Bartalucci <moreno.bartalucci@tecnorama.it> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Thinh Nguyen	c33b087c83	usb: gadget: f_mass_storage: Serialize wake and sleep execution commit `dc9217b69d` upstream. f_mass_storage has a memorry barrier issue with the sleep and wake functions that can cause a deadlock. This results in intermittent hangs during MSC file transfer. The host will reset the device after receiving no response to resume the transfer. This issue is seen when dwc3 is processing 2 transfer-in-progress events at the same time, invoking completion handlers for CSW and CBW. Also this issue occurs depending on the system timing and latency. To increase the chance to hit this issue, you can force dwc3 driver to wait and process those 2 events at once by adding a small delay (~100us) in dwc3_check_event_buf() whenever the request is for CSW and read the event count again. Avoid debugging with printk and ftrace as extra delays and memory barrier will mask this issue. Scenario which can lead to failure: ----------------------------------- 1) The main thread sleeps and waits for the next command in get_next_command(). 2) bulk_in_complete() wakes up main thread for CSW. 3) bulk_out_complete() tries to wake up the running main thread for CBW. 4) thread_wakeup_needed is not loaded with correct value in sleep_thread(). 5) Main thread goes to sleep again. The pattern is shown below. Note the 2 critical variables. * common->thread_wakeup_needed * bh->state CPU 0 (sleep_thread) CPU 1 (wakeup_thread) ============================== =============================== bh->state = BH_STATE_FULL; smp_wmb(); thread_wakeup_needed = 0; thread_wakeup_needed = 1; smp_rmb(); if (bh->state != BH_STATE_FULL) sleep again ... As pointed out by Alan Stern, this is an R-pattern issue. The issue can be seen when there are two wakeups in quick succession. The thread_wakeup_needed can be overwritten in sleep_thread, and the read of the bh->state maybe reordered before the write to thread_wakeup_needed. This patch applies full memory barrier smp_mb() in both sleep_thread() and wakeup_thread() to ensure the order which the thread_wakeup_needed and bh->state are written and loaded. However, a better solution in the future would be to use wait_queue method that takes care of managing memory barrier between waker and waiter. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Hans de Goede	3d7ba52b5f	drm: Fix oops + Xserver hang when unplugging USB drm devices commit `75fb636324` upstream. commit `a39be606f9` ("drm: Do a full device unregister when unplugging") causes backtraces like this one when unplugging an usb drm device while it is in use: usb 2-3: USB disconnect, device number 25 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:424 drm_mode_config_cleanup+0x220/0x280 [drm] ... RIP: 0010:drm_mode_config_cleanup+0x220/0x280 [drm] ... Call Trace: gm12u320_modeset_cleanup+0xe/0x10 [gm12u320] gm12u320_driver_unload+0x35/0x70 [gm12u320] drm_dev_unregister+0x3c/0xe0 [drm] drm_unplug_dev+0x12/0x60 [drm] gm12u320_usb_disconnect+0x36/0x40 [gm12u320] usb_unbind_interface+0x72/0x280 device_release_driver_internal+0x158/0x210 device_release_driver+0x12/0x20 bus_remove_device+0x104/0x180 device_del+0x1d2/0x350 usb_disable_device+0x9f/0x270 usb_disconnect+0xc6/0x260 ... [drm:drm_mode_config_cleanup [drm]] ERROR connector Unknown-1 leaked! ------------[ cut here ]------------ WARNING: CPU: 0 PID: 242 at drivers/gpu/drm/drm_mode_config.c:458 drm_mode_config_cleanup+0x268/0x280 [drm] ... <same Call Trace> ---[ end trace 80df975dae439ed6 ]--- general protection fault: 0000 [#1] SMP ... Call Trace: ? __switch_to+0x225/0x450 drm_mode_rmfb_work_fn+0x55/0x70 [drm] process_one_work+0x193/0x3c0 worker_thread+0x4a/0x3a0 ... RIP: drm_framebuffer_remove+0x62/0x3f0 [drm] RSP: ffffb776c39dfd98 ---[ end trace 80df975dae439ed7 ]--- After which the system is unusable this is caused by drm_dev_unregister getting called immediately on unplug, which calls the drivers unload function which calls drm_mode_config_cleanup which removes the framebuffer object while userspace is still holding a reference to it. Reverting commit `a39be606f9` ("drm: Do a full device unregister when unplugging") leads to the following oops on unplug instead, when userspace closes the last fd referencing the drm_dev: sysfs group 'power' not found for kobject 'card1-Unknown-1' ------------[ cut here ]------------ WARNING: CPU: 0 PID: 2459 at fs/sysfs/group.c:237 sysfs_remove_group+0x80/0x90 ... RIP: 0010:sysfs_remove_group+0x80/0x90 ... Call Trace: dpm_sysfs_remove+0x57/0x60 device_del+0xfd/0x350 device_unregister+0x1a/0x60 drm_sysfs_connector_remove+0x39/0x50 [drm] drm_connector_unregister+0x5a/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbe ]--- BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 IP: down_write+0x1f/0x40 ... Call Trace: debugfs_remove_recursive+0x55/0x1b0 drm_debugfs_connector_remove+0x21/0x40 [drm] drm_connector_unregister+0x62/0x70 [drm] drm_connector_unregister_all+0x45/0xa0 [drm] drm_modeset_unregister_all+0x12/0x30 [drm] drm_dev_unregister+0xca/0xe0 [drm] drm_put_dev+0x32/0x60 [drm] drm_release+0x2f3/0x380 [drm] __fput+0xdf/0x1e0 ... ---[ end trace ecfb91ac85688bbf ]--- This is caused by the revert moving back to drm_unplug_dev calling drm_minor_unregister which does: device_del(minor->kdev); dev_set_drvdata(minor->kdev, NULL); /* safety belt */ drm_debugfs_cleanup(minor); Causing the sysfs entries to already be removed even though we still have references to them in e.g. drm_connector. Note we must call drm_minor_unregister to notify userspace of the unplug of the device, so calling drm_dev_unregister is not completely wrong the problem is that drm_dev_unregister does too much. This commit fixes drm_unplug_dev by not only reverting commit `a39be606f9` ("drm: Do a full device unregister when unplugging") but by also adding a call to drm_modeset_unregister_all before the drm_minor_unregister calls to make sure all sysfs entries are removed before calling device_del(minor->kdev) thereby also fixing the second set of oopses caused by just reverting the commit. Fixes: `a39be606f9` ("drm: Do a full device unregister when unplugging") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jeffy <jeffy.chen@rock-chips.com> Cc: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Reported-by: Marco Diego Aurélio Mesquita <marcodiegomesquita@gmail.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Sean Paul <seanpaul@chromium.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170601115430.4113-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	de8f4aeaa1	ext4: fix fdatasync(2) after extent manipulation operations commit `67a7d5f561` upstream. Currently, extent manipulation operations such as hole punch, range zeroing, or extent shifting do not record the fact that file data has changed and thus fdatasync(2) has a work to do. As a result if we crash e.g. after a punch hole and fdatasync, user can still possibly see the punched out data after journal replay. Test generic/392 fails due to these problems. Fix the problem by properly marking that file data has changed in these operations. Fixes: `a4bb6b64e3` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	875d084e97	ext4: fix data corruption with EXT4_GET_BLOCKS_ZERO commit `4f8caa60a5` upstream. When ext4_map_blocks() is called with EXT4_GET_BLOCKS_ZERO to zero-out allocated blocks and these blocks are actually converted from unwritten extent the following race can happen: CPU0 CPU1 page fault page fault ... ... ext4_map_blocks() ext4_ext_map_blocks() ext4_ext_handle_unwritten_extents() ext4_ext_convert_to_initialized() - zero out converted extent ext4_zeroout_es() - inserts extent as initialized in status tree ext4_map_blocks() ext4_es_lookup_extent() - finds initialized extent write data ext4_issue_zeroout() - zeroes out new extent overwriting data This problem can be reproduced by generic/340 for the fallocated case for the last block in the file. Fix the problem by avoiding zeroing out the area we are mapping with ext4_map_blocks() in ext4_ext_convert_to_initialized(). It is pointless to zero out this area in the first place as the caller asked us to convert the area to initialized because he is just going to write data there before the transaction finishes. To achieve this we delete the special case of zeroing out full extent as that will be handled by the cases below zeroing only the part of the extent that needs it. We also instruct ext4_split_extent() that the middle of extent being split contains data so that ext4_split_extent_at() cannot zero out full extent in case of ENOSPC. Fixes: `12735f8819` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Konstantin Khlebnikov	22fb074c67	ext4: keep existing extra fields when inode expands commit `887a973061` upstream. ext4_expand_extra_isize() should clear only space between old and new size. Fixes: `6dd4ee7cab` # v2.6.23 Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:42 +02:00
Jan Kara	699dc1080d	ext4: fix SEEK_HOLE commit `7d95eddf31` upstream. Currently, SEEK_HOLE implementation in ext4 may both return that there's a hole at some offset although that offset already has data and skip some holes during a search for the next hole. The first problem is demostrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (2.054 GiB/sec and 538461.5385 ops/sec) Whence Result HOLE 0 Where we can see that SEEK_HOLE wrongly returned offset 0 as containing a hole although we have written data there. The second problem can be demonstrated by: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (1.978 GiB/sec and 518518.5185 ops/sec) wrote 8192/8192 bytes at offset 131072 8 KiB, 2 ops; 0.0000 sec (2 GiB/sec and 500000.0000 ops/sec) Whence Result HOLE 139264 Where we can see that hole at offsets 56k..128k has been ignored by the SEEK_HOLE call. The underlying problem is in the ext4_find_unwritten_pgoff() which is just buggy. In some cases it fails to update returned offset when it finds a hole (when no pages are found or when the first found page has higher index than expected), in some cases conditions for detecting hole are just missing (we fail to detect a situation where indices of returned pages are not contiguous). Fix ext4_find_unwritten_pgoff() to properly detect non-contiguous page indices and also handle all cases where we got less pages then expected in one place and handle it properly there. Fixes: `c8c0df241c` CC: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Julien Grall	628ee50458	xen/privcmd: Support correctly 64KB page granularity when mapping memory commit `753c09b565` upstream. Commit `5995a68` "xen/privcmd: Add support for Linux 64KB page granularity" did not go far enough to support 64KB in mmap_batch_fn. The variable 'nr' is the number of 4KB chunk to map. However, when Linux is using 64KB page granularity the array of pages (vma->vm_private_data) contain one page per 64KB. Fix it by incrementing st->index correctly. Furthermore, st->va is not correctly incremented as PAGE_SIZE != XEN_PAGE_SIZE. Fixes: `5995a68` ("xen/privcmd: Add support for Linux 64KB page granularity") Reported-by: Feng Kan <fkan@apm.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Marc Gonzalez	b9d013c005	mtd: nand: tango: Update ecc_stats.corrected commit `60cf0ce14b` upstream. According to Boris, some user-space tools expect MTD drivers to update ecc_stats.corrected, and it's better to provide a lower bound than to provide no information at all. Fixes: `6956e2385a` ("mtd: nand: add tango NAND flash controller support") Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Andres Galacho	6d916021a1	mtd: nand: tango: Export OF device ID table as module aliases commit `2761b4f12b` upstream. The device table is required to load modules based on modaliases. After adding MODULE_DEVICE_TABLE, below entries for example will be added to module.alias: alias: of:NTCsigma,smp8758-nandC* alias: of:NTCsigma,smp8758-nand Fixes: `6956e2385a` ("mtd: nand: add tango NAND flash controller support") Signed-off-by: Andres Galacho <andresgalacho@gmail.com> Acked-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Jan Kara	f0aa7a0415	reiserfs: Make flush bios explicitely sync commit `d8747d642e` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: reiserfs-devel@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Hou Tao	f2fee0c4cd	cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode commit `5be6b75610` upstream. When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY as the delay of cfq_group's vdisktime if there have been other cfq_groups already. When cfq is under iops mode, commit `9a7f38c42c` ("cfq-iosched: Convert from jiffies to nanoseconds") could result in a large iops delay and lead to an abnormal io schedule delay for the added cfq_group. To fix it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5 when iops mode is enabled. Despite having the same value, the delay of a cfq_queue in idle class and the delay of cfq_group are different things, so I define two new macros for the delay of a cfq_group under time-slice mode and iops mode. Fixes: `9a7f38c42c` ("cfq-iosched: Convert from jiffies to nanoseconds") Signed-off-by: Hou Tao <houtao1@huawei.com> Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	067b131e20	dmaengine: mv_xor_v2: set DMA mask to 40 bits commit `b2d3c270f9` upstream. The XORv2 engine on Armada 7K/8K can only access the first 40 bits of the physical address space, so the DMA mask must be set accordingly. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	6d41a85b29	dmaengine: mv_xor_v2: remove interrupt coalescing commit `9dd4f319ba` upstream. The current implementation of interrupt coalescing doesn't work, because it doesn't configure the coalescing timer, which is needed to make sure we get an interrupt at some point. As a fix for stable, we simply remove the interrupt coalescing functionality. It will be re-introduced properly in a future commit. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:41 +02:00
Thomas Petazzoni	6f4c739de9	dmaengine: mv_xor_v2: fix tx_submit() implementation commit `44d5887a8b` upstream. The mv_xor_v2_tx_submit() gets the next available HW descriptor by calling mv_xor_v2_get_desq_write_ptr(), which reads a HW register telling the next available HW descriptor. This was working fine when HW descriptors were issued for processing directly in tx_submit(). However, as part of the review process of the driver, a change was requested to move the actual kick-off of HW descriptors processing to ->issue_pending(). Due to this, reading the HW register to know the next available HW descriptor no longer works. So instead of using this HW register, we implemented a software index pointing to the next available HW descriptor. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Hanna Hawa	a4634dfb7c	dmaengine: mv_xor_v2: enable XOR engine after its configuration commit `ab2c5f0a77` upstream. The engine was enabled prior to its configuration, which isn't correct. This patch relocates the activation of the XOR engine, to be after the configuration of the XOR engine. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Hanna Hawa <hannah@marvell.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	cba2cd6df4	dmaengine: mv_xor_v2: do not use descriptors not acked by async_tx commit `bc473da1ed` upstream. Descriptors that have not been acknowledged by the async_tx layer should not be re-used, so this commit adjusts the implementation of mv_xor_v2_prep_sw_desc() to skip descriptors for which async_tx_test_ack() is false. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	2384df2bfe	dmaengine: mv_xor_v2: properly handle wrapping in the array of HW descriptors commit `2aab4e1815` upstream. mv_xor_v2_tasklet() is looping over completed HW descriptors. Before the loop, it initializes 'next_pending_hw_desc' to the first HW descriptor to handle, and then the loop simply increments this point, without taking care of wrapping when we reach the last HW descriptor. The 'pending_ptr' index was being wrapped back to 0 at the end, but it wasn't used in each iteration of the loop to calculate next_pending_hw_desc. This commit fixes that, and makes next_pending_hw_desc a variable local to the loop itself. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Thomas Petazzoni	fdcadb5f70	dmaengine: mv_xor_v2: handle mv_xor_v2_prep_sw_desc() error properly commit `eb8df543e4` upstream. The mv_xor_v2_prep_sw_desc() is called from a few different places in the driver, but we never take into account the fact that it might return NULL. This commit fixes that, ensuring that we don't panic if there are no more descriptors available. Fixes: `19a340b1a8` ("dmaengine: mv_xor_v2: new driver") Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Alexander Sverdlin	5c039176bc	dmaengine: ep93xx: Don't drain the transfers in terminate_all() commit `98f9de366f` upstream. Draining the transfers in terminate_all callback happens with IRQs disabled, therefore induces huge latency: irqsoff latency trace v1.1.5 on 4.11.0 -------------------------------------------------------------------- latency: 39770 us, #57/57, CPU#0 \| (M:preempt VP:0, KP:0, SP:0 HP:0) ----------------- \| task: process-129 (uid:0 nice:0 policy:2 rt_prio:50) ----------------- => started at: _snd_pcm_stream_lock_irqsave => ended at: snd_pcm_stream_unlock_irqrestore _------=> CPU# / _-----=> irqs-off \| / _----=> need-resched \|\| / _---=> hardirq/softirq \|\|\| / _--=> preempt-depth \|\|\|\| / delay cmd pid \|\|\|\|\| time \| caller \ / \|\|\|\|\| \ \| / process-129 0d.s. 3us : _snd_pcm_stream_lock_irqsave process-129 0d.s1 9us : snd_pcm_stream_lock <-_snd_pcm_stream_lock_irqsave process-129 0d.s1 15us : preempt_count_add <-snd_pcm_stream_lock process-129 0d.s2 22us : preempt_count_add <-snd_pcm_stream_lock process-129 0d.s3 32us : snd_pcm_update_hw_ptr0 <-snd_pcm_period_elapsed process-129 0d.s3 41us : soc_pcm_pointer <-snd_pcm_update_hw_ptr0 process-129 0d.s3 50us : dmaengine_pcm_pointer <-soc_pcm_pointer process-129 0d.s3 58us+: snd_dmaengine_pcm_pointer_no_residue <-dmaengine_pcm_pointer process-129 0d.s3 96us : update_audio_tstamp <-snd_pcm_update_hw_ptr0 process-129 0d.s3 103us : snd_pcm_update_state <-snd_pcm_update_hw_ptr0 process-129 0d.s3 112us : xrun <-snd_pcm_update_state process-129 0d.s3 119us : snd_pcm_stop <-xrun process-129 0d.s3 126us : snd_pcm_action <-snd_pcm_stop process-129 0d.s3 134us : snd_pcm_action_single <-snd_pcm_action process-129 0d.s3 141us : snd_pcm_pre_stop <-snd_pcm_action_single process-129 0d.s3 150us : snd_pcm_do_stop <-snd_pcm_action_single process-129 0d.s3 157us : soc_pcm_trigger <-snd_pcm_do_stop process-129 0d.s3 166us : snd_dmaengine_pcm_trigger <-soc_pcm_trigger process-129 0d.s3 175us : ep93xx_dma_terminate_all <-snd_dmaengine_pcm_trigger process-129 0d.s3 182us : preempt_count_add <-ep93xx_dma_terminate_all process-129 0d.s4 189us*: m2p_hw_shutdown <-ep93xx_dma_terminate_all process-129 0d.s4 39472us : m2p_hw_setup <-ep93xx_dma_terminate_all ... rest skipped... process-129 0d.s. 40080us : <stack trace> => ep93xx_dma_tasklet => tasklet_action => __do_softirq => irq_exit => __handle_domain_irq => vic_handle_irq => __irq_usr => 0xb66c6668 Just abort the transfers and warn if the HW state is not what we expect. Move draining into device_synchronize callback. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Alexander Sverdlin	81a38a595d	dmaengine: ep93xx: Always start from BASE0 commit `0037ae4781` upstream. The current buffer is being reset to zero on device_free_chan_resources() but not on device_terminate_all(). It could happen that HW is restarted and expects BASE0 to be used, but the driver is not synchronized and will start from BASE1. One solution is to reset the buffer explicitly in m2p_hw_setup(). Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Hiroyuki Yokoyama	9812dc2abf	dmaengine: usb-dmac: Fix DMAOR AE bit definition commit `9a445bbb16` upstream. This patch fixes the register definition of AE (Address Error flag) bit. Fixes: `0c1c8ff32f` ("dmaengine: usb-dmac: Add Renesas USB DMA Controller (USB-DMAC) driver") Signed-off-by: Hiroyuki Yokoyama <hiroyuki.yokoyama.vx@renesas.com> [Shimoda: add Fixes and Cc tags in the commit log] Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:40 +02:00
Wanpeng Li	3e456059a4	KVM: async_pf: avoid async pf injection when in guest mode commit `9bc1f09f6f` upstream. INFO: task gnome-terminal-:1734 blocked for more than 120 seconds. Not tainted 4.12.0-rc4+ #8 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. gnome-terminal- D 0 1734 1015 0x00000000 Call Trace: __schedule+0x3cd/0xb30 schedule+0x40/0x90 kvm_async_pf_task_wait+0x1cc/0x270 ? __vfs_read+0x37/0x150 ? prepare_to_swait+0x22/0x70 do_async_page_fault+0x77/0xb0 ? do_async_page_fault+0x77/0xb0 async_page_fault+0x28/0x30 This is triggered by running both win7 and win2016 on L1 KVM simultaneously, and then gives stress to memory on L1, I can observed this hang on L1 when at least ~70% swap area is occupied on L0. This is due to async pf was injected to L2 which should be injected to L1, L2 guest starts receiving pagefault w/ bogus %cr2(apf token from the host actually), and L1 guest starts accumulating tasks stuck in D state in kvm_async_pf_task_wait() since missing PAGE_READY async_pfs. This patch fixes the hang by doing async pf when executing L1 guest. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	37b1521501	arm: KVM: Allow unaligned accesses at HYP commit `33b5c38852` upstream. We currently have the HSCTLR.A bit set, trapping unaligned accesses at HYP, but we're not really prepared to deal with it. Since the rest of the kernel is pretty happy about that, let's follow its example and set HSCTLR.A to zero. Modern CPUs don't really care. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	d06ca326dd	arm64: KVM: Allow unaligned accesses at EL2 commit `78fd6dcf11` upstream. We currently have the SCTLR_EL2.A bit set, trapping unaligned accesses at EL2, but we're not really prepared to deal with it. So far, this has been unnoticed, until GCC 7 started emitting those (in particular 64bit writes on a 32bit boundary). Since the rest of the kernel is pretty happy about that, let's follow its example and set SCTLR_EL2.A to zero. Modern CPUs don't really care. Reported-by: Alexander Graf <agraf@suse.de> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Marc Zyngier	ff384c499b	arm64: KVM: Preserve RES1 bits in SCTLR_EL2 commit `d68c1f7fd1` upstream. __do_hyp_init has the rather bad habit of ignoring RES1 bits and writing them back as zero. On a v8.0-8.2 CPU, this doesn't do anything bad, but may end-up being pretty nasty on future revisions of the architecture. Let's preserve those bits so that we don't have to fix this later on. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Wanpeng Li	0d2c539ead	KVM: cpuid: Fix read/write out-of-bounds vulnerability in cpuid emulation commit `a3641631d1` upstream. If "i" is the last element in the vcpu->arch.cpuid_entries[] array, it potentially can be exploited the vulnerability. this will out-of-bounds read and write. Luckily, the effect is small: /* when no next entry is found, the current entry[i] is reselected / for (j = i + 1; ; j = (j + 1) % nent) { struct kvm_cpuid_entry2 ej = &vcpu->arch.cpuid_entries[j]; if (ej->function == e->function) { It reads ej->maxphyaddr, which is user controlled. However... ej->flags \|= KVM_CPUID_FLAG_STATE_READ_NEXT; After cpuid_entries there is int maxphyaddr; struct x86_emulate_ctxt emulate_ctxt; /* 16-byte aligned */ So we have: - cpuid_entries at offset 1B50 (6992) - maxphyaddr at offset 27D0 (6992 + 3200 = 10192) - padding at 27D4...27DF - emulate_ctxt at 27E0 And it writes in the padding. Pfew, writing the ops field of emulate_ctxt would have been much worse. This patch fixes it by modding the index to avoid the out-of-bounds access. Worst case, i == j and ej->function == e->function, the loop can bail out. Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Guofang Mo <moguofang@huawei.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Paolo Bonzini	406aa22b29	kvm: async_pf: fix rcu_irq_enter() with irqs enabled commit `bbaf0e2b1c` upstream. native_safe_halt enables interrupts, and you just shouldn't call rcu_irq_enter() with interrupts enabled. Reorder the call with the following local_irq_disable() to respect the invariant. Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Dave Young	f9ecf4222a	efi/bgrt: Skip efi_bgrt_init() in case of non-EFI boot commit `7425826f4f` upstream. Sabrina Dubroca reported an early panic: BUG: unable to handle kernel paging request at ffffffffff240001 IP: efi_bgrt_init+0xdc/0x134 [...] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ... which was introduced by: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") The cause is that on this machine the firmware provides the EFI ACPI BGRT table even on legacy non-EFI bootups - which table should be EFI only. The garbage BGRT data causes the efi_bgrt_init() panic. Add a check to skip efi_bgrt_init() in case non-EFI bootup to work around this firmware bug. Tested-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Dave Young <dyoung@redhat.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Fixes: `7b0a911478` ("efi/x86: Move the EFI BGRT init code to early init code") Link: http://lkml.kernel.org/r/20170526113652.21339-6-matt@codeblueprint.co.uk [ Rewrote the changelog to be more readable. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Juergen Gross	c76a4a0771	efi: Don't issue error message when booted under Xen commit `1ea34adb87` upstream. When booted as Xen dom0 there won't be an EFI memmap allocated. Avoid issuing an error message in this case: [ 0.144079] efi: Failed to allocate new EFI memmap Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-efi@vger.kernel.org Link: http://lkml.kernel.org/r/20170526113652.21339-2-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:39 +02:00
Jan Kara	b8745dbb65	gfs2: Make flush bios explicitely sync commit `0f0b9b63e1` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: Steven Whitehouse <swhiteho@redhat.com> CC: cluster-devel@redhat.com Acked-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
J. Bruce Fields	836fb216da	nfsd4: fix null dereference on replay commit `9a307403d3` upstream. if we receive a compound such that: - the sessionid, slot, and sequence number in the SEQUENCE op match a cached succesful reply with N ops, and - the Nth operation of the compound is a PUTFH, PUTPUBFH, PUTROOTFH, or RESTOREFH, then nfsd4_sequence will return 0 and set cstate->status to nfserr_replay_cache. The current filehandle will not be set. This will cause us to call check_nfsd_access with first argument NULL. To nfsd4_compound it looks like we just succesfully executed an operation that set a filehandle, but the current filehandle is not set. Fix this by moving the nfserr_replay_cache earlier. There was never any reason to have it after the encode_op label, since the only case where he hit that is when opdesc->op_func sets it. Note that there are two ways we could hit this case: - a client is resending a previously sent compound that ended with one of the four PUTFH-like operations, or - a client is sending a new compound that (incorrectly) shares sessionid, slot, and sequence number with a previously sent compound, and the length of the previously sent compound happens to match the position of a PUTFH-like operation in the new compound. The second is obviously incorrect client behavior. The first is also very strange--the only purpose of a PUTFH-like operation is to set the current filehandle to be used by the following operation, so there's no point in having it as the last in a compound. So it's likely this requires a buggy or malicious client to reproduce. Reported-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Alex Deucher	8015cc81a6	drm/amdgpu/ci: disable mclk switching for high refresh rates (v2) commit `0a646f331d` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Vegard Nossum	0f7ff02f7d	kthread: Fix use-after-free if kthread fork fails commit `4d6501dce0` upstream. If a kthread forks (e.g. usermodehelper since commit `1da5c46fa9`) but fails in copy_process() between calling dup_task_struct() and setting p->set_child_tid, then the value of p->set_child_tid will be inherited from the parent and get prematurely freed by free_kthread_struct(). kthread() - worker_thread() - process_one_work() \| - call_usermodehelper_exec_work() \| - kernel_thread() \| - _do_fork() \| - copy_process() \| - dup_task_struct() \| - arch_dup_task_struct() \| - tsk->set_child_tid = current->set_child_tid // implied \| - ... \| - goto bad_fork_* \| - ... \| - free_task(tsk) \| - free_kthread_struct(tsk) \| - kfree(tsk->set_child_tid) - ... - schedule() - __schedule() - wq_worker_sleeping() - kthread_data(task)->flags // UAF The problem started showing up with commit `1da5c46fa9` since it reused ->set_child_tid for the kthread worker data. A better long-term solution might be to get rid of the ->set_child_tid abuse. The comment in set_kthread_struct() also looks slightly wrong. Debugged-by: Jamie Iles <jamie.iles@oracle.com> Fixes: `1da5c46fa9` ("kthread: Make struct kthread kmalloc'ed") Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jamie Iles <jamie.iles@oracle.com> Link: http://lkml.kernel.org/r/20170509073959.17858-1-vegard.nossum@oracle.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Amir Goldstein	a5505a656f	ovl: fix creds leak in copy up error path commit `8137ae26d2` upstream. Fixes: `42f269b925` ("ovl: rearrange code in ovl_copy_up_locked()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Gilad Ben-Yossef	f59fdb278e	crypto: gcm - wait for crypto op not signal safe commit `f3ad587070` upstream. crypto_gcm_setkey() was using wait_for_completion_interruptible() to wait for completion of async crypto op but if a signal occurs it may return before DMA ops of HW crypto provider finish, thus corrupting the data buffer that is kfree'ed in this case. Resolve this by using wait_for_completion() instead. Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Gilad Ben-Yossef	1286652e80	crypto: drbg - wait for crypto op not signal safe commit `a5dfefb1c3` upstream. drbg_kcapi_sym_ctr() was using wait_for_completion_interruptible() to wait for completion of async crypto op but if a signal occurs it may return before DMA ops of HW crypto provider finish, thus corrupting the output buffer. Resolve this by using wait_for_completion() instead. Reported-by: Eric Biggers <ebiggers3@gmail.com> Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Eric Biggers	7e4c3a6da3	KEYS: encrypted: avoid encrypting/decrypting stack buffers commit `e9ff56ac35` upstream. Since v4.9, the crypto API cannot (normally) be used to encrypt/decrypt stack buffers because the stack may be virtually mapped. Fix this for the padding buffers in encrypted-keys by using ZERO_PAGE for the encryption padding and by allocating a temporary heap buffer for the decryption padding. Tested with CONFIG_DEBUG_SG=y: keyctl new_session keyctl add user master "abcdefghijklmnop" @s keyid=$(keyctl add encrypted desc "new user:master 25" @s) datablob="$(keyctl pipe $keyid)" keyctl unlink $keyid keyid=$(keyctl add encrypted desc "load $datablob" @s) datablob2="$(keyctl pipe $keyid)" [ "$datablob" = "$datablob2" ] && echo "Success!" Cc: Andy Lutomirski <luto@kernel.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:38 +02:00
Eric Biggers	8a28f221a0	KEYS: fix freeing uninitialized memory in key_update() commit `63a0b0509e` upstream. key_update() freed the key_preparsed_payload even if it was not initialized first. This would cause a crash if userspace called keyctl_update() on a key with type like "asymmetric" that has a ->preparse() method but not an ->update() method. Possibly it could even be triggered for other key types by racing with keyctl_setperm() to make the KEY_NEED_WRITE check fail (the permission was already checked, so normally it wouldn't fail there). Reproducer with key type "asymmetric", given a valid cert.der: keyctl new_session keyid=$(keyctl padd asymmetric desc @s < cert.der) keyctl setperm $keyid 0x3f000000 keyctl update $keyid data [ 150.686666] BUG: unable to handle kernel NULL pointer dereference at 0000000000000001 [ 150.687601] IP: asymmetric_key_free_kids+0x12/0x30 [ 150.688139] PGD 38a3d067 [ 150.688141] PUD 3b3de067 [ 150.688447] PMD 0 [ 150.688745] [ 150.689160] Oops: 0000 [#1] SMP [ 150.689455] Modules linked in: [ 150.689769] CPU: 1 PID: 2478 Comm: keyctl Not tainted 4.11.0-rc4-xfstests-00187-ga9f6b6b8cd2f #742 [ 150.690916] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 [ 150.692199] task: ffff88003b30c480 task.stack: ffffc90000350000 [ 150.692952] RIP: 0010:asymmetric_key_free_kids+0x12/0x30 [ 150.693556] RSP: 0018:ffffc90000353e58 EFLAGS: 00010202 [ 150.694142] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000004 [ 150.694845] RDX: ffffffff81ee3920 RSI: ffff88003d4b0700 RDI: 0000000000000001 [ 150.697569] RBP: ffffc90000353e60 R08: ffff88003d5d2140 R09: 0000000000000000 [ 150.702483] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 [ 150.707393] R13: 0000000000000004 R14: ffff880038a4d2d8 R15: 000000000040411f [ 150.709720] FS: 00007fcbcee35700(0000) GS:ffff88003fd00000(0000) knlGS:0000000000000000 [ 150.711504] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 150.712733] CR2: 0000000000000001 CR3: 0000000039eab000 CR4: 00000000003406e0 [ 150.714487] Call Trace: [ 150.714975] asymmetric_key_free_preparse+0x2f/0x40 [ 150.715907] key_update+0xf7/0x140 [ 150.716560] ? key_default_cmp+0x20/0x20 [ 150.717319] keyctl_update_key+0xb0/0xe0 [ 150.718066] SyS_keyctl+0x109/0x130 [ 150.718663] entry_SYSCALL_64_fastpath+0x1f/0xc2 [ 150.719440] RIP: 0033:0x7fcbce75ff19 [ 150.719926] RSP: 002b:00007ffd5d167088 EFLAGS: 00000206 ORIG_RAX: 00000000000000fa [ 150.720918] RAX: ffffffffffffffda RBX: 0000000000404d80 RCX: 00007fcbce75ff19 [ 150.721874] RDX: 00007ffd5d16785e RSI: 000000002866cd36 RDI: 0000000000000002 [ 150.722827] RBP: 0000000000000006 R08: 000000002866cd36 R09: 00007ffd5d16785e [ 150.723781] R10: 0000000000000004 R11: 0000000000000206 R12: 0000000000404d80 [ 150.724650] R13: 00007ffd5d16784d R14: 00007ffd5d167238 R15: 000000000040411f [ 150.725447] Code: 83 c4 08 31 c0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 85 ff 74 23 55 48 89 e5 53 48 89 fb <48> 8b 3f e8 06 21 c5 ff 48 8b 7b 08 e8 fd 20 c5 ff 48 89 df e8 [ 150.727489] RIP: asymmetric_key_free_kids+0x12/0x30 RSP: ffffc90000353e58 [ 150.728117] CR2: 0000000000000001 [ 150.728430] ---[ end trace f7f8fe1da2d5ae8d ]--- Fixes: `4d8c0250b8` ("KEYS: Call ->free_preparse() even after ->preparse() returns an error") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Eric Biggers	5def69023a	KEYS: fix dereferencing NULL payload with nonzero length commit `5649645d72` upstream. sys_add_key() and the KEYCTL_UPDATE operation of sys_keyctl() allowed a NULL payload with nonzero length to be passed to the key type's ->preparse(), ->instantiate(), and/or ->update() methods. Various key types including asymmetric, cifs.idmap, cifs.spnego, and pkcs7_test did not handle this case, allowing an unprivileged user to trivially cause a NULL pointer dereference (kernel oops) if one of these key types was present. Fix it by doing the copy_from_user() when 'plen' is nonzero rather than when '_payload' is non-NULL, causing the syscall to fail with EFAULT as expected when an invalid buffer is specified. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Gilad Ben-Yossef	e423898fd8	crypto: asymmetric_keys - handle EBUSY due to backlog correctly commit `e68368aed5` upstream. public_key_verify_signature() was passing the CRYPTO_TFM_REQ_MAY_BACKLOG flag to akcipher_request_set_callback() but was not handling correctly the case where a -EBUSY error could be returned from the call to crypto_akcipher_verify() if backlog was used, possibly casuing data corruption due to use-after-free of buffers. Resolve this by handling -EBUSY correctly. Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Murali Karicheri	c10acee89d	ARM: dts: keystone-k2l: fix broken Ethernet due to disabled OSR commit `791229f1d5` upstream. Ethernet networking on K2L has been broken since v4.11-rc1. This was caused by commit `32a34441a9` ("ARM: keystone: dts: fix netcp clocks and add names"). This commit inadvertently moves on-chip static RAM clock to the end of list of clocks provided for netcp. Since keystone PM domain support does not have a list of recognized con_ids, only the first clock in the list comes under runtime PM management. This means the OSR (On-chip Static RAM) clock remains disabled and that broke networking on K2L. The OSR is used by QMSS on K2L as an external linking RAM. However this is a standalone RAM that can be used for non-QMSS usage (as well as from DSP side). So add a SRAM device node for the same and add the OSR clock to the node. Remove the now redundant OSR clock node from netcp. To manage all clocks defined for netCP's use by runtime PM needs keystone generic power domain (genpd) driver support which is under works. Meanwhile, this patch restores K2L networking and is correct irrespective of any future genpd work since OSR is an independent module and not part of NetCP anyway. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> [nsekhar@ti.com: commit message updates, port to latest mainline] Signed-off-by: Sekhar Nori <nsekhar@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Eric W. Biederman	ff6c1649b4	ptrace: Properly initialize ptracer_cred on fork commit `c70d9d809f` upstream. When I introduced ptracer_cred I failed to consider the weirdness of fork where the task_struct copies the old value by default. This winds up leaving ptracer_cred set even when a process forks and the child process does not wind up being ptraced. Because ptracer_cred is not set on non-ptraced processes whose parents were ptraced this has broken the ability of the enlightenment window manager to start setuid children. Fix this by properly initializing ptracer_cred in ptrace_init_task This must be done with a little bit of care to preserve the current value of ptracer_cred when ptrace carries through fork. Re-reading the ptracer_cred from the ptracing process at this point is inconsistent with how PT_PTRACE_CAP has been maintained all of these years. Tested-by: Takashi Iwai <tiwai@suse.de> Fixes: `64b875f7ac` ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Lucas Stach	1c8faeebd5	serial: core: fix crash in uart_suspend_port commit `88e2582e90` upstream. With serdev we might end up with serial ports that have no cdev exported to userspace, as they are used as the bus interface to other devices. In that case serial_match_port() won't be able to find a matching tty_dev. Skip the irq wakeup enabling in that case, as serdev will make sure to keep the port active, as long as there are devices depending on it. Fixes: `8ee3fde047` (tty_port: register tty ports with serdev bus) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Johan Hovold	70876e7c02	serial: ifx6x60: fix use-after-free on module unload commit `1e948479b3` upstream. Make sure to deregister the SPI driver before releasing the tty driver to avoid use-after-free in the SPI remove callback where the tty devices are deregistered. Fixes: `72d4724ea5` ("serial: ifx6x60: Add modem power off function in the platform reboot process") Cc: Jun Chen <jun.d.chen@intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:37 +02:00
Jan Kiszka	7faf3f5f89	serial: exar: Fix stuck MSIs commit `2c0ac5b48a` upstream. After migrating 8250_exar to MSI in `172c33cb61`, we can get stuck without further interrupts because of the special wake-up event these chips send. They are only cleared by reading INT0. As we fail to do so during startup and shutdown, we can leave the interrupt line asserted, which is fatal with edge-triggered MSIs. Add the required reading of INT0 to startup and shutdown. Also account for the fact that a pending wake-up interrupt means we have to return 1 from exar_handle_irq. Drop the unneeded reading of INT1..3 along with this - those never reset anything. An alternative approach would have been disabling the wake-up interrupt. Unfortunately, this feature (REGB[17] = 1) is not available on the XR17D15X. Fixes: `172c33cb61` ("serial: exar: Enable MSI support") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Luis Henriques	bc0734aac3	ftrace: Fix memory leak in ftrace_graph_release() commit `f9797c2f20` upstream. ftrace_hash is being kfree'ed in ftrace_graph_release(), however the ->buckets field is not. This results in a memory leak that is easily captured by kmemleak: unreferenced object 0xffff880038afe000 (size 8192): comm "trace-cmd", pid 238, jiffies 4294916898 (age 9.736s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff815f561e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff8113964d>] __kmalloc+0x12d/0x1a0 [<ffffffff810bf6d1>] alloc_ftrace_hash+0x51/0x80 [<ffffffff810c0523>] __ftrace_graph_open.isra.39.constprop.46+0xa3/0x100 [<ffffffff810c05e8>] ftrace_graph_open+0x68/0xa0 [<ffffffff8114003d>] do_dentry_open.isra.1+0x1bd/0x2d0 [<ffffffff81140df7>] vfs_open+0x47/0x60 [<ffffffff81150f95>] path_openat+0x2a5/0x1020 [<ffffffff81152d6a>] do_filp_open+0x8a/0xf0 [<ffffffff811411df>] do_sys_open+0x12f/0x200 [<ffffffff811412ce>] SyS_open+0x1e/0x20 [<ffffffff815fa6e0>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Link: http://lkml.kernel.org/r/20170525152038.7661-1-lhenriques@suse.com Fixes: `b9b0c831be` ("ftrace: Convert graph filter to use hash tables") Signed-off-by: Luis Henriques <lhenriques@suse.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Jane Chu	1a223727ae	arch/sparc: support NR_CPUS = 4096 [ Upstream commit `c79a13734d` ] Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info() only allocates a single page for NR_CPUS mondo entries. Thus we cannot use all 4096 CPUs on some SPARC platforms. To fix, allocate (2^order) pages where order is set according to the size of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa are not used in asm code, there are no imm13 offsets from the base PA that will break because they can only reach one page. Orabug: 25505750 Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Atish Patra <atish.patra@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	10e3a2945e	sparc64: delete old wrap code [ Upstream commit `0197e41ce7` ] The old method that is using xcall and softint to get new context id is deleted, as it is replaced by a method of using per_cpu_secondary_mm without xcall to perform the context wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	b60d9051cc	sparc64: new context wrap [ Upstream commit `a0582f26ec` ] The current wrap implementation has a race issue: it is called outside of the ctx_alloc_lock, and also does not wait for all CPUs to complete the wrap. This means that a thread can get a new context with a new version and another thread might still be running with the same context. The problem is especially severe on CPUs with shared TLBs, like sun4v. I used the following test to very quickly reproduce the problem: - start over 8K processes (must be more than context IDs) - write and read values at a memory location in every process. Very quickly memory corruptions start happening, and what we read back does not equal what we wrote. Several approaches were explored before settling on this one: Approach 1: Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for every process to complete the wrap. (Note: every CPU must WAIT before leaving smp_new_mmu_context_version_client() until every one arrives). This approach ends up with deadlocks, as some threads own locks which other threads are waiting for, and they never receive softint until these threads exit smp_new_mmu_context_version_client(). Since we do not allow the exit, deadlock happens. Approach 2: Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into into C code, and issue new versions to every CPU. This approach adds some overhead to runtime: in switch_mm() we must add some checks to make sure that versions have not changed due to wrap while we were loading the new secondary context. (could be protected by PSTATE_IE but that degrades performance as on M7 and older CPUs as it takes 50 cycles for each access). Also, we still need a global per-cpu array of MMs to know where we need to load new contexts, otherwise we can change context to a thread that is going way (if we received mondo between switch_mm() and switch_to() time). Finally, there are some issues with window registers in rtrap() when context IDs are changed during CPU mondo time. The approach in this patch is the simplest and has almost no impact on runtime. We use the array with mm's where last secondary contexts were loaded onto CPUs and bump their versions to the new generation without changing context IDs. If a new process comes in to get a context ID, it will go through get_new_mmu_context() because of version mismatch. But the running processes do not need to be interrupted. And wrap is quicker as we do not need to xcall and wait for everyone to receive and complete wrap. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	be5e2c7026	sparc64: add per-cpu mm of secondary contexts [ Upstream commit `7a5b4bbf49` ] The new wrap is going to use information from this array to figure out mm's that currently have valid secondary contexts setup. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	d5fb553c51	sparc64: redefine first version [ Upstream commit `c4415235b2` ] CTX_FIRST_VERSION defines the first context version, but also it defines first context. This patch redefines it to only include the first context version. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	557145f44c	sparc64: combine activate_mm and switch_mm [ Upstream commit `14d0334c67` ] The only difference between these two functions is that in activate_mm we unconditionally flush context. However, there is no need to keep this difference after fixing a bug where cpumask was not reset on a wrap. So, in this patch we combine these. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:36 +02:00
Pavel Tatashin	aa6349e030	sparc64: reset mm cpumask after wrap [ Upstream commit `5889748573` ] After a wrap (getting a new context version) a process must get a new context id, which means that we would need to flush the context id from the TLB before running for the first time with this ID on every CPU. But, we use mm_cpumask to determine if this process has been running on this CPU before, and this mask is not reset after a wrap. So, there are two possible fixes for this issue: 1. Clear mm cpumask whenever mm gets a new context id 2. Unconditionally flush context every time process is running on a CPU This patch implements the first solution Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Steven Sistare <steven.sistare@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Liam R. Howlett	c8c7bb2f5b	sparc/mm/hugepages: Fix setup_hugepagesz for invalid values. [ Upstream commit `f322980b74` ] hugetlb_bad_size needs to be called on invalid values. Also change the pr_warn to a pr_err to better align with other platforms. Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
James Clarke	a091e625ed	sparc: Machine description indices can vary [ Upstream commit `c982aa9c30` ] VIO devices were being looked up by their index in the machine description node block, but this often varies over time as devices are added and removed. Instead, store the ID and look up using the type, config handle and ID. Signed-off-by: James Clarke <jrtc27@jrtc27.com> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541 Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Mike Kravetz	376452d48e	sparc64: mm: fix copy_tsb to correctly copy huge page TSBs [ Upstream commit `654f480762` ] When a TSB grows beyond its current capacity, a new TSB is allocated and copy_tsb is called to copy entries from the old TSB to the new. A hash shift based on page size is used to calculate the index of an entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT should be used. As a result, when copy_tsb is called for a huge page TSB the entries are placed at the incorrect index in the newly allocated TSB. When doing hardware table walk, the MMU does not match these entries and we end up in the TSB miss handling code. This code will then create and write an entry to the correct index in the TSB. We take a performance hit for the table walk miss and recreation of these entries. Pass a new parameter to copy_tsb that is the page size shift to be used when copying the TSB. Suggested-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
David S. Miller	35b284c739	sparc64: Add __multi3 for gcc 7.x and later. [ Upstream commit `1b4af13ff2` ] Reported-by: Waldemar Brodkorb <wbx@openadk.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Niklas Cassel	6bf499d388	net: stmmac: fix completely hung TX when using TSO [ Upstream commit `426849e661` ] stmmac_tso_allocator can fail to set the Last Descriptor bit on a descriptor that actually was the last descriptor. This happens when the buffer of the last descriptor ends up having a size of exactly TSO_MAX_BUFF_SIZE. When the IP eventually reaches the next last descriptor, which actually has the bit set, the DMA will hang. When the DMA hangs, we get a tx timeout, however, since stmmac does not do a complete reset of the IP in stmmac_tx_timeout, we end up in a state with completely hung TX. Signed-off-by: Niklas Cassel <niklas.cassel@axis.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Acked-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Max Filippov	d44184237a	net: ethoc: enable NAPI before poll may be scheduled [ Upstream commit `d220b942a4` ] ethoc_reset enables device interrupts, ethoc_interrupt may schedule a NAPI poll before NAPI is enabled in the ethoc_open, which results in device being unable to send or receive anything until it's closed and reopened. In case the device is flooded with ingress packets it may be unable to recover at all. Move napi_enable above ethoc_reset in the ethoc_open to fix that. Fixes: `a170285772` ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Nikolay Aleksandrov	9719e31a96	net: bridge: fix a null pointer dereference in br_afspec [ Upstream commit `1020ce3108` ] We might call br_afspec() with p == NULL which is a valid use case if the action is on the bridge device itself, but the bridge tunnel code dereferences the p pointer without checking, so check if p is null first. Reported-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Fixes: `efa5356b0d` ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:35 +02:00
Eugeniu Rosca	d673c7641c	ravb: Fix use-after-free on `ifconfig eth0 down` [ Upstream commit `79514ef670` ] Commit `a47b70ea86` ("ravb: unmap descriptors when freeing rings") has introduced the issue seen in [1] reproduced on H3ULCB board. Fix this by relocating the RX skb ringbuffer free operation, so that swiotlb page unmapping can be done first. Freeing of aligned TX buffers is not relevant to the issue seen in [1]. Still, reposition TX free calls as well, to have all kfree() operations performed consistently _after_ dma_unmap_()/dma_free_(). [1] Console screenshot with the problem reproduced: salvator-x login: root root@salvator-x:~# ifconfig eth0 up Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: \ attached PHY driver [Micrel KSZ9031 Gigabit PHY] \ (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=235) IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready root@salvator-x:~# root@salvator-x:~# ifconfig eth0 down ================================================================== BUG: KASAN: use-after-free in swiotlb_tbl_unmap_single+0xc4/0x35c Write of size 1538 at addr ffff8006d884f780 by task ifconfig/1649 CPU: 0 PID: 1649 Comm: ifconfig Not tainted 4.12.0-rc4-00004-g112eb07287d1 #32 Hardware name: Renesas H3ULCB board based on r8a7795 (DT) Call trace: [<ffff20000808f11c>] dump_backtrace+0x0/0x3a4 [<ffff20000808f4d4>] show_stack+0x14/0x1c [<ffff20000865970c>] dump_stack+0xf8/0x150 [<ffff20000831f8b0>] print_address_description+0x7c/0x330 [<ffff200008320010>] kasan_report+0x2e0/0x2f4 [<ffff20000831eac0>] check_memory_region+0x20/0x14c [<ffff20000831f054>] memcpy+0x48/0x68 [<ffff20000869ed50>] swiotlb_tbl_unmap_single+0xc4/0x35c [<ffff20000869fcf4>] unmap_single+0x90/0xa4 [<ffff20000869fd14>] swiotlb_unmap_page+0xc/0x14 [<ffff2000080a2974>] __swiotlb_unmap_page+0xcc/0xe4 [<ffff2000088acdb8>] ravb_ring_free+0x514/0x870 [<ffff2000088b25dc>] ravb_close+0x288/0x36c [<ffff200008aaf8c4>] __dev_close_many+0x14c/0x174 [<ffff200008aaf9b4>] __dev_close+0xc8/0x144 [<ffff200008ac2100>] __dev_change_flags+0xd8/0x194 [<ffff200008ac221c>] dev_change_flags+0x60/0xb0 [<ffff200008ba2dec>] devinet_ioctl+0x484/0x9d4 [<ffff200008ba7b78>] inet_ioctl+0x190/0x194 [<ffff200008a78c44>] sock_do_ioctl+0x78/0xa8 [<ffff200008a7a128>] sock_ioctl+0x110/0x3c4 [<ffff200008365a70>] vfs_ioctl+0x90/0xa0 [<ffff200008365dbc>] do_vfs_ioctl+0x148/0xc38 [<ffff2000083668f0>] SyS_ioctl+0x44/0x74 [<ffff200008083770>] el0_svc_naked+0x24/0x28 The buggy address belongs to the page: page:ffff7e001b6213c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffff7e001b6213e0 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8006d884f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff8006d884f780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8006d884f800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint root@salvator-x:~# Fixes: `a47b70ea86` ("ravb: unmap descriptors when freeing rings") Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Richard Haines	4990610df0	net/ipv6: Fix CALIPSO causing GPF with datagram support [ Upstream commit `e3ebdb20fd` ] When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the IP header may have moved. Also update the payload length after adding the CALIPSO option. Signed-off-by: Richard Haines <richard_c_haines@btinternet.com> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Huw Davies <huw@codeweavers.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Eric Dumazet	76d36dd1b1	net: ping: do not abuse udp_poll() [ Upstream commit `77d4b1d369` ] Alexander reported various KASAN messages triggered in recent kernels The problem is that ping sockets should not use udp_poll() in the first place, and recent changes in UDP stack finally exposed this old bug. Fixes: `c319b4d76b` ("net: ipv4: add IPPROTO_ICMP socket kind") Fixes: `6d0bfe2261` ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <alexander.levin@verizon.com> Cc: Solar Designer <solar@openwall.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Lorenzo Colitti <lorenzo@google.com> Acked-By: Lorenzo Colitti <lorenzo@google.com> Tested-By: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Florian Fainelli	223313c9be	net: dsa: Fix stale cpu_switch reference after unbind then bind [ Upstream commit `b07ac98946` ] Commit `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") replaced the use of dst->ds[0] with dst->cpu_switch since that is functionally equivalent, however, we can now run into an use after free scenario after unbinding then rebinding the switch driver. The use after free happens because we do correctly initialize dst->cpu_switch the first time we probe in dsa_cpu_parse(), then we unbind the driver: dsa_dst_unapply() is called, and we rebind again. dst->cpu_switch now points to a freed "ds" structure, and so when we finally dereference it in dsa_cpu_port_ethtool_setup(), we oops. To fix this, simply set dst->cpu_switch to NULL in dsa_dst_unapply() which guarantees that we always correctly re-assign dst->cpu_switch in dsa_cpu_parse(). Fixes: `9520ed8fb8` ("net: dsa: use cpu_switch instead of ds[0]") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
David S. Miller	a32f7198b2	ipv6: Fix leak in ipv6_gso_segment(). [ Upstream commit `e3e86b5119` ] If ip6_find_1stfragopt() fails and we return an error we have to free up 'segs' because nobody else is going to. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Eric Garver	f7f87871c4	geneve: fix needed_headroom and max_mtu for collect_metadata [ Upstream commit `9a1c44d989` ] Since commit `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") when using COLLECT_METADATA geneve devices are created with too small of a needed_headroom and too large of a max_mtu. This is because ip_tunnel_info_af() is not valid with the device level info when using COLLECT_METADATA and we mistakenly fall into the IPv4 case. For COLLECT_METADATA, always use the worst case of ipv6 since both sockets are created. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Soheil Hassas Yeganeh	19456d4526	sock: reset sk_err when the error queue is empty [ Upstream commit `38b257938a` ] Prior to `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb), sk_err was reset to the error of the skb on the head of the error queue. Applications, most notably ping, are relying on this behavior to reset sk_err for ICMP packets. Set sk_err to the ICMP error when there is an ICMP packet at the head of the error queue. Fixes: `f5f99309fa` (sock: do not set sk_err in sock_dequeue_err_skb) Reported-by: Cyril Hrubis <chrubis@suse.cz> Tested-by: Cyril Hrubis <chrubis@suse.cz> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:34 +02:00
Liam McBirnie	3ab7563eed	ip6_tunnel: fix traffic class routing for tunnels [ Upstream commit `5f733ee68f` ] ip6_route_output() requires that the flowlabel contains the traffic class for policy routing. Commit `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") removed the code which previously added the traffic class to the flowlabel. The traffic class is added here because only route lookup needs the flowlabel to contain the traffic class. Fixes: `0e9a709560` ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets") Signed-off-by: Liam McBirnie <liam.mcbirnie@boeing.com> Acked-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Mark Bloch	35e08f6e2b	vxlan: fix use-after-free on deletion [ Upstream commit `a53cb29b0a` ] Adding a vxlan interface to a socket isn't symmetrical, while adding is done in vxlan_open() the deletion is done in vxlan_dellink(). This can cause a use-after-free error when we close the vxlan interface before deleting it. We add vxlan_vs_del_dev() to match vxlan_vs_add_dev() and call it from vxlan_stop() to match the call from vxlan_open(). Fixes: `56ef9c909b` ("vxlan: Move socket initialization to within rtnl scope") Acked-by: Jiri Benc <jbenc@redhat.com> Tested-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Yuchung Cheng	905ae1b1e6	tcp: disallow cwnd undo when switching congestion control [ Upstream commit `44abafc4cc` ] When the sender switches its congestion control during loss recovery, if the recovery is spurious then it may incorrectly revert cwnd and ssthresh to the older values set by a previous congestion control. Consider a congestion control (like BBR) that does not use ssthresh and keeps it infinite: the connection may incorrectly revert cwnd to an infinite value when switching from BBR to another congestion control. This patch fixes it by disallowing such cwnd undo operation upon switching congestion control. Note that undo_marker is not reset s.t. the packets that were incorrectly marked lost would be corrected. We only avoid undoing the cwnd in tcp_undo_cwnd_reduction(). Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Ganesh Goudar	fb62fe105b	cxgb4: avoid enabling napi twice to the same queue [ Upstream commit `e7519f9926` ] Take uld mutex to avoid race between cxgb_up() and cxgb4_register_uld() to enable napi for the same uld queue. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Ben Hutchings	d935063ac4	ipv6: xfrm: Handle errors reported by xfrm6_find_1stfragopt() [ Upstream commit `6e80ac5cc9` ] xfrm6_find_1stfragopt() may now return an error code and we must not treat it as a length. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Florian Fainelli	ce45e4f995	net: systemport: Fix missing Wake-on-LAN interrupt for SYSTEMPORT Lite [ Upstream commit `d31353cd75` ] On SYSTEMPORT Lite, since we have the main interrupt source in the first cell, the second cell is the Wake-on-LAN interrupt, yet the code was not properly updated to fetch the second cell, and instead looked at the third and non-existing cell for Wake-on-LAN. Fixes: `44a4524c54` ("net: systemport: Add support for SYSTEMPORT Lite") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Lance Richardson	1f9f0e2822	vxlan: eliminate cached dst leak [ Upstream commit `35cf284556` ] After commit `0c1d70af92` ("net: use dst_cache for vxlan device"), cached dst entries could be leaked when more than one remote was present for a given vxlan_fdb entry, causing subsequent netns operations to block indefinitely and "unregister_netdevice: waiting for lo to become free." messages to appear in the kernel log. Fix by properly releasing cached dst and freeing resources in this case. Fixes: `0c1d70af92` ("net: use dst_cache for vxlan device") Signed-off-by: Lance Richardson <lrichard@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:33 +02:00
Nikolay Aleksandrov	8b8fd1831b	net: bridge: start hello timer only if device is up [ Upstream commit `aeb073241f` ] When the transition of NO_STP -> KERNEL_STP was fixed by always calling mod_timer in br_stp_start, it introduced a new regression which causes the timer to be armed even when the bridge is down, and since we stop the timers in its ndo_stop() function, they never get disabled if the device is destroyed before it's upped. To reproduce: $ while :; do ip l add br0 type bridge hello_time 100; brctl stp br0 on; ip l del br0; done; CC: Xin Long <lucien.xin@gmail.com> CC: Ivan Vecera <cera@cera.cz> CC: Sebastian Ott <sebott@linux.vnet.ibm.com> Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Fixes: `6d18c732b9` ("bridge: start hello_timer when enabling KERNEL_STP in br_stp_start") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:32 +02:00
Mintz, Yuval	f5f67e441e	bnx2x: Fix Multi-Cos [ Upstream commit `3968d38917` ] Apparently multi-cos isn't working for bnx2x quite some time - driver implements ndo_select_queue() to allow queue-selection for FCoE, but the regular L2 flow would cause it to modulo the fallback's result by the number of queues. The fallback would return a queue matching the needed tc [via __skb_tx_hash()], but since the modulo is by the number of TSS queues where number of TCs is not accounted, transmission would always be done by a queue configured into using TC0. Fixes: `ada7c19e6d` ("bnx2x: use XPS if possible for bnx2x_select_queue instead of pure hash") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-14 15:07:32 +02:00
Greg Kroah-Hartman	553c942bef	Linux 4.11.4	2017-06-07 12:10:31 +02:00
Jan Kara	b5ff97c774	xfs: Fix off-by-in in loop termination in xfs_find_get_desired_pgoff() commit `d7fd24257a` upstream. There is an off-by-one error in loop termination conditions in xfs_find_get_desired_pgoff() since 'end' may index a page beyond end of desired range if 'endoff' is page aligned. It doesn't have any visible effects but still it is good to fix it. Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:17 +02:00
Eric Sandeen	d514c634a4	xfs: fix unaligned access in xfs_btree_visit_blocks commit `a4d768e702` upstream. This structure copy was throwing unaligned access warnings on sparc64: Kernel unaligned access at TPC[1043c088] xfs_btree_visit_blocks+0x88/0xe0 [xfs] xfs_btree_copy_ptrs does a memcpy, which avoids it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:17 +02:00
Darrick J. Wong	fff5729ce2	xfs: avoid mount-time deadlock in CoW extent recovery commit `3ecb3ac7b9` upstream. If a malicious user corrupts the refcount btree to cause a cycle between different levels of the tree, the next mount attempt will deadlock in the CoW recovery routine while grabbing buffer locks. We can use the ability to re-grab a buffer that was previous locked to a transaction to avoid deadlocks, so do that here. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Christoph Hellwig	ecb4261526	xfs: xfs_trans_alloc_empty This is a partial cherry-pick of commit `e89c041338` ("xfs: implement the GETFSMAP ioctl"), which also adds this helper, and a great example of why feature patches should be properly split into their parts. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> [hch: split from the larger patch for -stable] Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-07 12:10:16 +02:00
Zorro Lang	2e08bd63dc	xfs: bad assertion for delalloc an extent that start at i_size commit `892d2a5f70` upstream. By run fsstress long enough time enough in RHEL-7, I find an assertion failure (harder to reproduce on linux-4.11, but problem is still there): XFS: Assertion failed: (iflags & BMV_IF_DELALLOC) != 0, file: fs/xfs/xfs_bmap_util.c The assertion is in xfs_getbmap() funciton: if (map[i].br_startblock == DELAYSTARTBLOCK && --> map[i].br_startoff <= XFS_B_TO_FSB(mp, XFS_ISIZE(ip))) ASSERT((iflags & BMV_IF_DELALLOC) != 0); When map[i].br_startoff == XFS_B_TO_FSB(mp, XFS_ISIZE(ip)), the startoff is just at EOF. But we only need to make sure delalloc extents that are within EOF, not include EOF. Signed-off-by: Zorro Lang <zlang@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Darrick J. Wong	2dc6e27120	xfs: BMAPX shouldn't barf on inline-format directories commit `6eadbf4c8b` upstream. When we're fulfilling a BMAPX request, jump out early if the data fork is in local format. This prevents us from hitting a debugging check in bmapi_read and barfing errors back to userspace. The on-disk extent count check later isn't sufficient for IF_DELALLOC mode because da extents are in memory and not on disk. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Brian Foster	1ae26380c8	xfs: fix indlen accounting error on partial delalloc conversion commit `0daaecacb8` upstream. The delalloc -> real block conversion path uses an incorrect calculation in the case where the middle part of a delalloc extent is being converted. This is documented as a rare situation because XFS generally attempts to maximize contiguity by converting as much of a delalloc extent as possible. If this situation does occur, the indlen reservation for the two new delalloc extents left behind by the conversion of the middle range is calculated and compared with the original reservation. If more blocks are required, the delta is allocated from the global block pool. This delta value can be characterized as the difference between the new total requirement (temp + temp2) and the currently available reservation minus those blocks that have already been allocated (startblockval(PREV.br_startblock) - allocated). The problem is that the current code does not account for previously allocated blocks correctly. It subtracts the current allocation count from the (new - old) delta rather than the old indlen reservation. This means that more indlen blocks than have been allocated end up stashed in the remaining extents and free space accounting is broken as a result. Fix up the calculation to subtract the allocated block count from the original extent indlen and thus correctly allocate the reservation delta based on the difference between the new total requirement and the unused blocks from the original reservation. Also remove a bogus assert that contradicts the fact that the new indlen reservation can be larger than the original indlen reservation. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:16 +02:00
Eryu Guan	d1a8ae21c3	xfs: fix use-after-free in xfs_finish_page_writeback commit `161f55efba` upstream. Commit `28b783e47a` ("xfs: bufferhead chains are invalid after end_page_writeback") fixed one use-after-free issue by pre-calculating the loop conditionals before calling bh->b_end_io() in the end_io processing loop, but it assigned 'next' pointer before checking end offset boundary & breaking the loop, at which point the bh might be freed already, and caused use-after-free. This is caught by KASAN when running fstests generic/127 on sub-page block size XFS. [ 2517.244502] run fstests generic/127 at 2017-04-27 07:30:50 [ 2747.868840] ================================================================== [ 2747.876949] BUG: KASAN: use-after-free in xfs_destroy_ioend+0x3d3/0x4e0 [xfs] at addr ffff8801395ae698 ... [ 2747.918245] Call Trace: [ 2747.920975] dump_stack+0x63/0x84 [ 2747.924673] kasan_object_err+0x21/0x70 [ 2747.928950] kasan_report+0x271/0x530 [ 2747.933064] ? xfs_destroy_ioend+0x3d3/0x4e0 [xfs] [ 2747.938409] ? end_page_writeback+0xce/0x110 [ 2747.943171] __asan_report_load8_noabort+0x19/0x20 [ 2747.948545] xfs_destroy_ioend+0x3d3/0x4e0 [xfs] [ 2747.953724] xfs_end_io+0x1af/0x2b0 [xfs] [ 2747.958197] process_one_work+0x5ff/0x1000 [ 2747.962766] worker_thread+0xe4/0x10e0 [ 2747.966946] kthread+0x2d3/0x3d0 [ 2747.970546] ? process_one_work+0x1000/0x1000 [ 2747.975405] ? kthread_create_on_node+0xc0/0xc0 [ 2747.980457] ? syscall_return_slowpath+0xe6/0x140 [ 2747.985706] ? do_page_fault+0x30/0x80 [ 2747.989887] ret_from_fork+0x2c/0x40 [ 2747.993874] Object at ffff8801395ae690, in cache buffer_head size: 104 [ 2748.001155] Allocated: [ 2748.003782] PID = 8327 [ 2748.006411] save_stack_trace+0x1b/0x20 [ 2748.010688] save_stack+0x46/0xd0 [ 2748.014383] kasan_kmalloc+0xad/0xe0 [ 2748.018370] kasan_slab_alloc+0x12/0x20 [ 2748.022648] kmem_cache_alloc+0xb8/0x1b0 [ 2748.027024] alloc_buffer_head+0x22/0xc0 [ 2748.031399] alloc_page_buffers+0xd1/0x250 [ 2748.035968] create_empty_buffers+0x30/0x410 [ 2748.040730] create_page_buffers+0x120/0x1b0 [ 2748.045493] __block_write_begin_int+0x17a/0x1800 [ 2748.050740] iomap_write_begin+0x100/0x2f0 [ 2748.055308] iomap_zero_range_actor+0x253/0x5c0 [ 2748.060362] iomap_apply+0x157/0x270 [ 2748.064347] iomap_zero_range+0x5a/0x80 [ 2748.068624] iomap_truncate_page+0x6b/0xa0 [ 2748.073227] xfs_setattr_size+0x1f7/0xa10 [xfs] [ 2748.078312] xfs_vn_setattr_size+0x68/0x140 [xfs] [ 2748.083589] xfs_file_fallocate+0x4ac/0x820 [xfs] [ 2748.088838] vfs_fallocate+0x2cf/0x780 [ 2748.093021] SyS_fallocate+0x48/0x80 [ 2748.097006] do_syscall_64+0x18a/0x430 [ 2748.101186] return_from_SYSCALL_64+0x0/0x6a [ 2748.105948] Freed: [ 2748.108189] PID = 8327 [ 2748.110816] save_stack_trace+0x1b/0x20 [ 2748.115093] save_stack+0x46/0xd0 [ 2748.118788] kasan_slab_free+0x73/0xc0 [ 2748.122969] kmem_cache_free+0x7a/0x200 [ 2748.127247] free_buffer_head+0x41/0x80 [ 2748.131524] try_to_free_buffers+0x178/0x250 [ 2748.136316] xfs_vm_releasepage+0x2e9/0x3d0 [xfs] [ 2748.141563] try_to_release_page+0x100/0x180 [ 2748.146325] invalidate_inode_pages2_range+0x7da/0xcf0 [ 2748.152087] xfs_shift_file_space+0x37d/0x6e0 [xfs] [ 2748.157557] xfs_collapse_file_space+0x49/0x120 [xfs] [ 2748.163223] xfs_file_fallocate+0x2a7/0x820 [xfs] [ 2748.168462] vfs_fallocate+0x2cf/0x780 [ 2748.172642] SyS_fallocate+0x48/0x80 [ 2748.176629] do_syscall_64+0x18a/0x430 [ 2748.180810] return_from_SYSCALL_64+0x0/0x6a Fixed it by checking on offset against end & breaking out first, dereference bh only if there're still bufferheads to process. Signed-off-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Darrick J. Wong	365625a670	xfs: reserve enough blocks to handle btree splits when remapping commit `fe0be23e68` upstream. In xfs_reflink_end_cow, we erroneously reserve only enough blocks to handle adding 1 extent. This is problematic if we fragment free space, have to do CoW, and then have to perform multiple bmap btree expansions. Furthermore, the BUI recovery routine doesn't reserve /any/ blocks to handle btree splits, so log recovery fails after our first error causes the filesystem to go down. Therefore, refactor the transaction block reservation macros until we have a macro that works for our deferred (re)mapping activities, and fix both problems by using that macro. With 1k blocks we can hit this fairly often in g/187 if the scratch fs is big enough. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	67e439ccfe	xfs: wait on new inodes during quotaoff dquot release commit `e20c8a517f` upstream. The quotaoff operation has a race with inode allocation that results in a livelock. An inode allocation that occurs before the quota status flags are updated acquires the appropriate dquots for the inode via xfs_qm_vop_dqalloc(). It then inserts the XFS_INEW inode into the perag radix tree, sometime later attaches the dquots to the inode and finally clears the XFS_INEW flag. Quotaoff expects to release the dquots from all inodes in the filesystem via xfs_qm_dqrele_all_inodes(). This invokes the AG inode iterator, which skips inodes in the XFS_INEW state because they are not fully constructed. If the scan occurs after dquots have been attached to an inode, but before XFS_INEW is cleared, the newly allocated inode will continue to hold a reference to the applicable dquots. When quotaoff invokes xfs_qm_dqpurge_all(), the reference count of those dquot(s) remain elevated and the dqpurge scan spins indefinitely. To address this problem, update the xfs_qm_dqrele_all_inodes() scan to wait on inodes marked on the XFS_INEW state. We wait on the inodes explicitly rather than skip and retry to avoid continuous retry loops due to a parallel inode allocation workload. Since quotaoff updates the quota state flags and uses a synchronous transaction before the dqrele scan, and dquots are attached to inodes after radix tree insertion iff quota is enabled, one INEW waiting pass through the AG guarantees that the scan has processed all inodes that could possibly hold dquot references. Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	f8c68633bb	xfs: update ag iterator to support wait on new inodes commit `ae2c4ac2dd` upstream. The AG inode iterator currently skips new inodes as such inodes are inserted into the inode radix tree before they are fully constructed. Certain contexts require the ability to wait on the construction of new inodes, however. The fs-wide dquot release from the quotaoff sequence is an example of this. Update the AG inode iterator to support the ability to wait on inodes flagged with XFS_INEW upon request. Create a new xfs_inode_ag_iterator_flags() interface and support a set of iteration flags to modify the iteration behavior. When the XFS_AGITER_INEW_WAIT flag is set, include XFS_INEW flags in the radix tree inode lookup and wait on them before the callback is executed. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	56aab3095b	xfs: support ability to wait on new inodes commit `756baca27f` upstream. Inodes that are inserted into the perag tree but still under construction are flagged with the XFS_INEW bit. Most contexts either skip such inodes when they are encountered or have the ability to handle them. The runtime quotaoff sequence introduces a context that must wait for construction of such inodes to correctly ensure that all dquots in the fs are released. In anticipation of this, support the ability to wait on new inodes. Wake the appropriate bit when XFS_INEW is cleared. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	bf16242614	xfs: fix up quotacheck buffer list error handling commit `20e8a06378` upstream. The quotacheck error handling of the delwri buffer list assumes the resident buffers are locked and doesn't clear the _XBF_DELWRI_Q flag on the buffers that are dequeued. This can lead to assert failures on buffer release and possibly other locking problems. Move this code to a delwri queue cancel helper function to encapsulate the logic required to properly release buffers from a delwri queue. Update the helper to clear the delwri queue flag and call it from quotacheck. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:15 +02:00
Brian Foster	89aab40e12	xfs: prevent multi-fsb dir readahead from reading random blocks commit `cb52ee334a` upstream. Directory block readahead uses a complex iteration mechanism to map between high-level directory blocks and underlying physical extents. This mechanism attempts to traverse the higher-level dir blocks in a manner that handles multi-fsb directory blocks and simultaneously maintains a reference to the corresponding physical blocks. This logic doesn't handle certain (discontiguous) physical extent layouts correctly with multi-fsb directory blocks. For example, consider the case of a 4k FSB filesystem with a 2 FSB (8k) directory block size and a directory with the following extent layout: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..7]: 88..95 0 (88..95) 8 1: [8..15]: 80..87 0 (80..87) 8 2: [16..39]: 168..191 0 (168..191) 24 3: [40..63]: 5242952..5242975 1 (72..95) 24 Directory block 0 spans physical extents 0 and 1, dirblk 1 lies entirely within extent 2 and dirblk 2 spans extents 2 and 3. Because extent 2 is larger than the directory block size, the readahead code erroneously assumes the block is contiguous and issues a readahead based on the physical mapping of the first fsb of the dirblk. This results in read verifier failure and a spurious corruption or crc failure, depending on the filesystem format. Further, the subsequent readahead code responsible for walking through the physical table doesn't correctly advance the physical block reference for dirblk 2. Instead of advancing two physical filesystem blocks, the first iteration of the loop advances 1 block (correctly), but the subsequent iteration advances 2 more physical blocks because the next physical extent (extent 3, above) happens to cover more than dirblk 2. At this point, the higher-level directory block walking is completely off the rails of the actual physical layout of the directory for the respective mapping table. Update the contiguous dirblock logic to consider the current offset in the physical extent to avoid issuing directory readahead to unrelated blocks. Also, update the mapping table advancing code to consider the current offset within the current dirblock to avoid advancing the mapping reference too far beyond the dirblock. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Eric Sandeen	91bb4f7da5	xfs: handle array index overrun in xfs_dir2_leaf_readbuf() commit `023cc840b4` upstream. Carlos had a case where "find" seemed to start spinning forever and never return. This was on a filesystem with non-default multi-fsb (8k) directory blocks, and a fragmented directory with extents like this: 0:[0,133646,2,0] 1:[2,195888,1,0] 2:[3,195890,1,0] 3:[4,195892,1,0] 4:[5,195894,1,0] 5:[6,195896,1,0] 6:[7,195898,1,0] 7:[8,195900,1,0] 8:[9,195902,1,0] 9:[10,195908,1,0] 10:[11,195910,1,0] 11:[12,195912,1,0] 12:[13,195914,1,0] ... i.e. the first extent is a contiguous 2-fsb dir block, but after that it is fragmented into 1 block extents. At the top of the readdir path, we allocate a mapping array which (for this filesystem geometry) can hold 10 extents; see the assignment to map_info->map_size. During readdir, we are therefore able to map extents 0 through 9 above into the array for readahead purposes. If we count by 2, we see that the last mapped index (9) is the first block of a 2-fsb directory block. At the end of xfs_dir2_leaf_readbuf() we have 2 loops to fill more readahead; the outer loop assumes one full dir block is processed each loop iteration, and an inner loop that ensures that this is so by advancing to the next extent until a full directory block is mapped. The problem is that this inner loop may step past the last extent in the mapping array as it tries to reach the end of the directory block. This will read garbage for the extent length, and as a result the loop control variable 'j' may become corrupted and never fail the loop conditional. The number of valid mappings we have in our array is stored in map->map_valid, so stop this inner loop based on that limit. There is an ASSERT at the top of the outer loop for this same condition, but we never made it out of the inner loop, so the ASSERT never fired. Huge appreciation for Carlos for debugging and isolating the problem. Debugged-and-analyzed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Tested-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Bill O'Donnell <billodo@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Christoph Hellwig	23da04dcc3	xfs: fix integer truncation in xfs_bmap_remap_alloc commit `52813fb13f` upstream. bno should be a xfs_fsblock_t, which is 64-bit wides instead of a xfs_aglock_t, which truncates the value to 32 bits. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Brian Foster	eff615a670	xfs: drop iolock from reclaim context to appease lockdep commit `3b4683c294` upstream. Lockdep complains about use of the iolock in inode reclaim context because it doesn't understand that reclaim has the last reference to the inode, and thus an iolock->reclaim->iolock deadlock is not possible. The iolock is technically not necessary in xfs_inactive() and was only added to appease an assert in xfs_free_eofblocks(), which can be called from other non-reclaim contexts. Therefore, just kill the assert and drop the use of the iolock from reclaim context to quiet lockdep. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Darrick J. Wong	8a1f785887	xfs: actually report xattr extents via iomap commit `84358536dc` upstream. Apparently FIEMAP for xattrs has been broken since we switched to the iomap backend because of an incorrect check for xattr presence. Also fix the broken locking. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Darrick J. Wong	1c862f5a3e	xfs: fix over-copying of getbmap parameters from userspace commit `be6324c00c` upstream. In xfs_ioc_getbmap, we should only copy the fields of struct getbmap from userspace, or else we end up copying random stack contents into the kernel. struct getbmap is a strict subset of getbmapx, so a partial structure copy should work fine. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Brian Foster	2cd6bd867a	xfs: use dedicated log worker wq to avoid deadlock with cil wq commit `696a562072` upstream. The log covering background task used to be part of the xfssyncd workqueue. That workqueue was removed as of commit `5889608df` ("xfs: syncd workqueue is no more") and the associated work item scheduled to the xfs-log wq. The latter is used for log buffer I/O completion. Since xfs_log_worker() can invoke a log flush, a deadlock is possible between the xfs-log and xfs-cil workqueues. Consider the following codepath from xfs_log_worker(): xfs_log_worker() xfs_log_force() _xfs_log_force() xlog_cil_force() xlog_cil_force_lsn() xlog_cil_push_now() flush_work() The above is in xfs-log wq context and blocked waiting on the completion of an xfs-cil work item. Concurrently, the cil push in progress can end up blocked here: xlog_cil_push_work() xlog_cil_push() xlog_write() xlog_state_get_iclog_space() xlog_wait(&log->l_flush_wait, ...) The above is in xfs-cil context waiting on log buffer I/O completion, which executes in xfs-log wq context. In this scenario both workqueues are deadlocked waiting on eachother. Add a new workqueue specifically for the high level log covering and ail pushing worker, as was the case prior to commit `5889608df`. Diagnosed-by: David Jeffery <djeffery@redhat.com> Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:14 +02:00
Eryu Guan	a19348b729	xfs: fix off-by-one on max nr_pages in xfs_find_get_desired_pgoff() commit `8affebe16d` upstream. xfs_find_get_desired_pgoff() is used to search for offset of hole or data in page range [index, end] (both inclusive), and the max number of pages to search should be at least one, if end == index. Otherwise the only page is missed and no hole or data is found, which is not correct. When block size is smaller than page size, this can be demonstrated by preallocating a file with size smaller than page size and writing data to the last block. E.g. run this xfs_io command on a 1k block size XFS on x86_64 host. # xfs_io -fc "falloc 0 3k" -c "pwrite 2k 1k" \ -c "seek -d 0" /mnt/xfs/testfile wrote 1024/1024 bytes at offset 2048 1 KiB, 1 ops; 0.0000 sec (33.675 MiB/sec and 34482.7586 ops/sec) Whence Result DATA EOF Data at offset 2k was missed, and lseek(2) returned ENXIO. This is uncovered by generic/285 subtest 07 and 08 on ppc64 host, where pagesize is 64k. Because a recent change to generic/285 reduced the preallocated file size to smaller than 64k. Signed-off-by: Eryu Guan <eguan@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Brian Foster	0364c225a5	xfs: use ->b_state to fix buffer I/O accounting release race commit `63db7c815b` upstream. We've had user reports of unmount hangs in xfs_wait_buftarg() that analysis shows is due to btp->bt_io_count == -1. bt_io_count represents the count of in-flight asynchronous buffers and thus should always be >= 0. xfs_wait_buftarg() waits for this value to stabilize to zero in order to ensure that all untracked (with respect to the lru) buffers have completed I/O processing before unmount proceeds to tear down in-core data structures. The value of -1 implies an I/O accounting decrement race. Indeed, the fact that xfs_buf_ioacct_dec() is called from xfs_buf_rele() (where the buffer lock is no longer held) means that bp->b_flags can be updated from an unsafe context. While a user-level reproducer is currently not available, some intrusive hacks to run racing buffer lookups/ioacct/releases from multiple threads was used to successfully manufacture this problem. Existing callers do not expect to acquire the buffer lock from xfs_buf_rele(). Therefore, we can not safely update ->b_flags from this context. It turns out that we already have separate buffer state bits and associated serialization for dealing with buffer LRU state in the form of ->b_state and ->b_lock. Therefore, replace the _XBF_IN_FLIGHT flag with a ->b_state variant, update the I/O accounting wrappers appropriately and make sure they are used with the correct locking. This ensures that buffer in-flight state can be modified at buffer release time without racing with modifications from a buffer lock holder. Fixes: `9c7504aa72` ("xfs: track and serialize in-flight async buffers against unmount") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Tested-by: Libor Pechacek <lpechacek@suse.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Jan Kara	34836549fb	xfs: Fix missed holes in SEEK_HOLE implementation commit `5375023ae1` upstream. XFS SEEK_HOLE implementation could miss a hole in an unwritten extent as can be seen by the following command: xfs_io -c "falloc 0 256k" -c "pwrite 0 56k" -c "pwrite 128k 8k" -c "seek -h 0" file wrote 57344/57344 bytes at offset 0 56 KiB, 14 ops; 0.0000 sec (49.312 MiB/sec and 12623.9856 ops/sec) wrote 8192/8192 bytes at offset 131072 8 KiB, 2 ops; 0.0000 sec (70.383 MiB/sec and 18018.0180 ops/sec) Whence Result HOLE 139264 Where we can see that hole at offset 56k was just ignored by SEEK_HOLE implementation. The bug is in xfs_find_get_desired_pgoff() which does not properly detect the case when pages are not contiguous. Fix the problem by properly detecting when found page has larger offset than expected. Fixes: `d126d43f63` Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Patrik Jakobsson	4ec50822a5	drm/gma500/psb: Actually use VBT mode when it is found commit `82bc9a42cf` upstream. With LVDS we were incorrectly picking the pre-programmed mode instead of the prefered mode provided by VBT. Make sure we pick the VBT mode if one is provided. It is likely that the mode read-out code is still wrong but this patch fixes the immediate problem on most machines. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78562 Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170418114332.12183-1-patrik.r.jakobsson@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Thomas Gleixner	74b4db844f	slub/memcg: cure the brainless abuse of sysfs attributes commit `478fe3037b` upstream. memcg_propagate_slab_attrs() abuses the sysfs attribute file functions to propagate settings from the root kmem_cache to a newly created kmem_cache. It does that with: attr->show(root, buf); attr->store(new, buf, strlen(bug); Aside of being a lazy and absurd hackery this is broken because it does not check the return value of the show() function. Some of the show() functions return 0 w/o touching the buffer. That means in such a case the store function is called with the stale content of the previous show(). That causes nonsense like invoking kmem_cache_shrink() on a newly created kmem_cache. In the worst case it would cause handing in an uninitialized buffer. This should be rewritten proper by adding a propagate() callback to those slub_attributes which must be propagated and avoid that insane conversion to and from ASCII, but that's too large for a hot fix. Check at least the return value of the show() function, so calling store() with stale content is prevented. Steven said: "It can cause a deadlock with get_online_cpus() that has been uncovered by recent cpu hotplug and lockdep changes that Thomas and Peter have been doing. Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpu_hotplug.lock); lock(slab_mutex); lock(cpu_hotplug.lock); lock(slab_mutex); * DEADLOCK *" Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1705201244540.2255@nanos Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Andrea Arcangeli	6caa2db34d	ksm: prevent crash after write_protect_page fails commit `a7306c3436` upstream. "err" needs to be left set to -EFAULT if split_huge_page succeeds. Otherwise if "err" gets clobbered with zero and write_protect_page fails, try_to_merge_one_page() will succeed instead of returning -EFAULT and then try_to_merge_with_ksm_page() will continue thinking kpage is a PageKsm when in fact it's still an anonymous page. Eventually it'll crash in page_add_anon_rmap. This has been reproduced on Fedora25 kernel but I can reproduce with upstream too. The bug was introduced in commit `f765f54059` ("ksm: prepare to new THP semantics") introduced in v4.5. page:fffff67546ce1cc0 count:4 mapcount:2 mapping:ffffa094551e36e1 index:0x7f0f46673 flags: 0x2ffffc0004007c(referenced\|uptodate\|dirty\|lru\|active\|swapbacked) page dumped because: VM_BUG_ON_PAGE(!PageLocked(page)) page->mem_cgroup:ffffa09674bf0000 ------------[ cut here ]------------ kernel BUG at mm/rmap.c:1222! CPU: 1 PID: 76 Comm: ksmd Not tainted 4.9.3-200.fc25.x86_64 #1 RIP: do_page_add_anon_rmap+0x1c4/0x240 Call Trace: page_add_anon_rmap+0x18/0x20 try_to_merge_with_ksm_page+0x50b/0x780 ksm_scan_thread+0x1211/0x1410 ? prepare_to_wait_event+0x100/0x100 ? try_to_merge_with_ksm_page+0x780/0x780 kthread+0xd9/0xf0 ? kthread_park+0x60/0x60 ret_from_fork+0x25/0x30 Fixes: `f765f54059` ("ksm: prepare to new THP semantics") Link: http://lkml.kernel.org/r/20170513131040.21732-1-aarcange@redhat.com Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Reported-by: Federico Simoncelli <fsimonce@redhat.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:13 +02:00
Rob Landley	d6eaf7a4d6	x86/boot: Use CROSS_COMPILE prefix for readelf commit `3780578761` upstream. The boot code Makefile contains a straight 'readelf' invocation. This causes build warnings in cross compile environments, when there is no unprefixed readelf accessible via $PATH. Add the missing $(CROSS_COMPILE) prefix. [ tglx: Rewrote changelog ] Fixes: `98f7852537` ("x86/boot: Refuse to build with data relocations") Signed-off-by: Rob Landley <rob@landley.net> Acked-by: Kees Cook <keescook@chromium.org> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Bolle <pebolle@tiscali.nl> Cc: "H.J. Lu" <hjl.tools@gmail.com> Link: http://lkml.kernel.org/r/ced18878-693a-9576-a024-113ef39a22c0@landley.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Mike Marciniszyn	f7e82ab3b6	RDMA/qib,hfi1: Fix MR reference count leak on write with immediate commit `1feb40067c` upstream. The handling of IB_RDMA_WRITE_ONLY_WITH_IMMEDIATE will leak a memory reference when a buffer cannot be allocated for returning the immediate data. The issue is that the rkey validation has already occurred and the RNR nak fails to release the reference that was fruitlessly gotten. The the peer will send the identical single packet request when its RNR timer pops. The fix is to release the held reference prior to the rnr nak exit. This is the only sequence the requires both rkey validation and the buffer allocation on the same packet. Tested-by: Tadeusz Struk <tadeusz.struk@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Israel Rukshin	68c98967e7	RDMA/srp: Fix NULL deref at srp_destroy_qp() commit `95c2ef50c7` upstream. If srp_init_qp() fails at srp_create_ch_ib() then ch->send_cq may be NULL. Calling directly to ib_destroy_qp() is sufficient because no work requests were posted on the created qp. Fixes: `9294000d6d` ("IB/srp: Drain the send queue before destroying a QP") Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Bart van Assche <bart.vanassche@sandisk.com>-- Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Michal Hocko	dbab023245	mm: consider memblock reservations for deferred memory initialization sizing commit `864b9a393d` upstream. We have seen an early OOM killer invocation on ppc64 systems with crashkernel=4096M: kthreadd invoked oom-killer: gfp_mask=0x16040c0(GFP_KERNEL\|__GFP_COMP\|__GFP_NOTRACK), nodemask=7, order=0, oom_score_adj=0 kthreadd cpuset=/ mems_allowed=7 CPU: 0 PID: 2 Comm: kthreadd Not tainted 4.4.68-1.gd7fe927-default #1 Call Trace: dump_stack+0xb0/0xf0 (unreliable) dump_header+0xb0/0x258 out_of_memory+0x5f0/0x640 __alloc_pages_nodemask+0xa8c/0xc80 kmem_getpages+0x84/0x1a0 fallback_alloc+0x2a4/0x320 kmem_cache_alloc_node+0xc0/0x2e0 copy_process.isra.25+0x260/0x1b30 _do_fork+0x94/0x470 kernel_thread+0x48/0x60 kthreadd+0x264/0x330 ret_from_kernel_thread+0x5c/0xa4 Mem-Info: active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 slab_reclaimable:5 slab_unreclaimable:73 mapped:0 shmem:0 pagetables:0 bounce:0 free:0 free_pcp:0 free_cma:0 Node 7 DMA free:0kB min:0kB low:0kB high:0kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:52428800kB managed:110016kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:320kB slab_unreclaimable:4672kB kernel_stack:1152kB pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 0 Node 7 DMA: 064kB 0128kB 0256kB 0512kB 01024kB 02048kB 04096kB 08192kB 0*16384kB = 0kB 0 total pagecache pages 0 pages in swap cache Swap cache stats: add 0, delete 0, find 0/0 Free swap = 0kB Total swap = 0kB 819200 pages RAM 0 pages HighMem/MovableOnly 817481 pages reserved 0 pages cma reserved 0 pages hwpoisoned the reason is that the managed memory is too low (only 110MB) while the rest of the the 50GB is still waiting for the deferred intialization to be done. update_defer_init estimates the initial memoty to initialize to 2GB at least but it doesn't consider any memory allocated in that range. In this particular case we've had Reserving 4096MB of memory at 128MB for crashkernel (System RAM: 51200MB) so the low 2GB is mostly depleted. Fix this by considering memblock allocations in the initial static initialization estimation. Move the max_initialise to reset_deferred_meminit and implement a simple memblock_reserved_memory helper which iterates all reserved blocks and sums the size of all that start below the given address. The cumulative size is than added on top of the initial estimation. This is still not ideal because reset_deferred_meminit doesn't consider holes and so reservation might be above the initial estimation whihch we ignore but let's make the logic simpler until we really need to handle more complicated cases. Fixes: `3a80a7fa79` ("mm: meminit: initialise a subset of struct pages if CONFIG_DEFERRED_STRUCT_PAGE_INIT is set") Link: http://lkml.kernel.org/r/20170531104010.GI27783@dhcp22.suse.cz Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
James Morse	1cc8926344	mm/hugetlb: report -EHWPOISON not -EFAULT when FOLL_HWPOISON is specified commit `9a291a7c94` upstream. KVM uses get_user_pages() to resolve its stage2 faults. KVM sets the FOLL_HWPOISON flag causing faultin_page() to return -EHWPOISON when it finds a VM_FAULT_HWPOISON. KVM handles these hwpoison pages as a special case. (check_user_page_hwpoison()) When huge pages are involved, this doesn't work so well. get_user_pages() calls follow_hugetlb_page(), which stops early if it receives VM_FAULT_HWPOISON from hugetlb_fault(), eventually returning -EFAULT to the caller. The step to map this to -EHWPOISON based on the FOLL_ flags is missing. The hwpoison special case is skipped, and -EFAULT is returned to user-space, causing Qemu or kvmtool to exit. Instead, move this VM_FAULT_ to errno mapping code into a header file and use it from faultin_page() and follow_hugetlb_page(). With this, KVM works as expected. This isn't a problem for arm64 today as we haven't enabled MEMORY_FAILURE, but I can't see any reason this doesn't happen on x86 too, so I think this should be a fix. This doesn't apply earlier than stable's v4.11.1 due to all sorts of cleanup. [james.morse@arm.com: add vm_fault_to_errno() call to faultin_page()] suggested. Link: http://lkml.kernel.org/r/20170525171035.16359-1-james.morse@arm.com [akpm@linux-foundation.org: coding-style fixes] Link: http://lkml.kernel.org/r/20170524160900.28786-1-james.morse@arm.com Signed-off-by: James Morse <james.morse@arm.com> Acked-by: Punit Agrawal <punit.agrawal@arm.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Yisheng Xie	f814bf4655	mlock: fix mlock count can not decrease in race condition commit `70feee0e1e` upstream. Kefeng reported that when running the follow test, the mlock count in meminfo will increase permanently: [1] testcase linux:~ # cat test_mlockal grep Mlocked /proc/meminfo for j in `seq 0 10` do for i in `seq 4 15` do ./p_mlockall >> log & done sleep 0.2 done # wait some time to let mlock counter decrease and 5s may not enough sleep 5 grep Mlocked /proc/meminfo linux:~ # cat p_mlockall.c #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #define SPACE_LEN 4096 int main(int argc, char ** argv) { int ret; void *adr = malloc(SPACE_LEN); if (!adr) return -1; ret = mlockall(MCL_CURRENT \| MCL_FUTURE); printf("mlcokall ret = %d\n", ret); ret = munlockall(); printf("munlcokall ret = %d\n", ret); free(adr); return 0; } In __munlock_pagevec() we should decrement NR_MLOCK for each page where we clear the PageMlocked flag. Commit `1ebb7cc6a5` ("mm: munlock: batch NR_MLOCK zone state updates") has introduced a bug where we don't decrement NR_MLOCK for pages where we clear the flag, but fail to isolate them from the lru list (e.g. when the pages are on some other cpu's percpu pagevec). Since PageMlocked stays cleared, the NR_MLOCK accounting gets permanently disrupted by this. Fix it by counting the number of page whose PageMlock flag is cleared. Fixes: `1ebb7cc6a5` (" mm: munlock: batch NR_MLOCK zone state updates") Link: http://lkml.kernel.org/r/1495678405-54569-1-git-send-email-xieyisheng1@huawei.com Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com> Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com> Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Joern Engel <joern@logfs.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Michel Lespinasse <walken@google.com> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Xishi Qiu <qiuxishi@huawei.com> Cc: zhongjiang <zhongjiang@huawei.com> Cc: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:12 +02:00
Punit Agrawal	a0189db30d	mm/migrate: fix refcount handling when !hugepage_migration_supported() commit `30809f559a` upstream. On failing to migrate a page, soft_offline_huge_page() performs the necessary update to the hugepage ref-count. But when !hugepage_migration_supported() , unmap_and_move_hugepage() also decrements the page ref-count for the hugepage. The combined behaviour leaves the ref-count in an inconsistent state. This leads to soft lockups when running the overcommitted hugepage test from mce-tests suite. Soft offlining pfn 0x83ed600 at process virtual address 0x400000000000 soft offline: 0x83ed600: migration failed 1, type 1fffc00000008008 (uptodate\|head) INFO: rcu_preempt detected stalls on CPUs/tasks: Tasks blocked on level-0 rcu_node (CPUs 0-7): P2715 (detected by 7, t=5254 jiffies, g=963, c=962, q=321) thugetlb_overco R running task 0 2715 2685 0x00000008 Call trace: dump_backtrace+0x0/0x268 show_stack+0x24/0x30 sched_show_task+0x134/0x180 rcu_print_detail_task_stall_rnp+0x54/0x7c rcu_check_callbacks+0xa74/0xb08 update_process_times+0x34/0x60 tick_sched_handle.isra.7+0x38/0x70 tick_sched_timer+0x4c/0x98 __hrtimer_run_queues+0xc0/0x300 hrtimer_interrupt+0xac/0x228 arch_timer_handler_phys+0x3c/0x50 handle_percpu_devid_irq+0x8c/0x290 generic_handle_irq+0x34/0x50 __handle_domain_irq+0x68/0xc0 gic_handle_irq+0x5c/0xb0 Address this by changing the putback_active_hugepage() in soft_offline_huge_page() to putback_movable_pages(). This only triggers on systems that enable memory failure handling (ARCH_SUPPORTS_MEMORY_FAILURE) but not hugepage migration (!ARCH_ENABLE_HUGEPAGE_MIGRATION). I imagine this wasn't triggered as there aren't many systems running this configuration. [akpm@linux-foundation.org: remove dead comment, per Naoya] Link: http://lkml.kernel.org/r/20170525135146.32011-1-punit.agrawal@arm.com Reported-by: Manoj Iyer <manoj.iyer@canonical.com> Tested-by: Manoj Iyer <manoj.iyer@canonical.com> Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Ross Zwisler	bdaac1fe71	dax: fix race between colliding PMD & PTE entries commit `e2093926a0` upstream. We currently have two related PMD vs PTE races in the DAX code. These can both be easily triggered by having two threads reading and writing simultaneously to the same private mapping, with the key being that private mapping reads can be handled with PMDs but private mapping writes are always handled with PTEs so that we can COW. Here is the first race: CPU 0 CPU 1 (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK handle_pte_fault() passes check for pmd_devmap() (private mapping read) __handle_mm_fault() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD dax_iomap_pte_fault() does a PTE fault, but we already have a DAX PMD installed in our page tables at this spot. Here's the second race: CPU 0 CPU 1 (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() dax_iomap_pmd_fault() inserts PMD (private mapping write) __handle_mm_fault() create_huge_pmd() - FALLBACK (private mapping read) __handle_mm_fault() passes check for pmd_none() create_huge_pmd() handle_pte_fault() dax_iomap_pte_fault() inserts PTE dax_iomap_pmd_fault() inserts PMD, but we already have a PTE at this spot. The core of the issue is that while there is isolation between faults to the same range in the DAX fault handlers via our DAX entry locking, there is no isolation between faults in the code in mm/memory.c. This means for instance that this code in __handle_mm_fault() can run: if (pmd_none(vmf.pmd) && transparent_hugepage_enabled(vma)) { ret = create_huge_pmd(&vmf); But by the time we actually get to run the fault handler called by create_huge_pmd(), the PMD is no longer pmd_none() because a racing PTE fault has installed a normal PMD here as a parent. This is the cause of the 2nd race. The first race is similar - there is the following check in handle_pte_fault(): } else { / See comment in pte_alloc_one_map() / if (pmd_devmap(vmf->pmd) \|\| pmd_trans_unstable(vmf->pmd)) return 0; So if a pmd_devmap() PMD (a DAX PMD) has been installed at vmf->pmd, we will bail and retry the fault. This is correct, but there is nothing preventing the PMD from being installed after this check but before we actually get to the DAX PTE fault handlers. In my testing these races result in the following types of errors: BUG: Bad rss-counter state mm:ffff8800a817d280 idx:1 val:1 BUG: non-zero nr_ptes on freeing mm: 15 Fix this issue by having the DAX fault handlers verify that it is safe to continue their fault after they have taken an entry lock to block other racing faults. [ross.zwisler@linux.intel.com: improve fix for colliding PMD & PTE entries] Link: http://lkml.kernel.org/r/20170526195932.32178-1-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/20170522215749.23516-2-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reported-by: Pawel Lebioda <pawel.lebioda@intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Ross Zwisler	b16e6ab5ad	mm: avoid spurious 'bad pmd' warning messages commit `d0f0931de9` upstream. When the pmd_devmap() checks were added by `5c7fb56e5e` ("mm, dax: dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge pages, they were all added to the end of if() statements after existing pmd_trans_huge() checks. So, things like: - if (pmd_trans_huge(pmd)) + if (pmd_trans_huge(pmd) \|\| pmd_devmap(pmd)) When further checks were added after pmd_trans_unstable() checks by commit `7267ec008b` ("mm: postpone page table allocation until we have page to map") they were also added at the end of the conditional: + if (pmd_trans_unstable(fe->pmd) \|\| pmd_devmap(fe->pmd)) This ordering is fine for pmd_trans_huge(), but doesn't work for pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd() check inside of pmd_none_or_trans_huge_or_clear_bad() (called by pmd_trans_unstable()), which prints out a warning and returns 1. So, we do end up doing the right thing, but only after spamming dmesg with suspicious looking messages: mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5) Reorder these checks in a helper so that pmd_devmap() is checked first, avoiding the error messages, and add a comment explaining why the ordering is important. Fixes: commit `7267ec008b` ("mm: postpone page table allocation until we have page to map") Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Pawel Lebioda <pawel.lebioda@intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Xiong Zhou <xzhou@redhat.com> Cc: Eryu Guan <eguan@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Tetsuo Handa	de12c73fa2	mm/page_alloc.c: make sure OOM victim can try allocations with no watermarks once commit `c288983ddd` upstream. Roman Gushchin has reported that the OOM killer can trivially selects next OOM victim when a thread doing memory allocation from page fault path was selected as first OOM victim. allocate invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 __alloc_pages_slowpath+0xc84/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 492 (allocate) score 899 or sacrifice child Killed process 492 (allocate) total-vm:2052368kB, anon-rss:1894576kB, file-rss:4kB, shmem-rss:0kB allocate: page allocation failure: order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE\|__GFP_ZERO), nodemask=(null) allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: __alloc_pages_slowpath+0xd32/0xd40 __alloc_pages_nodemask+0x245/0x260 alloc_pages_vma+0xa2/0x270 __handle_mm_fault+0xca9/0x10c0 handle_mm_fault+0xf3/0x210 __do_page_fault+0x240/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... oom_reaper: reaped process 492 (allocate), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB ... allocate invoked oom-killer: gfp_mask=0x0(), nodemask=(null), order=0, oom_score_adj=0 allocate cpuset=/ mems_allowed=0 CPU: 1 PID: 492 Comm: allocate Not tainted 4.12.0-rc1-mm1+ #181 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 Call Trace: oom_kill_process+0x219/0x3e0 out_of_memory+0x11d/0x480 pagefault_out_of_memory+0x68/0x80 mm_fault_error+0x8f/0x190 ? handle_mm_fault+0xf3/0x210 __do_page_fault+0x4b2/0x4e0 trace_do_page_fault+0x37/0xe0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 ... Out of memory: Kill process 233 (firewalld) score 10 or sacrifice child Killed process 233 (firewalld) total-vm:246076kB, anon-rss:20956kB, file-rss:0kB, shmem-rss:0kB There is a race window that the OOM reaper completes reclaiming the first victim's memory while nothing but mutex_trylock() prevents the first victim from calling out_of_memory() from pagefault_out_of_memory() after memory allocation for page fault path failed due to being selected as an OOM victim. This is a side effect of commit `9a67f6488e` ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") because that commit silently changed the behavior from /* Avoid allocations with no watermarks from looping endlessly / to / * Give up allocations without trying memory reserves if selected * as an OOM victim */ in __alloc_pages_slowpath() by moving the location to check TIF_MEMDIE flag. I have noticed this change but I didn't post a patch because I thought it is an acceptable change other than noise by warn_alloc() because !__GFP_NOFAIL allocations are allowed to fail. But we overlooked that failing memory allocation from page fault path makes difference due to the race window explained above. While it might be possible to add a check to pagefault_out_of_memory() that prevents the first victim from calling out_of_memory() or remove out_of_memory() from pagefault_out_of_memory(), changing pagefault_out_of_memory() does not suppress noise by warn_alloc() when allocating thread was selected as an OOM victim. There is little point with printing similar backtraces and memory information from both out_of_memory() and warn_alloc(). Instead, if we guarantee that current thread can try allocations with no watermarks once when current thread looping inside __alloc_pages_slowpath() was selected as an OOM victim, we can follow "who can use memory reserves" rules and suppress noise by warn_alloc() and prevent memory allocations from page fault path from calling pagefault_out_of_memory(). If we take the comment literally, this patch would do - if (test_thread_flag(TIF_MEMDIE)) - goto nopage; + if (alloc_flags == ALLOC_NO_WATERMARKS \|\| (gfp_mask & __GFP_NOMEMALLOC)) + goto nopage; because gfp_pfmemalloc_allowed() returns false if __GFP_NOMEMALLOC is given. But if I recall correctly (I couldn't find the message), the condition is meant to apply to only OOM victims despite the comment. Therefore, this patch preserves TIF_MEMDIE check. Fixes: `9a67f6488e` ("mm: consolidate GFP_NOFAIL checks in the allocator slowpath") Link: http://lkml.kernel.org/r/201705192112.IAF69238.OQOHSJLFOFFMtV@I-love.SAKURA.ne.jp Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-by: Roman Gushchin <guro@fb.com> Tested-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	af03bb0cab	ALSA: usb: Fix a typo in Tascam US-16x08 mixer element commit `617163fc25` upstream. A mixer element created in a quirk for Tascam US-16x08 contains a typo: it should be "EQ MidLow Q" instead of "EQ MidQLow Q". Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875 Fixes: `d2bb390a20` ("ALSA: usb-audio: Tascam US-16x08 DSP mixer quirk") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	0be9a9a422	Revert "ALSA: usb-audio: purge needless variable length array" commit `64188cfbe5` upstream. This reverts commit `89b593c30e` ("ALSA: usb-audio: purge needless variable length array"). The patch turned out to cause a severe regression, triggering an Oops at snd_usb_ctl_msg(). It was overseen that snd_usb_ctl_msg() writes back the response to the given buffer, while the patch changed it to a read-only const buffer. (One should always double-check when an extra pointer cast is present...) As a simple fix, just revert the affected commit. It was merely a cleanup. Although it brings VLA again, it's clearer as a fix. We'll address the VLA later in another patch. Fixes: `89b593c30e` ("ALSA: usb-audio: purge needless variable length array") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195875 Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Alexander Tsoy	ffb97b001b	ALSA: hda - apply STAC_9200_DELL_M22 quirk for Dell Latitude D430 commit `1fc2e41f7a` upstream. This model is actually called 92XXM2-8 in Windows driver. But since pin configs for M22 and M28 are identical, just reuse M22 quirk. Fixes external microphone (tested) and probably docking station ports (not tested). Signed-off-by: Alexander Tsoy <alexander@tsoy.me> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:11 +02:00
Takashi Iwai	0c4afdc6d8	ALSA: hda - No loopback on ALC299 codec commit `fa16b69f12` upstream. ALC299 has no loopback mixer, but the driver still tries to add a beep control over the mixer NID which leads to the error at accessing it. This patch fixes it by properly declaring mixer_nid=0 for this codec. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195775 Fixes: `28f1f9b26c` ("ALSA: hda/realtek - Add new codec ID ALC299") Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Nicolas Iooss	d5bc54d0a3	pcmcia: remove left-over %Z format commit `ff5a20169b` upstream. Commit `5b5e0928f7` ("lib/vsprintf.c: remove %Z support") removed some usages of format %Z but forgot "%.2Zx". This makes clang 4.0 reports a -Wformat-extra-args warning because it does not know about %Z. Replace %Z with %z. Link: http://lkml.kernel.org/r/20170520090946.22562-1-nicolas.iooss_linux@m4x.org Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Cc: Harald Welte <laforge@gnumonks.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Lyude	2609770993	drm/radeon: Unbreak HPD handling for r600+ commit `3d18e33735` upstream. We end up reading the interrupt register for HPD5, and then writing it to HPD6 which on systems without anything using HPD5 results in permanently disabling hotplug on one of the display outputs after the first time we acknowledge a hotplug interrupt from the GPU. This code is really bad. But for now, let's just fix this. I will hopefully have a large patch series to refactor all of this soon. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Lyude <lyude@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	6b693bbf9c	drm/radeon/ci: disable mclk switching for high refresh rates (v2) commit `58d7e3e427` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. v2: fix logic inversion (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	d6ba1a4407	drm/amd/powerplay/smu7: disable mclk switching for high refresh rates commit `2275a3a2fe` upstream. Even if the vblank period would allow it, it still seems to be problematic on some cards. bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Alex Deucher	662dbfcc66	drm/amd/powerplay/smu7: add vblank check for mclk switching (v2) commit `09be4a5219` upstream. Check to make sure the vblank period is long enough to support mclk switching. v2: drop needless initial assignment (Nils) bug: https://bugs.freedesktop.org/show_bug.cgi?id=96868 Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Ming Lei	a8aa8a0c10	nvme: avoid to use blk_mq_abort_requeue_list() commit `986f75c876` upstream. NVMe may add request into requeue list simply and not kick off the requeue if hw queues are stopped. Then blk_mq_abort_requeue_list() is called in both nvme_kill_queues() and nvme_ns_remove() for dealing with this issue. Unfortunately blk_mq_abort_requeue_list() is absolutely a race maker, for example, one request may be requeued during the aborting. So this patch just calls blk_mq_kick_requeue_list() in nvme_kill_queues() to handle this issue like what nvme_start_queues() does. Now all requests in requeue list when queues are stopped will be handled by blk_mq_kick_requeue_list() when queues are restarted, either in nvme_start_queues() or in nvme_kill_queues(). Reported-by: Zhang Yi <yizhan@redhat.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:10 +02:00
Ming Lei	20c03f455c	nvme: use blk_mq_start_hw_queues() in nvme_kill_queues() commit `806f026f9b` upstream. Inside nvme_kill_queues(), we have to start hw queues for draining requests in sw queues, .dispatch list and requeue list, so use blk_mq_start_hw_queues() instead of blk_mq_start_stopped_hw_queues() which only run queues if queues are stopped, but the queues may have been started already, for example nvme_start_queues() is called in reset work function. blk_mq_start_hw_queues() run hw queues in current context, instead of running asynchronously like before. Given nvme_kill_queues() is run from either remove context or reset worker context, both are fine to run hw queue directly. And the mutex of namespaces_mutex isn't a problem too becasue nvme_start_freeze() runs hw queue in this way already. Reported-by: Zhang Yi <yizhan@redhat.com> Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Marta Rybczynska	0fe9c55195	nvme-rdma: support devices with queue size < 32 commit `0544f5494a` upstream. In the case of small NVMe-oF queue size (<32) we may enter a deadlock caused by the fact that the IB completions aren't sent waiting for 32 and the send queue will fill up. The error is seen as (using mlx5): [ 2048.693355] mlx5_0:mlx5_ib_post_send:3765:(pid 7273): [ 2048.693360] nvme nvme1: nvme_rdma_post_send failed with error code -12 This patch changes the way the signaling is done so that it depends on the queue depth now. The magic define has been removed completely. Signed-off-by: Marta Rybczynska <marta.rybczynska@kalray.eu> Signed-off-by: Samuel Jones <sjones@kalray.eu> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Jason Gerecke	f88d3d6e4f	HID: wacom: Have wacom_tpc_irq guard against possible NULL dereference commit `2ac97f0f66` upstream. The following Smatch complaint was generated in response to commit `2a6cdbd` ("HID: wacom: Introduce new 'touch_input' device"): drivers/hid/wacom_wac.c:1586 wacom_tpc_irq() error: we previously assumed 'wacom->touch_input' could be null (see line 1577) The 'touch_input' and 'pen_input' variables point to the 'struct input_dev' used for relaying touch and pen events to userspace, respectively. If a device does not have a touch interface or pen interface, the associated input variable is NULL. The 'wacom_tpc_irq()' function is responsible for forwarding input reports to a more-specific IRQ handler function. An unknown report could theoretically be mistaken as e.g. a touch report on a device which does not have a touch interface. This can be prevented by only calling the pen/touch functions are called when the pen/touch pointers are valid. Fixes: `2a6cdbd` ("HID: wacom: Introduce new 'touch_input' device") Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com> Reviewed-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Bryant G. Ly	8d975ebd0a	ibmvscsis: Fix the incorrect req_lim_delta commit `75dbf2d36f` upstream. The current code is not correctly calculating the req_lim_delta. We want to make sure vscsi->credit is always incremented when we do not send a response for the scsi op. Thus for the case where there is a successfully aborted task we need to make sure the vscsi->credit is incremented. v2 - Moves the original location of the vscsi->credit increment to a better spot. Since if we increment credit, the next command we send back will have increased req_lim_delta. But we probably shouldn't be doing that until the aborted cmd is actually released. Otherwise the client will think that it can send a new command, and we could find ourselves short of command elements. Not likely, but could happen. This patch depends on both: commit `25e7853126` ("ibmvscsis: Do not send aborted task response") commit `98883f1b54` ("ibmvscsis: Clear left-over abort_cmd pointers") Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Bryant G. Ly	e920be8367	ibmvscsis: Clear left-over abort_cmd pointers commit `98883f1b54` upstream. With the addition of ibmvscsis->abort_cmd pointer within commit `25e7853126` ("ibmvscsis: Do not send aborted task response"), make sure to explicitly NULL these pointers when clearing DELAY_SEND flag. Do this for two cases, when getting the new new ibmvscsis descriptor in ibmvscsis_get_free_cmd() and before posting the response completion in ibmvscsis_send_messages(). Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Reviewed-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Artem Savkov	1fb66c6aad	scsi: scsi_dh_rdac: Use ctlr directly in rdac_failover_get() commit `0648a07c9b` upstream. rdac_failover_get references struct rdac_controller as ctlr->ms_sdev->handler_data->ctlr for no apparent reason. Besides being inefficient this also introduces a null-pointer dereference as send_mode_select() sets ctlr->ms_sdev to NULL before calling rdac_failover_get(): [ 18.432550] device-mapper: multipath service-time: version 0.3.0 loaded [ 18.436124] BUG: unable to handle kernel NULL pointer dereference at 0000000000000790 [ 18.436129] IP: send_mode_select+0xca/0x560 [ 18.436129] PGD 0 [ 18.436130] P4D 0 [ 18.436130] [ 18.436132] Oops: 0000 [#1] SMP [ 18.436133] Modules linked in: dm_service_time sd_mod dm_multipath amdkfd amd_iommu_v2 radeon(+) i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qla2xxx drm serio_raw scsi_transport_fc bnx2 i2c_core dm_mirror dm_region_hash dm_log dm_mod [ 18.436143] CPU: 4 PID: 443 Comm: kworker/u16:2 Not tainted 4.12.0-rc1.1.el7.test.x86_64 #1 [ 18.436144] Hardware name: IBM BladeCenter LS22 -[79013SG]-/Server Blade, BIOS -[L8E164AUS-1.07]- 05/25/2011 [ 18.436145] Workqueue: kmpath_rdacd send_mode_select [ 18.436146] task: ffff880225116a40 task.stack: ffffc90002bd8000 [ 18.436148] RIP: 0010:send_mode_select+0xca/0x560 [ 18.436148] RSP: 0018:ffffc90002bdbda8 EFLAGS: 00010246 [ 18.436149] RAX: 0000000000000000 RBX: ffffc90002bdbe08 RCX: ffff88017ef04a80 [ 18.436150] RDX: ffffc90002bdbe08 RSI: ffff88017ef04a80 RDI: ffff8802248e4388 [ 18.436151] RBP: ffffc90002bdbe48 R08: 0000000000000000 R09: ffffffff81c104c0 [ 18.436151] R10: 00000000000001ff R11: 000000000000035a R12: ffffc90002bdbdd8 [ 18.436152] R13: ffff8802248e4390 R14: ffff880225152800 R15: ffff8802248e4400 [ 18.436153] FS: 0000000000000000(0000) GS:ffff880227d00000(0000) knlGS:0000000000000000 [ 18.436154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 18.436154] CR2: 0000000000000790 CR3: 000000042535b000 CR4: 00000000000006e0 [ 18.436155] Call Trace: [ 18.436159] ? rdac_activate+0x14e/0x150 [ 18.436161] ? refcount_dec_and_test+0x11/0x20 [ 18.436162] ? kobject_put+0x1c/0x50 [ 18.436165] ? scsi_dh_activate+0x6f/0xd0 [ 18.436168] process_one_work+0x149/0x360 [ 18.436170] worker_thread+0x4d/0x3c0 [ 18.436172] kthread+0x109/0x140 [ 18.436173] ? rescuer_thread+0x380/0x380 [ 18.436174] ? kthread_park+0x60/0x60 [ 18.436176] ret_from_fork+0x2c/0x40 [ 18.436177] Code: 49 c7 46 20 00 00 00 00 4c 89 ef c6 07 00 0f 1f 40 00 45 31 ed c7 45 b0 05 00 00 00 44 89 6d b4 4d 89 f5 4c 8b 75 a8 49 8b 45 20 <48> 8b b0 90 07 00 00 48 8b 56 10 8b 42 10 48 8d 7a 28 85 c0 0f [ 18.436192] RIP: send_mode_select+0xca/0x560 RSP: ffffc90002bdbda8 [ 18.436192] CR2: 0000000000000790 [ 18.436198] ---[ end trace 40f3e4dca1ffabdd ]--- [ 18.436199] Kernel panic - not syncing: Fatal exception [ 18.436222] Kernel Offset: disabled [-- MARK -- Thu May 18 11:45:00 2017] Fixes: `3278255741` scsi_dh_rdac: switch to scsi_execute_req_flags() Signed-off-by: Artem Savkov <asavkov@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Nicholas Bellinger	14ba78937e	iscsi-target: Fix initial login PDU asynchronous socket close OOPs commit `25cdda95fd` upstream. This patch fixes a OOPs originally introduced by: commit `bb048357da` Author: Nicholas Bellinger <nab@linux-iscsi.org> Date: Thu Sep 5 14:54:04 2013 -0700 iscsi-target: Add sk->sk_state_change to cleanup after TCP failure which would trigger a NULL pointer dereference when a TCP connection was closed asynchronously via iscsi_target_sk_state_change(), but only when the initial PDU processing in iscsi_target_do_login() from iscsi_np process context was blocked waiting for backend I/O to complete. To address this issue, this patch makes the following changes. First, it introduces some common helper functions used for checking socket closing state, checking login_flags, and atomically checking socket closing state + setting login_flags. Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP connection has dropped via iscsi_target_sk_state_change(), but the initial PDU processing within iscsi_target_do_login() in iscsi_np context is still running. For this case, it sets LOGIN_FLAGS_CLOSED, but doesn't invoke schedule_delayed_work(). The original NULL pointer dereference case reported by MNC is now handled by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before transitioning to FFP to determine when the socket has already closed, or iscsi_target_start_negotiation() if the login needs to exchange more PDUs (eg: iscsi_target_do_login returned 0) but the socket has closed. For both of these cases, the cleanup up of remaining connection resources will occur in iscsi_target_start_negotiation() from iscsi_np process context once the failure is detected. Finally, to handle to case where iscsi_target_sk_state_change() is called after the initial PDU procesing is complete, it now invokes conn->login_work -> iscsi_target_do_login_rx() to perform cleanup once existing iscsi_target_sk_check_close() checks detect connection failure. For this case, the cleanup of remaining connection resources will occur in iscsi_target_do_login_rx() from delayed workqueue process context once the failure is detected. Reported-by: Mike Christie <mchristi@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Tested-by: Mike Christie <mchristi@redhat.com> Cc: Mike Christie <mchristi@redhat.com> Reported-by: Hannes Reinecke <hare@suse.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Varun Prakash <varun@chelsio.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:09 +02:00
Jiang Yi	c732f30887	iscsi-target: Always wait for kthread_should_stop() before kthread exit commit `5e0cf5e6c4` upstream. There are three timing problems in the kthread usages of iscsi_target_mod: - np_thread of struct iscsi_np - rx_thread and tx_thread of struct iscsi_conn In iscsit_close_connection(), it calls send_sig(SIGINT, conn->tx_thread, 1); kthread_stop(conn->tx_thread); In conn->tx_thread, which is iscsi_target_tx_thread(), when it receive SIGINT the kthread will exit without checking the return value of kthread_should_stop(). So if iscsi_target_tx_thread() exit right between send_sig(SIGINT...) and kthread_stop(...), the kthread_stop() will try to stop an already stopped kthread. This is invalid according to the documentation of kthread_stop(). (Fix -ECONNRESET logout handling in iscsi_target_tx_thread and early iscsi_target_rx_thread failure case - nab) Signed-off-by: Jiang Yi <jiangyilism@gmail.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Long Li	a168ac5b24	scsi: zero per-cmd private driver data for each MQ I/O commit `1bad6c4a57` upstream. In lower layer driver's (LLD) scsi_host_template, the driver may optionally ask SCSI to allocate its private driver memory for each command, by specifying cmd_size. This memory is allocated at the end of scsi_cmnd by SCSI. Later when SCSI queues a command, the LLD can use scsi_cmd_priv to get to its private data. Some LLD, e.g. hv_storvsc, doesn't clear its private data before use. In this case, the LLD may get to stale or uninitialized data in its private driver memory. This may result in unexpected driver and hardware behavior. Fix this problem by also zeroing the private driver memory before passing them to LLD. Signed-off-by: Long Li <longli@microsoft.com> Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com> Reviewed-by: KY Srinivasan <kys@microsoft.com> Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Srinath Mannam	21f8aa4cfc	mmc: sdhci-iproc: suppress spurious interrupt with Multiblock read commit `f5f968f237` upstream. The stingray SDHCI hardware supports ACMD12 and automatically issues after multi block transfer completed. If ACMD12 in SDHCI is disabled, spurious tx done interrupts are seen on multi block read command with below error message: Got data interrupt 0x00000002 even though no data operation was in progress. This patch uses SDHCI_QUIRK_MULTIBLOCK_READ_ACMD12 to enable ACM12 support in SDHCI hardware and suppress spurious interrupt. Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com> Reviewed-by: Ray Jui <ray.jui@broadcom.com> Reviewed-by: Scott Branden <scott.branden@broadcom.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: `b580c52d58` ("mmc: sdhci-iproc: add IPROC SDHCI driver") Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Benjamin Tissoires	4c5681afdf	Revert "ACPI / button: Change default behavior to lid_init_state=open" commit `878d8db039` upstream. Revert commit `77e9a4aa9d` (ACPI / button: Change default behavior to lid_init_state=open) which changed the kernel's behavior on laptops that boot with closed lids and expect the lid switch state to be reported accurately by the kernel. If you boot or resume your laptop with the lid closed on a docking station while using an external monitor connected to it, both internal and external displays will light on, while only the external should. There is a design choice in gdm to only provide the greeter on the internal display when lit on, so users only see a gray area on the external monitor. Also, the cursor will not show up as it's by default on the internal display too. To "fix" that, users have to open the laptop once and close it once again to sync the state of the switch with the hardware state. Even if the "method" operation mode implementation can be buggy on some platforms, the "open" choice is worse. It breaks docking stations basically and there is no way to have a user-space hwdb to fix that. On the contrary, it's rather easy in user-space to have a hwdb with the problematic platforms. Then, libinput (1.7.0+) can fix the state of the lid switch for us: you need to set the udev property LIBINPUT_ATTR_LID_SWITCH_RELIABILITY to 'write_open'. When libinput detects internal keyboard events, it will overwrite the state of the switch to open, making it reliable again. Given that logind only checks the lid switch value after a timeout, we can assume the user will use the internal keyboard before this timeout expires. For example, such a hwdb entry is: libinput:name:Lid Switch:dmi:svnMicrosoftCorporation:pnSurface3: LIBINPUT_ATTR_LID_SWITCH_RELIABILITY=write_open Link: https://bugzilla.gnome.org/show_bug.cgi?id=782380 Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Lv Zheng	5d0e4205ea	ACPICA: Tables: Fix regression introduced by a too early mechanism enabling commit `2ea65321b8` upstream. In the Linux kernel, acpi_get_table() "clones" haven't been fully balanced by acpi_put_table() invocations. In upstream ACPICA, due to the design change, there are also unbalanced acpi_get_table_by_index() invocations requiring special care. acpi_get_table() reference counting mismatches may occor due to that and printing error messages related to them is not useful at this point. The strict balanced validation count check should only be enabled after confirming that all invocations are safe and aligned with their designed purposes. Thus this patch removes the error value returned by acpi_tb_get_table() in that case along with the accompanying error message to fix the issue. Fixes: `174cc7187e` (ACPICA: Tables: Back port acpi_get_table_with_size() and early_acpi_os_unmap_memory() from Linux kernel) Reported-by: Anush Seetharaman <anush.seetharaman@intel.com> Reported-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> [ rjw: Changelog ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:08 +02:00
Dan Williams	34211cbf94	ACPI / sysfs: fix acpi_get_table() leak / acpi-sysfs denial of service commit `0de0e198bc` upstream. Reading an ACPI table through the /sys/firmware/acpi/tables interface more than 65,536 times leads to the following log message: ACPI Error: Table ffff88033595eaa8, Validation count is zero after increment (20170119/tbutils-423) ...and the table being unavailable until the next reboot. Add the missing acpi_put_table() so the table ->validation_count is decremented after each read. Reported-by: Anush Seetharaman <anush.seetharaman@intel.com> Fixes: `174cc7187e` "ACPICA: Tables: Back port acpi_get_table_with_size() ..." Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Vishal Verma	93da4e6c45	acpi, nfit: Fix the memory error check in nfit_handle_mce() commit `fc08a4703a` upstream. The check for an MCE being a memory error in the NFIT mce handler was bogus. Use the new mce_is_memory_error() helper to detect the error properly. Reported-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: http://lkml.kernel.org/r/20170519093915.15413-3-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Borislav Petkov	9183980a9e	x86/MCE: Export memory_error() commit `2d1f406139` upstream. Export the function which checks whether an MCE is a memory error to other users so that we can reuse the logic. Drop the boot_cpu_data use, while at it, as mce.cpuvendor already has the CPU vendor in there. Integrate a piece from a patch from Vishal Verma <vishal.l.verma@intel.com> to export it for modules (nfit). The main reason we're exporting it is that the nfit handler nfit_handle_mce() needs to detect a memory error properly before doing its recovery actions. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Link: http://lkml.kernel.org/r/20170519093915.15413-2-bp@alien8.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Lv Zheng	8f8dca3c86	Revert "ACPI / button: Remove lid_init_state=method mode" commit `f369fdf4f6` upstream. This reverts commit `ecb10b694b`. The only expected ACPI control method lid device's usage model is 1. Listen to the lid notification, 2. Evaluate _LID after being notified by BIOS, 3. Suspend the system (if users configure to do so) after seeing "close". It's not ensured that BIOS will notify OS after boot/resume, and it's not ensured that BIOS will always generate "open" event upon opening the lid. But there are 2 wrong usage models: 1. When the lid device is responsible for suspend/resume the system, userspace requires to see "open" event to be paired with "close" after the system is resumed, or it will suspend the system again. 2. When an external monitor connects to the laptop attached docks, userspace requires to see "close" event after the system is resumed so that it can determine whether the internal display should remain dark and the external display should be lit on. After we made default kernel behavior to be suitable for usage model 1, users of usage model 2 start to report regressions for such behavior change. Reversion of button.lid_init_state=method doesn't actually reverts to old default behavior as doing so can enter a regression loop, but facilitates users to work the reported regressions around with button.lid_init_state=method. Fixes: `ecb10b694b` (ACPI / button: Remove lid_init_state=method mode) Link: https://bugzilla.kernel.org/show_bug.cgi?id=195455 Link: https://bugzilla.redhat.com/show_bug.cgi?id=1430259 Tested-by: Steffen Weber <steffen.weber@gmail.com> Tested-by: Julian Wiedmann <julian.wiedmann@jwi.name> Reported-by: Joachim Frieben <jfrieben@hotmail.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Herbert Xu	f5eef8d245	crypto: skcipher - Add missing API setkey checks commit `9933e113c2` upstream. The API setkey checks for key sizes and alignment went AWOL during the skcipher conversion. This patch restores them. Fixes: `4e6c3df4d7` ("crypto: skcipher - Add low-level skcipher...") Reported-by: Baozeng <sploving1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:07 +02:00
Sebastian Reichel	2da7518890	i2c: i2c-tiny-usb: fix buffer not being DMA capable commit `5165da5923` upstream. Since v4.9 i2c-tiny-usb generates the below call trace and longer works, since it can't communicate with the USB device. The reason is, that since v4.9 the USB stack checks, that the buffer it should transfer is DMA capable. This was a requirement since v2.2 days, but it usually worked nevertheless. [ 17.504959] ------------[ cut here ]------------ [ 17.505488] WARNING: CPU: 0 PID: 93 at drivers/usb/core/hcd.c:1587 usb_hcd_map_urb_for_dma+0x37c/0x570 [ 17.506545] transfer buffer not dma capable [ 17.507022] Modules linked in: [ 17.507370] CPU: 0 PID: 93 Comm: i2cdetect Not tainted 4.11.0-rc8+ #10 [ 17.508103] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 17.509039] Call Trace: [ 17.509320] ? dump_stack+0x5c/0x78 [ 17.509714] ? __warn+0xbe/0xe0 [ 17.510073] ? warn_slowpath_fmt+0x5a/0x80 [ 17.510532] ? nommu_map_sg+0xb0/0xb0 [ 17.510949] ? usb_hcd_map_urb_for_dma+0x37c/0x570 [ 17.511482] ? usb_hcd_submit_urb+0x336/0xab0 [ 17.511976] ? wait_for_completion_timeout+0x12f/0x1a0 [ 17.512549] ? wait_for_completion_timeout+0x65/0x1a0 [ 17.513125] ? usb_start_wait_urb+0x65/0x160 [ 17.513604] ? usb_control_msg+0xdc/0x130 [ 17.514061] ? usb_xfer+0xa4/0x2a0 [ 17.514445] ? __i2c_transfer+0x108/0x3c0 [ 17.514899] ? i2c_transfer+0x57/0xb0 [ 17.515310] ? i2c_smbus_xfer_emulated+0x12f/0x590 [ 17.515851] ? _raw_spin_unlock_irqrestore+0x11/0x20 [ 17.516408] ? i2c_smbus_xfer+0x125/0x330 [ 17.516876] ? i2c_smbus_xfer+0x125/0x330 [ 17.517329] ? i2cdev_ioctl_smbus+0x1c1/0x2b0 [ 17.517824] ? i2cdev_ioctl+0x75/0x1c0 [ 17.518248] ? do_vfs_ioctl+0x9f/0x600 [ 17.518671] ? vfs_write+0x144/0x190 [ 17.519078] ? SyS_ioctl+0x74/0x80 [ 17.519463] ? entry_SYSCALL_64_fastpath+0x1e/0xad [ 17.519959] ---[ end trace d047c04982f5ac50 ]--- Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Till Harbaum <till@harbaum.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Ard Biesheuvel	e4bab31cf9	drivers/tty: 8250: only call fintek_8250_probe when doing port I/O commit `4c4fc90964` upstream. Commit `fa01e2ca9f` ("serial: 8250: Integrate Fintek into 8250_base") modified the probing logic for PNP0501 devices, to remove a collision between the generic 16550A driver and the Fintek driver, which reused the same ACPI _HID. The Fintek device probe is now incorporated into the common 8250 probe path, and gets called for all discovered 16550A compatible devices, including ones that are MMIO mapped rather than IO mapped. However, the Fintek driver assumes the port base is a I/O address, and proceeds to probe some arbitrary offsets above it. This is generally a wrong thing to do, but on ARM systems (having no native port I/O), this may result in faulting accesses of completely unrelated MMIO regions in the PCI I/O space. Given that this is at serial probe time, this results in hard to diagnose crashes at boot. So let's restrict the Fintek probe to devices that we know are using port I/O in the first place. Fixes: `fa01e2ca9f` ("serial: 8250: Integrate Fintek into 8250_base") Suggested-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ricardo Ribalda <ricardo.ribalda@gmail.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Johan Hovold	84ac7693f4	serdev: fix tty-port client deregistration commit `aee5da7838` upstream. The port client data must be set when registering the serdev controller or client deregistration will fail (and the serdev devices are left registered and allocated) if the port was never opened in between. Make sure to clear the port client data on any probe errors to avoid a use-after-free when the client is later deregistered unconditionally (e.g. in a tty-port deregistration helper). Also move port client operation initialisation to registration. Note that the client ops must be restored on failed probe. Fixes: `bed35c6dfa` ("serdev: add a tty port controller driver") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Johan Hovold	427fa8e391	Revert "tty_port: register tty ports with serdev bus" commit `d3ba126a22` upstream. This reverts commit `8ee3fde047`. The new serdev bus hooked into the tty layer in tty_port_register_device() by registering a serdev controller instead of a tty device whenever a serdev client is present, and by deregistering the controller in the tty-port destructor. This is broken in several ways: Firstly, it leads to a NULL-pointer dereference whenever a tty driver later deregisters its devices as no corresponding character device will exist. Secondly, far from every tty driver uses tty-port refcounting (e.g. serial core) so the serdev devices might never be deregistered or deallocated. Thirdly, deregistering at tty-port destruction is too late as the underlying device and structures may be long gone by then. A port is not released before an open tty device is closed, something which a registered serdev client can prevent from ever happening. A driver callback while the device is gone typically also leads to crashes. Many tty drivers even keep their ports around until the driver is unloaded (e.g. serial core), something which even if a late callback never happens, leads to leaks if a device is unbound from its driver and is later rebound. The right solution here is to add a new tty_port_unregister_device() helper and to never call tty_device_unregister() whenever the port has been claimed by serdev, but since this requires modifying just about every tty driver (and multiple subsystems) it will need to be done incrementally. Reverting the offending patch is the first step in fixing the broken lifetime assumptions. A follow-up patch will add a new pair of tty-device registration helpers, which a vetted tty driver can use to support serdev (initially serial core). When every tty driver uses the serdev helpers (at least for deregistration), we can add serdev registration to tty_port_register_device() again. Note that this also fixes another issue with serdev, which currently allocates and registers a serdev controller for every tty device registered using tty_port_device_register() only to immediately deregister and deallocate it when the corresponding OF node or serdev child node is missing. This should be addressed before enabling serdev for hot-pluggable buses. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Jeremy Kerr	baa4d4112e	powerpc/spufs: Fix hash faults for kernel regions commit `d75e4919cc` upstream. Commit `ac29c64089` ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED") swapped _PAGE_USER for _PAGE_PRIVILEGED, and introduced check_pte_access() which denied kernel access to non-_PAGE_PRIVILEGED pages. However, it didn't add _PAGE_PRIVILEGED to the hash fault handler for spufs' kernel accesses, so the DMAs required to establish SPE memory no longer work. This change adds _PAGE_PRIVILEGED to the hash fault handler for kernel accesses. Fixes: `ac29c64089` ("powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED") Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reported-by: Sombat Tragolgosol <sombat3960@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Michael Neuling	919c7173e0	powerpc: Fix booting P9 hash with CONFIG_PPC_RADIX_MMU=N commit `d957fb4d17` upstream. Currently if you disable CONFIG_PPC_RADIX_MMU you'll crash on boot on a P9. This is because we still set MMU_FTR_TYPE_RADIX via ibm,pa-features and MMU_FTR_TYPE_RADIX is what's used for code patching in much of the asm code (ie. slb_miss_realmode) This patch fixes the problem by stopping MMU_FTR_TYPE_RADIX from being set from ibm.pa-features. We may eventually end up removing the CONFIG_PPC_RADIX_MMU option completely but until then this fixes the issue. Fixes: `17a3dd2f5f` ("powerpc/mm/radix: Use firmware feature to enable Radix MMU") Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:06 +02:00
Richard Narron	72351ac5cd	fs/ufs: Set UFS default maximum bytes per file commit `239e250e4a` upstream. This fixes a problem with reading files larger than 2GB from a UFS-2 file system: https://bugzilla.kernel.org/show_bug.cgi?id=195721 The incorrect UFS s_maxsize limit became a problem as of commit `c2a9737f45` ("vfs,mm: fix a dead loop in truncate_inode_pages_range()") which started using s_maxbytes to avoid a page index overflow in do_generic_file_read(). That caused files to be truncated on UFS-2 file systems because the default maximum file size is 2GB (MAX_NON_LFS) and UFS didn't update it. Here I simply increase the default to a common value used by other file systems. Signed-off-by: Richard Narron <comet.berkeley@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Will B <will.brokenbourgh2877@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Liam R. Howlett	f351b1226d	sparc/ftrace: Fix ftrace graph time measurement [ Upstream commit `48078d2dac` ] The ftrace function_graph time measurements of a given function is not accurate according to those recorded by ftrace using the function filters. This change pulls the x86_64 fix from 'commit `722b3c7469` ("ftrace/graph: Trace function entry before updating index")' into the sparc specific prepare_ftrace_return which stops ftrace from counting interrupted tasks in the time measurement. Example measurements for select_task_rq_fair running "hackbench 100 process 1000": \| tracing/trace_stat/function0 \| function_graph Before patch \| 2.802 us \| 4.255 us After patch \| 2.749 us \| 3.094 us Signed-off-by: Liam R. Howlett <Liam.Howlett@Oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Orlando Arias	76037bf9e1	sparc: Fix -Wstringop-overflow warning [ Upstream commit `deba804c90` ] Greetings, GCC 7 introduced the -Wstringop-overflow flag to detect buffer overflows in calls to string handling functions [1][2]. Due to the way ``empty_zero_page'' is declared in arch/sparc/include/setup.h, this causes a warning to trigger at compile time in the function mem_init(), which is subsequently converted to an error. The ensuing patch fixes this issue and aligns the declaration of empty_zero_page to that of other architectures. Thank you. Cheers, Orlando. [1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg02308.html [2] https://gcc.gnu.org/gcc-7/changes.html Signed-off-by: Orlando Arias <oarias@knights.ucf.edu> -------------------------------------------------------------------------------- Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Nitin Gupta	e346489fac	sparc64: Fix mapping of 64k pages with MAP_FIXED [ Upstream commit `b6c41cb050` ] An incorrect huge page alignment check caused mmap failure for 64K pages when MAP_FIXED is used with address not aligned to HPAGE_SIZE. Orabug: 25885991 Fixes: `dcd1912d21` ("sparc64: Add 64K page size support") Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	21dccb0f7c	bpf: adjust verifier heuristics [ Upstream commit `3c2ce60bdd` ] Current limits with regards to processing program paths do not really reflect today's needs anymore due to programs becoming more complex and verifier smarter, keeping track of more data such as const ALU operations, alignment tracking, spilling of PTR_TO_MAP_VALUE_ADJ registers, and other features allowing for smarter matching of what LLVM generates. This also comes with the side-effect that we result in fewer opportunities to prune search states and thus often need to do more work to prove safety than in the past due to different register states and stack layout where we mismatch. Generally, it's quite hard to determine what caused a sudden increase in complexity, it could be caused by something as trivial as a single branch somewhere at the beginning of the program where LLVM assigned a stack slot that is marked differently throughout other branches and thus causing a mismatch, where verifier then needs to prove safety for the whole rest of the program. Subsequently, programs with even less than half the insn size limit can get rejected. We noticed that while some programs load fine under pre 4.11, they get rejected due to hitting limits on more recent kernels. We saw that in the vast majority of cases (90+%) pruning failed due to register mismatches. In case of stack mismatches, majority of cases failed due to different stack slot types (invalid, spill, misc) rather than differences in spilled registers. This patch makes pruning more aggressive by also adding markers that sit at conditional jumps as well. Currently, we only mark jump targets for pruning. For example in direct packet access, these are usually error paths where we bail out. We found that adding these markers, it can reduce number of processed insns by up to 30%. Another option is to ignore reg->id in probing PTR_TO_MAP_VALUE_OR_NULL registers, which can help pruning slightly as well by up to 7% observed complexity reduction as stand-alone. Meaning, if a previous path with register type PTR_TO_MAP_VALUE_OR_NULL for map X was found to be safe, then in the current state a PTR_TO_MAP_VALUE_OR_NULL register for the same map X must be safe as well. Last but not least the patch also adds a scheduling point and bumps the current limit for instructions to be processed to a more adequate value. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	87cebd0f19	bpf: fix wrong exposure of map_flags into fdinfo for lpm [ Upstream commit `a316338cb7` ] trie_alloc() always needs to have BPF_F_NO_PREALLOC passed in via attr->map_flags, since it does not support preallocation yet. We check the flag, but we never copy the flag into trie->map.map_flags, which is later on exposed into fdinfo and used by loaders such as iproute2. Latter uses this in bpf_map_selfcheck_pinned() to test whether a pinned map has the same spec as the one from the BPF obj file and if not, bails out, which is currently the case for lpm since it exposes always 0 as flags. Also copy over flags in array_map_alloc() and stack_map_alloc(). They always have to be 0 right now, but we should make sure to not miss to copy them over at a later point in time when we add actual flags for them to use. Fixes: `b95a5c4db0` ("bpf: add a longest prefix match trie map implementation") Reported-by: Jarno Rajahalme <jarno@covalent.io> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Daniel Borkmann	d6d2860eee	bpf: add bpf_clone_redirect to bpf_helper_changes_pkt_data [ Upstream commit `41703a7310` ] The bpf_clone_redirect() still needs to be listed in bpf_helper_changes_pkt_data() since we call into bpf_try_make_head_writable() from there, thus we need to invalidate prior pkt regs as well. Fixes: `36bbef52c7` ("bpf: direct packet write and access for helpers for clsact progs") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:05 +02:00
Eric Dumazet	3b69d6516e	ipv4: add reference counting to metrics [ Upstream commit `3fb07daff8` ] Andrey Konovalov reported crashes in ipv4_mtu() I could reproduce the issue with KASAN kernels, between 10.246.7.151 and 10.246.7.152 : 1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 & 2) At the same time run following loop : while : do ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500 done Cong Wang attempted to add back rt->fi in commit `82486aa6f1` ("ipv4: restore rt->fi for reference counting") but this proved to add some issues that were complex to solve. Instead, I suggested to add a refcount to the metrics themselves, being a standalone object (in particular, no reference to other objects) I tried to make this patch as small as possible to ease its backport, instead of being super clean. Note that we believe that only ipv4 dst need to take care of the metric refcount. But if this is wrong, this patch adds the basic infrastructure to extend this to other families. Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang for his efforts on this problem. Fixes: `2860583fe8` ("ipv4: Kill rt->fi") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Peter Dawson	d3edf403e2	ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets [ Upstream commit `0e9a709560` ] This fix addresses two problems in the way the DSCP field is formulated on the encapsulating header of IPv6 tunnels. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195661 1) The IPv6 tunneling code was manipulating the DSCP field of the encapsulating packet using the 32b flowlabel. Since the flowlabel is only the lower 20b it was incorrect to assume that the upper 12b containing the DSCP and ECN fields would remain intact when formulating the encapsulating header. This fix handles the 'inherit' and 'fixed-value' DSCP cases explicitly using the extant dsfield u8 variable. 2) The use of INET_ECN_encapsulate(0, dsfield) in ip6_tnl_xmit was incorrect and resulted in the DSCP value always being set to 0. Commit `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") caused the regression by masking out the flowlabel which exposed the incorrect handling of the DSCP portion of the flowlabel in ip6_tunnel and ip6_gre. Fixes: `90427ef5d2` ("ipv6: fix flow labels when the traffic class is non-0") Signed-off-by: Peter Dawson <peter.a.dawson@boeing.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Davide Caratti	90e7c3322d	sctp: fix ICMP processing if skb is non-linear [ Upstream commit `804ec7ebe8` ] sometimes ICMP replies to INIT chunks are ignored by the client, even if the encapsulated SCTP headers match an open socket. This happens when the ICMP packet is carried by a paged skb: use skb_header_pointer() to read packet contents beyond the SCTP header, so that chunk header and initiate tag are validated correctly. v2: - don't use skb_header_pointer() to read the transport header, since icmp_socket_deliver() already puts these 8 bytes in the linear area. - change commit message to make specific reference to INIT chunks. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Wei Wang	0236d8c44e	tcp: avoid fastopen API to be used on AF_UNSPEC [ Upstream commit `ba615f6752` ] Fastopen API should be used to perform fastopen operations on the TCP socket. It does not make sense to use fastopen API to perform disconnect by calling it with AF_UNSPEC. The fastopen data path is also prone to race conditions and bugs when using with AF_UNSPEC. One issue reported and analyzed by Vegard Nossum is as follows: +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Thread A: Thread B: ------------------------------------------------------------------------ sendto() - tcp_sendmsg() - sk_stream_memory_free() = 0 - goto wait_for_sndbuf - sk_stream_wait_memory() - sk_wait_event() // sleep \| sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC) \| - tcp_sendmsg() \| - tcp_sendmsg_fastopen() \| - __inet_stream_connect() \| - tcp_disconnect() //because of AF_UNSPEC \| - tcp_transmit_skb()// send RST \| - return 0; // no reconnect! \| - sk_stream_wait_connect() \| - sock_error() \| - xchg(&sk->sk_err, 0) \| - return -ECONNRESET - ... // wake up, see sk->sk_err == 0 - skb_entail() on TCP_CLOSE socket If the connection is reopened then we will send a brand new SYN packet after thread A has already queued a buffer. At this point I think the socket internal state (sequence numbers etc.) becomes messed up. When the new connection is closed, the FIN-ACK is rejected because the sequence number is outside the window. The other side tries to retransmit, but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which corrupts the skb data length and hits a BUG() in copy_and_csum_bits(). +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Hence, this patch adds a check for AF_UNSPEC in the fastopen data path and return EOPNOTSUPP to user if such case happens. Fixes: `cf60af03ca` ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)") Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Eric Garver	1642394fff	geneve: fix fill_info when using collect_metadata [ Upstream commit `11387fe4a9` ] Since `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") fill_info does not return UDP_ZERO_CSUM6_RX when using COLLECT_METADATA. This is because it uses ip_tunnel_info_af() with the device level info, which is not valid for COLLECT_METADATA. Fix by checking for the presence of the actual sockets. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Vlad Yasevich	4dbbbaad64	virtio-net: enable TSO/checksum offloads for Q-in-Q vlans [ Upstream commit `2836b4f224` ] Since virtio does not provide it's own ndo_features_check handler, TSO, and now checksum offload, are disabled for stacked vlans. Re-enable the support and let the host take care of it. This restores/improves Guest-to-Guest performance over Q-in-Q vlans. Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:04 +02:00
Vlad Yasevich	acc866e9b5	be2net: Fix offload features for Q-in-Q packets [ Upstream commit `cc6e9de62a` ] At least some of the be2net cards do not seem to be capabled of performing checksum offload computions on Q-in-Q packets. In these case, the recevied checksum on the remote is invalid and TCP syn packets are dropped. This patch adds a call to check disbled acceleration features on Q-in-Q tagged traffic. CC: Sathya Perla <sathya.perla@broadcom.com> CC: Ajit Khaparde <ajit.khaparde@broadcom.com> CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> CC: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Vlad Yasevich	423c1b4324	vlan: Fix tcp checksum offloads in Q-in-Q vlans [ Upstream commit `35d2f80b07` ] It appears that TCP checksum offloading has been broken for Q-in-Q vlans. The behavior was execerbated by the series commit `afb0bc972b` ("Merge branch 'stacked_vlan_tso'") that that enabled accleleration features on stacked vlans. However, event without that series, it is possible to trigger this issue. It just requires a lot more specialized configuration. The root cause is the interaction between how netdev_intersect_features() works, the features actually set on the vlan devices and HW having the ability to run checksum with longer headers. The issue starts when netdev_interesect_features() replaces NETIF_F_HW_CSUM with a combination of NETIF_F_IP_CSUM \| NETIF_F_IPV6_CSUM, if the HW advertises IP\|IPV6 specific checksums. This happens for tagged and multi-tagged packets. However, HW that enables IP\|IPV6 checksum offloading doesn't gurantee that packets with arbitrarily long headers can be checksummed. This patch disables IP\|IPV6 checksums on the packet for multi-tagged packets. CC: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> CC: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Acked-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Andrew Lunn	f1cd4c6331	net: phy: marvell: Limit errata to 88m1101 [ Upstream commit `f289978835` ] The 88m1101 has an errata when configuring autoneg. However, it was being applied to many other Marvell PHYs as well. Limit its scope to just the 88m1101. Fixes: `76884679c6` ("phylib: Add support for Marvell 88e1111S and 88e1145") Reported-by: Daniel Walker <danielwa@cisco.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Harini Katakam <harinik@xilinx.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Mohamad Haj Yahia	bea278025c	net/mlx5: Avoid using pending command interface slots [ Upstream commit `73dd3a4839` ] Currently when firmware command gets stuck or it takes long time to complete, the driver command will get timeout and the command slot is freed and can be used for new commands, and if the firmware receive new command on the old busy slot its behavior is unexpected and this could be harmful. To fix this when the driver command gets timeout we return failure, but we don't free the command slot and we wait for the firmware to explicitly respond to that command. Once all the entries are busy we will stop processing new firmware commands. Fixes: `9cba4ebcf3` ('net/mlx5: Fix potential deadlock in command mode change') Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Jarod Wilson	1cdcbe7c70	bonding: fix accounting of active ports in 3ad [ Upstream commit `751da2a69b` ] As of `7bb11dc9f5` and `0622cab034`, bond slaves in a 3ad bond are not removed from the aggregator when they are down, and the active slave count is NOT equal to number of ports in the aggregator, but rather the number of ports in the aggregator that are still enabled. The sysfs spew for bonding_show_ad_num_ports() has a comment that says "Show number of active 802.3ad ports.", but it's currently showing total number of ports, both active and inactive. Remedy it by using the same logic introduced in `0622cab034` in __bond_3ad_get_active_agg_info(), so sysfs, procfs and netlink all report the number of active ports. Note that this means that IFLA_BOND_AD_INFO_NUM_PORTS really means NUM_ACTIVE_PORTS instead of NUM_PORTS, and thus perhaps should be renamed for clarity. Lightly tested on a dual i40e lacp bond, simulating link downs with an ip link set dev <slave2> down, was able to produce the state where I could see both in the same aggregator, but a number of ports count of 1. MII Status: up Active Aggregator Info: Aggregator ID: 1 Number of ports: 2 <--- Slave Interface: ens10 MII Status: up <--- Aggregator ID: 1 Slave Interface: ens11 MII Status: up Aggregator ID: 1 MII Status: up Active Aggregator Info: Aggregator ID: 1 Number of ports: 1 <--- Slave Interface: ens10 MII Status: down <--- Aggregator ID: 1 Slave Interface: ens11 MII Status: up Aggregator ID: 1 CC: Jay Vosburgh <j.vosburgh@gmail.com> CC: Veaceslav Falico <vfalico@gmail.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:03 +02:00
Eric Dumazet	827624c3d1	ipv6: fix out of bound writes in __ip6_append_data() [ Upstream commit `232cd35d08` ] Andrey Konovalov and idaifish@gmail.com reported crashes caused by one skb shared_info being overwritten from __ip6_append_data() Andrey program lead to following state : copy -4200 datalen 2000 fraglen 2040 maxfraglen 2040 alloclen 2048 transhdrlen 0 offset 0 fraggap 6200 The skb_copy_and_csum_bits(skb_prev, maxfraglen, data + transhdrlen, fraggap, 0); is overwriting skb->head and skb_shared_info Since we apparently detect this rare condition too late, move the code earlier to even avoid allocating skb and risking crashes. Once again, many thanks to Andrey and syzkaller team. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Reported-by: <idaifish@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Xin Long	99c971a56d	bridge: start hello_timer when enabling KERNEL_STP in br_stp_start [ Upstream commit `6d18c732b9` ] Since commit `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers"), bridge would not start hello_timer if stp_enabled is not KERNEL_STP when br_dev_open. The problem is even if users set stp_enabled with KERNEL_STP later, the timer will still not be started. It causes that KERNEL_STP can not really work. Users have to re-ifup the bridge to avoid this. This patch is to fix it by starting br->hello_timer when enabling KERNEL_STP in br_stp_start. As an improvement, it's also to start hello_timer again only when br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is no reason to start the timer again when it's NO_STP. Fixes: `76b91c32dd` ("bridge: stp: when using userspace stp stop kernel hello and hold timers") Reported-by: Haidong Li <haili@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Reviewed-by: Ivan Vecera <cera@cera.cz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Bjørn Mork	bf97c6bf24	qmi_wwan: add another Lenovo EM74xx device ID [ Upstream commit `486181bcb3` ] In their infinite wisdom, and never ending quest for end user frustration, Lenovo has decided to use a new USB device ID for the wwan modules in their 2017 laptops. The actual hardware is still the Sierra Wireless EM7455 or EM7430, depending on region. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Tobias Jungel	7b570175b3	bridge: netlink: check vlan_default_pvid range [ Upstream commit `a285860211` ] Currently it is allowed to set the default pvid of a bridge to a value above VLAN_VID_MASK (0xfff). This patch adds a check to br_validate and returns -EINVAL in case the pvid is out of bounds. Reproduce by calling: [root@test ~]# ip l a type bridge [root@test ~]# ip l a type dummy [root@test ~]# ip l s bridge0 type bridge vlan_filtering 1 [root@test ~]# ip l s bridge0 type bridge vlan_default_pvid 9999 [root@test ~]# ip l s dummy0 master bridge0 [root@test ~]# bridge vlan port vlan ids bridge0 9999 PVID Egress Untagged dummy0 9999 PVID Egress Untagged Fixes: `0f963b7592` ("bridge: netlink: add support for default_pvid") Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Tobias Jungel <tobias.jungel@bisdn.de> Acked-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
David S. Miller	4b3a6fa35e	ipv6: Check ip6_find_1stfragopt() return value properly. [ Upstream commit `7dd7eb9513` ] Do not use unsigned variables to see if it returns a negative error or not. Fixes: `2423496af3` ("ipv6: Prevent overrun when parsing v6 header options") Reported-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Craig Gallek	9909e4e4ff	ipv6: Prevent overrun when parsing v6 header options [ Upstream commit `2423496af3` ] The KASAN warning repoted below was discovered with a syzkaller program. The reproducer is basically: int s = socket(AF_INET6, SOCK_RAW, NEXTHDR_HOP); send(s, &one_byte_of_data, 1, MSG_MORE); send(s, &more_than_mtu_bytes_data, 2000, 0); The socket() call sets the nexthdr field of the v6 header to NEXTHDR_HOP, the first send call primes the payload with a non zero byte of data, and the second send call triggers the fragmentation path. The fragmentation code tries to parse the header options in order to figure out where to insert the fragment option. Since nexthdr points to an invalid option, the calculation of the size of the network header can made to be much larger than the linear section of the skb and data is read outside of it. This fix makes ip6_find_1stfrag return an error if it detects running out-of-bounds. [ 42.361487] ================================================================== [ 42.364412] BUG: KASAN: slab-out-of-bounds in ip6_fragment+0x11c8/0x3730 [ 42.365471] Read of size 840 at addr ffff88000969e798 by task ip6_fragment-oo/3789 [ 42.366469] [ 42.366696] CPU: 1 PID: 3789 Comm: ip6_fragment-oo Not tainted 4.11.0+ #41 [ 42.367628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1ubuntu1 04/01/2014 [ 42.368824] Call Trace: [ 42.369183] dump_stack+0xb3/0x10b [ 42.369664] print_address_description+0x73/0x290 [ 42.370325] kasan_report+0x252/0x370 [ 42.370839] ? ip6_fragment+0x11c8/0x3730 [ 42.371396] check_memory_region+0x13c/0x1a0 [ 42.371978] memcpy+0x23/0x50 [ 42.372395] ip6_fragment+0x11c8/0x3730 [ 42.372920] ? nf_ct_expect_unregister_notifier+0x110/0x110 [ 42.373681] ? ip6_copy_metadata+0x7f0/0x7f0 [ 42.374263] ? ip6_forward+0x2e30/0x2e30 [ 42.374803] ip6_finish_output+0x584/0x990 [ 42.375350] ip6_output+0x1b7/0x690 [ 42.375836] ? ip6_finish_output+0x990/0x990 [ 42.376411] ? ip6_fragment+0x3730/0x3730 [ 42.376968] ip6_local_out+0x95/0x160 [ 42.377471] ip6_send_skb+0xa1/0x330 [ 42.377969] ip6_push_pending_frames+0xb3/0xe0 [ 42.378589] rawv6_sendmsg+0x2051/0x2db0 [ 42.379129] ? rawv6_bind+0x8b0/0x8b0 [ 42.379633] ? _copy_from_user+0x84/0xe0 [ 42.380193] ? debug_check_no_locks_freed+0x290/0x290 [ 42.380878] ? ___sys_sendmsg+0x162/0x930 [ 42.381427] ? rcu_read_lock_sched_held+0xa3/0x120 [ 42.382074] ? sock_has_perm+0x1f6/0x290 [ 42.382614] ? ___sys_sendmsg+0x167/0x930 [ 42.383173] ? lock_downgrade+0x660/0x660 [ 42.383727] inet_sendmsg+0x123/0x500 [ 42.384226] ? inet_sendmsg+0x123/0x500 [ 42.384748] ? inet_recvmsg+0x540/0x540 [ 42.385263] sock_sendmsg+0xca/0x110 [ 42.385758] SYSC_sendto+0x217/0x380 [ 42.386249] ? SYSC_connect+0x310/0x310 [ 42.386783] ? __might_fault+0x110/0x1d0 [ 42.387324] ? lock_downgrade+0x660/0x660 [ 42.387880] ? __fget_light+0xa1/0x1f0 [ 42.388403] ? __fdget+0x18/0x20 [ 42.388851] ? sock_common_setsockopt+0x95/0xd0 [ 42.389472] ? SyS_setsockopt+0x17f/0x260 [ 42.390021] ? entry_SYSCALL_64_fastpath+0x5/0xbe [ 42.390650] SyS_sendto+0x40/0x50 [ 42.391103] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.391731] RIP: 0033:0x7fbbb711e383 [ 42.392217] RSP: 002b:00007ffff4d34f28 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 42.393235] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbbb711e383 [ 42.394195] RDX: 0000000000001000 RSI: 00007ffff4d34f60 RDI: 0000000000000003 [ 42.395145] RBP: 0000000000000046 R08: 00007ffff4d34f40 R09: 0000000000000018 [ 42.396056] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000400aad [ 42.396598] R13: 0000000000000066 R14: 00007ffff4d34ee0 R15: 00007fbbb717af00 [ 42.397257] [ 42.397411] Allocated by task 3789: [ 42.397702] save_stack_trace+0x16/0x20 [ 42.398005] save_stack+0x46/0xd0 [ 42.398267] kasan_kmalloc+0xad/0xe0 [ 42.398548] kasan_slab_alloc+0x12/0x20 [ 42.398848] __kmalloc_node_track_caller+0xcb/0x380 [ 42.399224] __kmalloc_reserve.isra.32+0x41/0xe0 [ 42.399654] __alloc_skb+0xf8/0x580 [ 42.400003] sock_wmalloc+0xab/0xf0 [ 42.400346] __ip6_append_data.isra.41+0x2472/0x33d0 [ 42.400813] ip6_append_data+0x1a8/0x2f0 [ 42.401122] rawv6_sendmsg+0x11ee/0x2db0 [ 42.401505] inet_sendmsg+0x123/0x500 [ 42.401860] sock_sendmsg+0xca/0x110 [ 42.402209] ___sys_sendmsg+0x7cb/0x930 [ 42.402582] __sys_sendmsg+0xd9/0x190 [ 42.402941] SyS_sendmsg+0x2d/0x50 [ 42.403273] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.403718] [ 42.403871] Freed by task 1794: [ 42.404146] save_stack_trace+0x16/0x20 [ 42.404515] save_stack+0x46/0xd0 [ 42.404827] kasan_slab_free+0x72/0xc0 [ 42.405167] kfree+0xe8/0x2b0 [ 42.405462] skb_free_head+0x74/0xb0 [ 42.405806] skb_release_data+0x30e/0x3a0 [ 42.406198] skb_release_all+0x4a/0x60 [ 42.406563] consume_skb+0x113/0x2e0 [ 42.406910] skb_free_datagram+0x1a/0xe0 [ 42.407288] netlink_recvmsg+0x60d/0xe40 [ 42.407667] sock_recvmsg+0xd7/0x110 [ 42.408022] ___sys_recvmsg+0x25c/0x580 [ 42.408395] __sys_recvmsg+0xd6/0x190 [ 42.408753] SyS_recvmsg+0x2d/0x50 [ 42.409086] entry_SYSCALL_64_fastpath+0x1f/0xbe [ 42.409513] [ 42.409665] The buggy address belongs to the object at ffff88000969e780 [ 42.409665] which belongs to the cache kmalloc-512 of size 512 [ 42.410846] The buggy address is located 24 bytes inside of [ 42.410846] 512-byte region [ffff88000969e780, ffff88000969e980) [ 42.411941] The buggy address belongs to the page: [ 42.412405] page:ffffea000025a780 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 [ 42.413298] flags: 0x100000000008100(slab\|head) [ 42.413729] raw: 0100000000008100 0000000000000000 0000000000000000 00000001800c000c [ 42.414387] raw: ffffea00002a9500 0000000900000007 ffff88000c401280 0000000000000000 [ 42.415074] page dumped because: kasan: bad access detected [ 42.415604] [ 42.415757] Memory state around the buggy address: [ 42.416222] ffff88000969e880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.416904] ffff88000969e900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 42.417591] >ffff88000969e980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 42.418273] ^ [ 42.418588] ffff88000969ea00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419273] ffff88000969ea80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 42.419882] ================================================================== Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Craig Gallek <kraig@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
David Ahern	df6342be40	net: Improve handling of failures on link and route dumps [ Upstream commit `f6c5775ff0` ] In general, rtnetlink dumps do not anticipate failure to dump a single object (e.g., link or route) on a single pass. As both route and link objects have grown via more attributes, that is no longer a given. netlink dumps can handle a failure if the dump function returns an error; specifically, netlink_dump adds the return code to the response if it is <= 0 so userspace is notified of the failure. The missing piece is the rtnetlink dump functions returning the error. Fix route and link dump functions to return the errors if no object is added to an skb (detected by skb->len != 0). IPv6 route dumps (rt6_dump_route) already return the error; this patch updates IPv4 and link dumps. Other dump functions may need to be ajusted as well. Reported-by: Jan Moskyto Matejka <mq@ucw.cz> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:02 +02:00
Christoph Hellwig	a19f55f9bb	net/smc: Add warning about remote memory exposure [ Upstream commit `19a0f7e37c` ] The driver explicitly bypasses APIs to register all memory once a connection is made, and thus allows remote access to memory. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Ursula Braun	516b3ed999	smc: switch to usage of IB_PD_UNSAFE_GLOBAL_RKEY [ Upstream commit `263eec9b2a` ] Currently, SMC enables remote access to physical memory when a user has successfully configured and established an SMC-connection until ten minutes after the last SMC connection is closed. Because this is considered a security risk, drivers are supposed to use IB_PD_UNSAFE_GLOBAL_RKEY in such a case. This patch changes the current SMC code to use IB_PD_UNSAFE_GLOBAL_RKEY. This improves user awareness, but does not remove the security risk itself. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Soheil Hassas Yeganeh	d6cb41cf30	tcp: eliminate negative reordering in tcp_clean_rtx_queue [ Upstream commit `bafbb9c732` ] tcp_ack() can call tcp_fragment() which may dededuct the value tp->fackets_out when MSS changes. When prior_fackets is larger than tp->fackets_out, tcp_clean_rtx_queue() can invoke tcp_update_reordering() with negative values. This results in absurd tp->reodering values higher than sysctl_tcp_max_reordering. Note that tcp_update_reordering indeeds sets tp->reordering to min(sysctl_tcp_max_reordering, metric), but because the comparison is signed, a negative metric always wins. Fixes: `c7caf8d3ed` ("[TCP]: Fix reord detection due to snd_una covered holes") Reported-by: Rebecca Isaacs <risaacs@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Gal Pressman	cc1d2a620d	net/mlx5e: Fix ethtool pause support and advertise reporting [ Upstream commit `e3c1950371` ] Pause bit should set when RX pause is on, not TX pause. Also, setting Asym_Pause is incorrect, and should be turned off. Fixes: `665bc53969` ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Gal Pressman	c8a52aa79a	net/mlx5e: Use the correct pause values for ethtool advertising [ Upstream commit `b383b544f2` ] Query the operational pause from firmware (PFCC register) instead of always passing zeros. Fixes: `665bc53969` ("net/mlx5e: Use new ethtool get/set link ksettings API") Signed-off-by: Gal Pressman <galp@mellanox.com> Cc: kernel-team@fb.com Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Douglas Caetano dos Santos	03c10a87f1	net/packet: fix missing net_device reference release [ Upstream commit `d19b183cdc` ] When using a TX ring buffer, if an error occurs processing a control message (e.g. invalid message), the net_device reference is not released. Fixes `c14ac9451c` ("sock: enable timestamping using control messages") Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:01 +02:00
Eric Dumazet	703a208274	sctp: do not inherit ipv6_{mc\|ac\|fl}_list from parent [ Upstream commit `fdcee2cbb8` ] SCTP needs fixes similar to `83eaddab43` ("ipv6/dccp: do not inherit ipv6_mc_list from parent"), otherwise bad things can happen. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Xin Long	dc9bf5513e	sctp: fix src address selection if using secondary addresses for ipv6 [ Upstream commit `dbc2b5e9a0` ] Commit `0ca50d12fe` ("sctp: fix src address selection if using secondary addresses") has fixed a src address selection issue when using secondary addresses for ipv4. Now sctp ipv6 also has the similar issue. When using a secondary address, sctp_v6_get_dst tries to choose the saddr which has the most same bits with the daddr by sctp_v6_addr_match_len. It may make some cases not work as expected. hostA: [1] fd21:356b:459a:cf10::11 (eth1) [2] fd21:356b:459a:cf20::11 (eth2) hostB: [a] fd21:356b:459a:cf30::2 (eth1) [b] fd21:356b:459a:cf40::2 (eth2) route from hostA to hostB: fd21:356b:459a:cf30::/64 dev eth1 metric 1024 mtu 1500 The expected path should be: fd21:356b:459a:cf10::11 <-> fd21:356b:459a:cf30::2 But addr[2] matches addr[a] more bits than addr[1] does, according to sctp_v6_addr_match_len. It causes the path to be: fd21:356b:459a:cf20::11 <-> fd21:356b:459a:cf30::2 This patch is to fix it with the same way as Marcelo's fix for sctp ipv4. As no ip_dev_find for ipv6, this patch is to use ipv6_chk_addr to check if the saddr is in a dev instead. Note that for backwards compatibility, it will still do the addr_match_len check here when no optimal is found. Reported-by: Patrick Talbert <ptalbert@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Jon Paul Maloy	d9cb26d6a3	tipc: make macro tipc_wait_for_cond() smp safe [ Upstream commit `844cf763fb` ] The macro tipc_wait_for_cond() is embedding the macro sk_wait_event() to fulfil its task. The latter, in turn, is evaluating the stated condition outside the socket lock context. This is problematic if the condition is accessing non-trivial data structures which may be altered by incoming interrupts, as is the case with the cong_links() linked list, used by socket to keep track of the current set of congested links. We sometimes see crashes when this list is accessed by a condition function at the same time as a SOCK_WAKEUP interrupt is removing an element from the list. We fix this by expanding selected parts of sk_wait_event() into the outer macro, while ensuring that all evaluations of a given condition are performed under socket lock protection. Fixes: commit `365ad353c2` ("tipc: reduce risk of user starvation during link congestion") Reviewed-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Yuchung Cheng	c91cd74b32	tcp: avoid fragmenting peculiar skbs in SACK [ Upstream commit `b451e5d24b` ] This patch fixes a bug in splitting an SKB during SACK processing. Specifically if an skb contains multiple packets and is only partially sacked in the higher sequences, tcp_match_sack_to_skb() splits the skb and marks the second fragment as SACKed. The current code further attempts rounding up the first fragment to MSS boundaries. But it misses a boundary condition when the rounded-up fragment size (pkt_len) is exactly skb size. Spliting such an skb is pointless and causses a kernel warning and aborts the SACK processing. This patch universally checks such over-split before calling tcp_fragment to prevent these unnecessary warnings. Fixes: `adb92db857` ("tcp: Make SACK code to split only at mss boundaries") Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Eric Dumazet	20d699e0ca	net: fix compile error in skb_orphan_partial() [ Upstream commit `9142e9007f` ] If CONFIG_INET is not set, net/core/sock.c can not compile : net/core/sock.c: In function ‘skb_orphan_partial’: net/core/sock.c:1810:2: error: implicit declaration of function ‘skb_is_tcp_pure_ack’ [-Werror=implicit-function-declaration] if (skb_is_tcp_pure_ack(skb)) ^ Fix this by always including <net/tcp.h> Fixes: `f6ba8d33cf` ("netem: fix skb_orphan_partial()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:10:00 +02:00
Eric Dumazet	e13cb6c25b	netem: fix skb_orphan_partial() [ Upstream commit `f6ba8d33cf` ] I should have known that lowering skb->truesize was dangerous :/ In case packets are not leaving the host via a standard Ethernet device, but looped back to local sockets, bad things can happen, as reported by Michael Madsen ( https://bugzilla.kernel.org/show_bug.cgi?id=195713 ) So instead of tweaking skb->truesize, lets change skb->destructor and keep a reference on the owner socket via its sk_refcnt. Fixes: `f2f872f927` ("netem: Introduce skb_orphan_partial() helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Michael Madsen <mkm@nabto.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Daniel Borkmann	3bfb04d102	bpf, arm64: fix faulty emission of map access in tail calls [ Upstream commit `d8b54110ee` ] Shubham was recently asking on netdev why in arm64 JIT we don't multiply the index for accessing the tail call map by 8. That led me into testing out arm64 JIT wrt tail calls and it turned out I got a NULL pointer dereference on the tail call. The buggy access is at: prog = array->ptrs[index]; if (prog == NULL) goto out; [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: f86a682a ldr x10, [x1,x10] 00000068: f862694b ldr x11, [x10,x2] 0000006c: b40000ab cbz x11, 0x00000080 [...] The code triggering the crash is f862694b. x1 at the time contains the address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning, above we load the pointer to the program at map slot 0 into x10. x10 can then be NULL if the slot is not occupied, which we later on try to access with a user given offset in x2 that is the map index. Fix this by emitting the following instead: [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: 8b0a002a add x10, x1, x10 00000068: d37df04b lsl x11, x2, #3 0000006c: f86b694b ldr x11, [x10,x11] 00000070: b40000ab cbz x11, 0x00000084 [...] This basically adds the offset to ptrs to the base address of the bpf array we got and we later on access the map with an index * 8 offset relative to that. The tail call map itself is basically one large area with meta data at the head followed by the array of prog pointers. This makes tail calls working again, tested on Cavium ThunderX ARMv8. Fixes: `ddb55992b0` ("arm64: bpf: implement bpf_tail_call() helper") Reported-by: Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Ursula Braun	617461cac6	s390/qeth: add missing hash table initializations [ Upstream commit `ebccc7397e` ] commit `5f78e29cee` ("qeth: optimize IP handling in rx_mode callback") added new hash tables, but missed to initialize them. Fixes: `5f78e29cee` ("qeth: optimize IP handling in rx_mode callback") Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Reviewed-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Julian Wiedmann	b8a9a79c89	s390/qeth: avoid null pointer dereference on OSN [ Upstream commit `25e2c341e7` ] Access card->dev only after checking whether's its valid. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Julian Wiedmann	9b1042aa59	s390/qeth: unbreak OSM and OSN support [ Upstream commit `2d2ebb3ed0` ] commit `b4d72c08b3` ("qeth: bridgeport support - basic control") broke the support for OSM and OSN devices as follows: As OSM and OSN are L2 only, qeth_core_probe_device() does an early setup by loading the l2 discipline and calling qeth_l2_probe_device(). In this context, adding the l2-specific bridgeport sysfs attributes via qeth_l2_create_device_attributes() hits a BUG_ON in fs/sysfs/group.c, since the basic sysfs infrastructure for the device hasn't been established yet. Note that OSN actually has its own unique sysfs attributes (qeth_osn_devtype), so the additional attributes shouldn't be created at all. For OSM, add a new qeth_l2_devtype that contains all the common and l2-specific sysfs attributes. When qeth_core_probe_device() does early setup for OSM or OSN, assign the corresponding devtype so that the ccwgroup probe code creates the full set of sysfs attributes. This allows us to skip qeth_l2_create_device_attributes() in case of an early setup. Any device that can't do early setup will initially have only the generic sysfs attributes, and when it's probed later qeth_l2_probe_device() adds the l2-specific attributes. If an early-setup device is removed (by calling ccwgroup_ungroup()), device_unregister() will - using the devtype - delete the l2-specific attributes before qeth_l2_remove_device() is called. So make sure to not remove them twice. What complicates the issue is that qeth_l2_probe_device() and qeth_l2_remove_device() is also called on a device when its layer2 attribute changes (ie. its layer mode is switched). For early-setup devices this wouldn't work properly - we wouldn't remove the l2-specific attributes when switching to L3. But switching the layer mode doesn't actually make any sense; we already decided that the device can only operate in L2! So just refuse to switch the layer mode on such devices. Note that OSN doesn't have a layer2 attribute, so we only need to special-case OSM. Based on an initial patch by Ursula Braun. Fixes: `b4d72c08b3` ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
Ursula Braun	41b81ff056	s390/qeth: handle sysfs error during initialization [ Upstream commit `9111e7880c` ] When setting up the device from within the layer discipline's probe routine, creating the layer-specific sysfs attributes can fail. Report this error back to the caller, and handle it by releasing the layer discipline. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> [jwi: updated commit msg, moved an OSN change to a subsequent patch] Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:59 +02:00
WANG Cong	8e929937f8	ipv6/dccp: do not inherit ipv6_mc_list from parent [ Upstream commit `83eaddab43` ] Like commit `657831ffc3` ("dccp/tcp: do not inherit mc_list from parent") we should clear ipv6_mc_list etc. for IPv6 sockets too. Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Gao Feng	03a275f5aa	driver: vrf: Fix one possible use-after-free issue [ Upstream commit `1a4a5bf52a` ] The current codes only deal with the case that the skb is dropped, it may meet one use-after-free issue when NF_HOOK returns 0 that means the skb is stolen by one netfilter rule or hook. When one netfilter rule or hook stoles the skb and return NF_STOLEN, it means the skb is taken by the rule, and other modules should not touch this skb ever. Maybe the skb is queued or freed directly by the rule. Now uses the nf_hook instead of NF_HOOK to get the result of netfilter, and check the return value of nf_hook. Only when its value equals 1, it means the skb could go ahead. Or reset the skb as NULL. BTW, because vrf_rcv_finish is empty function, so needn't invoke it even though nf_hook returns 1. But we need to modify vrf_rcv_finish to deal with the NF_STOLEN case. There are two cases when skb is stolen. 1. The skb is stolen and freed directly. There is nothing we need to do, and vrf_rcv_finish isn't invoked. 2. The skb is queued and reinjected again. The vrf_rcv_finish would be invoked as okfn, so need to free the skb in it. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Eric Dumazet	db8ebc6da8	dccp/tcp: do not inherit mc_list from parent [ Upstream commit `657831ffc3` ] syzkaller found a way to trigger double frees from ip_mc_drop_socket() It turns out that leave a copy of parent mc_list at accept() time, which is very bad. Very similar to commit `8b485ce698` ("tcp: do not inherit fastopen_req from parent") Initial report from Pray3r, completed by Andrey one. Thanks a lot to them ! Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Pray3r <pray3r.z@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-06-07 12:09:58 +02:00
Greg Kroah-Hartman	b43ae25b70	Linux 4.11.3	2017-05-25 15:46:45 +02:00
Tadeusz Struk	8d6d97abb8	IB/hfi1: Protect the global dev_cntr_names and port_cntr_names commit `62eed66e98` upstream. Protect the global dev_cntr_names and port_cntr_names with the global mutex as they are allocated and freed in a function called per device. Otherwise there is a danger of double free and memory leaks. Fixes: Commit `b7481944b0` ("IB/hfi1: Show statistics counters under IB stats interface") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Chris Wilson	b770585918	drm/i915/gvt: Disable access to stolen memory as a guest commit `04a68a35ce` upstream. Explicitly disable stolen memory when running as a guest in a virtual machine, since the memory is not mediated between clients and reserved entirely for the host. The actual size should be reported as zero, but like every other quirk we want to tell the user what is happening. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99028 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20161109103905.17860-1-chris@chris-wilson.co.uk Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Julius Werner	846d6e12d0	drivers: char: mem: Check for address space wraparound with mmap() commit `b299cde245` upstream. /dev/mem currently allows mmap() mappings that wrap around the end of the physical address space, which should probably be illegal. It circumvents the existing STRICT_DEVMEM permission check because the loop immediately terminates (as the start address is already higher than the end address). On the x86_64 architecture it will then cause a panic (from the BUG(start >= end) in arch/x86/mm/pat.c:reserve_memtype()). This patch adds an explicit check to make sure offset + size will not wrap around in the physical address type. Signed-off-by: Julius Werner <jwerner@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	326842c019	nfsd: Fix up the "supattr_exclcreat" attributes commit `b26b78cb72` upstream. If an NFSv4 client asks us for the supattr_exclcreat, then we must not return attributes that are unsupported by this minor version. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `75976de655` ("NFSD: Return word2 bitmask if setting security..,") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
J. Bruce Fields	9a4723626e	nfsd: encoders mustn't use unitialized values in error cases commit `f961e3f2ac` upstream. In error cases, lgp->lg_layout_type may be out of bounds; so we shouldn't be using it until after the check of nfserr. This was seen to crash nfsd threads when the server receives a LAYOUTGET request with a large layout type. GETDEVICEINFO has the same problem. Reported-by: Ari Kauppi <Ari.Kauppi@synopsys.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Ari Kauppi	06cc61e8f9	nfsd: fix undefined behavior in nfsd4_layout_verify commit `b550a32e60` upstream. UBSAN: Undefined behaviour in fs/nfsd/nfs4proc.c:1262:34 shift exponent 128 is too large for 32-bit type 'int' Depending on compiler+architecture, this may cause the check for layout_type to succeed for overly large values (which seems to be the case with amd64). The large value will be later used in de-referencing nfsd4_layout_ops for function pointers. Reported-by: Jani Tuovila <tuovila@synopsys.com> Signed-off-by: Ari Kauppi <ari@synopsys.com> [colin.king@canonical.com: use LAYOUT_TYPE_MAX instead of 32] Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	af069f63c5	NFSv4: Fix an rcu lock leak commit `2e84611b3f` upstream. The intention in the original patch was to release the lock when we put the inode, however something got screwed up. Reported-by: Jason Yan <yanaijie@huawei.com> Fixes: `7b410d9ce4` ("pNFS: Delay getting the layout header in..") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:30 +02:00
Trond Myklebust	89207ffd22	pNFS/flexfiles: Check the result of nfs4_pnfs_ds_connect commit `260f32adb8` upstream. The check in nfs4_ff_layout_prepare_ds() seems to be missing. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `a33e4b036d` ("pNFS: return status from nfs4_pnfs_ds_connect") Cc: Weston Andros Adamson <dros@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Benjamin Coddington	bc38735f21	NFS: Use GFP_NOIO for two allocations in writeback commit `ae97aa524e` upstream. Prevent a deadlock that can occur if we wait on allocations that try to write back our pages. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: `00bfa30abe` ("NFS: Create a common pgio_alloc and pgio_release...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Fred Isaman	2aa6b5f2ac	NFS: Fix use after free in write error path commit `1f84ccdf37` upstream. Signed-off-by: Fred Isaman <fred.isaman@gmail.com> Fixes: `0bcbf039f6` ("nfs: handle request add failure properly") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Trond Myklebust	e7ec98b137	NFSv4: Fix a hang in OPEN related to server reboot commit `56e0d71ef1` upstream. If the server fails to return the attributes as part of an OPEN reply, and then reboots, we can end up hanging. The reason is that the client attempts to send a GETATTR in order to pick up the missing OPEN call, but fails to release the slot first, causing reboot recovery to deadlock. Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Fixes: `2e80dbe7ac` ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Mario Kleiner	acef1b8125	drm/edid: Add 10 bpc quirk for LGD 764 panel in HP zBook 17 G2 commit `e345da82bd` upstream. The builtin eDP panel in the HP zBook 17 G2 supports 10 bpc, as advertised by the Laptops product specs and verified via injecting a fixed edid + photometer measurements, but edid reports unknown depth, so drivers fall back to 6 bpc. Add a quirk to get the full 10 bpc. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1492787108-23959-1-git-send-email-mario.kleiner.de@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Alexander Couzens	9c92bffbd8	mtd: nand: add ooblayout for old hamming layout commit `6a623e0769` upstream. The old 1-bit hamming layout requires ECC data to be placed at a fixed offset, and not necessarily at the end of the OOB area. Add this old layout back in order to fix legacy setups. Fixes: `41b207a70d` ("mtd: nand: implement the default mtd_ooblayout_ops") Signed-off-by: Alexander Couzens <lynxis@fe80.eu> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Brian Norris <computersforpeace@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Roger Quadros	2d3e850373	mtd: nand: omap2: Fix partition creation via cmdline mtdparts commit `2d283ede59` upstream. commit `c9711ec525` ("mtd: nand: omap: Clean up device tree support") caused the parent device name to be changed from "omap2-nand.0" to "<base address>.nand" (e.g. 30000000.nand on omap3 platforms). This caused mtd->name to be changed as well. This breaks partition creation via mtdparts passed by u-boot as it uses "omap2-nand.0" for the mtd-id. Fix this by explicitly setting the mtd->name to "omap2-nand.<CS number>" if it isn't already set by nand_set_flash_node(). CS number is the NAND controller instance ID. Fixes: `c9711ec525` ("mtd: nand: omap: Clean up device tree support") Reported-by: Leto Enrico <enrico.leto@siemens.com> Reported-by: Adam Ford <aford173@gmail.com> Suggested-by: Boris Brezillon <boris.brezillon@free-electrons.com> Tested-by: Adam Ford <aford173@gmail.com> Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Simon Baatz	750d21357e	mtd: nand: orion: fix clk handling commit `675b11d94c` upstream. The clk handling in orion_nand.c had two problems: - In the probe function, clk_put() was called for an enabled clock, which violates the API (see documentation for clk_put() in include/linux/clk.h) - In the error path of the probe function, clk_put() could be called twice for the same clock. In order to clean this up, use the managed function devm_clk_get() and store the pointer to the clk in the driver data. Fixes: `baffab28b1` ('ARM: Orion: fix driver probe error handling with respect to clk') Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
Lukas Wunner	6003ac5842	PCI: Freeze PME scan before suspending devices commit `ea00353f36` upstream. Laurent Pinchart reported that the Renesas R-Car H2 Lager board (r8a7790) crashes during suspend tests. Geert Uytterhoeven managed to reproduce the issue on an M2-W Koelsch board (r8a7791): It occurs when the PME scan runs, once per second. During PME scan, the PCI host bridge (rcar-pci) registers are accessed while its module clock has already been disabled, leading to the crash. One reproducer is to configure s2ram to use "s2idle" instead of "deep" suspend: # echo 0 > /sys/module/printk/parameters/console_suspend # echo s2idle > /sys/power/mem_sleep # echo mem > /sys/power/state Another reproducer is to write either "platform" or "processors" to /sys/power/pm_test. It does not (or is less likely) to happen during full system suspend ("core" or "none") because system suspend also disables timers, and thus the workqueue handling PME scans no longer runs. Geert believes the issue may still happen in the small window between disabling module clocks and disabling timers: # echo 0 > /sys/module/printk/parameters/console_suspend # echo platform > /sys/power/pm_test # Or "processors" # echo mem > /sys/power/state (Make sure CONFIG_PCI_RCAR_GEN2 and CONFIG_USB_OHCI_HCD_PCI are enabled.) Rafael Wysocki agrees that PME scans should be suspended before the host bridge registers become inaccessible. To that end, queue the task on a workqueue that gets frozen before devices suspend. Rafael notes however that as a result, some wakeup events may be missed if they are delivered via PME from a device without working IRQ (which hence must be polled) and occur after the workqueue has been frozen. If that turns out to be an issue in practice, it may be possible to solve it by calling pci_pme_list_scan() once directly from one of the host bridge's pm_ops callbacks. Stacktrace for posterity: PM: Syncing filesystems ... [ 38.566237] done. PM: Preparing system for sleep (mem) Freezing user space processes ... [ 38.579813] (elapsed 0.001 seconds) done. Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. PM: Suspending system (mem) PM: suspend of devices complete after 152.456 msecs PM: late suspend of devices complete after 2.809 msecs PM: noirq suspend of devices complete after 29.863 msecs suspend debug: Waiting for 5 second(s). Unhandled fault: asynchronous external abort (0x1211) at 0x00000000 pgd = c0003000 [00000000] pgd=80000040004003, pmd=00000000 Internal error: : 1211 [#1] SMP ARM Modules linked in: CPU: 1 PID: 20 Comm: kworker/1:1 Not tainted 4.9.0-rc1-koelsch-00011-g68db9bc814362e7f #3383 Hardware name: Generic R8A7791 (Flattened Device Tree) Workqueue: events pci_pme_list_scan task: eb56e140 task.stack: eb58e000 PC is at pci_generic_config_read+0x64/0x6c LR is at rcar_pci_cfg_base+0x64/0x84 pc : [<c041d7b4>] lr : [<c04309a0>] psr: 600d0093 sp : eb58fe98 ip : c041d750 fp : 00000008 r10: c0e2283c r9 : 00000000 r8 : 600d0013 r7 : 00000008 r6 : eb58fed6 r5 : 00000002 r4 : eb58feb4 r3 : 00000000 r2 : 00000044 r1 : 00000008 r0 : 00000000 Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 30c5387d Table: 6a9f6c80 DAC: 55555555 Process kworker/1:1 (pid: 20, stack limit = 0xeb58e210) Stack: (0xeb58fe98 to 0xeb590000) fe80: 00000002 00000044 fea0: eb6f5800 c041d9b0 eb58feb4 00000008 00000044 00000000 eb78a000 eb78a000 fec0: 00000044 00000000 eb9aff00 c0424bf0 eb78a000 00000000 eb78a000 c0e22830 fee0: ea8a6fc0 c0424c5c eaae79c0 c0424ce0 eb55f380 c0e22838 eb9a9800 c0235fbc ff00: eb55f380 c0e22838 eb55f380 eb9a9800 eb9a9800 eb58e000 eb9a9824 c0e02100 ff20: eb55f398 c02366c4 eb56e140 eb5631c0 00000000 eb55f380 c023641c 00000000 ff40: 00000000 00000000 00000000 c023a928 cd105598 00000000 40506a34 eb55f380 ff60: 00000000 00000000 dead4ead ffffffff ffffffff eb58ff74 eb58ff74 00000000 ff80: 00000000 dead4ead ffffffff ffffffff eb58ff90 eb58ff90 eb58ffac eb5631c0 ffa0: c023a844 00000000 00000000 c0206d68 00000000 00000000 00000000 00000000 ffc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ffe0: 00000000 00000000 00000000 00000000 00000013 00000000 3a81336c 10ccd1dd [<c041d7b4>] (pci_generic_config_read) from [<c041d9b0>] (pci_bus_read_config_word+0x58/0x80) [<c041d9b0>] (pci_bus_read_config_word) from [<c0424bf0>] (pci_check_pme_status+0x34/0x78) [<c0424bf0>] (pci_check_pme_status) from [<c0424c5c>] (pci_pme_wakeup+0x28/0x54) [<c0424c5c>] (pci_pme_wakeup) from [<c0424ce0>] (pci_pme_list_scan+0x58/0xb4) [<c0424ce0>] (pci_pme_list_scan) from [<c0235fbc>] (process_one_work+0x1bc/0x308) [<c0235fbc>] (process_one_work) from [<c02366c4>] (worker_thread+0x2a8/0x3e0) [<c02366c4>] (worker_thread) from [<c023a928>] (kthread+0xe4/0xfc) [<c023a928>] (kthread) from [<c0206d68>] (ret_from_fork+0x14/0x2c) Code: ea000000 e5903000 f57ff04f e3a00000 (e5843000) ---[ end trace 667d43ba3aa9e589 ]--- Fixes: `df17e62e5b` ("PCI: Add support for polling PME state on suspended legacy PCI devices") Reported-and-tested-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Cc: Simon Horman <horms+renesas@verge.net.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:29 +02:00
David Woodhouse	f21143a53f	PCI: Only allow WC mmap on prefetchable resources commit `cef4d02305` upstream. The /proc/bus/pci mmap interface allows the user to specify whether they want WC or not. Don't let them do so on non-prefetchable BARs. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
David Woodhouse	bee38f0f37	PCI: Fix another sanity check bug in /proc/pci mmap commit `17caf56731` upstream. Don't match MMIO maps with I/O BARs and vice versa. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
David Woodhouse	b87e133e3d	PCI: Fix pci_mmap_fits() for HAVE_PCI_RESOURCE_TO_USER platforms commit `6bccc7f426` upstream. In the PCI_MMAP_PROCFS case when the address being passed by the user is a 'user visible' resource address based on the bus window, and not the actual contents of the resource, that's what we need to be checking it against. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
K. Y. Srinivasan	3537380a05	PCI: hv: Specify CPU_AFFINITY_ALL for MSI affinity when >= 32 CPUs commit `433fcf6b7b` upstream. When we have 32 or more CPUs in the affinity mask, we should use a special constant to specify that to the host. Fix this issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
K. Y. Srinivasan	9dc0babcd2	PCI: hv: Allocate interrupt descriptors with GFP_ATOMIC commit `59c58ceeea` upstream. The memory allocation here needs to be non-blocking. Fix the issue. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Long Li <longli@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Tomasz Nowicki	569ed828de	PCI/ACPI: Add ThunderX pass2.x 2nd node MCFG quirk commit `cd18374048` upstream. Currently SoCs pass2.x do not emulate EA headers for ACPI boot method at all. However, for pass2.x some devices (like EDAC) advertise incorrect base addresses in their BARs which results in driver probe failure during resource request. Since all problematic blocks are on 2nd NUMA node under domain 10 add necessary quirk entry to obtain BAR addresses correction using EA header emulation. Fixes: `44f22bd91e` ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller") Signed-off-by: Tomasz Nowicki <tn@semihalf.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Robert Richter <rrichter@cavium.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Bjorn Helgaas	b4447dbb5c	PCI/ACPI: Tidy up MCFG quirk whitespace commit `ced414a14f` upstream. With no blank lines, it's not obvious where the macro definitions end and the uses begin. Add some blank lines and reorder the ThunderX definitions. No functional change intended. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Dawei Chien	d76f918859	thermal: mt8173: minor mtk_thermal.c cleanups commit `05d7839aa2` upstream. If thermal bank with 4 sensors, thermal driver should read TEMP_MSR3. However, currently thermal driver would not read TEMP_MSR3 since mt8173 thermal driver only use 3 sensors on each thermal bank at the same time, so this patch would not effect temperature. Only if mt mt8173 thermal driver use 4 sensors on any thermal bank, would read third sensor two times, and lose fourth sensor of vale. Fixes: `b7cf005373` ("thermal: Add Mediatek thermal driver for mt2701.") Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Dawei Chien <dawei.chien@mediatek.com> Signed-off-by: Eduardo Valentin <edubezval@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Thomas Gleixner	63ab09e39d	tracing/kprobes: Enforce kprobes teardown after testing commit `30e7d894c1` upstream. Enabling the tracer selftest triggers occasionally the warning in text_poke(), which warns when the to be modified page is not marked reserved. The reason is that the tracer selftest installs kprobes on functions marked __init for testing. These probes are removed after the tests, but that removal schedules the delayed kprobes_optimizer work, which will do the actual text poke. If the work is executed after the init text is freed, then the warning triggers. The bug can be reproduced reliably when the work delay is increased. Flush the optimizer work and wait for the optimizing/unoptimizing lists to become empty before returning from the kprobes tracer selftest. That ensures that all operations which were queued due to the probes removal have completed. Link: http://lkml.kernel.org/r/20170516094802.76a468bb@gandalf.local.home Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `6274de498` ("kprobes: Support delayed unoptimizing") Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:28 +02:00
Arnd Bergmann	18b4b0ab65	firmware: ti_sci: fix strncat length check commit `76cefef8e8` upstream. gcc-7 notices that the length we pass to strncat is wrong: drivers/firmware/ti_sci.c: In function 'ti_sci_probe': drivers/firmware/ti_sci.c:204:32: error: specified bound 50 equals the size of the destination [-Werror=stringop-overflow=] Instead of the total length, we must pass the length of the remaining space here. Fixes: `aa276781a6` ("firmware: Add basic support for TI System Control Interface (TI-SCI) protocol") Acked-by: Nishanth Menon <nm@ti.com> Acked-by: Santosh Shilimkar <ssantosh@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Masami Hiramatsu	e57f049b09	um: Fix to call read_initrd after init_bootmem commit `5b4236e17c` upstream. Since read_initrd() invokes alloc_bootmem() for allocating memory to load initrd image, it must be called after init_bootmem. This makes read_initrd() called directly from setup_arch() after init_bootmem() and mem_total_pages(). Fixes: `b63236972e` ("um: Setup physical memory in setup_arch()") Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Lars Ellenberg	ece6ff2261	drbd: fix request leak introduced by locking/atomic, kref: Kill kref_sub() commit `a00ebd1cf1` upstream. When killing kref_sub(), the unconditional additional kref_get() was not properly paired with the necessary kref_put(), causing a leak of struct drbd_requests (~ 224 Bytes) per submitted bio, and breaking DRBD in general, as the destructor of those "drbd_requests" does more than just the mempoll_free(). Fixes: `bdfafc4ffd` ("locking/atomic, kref: Kill kref_sub()") Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Al Viro	7811108277	osf_wait4(): fix infoleak commit `a8c39544a6` upstream. failing sys_wait4() won't fill struct rusage... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	90c676ffab	kvm: arm/arm64: Force reading uncached stage2 PGD commit `2952a6070e` upstream. Make sure we don't use a cached value of the KVM stage2 PGD while resetting the PGD. Cc: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	59e1faddb3	kvm: arm/arm64: Fix use after free of stage2 page table commit `0c428a6a92` upstream. We yield the kvm->mmu_lock occassionaly while performing an operation (e.g, unmap or permission changes) on a large area of stage2 mappings. However this could possibly cause another thread to clear and free up the stage2 page tables while we were waiting for regaining the lock and thus the original thread could end up in accessing memory that was freed. This patch fixes the problem by making sure that the stage2 pagetable is still valid after we regain the lock. The fact that mmu_notifer->release() could be called twice (via __mmu_notifier_release and mmu_notifier_unregsister) enhances the possibility of hitting this race where there are two threads trying to unmap the entire guest shadow pages. While at it, cleanup the redudant checks around cond_resched_lock in stage2_wp_range(), as cond_resched_lock already does the same checks. Cc: Mark Rutland <mark.rutland@arm.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: andreyknvl@google.com Cc: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Suzuki K Poulose	3bbe9f26a0	kvm: arm/arm64: Fix race in resetting stage2 PGD commit `6c0d706b56` upstream. In kvm_free_stage2_pgd() we check the stage2 PGD before holding the lock and proceed to take the lock if it is valid. And we unmap the page tables, followed by releasing the lock. We reset the PGD only after dropping this lock, which could cause a race condition where another thread waiting on or even holding the lock, could potentially see that the PGD is still valid and proceed to perform a stage2 operation and later encounter a NULL PGD. [223090.242280] Unable to handle kernel NULL pointer dereference at virtual address 00000040 [223090.262330] PC is at unmap_stage2_range+0x8c/0x428 [223090.262332] LR is at kvm_unmap_hva_handler+0x2c/0x3c [223090.262531] Call trace: [223090.262533] [<ffff0000080adb78>] unmap_stage2_range+0x8c/0x428 [223090.262535] [<ffff0000080adf40>] kvm_unmap_hva_handler+0x2c/0x3c [223090.262537] [<ffff0000080ace2c>] handle_hva_to_gpa+0xb0/0x104 [223090.262539] [<ffff0000080af988>] kvm_unmap_hva+0x5c/0xbc [223090.262543] [<ffff0000080a2478>] kvm_mmu_notifier_invalidate_page+0x50/0x8c [223090.262547] [<ffff0000082274f8>] __mmu_notifier_invalidate_page+0x5c/0x84 [223090.262551] [<ffff00000820b700>] try_to_unmap_one+0x1d0/0x4a0 [223090.262553] [<ffff00000820c5c8>] rmap_walk+0x1cc/0x2e0 [223090.262555] [<ffff00000820c90c>] try_to_unmap+0x74/0xa4 [223090.262557] [<ffff000008230ce4>] migrate_pages+0x31c/0x5ac [223090.262561] [<ffff0000081f869c>] compact_zone+0x3fc/0x7ac [223090.262563] [<ffff0000081f8ae0>] compact_zone_order+0x94/0xb0 [223090.262564] [<ffff0000081f91c0>] try_to_compact_pages+0x108/0x290 [223090.262569] [<ffff0000081d5108>] __alloc_pages_direct_compact+0x70/0x1ac [223090.262571] [<ffff0000081d64a0>] __alloc_pages_nodemask+0x434/0x9f4 [223090.262572] [<ffff0000082256f0>] alloc_pages_vma+0x230/0x254 [223090.262574] [<ffff000008235e5c>] do_huge_pmd_anonymous_page+0x114/0x538 [223090.262576] [<ffff000008201bec>] handle_mm_fault+0xd40/0x17a4 [223090.262577] [<ffff0000081fb324>] __get_user_pages+0x12c/0x36c [223090.262578] [<ffff0000081fb804>] get_user_pages_unlocked+0xa4/0x1b8 [223090.262579] [<ffff0000080a3ce8>] __gfn_to_pfn_memslot+0x280/0x31c [223090.262580] [<ffff0000080a3dd0>] gfn_to_pfn_prot+0x4c/0x5c [223090.262582] [<ffff0000080af3f8>] kvm_handle_guest_abort+0x240/0x774 [223090.262584] [<ffff0000080b2bac>] handle_exit+0x11c/0x1ac [223090.262586] [<ffff0000080ab99c>] kvm_arch_vcpu_ioctl_run+0x31c/0x648 [223090.262587] [<ffff0000080a1d78>] kvm_vcpu_ioctl+0x378/0x768 [223090.262590] [<ffff00000825df5c>] do_vfs_ioctl+0x324/0x5a4 [223090.262591] [<ffff00000825e26c>] SyS_ioctl+0x90/0xa4 [223090.262595] [<ffff000008085d84>] el0_svc_naked+0x38/0x3c This patch moves the stage2 PGD manipulation under the lock. Reported-by: Alexander Graf <agraf@suse.de> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Huacai Chen	a24a206192	MIPS: Loongson-3: Select MIPS_L1_CACHE_SHIFT_6 commit `17c99d9421` upstream. Some newer Loongson-3 have 64 bytes cache lines, so select MIPS_L1_CACHE_SHIFT_6. Signed-off-by: Huacai Chen <chenhc@lemote.com> Cc: John Crispin <john@phrozen.org> Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com> Cc: Fuxin Zhang <zhangfx@lemote.com> Cc: Zhangjin Wu <wuzhangjin@gmail.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/15755/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Jon Derrick	7ce12d8b97	nvme: unmap CMB and remove sysfs file in reset path commit `f63572dff1` upstream. CMB doesn't get unmapped until removal while getting remapped on every reset. Add the unmapping and sysfs file removal to the reset path in nvme_pci_disable to match the mapping path in nvme_pci_enable. Fixes: `202021c1a` ("nvme : Add sysfs entry for NVMe CMBs when appropriate") Signed-off-by: Jon Derrick <jonathan.derrick@intel.com> Acked-by: Keith Busch <keith.busch@intel.com> Reviewed-By: Stephen Bates <sbates@raithlin.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:27 +02:00
Thomas Gleixner	ef1f1648fe	genirq: Fix chained interrupt data ordering commit `2c4569ca26` upstream. irq_set_chained_handler_and_data() sets up the chained interrupt and then stores the handler data. That's racy against an immediate interrupt which gets handled before the store of the handler data happened. The handler will dereference a NULL pointer and crash. Cure it by storing handler data before installing the chained handler. Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Johan Hovold	3a97bedaa3	uwb: fix device quirk on big-endian hosts commit `41318a2b82` upstream. Add missing endianness conversion when using the USB device-descriptor idProduct field to apply a hardware quirk. Fixes: `1ba47da527` ("uwb: add the i1480 DFU driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Daniel Micay	4c44438b56	stackprotector: Increase the per-task stack canary's random range from 32 bits to 64 bits on 64-bit platforms commit `5ea30e4e58` upstream. The stack canary is an 'unsigned long' and should be fully initialized to random data rather than only 32 bits of random data. Signed-off-by: Daniel Micay <danielmicay@gmail.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arjan van Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20170504133209.3053-1-danielmicay@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
James Hogan	88f7b253a9	metag/uaccess: Check access_ok in strncpy_from_user commit `3a158a62da` upstream. The metag implementation of strncpy_from_user() doesn't validate the src pointer, which could allow reading of arbitrary kernel memory. Add a short access_ok() check to prevent that. Its still possible for it to read across the user/kernel boundary, but it will invariably reach a NUL character after only 9 bytes, leaking only a static kernel address being loaded into D0Re0 at the beginning of __start, which is acceptable for the immediate fix. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
James Hogan	160f7df55b	metag/uaccess: Fix access_ok() commit `8a8b56638b` upstream. The __user_bad() macro used by access_ok() has a few corner cases noticed by Al Viro where it doesn't behave correctly: - The kernel range check has off by 1 errors which permit access to the first and last byte of the kernel mapped range. - The kernel range check ends at LINCORE_BASE rather than META_MEMORY_LIMIT, which is ineffective when the kernel is in global space (an extremely uncommon configuration). There are a couple of other shortcomings here too: - Access to the whole of the other address space is permitted (i.e. the global half of the address space when the kernel is in local space). This isn't ideal as it could theoretically still contain privileged mappings set up by the bootloader. - The size argument is unused, permitting user copies which start on valid pages at the end of the user address range and cross the boundary into the kernel address space (e.g. addr = 0x3ffffff0, size > 0x10). It isn't very convenient to add size checks when disallowing certain regions, and it seems far safer to be sure and explicit about what userland is able to access, so invert the logic to allow certain regions instead, and fix the off by 1 errors and missing size checks. This also allows the get_fs() == KERNEL_DS check to be more easily optimised into the user address range case. We now have 3 such allowed regions: - The user address range (incorporating the get_fs() == KERNEL_DS check). - NULL (some kernel code expects this to work, and we'll always catch the fault anyway). - The core code memory region. Fixes: `373cd784d0` ("metag: Memory handling") Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: linux-metag@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Li, Fei	4b70137931	cpuidle: check dev before usage in cpuidle_use_deepest_state() commit `41dc750ea6` upstream. In case of there is no cpuidle devices registered, dev will be null, and panic will be triggered like below; In this patch, add checking of dev before usage, like that done in cpuidle_idle_call. Panic without fix: [ 184.961328] BUG: unable to handle kernel NULL pointer dereference at (null) [ 184.961328] IP: cpuidle_use_deepest_state+0x30/0x60 ... [ 184.961328] play_idle+0x8d/0x210 [ 184.961328] ? __schedule+0x359/0x8e0 [ 184.961328] ? _raw_spin_unlock_irqrestore+0x28/0x50 [ 184.961328] ? kthread_queue_delayed_work+0x41/0x80 [ 184.961328] clamp_idle_injection_func+0x64/0x1e0 Fixes: `bb8313b603` (cpuidle: Allow enforcing deepest idle state selection) Signed-off-by: Li, Fei <fei.li@intel.com> Tested-by: Shi, Feng <fengx.shi@intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
KarimAllah Ahmed	4fd7fd47a0	iommu/vt-d: Flush the IOTLB to get rid of the initial kdump mappings commit `f73a7eee90` upstream. Ever since commit `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") the kdump kernel copies the IOMMU context tables from the previous kernel. Each device mappings will be destroyed once the driver for the respective device takes over. This unfortunately breaks the workflow of mapping and unmapping a new context to the IOMMU. The mapping function assumes that either: 1) Unmapping did the proper IOMMU flushing and it only ever flush if the IOMMU unit supports caching invalid entries. 2) The system just booted and the initialization code took care of flushing all IOMMU caches. This assumption is not true for the kdump kernel since the context tables have been copied from the previous kernel and translations could have been cached ever since. So make sure to flush the IOTLB as well when we destroy these old copied mappings. Cc: Joerg Roedel <joro@8bytes.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Anthony Liguori <aliguori@amazon.com> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Acked-by: David Woodhouse <dwmw@amazon.co.uk> Fixes: `091d42e43d` ("iommu/vt-d: Copy translation tables from old kernel") Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	1cfc2a16c5	staging: rtl8192e: GetTs Fix invalid TID 7 warning. commit `95d93e271d` upstream. TID 7 is a valid value for QoS IEEE 802.11e. The switch statement that follows states 7 is valid. Remove function IsACValid and use the default case to filter invalid TIDs. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	7369a3b5ec	staging: rtl8192e: rtl92e_get_eeprom_size Fix read size of EPROM_CMD. commit `90be652c9f` upstream. EPROM_CMD is 2 byte aligned on PCI map so calling with rtl92e_readl will return invalid data so use rtl92e_readw. The device is unable to select the right eeprom type. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	3e57290a19	staging: rtl8192e: fix 2 byte alignment of register BSSIDR. commit `867510bde1` upstream. BSSIDR has two byte alignment on PCI ioremap correct the write by swapping to 16 bits first. This fixes a problem that the device associates fail because the filter is not set correctly. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:26 +02:00
Malcolm Priestley	d92507d9e5	staging: rtl8192e: rtl92e_fill_tx_desc fix write to mapped out memory. commit `baabd567f8` upstream. The driver attempts to alter memory that is mapped to PCI device. This is because tx_fwinfo_8190pci points to skb->data Move the pci_map_single to when completed buffer is ready to be mapped with psdec is empty to drop on mapping error. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Phil Elwell	e478ae4e63	staging: vc04_services: Fix bulk cache maintenance commit `ff92b9e3c9` upstream. vchiq_arm supports transfers less than one page and at arbitrary alignment, using the dma-mapping API to perform its cache maintenance (even though the VPU drives the DMA hardware). Read (DMA_FROM_DEVICE) operations use cache invalidation for speed, falling back to clean+invalidate on partial cache lines, with writes (DMA_TO_DEVICE) using flushes. If a read transfer has ends which aren't page-aligned, performing cache maintenance as if they were whole pages can lead to memory corruption since the partial cache lines at the ends (and any cache lines before or after the transfer area) will be invalidated. This bug was masked until the disabling of the cache flush in flush_dcache_page(). Honouring the requested transfer start- and end-points prevents the corruption. Fixes: `cf9caf1929` ("staging: vc04_services: Replace dmac_map_area with dmac_map_sg") Signed-off-by: Phil Elwell <phil@raspberrypi.org> Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Tested-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	e3b8dd546d	arm64: documentation: document tagged pointer stack constraints commit `f0e421b1bf` upstream. Some kernel features don't currently work if a task puts a non-zero address tag in its stack pointer, frame pointer, or frame record entries (FP, LR). For example, with a tagged stack pointer, the kernel can't deliver signals to the process, and the task is killed instead. As another example, with a tagged frame pointer or frame records, perf fails to generate call graphs or resolve symbols. For now, just document these limitations, instead of finding and fixing everything that doesn't work, as it's not known if anyone needs to use tags in these places anyway. In addition, as requested by Dave Martin, generalize the limitations into a general kernel address tag policy, and refactor tagged-pointers.txt to include it. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	45b1f35406	arm64: entry: improve data abort handling of tagged pointers commit `276e93279a` upstream. When handling a data abort from EL0, we currently zero the top byte of the faulting address, as we assume the address is a TTBR0 address, which may contain a non-zero address tag. However, the address may be a TTBR1 address, in which case we should not zero the top byte. This patch fixes that. The effect is that the full TTBR1 address is passed to the task's signal handler (or printed out in the kernel log). When handling a data abort from EL1, we leave the faulting address intact, as we assume it's either a TTBR1 address or a TTBR0 address with tag 0x00. This is true as far as I'm aware, we don't seem to access a tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to forget about address tags, and code added in the future may not always remember to remove tags from addresses before accessing them. So add tag handling to the EL1 data abort handler as well. This also makes it consistent with the EL0 data abort handler. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Reviewed-by: Dave Martin <Dave.Martin@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	684f9dee46	arm64: hw_breakpoint: fix watchpoint matching for tagged pointers commit `7dcd9dd8ce` upstream. When we take a watchpoint exception, the address that triggered the watchpoint is found in FAR_EL1. We compare it to the address of each configured watchpoint to see which one was hit. The configured watchpoint addresses are untagged, while the address in FAR_EL1 will have an address tag if the data access was done using a tagged address. The tag needs to be removed to compare the address to the watchpoints. Currently we don't remove it, and as a result can report the wrong watchpoint as being hit (specifically, always either the highest TTBR0 watchpoint or lowest TTBR1 watchpoint). This patch removes the tag. Fixes: `d50240a5f6` ("arm64: mm: permit use of tagged pointers at EL0") Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Kristina Martsenko	5de6ca05b6	arm64: traps: fix userspace cache maintenance emulation on a tagged pointer commit `81cddd65b5` upstream. When we emulate userspace cache maintenance in the kernel, we can currently send the task a SIGSEGV even though the maintenance was done on a valid address. This happens if the address has a non-zero address tag, and happens to not be mapped in. When we get the address from a user register, we don't currently remove the address tag before performing cache maintenance on it. If the maintenance faults, we end up in either __do_page_fault, where find_vma can't find the VMA if the address has a tag, or in do_translation_fault, where the tagged address will appear to be above TASK_SIZE. In both cases, the address is not mapped in, and the task is sent a SIGSEGV. This patch removes the tag from the address before using it. With this patch, the fault is handled correctly, the address gets mapped in, and the cache maintenance succeeds. As a second bug, if cache maintenance (correctly) fails on an invalid tagged address, the address gets passed into arm64_notify_segfault, where find_vma fails to find the VMA due to the tag, and the wrong si_code may be sent as part of the siginfo_t of the segfault. With this patch, the correct si_code is sent. Fixes: `7dd01aef05` ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	87e2b8c096	arm64: uaccess: ensure extension of access_ok() addr commit `a06040d7a7` upstream. Our access_ok() simply hands its arguments over to __range_ok(), which implicitly assummes that the addr parameter is 64 bits wide. This isn't necessarily true for compat code, which might pass down a 32-bit address parameter. In these cases, we don't have a guarantee that the address has been zero extended to 64 bits, and the upper bits of the register may contain unknown values, potentially resulting in a suprious failure. Avoid this by explicitly casting the addr parameter to an unsigned long (as is done on other architectures), ensuring that the parameter is widened appropriately. Fixes: `0aea86a217` ("arm64: User access library functions") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	1f13dc4c22	arm64: armv8_deprecated: ensure extension of addr commit `55de49f9aa` upstream. Our compat swp emulation holds the compat user address in an unsigned int, which it passes to __user_swpX_asm(). When a 32-bit value is passed in a register, the upper 32 bits of the register are unknown, and we must extend the value to 64 bits before we can use it as a base address. This patch casts the address to unsigned long to ensure it has been suitably extended, avoiding the potential issue, and silencing a related warning from clang. Fixes: `bd35a4adc4` ("arm64: Port SWP/SWPB emulation support from arm") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	2339f63ef9	arm64: ensure extension of smp_store_release value commit `994870bead` upstream. When an inline assembly operand's type is narrower than the register it is allocated to, the least significant bits of the register (up to the operand type's width) are valid, and any other bits are permitted to contain any arbitrary value. This aligns with the AAPCS64 parameter passing rules. Our __smp_store_release() implementation does not account for this, and implicitly assumes that operands have been zero-extended to the width of the type being stored to. Thus, we may store unknown values to memory when the value type is narrower than the pointer type (e.g. when storing a char to a long). This patch fixes the issue by casting the value operand to the same width as the pointer operand in all cases, which ensures that the value is zero-extended as we expect. We use the same union trickery as __smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that pointers are potentially cast to narrower width integers in unreachable paths. A whitespace issue at the top of __smp_store_release() is also corrected. No changes are necessary for __smp_load_acquire(). Load instructions implicitly clear any upper bits of the register, and the compiler will only consider the least significant bits of the register as valid regardless. Fixes: `47933ad41a` ("arch: Introduce smp_load_acquire(), smp_store_release()") Fixes: `878a84d5a8` ("arm64: add missing data types in smp_load_acquire/smp_store_release") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:25 +02:00
Mark Rutland	b55563999f	arm64: xchg: hazard against entire exchange variable commit `fee960bed5` upstream. The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a u8 pointer, and thus the hazard only applies to the first byte of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, as demonstrated with the following test case: union u { unsigned long l; unsigned int i[2]; }; unsigned long update_char_hazard(union u u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" ((char )&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } unsigned long update_long_hazard(union u u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" ((long )&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when using -O2 or above: 0000000000000000 <update_char_hazard>: 0: d2800001 mov x1, #0x0 // #0 4: f9000001 str x1, [x0] 8: d2800000 mov x0, #0x0 // #0 c: d65f03c0 ret 0000000000000010 <update_long_hazard>: 10: b9400401 ldr w1, [x0,#4] 14: d2800002 mov x2, #0x0 // #0 18: f9000002 str x2, [x0] 1c: b9400400 ldr w0, [x0,#4] 20: 4a000020 eor w0, w1, w0 24: d65f03c0 ret This patch fixes the issue by passing an unsigned long pointer into the +Q constraint, as we do for our cmpxchg code. This may hazard against more than is necessary, but this is better than missing a necessary hazard. Fixes: `305d454aaa` ("arm64: atomics: implement native {relaxed, acquire, release} atomics") Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Daniel Lezcano	ee6dfbff78	arm64: dts: hi6220: Reset the mmc hosts commit `0fbdf9953b` upstream. The MMC hosts could be left in an unconsistent or uninitialized state from the firmware. Instead of assuming, the firmware did the right things, let's reset the host controllers. This change fixes a bug when the mmc2/sdio is initialized leading to a hung task: [ 242.704294] INFO: task kworker/7:1:675 blocked for more than 120 seconds. [ 242.711129] Not tainted 4.9.0-rc8-00017-gcf0251f #3 [ 242.716571] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 242.724435] kworker/7:1 D 0 675 2 0x00000000 [ 242.729973] Workqueue: events_freezable mmc_rescan [ 242.734796] Call trace: [ 242.737269] [<ffff00000808611c>] __switch_to+0xa8/0xb4 [ 242.742437] [<ffff000008d07c04>] __schedule+0x1c0/0x67c [ 242.747689] [<ffff000008d08254>] schedule+0x40/0xa0 [ 242.752594] [<ffff000008d0b284>] schedule_timeout+0x1c4/0x35c [ 242.758366] [<ffff000008d08e38>] wait_for_common+0xd0/0x15c [ 242.763964] [<ffff000008d09008>] wait_for_completion+0x28/0x34 [ 242.769825] [<ffff000008a1a9f4>] mmc_wait_for_req_done+0x40/0x124 [ 242.775949] [<ffff000008a1ab98>] mmc_wait_for_req+0xc0/0xf8 [ 242.781549] [<ffff000008a1ac3c>] mmc_wait_for_cmd+0x6c/0x84 [ 242.787149] [<ffff000008a26610>] mmc_io_rw_direct_host+0x9c/0x114 [ 242.793270] [<ffff000008a26aa0>] sdio_reset+0x34/0x7c [ 242.798347] [<ffff000008a1d46c>] mmc_rescan+0x2fc/0x360 [ ... ] Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Wei Xu <xuwei5@hisilicon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Leonard Crestez	1116274777	ARM: dts: imx6sx-sdb: Remove OPP override commit `d8581c7c8b` upstream. The board file for imx6sx-sdb overrides cpufreq operating points to use higher voltages. This is done because the board has a shared rail for VDD_ARM_IN and VDD_SOC_IN and when using LDO bypass the shared voltage needs to be a value suitable for both ARM and SOC. This only applies to LDO bypass mode, a feature not present in upstream. When LDOs are enabled the effect is to use higher voltages than necessary for no good reason. Setting these higher voltages can make some boards fail to boot with ugly semi-random crashes reminiscent of memory corruption. These failures only happen on board rev. C, rev. B is reported to still work. Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com> Fixes: `54183bd7f7` ("ARM: imx6sx-sdb: add revb board and make it default") Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ludovic Desroches	8539c6b56a	ARM: dts: at91: sama5d3_xplained: not all ADC channels are available commit `d3df1ec063` upstream. Remove ADC channels that are not available by default on the sama5d3_xplained board (resistor not populated) in order to not create confusion. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ludovic Desroches	87cb676e83	ARM: dts: at91: sama5d3_xplained: fix ADC vref commit `9cdd31e591` upstream. The voltage reference for the ADC is not 3V but 3.3V since it is connected to VDDANA. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Vladimir Murzin	6b9e12331a	ARM: 8670/1: V7M: Do not corrupt vector table around v7m_invalidate_l1 call commit `6d80594936` upstream. We save/restore registers around v7m_invalidate_l1 to address pointed by r12, which is vector table, so the first eight entries are overwritten with a garbage. We already have stack setup at that stage, so use it to save/restore register. Fixes: `6a8146f420` ("ARM: 8609/1: V7M: Add support for the Cortex-M7 processor") Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Jon Medhurst	3e9570206d	ARM: 8667/3: Fix memory attribute inconsistencies when using fixmap commit `b089c31c51` upstream. To cope with the variety in ARM architectures and configurations, the pagetable attributes for kernel memory are generated at runtime to match the system the kernel finds itself on. This calculated value is stored in pgprot_kernel. However, when early fixmap support was added for ARM (commit `a5f4c561b3`) the attributes used for mappings were hard coded because pgprot_kernel is not set up early enough. Unfortunately, when fixmap is used after early boot this means the memory being mapped can have different attributes to existing mappings, potentially leading to unpredictable behaviour. A specific problem also exists due to the hard coded values not include the 'shareable' attribute which means on systems where this matters (e.g. those with multiple CPU clusters) the cache contents for a memory location can become inconsistent between CPUs. To resolve these issues we change fixmap to use the same memory attributes (from pgprot_kernel) that the rest of the kernel uses. To enable this we need to refactor the initialisation code so build_mem_type_table() is called early enough. Note, that relies on early param parsing for memory type overrides passed via the kernel command line, so we need to make sure this call is still after parse_early_params(). [ardb: keep early_fixmap_init() before param parsing, for earlycon] Fixes: `a5f4c561b3` ("ARM: 8415/1: early fixmap support for earlycon") Tested-by: afzal mohammed <afzal.mohd.ma@gmail.com> Signed-off-by: Jon Medhurst <tixy@linaro.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Ard Biesheuvel	7e2aa72f11	ARM: 8662/1: module: split core and init PLT sections commit `b7ede5a1f5` upstream. Since commit `35fa91eed8` ("ARM: kernel: merge core and init PLTs"), the ARM module PLT code allocates all PLT entries in a single core section, since the overhead of having a separate init PLT section is not justified by the small number of PLT entries usually required for init code. However, the core and init module regions are allocated independently, and there is a corner case where the core region may be allocated from the VMALLOC region if the dedicated module region is exhausted, but the init region, being much smaller, can still be allocated from the module region. This puts the PLT entries out of reach of the relocated branch instructions, defeating the whole purpose of PLTs. So split the core and init PLT regions, and name the latter ".init.plt" so it gets allocated along with (and sufficiently close to) the .init sections that it serves. Also, given that init PLT entries may need to be emitted for branches that target the core module, modify the logic that disregards defined symbols to only disregard symbols that are defined in the same section. Fixes: `35fa91eed8` ("ARM: kernel: merge core and init PLTs") Reported-by: Angus Clark <angus@angusclark.org> Tested-by: Angus Clark <angus@angusclark.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Zhichao Huang	ae9bd04a13	KVM: arm: plug potential guest hardware debug leakage commit `661e6b02b5` upstream. Hardware debugging in guests is not intercepted currently, it means that a malicious guest can bring down the entire machine by writing to the debug registers. This patch enable trapping of all debug registers, preventing the guests to access the debug registers. This includes access to the debug mode(DBGDSCR) in the guest world all the time which could otherwise mess with the host state. Reads return 0 and writes are ignored (RAZ_WI). The result is the guest cannot detect any working hardware based debug support. As debug exceptions are still routed to the guest normal debug using software based breakpoints still works. To support debugging using hardware registers we need to implement a debug register aware world switch as well as special trapping for registers that may affect the host state. Signed-off-by: Zhichao Huang <zhichao.huang@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:24 +02:00
Marc Zyngier	c870815609	KVM: arm/arm64: vgic-v3: Do not use Active+Pending state for a HW interrupt commit `3d6e77ad14` upstream. When an interrupt is injected with the HW bit set (indicating that deactivation should be propagated to the physical distributor), special care must be taken so that we never mark the corresponding LR with the Active+Pending state (as the pending state is kept in the physycal distributor). Fixes: `59529f69f5` ("KVM: arm/arm64: vgic-new: Add GICv3 world switch backend") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	181339b540	KVM: arm/arm64: vgic-v2: Do not use Active+Pending state for a HW interrupt commit `ddf42d068f` upstream. When an interrupt is injected with the HW bit set (indicating that deactivation should be propagated to the physical distributor), special care must be taken so that we never mark the corresponding LR with the Active+Pending state (as the pending state is kept in the physycal distributor). Fixes: `140b086dd1` ("KVM: arm/arm64: vgic-new: Add GICv2 world switch backend") Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	50f665c39d	arm: KVM: Do not use stack-protector to compile HYP code commit `501ad27c67` upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the HYP code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at HYP. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Marc Zyngier	c735d4e3c2	arm64: KVM: Do not use stack-protector to compile EL2 code commit `cde13b5dad` upstream. We like living dangerously. Nothing explicitely forbids stack-protector to be used in the EL2 code, while distributions routinely compile their kernel with it. We're just lucky that no code actually triggers the instrumentation. Let's not try our luck for much longer, and disable stack-protector for code living at EL2. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Michael Neuling	75295195f7	powerpc/tm: Fix FP and VMX register corruption commit `f48e91e87e` upstream. In commit `dc3106690b` ("powerpc: tm: Always use fp_state and vr_state to store live registers"), a section of code was removed that copied the current state to checkpointed state. That code should not have been removed. When an FP (Floating Point) unavailable is taken inside a transaction, we need to abort the transaction. This is because at the time of the tbegin, the FP state is bogus so the state stored in the checkpointed registers is incorrect. To fix this, we treclaim (to get the checkpointed GPRs) and then copy the thread_struct FP live state into the checkpointed state. We then trecheckpoint so that the FP state is correctly restored into the CPU. The copying of the FP registers from live to checkpointed is what was missing. This simplifies the logic slightly from the original patch. tm_reclaim_thread() will now always write the checkpointed FP state. Either the checkpointed FP state will be written as part of the actual treclaim (in tm.S), or it'll be a copy of the live state. Which one we use is based on MSR[FP] from userspace. Similarly for VMX. Fixes: `dc3106690b` ("powerpc: tm: Always use fp_state and vr_state to store live registers") Signed-off-by: Michael Neuling <mikey@neuling.org> Reviewed-by: cyrilbur@gmail.com Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Michael Ellerman	f9fde78726	powerpc/mm: Fix crash in page table dump with huge pages commit `bfb9956ab4` upstream. The page table dump code doesn't know about huge pages, so currently it crashes (or walks random memory, usually leading to a crash), if it finds a huge page. On Book3S we only see huge pages in the Linux page tables when we're using the P9 Radix MMU. Teaching the code to properly handle huge pages is a bit more involved, so for now just prevent the crash. Fixes: `8eb07b1870` ("powerpc/mm: Dump linux pagetables") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
LiuHailong	1d19e3f139	powerpc/64e: Fix hang when debugging programs with relocated kernel commit `fd615f69a1` upstream. Debug interrupts can be taken during interrupt entry, since interrupt entry does not automatically turn them off. The kernel will check whether the faulting instruction is between [interrupt_base_book3e, __end_interrupts], and if so clear MSR[DE] and return. However, when the kernel is built with CONFIG_RELOCATABLE, it can't use LOAD_REG_IMMEDIATE(r14,interrupt_base_book3e) and LOAD_REG_IMMEDIATE(r15,__end_interrupts), as they ignore relocation. Thus, if the kernel is actually running at a different address than it was built at, the address comparison will fail, and the exception entry code will hang at kernel_dbg_exc. r2(toc) is also not usable here, as r2 still holds data from the interrupted context, so LOAD_REG_ADDR() doesn't work either. So we use the name@got to get the EV of two labels directly. Test programs test.c shows as follows: int main(int argc, char *argv[]) { if (access("/proc/sys/kernel/perf_event_paranoid", F_OK) == -1) printf("Kernel doesn't have perf_event support\n"); } Steps to reproduce the bug, for example: 1) ./gdb ./test 2) (gdb) b access 3) (gdb) r 4) (gdb) s Signed-off-by: Liu Hailong <liu.hailong6@zte.com.cn> Signed-off-by: Jiang Xuexin <jiang.xuexin@zte.com.cn> Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn> Reviewed-by: Liu Song <liu.song11@zte.com.cn> Reviewed-by: Huang Jian <huang.jian@zte.com.cn> [scottwood: cleaned up commit message, and specified bad behavior as a hang rather than an oops to correspond to mainline kernel behavior] Fixes: `1cb6e06492` ("powerpc/book3e: support CONFIG_RELOCATABLE") Signed-off-by: Scott Wood <oss@buserror.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Alistair Popple	31f96d860e	powerpc/powernv: Fix TCE kill on NVLink2 commit `6b3d12a948` upstream. Commit `616badd2fb` ("powerpc/powernv: Use OPAL call for TCE kill on NVLink2") forced all TCE kills to go via the OPAL call for NVLink2. However the PHB3 implementation of TCE kill was still being called directly from some functions which in some circumstances caused a machine check. This patch adds an equivalent IODA2 version of the function which uses the correct invalidation method depending on PHB model and changes all external callers to use it instead. Fixes: `616badd2fb` ("powerpc/powernv: Use OPAL call for TCE kill on NVLink2") Signed-off-by: Alistair Popple <alistair@popple.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Alexey Kardashevskiy	8781143e13	powerpc/iommu: Do not call PageTransHuge() on tail pages commit `e889e96e98` upstream. The CMA pages migration code does not support compound pages at the moment so it performs few tests before proceeding to actual page migration. One of the tests - PageTransHuge() - has VM_BUG_ON_PAGE(PageTail()) as it is designed to be called on head pages only. Since we also test for PageCompound(), and it contains PageTail() and PageHead(), we can simplify the check by leaving just PageCompound() and therefore avoid possible VM_BUG_ON_PAGE. Fixes: `2e5bbb5461` ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA") Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:23 +02:00
Tyrel Datwyler	eb0807b288	powerpc/sysfs: Fix reference leak of cpu device_nodes present at boot commit `e76ca27790` upstream. For CPUs present at boot each logical CPU acquires a reference to the associated device node of the core. This happens in register_cpu() which is called by topology_init(). The result of this is that we end up with a reference held by each thread of the core. However, these references are never freed if the CPU core is DLPAR removed. This patch fixes the reference leaks by acquiring and releasing the references in the CPU hotplug callbacks un/register_cpu_online(). With this patch symmetric reference counting is observed with both CPUs present at boot, and those DLPAR added after boot. Fixes: `f86e4718f2` ("driver/core: cpu: initialize of_node in cpu's device struture") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Tyrel Datwyler	d731227327	powerpc/pseries: Fix of_node_put() underflow during DLPAR remove commit `68baf692c4` upstream. Historically struct device_node references were tracked using a kref embedded as a struct field. Commit `75b57ecf9d` ("of: Make device nodes kobjects so they show up in sysfs") (Mar 2014) refactored device_nodes to be kobjects such that the device tree could by more simply exposed to userspace using sysfs. Commit `0829f6d1f6` ("of: device_node kobject lifecycle fixes") (Mar 2014) followed up these changes to better control the kobject lifecycle and in particular the referecne counting via of_node_get(), of_node_put(), and of_node_init(). A result of this second commit was that it introduced an of_node_put() call when a dynamic node is detached, in of_node_remove(), that removes the initial kobj reference created by of_node_init(). Traditionally as the original dynamic device node user the pseries code had assumed responsibilty for releasing this final reference in its platform specific DLPAR detach code. This patch fixes a refcount underflow introduced by commit `0829f6d1f6`, and recently exposed by the upstreaming of the recount API. Messages like the following are no longer seen in the kernel log with this patch following DLPAR remove operations of cpus and pci devices. rpadlpar_io: slot PHB 72 removed refcount_t: underflow; use-after-free. ------------[ cut here ]------------ WARNING: CPU: 5 PID: 3335 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 Fixes: `0829f6d1f6` ("of: device_node kobject lifecycle fixes") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> [mpe: Make change log commit references more verbose] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Mahesh Salgaonkar	11e382b3b8	powerpc/book3s/mce: Move add_taint() later in virtual mode commit `d93b0ac01a` upstream. machine_check_early() gets called in real mode. The very first time when add_taint() is called, it prints a warning which ends up calling opal call (that uses OPAL_CALL wrapper) for writing it to console. If we get a very first machine check while we are in opal we are doomed. OPAL_CALL overwrites the PACASAVEDMSR in r13 and in this case when we are done with MCE handling the original opal call will use this new MSR on it's way back to opal_return. This usually leads to unexpected behaviour or the kernel to panic. Instead move the add_taint() call later in the virtual mode where it is safe to call. This is broken with current FW level. We got lucky so far for not getting very first MCE hit while in OPAL. But easily reproducible on Mambo. Fixes: `27ea2c420c` ("powerpc: Set the correct kernel taint on machine check errors.") Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Russell Currey	e548d552c0	powerpc/eeh: Avoid use after free in eeh_handle_special_event() commit `daeba2956f` upstream. eeh_handle_special_event() is called when an EEH event is detected but can't be narrowed down to a specific PE. This function looks through every PE to find one in an erroneous state, then calls the regular event handler eeh_handle_normal_event() once it knows which PE has an error. However, if eeh_handle_normal_event() found that the PE cannot possibly be recovered, it will free it, rendering the passed PE stale. This leads to a use after free in eeh_handle_special_event() as it attempts to clear the "recovering" state on the PE after eeh_handle_normal_event() returns. Thus, make sure the PE is valid when attempting to clear state in eeh_handle_special_event(). Fixes: `8a6b1bc70d` ("powerpc/eeh: EEH core to handle special event") Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Russell Currey <ruscur@russell.cc> Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
David Gibson	63f44c113b	powerpc/mm: Ensure IRQs are off in switch_mm() commit `9765ad134a` upstream. powerpc expects IRQs to already be (soft) disabled when switch_mm() is called, as made clear in the commit message of `9c1e105238` ("powerpc: Allow perf_counters to access user memory at interrupt time"). Aside from any race conditions that might exist between switch_mm() and an IRQ, there is also an unconditional hard_irq_disable() in switch_slb(). If that isn't followed at some point by an IRQ enable then interrupts will remain disabled until we return to userspace. It is true that when switch_mm() is called from the scheduler IRQs are off, but not when it's called by use_mm(). Looking closer we see that last year in commit `f98db6013c` ("sched/core: Add switch_mm_irqs_off() and use it in the scheduler") this was made more explicit by the addition of switch_mm_irqs_off() which is now called by the scheduler, vs switch_mm() which is used by use_mm(). Arguably it is a bug in use_mm() to call switch_mm() in a different context than it expects, but fixing that will take time. This was discovered recently when vhost started throwing warnings such as: BUG: sleeping function called from invalid context at kernel/mutex.c:578 in_atomic(): 0, irqs_disabled(): 1, pid: 10768, name: vhost-10760 no locks held by vhost-10760/10768. irq event stamp: 10 hardirqs last enabled at (9): _raw_spin_unlock_irq+0x40/0x80 hardirqs last disabled at (10): switch_slb+0x2e4/0x490 softirqs last enabled at (0): copy_process+0x5e8/0x1260 softirqs last disabled at (0): (null) Call Trace: show_stack+0x88/0x390 (unreliable) dump_stack+0x30/0x44 __might_sleep+0x1c4/0x2d0 mutex_lock_nested+0x74/0x5c0 cgroup_attach_task_all+0x5c/0x180 vhost_attach_cgroups_work+0x58/0x80 [vhost] vhost_worker+0x24c/0x3d0 [vhost] kthread+0xec/0x100 ret_from_kernel_thread+0x5c/0xd4 Prior to commit `04b96e5528` ("vhost: lockless enqueuing") (Aug 2016) the vhost_worker() would do a spin_unlock_irq() not long after calling use_mm(), which had the effect of reenabling IRQs. Since that commit removed the locking in vhost_worker() the body of the vhost_worker() loop now runs with interrupts off causing the warnings. This patch addresses the problem by making the powerpc code mirror the x86 code, ie. we disable interrupts in switch_mm(), and optimise the scheduler case by defining switch_mm_irqs_off(). Signed-off-by: David Gibson <david@gibson.dropbear.id.au> [mpe: Flesh out/rewrite change log, add stable] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	5617775ed1	cx231xx-cards: fix NULL-deref at probe commit `0cd273bb5e` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	6f623a3490	cx231xx-audio: fix NULL-deref at probe commit `65f921647f` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Johan Hovold	8bb9fa63d0	cx231xx-audio: fix init error path commit `fff1abc4d5` upstream. Make sure to release the snd_card also on a late allocation error. Fixes: `e0d3bafd02` ("V4L/DVB (10954): Add cx231xx USB driver") Cc: Sri Deevi <Srinivasa.Deevi@conexant.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Alyssa Milburn	a28e1f13a6	dw2102: limit messages to buffer size commit `950e252cb4` upstream. Otherwise the i2c transfer functions can read or write beyond the end of stack or heap buffers. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Alyssa Milburn	f953d6b03a	digitv: limit messages to buffer size commit `821117dc21` upstream. Return an error rather than memcpy()ing beyond the end of the buffer. Internal callers use appropriate sizes, but digitv_i2c_xfer may not. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:22 +02:00
Daniel Scheller	9e074c588d	dvb-frontends/cxd2841er: define symbol_rate_min/max in T/C fe-ops commit `158f0328af` upstream. Fixes "w_scan -f c" complaining with This dvb driver is buggy: the symbol rate limits are undefined - please report to linuxtv.org) Signed-off-by: Daniel Scheller <d.scheller@gmx.net> Acked-by: Abylay Ospan <aospan@netup.ru> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Alyssa Milburn	ac6dedddba	zr364xx: enforce minimum size when reading header commit `ee0fe833d9` upstream. This code copies actual_length-128 bytes from the header, which will underflow if the received buffer is too small. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Johan Hovold	20c98053a2	dib0700: fix NULL-deref at probe commit `d5823511c0` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Fixes: `c4018fa2e4` ("[media] dib0700: fix RC support on Hauppauge Nova-TD") Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Marek Szyprowski	6af3917b8d	s5p-mfc: Fix unbalanced call to clock management commit `a5cb00eb42` upstream. Clock should be turned off after calling s5p_mfc_init_hw() from the watchdog worker, like it is already done in the s5p_mfc_open() which also calls this function. Fixes: `af93574678` ("[media] MFC: Add MFC 5.1 V4L2 driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Johan Hovold	4f220307a8	gspca: konica: add missing endpoint sanity check commit `aa58fedb8c` upstream. Make sure to check the number of endpoints to avoid accessing memory beyond the endpoint array should a device lack the expected endpoints. Note that, as far as I can tell, the gspca framework has already made sure there is at least one endpoint in the current alternate setting so there should be no risk for a NULL-pointer dereference here. Fixes: `b517af7228` ("V4L/DVB: gspca_konica: New gspca subdriver for konica chipset using cams") Cc: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hansverk@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Marek Szyprowski	d926ceaa72	s5p-mfc: Fix race between interrupt routine and device functions commit `0c32b8ec02` upstream. Interrupt routine must wake process waiting for given interrupt AFTER updating driver's internal structures and contexts. Doing it in-between is a serious bug. This patch moves all calls to the wake() function to the end of the interrupt processing block to avoid potential and real races, especially on multi-core platforms. This also fixes following issue reported from clock core (clocks were disabled in interrupt after being unprepared from the other place in the driver, the stack trace however points to the different place than s5p_mfc driver because of the race): WARNING: CPU: 1 PID: 18 at drivers/clk/clk.c:544 clk_core_unprepare+0xc8/0x108 Modules linked in: CPU: 1 PID: 18 Comm: kworker/1:0 Not tainted 4.10.0-next-20170223-00070-g04e18bc99ab9-dirty #2154 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) Workqueue: pm pm_runtime_work [<c010d8b0>] (unwind_backtrace) from [<c010a534>] (show_stack+0x10/0x14) [<c010a534>] (show_stack) from [<c033292c>] (dump_stack+0x74/0x94) [<c033292c>] (dump_stack) from [<c011cef4>] (__warn+0xd4/0x100) [<c011cef4>] (__warn) from [<c011cf40>] (warn_slowpath_null+0x20/0x28) [<c011cf40>] (warn_slowpath_null) from [<c0387a84>] (clk_core_unprepare+0xc8/0x108) [<c0387a84>] (clk_core_unprepare) from [<c0389d84>] (clk_unprepare+0x24/0x2c) [<c0389d84>] (clk_unprepare) from [<c03d4660>] (exynos_sysmmu_suspend+0x48/0x60) [<c03d4660>] (exynos_sysmmu_suspend) from [<c042b9b0>] (pm_generic_runtime_suspend+0x2c/0x38) [<c042b9b0>] (pm_generic_runtime_suspend) from [<c0437580>] (genpd_runtime_suspend+0x94/0x220) [<c0437580>] (genpd_runtime_suspend) from [<c042e240>] (__rpm_callback+0x134/0x208) [<c042e240>] (__rpm_callback) from [<c042e334>] (rpm_callback+0x20/0x80) [<c042e334>] (rpm_callback) from [<c042d3b8>] (rpm_suspend+0xdc/0x458) [<c042d3b8>] (rpm_suspend) from [<c042ea24>] (pm_runtime_work+0x80/0x90) [<c042ea24>] (pm_runtime_work) from [<c01322c4>] (process_one_work+0x120/0x318) [<c01322c4>] (process_one_work) from [<c0132520>] (worker_thread+0x2c/0x4ac) [<c0132520>] (worker_thread) from [<c0137ab0>] (kthread+0xfc/0x134) [<c0137ab0>] (kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c) ---[ end trace 1ead49a7bb83f0d8 ]--- Fixes: `af93574678` ("[media] MFC: Add MFC 5.1 V4L2 driver") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com> Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Lee Jones	776a591f75	cec: Fix runtime BUG when (CONFIG_RC_CORE && !CEC_CAP_RC) commit `43c0c03961` upstream. Currently when the RC Core is enabled (reachable) core code located in cec_register_adapter() attempts to populate the RC structure with a pointer to the 'parent' passed in by the caller. Unfortunately if the caller did not specify RC capability when calling cec_allocate_adapter(), then there will be no RC structure to populate. This causes a "NULL pointer dereference" error. Fixes: `f51e80804f` ("[media] cec: pass parent device in register(), not allocate()") Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Srinivas Pandruvada	0d1015c0e9	iio: hid-sensor: Store restore poll and hysteresis on S3 commit `5d9854eaea` upstream. This change undo the change done by 'commit `3bec247474` ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3")' as this breaks some USB/i2c sensor hubs. Instead of relying on HW for restoring poll and hysteresis, driver stores and restores on resume (S3). In this way user space modified settings are not lost for any kind of sensor hub behavior. In this change, whenever user space modifies sampling frequency or hysteresis driver will get the feature value from the hub and store in the per device hid_sensor_common data structure. On resume callback from S3, system will set the feature to sensor hub, if user space ever modified the feature value. Fixes: `3bec247474` ("iio: hid-sensor-trigger: Change get poll value function order to avoid sensor properties losing after resume from S3") Reported-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Ritesh Raj Sarraf <rrs@researchut.com> Tested-by: Song, Hongyan <hongyan.song@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:21 +02:00
Matt Ranostay	03652e4884	iio: proximity: as3935: fix as3935_write commit `84ca8e364a` upstream. AS3935_WRITE_DATA macro bit is incorrect and the actual write sequence is two leading zeros. Cc: George McCollister <george.mccollister@gmail.com> Signed-off-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Dan Carpenter	b13b3f3985	ipx: call ipxitf_put() in ioctl error path commit `ee0d8d8482` upstream. We should call ipxitf_put() if the copy_to_user() fails. Reported-by: 李强 <liqiang6-s@360.cn> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	1e2a08069d	USB: hub: fix non-SS hub-descriptor handling commit `bec444cd1c` upstream. Add missing sanity check on the non-SuperSpeed hub-descriptor length in order to avoid parsing and leaking two bytes of uninitialised slab data through sysfs removable-attributes (or a compound-device debug statement). Note that we only make sure that the DeviceRemovable field is always present (and specifically ignore the unused PortPwrCtrlMask field) in order to continue support any hubs with non-compliant descriptors. As a further safeguard, the descriptor buffer is also cleared. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	345477ed11	USB: hub: fix SS hub-descriptor handling commit `2c25a2c818` upstream. A SuperSpeed hub descriptor does not have any variable-length fields so bail out when reading a short descriptor. This avoids parsing and leaking two bytes of uninitialised slab data through sysfs removable-attributes. Fixes: `dbe79bbe9d` ("USB 3.0 Hub Changes") Cc: John Youn <John.Youn@synopsys.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	3a82455292	USB: serial: io_ti: fix div-by-zero in set_termios commit `6aeb75e6ad` upstream. Fix a division-by-zero in set_termios when debugging is enabled and a high-enough speed has been requested so that the divisor value becomes zero. Instead of just fixing the offending debug statement, cap the baud rate at the base as a zero divisor value also appears to crash the firmware. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	e4d3ae36fc	USB: serial: mct_u232: fix big-endian baud-rate handling commit `26cede3436` upstream. Drop erroneous cpu_to_le32 when setting the baud rate, something which corrupted the divisor on big-endian hosts. Found using sparse: warning: incorrect type in argument 1 (different base types) expected unsigned int [unsigned] [usertype] val got restricted __le32 [usertype] <noident> Fixes: `af2ac1a091` ("USB: serial mct_usb232: move DMA buffers to heap") Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-By: Pete Zaitcev <zaitcev@yahoo.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Bjørn Mork	07cf1f9d01	USB: serial: qcserial: add more Lenovo EM74xx device IDs commit `8d7a10dd32` upstream. In their infinite wisdom, and never ending quest for end user frustration, Lenovo has decided to use new USB device IDs for the wwan modules in their 2017 laptops. The actual hardware is still the Sierra Wireless EM7455 or EM7430, depending on region. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Daniele Palmas	d986525b7d	usb: serial: option: add Telit ME910 support commit `40dd46048c` upstream. This patch adds support for Telit ME910 PID 0x1100. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Johan Hovold	b58d7087ce	USB: iowarrior: fix info ioctl on big-endian hosts commit `dd5ca753fa` upstream. Drop erroneous le16_to_cpu when returning the USB device speed which is already in host byte order. Found using sparse: warning: cast to restricted __le16 Fixes: `946b960d13` ("USB: add driver for iowarrior devices.") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:20 +02:00
Tony Lindgren	d69737cb64	usb: musb: Fix trying to suspend while active for OTG configurations commit `3c50ffef25` upstream. Commit `d8e5f0eca1` ("usb: musb: Fix hardirq-safe hardirq-unsafe lock order error") caused a regression where musb keeps trying to enable host mode with no cable connected. This seems to be caused by the fact that now phy is enabled earlier, and we are wrongly trying to force USB host mode on an OTG port. The errors we are getting are "trying to suspend as a_idle while active". For ports configured as OTG, we should not need to do anything to try to force USB host mode on it's OTG port. Trying to force host mode in this case just seems to completely confuse the musb state machine. Let's fix the issue by making musb_host_setup() attempt to force the mode only if port_mode is configured for host mode. Fixes: `d8e5f0eca1` ("usb: musb: Fix hardirq-safe hardirq-unsafe lock order error") Cc: Johan Hovold <johan@kernel.org> Reported-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reported-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Peter Ujfalusi	925f7f9244	usb: musb: tusb6010_omap: Do not reset the other direction's packet size commit `6df2b42f7c` upstream. We have one register for each EP to set the maximum packet size for both TX and RX. If for example an RX programming would happen before the previous TX transfer finishes we would reset the TX packet side. To fix this issue, only modify the TX or RX part of the register. Fixes: `550a7375fe` ("USB: Add MUSB and TUSB support") Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Bin Liu <b-liu@ti.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Thinh Nguyen	33c19ef5e2	usb: dwc3: gadget: Prevent losing events in event cache commit `d325a1de49` upstream. The dwc3 driver can overwite its previous events if its top-half IRQ handler (TH) gets invoked again before processing the events in the cache. We see this as a hang in the file transfer and the host will attempt to reset the device. TH gets the event count and deasserts the interrupt line by writing DWC3_GEVNTSIZ_INTMASK to DWC3_GEVNTSIZ. If there's a new event coming between reading the event count and interrupt deassertion, dwc3 will lose previous pending events. More generally, we will see 0 event count, which should not affect anything. This shouldn't be possible in the current dwc3 implementation. However, through testing and reading the PCIe trace, the TH occasionally still gets invoked one more time after HW interrupt deassertion. (With PCIe legacy interrupts, TH is called repeatedly as long as the interrupt line is asserted). We suspect that there is a small detection delay in the SW. To avoid this issue, Check DWC3_EVENT_PENDING flag to determine if the events are processed in the bottom-half IRQ handler. If not, return IRQ_HANDLED and don't process new event. Signed-off-by: Thinh Nguyen <thinhn@synopsys.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Ben Hutchings	79312c5d61	dvb-usb-dibusb-mc-common: Add MODULE_LICENSE commit `bf05b65a9f` upstream. dvb-usb-dibusb-mc-common is licensed under GPLv2, and if we don't say so then it won't even load since it needs a GPL-only symbol. Fixes: `e91455a149` ("[media] dvb-usb: split out common parts of dibusb") Reported-by: Dominique Dumont <dod@debian.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Alyssa Milburn	8d730619ae	ttusb2: limit messages to buffer size commit `a12b8ab8c5` upstream. Otherwise ttusb2_i2c_xfer can read or write beyond the end of static and heap buffers. Signed-off-by: Alyssa Milburn <amilburn@zall.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	f2b30b2f19	mceusb: fix NULL-deref at probe commit `03eb2a557e` upstream. Make sure to check for the required out endpoint to avoid dereferencing a NULL-pointer in mce_request_packet should a malicious device lack such an endpoint. Note that this path is hit during probe. Fixes: `66e89522af` ("V4L/DVB: IR: add mceusb IR receiver driver") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	03a7e78a25	usbvision: fix NULL-deref at probe commit `eacb975b48` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `2a9f8b5d25` ("V4L/DVB (5206): Usbvision: set alternate interface modification") Cc: Thierry MERLE <thierry.merle@free.fr> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Johan Hovold	3aade67b76	net: irda: irda-usb: fix firmware name on big-endian hosts commit `75cf067953` upstream. Add missing endianness conversion when using the USB device-descriptor bcdDevice field to construct a firmware file name. Fixes: `8ef80aef11` ("[IRDA]: irda-usb.c: STIR421x cleanups") Cc: Nick Fedchik <nfedchik@atlantic-link.com.ua> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Peter Chen	0cc9aa4c4b	usb: host: xhci-mem: allocate zeroed Scratchpad Buffer commit `7480d912d5` upstream. According to xHCI ch4.20 Scratchpad Buffers, the Scratchpad Buffer needs to be zeroed. ... The following operations take place to allocate Scratchpad Buffers to the xHC: ... b. Software clears the Scratchpad Buffer to '0' Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:19 +02:00
Mathias Nyman	966e88d540	xhci: apply PME_STUCK_QUIRK and MISSING_CAS quirk for Denverton commit `a0c16630d3` upstream. Intel Denverton microserver is Atom based and need the PME and CAS quirks as well. Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Alan Stern	5077513b4e	USB: xhci: fix lock-inversion problem commit `63aea0dbab` upstream. With threaded interrupts, bottom-half handlers are called with interrupts enabled. Therefore they can't safely use spin_lock(); they have to use spin_lock_irqsave(). Lockdep warns about a violation occurring in xhci_irq(): ========================================================= [ INFO: possible irq lock inversion dependency detected ] 4.11.0-rc8-dbg+ #1 Not tainted --------------------------------------------------------- swapper/7/0 just changed the state of lock: (&(&ehci->lock)->rlock){-.-...}, at: [<ffffffffa0130a69>] ehci_hrtimer_func+0x29/0xc0 [ehci_hcd] but this lock took another, HARDIRQ-unsafe lock in the past: (hcd_urb_list_lock){+.....} and interrupts could create inverse lock ordering between them. other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(hcd_urb_list_lock); local_irq_disable(); lock(&(&ehci->lock)->rlock); lock(hcd_urb_list_lock); <Interrupt> lock(&(&ehci->lock)->rlock); * DEADLOCK * no locks held by swapper/7/0. the shortest dependencies between 2nd lock and 1st lock: -> (hcd_urb_list_lock){+.....} ops: 252 { HARDIRQ-ON-W at: __lock_acquire+0x602/0x1280 lock_acquire+0xd5/0x1c0 _raw_spin_lock+0x2f/0x40 usb_hcd_unlink_urb_from_ep+0x1b/0x60 [usbcore] xhci_giveback_urb_in_irq.isra.45+0x70/0x1b0 [xhci_hcd] finish_td.constprop.60+0x1d8/0x2e0 [xhci_hcd] xhci_irq+0xdd6/0x1fa0 [xhci_hcd] usb_hcd_irq+0x26/0x40 [usbcore] irq_forced_thread_fn+0x2f/0x70 irq_thread+0x149/0x1d0 kthread+0x113/0x150 ret_from_fork+0x2e/0x40 This patch fixes the problem. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Thomas Petazzoni	354c85b7ad	usb: host: xhci-plat: propagate return value of platform_get_irq() commit `4b148d5144` upstream. platform_get_irq() returns an error code, but the xhci-plat driver ignores it and always returns -ENODEV. This is not correct, and prevents -EPROBE_DEFER from being propagated properly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Matthias Lange	c3dc5f5ae5	xhci: remove GFP_DMA flag from allocation commit `5db851cf20` upstream. There is no reason to restrict allocations to the first 16MB ISA DMA addresses. It is causing problems in a virtualization setup with enabled IOMMU (x86_64). The result is that USB is not working in the VM. Signed-off-by: Matthias Lange <matthias.lange@kernkonzept.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Mathias Nyman	d0c8639797	xhci: Fix command ring stop regression in 4.11 commit `604d02a2a6` upstream. In 4.11 TRB completion codes were renamed to match spec. Completion codes for command ring stopped and endpoint stopped were mixed, leading to failures while handling a stopped command ring. Use the correct completion code for command ring stopped events. Fixes: `0b7c105a04` ("usb: host: xhci: rename completion codes to match spec") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Yazen Ghannam	1b8de0a2bb	EDAC, amd64: Fix reporting of Chip Select sizes on Fam17h commit `eb77e6b80f` upstream. The wrong index into the csbases/csmasks arrays was being passed to the function to compute the chip select sizes, which resulted in the wrong size being computed. Address that so that the correct values are computed and printed. Also, redo how we calculate the number of pages in a CS row. Reported-by: Benjamin Bennett <benbennett@gmail.com> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Link: http://lkml.kernel.org/r/1493313114-11260-1-git-send-email-Yazen.Ghannam@amd.com [ Remove unneeded integer math comment, minor cleanups. ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Jan Kara	0a4115da61	dax: fix data corruption when fault races with write commit `13e451fdc1` upstream. Currently DAX read fault can race with write(2) in the following way: CPU1 - write(2) CPU2 - read fault dax_iomap_pte_fault() ->iomap_begin() - sees hole dax_iomap_rw() iomap_apply() ->iomap_begin - allocates blocks dax_iomap_actor() invalidate_inode_pages2_range() - there's nothing to invalidate grab_mapping_entry() - we add zero page in the radix tree and map it to page tables The result is that hole page is mapped into page tables (and thus zeros are seen in mmap) while file has data written in that place. Fix the problem by locking exception entry before mapping blocks for the fault. That way we are sure invalidate_inode_pages2_range() call for racing write will either block on entry lock waiting for the fault to finish (and unmap stale page tables after that) or read fault will see already allocated blocks by write(2). Fixes: `9f141d6ef6` Link: http://lkml.kernel.org/r/20170510085419.27601-5-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Toshi Kani	3ef206af54	libnvdimm: fix clear length of nvdimm_forget_poison() commit `8d13c02906` upstream. ND_CMD_CLEAR_ERROR command returns 'clear_err.cleared', the length of error actually cleared, which may be smaller than its requested 'len'. Change nvdimm_clear_poison() to call nvdimm_forget_poison() with 'clear_err.cleared' when this value is valid. Fixes: `e046114af5` ("libnvdimm: clear the internal poison_list when clearing badblocks") Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
David Howells	7868d9ff2e	Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx() commit `deccf497d8` upstream. stat/lstat/fstatat need to pass AT_NO_AUTOMOUNT to vfs_statx() as the pre-statx code didn't set LOOKUP_AUTOMOUNT, even though fstatat() accepted the AT_NO_AUTOMOUNT flag. Fixes: `a528d35e8b` ("statx: Add a system call to make enhanced file info available") Reported-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:18 +02:00
Johan Hovold	d08975541f	USB: chaoskey: fix Alea quirk on big-endian hosts commit `63afd5cc78` upstream. Add missing endianness conversion when applying the Alea timeout quirk. Found using sparse: warning: restricted __le16 degrades to integer Fixes: `e4a886e811` ("hwrng: chaoskey - Fix URB warning due to timeout on Alea") Cc: Bob Ham <bob.ham@collabora.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Keith Packard <keithp@keithp.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Andrey Korolyov	7ed35d6ca7	USB: serial: ftdi_sio: add Olimex ARM-USB-TINY(H) PIDs commit `5f63424ab7` upstream. This patch adds support for recognition of ARM-USB-TINY(H) devices which are almost identical to ARM-USB-OCD(H) but lacking separate barrel jack and serial console. By suggestion from Johan Hovold it is possible to replace ftdi_jtag_quirk with a bit more generic construction. Since all Olimex-ARM debuggers has exactly two ports, we could safely always use only second port within the debugger family. Signed-off-by: Andrey Korolyov <andrey@xdel.ru> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Anthony Mallet	14bd189bef	USB: serial: ftdi_sio: fix setting latency for unprivileged users commit `bb246681b3` upstream. Commit `557aaa7ffa` ("ft232: support the ASYNC_LOW_LATENCY flag") enables unprivileged users to set the FTDI latency timer, but there was a logic flaw that skipped sending the corresponding USB control message to the device. Specifically, the device latency timer would not be updated until next open, something which was later also inadvertently broken by commit `c19db4c9e4` ("USB: ftdi_sio: set device latency timeout at port probe"). A recent commit `c6dce26266` ("USB: serial: ftdi_sio: fix extreme low-latency setting") disabled the low-latency mode by default so we now need this fix to allow unprivileged users to again enable it. Signed-off-by: Anthony Mallet <anthony.mallet@laas.fr> [johan: amend commit message] Fixes: `557aaa7ffa` ("ft232: support the ASYNC_LOW_LATENCY flag") Fixes: `c19db4c9e4` ("USB: ftdi_sio: set device latency timeout at port probe"). Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Kirill Tkhai	15e4ed2a46	pid_ns: Fix race between setns'ed fork() and zap_pid_ns_processes() commit `3fd3722621` upstream. Imagine we have a pid namespace and a task from its parent's pid_ns, which made setns() to the pid namespace. The task is doing fork(), while the pid namespace's child reaper is dying. We have the race between them: Task from parent pid_ns Child reaper copy_process() .. alloc_pid() .. .. zap_pid_ns_processes() .. disable_pid_allocation() .. read_lock(&tasklist_lock) .. iterate over pids in pid_ns .. kill tasks linked to pids .. read_unlock(&tasklist_lock) write_lock_irq(&tasklist_lock); .. attach_pid(p, PIDTYPE_PID); .. .. .. So, just created task p won't receive SIGKILL signal, and the pid namespace will be in contradictory state. Only manual kill will help there, but does the userspace care about this? I suppose, the most users just inject a task into a pid namespace and wait a SIGCHLD from it. The patch fixes the problem. It simply checks for (pid_ns->nr_hashed & PIDNS_HASH_ADDING) in copy_process(). We do it under the tasklist_lock, and can't skip PIDNS_HASH_ADDING as noted by Oleg: "zap_pid_ns_processes() does disable_pid_allocation() and then takes tasklist_lock to kill the whole namespace. Given that copy_process() checks PIDNS_HASH_ADDING under write_lock(tasklist) they can't race; if copy_process() takes this lock first, the new child will be killed, otherwise copy_process() can't miss the change in ->nr_hashed." If allocation is disabled, we just return -ENOMEM like it's made for such cases in alloc_pid(). v2: Do not move disable_pid_allocation(), do not introduce a new variable in copy_process() and simplify the patch as suggested by Oleg Nesterov. Account the problem with double irq enabling found by Eric W. Biederman. Fixes: `c876ad7682` ("pidns: Stop pid allocation when init dies") Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Ingo Molnar <mingo@kernel.org> CC: Peter Zijlstra <peterz@infradead.org> CC: Oleg Nesterov <oleg@redhat.com> CC: Mike Rapoport <rppt@linux.vnet.ibm.com> CC: Michal Hocko <mhocko@suse.com> CC: Andy Lutomirski <luto@kernel.org> CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Andrei Vagin <avagin@openvz.org> CC: Cyrill Gorcunov <gorcunov@openvz.org> CC: Serge Hallyn <serge@hallyn.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Eric W. Biederman	9fbdfb7af4	pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes commit `b9a985db98` upstream. The code can potentially sleep for an indefinite amount of time in zap_pid_ns_processes triggering the hung task timeout, and increasing the system average. This is undesirable. Sleep with a task state of TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these undesirable side effects. Apparently under heavy load this has been allowing Chrome to trigger the hung time task timeout error and cause ChromeOS to reboot. Reported-by: Vovo Yang <vovoy@google.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Fixes: `6347e90091` ("pidns: guarantee that the pidns init will be the last pidns process reaped") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Michael J. Ruhl	4cbd5018c4	IB/hfi1: Fix a subcontext memory leak commit `224d71f910` upstream. The only context that frees user_exp_rcv data structures is the last context closed (from a sub-context set). This leaks the allocations from the other sub-contexts. Separate the common frees from the specific frees and call them at the appropriate time. Using KEDR to check for memory leaks we get: Before test: [leak_check] Possible leaks: 25 After test: [leak_check] Possible leaks: 31 (6 leaked data structures) After patch applied (before and after test have the same value) [leak_check] Possible leaks: 25 Each leak is 192 + 13440 + 6720 = 20352 bytes per sub-context. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Michael J. Ruhl	d971ab21c9	IB/hfi1: Return an error on memory allocation failure commit `94679061dc` upstream. If the eager buffer allocation fails, it is necessary to return an error code. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Fabrice Gasnier	c07927502b	iio: stm32 trigger: fix sampling_frequency read commit `77a9febfd8` upstream. When prescaler (PSC) is 0, it means div factor is 1: counter clock frequency is equal to input clk / (PSC + 1). When reload value is 8 for example, counter counts 9 cycles, from 0 to 8. This is handled in frequency write routine, by writing respectively: - prescaler - 1 to PSC - reload value - 1 to ARR This fix does the opposite when reading the frequency from PSC and ARR: - prescaler is PSC + 1 - reload value is ARR + 1 Thus, PSC may be 0, depending on requested sampling frequency (div 1). In this case, reading freq wrongly reports 0, instead of computing and reporting correct value. Remove test on !psc and !arr. Small test on stm32f4 (example on tim1_trgo), before this fix: $ cd /sys/bus/iio/devices/triggerX $ echo 10000 > sampling_frequency $ cat sampling_frequency 0 After this fix: $ echo 10000 > sampling_frequency $ cat sampling_frequency 10000 Signed-off-by: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Andreas Klinger	705fa4038f	IIO: bmp280-core.c: fix error in humidity calculation commit `ed3730c435` upstream. While calculating the compensation of the humidity there are negative values interpreted as unsigned because of unsigned variables used. These values as well as the constants need to be casted to signed as indicated by the documentation of the sensor. Signed-off-by: Andreas Klinger <ak@it-klinger.de> Acked-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Matt Ranostay <matt.ranostay@konsulko.com> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:17 +02:00
Pavel Roskin	327f5da04e	iio: dac: ad7303: fix channel description commit `ce420fd425` upstream. realbits, storagebits and shift should be numbers, not ASCII characters. Signed-off-by: Pavel Roskin <plroskin@gmail.com> Reviewed-by: Lars-Peter Clausen <lars@metafoo.de> Signed-off-by: Jonathan Cameron <jic23@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
James Smart	6567967aad	scsi: lpfc: Fix panic on BFS configuration commit `4492b739c9` upstream. To select the appropriate shost template, the driver is issuing a mailbox command to retrieve the wwn. Turns out the sending of the command precedes the reset of the function. On SLI-4 adapters, this is inconsequential as the mailbox command location is specified by dma via the BMBX register. However, on SLI-3 adapters, the location of the mailbox command submission area changes. When the function is first powered on or reset, the cmd is submitted via PCI bar memory. Later the driver changes the function config to use host memory and DMA. The request to start a mailbox command is the same, a simple doorbell write, regardless of submission area. So.. if there has not been a boot driver run against the adapter, the mailbox command works as defaults are ok. But, if the boot driver has configured the card and, and if no platform pci function/slot reset occurs as the os starts, the mailbox command will fail. The SLI-3 device will use the stale boot driver dma location. This can cause PCI eeh errors. Fix is to reset the sli-3 function before sending the mailbox command, thus synchronizing the function/driver on mailbox location. Note: The fix uses routines that are typically invoked later in the call flow to reset the sli-3 device. The issue in using those routines is that the normal (non-fix) flow does additional initialization, namely the allocation of the pport structure. So, rather than significantly reworking the initialization flow so that the pport is alloc'd first, pointer checks are added to work around it. Checks are limited to the routines invoked by a sli-3 adapter (s3 routines) as this fix/early call is only invoked on a sli3 adapter. Nothing changes post the fix. Subsequent initialization, and another adapter reset, still occur - both on sli-3 and sli-4 adapters. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Fixes: `96418b5e2c` ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.") Reviewed-by: Ewan D. Milne <emilne@redhat.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Bryant G. Ly	5d6d5f07f8	ibmvscsis: Do not send aborted task response commit `25e7853126` upstream. The driver is sending a response to the actual scsi op that was aborted by an abort task TM, while LIO is sending a response to the abort task TM. ibmvscsis_tgt does not send the response to the client until release_cmd time. The reason for this was because if we did it at queue_status time, then the client would be free to reuse the tag for that command, but we're still using the tag until the command is released at release_cmd time, so we chose to delay sending the response until then. That then caused this issue, because release_cmd is always called, even if queue_status is not. SCSI spec says that the initiator that sends the abort task TM NEVER gets a response to the aborted op and with the current code it will send a response. Thus this fix will remove that response if the CMD_T_ABORTED && !CMD_T_TAS. Another case with a small timing window is the case where if LIO sends a TMR_DOES_NOT_EXIST, and the release_cmd callback is called for the TMR Abort cmd before the release_cmd for the (attemped) aborted cmd, then we need to ensure that we send the response for the (attempted) abort cmd to the client before we send the response for the TMR Abort cmd. Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com> Signed-off-by: Michael Cyr <mikecyr@linux.vnet.ibm.com> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Johan Hovold	c3f0eff47d	of: fdt: add missing allocation-failure check commit `49e67dd176` upstream. The memory allocator passed to __unflatten_device_tree() (e.g. a wrapped kzalloc) can fail so add the missing sanity check to avoid dereferencing a NULL pointer. Fixes: `fe14042358` ("of/flattree: Refactor unflatten_device_tree and add fdt_unflatten_tree") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Tyrel Datwyler	5bc2cfba89	of: fix "/cpus" reference leak in of_numa_parse_cpu_nodes() commit `b8475cbee5` upstream. The call to of_find_node_by_path("/cpus") returns the cpus device_node with its reference count incremented. There is no matching of_node_put() call in of_numa_parse_cpu_nodes() which results in a leaked reference to the "/cpus" node. This patch adds an of_node_put() to release the reference. fixes: `298535c00a` ("of, numa: Add NUMA of binding implementation.") Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Acked-by: David Daney <david.daney@cavium.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Rob Herring	854d0231ef	of: fix sparse warning in of_pci_range_parser_one commit `eb31003657` upstream. sparse gives the following warning for 'pci_space': ../drivers/of/address.c:266:26: warning: incorrect type in assignment (different base types) ../drivers/of/address.c:266:26: expected unsigned int [unsigned] [usertype] pci_space ../drivers/of/address.c:266:26: got restricted __be32 const [usertype] <noident> It appears that pci_space is only ever accessed on powerpc, so the endian swap is often not needed. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Takashi Iwai	d5331e619a	proc: Fix unbalanced hard link numbers commit `d66bb1607e` upstream. proc_create_mount_point() forgot to increase the parent's nlink, and it resulted in unbalanced hard link numbers, e.g. /proc/fs shows one less than expected. Fixes: `eb6d38d542` ("proc: Allow creating permanently empty directories...") Reported-by: Tristan Ye <tristan.ye@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Vaibhav Jain	525d5ef5ae	cxl: Route eeh events to all drivers in cxl_pci_error_detected() commit `4f58f0bf15` upstream. Fix a boundary condition where in some cases an eeh event that results in card reset isn't passed on to a driver attached to the virtual PCI device associated with a slice. This will happen in case when a slice attached device driver returns a value other than PCI_ERS_RESULT_NEED_RESET from the eeh error_detected() callback. This would result in an early return from cxl_pci_error_detected() and other drivers attached to other AFUs on the card wont be notified. The patch fixes this by making sure that all slice attached device-drivers are notified and the return values from error_detected() callback are aggregated in a scheme where request for 'disconnect' trumps all and 'none' trumps 'need_reset'. Fixes: `9e8df8a219` ("cxl: EEH support") Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Vaibhav Jain	4a0e8757ad	cxl: Force context lock during EEH flow commit `ea9a26d117` upstream. During an eeh event when the cxl card is fenced and card sysfs attr perst_reloads_same_image is set following warning message is seen in the kernel logs: Adapter context unlocked with 0 active contexts ------------[ cut here ]------------ WARNING: CPU: 12 PID: 627 at ../drivers/misc/cxl/main.c:325 cxl_adapter_context_unlock+0x60/0x80 [cxl] Even though this warning is harmless, it clutters the kernel log during an eeh event. This warning is triggered as the EEH callback cxl_pci_error_detected doesn't obtain a context-lock before forcibly detaching all active context and when context-lock is released during call to cxl_configure_adapter from cxl_pci_slot_reset, a warning in cxl_adapter_context_unlock is triggered. To fix this warning, we acquire the adapter context-lock via cxl_adapter_context_lock() in the eeh callback cxl_pci_error_detected() once all the virtual AFU PHBs are notified and their contexts detached. The context-lock is released in cxl_pci_slot_reset() after the adapter is successfully reconfigured and before the we call the slot_reset callback on slice attached device-drivers. Fixes: `70b565bbdb` ("cxl: Prevent adapter reset if an active context exists") Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Tested-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:16 +02:00
Gerd Hoffmann	6c58e144d5	ohci-pci: add qemu quirk commit `21a60f6e65` upstream. On a loaded virtualization host (dozen guests booting at the same time) it may happen that the ohci controller emulation doesn't manage to do timely frame processing, with the result that the io watchdog fires and considers the controller being dead, even though it's only the emulation being unusual slow due to the load peak. So, add a quirk for qemu and don't use the watchdog in case we figure we are running on emulated ohci. The virtual ohci controller masquerades as apple ohci controller, but we can identify it by subsystem id. Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Tobias Herzog	a83f3099d0	cdc-acm: fix possible invalid access when processing notification commit `1bb9914e17` upstream. Notifications may only be 8 bytes long. Accessing the 9th and 10th byte of unimplemented/unknown notifications may be insecure. Also check the length of known notifications before accessing anything behind the 8th byte. Signed-off-by: Tobias Herzog <t-herzog@gmx.de> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
David Rivshin	9541621a12	gpio: omap: return error if requested debounce time is not possible commit `8397744393` upstream. omap_gpio_debounce() does not validate that the requested debounce is within a range it can handle. Instead it lets the register value wrap silently, and always returns success. This can lead to all sorts of unexpected behavior, such as gpio_keys asking for a too-long debounce, but getting a very short debounce in practice. Fix this by returning -EINVAL if the requested value does not fit into the register field. If there is no debounce clock available at all, return -ENOTSUPP. Fixes: `e85ec6c304` ("gpio: omap: fix omap2_set_gpio_debounce") Signed-off-by: David Rivshin <drivshin@allworx.com> Acked-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	33f379db47	drm/nouveau/tmr: handle races with hw when updating the next alarm time commit `1b0f84380b` upstream. If the time to the next alarm is short enough, we could race with HW and end up with an ~4 second delay until it triggers. Fix this by checking again after we update HW. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	53b8e32038	drm/nouveau/tmr: avoid processing completed alarms when adding a new one commit `330bdf62fe` upstream. The idea here was to avoid having to "manually" program the HW if there's a new earliest alarm. This was lazy and bad, as it leads to loads of fun races between inter-related callers (ie. therm). Turns out, it's not so difficult after all. Go figure ;) Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	952030b868	drm/nouveau/tmr: fix corruption of the pending list when rescheduling an alarm commit `9fc64667ee` upstream. At least therm/fantog "attempts" to work around this issue, which could lead to corruption of the pending alarm list. Fix it properly by not updating the timestamp without the lock held, or trying to add an already pending alarm to the pending alarm list.... Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	4c168033dd	drm/nouveau/tmr: ack interrupt before processing alarms commit `3733bd8b40` upstream. Fixes a race where we can miss an alarm that triggers while we're already processing previous alarms. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	a5e41c9e5a	drm/nouveau/kms/nv50: skip core channel cursor update on position-only changes commit `e6db95799b` upstream. The DRM core used to only call prepare_fb/cleanup_fb() when a plane's framebuffer changed, which achieved the desired effect. It's apparently now up to the driver to decide on its own. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	4aecf3a73b	drm/nouveau/kms/nv50: fix source-rect-only plane updates commit `36601c2b36` upstream. This "optimisation" (which was originally meant to skip updating cursor settings in the core channel on position-only updates) turned out to be pointless in the final design of the code before it was merged. Remove it completely, as it breaks other cases. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:15 +02:00
Ben Skeggs	2138faf017	drm/nouveau/therm: remove ineffective workarounds for alarm bugs commit `e4311ee51d` upstream. These were ineffective due to touching the list without the alarm lock, but should no longer be required. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	21d3947bf7	drm/amdgpu: Add missing lb_vblank_lead_lines setup to DCE-6 path. commit `effaf848b9` upstream. This apparently got lost when implementing the new DCE-6 support and would cause failures in pageflip scheduling and timestamping. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	3f0db81c7d	drm/amdgpu: Avoid overflows/divide-by-zero in latency_watermark calculations. commit `e190ed1ea7` upstream. At dot clocks > approx. 250 Mhz, some of these calcs will overflow and cause miscalculation of latency watermarks, and for some overflows also divide-by-zero driver crash ("divide error: 0000 [#1] PREEMPT SMP" in "dce_v10_0_latency_watermark+0x12d/0x190"). This zero-divide happened, e.g., on AMD Tonga Pro under DCE-10, on a Displayport panel when trying to set a video mode of 2560x1440 at 165 Hz vrefresh with a dot clock of 635.540 Mhz. Refine calculations to avoid the overflows. Tested for DCE-10 with R9 380 Tonga + ASUS ROG PG279 panel. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Mario Kleiner	de2fa45aa9	drm/amdgpu: Make display watermark calculations more accurate commit `d63c277dc6` upstream. Avoid big roundoff errors in scanline/hactive durations for high pixel clocks, especially for >= 500 Mhz, and thereby program more accurate display fifo watermarks. Implemented here for DCE 6,8,10,11. Successfully tested on DCE 10 with AMD R9 380 Tonga. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Johan Hovold	fab4402325	ath9k_htc: fix NULL-deref at probe commit `ebeb36670e` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer or accessing memory beyond the endpoint array should a malicious device lack the expected endpoints. Fixes: `36bcce4306` ("ath9k_htc: Handle storage devices") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Dmitry Tunin	43068b4585	ath9k_htc: Add support of AirTies 1eda:2315 AR9271 device commit `16ff1fb0e3` upstream. T: Bus=01 Lev=02 Prnt=02 Port=02 Cnt=01 Dev#= 7 Spd=480 MxCh= 0 D: Ver= 2.00 Cls=ff(vend.) Sub=ff Prot=ff MxPS=64 #Cfgs= 1 P: Vendor=1eda ProdID=2315 Rev=01.08 S: Manufacturer=ATHEROS S: Product=USB2.0 WLAN S: SerialNumber=12345 C: #Ifs= 1 Cfg#= 1 Atr=80 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 6 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com> Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Martin Schwidefsky	82e31de018	s390/cputime: fix incorrect system time commit `07a63cbe8b` upstream. git commit `c5328901aa` "[S390] entry[64].S improvements" removed the update of the exit_timer lowcore field from the critical section cleanup of the .Lsysc_restore/.Lsysc_done and .Lio_restore/.Lio_done blocks. If the PSW is updated by the critical section cleanup to point to user space again, the interrupt entry code will do a vtime calculation after the cleanup completed with an exit_timer value which has not been updated. Due to this incorrect system time deltas are calculated. If an interrupt occured with an old PSW between .Lsysc_restore/.Lsysc_done or .Lio_restore/.Lio_done update __LC_EXIT_TIMER with the system entry time of the interrupt. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Michael Holzheu	5d1adbaf36	s390/kdump: Add final note commit `dcc00b79fc` upstream. Since linux v3.14 with commit `38dfac843c` ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Richard Cochran	d2cc3a8cb9	regulator: tps65023: Fix inverted core enable logic. commit `c90722b54a` upstream. Commit `43530b69d7` ("regulator: Use regmap_read/write(), regmap_update_bits functions directly") intended to replace working inline helper functions with standard regmap calls. However, it also inverted the set/clear logic of the "CORE ADJ Allowed" bit. That patch was clearly never tested, since without that bit cleared, the core VDCDC1 voltage output does not react to I2C configuration changes. This patch fixes the issue by clearing the bit as in the original, correct implementation. Note for stable back porting that, due to subsequent driver churn, this patch will not apply on every kernel version. Fixes: `43530b69d7` ("regulator: Use regmap_read/write(), regmap_update_bits functions directly") Signed-off-by: Richard Cochran <rcochran@linutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:14 +02:00
Wadim Egorov	fde4f79074	regulator: rk808: Fix RK818 LDO2 commit `75f8811539` upstream. Set the correct voltage select register for LDO2. Signed-off-by: Wadim Egorov <w.egorov@phytec.de> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Linus Torvalds	efb209c87a	x86: fix 32-bit case of __get_user_asm_u64() commit `33c9e97290` upstream. The code to fetch a 64-bit value from user space was entirely buggered, and has been since the code was merged in early 2016 in commit `b2f680380d` ("x86/mm/32: Add support for 64-bit __get_user() on 32-bit kernels"). Happily the buggered routine is almost certainly entirely unused, since the normal way to access user space memory is just with the non-inlined "get_user()", and the inlined version didn't even historically exist. The normal "get_user()" case is handled by external hand-written asm in arch/x86/lib/getuser.S that doesn't have either of these issues. There were two independent bugs in __get_user_asm_u64(): - it still did the STAC/CLAC user space access marking, even though that is now done by the wrapper macros, see commit `11f1a4b975` ("x86: reorganize SMAP handling in user space accesses"). This didn't result in a semantic error, it just means that the inlined optimized version was hugely less efficient than the allegedly slower standard version, since the CLAC/STAC overhead is quite high on modern Intel CPU's. - the double register %eax/%edx was marked as an output, but the %eax part of it was touched early in the asm, and could thus clobber other inputs to the asm that gcc didn't expect it to touch. In particular, that meant that the generated code could look like this: mov (%eax),%eax mov 0x4(%eax),%edx where the load of %edx obviously was _supposed_ to be from the 32-bit word that followed the source of %eax, but because %eax was overwritten by the first instruction, the source of %edx was basically random garbage. The fixes are trivial: remove the extraneous STAC/CLAC entries, and mark the 64-bit output as early-clobber to let gcc know that no inputs should alias with the output register. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	5bbaf88d79	KVM: X86: Fix read out-of-bounds vulnerability in kvm pio emulation commit `cbfc6c9184` upstream. Huawei folks reported a read out-of-bounds vulnerability in kvm pio emulation. - "inb" instruction to access PIT Mod/Command register (ioport 0x43, write only, a read should be ignored) in guest can get a random number. - "rep insb" instruction to access PIT register port 0x43 can control memcpy() in emulator_pio_in_emulated() to copy max 0x400 bytes but only read 1 bytes, which will disclose the unimportant kernel memory in host but no crash. The similar test program below can reproduce the read out-of-bounds vulnerability: void hexdump(void mem, unsigned int len) { unsigned int i, j; for(i = 0; i < len + ((len % HEXDUMP_COLS) ? (HEXDUMP_COLS - len % HEXDUMP_COLS) : 0); i++) { / print offset / if(i % HEXDUMP_COLS == 0) { printf("0x%06x: ", i); } / print hex data / if(i < len) { printf("%02x ", 0xFF & ((char)mem)[i]); } else /* end of block, just aligning for ASCII dump / { printf(" "); } / print ASCII dump / if(i % HEXDUMP_COLS == (HEXDUMP_COLS - 1)) { for(j = i - (HEXDUMP_COLS - 1); j <= i; j++) { if(j >= len) / end of block, not really printing / { putchar(' '); } else if(isprint(((char)mem)[j])) /* printable char / { putchar(0xFF & ((char)mem)[j]); } else /* other char / { putchar('.'); } } putchar('\n'); } } } int main(void) { int i; if (iopl(3)) { err(1, "set iopl unsuccessfully\n"); return -1; } static char buf[0x40]; / test ioport 0x40,0x41,0x42,0x43,0x44,0x45 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x40, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x41, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x42, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x44, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("mov $0x45, %rdx;"); asm volatile ("in %dx,%al;"); asm volatile ("stosb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); / ins port 0x40 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x40, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); / ins port 0x43 / memset(buf, 0xab, sizeof(buf)); asm volatile("push %rdi;"); asm volatile("mov %0, %%rdi;"::"q"(buf)); asm volatile ("mov $0x20, %rcx;"); asm volatile ("mov $0x43, %rdx;"); asm volatile ("rep insb;"); asm volatile ("pop %rdi;"); hexdump(buf, 0x40); printf("\n"); return 0; } The vcpu->arch.pio_data buffer is used by both in/out instrutions emulation w/o clear after using which results in some random datas are left over in the buffer. Guest reads port 0x43 will be ignored since it is write only, however, the function kernel_pio() can't distigush this ignore from successfully reads data from device's ioport. There is no new data fill the buffer from port 0x43, however, emulator_pio_in_emulated() will copy the stale data in the buffer to the guest unconditionally. This patch fixes it by clearing the buffer before in instruction emulation to avoid to grant guest the stale data in the buffer. In addition, string I/O is not supported for in kernel device. So there is no iteration to read ioport %RCX times for string I/O. The function kernel_pio() just reads one round, and then copy the io size %RCX to the guest unconditionally, actually it copies the one round ioport data w/ other random datas which are left over in the vcpu->arch.pio_data buffer to the guest. This patch fixes it by introducing the string I/O support for in kernel device in order to grant the right ioport datas to the guest. Before the patch: 0x000000: fe 38 93 93 ff ff ab ab .8...... 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: f6 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 4d 51 30 30 ....MQ00 0x000018: 30 30 20 33 20 20 20 20 00 3 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ After the patch: 0x000000: 1e 02 f8 00 ff ff ab ab ........ 0x000008: ab ab ab ab ab ab ab ab ........ 0x000010: ab ab ab ab ab ab ab ab ........ 0x000018: ab ab ab ab ab ab ab ab ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: d2 e2 d2 df d2 db d2 d7 ........ 0x000008: d2 d3 d2 cf d2 cb d2 c7 ........ 0x000010: d2 c4 d2 c0 d2 bc d2 b8 ........ 0x000018: d2 b4 d2 b0 d2 ac d2 a8 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ 0x000000: 00 00 00 00 00 00 00 00 ........ 0x000008: 00 00 00 00 00 00 00 00 ........ 0x000010: 00 00 00 00 00 00 00 00 ........ 0x000018: 00 00 00 00 00 00 00 00 ........ 0x000020: ab ab ab ab ab ab ab ab ........ 0x000028: ab ab ab ab ab ab ab ab ........ 0x000030: ab ab ab ab ab ab ab ab ........ 0x000038: ab ab ab ab ab ab ab ab ........ Reported-by: Moguofang <moguofang@huawei.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Moguofang <moguofang@huawei.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	b67dcf8cf8	KVM: x86: Fix potential preemption when get the current kvmclock timestamp commit `e2c2206a18` upstream. BUG: using __this_cpu_read() in preemptible [00000000] code: qemu-system-x86/2809 caller is __this_cpu_preempt_check+0x13/0x20 CPU: 2 PID: 2809 Comm: qemu-system-x86 Not tainted 4.11.0+ #13 Call Trace: dump_stack+0x99/0xce check_preemption_disabled+0xf5/0x100 __this_cpu_preempt_check+0x13/0x20 get_kvmclock_ns+0x6f/0x110 [kvm] get_time_ref_counter+0x5d/0x80 [kvm] kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_hv_process_stimers+0x2a1/0x8a0 [kvm] ? kvm_arch_vcpu_ioctl_run+0xac9/0x1ce0 [kvm] kvm_arch_vcpu_ioctl_run+0x5bf/0x1ce0 [kvm] kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? kvm_vcpu_ioctl+0x384/0x7b0 [kvm] ? __fget+0xf3/0x210 do_vfs_ioctl+0xa4/0x700 ? __fget+0x114/0x210 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 RIP: 0033:0x7f9d164ed357 ? __this_cpu_preempt_check+0x13/0x20 This can be reproduced by run kvm-unit-tests/hyperv_stimer.flat w/ CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT enabled. Safe access to per-CPU data requires a couple of constraints, though: the thread working with the data cannot be preempted and it cannot be migrated while it manipulates per-CPU variables. If the thread is preempted, the thread that replaces it could try to work with the same variables; migration to another CPU could also cause confusion. However there is no preemption disable when reads host per-CPU tsc rate to calculate the current kvmclock timestamp. This patch fixes it by utilizing get_cpu/put_cpu pair to guarantee both __this_cpu_read() and rdtsc() are not preempted. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Wanpeng Li	8223e8f994	KVM: x86: Fix load damaged SSEx MXCSR register commit `a575813bfe` upstream. Reported by syzkaller: BUG: unable to handle kernel paging request at ffffffffc07f6a2e IP: report_bug+0x94/0x120 PGD 348e12067 P4D 348e12067 PUD 348e14067 PMD 3cbd84067 PTE 80000003f7e87161 Oops: 0003 [#1] SMP CPU: 2 PID: 7091 Comm: kvm_load_guest_ Tainted: G OE 4.11.0+ #8 task: ffff92fdfb525400 task.stack: ffffbda6c3d04000 RIP: 0010:report_bug+0x94/0x120 RSP: 0018:ffffbda6c3d07b20 EFLAGS: 00010202 do_trap+0x156/0x170 do_error_trap+0xa3/0x170 ? kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? mark_held_locks+0x79/0xa0 ? retint_kernel+0x10/0x10 ? trace_hardirqs_off_thunk+0x1a/0x1c do_invalid_op+0x20/0x30 invalid_op+0x1e/0x30 RIP: 0010:kvm_load_guest_fpu.part.175+0x12a/0x170 [kvm] ? kvm_load_guest_fpu.part.175+0x1c/0x170 [kvm] kvm_arch_vcpu_ioctl_run+0xed6/0x1b70 [kvm] kvm_vcpu_ioctl+0x384/0x780 [kvm] ? kvm_vcpu_ioctl+0x384/0x780 [kvm] ? sched_clock+0x13/0x20 ? __do_page_fault+0x2a0/0x550 do_vfs_ioctl+0xa4/0x700 ? up_read+0x1f/0x40 ? __do_page_fault+0x2a0/0x550 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x23/0xc2 SDM mentioned that "The MXCSR has several reserved bits, and attempting to write a 1 to any of these bits will cause a general-protection exception(#GP) to be generated". The syzkaller forks' testcase overrides xsave area w/ random values and steps on the reserved bits of MXCSR register. The damaged MXCSR register values of guest will be restored to SSEx MXCSR register before vmentry. This patch fixes it by catching userspace override MXCSR register reserved bits w/ random values and bails out immediately. Reported-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Daniel Glöckner	efa8cd1e2f	ima: accept previously set IMA_NEW_FILE commit `1ac202e978` upstream. Modifying the attributes of a file makes ima_inode_post_setattr reset the IMA cache flags. So if the file, which has just been created, is opened a second time before the first file descriptor is closed, verification fails since the security.ima xattr has not been written yet. We therefore have to look at the IMA_NEW_FILE even if the file already existed. With this patch there should no longer be an error when cat tries to open testfile: $ rm -f testfile $ ( echo test >&3 ; touch testfile ; cat testfile ) 3>testfile A file being new is no reason to accept that it is missing a digital signature demanded by the policy. Signed-off-by: Daniel Glöckner <dg@emlix.com> Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Brian Norris	d5811285b5	mwifiex: pcie: fix cmd_buf use-after-free in remove/reset commit `3c8cb9ad03` upstream. Command buffers (skb's) are allocated by the main driver, and freed upon the last use. That last use is often in mwifiex_free_cmd_buffer(). In the meantime, if the command buffer gets used by the PCI driver, we map it as DMA-able, and store the mapping information in the 'cb' memory. However, if a command was in-flight when resetting the device (and therefore was still mapped), we don't get a chance to unmap this memory until after the core has cleaned up its command handling. Let's keep a refcount within the PCI driver, so we ensure the memory only gets freed after we've finished unmapping it. Noticed by KASAN when forcing a reset via: echo 1 > /sys/bus/pci/.../reset The same code path can presumably be exercised in remove() and shutdown(). [ 205.390377] mwifiex_pcie 0000:01:00.0: info: shutdown mwifiex... [ 205.400393] ================================================================== [ 205.407719] BUG: KASAN: use-after-free in mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] at addr ffffffc0ad471b28 [ 205.419040] Read of size 16 by task bash/1913 [ 205.423421] ============================================================================= [ 205.431625] BUG skbuff_head_cache (Tainted: G B ): kasan: bad access detected [ 205.439815] ----------------------------------------------------------------------------- [ 205.439815] [ 205.449534] INFO: Allocated in __build_skb+0x48/0x114 age=1311 cpu=4 pid=1913 [ 205.456709] alloc_debug_processing+0x124/0x178 [ 205.461282] ___slab_alloc.constprop.58+0x528/0x608 [ 205.466196] __slab_alloc.isra.54.constprop.57+0x44/0x54 [ 205.471542] kmem_cache_alloc+0xcc/0x278 [ 205.475497] __build_skb+0x48/0x114 [ 205.479019] __netdev_alloc_skb+0xe0/0x170 [ 205.483244] mwifiex_alloc_cmd_buffer+0x68/0xdc [mwifiex] [ 205.488759] mwifiex_init_fw+0x40/0x6cc [mwifiex] [ 205.493584] _mwifiex_fw_dpc+0x158/0x520 [mwifiex] [ 205.498491] mwifiex_reinit_sw+0x2c4/0x398 [mwifiex] [ 205.503510] mwifiex_pcie_reset_notify+0x114/0x15c [mwifiex_pcie] [ 205.509643] pci_reset_notify+0x5c/0x6c [ 205.513519] pci_reset_function+0x6c/0x7c [ 205.517567] reset_store+0x68/0x98 [ 205.521003] dev_attr_store+0x54/0x60 [ 205.524705] sysfs_kf_write+0x9c/0xb0 [ 205.528413] INFO: Freed in __kfree_skb+0xb0/0xbc age=131 cpu=4 pid=1913 [ 205.535064] free_debug_processing+0x264/0x370 [ 205.539550] __slab_free+0x84/0x40c [ 205.543075] kmem_cache_free+0x1c8/0x2a0 [ 205.547030] __kfree_skb+0xb0/0xbc [ 205.550465] consume_skb+0x164/0x178 [ 205.554079] __dev_kfree_skb_any+0x58/0x64 [ 205.558304] mwifiex_free_cmd_buffer+0xa0/0x158 [mwifiex] [ 205.563817] mwifiex_shutdown_drv+0x578/0x5c4 [mwifiex] [ 205.569164] mwifiex_shutdown_sw+0x178/0x310 [mwifiex] [ 205.574353] mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie] [ 205.580398] pci_reset_notify+0x5c/0x6c [ 205.584274] pci_dev_save_and_disable+0x24/0x6c [ 205.588837] pci_reset_function+0x30/0x7c [ 205.592885] reset_store+0x68/0x98 [ 205.596324] dev_attr_store+0x54/0x60 [ 205.600017] sysfs_kf_write+0x9c/0xb0 ... [ 205.800488] Call trace: [ 205.802980] [<ffffffc00020a69c>] dump_backtrace+0x0/0x190 [ 205.808415] [<ffffffc00020a96c>] show_stack+0x20/0x28 [ 205.813506] [<ffffffc0005d020c>] dump_stack+0xa4/0xcc [ 205.818598] [<ffffffc0003be44c>] print_trailer+0x158/0x168 [ 205.824120] [<ffffffc0003be5f0>] object_err+0x4c/0x5c [ 205.829210] [<ffffffc0003c45bc>] kasan_report+0x334/0x500 [ 205.834641] [<ffffffc0003c3994>] check_memory_region+0x20/0x14c [ 205.840593] [<ffffffc0003c3b14>] __asan_loadN+0x14/0x1c [ 205.845879] [<ffffffbffc46171c>] mwifiex_unmap_pci_memory.isra.14+0x4c/0x100 [mwifiex_pcie] [ 205.854282] [<ffffffbffc461864>] mwifiex_pcie_delete_cmdrsp_buf+0x94/0xa8 [mwifiex_pcie] [ 205.862421] [<ffffffbffc462028>] mwifiex_pcie_free_buffers+0x11c/0x158 [mwifiex_pcie] [ 205.870302] [<ffffffbffc4620d4>] mwifiex_pcie_down_dev+0x70/0x80 [mwifiex_pcie] [ 205.877736] [<ffffffbffc1397a8>] mwifiex_shutdown_sw+0x190/0x310 [mwifiex] [ 205.884658] [<ffffffbffc4606b4>] mwifiex_pcie_reset_notify+0xd4/0x15c [mwifiex_pcie] [ 205.892446] [<ffffffc000635f54>] pci_reset_notify+0x5c/0x6c [ 205.898048] [<ffffffc00063a044>] pci_dev_save_and_disable+0x24/0x6c [ 205.904350] [<ffffffc00063cf0c>] pci_reset_function+0x30/0x7c [ 205.910134] [<ffffffc000641118>] reset_store+0x68/0x98 [ 205.915312] [<ffffffc000771588>] dev_attr_store+0x54/0x60 [ 205.920750] [<ffffffc00046f53c>] sysfs_kf_write+0x9c/0xb0 [ 205.926182] [<ffffffc00046dfb0>] kernfs_fop_write+0x184/0x1f8 [ 205.931963] [<ffffffc0003d64f4>] __vfs_write+0x6c/0x17c [ 205.937221] [<ffffffc0003d7164>] vfs_write+0xf0/0x1c4 [ 205.942310] [<ffffffc0003d7da0>] SyS_write+0x78/0xd8 [ 205.947312] [<ffffffc000204634>] el0_svc_naked+0x24/0x28 ... [ 205.998268] ================================================================== This bug has been around in different forms for a while. It was sort of noticed in commit `955ab095c5` ("mwifiex: Do not kfree cmd buf while unregistering PCIe"), but it just fixed the double-free, without acknowledging the potential for use-after-free. Fixes: `fc33146090` ("mwifiex: use pci_alloc/free_consistent APIs for PCIe") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Brian Norris	a32a0c8285	mwifiex: MAC randomization should not be persistent commit `7e2f18f064` upstream. nl80211 provides the NL80211_SCAN_FLAG_RANDOM_ADDR for every scan request that should be randomized; the absence of such a flag means we should not randomize. However, mwifiex was stashing the latest randomization request and always using it for future scans, even those that didn't set the flag. Let's zero out the randomization info whenever we get a scan request without NL80211_SCAN_FLAG_RANDOM_ADDR. I'd prefer to remove priv->random_mac entirely (and plumb the randomization MAC properly through the call sequence), but the spaghetti is a little difficult to unravel here for me. Fixes: `c2a8f0ff9c` ("mwifiex: support random MAC address for scanning") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
Larry Finger	cf6df2bcd4	rtlwifi: rtl8821ae: setup 8812ae RFE according to device type commit `46cfa2148e` upstream. Current channel switch implementation sets 8812ae RFE reg value assuming that device always has type 2. Extend possible RFE types set and write corresponding reg values. Source for new code is http://dlcdnet.asus.com/pub/ASUS/wireless/PCE-AC51/DR_PCE_AC51_20232801152016.zip Signed-off-by: Maxim Samoylov <max7255@gmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: Yan-Hsuan Chuang <yhchuang@realtek.com> Cc: Pkshih <pkshih@realtek.com> Cc: Birming Chiu <birming@realtek.com> Cc: Shaofu <shaofu@realtek.com> Cc: Steven Ting <steventing@realtek.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:13 +02:00
NeilBrown	699ae09634	md: MD_CLOSING needs to be cleared after called md_set_readonly or do_md_stop commit `065e519e71` upstream. if called md_set_readonly and set MD_CLOSING bit, the mddev cannot be opened any more due to the MD_CLOING bit wasn't cleared. Thus it needs to be cleared in md_ioctl after any call to md_set_readonly() or do_md_stop(). Signed-off-by: NeilBrown <neilb@suse.com> Fixes: `af8d8e6f03` ("md: changes for MD_STILL_CLOSED flag") Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Dennis Yang	1800fde309	md: update slab_cache before releasing new stripes when stripes resizing commit `583da48e38` upstream. When growing raid5 device on machine with small memory, there is chance that mdadm will be killed and the following bug report can be observed. The same bug could also be reproduced in linux-4.10.6. [57600.075774] BUG: unable to handle kernel NULL pointer dereference at (null) [57600.083796] IP: [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.110378] PGD 421cf067 PUD 4442d067 PMD 0 [57600.114678] Oops: 0002 [#1] SMP [57600.180799] CPU: 1 PID: 25990 Comm: mdadm Tainted: P O 4.2.8 #1 [57600.187849] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS QV05AR66 03/06/2013 [57600.197490] task: ffff880044e47240 ti: ffff880043070000 task.ti: ffff880043070000 [57600.204963] RIP: 0010:[<ffffffff81a6aa87>] [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.213057] RSP: 0018:ffff880043073810 EFLAGS: 00010046 [57600.218359] RAX: 0000000000000000 RBX: 000000000000000c RCX: ffff88011e296dd0 [57600.225486] RDX: 0000000000000001 RSI: ffffe8ffffcb46c0 RDI: 0000000000000000 [57600.232613] RBP: ffff880043073878 R08: ffff88011e5f8170 R09: 0000000000000282 [57600.239739] R10: 0000000000000005 R11: 28f5c28f5c28f5c3 R12: ffff880043073838 [57600.246872] R13: ffffe8ffffcb46c0 R14: 0000000000000000 R15: ffff8800b9706a00 [57600.253999] FS: 00007f576106c700(0000) GS:ffff88011e280000(0000) knlGS:0000000000000000 [57600.262078] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [57600.267817] CR2: 0000000000000000 CR3: 00000000428fe000 CR4: 00000000001406e0 [57600.274942] Stack: [57600.276949] ffffffff8114ee35 ffff880043073868 0000000000000282 000000000000eb3f [57600.284383] ffffffff81119043 ffff880043073838 ffff880043073838 ffff88003e197b98 [57600.291820] ffffe8ffffcb46c0 ffff88003e197360 0000000000000286 ffff880043073968 [57600.299254] Call Trace: [57600.301698] [<ffffffff8114ee35>] ? cache_flusharray+0x35/0xe0 [57600.307523] [<ffffffff81119043>] ? __page_cache_release+0x23/0x110 [57600.313779] [<ffffffff8114eb53>] kmem_cache_free+0x63/0xc0 [57600.319344] [<ffffffff81579942>] drop_one_stripe+0x62/0x90 [57600.324915] [<ffffffff81579b5b>] raid5_cache_scan+0x8b/0xb0 [57600.330563] [<ffffffff8111b98a>] shrink_slab.part.36+0x19a/0x250 [57600.336650] [<ffffffff8111e38c>] shrink_zone+0x23c/0x250 [57600.342039] [<ffffffff8111e4f3>] do_try_to_free_pages+0x153/0x420 [57600.348210] [<ffffffff8111e851>] try_to_free_pages+0x91/0xa0 [57600.353959] [<ffffffff811145b1>] __alloc_pages_nodemask+0x4d1/0x8b0 [57600.360303] [<ffffffff8157a30b>] check_reshape+0x62b/0x770 [57600.365866] [<ffffffff8157a4a5>] raid5_check_reshape+0x55/0xa0 [57600.371778] [<ffffffff81583df7>] update_raid_disks+0xc7/0x110 [57600.377604] [<ffffffff81592b73>] md_ioctl+0xd83/0x1b10 [57600.382827] [<ffffffff81385380>] blkdev_ioctl+0x170/0x690 [57600.388307] [<ffffffff81195238>] block_ioctl+0x38/0x40 [57600.393525] [<ffffffff811731c5>] do_vfs_ioctl+0x2b5/0x480 [57600.399010] [<ffffffff8115e07b>] ? vfs_write+0x14b/0x1f0 [57600.404400] [<ffffffff811733cc>] SyS_ioctl+0x3c/0x70 [57600.409447] [<ffffffff81a6ad97>] entry_SYSCALL_64_fastpath+0x12/0x6a [57600.415875] Code: 00 00 00 00 55 48 89 e5 8b 07 85 c0 74 04 31 c0 5d c3 ba 01 00 00 00 f0 0f b1 17 85 c0 75 ef b0 01 5d c3 90 31 c0 ba 01 00 00 00 <f0> 0f b1 17 85 c0 75 01 c3 55 89 c6 48 89 e5 e8 85 d1 63 ff 5d [57600.435460] RIP [<ffffffff81a6aa87>] _raw_spin_lock+0x7/0x20 [57600.441208] RSP <ffff880043073810> [57600.444690] CR2: 0000000000000000 [57600.448000] ---[ end trace cbc6b5cc4bf9831d ]--- The problem is that resize_stripes() releases new stripe_heads before assigning new slab cache to conf->slab_cache. If the shrinker function raid5_cache_scan() gets called after resize_stripes() starting releasing new stripes but right before new slab cache being assigned, it is possible that these new stripe_heads will be freed with the old slab_cache which was already been destoryed and that triggers this bug. Signed-off-by: Dennis Yang <dennisyang@qnap.com> Fixes: `edbe83ab4c` ("md/raid5: allow the stripe_cache to grow and shrink.") Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Joe Thornber	d7252903e2	dm space map disk: fix some book keeping in the disk space map commit `0377a07c7a` upstream. When decrementing the reference count for a block, the free count wasn't being updated if the reference count went to zero. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Joe Thornber	f568a10289	dm thin metadata: call precommit before saving the roots commit `91bcdb92d3` upstream. These calls were the wrong way round in __write_initial_superblock. Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Mikulas Patocka	17cedf3ba1	dm bufio: make the parameter "retain_bytes" unsigned long commit `13840d3801` upstream. Change the type of the parameter "retain_bytes" from unsigned to unsigned long, so that on 64-bit machines the user can set more than 4GiB of data to be retained. Also, change the type of the variable "count" in the function "__evict_old_buffers" to unsigned long. The assignment "count = c->n_buffers[LIST_CLEAN] + c->n_buffers[LIST_DIRTY];" could result in unsigned long to unsigned overflow and that could result in buffers not being freed when they should. While at it, avoid division in get_retain_buffers(). Division is slow, we can change it to shift because we have precalculated the log2 of block size. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Mike Snitzer	2e80638cee	dm cache metadata: fail operations if fail_io mode has been established commit `10add84e27` upstream. Otherwise it is possible to trigger crashes due to the metadata being inaccessible yet these methods don't safely account for that possibility without these checks. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	a5b9afd690	dm mpath: delay requeuing while path initialization is in progress commit `c1d7ecf7ca` upstream. Requeuing a request immediately while path initialization is ongoing causes high CPU usage, something that is undesired. Hence delay requeuing while path initialization is in progress. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	5087ded2bb	dm mpath: avoid that path removal can trigger an infinite loop commit `7083abbbfc` upstream. If blk_get_request() fails, check whether the failure is due to a path being removed. If that is the case, fail the path by triggering a call to fail_path(). This avoids that the following scenario can be encountered while removing paths: * CPU usage of a kworker thread jumps to 100%. * Removing the DM device becomes impossible. Delay requeueing if blk_get_request() returns -EBUSY or -EWOULDBLOCK, and the queue is not dying, because in these cases immediate requeuing is inappropriate. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:12 +02:00
Bart Van Assche	5a09414335	dm mpath: split and rename activate_path() to prepare for its expanded use commit `89bfce763e` upstream. activate_path() is renamed to activate_path_work() which now calls activate_or_offline_path(). activate_or_offline_path() will be used by the next commit. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Bart Van Assche	4ff00db718	dm mpath: requeue after a small delay if blk_get_request() fails commit `06eb061f48` upstream. If blk_get_request() returns ENODEV then multipath_clone_and_map() causes a request to be requeued immediately. This can cause a kworker thread to spend 100% of the CPU time of a single core in __blk_mq_run_hw_queue() and also can cause device removal to never finish. Avoid this by only requeuing after a delay if blk_get_request() fails. Additionally, reduce the requeue delay. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	7d67cd026d	dm bufio: check new buffer allocation watermark every 30 seconds commit `390020ad2a` upstream. dm-bufio checks a watermark when it allocates a new buffer in __bufio_new(). However, it doesn't check the watermark when the user changes /sys/module/dm_bufio/parameters/max_cache_size_bytes. This may result in a problem - if the watermark is high enough so that all possible buffers are allocated and if the user lowers the value of "max_cache_size_bytes", the watermark will never be checked against the new value because no new buffer would be allocated. To fix this, change __evict_old_buffers() so that it checks the watermark. __evict_old_buffers() is called every 30 seconds, so if the user reduces "max_cache_size_bytes", dm-bufio will react to this change within 30 seconds and decrease memory consumption. Depends-on: `1b0fb5a5b2` ("dm bufio: avoid a possible ABBA deadlock") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	2aac21e001	dm bufio: avoid a possible ABBA deadlock commit `1b0fb5a5b2` upstream. __get_memory_limit() tests if dm_bufio_cache_size changed and calls __cache_size_refresh() if it did. It takes dm_bufio_clients_lock while it already holds the client lock. However, lock ordering is violated because in cleanup_old_buffers() dm_bufio_clients_lock is taken before the client lock. This results in a possible deadlock and lockdep engine warning. Fix this deadlock by changing mutex_lock() to mutex_trylock(). If the lock can't be taken, it will be re-checked next time when a new buffer is allocated. Also add "unlikely" to the if condition, so that the optimizer assumes that the condition is false. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Mikulas Patocka	539ee78649	dm raid: select the Kconfig option CONFIG_MD_RAID0 commit `7b81ef8b14` upstream. Since the commit `0cf4503174` ("dm raid: add support for the MD RAID0 personality"), the dm-raid subsystem can activate a RAID-0 array. Therefore, add MD_RAID0 to the dependencies of DM_RAID, so that MD_RAID0 will be selected when DM_RAID is selected. Fixes: `0cf4503174` ("dm raid: add support for the MD RAID0 personality") Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Vinothkumar Raja	11dfdb1a46	dm btree: fix for dm_btree_find_lowest_key() commit `7d1fedb6e9` upstream. dm_btree_find_lowest_key() is giving incorrect results. find_key() traverses the btree correctly for finding the highest key, but there is an error in the way it traverses the btree for retrieving the lowest key. dm_btree_find_lowest_key() fetches the first key of the rightmost block of the btree instead of fetching the first key from the leftmost block. Fix this by conditionally passing the correct parameter to value64() based on the @find_highest flag. Signed-off-by: Erez Zadok <ezk@fsl.cs.sunysb.edu> Signed-off-by: Vinothkumar Raja <vinraja@cs.stonybrook.edu> Signed-off-by: Nidhi Panpalia <npanpalia@cs.stonybrook.edu> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Paolo Abeni	cd21f4af21	infiniband: call ipv6 route lookup via the stub interface commit `eea40b8f62` upstream. The infiniband address handle can be triggered to resolve an ipv6 address in response to MAD packets, regardless of the ipv6 module being disabled via the kernel command line argument. That will cause a call into the ipv6 routing code, which is not initialized, and a conseguent oops. This commit addresses the above issue replacing the direct lookup call with an indirect one via the ipv6 stub, which is properly initialized according to the ipv6 status (e.g. if ipv6 is disabled, the routing lookup fails gracefully) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Sagi Grimberg	3992c93b74	mlx5: Fix mlx5_ib_map_mr_sg mr length commit `0a49f2c31c` upstream. In case we got an initial sg_offset, we need to account for it in the mr length. Fixes: `ff2ba99365` ("IB/core: Add passing an offset into the SG to ib_map_mr_sg") Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Israel Rukshin <israelr@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:11 +02:00
Alexander Sverdlin	cca9c04ad4	ASoC: cs4271: configure reset GPIO as output commit `49b2e27ab9` upstream. During reset "refactoring" the output configuration was lost. This commit repairs sound on EDB93XX boards. Fixes: `9a397f4` ("ASoC: cs4271: add regulator consumer support") Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Petr Vandrovec	a7caccf886	tpm: fix handling of the TPM 2.0 event logs commit `fd5c78694f` upstream. When TPM2 log has entries with more than 3 digests, or with digests not listed in the log header, log gets misparsed, eventually leading to kernel complaint that code tried to vmalloc 512MB of memory (I have no idea what would happen on bigger system). So code should not parse only first 3 digests: both event header and event itself are already in memory, so we can parse any number of digests, as long as we do not try to parse whole memory when given count of 0xFFFFFFFF. So this change: * Rejects event entry with more digests than log header describes. Digest types should be unique, and all should be described in log header, so there cannot be more digests in the event than in the header. * Reject event entry with digest that is not described in the log header. In theory code could hardcode information about digest IDs already assigned by TCG, but if firmware authors cannot get event log format right, why should anyone believe that they got event log content right. Fixes: `4d23cc323c` ("tpm: add securityfs support for TPM 2.0 firmware event log") Signed-off-by: Petr Vandrovec <petr@vmware.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
$Hon Ching \(Vicky) Lo$ Hon Ching \(Vicky) Lo	455edd3ceb	vTPM: Fix missing NULL check commit `31574d321c` upstream. The current code passes the address of tpm_chip as the argument to dev_get_drvdata() without prior NULL check in tpm_ibmvtpm_get_desired_dma. This resulted an oops during kernel boot when vTPM is enabled in Power partition configured in active memory sharing mode. The vio_driver's get_desired_dma() is called before the probe(), which for vtpm is tpm_ibmvtpm_probe, and it's this latter function that initializes the driver and set data. Attempting to get data before the probe() caused the problem. This patch adds a NULL check to the tpm_ibmvtpm_get_desired_dma. fixes: `9e0d39d8a6` ("tpm: Remove useless priv field in struct tpm_vendor_specific") Signed-off-by: Hon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkine <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Jerry Snitselaar	b06ad9f0bf	tpm_crb: check for bad response size commit `8569defde8` upstream. Make sure size of response buffer is at least 6 bytes, or we will underflow and pass large size_t to memcpy_fromio(). This was encountered while testing earlier version of locality patchset. Fixes: `30fc8d138e` ("tpm: TPM 2.0 CRB Interface") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Nayna Jain	7cb54bfbd5	tpm: add sleep only for retry in i2c_nuvoton_write_status() commit `0afb7118ae` upstream. Currently, there is an unnecessary 1 msec delay added in i2c_nuvoton_write_status() for the successful case. This function is called multiple times during send() and recv(), which implies adding multiple extra delays for every TPM operation. This patch calls usleep_range() only if retry is to be done. Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Nayna Jain	6705f253bc	tpm: msleep() delays - replace with usleep_range() in i2c nuvoton driver commit `a233a0289c` upstream. Commit `500462a9de` "timers: Switch to a non-cascading wheel" replaced the 'classic' timer wheel, which aimed for near 'exact' expiry of the timers. Their analysis was that the vast majority of timeout timers are used as safeguards, not as real timers, and are cancelled or rearmed before expiration. The only exception noted to this were networking timers with a small expiry time. Not included in the analysis was the TPM polling timer, which resulted in a longer normal delay and, every so often, a very long delay. The non-cascading wheel delay is based on CONFIG_HZ. For a description of the different rings and their delays, refer to the comments in kernel/time/timer.c. Below are the delays given for rings 0 - 2, which explains the longer "normal" delays and the very, long delays as seen on systems with CONFIG_HZ 250. * HZ 1000 steps * Level Offset Granularity Range * 0 0 1 ms 0 ms - 63 ms * 1 64 8 ms 64 ms - 511 ms * 2 128 64 ms 512 ms - 4095 ms (512ms - ~4s) * HZ 250 * Level Offset Granularity Range * 0 0 4 ms 0 ms - 255 ms * 1 64 32 ms 256 ms - 2047 ms (256ms - ~2s) * 2 128 256 ms 2048 ms - 16383 ms (~2s - ~16s) Below is a comparison of extending the TPM with 1000 measurements, using msleep() vs. usleep_delay() when configured for 1000 hz vs. 250 hz, before and after commit `500462a9de`. linux-4.7 \| msleep() usleep_range() 1000 hz: 0m44.628s \| 1m34.497s 29.243s 250 hz: 1m28.510s \| 4m49.269s 32.386s linux-4.7 \| min-max (msleep) min-max (usleep_range) 1000 hz: 0:017 - 2:760s \| 0:015 - 3:967s 0:014 - 0:418s 250 hz: 0:028 - 1:954s \| 0:040 - 4:096s 0:016 - 0:816s This patch replaces the msleep() with usleep_range() calls in the i2c nuvoton driver with a consistent max range value. Signed-of-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Nayna Jain <nayna@linux.vnet.ibm.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	84eef50e82	tpm_tis_spi: Add small delay after last transfer commit `5cc0101d1f` upstream. Testing the implementation with a Raspberry Pi 2 showed that under some circumstances its SPI master erroneously releases the CS line before the transfer is complete, i.e. before the end of the last clock. In this case the TPM ignores the transfer and misses for example the GO command. The driver is unable to detect this communication problem and will wait for a command response that is never going to arrive, timing out eventually. As a workaround, the small delay ensures that the CS line is held long enough, even with a faulty SPI master. Other SPI masters are not affected, except for a negligible performance penalty. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	32a6b61947	tpm_tis_spi: Remove limitation of transfers to MAX_SPI_FRAMESIZE bytes commit `591e48c26c` upstream. Limiting transfers to MAX_SPI_FRAMESIZE was not expected by the upper layers, as tpm_tis has no such limitation. Add a loop to hide that limitation. v2: Moved scope of spi_message to the top as requested by Jarkko Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:10 +02:00
Peter Huewe	e2fbfd47e3	tpm_tis_spi: Check correct byte for wait state indicator commit `e110cc69dc` upstream. Wait states are signaled in the last byte received from the TPM in response to the header, not the first byte. Check rx_buf[3] instead of rx_buf[0]. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Peter Huewe	6b2a246ad0	tpm_tis_spi: Abort transfer when too many wait states are signaled commit `975094ddc3` upstream. Abort the transfer with ETIMEDOUT when the TPM signals more than TPM_RETRY wait states. Continuing with the transfer in this state will only lead to arbitrary failures in other parts of the code. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Peter Huewe	b0e2a93c28	tpm_tis_spi: Use single function to transfer data commit `f848f2143a` upstream. The algorithm for sending data to the TPM is mostly identical to the algorithm for receiving data from the TPM, so a single function is sufficient to handle both cases. This is a prequisite for all the other fixes, so we don't have to fix everything twice (send/receive) v2: u16 instead of u8 for the length. Fixes: `0edbfea537` ("tpm/tpm_tis_spi: Add support for spi phy") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Benoit Houyere <benoit.houyere@st.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Amir Goldstein	580c656a95	fanotify: don't expose EOPENSTALE to userspace commit `4ff33aafd3` upstream. When delivering an event to userspace for a file on an NFS share, if the file is deleted on server side before user reads the event, user will not get the event. If the event queue contained several events, the stale event is quietly dropped and read() returns to user with events read so far in the buffer. If the event queue contains a single stale event or if the stale event is a permission event, read() returns to user with the kernel internal error code 518 (EOPENSTALE), which is not a POSIX error code. Check the internal return value -EOPENSTALE in fanotify_read(), just the same as it is checked in path_openat() and drop the event in the cases that it is not already dropped. This is a reproducer from Marko Rauhamaa: Just take the example program listed under "man fanotify" ("fantest") and follow these steps: ============================================================== NFS Server NFS Client(1) NFS Client(2) ============================================================== # echo foo >/nfsshare/bar.txt # cat /nfsshare/bar.txt foo # ./fantest /nfsshare Press enter key to terminate. Listening for events. # rm -f /nfsshare/bar.txt # cat /nfsshare/bar.txt read: Unknown error 518 cat: /nfsshare/bar.txt: Operation not permitted ============================================================== where NFS Client (1) and (2) are two terminal sessions on a single NFS Client machine. Reported-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com> Tested-by: Marko Rauhamaa <marko.rauhamaa@f-secure.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Jeeja KP	826ca313fd	ALSA: hda: Fix cpu lockup when stopping the cmd dmas commit `960013762d` upstream. Using jiffies in hdac_wait_for_cmd_dmas() to determine when to time out when interrupts are off (snd_hdac_bus_stop_cmd_io()/spin_lock_irq()) causes hard lockup so unlock while waiting using jiffies. ---<-snip->--- <0>[ 1211.603046] NMI watchdog: Watchdog detected hard LOCKUP on cpu 3 <4>[ 1211.603047] Modules linked in: snd_hda_intel i915 vgem <4>[ 1211.603053] irq event stamp: 13366 <4>[ 1211.603053] hardirqs last enabled at (13365): ... <4>[ 1211.603059] Call Trace: <4>[ 1211.603059] ? delay_tsc+0x3d/0xc0 <4>[ 1211.603059] __delay+0xa/0x10 <4>[ 1211.603060] __const_udelay+0x31/0x40 <4>[ 1211.603060] snd_hdac_bus_stop_cmd_io+0x96/0xe0 [snd_hda_core] <4>[ 1211.603060] ? azx_dev_disconnect+0x20/0x20 [snd_hda_intel] <4>[ 1211.603061] snd_hdac_bus_stop_chip+0xb1/0x100 [snd_hda_core] <4>[ 1211.603061] azx_stop_chip+0x9/0x10 [snd_hda_codec] <4>[ 1211.603061] azx_suspend+0x72/0x220 [snd_hda_intel] <4>[ 1211.603061] pci_pm_suspend+0x71/0x140 <4>[ 1211.603062] dpm_run_callback+0x6f/0x330 <4>[ 1211.603062] ? pci_pm_freeze+0xe0/0xe0 <4>[ 1211.603062] __device_suspend+0xf9/0x370 <4>[ 1211.603062] ? dpm_watchdog_set+0x60/0x60 <4>[ 1211.603063] async_suspend+0x1a/0x90 <4>[ 1211.603063] async_run_entry_fn+0x34/0x160 <4>[ 1211.603063] process_one_work+0x1f4/0x6d0 <4>[ 1211.603063] ? process_one_work+0x16e/0x6d0 <4>[ 1211.603064] worker_thread+0x49/0x4a0 <4>[ 1211.603064] kthread+0x107/0x140 <4>[ 1211.603064] ? process_one_work+0x6d0/0x6d0 <4>[ 1211.603065] ? kthread_create_on_node+0x40/0x40 <4>[ 1211.603065] ret_from_fork+0x2e/0x40 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100419 Fixes: `38b19ed7f8` ("ALSA: hda: fix to wait for RIRB & CORB DMA to set") Reported-by: Marta Lofstedt <marta.lofstedt@intel.com> Suggested-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Jeeja KP <jeeja.kp@intel.com> Acked-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Alexander Steffen	d2b8a861a6	tpm_tis_core: Choose appropriate timeout for reading burstcount commit `302a6ad7fc` upstream. TIS v1.3 for TPM 1.2 and PTP for TPM 2.0 disagree about which timeout value applies to reading a valid burstcount. It is TIMEOUT_D according to TIS, but TIMEOUT_A according to PTP, so choose the appropriate value depending on whether we deal with a TPM 1.2 or a TPM 2.0. This is important since according to the PTP TIMEOUT_D is much smaller than TIMEOUT_A. So the previous implementation could run into timeouts with a TPM 2.0, even though the TPM was behaving perfectly fine. During tpm2_probe TIMEOUT_D will be used even with a TPM 2.0, because TPM_CHIP_FLAG_TPM2 is not yet set. This is fine, since the timeout values will only be changed afterwards by tpm_get_timeouts. Until then TIS_TIMEOUT_D_MAX applies, which is large enough. Fixes: `aec04cbdf7` ("tpm: TPM 2.0 FIFO Interface") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Signed-off-by: Peter Huewe <peter.huewe@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Vamsi Krishna Samavedam	b3bf0bbc1e	USB: core: replace %p with %pK commit `2f964780c0` upstream. Format specifier %p can leak kernel addresses while not valuing the kptr_restrict system settings. When kptr_restrict is set to (1), kernel pointers printed using the %pK format specifier will be replaced with Zeros. Debugging Note : &pK prints only Zeros as address. If you need actual address information, write 0 to kptr_restrict. echo 0 > /proc/sys/kernel/kptr_restrict [Found by poking around in a random vendor kernel tree, it would be nice if someone would actually send these types of patches upstream - gkh] Signed-off-by: Vamsi Krishna Samavedam <vskrishn@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Willy Tarreau	28c7411cdb	char: lp: fix possible integer overflow in lp_setup() commit `3e21f4af17` upstream. The lp_setup() code doesn't apply any bounds checking when passing "lp=none", and only in this case, resulting in an overflow of the parport_nr[] array. All versions in Git history are affected. Reported-By: Roee Hay <roee.hay@hcl.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Johan Hovold	37f84f8b99	watchdog: pcwd_usb: fix NULL-deref at probe commit `46c319b848` upstream. Make sure to check the number of endpoints to avoid dereferencing a NULL-pointer should a malicious device lack endpoints. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:09 +02:00
Alan Stern	a646b2974a	USB: ene_usb6250: fix DMA to the stack commit `628c2893d4` upstream. The ene_usb6250 sub-driver in usb-storage does USB I/O to buffers on the stack, which doesn't work with vmapped stacks. This patch fixes the problem by allocating a separate 512-byte buffer at probe time and using it for all of the offending I/O operations. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Andreas Hartmann <andihartmann@01019freenet.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Maksim Salau	dff0ad3b9c	usb: misc: legousbtower: Fix memory leak commit `0bd193d62b` upstream. get_version_reply is not freed if function returns with success. Fixes: `942a48730f` ("usb: misc: legousbtower: Fix buffers on stack") Reported-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Maksim Salau <maksim.salau@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Maksim Salau	cfd3c22c36	usb: misc: legousbtower: Fix buffers on stack commit `942a48730f` upstream. Allocate buffers on HEAP instead of STACK for local structures that are to be received using usb_control_msg(). Signed-off-by: Maksim Salau <maksim.salau@gmail.com> Tested-by: Alfredo Rafael Vicente Boix <alviboi@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-25 15:46:08 +02:00
Greg Kroah-Hartman	02d8683735	Linux 4.11.2	2017-05-20 14:50:04 +02:00
Kees Cook	bbc105f387	pstore: Shut down worker when unregistering commit `6330d55347` upstream. When built as a module and running with update_ms >= 0, pstore will Oops during module unload since the work timer is still running. This makes sure the worker is stopped before unloading. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Kees Cook	ed8834ea4f	pstore: Use dynamic spinlock initializer commit `e9a330c428` upstream. The per-prz spinlock should be using the dynamic initializer so that lockdep can correctly track it. Without this, under lockdep, we get a warning at boot that the lock is in non-static memory. Fixes: `109704492e` ("pstore: Make spinlock per zone instead of global") Fixes: `76d5692a58` ("pstore: Correctly initialize spinlock and flags") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Ankit Kumar	f25c78c879	pstore: Fix flags to enable dumps on powerpc commit `041939c1ec` upstream. After commit `c950fd6f20` kernel registers pstore write based on flag set. Pstore write for powerpc is broken as flags(PSTORE_FLAGS_DMESG) is not set for powerpc architecture. On panic, kernel doesn't write message to /fs/pstore/dmesg*(Entry doesn't gets created at all). This patch enables pstore write for powerpc architecture by setting PSTORE_FLAGS_DMESG flag. Fixes: `c950fd6f20` ("pstore: Split pstore fragile flags") Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	af0eb80e8e	libnvdimm, pfn: fix 'npfns' vs section alignment commit `d5483feda8` upstream. Fix failures to create namespaces due to the vmem_altmap not advertising enough free space to store the memmap. WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0 [..] Call Trace: dump_stack+0x63/0x83 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xde/0xf0 devm_memremap_pages+0x244/0x440 pmem_attach_disk+0x37e/0x490 [nd_pmem] nd_pmem_probe+0x7e/0xa0 [nd_pmem] nvdimm_bus_probe+0x71/0x120 [libnvdimm] driver_probe_device+0x2bb/0x460 bind_store+0x114/0x160 drv_attr_store+0x25/0x30 In commit `658922e57b` "libnvdimm, pfn: fix memmap reservation sizing" we arranged for the capacity to be allocated, but failed to also update the 'npfns' parameter. This leads to cases where there is enough capacity reserved to hold all the allocated sections, but vmemmap_populate_hugepages() still encounters -ENOMEM from altmap_alloc_block_buf(). This fix is a stop-gap until we can teach the core memory hotplug implementation to permit sub-section hotplug. Fixes: `658922e57b` ("libnvdimm, pfn: fix memmap reservation sizing") Reported-by: Anisha Allada <anisha.allada@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	a3ff3ebdf3	libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering commit `452bae0aed` upstream. A debug patch to turn the standard device_lock() into something that lockdep can analyze yielded the following: ====================================================== [ INFO: possible circular locking dependency detected ] 4.11.0-rc4+ #106 Tainted: G O ------------------------------------------------------- lt-libndctl/1898 is trying to acquire lock: (&dev->nvdimm_mutex/3){+.+.+.}, at: [<ffffffffc023c948>] nd_attach_ndns+0x178/0x1b0 [libnvdimm] but task is already holding lock: (&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffc022e0b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&nvdimm_bus->reconfig_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nvdimm_bus_lock+0x21/0x30 [libnvdimm] nvdimm_namespace_capacity+0x1b/0x40 [libnvdimm] nvdimm_namespace_common_probe+0x230/0x510 [libnvdimm] nd_pmem_probe+0x14/0x180 [nd_pmem] nvdimm_bus_probe+0xa9/0x260 [libnvdimm] -> #0 (&dev->nvdimm_mutex/3){+.+.+.}: __lock_acquire+0x1107/0x1280 lock_acquire+0xf6/0x1f0 __mutex_lock+0x88/0x980 mutex_lock_nested+0x1b/0x20 nd_attach_ndns+0x178/0x1b0 [libnvdimm] nd_namespace_store+0x308/0x3c0 [libnvdimm] namespace_store+0x87/0x220 [libnvdimm] In this case '&dev->nvdimm_mutex/3' mirrors '&dev->mutex'. Fix this by replacing the use of device_lock() with nvdimm_bus_lock() to protect nd_{attach,detach}_ndns() operations. Fixes: `8c2f7e8658` ("libnvdimm: infrastructure for btt devices") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Toshi Kani	de21b800a6	libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify commit `b2518c78ce` upstream. The following BUG was observed when nd_pmem_notify() was called for a BTT device. The use of a pmem_device pointer is not valid with BTT. BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 IP: nd_pmem_notify+0x30/0xf0 [nd_pmem] Call Trace: nd_device_notify+0x40/0x50 child_notify+0x10/0x20 device_for_each_child+0x50/0x90 nd_region_notify+0x20/0x30 nd_device_notify+0x40/0x50 nvdimm_region_notify+0x27/0x30 acpi_nfit_scrub+0x341/0x590 [nfit] process_one_work+0x197/0x450 worker_thread+0x4e/0x4a0 kthread+0x109/0x140 Fix nd_pmem_notify() by setting nd_region and badblocks pointers properly for BTT. Cc: Vishal Verma <vishal.l.verma@intel.com> Fixes: `719994660c` ("libnvdimm: async notification support") Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Dan Williams	d2572f5b01	libnvdimm, region: fix flush hint detection crash commit `bc042fdfbb` upstream. In the case where a dimm does not have any associated flush hints the ndrd->flush_wpq array may be uninitialized leading to crashes with the following signature: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: region_visible+0x10f/0x160 [libnvdimm] Call Trace: internal_create_group+0xbe/0x2f0 sysfs_create_groups+0x40/0x80 device_add+0x2d8/0x650 nd_async_device_register+0x12/0x40 [libnvdimm] async_run_entry_fn+0x39/0x170 process_one_work+0x212/0x6c0 ? process_one_work+0x197/0x6c0 worker_thread+0x4e/0x4a0 kthread+0x10c/0x140 ? process_one_work+0x6c0/0x6c0 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x31/0x40 Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Fixes: `f284a4f237` ("libnvdimm: introduce nvdimm_flush() and nvdimm_has_flush()") Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Joeseph Chang	d517be5133	ipmi: Fix kernel panic at ipmi_ssif_thread() commit `6de65fcfdb` upstream. msg_written_handler() may set ssif_info->multi_data to NULL when using ipmitool to write fru. Before setting ssif_info->multi_data to NULL, add new local pointer "data_to_send" and store correct i2c data pointer to it to fix NULL pointer kernel panic and incorrect ssif_info->multi_pos. Signed-off-by: Joeseph Chang <joechang@codeaurora.org> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:51 +02:00
Christoph Hellwig	8f0fde5bc9	libata: reject passthrough WRITE SAME requests commit `c6ade20f5e` upstream. The WRITE SAME to TRIM translation rewrites the DATA OUT buffer. While the SCSI code accomodates for this by passing a read-writable buffer userspace applications don't cater for this behavior. In fact it can be used to rewrite e.g. a readonly file through mmap and should be considered as a security fix. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Tejun Heo	a5938c093a	cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc() commit `a590b90d47` upstream. cgroup_get() expected to be called only on live cgroups and triggers warning on a dead cgroup; however, cgroup_sk_alloc() may be called while cloning a socket which is left in an empty and removed cgroup and thus may legitimately duplicate its reference on a dead cgroup. This currently triggers the following warning spuriously. WARNING: CPU: 14 PID: 0 at kernel/cgroup.c:490 cgroup_get+0x55/0x60 ... [<ffffffff8107e123>] __warn+0xd3/0xf0 [<ffffffff8107e20e>] warn_slowpath_null+0x1e/0x20 [<ffffffff810ff465>] cgroup_get+0x55/0x60 [<ffffffff81106061>] cgroup_sk_alloc+0x51/0xe0 [<ffffffff81761beb>] sk_clone_lock+0x2db/0x390 [<ffffffff817cce06>] inet_csk_clone_lock+0x16/0xc0 [<ffffffff817e8173>] tcp_create_openreq_child+0x23/0x4b0 [<ffffffff818601a1>] tcp_v6_syn_recv_sock+0x91/0x670 [<ffffffff817e8b16>] tcp_check_req+0x3a6/0x4e0 [<ffffffff81861ba3>] tcp_v6_rcv+0x693/0xa00 [<ffffffff81837429>] ip6_input_finish+0x59/0x3e0 [<ffffffff81837cb2>] ip6_input+0x32/0xb0 [<ffffffff81837387>] ip6_rcv_finish+0x57/0xa0 [<ffffffff81837ac8>] ipv6_rcv+0x318/0x4d0 [<ffffffff817778c7>] __netif_receive_skb_core+0x2d7/0x9a0 [<ffffffff81777fa6>] __netif_receive_skb+0x16/0x70 [<ffffffff81778023>] netif_receive_skb_internal+0x23/0x80 [<ffffffff817787d8>] napi_gro_frags+0x208/0x270 [<ffffffff8168a9ec>] mlx4_en_process_rx_cq+0x74c/0xf40 [<ffffffff8168b270>] mlx4_en_poll_rx_cq+0x30/0x90 [<ffffffff81778b30>] net_rx_action+0x210/0x350 [<ffffffff8188c426>] __do_softirq+0x106/0x2c7 [<ffffffff81082bad>] irq_exit+0x9d/0xa0 [<ffffffff8188c0e4>] do_IRQ+0x54/0xd0 [<ffffffff8188a63f>] common_interrupt+0x7f/0x7f <EOI> [<ffffffff8173d7e7>] cpuidle_enter+0x17/0x20 [<ffffffff810bdfd9>] cpu_startup_entry+0x2a9/0x2f0 [<ffffffff8103edd1>] start_secondary+0xf1/0x100 This patch renames the existing cgroup_get() with the dead cgroup warning to cgroup_get_live() after cgroup_kn_lock_live() and introduces the new cgroup_get() which doesn't check whether the cgroup is live or dead. All existing cgroup_get() users except for cgroup_sk_alloc() are converted to use cgroup_get_live(). Fixes: `d979a39d72` ("cgroup: duplicate cgroup reference when cloning sockets") Cc: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Chris Mason <clm@fb.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	740f485dae	Bluetooth: hci_intel: add missing tty-device sanity check commit `dcb9cfaa5e` upstream. Make sure to check the tty-device pointer before looking up the sibling platform device to avoid dereferencing a NULL-pointer when the tty is one end of a Unix98 pty. Fixes: `74cdad37cd` ("Bluetooth: hci_intel: Add runtime PM support") Fixes: `1ab1f239bf` ("Bluetooth: hci_intel: Add support for platform driver") Cc: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	a86af7983d	Bluetooth: hci_bcm: add missing tty-device sanity check commit `95065a61e9` upstream. Make sure to check the tty-device pointer before looking up the sibling platform device to avoid dereferencing a NULL-pointer when the tty is one end of a Unix98 pty. Fixes: `0395ffc1ee` ("Bluetooth: hci_bcm: Add PM for BCM devices") Cc: Frederic Danis <frederic.danis@linux.intel.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Szymon Janc	ded39dd9f0	Bluetooth: Fix user channel for 32bit userspace on 64bit kernel commit `ab89f0bdd6` upstream. Running 32bit userspace on 64bit kernel results in MSG_CMSG_COMPAT being defined as 0x80000000. This results in sendmsg failure if used from 32bit userspace running on 64bit kernel. Fix this by accounting for MSG_CMSG_COMPAT in flags check in hci_sock_sendmsg. Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl> Signed-off-by: Marko Kiiskila <marko@runtime.io> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Timur Tabi	01f9853326	tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44 commit `5a0722b898` upstream. Define a new early console name for Qualcomm Datacenter Technologies QDF2400 SOCs affected by erratum 44, instead of piggy-backing on "pl011". Previously, to enable traditional (non-SPCR) earlycon, the documentation said to specify "earlycon=pl011,<address>,qdf2400_e44", but the code was broken and this didn't actually work. So instead, the method for specifying the E44 work-around with traditional earlycon is "earlycon=qdf2400_e44,<address>". Both methods of earlycon are now enabled with the same function. Fixes: `e53e597fd4` ("tty: pl011: fix earlycon work-around for QDF2400 erratum 44") Signed-off-by: Timur Tabi <timur@codeaurora.org> Tested-by: Shanker Donthineni <shankerd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Wang YanQing	34e01f9207	tty: pty: Fix ldisc flush after userspace become aware of the data already commit `77dae61344` upstream. While using emacs, cat or others' commands in konsole with recent kernels, I have met many times that CTRL-C freeze konsole. After konsole freeze I can't type anything, then I have to open a new one, it is very annoying. See bug report: https://bugs.kde.org/show_bug.cgi?id=175283 The platform in that bug report is Solaris, but now the pty in linux has the same problem or the same behavior as Solaris :) It has high possibility to trigger the problem follow steps below: Note: In my test, BigFile is a text file whose size is bigger than 1G 1:open konsole 1:cat BigFile 2:CTRL-C After some digging, I find out the reason is that commit `1d1d14da12` ("pty: Fix buffer flush deadlock") changes the behavior of pty_flush_buffer. Thread A Thread B -------- -------- 1:n_tty_poll return POLLIN 2:CTRL-C trigger pty_flush_buffer tty_buffer_flush n_tty_flush_buffer 3:attempt to check count of chars: ioctl(fd, TIOCINQ, &available) available is equal to 0 4:read(fd, buffer, avaiable) return 0 5:konsole close fd Yes, I know we could use the same patch included in the BUG report as a workaround for linux platform too. But I think the data in ldisc is belong to application of another side, we shouldn't clear it when we want to flush write buffer of this side in pty_flush_buffer. So I think it is better to disable ldisc flush in pty_flush_buffer, because its new hehavior bring no benefit except that it mess up the behavior between POLLIN, and TIOCINQ or FIONREAD. Also I find no flush_buffer function in others' tty driver has the same behavior as current pty_flush_buffer. Fixes: `1d1d14da12` ("pty: Fix buffer flush deadlock") Signed-off-by: Wang YanQing <udknight@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	aad32798dc	serial: omap: suspend device on probe errors commit `77e6fe7fd2` upstream. Make sure to actually suspend the device before returning after a failed (or deferred) probe. Note that autosuspend must be disabled before runtime pm is disabled in order to balance the usage count due to a negative autosuspend delay as well as to make the final put suspend the device synchronously. Fixes: `388bc26226` ("omap-serial: Fix the error handling in the omap_serial probe") Cc: Shubhrajyoti D <shubhrajyoti@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Johan Hovold	9e85b5c73a	serial: omap: fix runtime-pm handling on unbind commit `099bd73dc1` upstream. An unbalanced and misplaced synchronous put was used to suspend the device on driver unbind, something which with a likewise misplaced pm_runtime_disable leads to external aborts when an open port is being removed. Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa024010 ... [<c046e760>] (serial_omap_set_mctrl) from [<c046a064>] (uart_update_mctrl+0x50/0x60) [<c046a064>] (uart_update_mctrl) from [<c046a400>] (uart_shutdown+0xbc/0x138) [<c046a400>] (uart_shutdown) from [<c046bd2c>] (uart_hangup+0x94/0x190) [<c046bd2c>] (uart_hangup) from [<c045b760>] (__tty_hangup+0x404/0x41c) [<c045b760>] (__tty_hangup) from [<c045b794>] (tty_vhangup+0x1c/0x20) [<c045b794>] (tty_vhangup) from [<c046ccc8>] (uart_remove_one_port+0xec/0x260) [<c046ccc8>] (uart_remove_one_port) from [<c046ef4c>] (serial_omap_remove+0x40/0x60) [<c046ef4c>] (serial_omap_remove) from [<c04845e8>] (platform_drv_remove+0x34/0x4c) Fix this up by resuming the device before deregistering the port and by suspending and disabling runtime pm only after the port has been removed. Also make sure to disable autosuspend before disabling runtime pm so that the usage count is balanced and device actually suspended before returning. Note that due to a negative autosuspend delay being set in probe, the unbalanced put would actually suspend the device on first driver unbind, while rebinding and again unbinding would result in a negative power.usage_count. Fixes: `7e9c8e7dbf` ("serial: omap: make sure to suspend device before remove") Cc: Felipe Balbi <balbi@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Johan Hovold <johan@kernel.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:50 +02:00
Marek Szyprowski	cd823aa0bc	serial: samsung: Add missing checks for dma_map_single failure commit `500fcc08a3` upstream. This patch adds missing checks for dma_map_single() failure and proper error reporting. Although this issue was harmless on ARM architecture, it is always good to use the DMA mapping API in a proper way. This patch fixes the following DMA API debug warning: WARNING: CPU: 1 PID: 3785 at lib/dma-debug.c:1171 check_unmap+0x8a0/0xf28 dma-pl330 121a0000.pdma: DMA-API: device driver failed to check map error[device address=0x000000006e0f9000] [size=4096 bytes] [mapped as single] Modules linked in: CPU: 1 PID: 3785 Comm: (agetty) Tainted: G W 4.11.0-rc1-00137-g07ca963-dirty #59 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24) [<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0) [<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180) [<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50) [<c01395a4>] (warn_slowpath_fmt) from [<c072a114>] (check_unmap+0x8a0/0xf28) [<c072a114>] (check_unmap) from [<c072a834>] (debug_dma_unmap_page+0x98/0xc8) [<c072a834>] (debug_dma_unmap_page) from [<c0803874>] (s3c24xx_serial_shutdown+0x314/0x52c) [<c0803874>] (s3c24xx_serial_shutdown) from [<c07f5124>] (uart_port_shutdown+0x54/0x88) [<c07f5124>] (uart_port_shutdown) from [<c07f522c>] (uart_shutdown+0xd4/0x110) [<c07f522c>] (uart_shutdown) from [<c07f6a8c>] (uart_hangup+0x9c/0x208) [<c07f6a8c>] (uart_hangup) from [<c07c426c>] (__tty_hangup+0x49c/0x634) [<c07c426c>] (__tty_hangup) from [<c07c78ac>] (tty_ioctl+0xc88/0x16e4) [<c07c78ac>] (tty_ioctl) from [<c03b5f2c>] (do_vfs_ioctl+0xc4/0xd10) [<c03b5f2c>] (do_vfs_ioctl) from [<c03b6bf4>] (SyS_ioctl+0x7c/0x8c) [<c03b6bf4>] (SyS_ioctl) from [<c010b4a0>] (ret_fast_syscall+0x0/0x3c) Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Marek Szyprowski	43a1c1ff12	serial: samsung: Use right device for DMA-mapping calls commit `768d64f491` upstream. Driver should provide its own struct device for all DMA-mapping calls instead of extracting device pointer from DMA engine channel. Although this is harmless from the driver operation perspective on ARM architecture, it is always good to use the DMA mapping API in a proper way. This patch fixes following DMA API debug warning: WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:1241 check_sync+0x520/0x9f4 samsung-uart 12c20000.serial: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000006df0f580] [size=64 bytes] Modules linked in: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-00137-g07ca963 #51 Hardware name: SAMSUNG EXYNOS (Flattened Device Tree) [<c011aaa4>] (unwind_backtrace) from [<c01127c0>] (show_stack+0x20/0x24) [<c01127c0>] (show_stack) from [<c06ba5d8>] (dump_stack+0x84/0xa0) [<c06ba5d8>] (dump_stack) from [<c0139528>] (__warn+0x14c/0x180) [<c0139528>] (__warn) from [<c01395a4>] (warn_slowpath_fmt+0x48/0x50) [<c01395a4>] (warn_slowpath_fmt) from [<c0729058>] (check_sync+0x520/0x9f4) [<c0729058>] (check_sync) from [<c072967c>] (debug_dma_sync_single_for_device+0x88/0xc8) [<c072967c>] (debug_dma_sync_single_for_device) from [<c0803c10>] (s3c24xx_serial_start_tx_dma+0x100/0x2f8) [<c0803c10>] (s3c24xx_serial_start_tx_dma) from [<c0804338>] (s3c24xx_serial_tx_chars+0x198/0x33c) Reported-by: Seung-Woo Kim <sw0312.kim@samsung.com> Fixes: `62c37eedb7` ("serial: samsung: add dma reqest/release functions") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Shuah Khan <shuahkh@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Eric Biggers	ab62f4118b	fscrypt: avoid collisions when presenting long encrypted filenames commit `6b06cdee81` upstream. When accessing an encrypted directory without the key, userspace must operate on filenames derived from the ciphertext names, which contain arbitrary bytes. Since we must support filenames as long as NAME_MAX, we can't always just base64-encode the ciphertext, since that may make it too long. Currently, this is solved by presenting long names in an abbreviated form containing any needed filesystem-specific hashes (e.g. to identify a directory block), then the last 16 bytes of ciphertext. This needs to be sufficient to identify the actual name on lookup. However, there is a bug. It seems to have been assumed that due to the use of a CBC (ciphertext block chaining)-based encryption mode, the last 16 bytes (i.e. the AES block size) of ciphertext would depend on the full plaintext, preventing collisions. However, we actually use CBC with ciphertext stealing (CTS), which handles the last two blocks specially, causing them to appear "flipped". Thus, it's actually the second-to-last block which depends on the full plaintext. This caused long filenames that differ only near the end of their plaintexts to, when observed without the key, point to the wrong inode and be undeletable. For example, with ext4: # echo pass \| e4crypt add_key -p 16 edir/ # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 \| xargs touch # find edir/ -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 100000 # sync # echo 3 > /proc/sys/vm/drop_caches # keyctl new_session # find edir/ -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 2004 # rm -rf edir/ rm: cannot remove 'edir/_A7nNFi3rhkEQlJ6P,hdzluhODKOeWx5V': Structure needs cleaning ... To fix this, when presenting long encrypted filenames, encode the second-to-last block of ciphertext rather than the last 16 bytes. Although it would be nice to solve this without depending on a specific encryption mode, that would mean doing a cryptographic hash like SHA-256 which would be much less efficient. This way is sufficient for now, and it's still compatible with encryption modes like HEH which are strong pseudorandom permutations. Also, changing the presented names is still allowed at any time because they are only provided to allow applications to do things like delete encrypted directories. They're not designed to be used to persistently identify files --- which would be hard to do anyway, given that they're encrypted after all. For ease of backports, this patch only makes the minimal fix to both ext4 and f2fs. It leaves ubifs as-is, since ubifs doesn't compare the ciphertext block yet. Follow-on patches will clean things up properly and make the filesystems use a shared helper function. Fixes: `5de0b4d0cd` ("ext4 crypto: simplify and speed up filename encryption") Reported-by: Gwendal Grignou <gwendal@chromium.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Eric Biggers	eb86b4c68b	fscrypt: fix context consistency check when key(s) unavailable commit `272f98f684` upstream. To mitigate some types of offline attacks, filesystem encryption is designed to enforce that all files in an encrypted directory tree use the same encryption policy (i.e. the same encryption context excluding the nonce). However, the fscrypt_has_permitted_context() function which enforces this relies on comparing struct fscrypt_info's, which are only available when we have the encryption keys. This can cause two incorrect behaviors: 1. If we have the parent directory's key but not the child's key, or vice versa, then fscrypt_has_permitted_context() returned false, causing applications to see EPERM or ENOKEY. This is incorrect if the encryption contexts are in fact consistent. Although we'd normally have either both keys or neither key in that case since the master_key_descriptors would be the same, this is not guaranteed because keys can be added or removed from keyrings at any time. 2. If we have neither the parent's key nor the child's key, then fscrypt_has_permitted_context() returned true, causing applications to see no error (or else an error for some other reason). This is incorrect if the encryption contexts are in fact inconsistent, since in that case we should deny access. To fix this, retrieve and compare the fscrypt_contexts if we are unable to set up both fscrypt_infos. While this slightly hurts performance when accessing an encrypted directory tree without the key, this isn't a case we really need to be optimizing for; access with the key is much more important. Furthermore, the performance hit is barely noticeable given that we are already retrieving the fscrypt_context and doing two keyring searches in fscrypt_get_encryption_info(). If we ever actually wanted to optimize this case we might start by caching the fscrypt_contexts. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Linus Torvalds	602f8911c7	initramfs: avoid "label at end of compound statement" error commit `394e4f5d58` upstream. Commit `17a9be3174` ("initramfs: Always do fput() and load modules after rootfs populate") introduced an error for the CONFIG_BLK_DEV_RAM=y case, because even though the code looks fine, the compiler really wants a statement after a label, or you'll get complaints: init/initramfs.c: In function 'populate_rootfs': init/initramfs.c:644:2: error: label at end of compound statement That commit moved the subsequent statements to outside the compound statement, leaving the label without any associated statements. Reported-by: Jörg Otte <jrg.otte@gmail.com> Fixes: `17a9be3174` ("initramfs: Always do fput() and load modules after rootfs populate") Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Stafford Horne	e54af78e1a	initramfs: Always do fput() and load modules after rootfs populate commit `17a9be3174` upstream. In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit `0886551480` ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: `0886551480` ("initramfs: finish fput() before accessing any binary from initramfs") Cc: Lokesh Vutla <lokeshvutla@ti.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stafford Horne <shorne@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jan Kara	9b3be66cb4	f2fs: Make flush bios explicitely sync commit `3adc5fcb7e` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_{FUA\|PREFLUSH\|...} definitions. generic_make_request_checks() however strips REQ_FUA and REQ_PREFLUSH flags from a bio when the storage doesn't report volatile write cache and thus write effectively becomes asynchronous which can lead to performance regressions. Fix the problem by making sure all bios which are synchronous are properly marked with REQ_SYNC. Fixes: `b685d3d65a` CC: Jaegeuk Kim <jaegeuk@kernel.org> CC: linux-f2fs-devel@lists.sourceforge.net Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	bd3dfe5049	f2fs: check entire encrypted bigname when finding a dentry commit `6332cd32c8` upstream. If user has no key under an encrypted dir, fscrypt gives digested dentries. Previously, when looking up a dentry, f2fs only checks its hash value with first 4 bytes of the digested dentry, which didn't handle hash collisions fully. This patch enhances to check entire dentry bytes likewise ext4. Eric reported how to reproduce this issue by: # seq -f "edir/abcdefghijklmnopqrstuvwxyz012345%.0f" 100000 \| xargs touch # find edir -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 100000 # sync # echo 3 > /proc/sys/vm/drop_caches # keyctl new_session # find edir -type f \| xargs stat -c %i \| sort \| uniq \| wc -l 99999 Reported-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> (fixed f2fs_dentry_hash() to work even when the hash is 0) Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Sheng Yong	3a625468bd	f2fs: fix multiple f2fs_add_link() having same name for inline dentry commit `d3bb910c15` upstream. Commit `88c5c13a50` (f2fs: fix multiple f2fs_add_link() calls having same name) does not cover the scenario where inline dentry is enabled. In that case, F2FS_I(dir)->task will be NULL, and __f2fs_add_link will lookup dentries one more time. This patch fixes it by moving the assigment of current task to a upper level to cover both normal and inline dentry. Fixes: `88c5c13a50` (f2fs: fix multiple f2fs_add_link() calls having same name) Signed-off-by: Sheng Yong <shengyong1@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	e71f099677	f2fs: fix fs corruption due to zero inode page commit `9bb02c3627` upstream. This patch fixes the following scenario. - f2fs_create/f2fs_mkdir - write_checkpoint - f2fs_mark_inode_dirty_sync - block_operations - f2fs_lock_all - f2fs_sync_inode_meta - f2fs_unlock_all - sync_inode_metadata - f2fs_lock_op - f2fs_write_inode - update_inode_page - get_node_page return -ENOENT - new_inode_page - fill_node_footer - f2fs_mark_inode_dirty_sync - ... - f2fs_unlock_op - f2fs_inode_synced - f2fs_lock_all - do_checkpoint In this checkpoint, we can get an inode page which contains zeros having valid node footer only. Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:49 +02:00
Jaegeuk Kim	ed4d26a1e4	Revert "f2fs: put allocate_segment after refresh_sit_entry" commit `c6f82fe90d` upstream. This reverts commit `3436c4bdb3`. This makes a leak to register dirty segments. I reproduced the issue by modified postmark which injects a lot of file create/delete/update and finally triggers huge number of SSR allocations. [Jaegeuk Kim: Change missing incorrect comment] Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jaegeuk Kim	33cbcc2556	f2fs: fix wrong max cost initialization commit `c541a51b8c` upstream. This patch fixes missing increased max cost caused by a patch that we increased cose of data segments in greedy algorithm. Fixes: `b9cd20619` "f2fs: node segment is prior to data segment selected victim" Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Ross Zwisler	03606c8ecc	dax: fix PMD data corruption when fault races with write commit `876f29460c` upstream. This is based on a patch from Jan Kara that fixed the equivalent race in the DAX PTE fault path. Currently DAX PMD read fault can race with write(2) in the following way: CPU1 - write(2) CPU2 - read fault dax_iomap_pmd_fault() ->iomap_begin() - sees hole dax_iomap_rw() iomap_apply() ->iomap_begin - allocates blocks dax_iomap_actor() invalidate_inode_pages2_range() - there's nothing to invalidate grab_mapping_entry() - we add huge zero page to the radix tree and map it to page tables The result is that hole page is mapped into page tables (and thus zeros are seen in mmap) while file has data written in that place. Fix the problem by locking exception entry before mapping blocks for the fault. That way we are sure invalidate_inode_pages2_range() call for racing write will either block on entry lock waiting for the fault to finish (and unmap stale page tables after that) or read fault will see already allocated blocks by write(2). Fixes: `9f141d6ef6` ("dax: Call ->iomap_begin without entry lock during dax fault") Link: http://lkml.kernel.org/r/20170510172700.18991-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jan Kara	5a3651b4a9	ext4: return to starting transaction in ext4_dax_huge_fault() commit `fb26a1cbed` upstream. DAX will return to locking exceptional entry before mapping blocks for a page fault to fix possible races with concurrent writes. To avoid lock inversion between exceptional entry lock and transaction start, start the transaction already in ext4_dax_huge_fault(). Fixes: `9f141d6ef6` Link: http://lkml.kernel.org/r/20170510085419.27601-4-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jan Kara	85f353dbd0	mm: fix data corruption due to stale mmap reads commit `cd656375f9` upstream. Currently, we didn't invalidate page tables during invalidate_inode_pages2() for DAX. That could result in e.g. 2MiB zero page being mapped into page tables while there were already underlying blocks allocated and thus data seen through mmap were different from data seen by read(2). The following sequence reproduces the problem: - open an mmap over a 2MiB hole - read from a 2MiB hole, faulting in a 2MiB zero page - write to the hole with write(3p). The write succeeds but we incorrectly leave the 2MiB zero page mapping intact. - via the mmap, read the data that was just written. Since the zero page mapping is still intact we read back zeroes instead of the new data. Fix the problem by unconditionally calling invalidate_inode_pages2_range() in dax_iomap_actor() for new block allocations and by properly invalidating page tables in invalidate_inode_pages2_range() for DAX mappings. Fixes: `c6dcf52c23` Link: http://lkml.kernel.org/r/20170510085419.27601-3-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Ross Zwisler	06305f51ef	dax: prevent invalidation of mapped DAX entries commit `4636e70bb0` upstream. Patch series "mm,dax: Fix data corruption due to mmap inconsistency", v4. This series fixes data corruption that can happen for DAX mounts when page faults race with write(2) and as a result page tables get out of sync with block mappings in the filesystem and thus data seen through mmap is different from data seen through read(2). The series passes testing with t_mmap_stale test program from Ross and also other mmap related tests on DAX filesystem. This patch (of 4): dax_invalidate_mapping_entry() currently removes DAX exceptional entries only if they are clean and unlocked. This is done via: invalidate_mapping_pages() invalidate_exceptional_entry() dax_invalidate_mapping_entry() However, for page cache pages removed in invalidate_mapping_pages() there is an additional criteria which is that the page must not be mapped. This is noted in the comments above invalidate_mapping_pages() and is checked in invalidate_inode_page(). For DAX entries this means that we can can end up in a situation where a DAX exceptional entry, either a huge zero page or a regular DAX entry, could end up mapped but without an associated radix tree entry. This is inconsistent with the rest of the DAX code and with what happens in the page cache case. We aren't able to unmap the DAX exceptional entry because according to its comments invalidate_mapping_pages() isn't allowed to block, and unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem. Since we essentially never have unmapped DAX entries to evict from the radix tree, just remove dax_invalidate_mapping_entry(). Fixes: `c6dcf52c23` ("mm: Invalidate DAX radix tree entries only if appropriate") Link: http://lkml.kernel.org/r/20170510085419.27601-2-jack@suse.cz Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Reported-by: Jan Kara <jack@suse.cz> Cc: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Dan Williams	55dbfd7dcd	device-dax: fix sysfs attribute deadlock commit `565851c972` upstream. Usage of device_lock() for dax_region attributes is unnecessary and deadlock prone. It's unnecessary because the order of registration / un-registration guarantees that drvdata is always valid. It's deadlock prone because it sets up this situation: ndctl D 0 2170 2082 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 schedule_preempt_disabled+0x15/0x20 __mutex_lock+0x402/0x980 ? __mutex_lock+0x158/0x980 ? align_show+0x2b/0x80 [dax] ? kernfs_seq_start+0x2f/0x90 mutex_lock_nested+0x1b/0x20 align_show+0x2b/0x80 [dax] dev_attr_show+0x20/0x50 ndctl D 0 2186 2079 0x00000000 Call Trace: __schedule+0x31f/0x980 schedule+0x3d/0x90 __kernfs_remove+0x1f6/0x340 ? kernfs_remove_by_name_ns+0x45/0xa0 ? remove_wait_queue+0x70/0x70 kernfs_remove_by_name_ns+0x45/0xa0 remove_files.isra.1+0x35/0x70 sysfs_remove_group+0x44/0x90 sysfs_remove_groups+0x2e/0x50 dax_region_unregister+0x25/0x40 [dax] devm_action_release+0xf/0x20 release_nodes+0x16d/0x2b0 devres_release_all+0x3c/0x60 device_release_driver_internal+0x17d/0x220 device_release_driver+0x12/0x20 unbind_store+0x112/0x160 ndctl/2170 is trying to acquire the device_lock() to read an attribute, and ndctl/2186 is holding the device_lock() while trying to drain all active attribute readers. Thanks to Yi Zhang for the reproduction script. Fixes: `d7fe1a67f6` ("dax: add region 'id', 'size', and 'align' attributes") Reported-by: Yi Zhang <yizhan@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Dan Williams	8931477790	device-dax: fix cdev leak commit `ed01e50acd` upstream. If device_add() fails, cleanup the cdev. Otherwise, we leak a kobj_map() with a stale device number. As Jason points out, there is a small possibility that userspace has opened and mapped the device in the time between cdev_add() and the device_add() failure. We need a new kill_dax_dev() helper to invalidate any established mappings. Fixes: `ba09c01d2f` ("dax: convert to the cdev api") Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
NeilBrown	9bcb9cc96d	md/raid1: avoid reusing a resync bio after error handling. commit `0c9d5b127f` upstream. fix_sync_read_error() modifies a bio on a newly faulty device by setting bi_end_io to end_sync_write. This ensure that put_buf() will still call rdev_dec_pending() as required, but makes sure that subsequent code in fix_sync_read_error() doesn't try to read from the device. Unfortunately this interacts badly with sync_request_write() which assumes that any bio with bi_end_io set to non-NULL other than end_sync_read is safe to write to. As the device is now faulty it doesn't make sense to write. As the bio was recently used for a read, it is "dirty" and not suitable for immediate submission. In particular, ->bi_next might be non-NULL, which will cause generic_make_request() to complain. Break this interaction by refusing to write to devices which are marked as Faulty. Reported-and-tested-by: Michael Wang <yun.wang@profitbricks.com> Fixes: `2e52d449bc` ("md/raid1: add failfast handling for reads.") Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:48 +02:00
Jason A. Donenfeld	7d70808547	padata: free correct variable commit `07a77929ba` upstream. The author meant to free the variable that was just allocated, instead of the one that failed to be allocated, but made a simple typo. This patch rectifies that. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Amir Goldstein	f33f17b28d	ovl: do not set overlay.opaque on non-dir create commit `4a99f3c83d` upstream. The optimization for opaque dir create was wrongly being applied also to non-dir create. Fixes: `97c684cc91` ("ovl: create directories inside merged parent opaque") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Björn Jacke	6a1baa3a16	CIFS: add misssing SFM mapping for doublequote commit `85435d7a15` upstream. SFM is mapping doublequote to 0xF020 Without this patch creating files with doublequote fails to Windows/Mac Signed-off-by: Bjoern Jacke <bjacke@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	c4f35cc8dc	cifs: fix CIFS_IOC_GET_MNT_INFO oops commit `d8a6e505d6` upstream. An open directory may have a NULL private_data pointer prior to readdir. Fixes: `0de1f4c6f6` ("Add way to query server fs info for smb3") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Rabin Vincent	33fa011644	CIFS: fix oplock break deadlocks commit `3998e6b87d` upstream. When the final cifsFileInfo_put() is called from cifsiod and an oplock break work is queued, lockdep complains loudly: ============================================= [ INFO: possible recursive locking detected ] 4.11.0+ #21 Not tainted --------------------------------------------- kworker/0:2/78 is trying to acquire lock: ("cifsiod"){++++.+}, at: flush_work+0x215/0x350 but task is already holding lock: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock("cifsiod"); lock("cifsiod"); * DEADLOCK * May be due to missing lock nesting notation 2 locks held by kworker/0:2/78: #0: ("cifsiod"){++++.+}, at: process_one_work+0x255/0x8e0 #1: ((&wdata->work)){+.+...}, at: process_one_work+0x255/0x8e0 stack backtrace: CPU: 0 PID: 78 Comm: kworker/0:2 Not tainted 4.11.0+ #21 Workqueue: cifsiod cifs_writev_complete Call Trace: dump_stack+0x85/0xc2 __lock_acquire+0x17dd/0x2260 ? match_held_lock+0x20/0x2b0 ? trace_hardirqs_off_caller+0x86/0x130 ? mark_lock+0xa6/0x920 lock_acquire+0xcc/0x260 ? lock_acquire+0xcc/0x260 ? flush_work+0x215/0x350 flush_work+0x236/0x350 ? flush_work+0x215/0x350 ? destroy_worker+0x170/0x170 __cancel_work_timer+0x17d/0x210 ? ___preempt_schedule+0x16/0x18 cancel_work_sync+0x10/0x20 cifsFileInfo_put+0x338/0x7f0 cifs_writedata_release+0x2a/0x40 ? cifs_writedata_release+0x2a/0x40 cifs_writev_complete+0x29d/0x850 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 This is a real warning. Since the oplock is queued on the same workqueue this can deadlock if there is only one worker thread active for the workqueue (which will be the case during memory pressure when the rescuer thread is handling it). Furthermore, there is at least one other kind of hang possible due to the oplock break handling if there is only worker. (This can be reproduced without introducing memory pressure by having passing 1 for the max_active parameter of cifsiod.) cifs_oplock_break() can wait indefintely in the filemap_fdatawait() while the cifs_writev_complete() work is blocked: sysrq: SysRq : Show Blocked State task PC stack pid father kworker/0:1 D 0 16 2 0x00000000 Workqueue: cifsiod cifs_oplock_break Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x4a/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 cifs_oplock_break+0x651/0x710 ? preempt_count_sub+0x18/0xd0 process_one_work+0x304/0x8e0 worker_thread+0x9b/0x6a0 kthread+0x1b2/0x200 ? process_one_work+0x8e0/0x8e0 ? kthread_create_on_node+0x40/0x40 ret_from_fork+0x31/0x40 dd D 0 683 171 0x00000000 Call Trace: __schedule+0x562/0xf40 ? mark_held_locks+0x29/0xb0 schedule+0x57/0xe0 io_schedule+0x21/0x50 wait_on_page_bit+0x143/0x190 ? add_to_page_cache_lru+0x150/0x150 __filemap_fdatawait_range+0x134/0x190 ? do_writepages+0x51/0x70 filemap_fdatawait_range+0x14/0x30 filemap_fdatawait+0x3b/0x40 filemap_write_and_wait+0x4e/0x70 cifs_flush+0x6a/0xb0 filp_close+0x52/0xa0 __close_fd+0xdc/0x150 SyS_close+0x33/0x60 entry_SYSCALL_64_fastpath+0x1f/0xbe Showing all locks held in the system: 2 locks held by kworker/0:1/16: #0: ("cifsiod"){.+.+.+}, at: process_one_work+0x255/0x8e0 #1: ((&cfile->oplock_break)){+.+.+.}, at: process_one_work+0x255/0x8e0 Showing busy workqueues and worker pools: workqueue cifsiod: flags=0xc pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1 in-flight: 16:cifs_oplock_break delayed: cifs_writev_complete, cifs_echo_request pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=0s workers=3 idle: 750 3 Fix these problems by creating a a new workqueue (with a rescuer) for the oplock break work. Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	076c404c12	cifs: fix CIFS_ENUMERATE_SNAPSHOTS oops commit `6026685de3` upstream. As with `618763958b`, an open directory may have a NULL private_data pointer prior to readdir. CIFS_ENUMERATE_SNAPSHOTS must check for this before dereference. Fixes: `834170c859` ("Enable previous version support") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
David Disseldorp	63d86cd970	cifs: fix leak in FSCTL_ENUM_SNAPS response handling commit `0e5c795592` upstream. The server may respond with success, and an output buffer less than sizeof(struct smb_snapshot_array) in length. Do not leak the output buffer in this case. Fixes: `834170c859` ("Enable previous version support") Signed-off-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Björn Jacke	160239dfbf	CIFS: fix mapping of SFM_SPACE and SFM_PERIOD commit `b704e70b7c` upstream. - trailing space maps to 0xF028 - trailing period maps to 0xF029 This fix corrects the mapping of file names which have a trailing character that would otherwise be illegal (period or space) but is allowed by POSIX. Signed-off-by: Bjoern Jacke <bjacke@samba.org> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Steve French	6f95f1925a	SMB3: Work around mount failure when using SMB3 dialect to Macs commit `7db0a6efdc` upstream. Macs send the maximum buffer size in response on ioctl to validate negotiate security information, which causes us to fail the mount as the response buffer is larger than the expected response. Changed ioctl response processing to allow for padding of validate negotiate ioctl response and limit the maximum response size to maximum buffer size. Signed-off-by: Steve French <steve.french@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Steve French	49f70d15b5	Set unicode flag on cifs echo request to avoid Mac error commit `26c9cb668c` upstream. Mac requires the unicode flag to be set for cifs, even for the smb echo request (which doesn't have strings). Without this Mac rejects the periodic echo requests (when mounting with cifs) that we use to check if server is down Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:47 +02:00
Sachin Prabhu	fdf150ee55	Do not return number of bytes written for ioctl CIFS_IOC_COPYCHUNK_FILE commit `7d0c234fd2` upstream. commit `620d8745b3` ("Introduce cifs_copy_file_range()") changes the behaviour of the cifs ioctl call CIFS_IOC_COPYCHUNK_FILE. In case of successful writes, it now returns the number of bytes written. This return value is treated as an error by the xfstest cifs/001. Depending on the errno set at that time, this may or may not result in the test failing. The patch fixes this by setting the return value to 0 in case of successful writes. Fixes: commit `620d8745b3` ("Introduce cifs_copy_file_range()") Reported-by: Eryu Guan <eguan@redhat.com> Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Sachin Prabhu	ecf9bb0fd7	Fix match_prepath() commit `cd8c42968e` upstream. Incorrect return value for shares not using the prefix path means that we will never match superblocks for these shares. Fixes: commit `c1d8b24d18` ("Compare prepaths when comparing superblocks") Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Vlastimil Babka	d322fd67ca	mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC commit `62be1511b1` upstream. Patch series "more robust PF_MEMALLOC handling" This series aims to unify the setting and clearing of PF_MEMALLOC, which prevents recursive reclaim. There are some places that clear the flag unconditionally from current->flags, which may result in clearing a pre-existing flag. This already resulted in a bug report that Patch 1 fixes (without the new helpers, to make backporting easier). Patch 2 introduces the new helpers, modelled after existing memalloc_noio_* and memalloc_nofs_* helpers, and converts mm core to use them. Patches 3 and 4 convert non-mm code. This patch (of 4): __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent deadlock during page migration by lock_page() (see the comment in __unmap_and_move()). Then it unconditionally clears the flag, which can clear a pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a problem until commit `a8161d1ed6` ("mm, page_alloc: restructure direct compaction handling in slowpath"), because direct compation was called only after direct reclaim, which was skipped when PF_MEMALLOC flag was set. Even now it's only a theoretical issue, as the new callsite of __alloc_pages_direct_compact() is reached only for costly orders and when gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in gfp_flags or in_interrupt() is true. There is no such known context, but let's play it safe and make __alloc_pages_direct_compact() robust for cases where PF_MEMALLOC is already set. Fixes: `a8161d1ed6` ("mm, page_alloc: restructure direct compaction handling in slowpath") Link: http://lkml.kernel.org/r/20170405074700.29871-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Johannes Weiner	41a68ab851	mm: vmscan: fix IO/refault regression in cache workingset transition commit `2a2e48854d` upstream. Since commit `59dc76b0d4` ("mm: vmscan: reduce size of inactive file list") we noticed bigger IO spikes during changes in cache access patterns. The patch in question shrunk the inactive list size to leave more room for the current workingset in the presence of streaming IO. However, workingset transitions that previously happened on the inactive list are now pushed out of memory and incur more refaults to complete. This patch disables active list protection when refaults are being observed. This accelerates workingset transitions, and allows more of the new set to establish itself from memory, without eating into the ability to protect the established workingset during stable periods. The workloads that were measurably affected for us were hit pretty bad by it, with refault/majfault rates doubling and tripling during cache transitions, and the machines sustaining half-hour periods of 100% IO utilization, where they'd previously have sub-minute peaks at 60-90%. Stateful services that handle user data tend to be more conservative with kernel upgrades. As a result we hit most page cache issues with some delay, as was the case here. The severity seemed to warrant a stable tag. Fixes: `59dc76b0d4` ("mm: vmscan: reduce size of inactive file list") Link: http://lkml.kernel.org/r/20170404220052.27593-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Andrey Ryabinin	f7bccf3f16	fs/block_dev: always invalidate cleancache in invalidate_bdev() commit `a5f6a6a9c7` upstream. invalidate_bdev() calls cleancache_invalidate_inode() iff ->nrpages != 0 which doen't make any sense. Make sure that invalidate_bdev() always calls cleancache_invalidate_inode() regardless of mapping->nrpages value. Fixes: `c515e1fd36` ("mm/fs: add hooks to support cleancache") Link: http://lkml.kernel.org/r/20170424164135.22350-3-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Andrey Ryabinin	601462864d	fs: fix data invalidation in the cleancache during direct IO commit `55635ba76e` upstream. Patch series "Properly invalidate data in the cleancache", v2. We've noticed that after direct IO write, buffered read sometimes gets stale data which is coming from the cleancache. The reason for this is that some direct write hooks call call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero, so we may not invalidate data in the cleancache. Another odd thing is that we check only for ->nrpages and don't check for ->nrexceptional, but invalidate_inode_pages2[_range] also invalidates exceptional entries as well. So we invalidate exceptional entries only if ->nrpages != 0? This doesn't feel right. - Patch 1 fixes direct IO writes by removing ->nrpages check. - Patch 2 fixes similar case in invalidate_bdev(). Note: I only fixed conditional cleancache_invalidate_inode() here. Do we also need to add ->nrexceptional check in into invalidate_bdev()? - Patches 3-4: some optimizations. This patch (of 4): Some direct IO write fs hooks call invalidate_inode_pages2[_range]() conditionally iff mapping->nrpages is not zero. This can't be right, because invalidate_inode_pages2[_range]() also invalidate data in the cleancache via cleancache_invalidate_inode() call. So if page cache is empty but there is some data in the cleancache, buffered read after direct IO write would get stale data from the cleancache. Also it doesn't feel right to check only for ->nrpages because invalidate_inode_pages2[_range] invalidates exceptional entries as well. Fix this by calling invalidate_inode_pages2[_range]() regardless of nrpages state. Note: nfs,cifs,9p doesn't need similar fix because the never call cleancache_get_page() (nor directly, nor via mpage_readpage[s]()), so they are not affected by this bug. Fixes: `c515e1fd36` ("mm/fs: add hooks to support cleancache") Link: http://lkml.kernel.org/r/20170424164135.22350-2-aryabinin@virtuozzo.com Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alexey Kuznetsov <kuznet@virtuozzo.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nikolay Borisov <n.borisov.lkml@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Luis Henriques	a4c34c20fb	ceph: fix memory leak in __ceph_setxattr() commit `eeca958dce` upstream. The ceph_inode_xattr needs to be released when removing an xattr. Easily reproducible running the 'generic/020' test from xfstests or simply by doing: attr -s attr0 -V 0 /mnt/test && attr -r attr0 /mnt/test While there, also fix the error path. Here's the kmemleak splat: unreferenced object 0xffff88001f86fbc0 (size 64): comm "attr", pid 244, jiffies 4294904246 (age 98.464s) hex dump (first 32 bytes): 40 fa 86 1f 00 88 ff ff 80 32 38 1f 00 88 ff ff @........28..... 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ backtrace: [<ffffffff81560199>] kmemleak_alloc+0x49/0xa0 [<ffffffff810f3e5b>] kmem_cache_alloc+0x9b/0xf0 [<ffffffff812b157e>] __ceph_setxattr+0x17e/0x820 [<ffffffff812b1c57>] ceph_set_xattr_handler+0x37/0x40 [<ffffffff8111fb4b>] __vfs_removexattr+0x4b/0x60 [<ffffffff8111fd37>] vfs_removexattr+0x77/0xd0 [<ffffffff8111fdd1>] removexattr+0x41/0x60 [<ffffffff8111fe65>] path_removexattr+0x75/0xa0 [<ffffffff81120aeb>] SyS_lremovexattr+0xb/0x10 [<ffffffff81564b20>] entry_SYSCALL_64_fastpath+0x13/0x94 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Michal Hocko	2cfc744069	fs/xattr.c: zero out memory copied to userspace in getxattr commit `81be3dee96` upstream. getxattr uses vmalloc to allocate memory if kzalloc fails. This is filled by vfs_getxattr and then copied to the userspace. vmalloc, however, doesn't zero out the memory so if the specific implementation of the xattr handler is sloppy we can theoretically expose a kernel memory. There is no real sign this is really the case but let's make sure this will not happen and use vzalloc instead. Fixes: `779302e678` ("fs/xattr.c:getxattr(): improve handling of allocation failures") Link: http://lkml.kernel.org/r/20170306103327.2766-1-mhocko@kernel.org Acked-by: Kees Cook <keescook@chromium.org> Reported-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Martin Brandenburg	8bd8322308	orangefs: do not check possibly stale size on truncate commit `53950ef541` upstream. Let the server figure this out because our size might be out of date or not present. The bug was that xfs_io -f -t -c "pread -v 0 100" /mnt/foo echo "Test" > /mnt/foo xfs_io -f -t -c "pread -v 0 100" /mnt/foo fails because the second truncate did not happen if nothing had requested the size after the write in echo. Thus i_size was zero (not present) and the orangefs_setattr though i_size was zero and there was nothing to do. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:46 +02:00
Martin Brandenburg	66f1885bea	orangefs: do not set getattr_time on orangefs_lookup commit `17930b252c` upstream. Since orangefs_lookup calls orangefs_iget which calls orangefs_inode_getattr, getattr_time will get set. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Martin Brandenburg	a6d3c33255	orangefs: clean up oversize xattr validation commit `e675c5ec51` upstream. Also don't check flags as this has been validated by the VFS already. Fix an off-by-one error in the max size checking. Stop logging just because userspace wants to write attributes which do not fit. This and the previous commit fix xfstests generic/020. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Martin Brandenburg	7adb15036a	orangefs: fix bounds check for listxattr commit `a956af337b` upstream. Signed-off-by: Martin Brandenburg <martin@omnibond.com> Signed-off-by: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Eric Biggers	1567c17063	ext4: evict inline data when writing to memory map commit `7b4cc9787f` upstream. Currently the case of writing via mmap to a file with inline data is not handled. This is maybe a rare case since it requires a writable memory map of a very small file, but it is trivial to trigger with on inline_data filesystem, and it causes the 'BUG_ON(ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA));' in ext4_writepages() to be hit: mkfs.ext4 -O inline_data /dev/vdb mount /dev/vdb /mnt xfs_io -f /mnt/file \ -c 'pwrite 0 1' \ -c 'mmap -w 0 1m' \ -c 'mwrite 0 1' \ -c 'fsync' kernel BUG at fs/ext4/inode.c:2723! invalid opcode: 0000 [#1] SMP CPU: 1 PID: 2532 Comm: xfs_io Not tainted 4.11.0-rc1-xfstests-00301-g071d9acf3d1f #633 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014 task: ffff88003d3a8040 task.stack: ffffc90000300000 RIP: 0010:ext4_writepages+0xc89/0xf8a RSP: 0018:ffffc90000303ca0 EFLAGS: 00010283 RAX: 0000028410000000 RBX: ffff8800383fa3b0 RCX: ffffffff812afcdc RDX: 00000a9d00000246 RSI: ffffffff81e660e0 RDI: 0000000000000246 RBP: ffffc90000303dc0 R08: 0000000000000002 R09: 869618e8f99b4fa5 R10: 00000000852287a2 R11: 00000000a03b49f4 R12: ffff88003808e698 R13: 0000000000000000 R14: 7fffffffffffffff R15: 7fffffffffffffff FS: 00007fd3e53094c0(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd3e4c51000 CR3: 000000003d554000 CR4: 00000000003406e0 Call Trace: ? _raw_spin_unlock+0x27/0x2a ? kvm_clock_read+0x1e/0x20 do_writepages+0x23/0x2c ? do_writepages+0x23/0x2c __filemap_fdatawrite_range+0x80/0x87 filemap_write_and_wait_range+0x67/0x8c ext4_sync_file+0x20e/0x472 vfs_fsync_range+0x8e/0x9f ? syscall_trace_enter+0x25b/0x2d0 vfs_fsync+0x1c/0x1e do_fsync+0x31/0x4a SyS_fsync+0x10/0x14 do_syscall_64+0x69/0x131 entry_SYSCALL64_slow_path+0x25/0x25 We could try to be smart and keep the inline data in this case, or at least support delayed allocation when allocating the block, but these solutions would be more complicated and don't seem worthwhile given how rare this case seems to be. So just fix the bug by calling ext4_convert_inline_data() when we're asked to make a page writable, so that any inline data gets evicted, with the block allocated immediately. Reported-by: Nick Alcock <nick.alcock@oracle.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Jan Kara	7621cb7966	jbd2: fix dbench4 performance regression for 'nobarrier' mounts commit `5052b069ac` upstream. Commit `b685d3d65a` "block: treat REQ_FUA and REQ_PREFLUSH as synchronous" removed REQ_SYNC flag from WRITE_FUA implementation. Since JBD2 strips REQ_FUA and REQ_FLUSH flags from submitted IO when the filesystem is mounted with nobarrier mount option, journal superblock writes ended up being async writes after this patch and that caused heavy performance regression for dbench4 benchmark with high number of processes. In my test setup with HP RAID array with non-volatile write cache and 32 GB ram, dbench4 runs with 8 processes regressed by ~25%. Fix the problem by making sure journal superblock writes are always treated as synchronous since they generally block progress of the journalling machinery and thus the whole filesystem. Fixes: `b685d3d65a` Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Christian Borntraeger	7a17cbb4b8	perf annotate s390: Implement jump types for perf annotate commit `d9f8dfa9ba` upstream. Implement simple detection for all kind of jumps and branches. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Link: http://lkml.kernel.org/r/1491465112-45819-3-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Christian Borntraeger	b09eace76e	perf annotate s390: Fix perf annotate error -95 (4.10 regression) commit `e77852b32d` upstream. since 4.10 perf annotate exits on s390 with an "unknown error -95". Turns out that commit `786c1b5184` ("perf annotate: Start supporting cross arch annotation") added a hard requirement for architecture support when objdump is used but only provided x86 and arm support. Meanwhile power was added so lets add s390 as well. While at it make sure to implement the branch and jump types. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Andreas Krebbel <krebbel@linux.vnet.ibm.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-s390 <linux-s390@vger.kernel.org> Fixes: `786c1b5184` "perf annotate: Start supporting cross arch annotation" Link: http://lkml.kernel.org/r/1491465112-45819-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Adrian Hunter	08b0b5bd54	perf auxtrace: Fix no_size logic in addr_filter__resolve_kernel_syms() commit `c3a0bbc7ad` upstream. Address filtering with kernel symbols incorrectly resulted in the error "Cannot determine size of symbol" because the no_size logic was the wrong way around. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/1490357752-27942-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Mike Marciniszyn	2d58452241	IB/hfi1: Prevent kernel QP post send hard lockups commit `b6eac931b9` upstream. The driver progress routines can call cond_resched() when a timeslice is exhausted and irqs are enabled. If the ULP had been holding a spin lock without disabling irqs and the post send directly called the progress routine, the cond_resched() could yield allowing another thread from the same ULP to deadlock on that same lock. Correct by replacing the current hfi1_do_send() calldown with a unique one for post send and adding an argument to hfi1_do_send() to indicate that the send engine is running in a thread. If the routine is not running in a thread, avoid calling cond_resched(). Fixes: Commit `831464ce4b` ("IB/hfi1: Don't call cond_resched in atomic mode when sending packets") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:45 +02:00
Jack Morgenstein	7a48c89a3c	IB/mlx4: Reduce SRIOV multicast cleanup warning message to debug level commit `fb7a91746a` upstream. A warning message during SRIOV multicast cleanup should have actually been a debug level message. The condition generating the warning does no harm and can fill the message log. In some cases, during testing, some tests were so intense as to swamp the message log with these warning messages, causing a stall in the console message log output task. This stall caused an NMI to be sent to all CPUs (so that they all dumped their stacks into the message log). Aside from the message flood causing an NMI, the tests all passed. Once the message flood which caused the NMI is removed (by reducing the warning message to debug level), the NMI no longer occurs. Sample message log (console log) output illustrating the flood and resultant NMI (snippets with comments and modified with ... instead of hex digits, to satisfy checkpatch.pl): <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!... * About 4000 almost identical lines in less than one second * <mlx4_ib> _mlx4_ib_mcg_port_cleanup: ... WARNING: group refcount 1!!!... INFO: rcu_sched detected stalls on CPUs/tasks: { 17} (...) * { 17} above indicates that CPU 17 was the one that stalled * sending NMI to all CPUs: ... NMI backtrace for cpu 17 CPU: 17 PID: 45909 Comm: kworker/17:2 Hardware name: HP ProLiant DL360p Gen8, BIOS P71 09/08/2013 Workqueue: events fb_flashcursor task: ffff880478...... ti: ffff88064e...... task.ti: ffff88064e...... RIP: 0010:[ffffffff81......] [ffffffff81......] io_serial_in+0x15/0x20 RSP: 0018:ffff88064e257cb0 EFLAGS: 00000002 RAX: 0000000000...... RBX: ffffffff81...... RCX: 0000000000...... RDX: 0000000000...... RSI: 0000000000...... RDI: ffffffff81...... RBP: ffff88064e...... R08: ffffffff81...... R09: 0000000000...... R10: 0000000000...... R11: ffff88064e...... R12: 0000000000...... R13: 0000000000...... R14: ffffffff81...... R15: 0000000000...... FS: 0000000000......(0000) GS:ffff8804af......(0000) knlGS:000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080...... CR2: 00007f2a2f...... CR3: 0000000001...... CR4: 0000000000...... DR0: 0000000000...... DR1: 0000000000...... DR2: 0000000000...... DR3: 0000000000...... DR6: 00000000ff...... DR7: 0000000000...... Stack: ffff88064e...... ffffffff81...... ffffffff81...... 0000000000...... ffffffff81...... ffff88064e...... ffffffff81...... ffffffff81...... ffffffff81...... ffff88064e...... ffffffff81...... 0000000000...... Call Trace: [<ffffffff813d099b>] wait_for_xmitr+0x3b/0xa0 [<ffffffff813d0b5c>] serial8250_console_putchar+0x1c/0x30 [<ffffffff813d0b40>] ? serial8250_console_write+0x140/0x140 [<ffffffff813cb5fa>] uart_console_write+0x3a/0x80 [<ffffffff813d0aae>] serial8250_console_write+0xae/0x140 [<ffffffff8107c4d1>] call_console_drivers.constprop.15+0x91/0xf0 [<ffffffff8107d6cf>] console_unlock+0x3bf/0x400 [<ffffffff813503cd>] fb_flashcursor+0x5d/0x140 [<ffffffff81355c30>] ? bit_clear+0x120/0x120 [<ffffffff8109d5fb>] process_one_work+0x17b/0x470 [<ffffffff8109e3cb>] worker_thread+0x11b/0x400 [<ffffffff8109e2b0>] ? rescuer_thread+0x400/0x400 [<ffffffff810a5aef>] kthread+0xcf/0xe0 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81645858>] ret_from_fork+0x58/0x90 [<ffffffff810a5a20>] ? kthread_create_on_node+0x140/0x140 Code: 48 89 e5 d3 e6 48 63 f6 48 03 77 10 8b 06 5d c3 66 0f 1f 44 00 00 66 66 66 6 As indicated in the stack trace above, the console output task got swamped. Fixes: `b9c5d6a643` ("IB/mlx4: Add multicast group (MCG) paravirtualization for SR-IOV") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Jack Morgenstein	295a7aab35	IB/mlx4: Fix ib device initialization error flow commit `99e68909d5` upstream. In mlx4_ib_add, procedure mlx4_ib_alloc_eqs is called to allocate EQs. However, in the mlx4_ib_add error flow, procedure mlx4_ib_free_eqs is not called to free the allocated EQs. Fixes: `e605b743f3` ("IB/mlx4: Increase the number of vectors (EQs) available for ULPs") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Shamir Rabinovitch	f98af463f8	IB/IPoIB: ibX: failed to create mcg debug file commit `771a525840` upstream. When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: `1732b0ef3b` ([IPoIB] add path record information in debugfs) Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Michael J. Ruhl	88ded1f18b	IB/core: For multicast functions, verify that LIDs are multicast LIDs commit `8561eae60f` upstream. The Infiniband spec defines "A multicast address is defined by a MGID and a MLID" (section 10.5). Currently the MLID value is not validated. Add check to verify that the MLID value is in the correct address range. Fixes: `0c33aeedb2` ("[IB] Add checks to multicast attach and detach") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Parav Pandit	82f0feccee	IB/core: Fix kernel crash during fail to initialize device commit `4be3a4fa51` upstream. This patch fixes the kernel crash that occurs during ib_dealloc_device() called due to provider driver fails with an error after ib_alloc_device() and before it can register using ib_register_device(). This crashed seen in tha lab as below which can occur with any IB device which fails to perform its device initialization before invoking ib_register_device(). This patch avoids touching cache and port immutable structures if device is not yet initialized. It also releases related memory when cache and port immutable data structure initialization fails during register_device() state. [81416.561946] BUG: unable to handle kernel NULL pointer dereference at (null) [81416.570340] IP: ib_cache_release_one+0x29/0x80 [ib_core] [81416.576222] PGD 78da66067 [81416.576223] PUD 7f2d7c067 [81416.579484] PMD 0 [81416.582720] [81416.587242] Oops: 0000 [#1] SMP [81416.722395] task: ffff8807887515c0 task.stack: ffffc900062c0000 [81416.729148] RIP: 0010:ib_cache_release_one+0x29/0x80 [ib_core] [81416.735793] RSP: 0018:ffffc900062c3a90 EFLAGS: 00010202 [81416.741823] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 [81416.749785] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff880859fec000 [81416.757757] RBP: ffffc900062c3aa0 R08: ffff8808536e5ac0 R09: ffff880859fec5b0 [81416.765708] R10: 00000000536e5c01 R11: ffff8808536e5ac0 R12: ffff880859fec000 [81416.773672] R13: 0000000000000000 R14: ffff8808536e5ac0 R15: ffff88084ebc0060 [81416.781621] FS: 00007fd879fab740(0000) GS:ffff88085fac0000(0000) knlGS:0000000000000000 [81416.790522] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [81416.797094] CR2: 0000000000000000 CR3: 00000007eb215000 CR4: 00000000003406e0 [81416.805051] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [81416.812997] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [81416.820950] Call Trace: [81416.824226] ib_device_release+0x1e/0x40 [ib_core] [81416.829858] device_release+0x32/0xa0 [81416.834370] kobject_cleanup+0x63/0x170 [81416.839058] kobject_put+0x25/0x50 [81416.843319] ib_dealloc_device+0x25/0x40 [ib_core] [81416.848986] mlx5_ib_add+0x163/0x1990 [mlx5_ib] [81416.854414] mlx5_add_device+0x5a/0x160 [mlx5_core] [81416.860191] mlx5_register_interface+0x8d/0xc0 [mlx5_core] [81416.866587] ? 0xffffffffa09e9000 [81416.870816] mlx5_ib_init+0x15/0x17 [mlx5_ib] [81416.876094] do_one_initcall+0x51/0x1b0 [81416.880861] ? __vunmap+0x85/0xd0 [81416.885113] ? kmem_cache_alloc_trace+0x14b/0x1b0 [81416.890768] ? vfree+0x2e/0x70 [81416.894762] do_init_module+0x60/0x1fa [81416.899441] load_module+0x15f6/0x1af0 [81416.904114] ? __symbol_put+0x60/0x60 [81416.908709] ? ima_post_read_file+0x3d/0x80 [81416.913828] ? security_kernel_post_read_file+0x6b/0x80 [81416.920006] SYSC_finit_module+0xa6/0xf0 [81416.924888] SyS_finit_module+0xe/0x10 [81416.929568] entry_SYSCALL_64_fastpath+0x1a/0xa9 [81416.935089] RIP: 0033:0x7fd879494949 [81416.939543] RSP: 002b:00007ffdbc1b4e58 EFLAGS: 00000202 ORIG_RAX: 0000000000000139 [81416.947982] RAX: ffffffffffffffda RBX: 0000000001b66f00 RCX: 00007fd879494949 [81416.955965] RDX: 0000000000000000 RSI: 000000000041a13c RDI: 0000000000000003 [81416.963926] RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000001b652a0 [81416.971861] R10: 0000000000000003 R11: 0000000000000202 R12: 00007ffdbc1b3e70 [81416.979763] R13: 00007ffdbc1b3e50 R14: 0000000000000005 R15: 0000000000000000 [81417.008005] RIP: ib_cache_release_one+0x29/0x80 [ib_core] RSP: ffffc900062c3a90 [81417.016045] CR2: 0000000000000000 Fixes: `55aeed0654` ("IB/core: Make ib_alloc_device init the kobject") Fixes: `7738613e7c` ("IB/core: Add per port immutable struct to ib_device") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Jack Morgenstein	5c7f1dfa7b	IB/core: Fix sysfs registration error flow commit `b312be3d87` upstream. The kernel commit cited below restructured ib device management so that the device kobject is initialized in ib_alloc_device. As part of the restructuring, the kobject is now initialized in procedure ib_alloc_device, and is later added to the device hierarchy in the ib_register_device call stack, in procedure ib_device_register_sysfs (which calls device_add). However, in the ib_device_register_sysfs error flow, if an error occurs following the call to device_add, the cleanup procedure device_unregister is called. This call results in the device object being deleted -- which results in various use-after-free crashes. The correct cleanup call is device_del -- which undoes device_add without deleting the device object. The device object will then (correctly) be deleted in the ib_register_device caller's error cleanup flow, when the caller invokes ib_dealloc_device. Fixes: `55aeed0654` ("IB/core: Make ib_alloc_device init the kobject") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Ding Tianhong	09944c2660	iov_iter: don't revert iov buffer if csum error commit `a6a5993243` upstream. The patch `3278682123` (make skb_copy_datagram_msg() et.al. preserve ->msg_iter on error) will revert the iov buffer if copy to iter failed, but it didn't copy any datagram if the skb_checksum_complete error, so no need to revert any data at this place. v2: Sabrina notice that return -EFAULT when checksum error is not correct here, it would confuse the caller about the return value, so fix it. Fixes: `3278682123` ("make skb_copy_datagram_msg() et.al. preserve->msg_iter on error") Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Alex Williamson	1e0b0a9ef0	vfio/type1: Remove locked page accounting workqueue commit `0cfef2b741` upstream. If the mmap_sem is contented then the vfio type1 IOMMU backend will defer locked page accounting updates to a workqueue task. This has a few problems and depending on which side the user tries to play, they might be over-penalized for unmaps that haven't yet been accounted or race the workqueue to enter more mappings than they're allowed. The original intent of this workqueue mechanism seems to be focused on reducing latency through the ioctl, but we cannot do so at the cost of correctness. Remove this workqueue mechanism and update the callers to allow for failure. We can also now recheck the limit under write lock to make sure we don't exceed it. vfio_pin_pages_remote() also now necessarily includes an unwind path which we can jump to directly if the consecutive page pinning finds that we're exceeding the user's memory limits. This avoids the current lazy approach which does accounting and mapping up to the fault, only to return an error on the next iteration to unwind the entire vfio_dma. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Dennis Yang	34224e0e1c	dm thin: fix a memory leak when passing discard bio down commit `948f581a53` upstream. dm-thin does not free the discard_parent bio after all chained sub bios finished. The following kmemleak report could be observed after pool with discard_passdown option processes discard bios in linux v4.11-rc7. To fix this, we drop the discard_parent bio reference when its endio (passdown_endio) called. unreferenced object 0xffff8803d6b29700 (size 256): comm "kworker/u8:0", pid 30349, jiffies 4379504020 (age 143002.776s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 01 00 00 00 00 00 00 f0 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81a5efd9>] kmemleak_alloc+0x49/0xa0 [<ffffffff8114ec34>] kmem_cache_alloc+0xb4/0x100 [<ffffffff8110eec0>] mempool_alloc_slab+0x10/0x20 [<ffffffff8110efa5>] mempool_alloc+0x55/0x150 [<ffffffff81374939>] bio_alloc_bioset+0xb9/0x260 [<ffffffffa018fd20>] process_prepared_discard_passdown_pt1+0x40/0x1c0 [dm_thin_pool] [<ffffffffa018b409>] break_up_discard_bio+0x1a9/0x200 [dm_thin_pool] [<ffffffffa018b484>] process_discard_cell_passdown+0x24/0x40 [dm_thin_pool] [<ffffffffa018b24d>] process_discard_bio+0xdd/0xf0 [dm_thin_pool] [<ffffffffa018ecf6>] do_worker+0xa76/0xd50 [dm_thin_pool] [<ffffffff81086239>] process_one_work+0x139/0x370 [<ffffffff810867b1>] worker_thread+0x61/0x450 [<ffffffff8108b316>] kthread+0xd6/0xf0 [<ffffffff81a6cd1f>] ret_from_fork+0x3f/0x70 [<ffffffffffffffff>] 0xffffffffffffffff Signed-off-by: Dennis Yang <dennisyang@qnap.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:44 +02:00
Bart Van Assche	899750e7d9	dm rq: check blk_mq_register_dev() return value in dm_mq_init_request_queue() commit `23a6012489` upstream. Otherwise the request-based DM blk-mq request_queue will be put into service without being properly exported via sysfs. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Somasundaram Krishnasamy	ee68aa113f	dm era: save spacemap metadata root after the pre-commit commit `117aceb030` upstream. When committing era metadata to disk, it doesn't always save the latest spacemap metadata root in superblock. Due to this, metadata is getting corrupted sometimes when reopening the device. The correct order of update should be, pre-commit (shadows spacemap root), save the spacemap root (newly shadowed block) to in-core superblock and then the final commit. Signed-off-by: Somasundaram Krishnasamy <somasundaram.krishnasamy@oracle.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Ondrej Kozina	f38faf569e	dm crypt: rewrite (wipe) key in crypto layer using random data commit `c82feeec9a` upstream. The message "key wipe" used to wipe real key stored in crypto layer by rewriting it with zeroes. Since commit `28856a9` ("crypto: xts - consolidate sanity check for keys") this no longer works in FIPS mode for XTS. While running in FIPS mode the crypto key part has to differ from the tweak key. Fixes: `28856a9` ("crypto: xts - consolidate sanity check for keys") Signed-off-by: Ondrej Kozina <okozina@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	6ab60d2b26	crypto: ccp - Change ISR handler method for a v5 CCP commit `6263b51eb3` upstream. The CCP has the ability to perform several operations simultaneously, but only one interrupt. When implemented as a PCI device and using MSI-X/MSI interrupts, use a tasklet model to service interrupts. By disabling and enabling interrupts from the CCP, coupled with the queuing that tasklets provide, we can ensure that all events (occurring on the device) are recognized and serviced. This change fixes a problem wherein 2 or more busy queues can cause notification bits to change state while a (CCP) interrupt is being serviced, but after the queue state has been evaluated. This results in the event being 'lost' and the queue hanging, waiting to be serviced. Since the status bits are never fully de-asserted, the CCP never generates another interrupt (all bits zero -> one or more bits one), and no further CCP operations will be executed. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	747cbf4753	crypto: ccp - Change ISR handler method for a v3 CCP commit `7b537b24e7` upstream. The CCP has the ability to perform several operations simultaneously, but only one interrupt. When implemented as a PCI device and using MSI-X/MSI interrupts, use a tasklet model to service interrupts. By disabling and enabling interrupts from the CCP, coupled with the queuing that tasklets provide, we can ensure that all events (occurring on the device) are recognized and serviced. This change fixes a problem wherein 2 or more busy queues can cause notification bits to change state while a (CCP) interrupt is being serviced, but after the queue state has been evaluated. This results in the event being 'lost' and the queue hanging, waiting to be serviced. Since the status bits are never fully de-asserted, the CCP never generates another interrupt (all bits zero -> one or more bits one), and no further CCP operations will be executed. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	511b79fb94	crypto: ccp - Disable interrupts early on unload commit `116591fe3e` upstream. Ensure that we disable interrupts first when shutting down the driver. Signed-off-by: Gary R Hook <ghook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Gary R Hook	c133daf731	crypto: ccp - Use only the relevant interrupt bits commit `56467cb11c` upstream. Each CCP queue can product interrupts for 4 conditions: operation complete, queue empty, error, and queue stopped. This driver only works with completion and error events. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Stephan Mueller	b576fed7c8	crypto: algif_aead - Require setkey before accept(2) commit `2a2a251f11` upstream. Some cipher implementations will crash if you try to use them without calling setkey first. This patch adds a check so that the accept(2) call will fail with -ENOKEY if setkey hasn't been done on the socket yet. Fixes: `400c40cf78` ("crypto: algif - add AEAD support") Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Krzysztof Kozlowski	72a03cf8e3	crypto: s5p-sss - Close possible race for completed requests commit `42d5c176b7` upstream. Driver is capable of handling only one request at a time and it stores it in its state container struct s5p_aes_dev. This stored request must be protected between concurrent invocations (e.g. completing current request and scheduling new one). Combination of lock and "busy" field is used for that purpose. When "busy" field is true, the driver will not accept new request thus it will not overwrite currently handled data. However commit `28b62b1458` ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)") moved some of the write to "busy" field out of a lock protected critical section. This might lead to potential race between completing current request and scheduling a new one. Effectively the request completion might try to operate on new crypto request. Fixes: `28b62b1458` ("crypto: s5p-sss - Fix spinlock recursion on LRW(AES)") Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Mike Snitzer	d9ae27b661	block: fix blk_integrity_register to use template's interval_exp if not 0 commit `2859323e35` upstream. When registering an integrity profile: if the template's interval_exp is not 0 use it, otherwise use the ilog2() of logical block size of the provided gendisk. This fixes a long-standing DM linear target bug where it cannot pass integrity data to the underlying device if its logical block size conflicts with the underlying device's logical block size. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:43 +02:00
Marc Zyngier	d978303275	arm64: KVM: Fix decoding of Rt/Rt2 when trapping AArch32 CP accesses commit `c667186f1c` upstream. Our 32bit CP14/15 handling inherited some of the ARMv7 code for handling the trapped system registers, completely missing the fact that the fields for Rt and Rt2 are now 5 bit wide, and not 4... Let's fix it, and provide an accessor for the most common Rt case. Reviewed-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Andrew Jones	95121cc98c	KVM: arm/arm64: fix races in kvm_psci_vcpu_on commit `6c7a5dce22` upstream. Fix potential races in kvm_psci_vcpu_on() by taking the kvm->lock mutex. In general, it's a bad idea to allow more than one PSCI_CPU_ON to process the same target VCPU at the same time. One such problem that may arise is that one PSCI_CPU_ON could be resetting the target vcpu, which fills the entire sys_regs array with a temporary value including the MPIDR register, while another looks up the VCPU based on the MPIDR value, resulting in no target VCPU found. Resolves both races found with the kvm-unit-tests/arm/psci unit test. Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Christoffer Dall <cdall@linaro.org> Reported-by: Levente Kurusa <lkurusa@redhat.com> Suggested-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Paolo Bonzini	d1daa545bb	Revert "KVM: Support vCPU-based gfn->hva cache" commit `4e335d9e7d` upstream. This reverts commit `bbd6411513`. I've been sitting on this revert for too long and it unfortunately missed 4.11. It's also the reason why I haven't merged ring-based dirty tracking for 4.12. Using kvm_vcpu_memslots in kvm_gfn_to_hva_cache_init and kvm_vcpu_write_guest_offset_cached means that the MSR value can now be used to access SMRAM, simply by making it point to an SMRAM physical address. This is problematic because it lets the guest OS overwrite memory that it shouldn't be able to touch. Fixes: `bbd6411513` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
David Hildenbrand	2d271c9508	KVM: x86: fix user triggerable warning in kvm_apic_accept_events() commit `28bf288879` upstream. If we already entered/are about to enter SMM, don't allow switching to INIT/SIPI_RECEIVED, otherwise the next call to kvm_apic_accept_events() will report a warning. Same applies if we are already in MP state INIT_RECEIVED and SMM is requested to be turned on. Refuse to set the VCPU events in this case. Fixes: `cd7764fe9f` ("KVM: x86: latch INITs while in system management mode") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Vince Weaver	d85d4c871a	perf/x86: Fix Broadwell-EP DRAM RAPL events commit `33b88e708e` upstream. It appears as though the Broadwell-EP DRAM units share the special units quirk with Haswell-EP/KNL. Without this patch, you get really high results (a single DRAM using 20W of power). The powercap driver in drivers/powercap/intel_rapl.c already has this change. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Richard Weinberger	9e74819678	um: Fix PTRACE_POKEUSER on x86_64 commit `9abc74a22d` upstream. This is broken since ever but sadly nobody noticed. Recent versions of GDB set DR_CONTROL unconditionally and UML dies due to a heap corruption. It turns out that the PTRACE_POKEUSER was copy&pasted from i386 and assumes that addresses are 4 bytes long. Fix that by using 8 as address size in the calculation. Reported-by: jie cao <cj3054@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Ben Hutchings	f25c69bd5e	x86, pmem: Fix cache flushing for iovec write < 8 bytes commit `8376efd31d` upstream. Commit `11e63f6d92` added cache flushing for unaligned writes from an iovec, covering the first and last cache line of a >= 8 byte write and the first cache line of a < 8 byte write. But an unaligned write of 2-7 bytes can still cover two cache lines, so make sure we flush both in that case. Fixes: `11e63f6d92` ("x86, pmem: fix broken __copy_user_nocache ...") Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Andy Lutomirski	20fd61dbb1	selftests/x86/ldt_gdt_32: Work around a glibc sigaction() bug commit `65973dd3fd` upstream. i386 glibc is buggy and calls the sigaction syscall incorrectly. This is asymptomatic for normal programs, but it blows up on programs that do evil things with segmentation. The ldt_gdt self-test is an example of such an evil program. This doesn't appear to be a regression -- I think I just got lucky with the uninitialized memory that glibc threw at the kernel when I wrote the test. This hackish fix manually issues sigaction(2) syscalls to undo the damage. Without the fix, ldt_gdt_32 segfaults; with the fix, it passes for me. See: https://sourceware.org/bugzilla/show_bug.cgi?id=21269 Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Garnier <thgarnie@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aaab0f9f93c9af25396f01232608c163a760a668.1490218061.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Ashish Kalra	0dc0e26fee	x86/boot: Fix BSS corruption/overwrite bug in early x86 kernel startup commit `d594aa0277` upstream. The minimum size for a new stack (512 bytes) setup for arch/x86/boot components when the bootloader does not setup/provide a stack for the early boot components is not "enough". The setup code executing as part of early kernel startup code, uses the stack beyond 512 bytes and accidentally overwrites and corrupts part of the BSS section. This is exposed mostly in the early video setup code, where it was corrupting BSS variables like force_x, force_y, which in-turn affected kernel parameters such as screen_info (screen_info.orig_video_cols) and later caused an exception/panic in console_init(). Most recent boot loaders setup the stack for early boot components, so this stack overwriting into BSS section issue has not been exposed. Signed-off-by: Ashish Kalra <ashish@bluestacks.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170419152015.10011-1-ashishkalra@Ashishs-MacBook-Pro.local Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:42 +02:00
Guenter Roeck	12001f3a45	usb: hub: Do not attempt to autosuspend disconnected devices commit `f5cccf4942` upstream. While running a bind/unbind stress test with the dwc3 usb driver on rk3399, the following crash was observed. Unable to handle kernel NULL pointer dereference at virtual address 00000218 pgd = ffffffc00165f000 [00000218] pgd=000000000174f003, pud=000000000174f003, pmd=0000000001750003, pte=00e8000001751713 Internal error: Oops: 96000005 [#1] PREEMPT SMP Modules linked in: uinput uvcvideo videobuf2_vmalloc cmac ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat rfcomm xt_mark fuse bridge stp llc zram btusb btrtl btbcm btintel bluetooth ip6table_filter mwifiex_pcie mwifiex cfg80211 cdc_ether usbnet r8152 mii joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device ppp_async ppp_generic slhc tun CPU: 1 PID: 29814 Comm: kworker/1:1 Not tainted 4.4.52 #507 Hardware name: Google Kevin (DT) Workqueue: pm pm_runtime_work task: ffffffc0ac540000 ti: ffffffc0af4d4000 task.ti: ffffffc0af4d4000 PC is at autosuspend_check+0x74/0x174 LR is at autosuspend_check+0x70/0x174 ... Call trace: [<ffffffc00080dcc0>] autosuspend_check+0x74/0x174 [<ffffffc000810500>] usb_runtime_idle+0x20/0x40 [<ffffffc000785ae0>] __rpm_callback+0x48/0x7c [<ffffffc000786af0>] rpm_idle+0x1e8/0x498 [<ffffffc000787cdc>] pm_runtime_work+0x88/0xcc [<ffffffc000249bb8>] process_one_work+0x390/0x6b8 [<ffffffc00024abcc>] worker_thread+0x480/0x610 [<ffffffc000251a80>] kthread+0x164/0x178 [<ffffffc0002045d0>] ret_from_fork+0x10/0x40 Source: (gdb) l 0xffffffc00080dcc0 0xffffffc00080dcc0 is in autosuspend_check (drivers/usb/core/driver.c:1778). 1773 / We don't need to check interfaces that are 1774 * disabled for runtime PM. Either they are unbound 1775 * or else their drivers don't support autosuspend 1776 * and so they are permanently active. 1777 */ 1778 if (intf->dev.power.disable_depth) 1779 continue; 1780 if (atomic_read(&intf->dev.power.usage_count) > 0) 1781 return -EBUSY; 1782 w \|= intf->needs_remote_wakeup; Code analysis shows that intf is set to NULL in usb_disable_device() prior to setting actconfig to NULL. At the same time, usb_runtime_idle() does not lock the usb device, and neither does any of the functions in the traceback. This means that there is no protection against a race condition where usb_disable_device() is removing dev->actconfig->interface[] pointers while those are being accessed from autosuspend_check(). To solve the problem, synchronize and validate device state between autosuspend_check() and usb_disconnect(). Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Guenter Roeck	90d4e2a659	usb: hub: Fix error loop seen after hub communication errors commit `245b2eecee` upstream. While stress testing a usb controller using a bind/unbind looop, the following error loop was observed. usb 7-1.2: new low-speed USB device number 3 using xhci-hcd usb 7-1.2: hub failed to enable device, error -108 usb 7-1-port2: cannot disable (err = -22) usb 7-1-port2: couldn't allocate usb_device usb 7-1-port2: cannot disable (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: activate --> -22 hub 7-1:1.0: hub_ext_port_status failed (err = -22) hub 7-1:1.0: hub_ext_port_status failed (err = -22) 57 printk messages dropped hub 7-1:1.0: activate --> -22 82 printk messages dropped hub 7-1:1.0: hub_ext_port_status failed (err = -22) This continues forever. After adding tracebacks into the code, the call sequence leading to this is found to be as follows. [<ffffffc0007fc8e0>] hub_activate+0x368/0x7b8 [<ffffffc0007fceb4>] hub_resume+0x2c/0x3c [<ffffffc00080b3b8>] usb_resume_interface.isra.6+0x128/0x158 [<ffffffc00080b5d0>] usb_suspend_both+0x1e8/0x288 [<ffffffc00080c9c4>] usb_runtime_suspend+0x3c/0x98 [<ffffffc0007820a0>] __rpm_callback+0x48/0x7c [<ffffffc00078217c>] rpm_callback+0xa8/0xd4 [<ffffffc000786234>] rpm_suspend+0x84/0x758 [<ffffffc000786ca4>] rpm_idle+0x2c8/0x498 [<ffffffc000786ed4>] __pm_runtime_idle+0x60/0xac [<ffffffc00080eba8>] usb_autopm_put_interface+0x6c/0x7c [<ffffffc000803798>] hub_event+0x10ac/0x12ac [<ffffffc000249bb8>] process_one_work+0x390/0x6b8 [<ffffffc00024abcc>] worker_thread+0x480/0x610 [<ffffffc000251a80>] kthread+0x164/0x178 [<ffffffc0002045d0>] ret_from_fork+0x10/0x40 kick_hub_wq() is called from hub_activate() even after failures to communicate with the hub. This results in an endless sequence of hub event -> hub activate -> wq trigger -> hub event -> ... Provide two solutions for the problem. - Only trigger the hub event queue if communication with the hub is successful. - After a suspend failure, only resume already suspended interfaces if the communication with the device is still possible. Each of the changes fixes the observed problem. Use both to improve robustness. Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Alexey Brodkin	8e3b0f66f4	usb: Make sure usb/phy/of gets built-in commit `3d6159640d` upstream. DWC3 driver uses of_usb_get_phy_mode() which is implemented in drivers/usb/phy/of.c and in bare minimal configuration it might not be pulled in kernel binary. In case of ARC or ARM this could be easily reproduced with "allnodefconfig" +CONFIG_USB=m +CONFIG_USB_DWC3=m. On building all ends-up with: ---------------------->8------------------ Kernel: arch/arm/boot/Image is ready Kernel: arch/arm/boot/zImage is ready Building modules, stage 2. MODPOST 5 modules ERROR: "of_usb_get_phy_mode" [drivers/usb/dwc3/dwc3.ko] undefined! make[1]: * [__modpost] Error 1 make: * [modules] Error 2 ---------------------->8------------------ Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Nicolas Pitre <nicolas.pitre@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Felipe Balbi <balbi@kernel.org> Cc: Felix Fietkau <nbd@nbd.name> Cc: Jeremy Kerr <jk@ozlabs.org> Cc: linux-snps-arc@lists.infradead.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Romain Izard	6a9fc06acf	usb: gadget: legacy gadgets are optional commit `6e253d0fbc` upstream. With commit `bc49d1d17d` ("usb: gadget: don't couple configfs to legacy gadgets"),it is possible to build a modular kernel with both built-in configfs support and modular legacy gadget drivers. But when building a kernel without modules, it is also necessary to be able to build with configfs but without any legacy gadget driver. This was a possible configuration when the USB_CONFIGFS was a part of the choice options, but not anymore. Mark the choice for legacy gadget drivers as optional restores this. Fixes: `bc49d1d17d` ("usb: gadget: don't couple configfs to legacy gadgets") Signed-off-by: Romain Izard <romain.izard.pro@gmail.com> Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Gustavo A. R. Silva	84759debb0	usb: misc: add missing continue in switch commit `2c930e3d0a` upstream. Add missing continue in switch. Addresses-Coverity-ID: 1248733 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Ian Abbott	69f8945b4f	staging: comedi: jr3_pci: cope with jiffies wraparound commit `8ec04a4918` upstream. The timer expiry routine `jr3_pci_poll_dev()` checks for expiry by checking whether the absolute value of `jiffies` (stored in local variable `now`) is greater than the expected expiry time in jiffy units. This will fail when `jiffies` wraps around. Also, it seems to make sense to handle the expiry one jiffy earlier than the current test. Use `time_after_eq()` to check for expiry. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Ian Abbott	6e4c973ac6	staging: comedi: jr3_pci: fix possible null pointer dereference commit `45292be0b3` upstream. For some reason, the driver does not consider allocation of the subdevice private data to be a fatal error when attaching the COMEDI device. It tests the subdevice private data pointer for validity at certain points, but omits some crucial tests. In particular, `jr3_pci_auto_attach()` calls `jr3_pci_alloc_spriv()` to allocate and initialize the subdevice private data, but the same function subsequently dereferences the pointer to access the `next_time_min` and `next_time_max` members without checking it first. The other missing test is in the timer expiry routine `jr3_pci_poll_dev()`, but it will crash before it gets that far. Fix the bug by returning `-ENOMEM` from `jr3_pci_auto_attach()` as soon as one of the calls to `jr3_pci_alloc_spriv()` returns `NULL`. The COMEDI core will subsequently call `jr3_pci_detach()` to clean up. Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Sean Young	9918523578	staging: sir: fill in missing fields and fix probe commit `cf9ed9aa5b` upstream. Some fields are left blank. Signed-off-by: Sean Young <sean@mess.org> Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Aditya Shankar	89cb8fccea	staging: wilc1000: Fix problem with wrong vif index commit `0e490657c7` upstream. The vif->idx value is always 0 for two interfaces. wl->vif_num = 0; loop { ... vif->idx = wl->vif_num; ... wl->vif_num = i; .... i++; ... } At present, vif->idx is assigned the value of wl->vif_num at the beginning of this block and device is initialized based on this index value. In the next iteration, wl->vif_num is still 0 as it is only updated later but gets assigned to vif->idx in the beginning. This causes problems later when we try to reference a particular interface and also while configuring the firmware. This patch moves the assignment to vif->idx from the beginning of the block to after wl->vif_num is updated with latest value of i. Fixes: commit `735bb39ca3` ("staging: wilc1000: simplify vif[i]->ndev accesses") Signed-off-by: Aditya Shankar <aditya.shankar@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:41 +02:00
Johan Hovold	22d8767df9	staging: gdm724x: gdm_mux: fix use-after-free on module unload commit `b58f45c8fc` upstream. Make sure to deregister the USB driver before releasing the tty driver to avoid use-after-free in the USB disconnect callback where the tty devices are deregistered. Fixes: `61e1210476` ("staging: gdm7240: adding LTE USB driver") Cc: Won Kang <wkang77@gmail.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Malcolm Priestley	af685eefa2	staging: vt6656: use off stack for out buffer USB transfers. commit `12ecd24ef9` upstream. Since 4.9 mandated USB buffers be heap allocated this causes the driver to fail. Since there is a wide range of buffer sizes use kmemdup to create allocated buffer. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Malcolm Priestley	5ba4fd334f	staging: vt6656: use off stack for in buffer USB transfers. commit `05c0cf88be` upstream. Since 4.9 mandated USB buffers to be heap allocated. This causes the driver to fail. Create buffer for USB transfers. Signed-off-by: Malcolm Priestley <tvboxspy@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Bjørn Mork	f49d36b7b3	USB: Revert "cdc-wdm: fix "out-of-sync" due to missing notifications" commit `1944581699` upstream. This reverts commit `833415a3e7` ("cdc-wdm: fix "out-of-sync" due to missing notifications") There have been several reports of wdm_read returning unexpected EIO errors with QMI devices using the qmi_wwan driver. The reporters confirm that reverting prevents these errors. I have been unable to reproduce the bug myself, and have no explanation to offer either. But reverting is the safe choice here, given that the commit was an attempt to work around a firmware problem. Living with a firmware problem is still better than adding driver bugs. Reported-by: Kasper Holtze <kasper@holtze.dk> Reported-by: Aleksander Morgado <aleksander@aleksander.es> Reported-by: Daniele Palmas <dnlplm@gmail.com> Fixes: `833415a3e7` ("cdc-wdm: fix "out-of-sync" due to missing notifications") Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Ajay Kaher	2c33341c96	USB: Proper handling of Race Condition when two USB class drivers try to call init_usb_class simultaneously commit `2f86a96be0` upstream. There is race condition when two USB class drivers try to call init_usb_class at the same time and leads to crash. code path: probe->usb_register_dev->init_usb_class To solve this, mutex locking has been added in init_usb_class() and destroy_usb_class(). As pointed by Alan, removed "if (usb_class)" test from destroy_usb_class() because usb_class can never be NULL there. Signed-off-by: Ajay Kaher <ajay.kaher@samsung.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Marek Vasut	2436236ee3	USB: serial: ftdi_sio: add device ID for Microsemi/Arrow SF2PLUS Dev Kit commit `31c5d1922b` upstream. This development kit has an FT4232 on it with a custom USB VID/PID. The FT4232 provides four UARTs, but only two are used. The UART 0 is used by the FlashPro5 programmer and UART 2 is connected to the SmartFusion2 CortexM3 SoC UART port. Note that the USB VID is registered to Actel according to Linux USB VID database, but that was acquired by Microsemi. Signed-off-by: Marek Vasut <marex@denx.de> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Peter Chen	f34b103cd5	usb: host: xhci: print correct command ring address commit `6fc091fb04` upstream. Print correct command ring address using 'val_64'. Signed-off-by: Peter Chen <peter.chen@nxp.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Roger Quadros	c5379cf67f	usb: xhci: bInterval quirk for TI TUSB73x0 commit `69307ccb9a` upstream. As per [1] issue #4, "The periodic EP scheduler always tries to schedule the EPs that have large intervals (interval equal to or greater than 128 microframes) into different microframes. So it maintains an internal counter and increments for each large interval EP added. When the counter is greater than 128, the scheduler rejects the new EP. So when the hub re-enumerated 128 times, it triggers this condition." This results in Bandwidth error when devices with periodic endpoints (ISO/INT) having bInterval > 7 are plugged and unplugged several times on a TUSB73x0 XHCI host. Workaround this issue by limiting the bInterval to 7 (i.e. interval to 6) for High-speed or faster periodic endpoints. [1] - http://www.ti.com/lit/er/sllz076/sllz076.pdf Signed-off-by: Roger Quadros <rogerq@ti.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Nicholas Bellinger	f9a45058a3	iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatement commit `197b806ae5` upstream. While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit `d36ad77f70` to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo <ghg@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:40 +02:00
Bart Van Assche	14a890020f	target/fileio: Fix zero-length READ and WRITE handling commit `59ac9c0781` upstream. This patch fixes zero-length READ and WRITE handling in target/FILEIO, which was broken a long time back by: Since: commit `d81cb44726` Author: Paolo Bonzini <pbonzini@redhat.com> Date: Mon Sep 17 16:36:11 2012 -0700 target: go through normal processing for all zero-length commands which moved zero-length READ and WRITE completion out of target-core, to doing submission into backend driver code. To address this, go ahead and invoke target_complete_cmd() for any non negative return value in fd_do_rw(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Andy Grover <agrover@redhat.com> Cc: David Disseldorp <ddiss@suse.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Nicholas Bellinger	25ed85889d	target: Fix compare_and_write_callback handling for non GOOD status commit `a71a5dc7f8` upstream. Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit `9b2792c3da`, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari <wgb@datera.io> Cc: Bill Borsari <wgb@datera.io> Tested-by: Gary Guo <ghg@datera.io> Cc: Gary Guo <ghg@datera.io> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Juergen Gross	1ec6f0815a	xen: adjust early dom0 p2m handling to xen hypervisor behavior commit `69861e0a52` upstream. When booted as pv-guest the p2m list presented by the Xen is already mapped to virtual addresses. In dom0 case the hypervisor might make use of 2M- or 1G-pages for this mapping. Unfortunately while being properly aligned in virtual and machine address space, those pages might not be aligned properly in guest physical address space. So when trying to obtain the guest physical address of such a page pud_pfn() and pmd_pfn() must be avoided as those will mask away guest physical address bits not being zero in this special case. Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-20 14:49:39 +02:00
Greg Kroah-Hartman	4c71e91a04	Linux 4.11.1	2017-05-14 14:06:16 +02:00
Ilya Dryomov	0873ab0224	block: get rid of blk_integrity_revalidate() commit `19b7ccf865` upstream. Commit `25520d55cd` ("block: Inline blk_integrity in struct gendisk") introduced blk_integrity_revalidate(), which seems to assume ownership of the stable pages flag and unilaterally clears it if no blk_integrity profile is registered: if (bi->profile) disk->queue->backing_dev_info->capabilities \|= BDI_CAP_STABLE_WRITES; else disk->queue->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES; It's called from revalidate_disk() and rescan_partitions(), making it impossible to enable stable pages for drivers that support partitions and don't use blk_integrity: while the call in revalidate_disk() can be trivially worked around (see zram, which doesn't support partitions and hence gets away with zram_revalidate_disk()), rescan_partitions() can be triggered from userspace at any time. This breaks rbd, where the ceph messenger is responsible for generating/verifying CRCs. Since blk_integrity_{un,}register() "must" be used for (un)registering the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES setting there. This way drivers that call blk_integrity_register() and use integrity infrastructure won't interfere with drivers that don't but still want stable pages. Fixes: `25520d55cd` ("block: Inline blk_integrity in struct gendisk") Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Mike Snitzer <snitzer@redhat.com> Tested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:03 +02:00
Boris Ostrovsky	3ed024e274	xen: Revert commits `da72ff5bfc` and `72a9b18629` commit `84d582d236` upstream. Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741) established that commit `72a9b18629` ("xen: Remove event channel notification through Xen PCI platform device") (and thus commit `da72ff5bfc` ("partially revert "xen: Remove event channel notification through Xen PCI platform device"")) are unnecessary and, in fact, prevent HVM guests from booting on Xen releases prior to 4.0 Therefore we revert both of those commits. The summary of that discussion is below: Here is the brief summary of the current situation: Before the offending commit (`72a9b18629`): 1) INTx does not work because of the reset_watches path. 2) The reset_watches path is only taken if you have Xen > 4.0 3) The Linux Kernel by default will use vector inject if the hypervisor support. So even INTx does not work no body running the kernel with Xen > 4.0 would notice. Unless he explicitly disabled this feature either in the kernel or in Xen (and this can only be disabled by modifying the code, not user-supported way to do it). After the offending commit (+ partial revert): 1) INTx is no longer support for HVM (only for PV guests). 2) Any HVM guest The kernel will not boot on Xen < 4.0 which does not have vector injection support. Since the only other mode supported is INTx which. So based on this summary, I think before commit (`72a9b18629`) we were in much better position from a user point of view. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Ross Lagerwall <ross.lagerwall@citrix.com> Cc: xen-devel@lists.xenproject.org Cc: linux-kernel@vger.kernel.org Cc: linux-pci@vger.kernel.org Cc: Anthony Liguori <aliguori@amazon.com> Cc: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:03 +02:00
Stefano Stabellini	5e25479ea9	xen/arm,arm64: fix xen_dma_ops after `815dd18` "Consolidate get_dma_ops..." commit `e058632670` upstream. The following commit: commit `815dd18788` Author: Bart Van Assche <bart.vanassche@sandisk.com> Date: Fri Jan 20 13:04:04 2017 -0800 treewide: Consolidate get_dma_ops() implementations rearranges get_dma_ops in a way that xen_dma_ops are not returned when running on Xen anymore, dev->dma_ops is returned instead (see arch/arm/include/asm/dma-mapping.h:get_arch_dma_ops and include/linux/dma-mapping.h:get_dma_ops). Fix the problem by storing dev->dma_ops in dev_archdata, and setting dev->dma_ops to xen_dma_ops. This way, xen_dma_ops is returned naturally by get_dma_ops. The Xen code can retrieve the original dev->dma_ops from dev_archdata when needed. It also allows us to remove __generic_dma_ops from common headers. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Tested-by: Julien Grall <julien.grall@arm.com> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> CC: linux@armlinux.org.uk CC: catalin.marinas@arm.com CC: will.deacon@arm.com CC: boris.ostrovsky@oracle.com CC: jgross@suse.com CC: Julien Grall <julien.grall@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Jin Qian	c7f765b5d6	f2fs: sanity check segment count commit `b9dd46188e` upstream. F2FS uses 4 bytes to represent block address. As a result, supported size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments. Signed-off-by: Jin Qian <jinqian@google.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Jon Mason	275e2edbb8	net: mdio-mux: bcm-iproc: call mdiobus_free() in error path [ Upstream commit `922c60e89d` ] If an error is encountered in mdio_mux_init(), the error path will call mdiobus_free(). Since mdiobus_register() has been called prior to mdio_mux_init(), the bus->state will not be MDIOBUS_UNREGISTERED. This causes a BUG_ON() in mdiobus_free(). To correct this issue, add an error path for mdio_mux_init() which calls mdiobus_unregister() prior to mdiobus_free(). Signed-off-by: Jon Mason <jon.mason@broadcom.com> Fixes: `98bc865a1e` ("net: mdio-mux: Add MDIO mux driver for iProc SoCs") Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Daniel Borkmann	ced12308e5	bpf: don't let ldimm64 leak map addresses on unprivileged [ Upstream commit `0d0e57697f` ] The patch fixes two things at once: 1) It checks the env->allow_ptr_leaks and only prints the map address to the log if we have the privileges to do so, otherwise it just dumps 0 as we would when kptr_restrict is enabled on %pK. Given the latter is off by default and not every distro sets it, I don't want to rely on this, hence the 0 by default for unprivileged. 2) Printing of ldimm64 in the verifier log is currently broken in that we don't print the full immediate, but only the 32 bit part of the first insn part for ldimm64. Thus, fix this up as well; it's okay to access, since we verified all ldimm64 earlier already (including just constants) through replace_map_fd_with_map_ptr(). Fixes: `1be7f75d16` ("bpf: enable non-root eBPF programs") Fixes: `cbd3570086` ("bpf: verifier (add ability to receive verification log)") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Dan Carpenter	7a0a483ec5	bnxt_en: allocate enough space for ->ntp_fltr_bmap [ Upstream commit `ac45bd93a5` ] We have the number of longs, but we need to calculate the number of bytes required. Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Eric Dumazet	f8e3892f9f	tcp: randomize timestamps on syncookies [ Upstream commit `84b114b984` ] Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: `95a22caee3` ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Florian Westphal <fw@strlen.de> Tested-by: Florian Westphal <fw@strlen.de> Cc: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
WANG Cong	3960afa2e0	ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf [ Upstream commit `242d3a49a2` ] For each netns (except init_net), we initialize its null entry in 3 places: 1) The template itself, as we use kmemdup() 2) Code around dst_init_metrics() in ip6_route_net_init() 3) ip6_route_dev_notify(), which is supposed to initialize it after loopback registers Unfortunately the last one still happens in a wrong order because we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to net->loopback_dev's idev, thus we have to do that after we add idev to loopback. However, this notifier has priority == 0 same as ipv6_dev_notf, and ipv6_dev_notf is registered after ip6_route_dev_notifier so it is called actually after ip6_route_dev_notifier. This is similar to commit `2f460933f5` ("ipv6: initialize route null entry in addrconf_init()") which fixes init_net. Fix it by picking a smaller priority for ip6_route_dev_notifier. Also, we have to release the refcnt accordingly when unregistering loopback_dev because device exit functions are called before subsys exit functions. Acked-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
WANG Cong	5ee83127ac	ipv6: initialize route null entry in addrconf_init() [ Upstream commit `2f460933f5` ] Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev since it is always NULL. This is clearly wrong, we have code to initialize it to loopback_dev, unfortunately the order is still not correct. loopback_dev is registered very early during boot, we lose a chance to re-initialize it in notifier. addrconf_init() is called after ip6_route_init(), which means we have no chance to correct it. Fix it by moving this initialization explicitly after ipv6_add_dev(init_net.loopback_dev) in addrconf_init(). Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Michal Schmidt	0913e57331	rtnetlink: NUL-terminate IFLA_PHYS_PORT_NAME string [ Upstream commit `77ef033b68` ] IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0. Otherwise libnl3 fails to validate netlink messages with this attribute. "ip -detail a" assumes too that the attribute is NUL-terminated when printing it. It often was, due to padding. I noticed this as libvirtd failing to start on a system with sfc driver after upgrading it to Linux 4.11, i.e. when sfc added support for phys_port_name. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Alexander Potapenko	165a007e51	ipv4, ipv6: ensure raw socket message is big enough to hold an IP header [ Upstream commit `86f4c90a1c` ] raw_send_hdrinc() and rawv6_send_hdrinc() expect that the buffer copied from the userspace contains the IPv4/IPv6 header, so if too few bytes are copied, parts of the header may remain uninitialized. This bug has been detected with KMSAN. For the record, the KMSAN report: ================================================================== BUG: KMSAN: use of unitialized memory in nf_ct_frag6_gather+0xf5a/0x44a0 inter: 0 CPU: 0 PID: 1036 Comm: probe Not tainted 4.11.0-rc5+ #2455 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x143/0x1b0 lib/dump_stack.c:52 kmsan_report+0x16b/0x1e0 mm/kmsan/kmsan.c:1078 __kmsan_warning_32+0x5c/0xa0 mm/kmsan/kmsan_instr.c:510 nf_ct_frag6_gather+0xf5a/0x44a0 net/ipv6/netfilter/nf_conntrack_reasm.c:577 ipv6_defrag+0x1d9/0x280 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn ./include/linux/netfilter.h:102 nf_hook_slow+0x13f/0x3c0 net/netfilter/core.c:310 nf_hook ./include/linux/netfilter.h:212 NF_HOOK ./include/linux/netfilter.h:255 rawv6_send_hdrinc net/ipv6/raw.c:673 rawv6_sendmsg+0x2fcb/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246 RIP: 0033:0x436e03 RSP: 002b:00007ffce48baf38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000436e03 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffce48baf90 R08: 00007ffce48baf50 R09: 000000000000001c R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000401790 R14: 0000000000401820 R15: 0000000000000000 origin: 00000000d9400053 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 kmsan_save_stack_with_flags mm/kmsan/kmsan.c:362 kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:257 kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:270 slab_alloc_node mm/slub.c:2735 __kmalloc_node_track_caller+0x1f4/0x390 mm/slub.c:4341 __kmalloc_reserve net/core/skbuff.c:138 __alloc_skb+0x2cd/0x740 net/core/skbuff.c:231 alloc_skb ./include/linux/skbuff.h:933 alloc_skb_with_frags+0x209/0xbc0 net/core/skbuff.c:4678 sock_alloc_send_pskb+0x9ff/0xe00 net/core/sock.c:1903 sock_alloc_send_skb+0xe4/0x100 net/core/sock.c:1920 rawv6_send_hdrinc net/ipv6/raw.c:638 rawv6_sendmsg+0x2918/0x41a0 net/ipv6/raw.c:919 inet_sendmsg+0x3f8/0x6d0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 sock_sendmsg net/socket.c:643 SYSC_sendto+0x6a5/0x7c0 net/socket.c:1696 SyS_sendto+0xbc/0xe0 net/socket.c:1664 do_syscall_64+0x72/0xa0 arch/x86/entry/common.c:285 return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246 ================================================================== , triggered by the following syscalls: socket(PF_INET6, SOCK_RAW, IPPROTO_RAW) = 3 sendto(3, NULL, 0, 0, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "ff00::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EPERM A similar report is triggered in net/ipv4/raw.c if we use a PF_INET socket instead of a PF_INET6 one. Signed-off-by: Alexander Potapenko <glider@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:02 +02:00
Eric Dumazet	86a9a0884d	tcp: do not inherit fastopen_req from parent [ Upstream commit `8b485ce698` ] Under fuzzer stress, it is possible that a child gets a non NULL fastopen_req pointer from its parent at accept() time, when/if parent morphs from listener to active session. We need to make sure this can not happen, by clearing the field after socket cloning. BUG: Double free or freeing an invalid pointer Unexpected shadow byte: 0xFB CPU: 3 PID: 20933 Comm: syz-executor3 Not tainted 4.11.0+ #306 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x292/0x395 lib/dump_stack.c:52 kasan_object_err+0x1c/0x70 mm/kasan/report.c:164 kasan_report_double_free+0x5c/0x70 mm/kasan/report.c:185 kasan_slab_free+0x9d/0xc0 mm/kasan/kasan.c:580 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 inet_child_forget+0xb8/0x600 net/ipv4/inet_connection_sock.c:898 inet_csk_reqsk_queue_add+0x1e7/0x250 net/ipv4/inet_connection_sock.c:928 tcp_get_cookie_sock+0x21a/0x510 net/ipv4/syncookies.c:217 cookie_v4_check+0x1a19/0x28b0 net/ipv4/syncookies.c:384 tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1384 [inline] tcp_v4_do_rcv+0x731/0x940 net/ipv4/tcp_ipv4.c:1421 tcp_v4_rcv+0x2dc0/0x31c0 net/ipv4/tcp_ipv4.c:1715 ip_local_deliver_finish+0x4cc/0xc20 net/ipv4/ip_input.c:216 NF_HOOK include/linux/netfilter.h:257 [inline] ip_local_deliver+0x1ce/0x700 net/ipv4/ip_input.c:257 dst_input include/net/dst.h:492 [inline] ip_rcv_finish+0xb1d/0x20b0 net/ipv4/ip_input.c:396 NF_HOOK include/linux/netfilter.h:257 [inline] ip_rcv+0xd8c/0x19c0 net/ipv4/ip_input.c:487 __netif_receive_skb_core+0x1ad1/0x3400 net/core/dev.c:4210 __netif_receive_skb+0x2a/0x1a0 net/core/dev.c:4248 process_backlog+0xe5/0x6c0 net/core/dev.c:4868 napi_poll net/core/dev.c:5270 [inline] net_rx_action+0xe70/0x18e0 net/core/dev.c:5335 __do_softirq+0x2fb/0xb99 kernel/softirq.c:284 do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:899 </IRQ> do_softirq.part.17+0x1e8/0x230 kernel/softirq.c:328 do_softirq kernel/softirq.c:176 [inline] __local_bh_enable_ip+0x1cf/0x1e0 kernel/softirq.c:181 local_bh_enable include/linux/bottom_half.h:31 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:931 [inline] ip_finish_output2+0x9ab/0x15e0 net/ipv4/ip_output.c:230 ip_finish_output+0xa35/0xdf0 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f6/0x7b0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_queue_xmit+0x9a8/0x1a10 net/ipv4/ip_output.c:503 tcp_transmit_skb+0x1ade/0x3470 net/ipv4/tcp_output.c:1057 tcp_write_xmit+0x79e/0x55b0 net/ipv4/tcp_output.c:2265 __tcp_push_pending_frames+0xfa/0x3a0 net/ipv4/tcp_output.c:2450 tcp_push+0x4ee/0x780 net/ipv4/tcp.c:683 tcp_sendmsg+0x128d/0x39b0 net/ipv4/tcp.c:1342 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x446059 RSP: 002b:00007faa6761fb58 EFLAGS: 00000282 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000000000000017 RCX: 0000000000446059 RDX: 0000000000000001 RSI: 0000000020ba3fcd RDI: 0000000000000017 RBP: 00000000006e40a0 R08: 0000000020ba4ff0 R09: 0000000000000010 R10: 0000000020000000 R11: 0000000000000282 R12: 0000000000708150 R13: 0000000000000000 R14: 00007faa676209c0 R15: 00007faa67620700 Object at ffff88003b5bbcb8, in cache kmalloc-64 size: 64 Allocated: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:616 kmem_cache_alloc_trace+0x82/0x270 mm/slub.c:2745 kmalloc include/linux/slab.h:490 [inline] kzalloc include/linux/slab.h:663 [inline] tcp_sendmsg_fastopen net/ipv4/tcp.c:1094 [inline] tcp_sendmsg+0x221a/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Freed: PID = 20909 save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 save_stack+0x43/0xd0 mm/kasan/kasan.c:513 set_track mm/kasan/kasan.c:525 [inline] kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:589 slab_free_hook mm/slub.c:1357 [inline] slab_free_freelist_hook mm/slub.c:1379 [inline] slab_free mm/slub.c:2961 [inline] kfree+0xe8/0x2b0 mm/slub.c:3882 tcp_free_fastopen_req net/ipv4/tcp.c:1077 [inline] tcp_disconnect+0xc15/0x13e0 net/ipv4/tcp.c:2328 __inet_stream_connect+0x20c/0xf90 net/ipv4/af_inet.c:593 tcp_sendmsg_fastopen net/ipv4/tcp.c:1111 [inline] tcp_sendmsg+0x23a8/0x39b0 net/ipv4/tcp.c:1139 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:762 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 SYSC_sendto+0x660/0x810 net/socket.c:1696 SyS_sendto+0x40/0x50 net/socket.c:1664 entry_SYSCALL_64_fastpath+0x1f/0xbe Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Fixes: `7db92362d2` ("tcp: fix potential double free issue for fastopen_req") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Daniele Palmas	0b1d889cb0	net: usb: qmi_wwan: add Telit ME910 support [ Upstream commit `4c54dc0277` ] This patch adds support for Telit ME910 PID 0x1100. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
David Ahern	23e761f37b	net: ipv6: Do not duplicate DAD on link up [ Upstream commit `6d717134a1` ] Andrey reported a warning triggered by the rcu code: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289 debug_print_object+0x175/0x210 ODEBUG: activate active (active state 1) object type: rcu_head hint: (null) Modules linked in: CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:16 dump_stack+0x192/0x22d lib/dump_stack.c:52 __warn+0x19f/0x1e0 kernel/panic.c:549 warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564 debug_print_object+0x175/0x210 lib/debugobjects.c:286 debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442 debug_rcu_head_queue kernel/rcu/rcu.h:75 __call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229 call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288 rt6_rcu_free net/ipv6/ip6_fib.c:158 rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188 fib6_del_route net/ipv6/ip6_fib.c:1461 fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500 __ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174 ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187 __ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520 addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672 ... Andrey's reproducer program runs in a very tight loop, calling 'unshare -n' and then spawning 2 sets of 14 threads running random ioctl calls. The relevant networking sequence: 1. New network namespace created via unshare -n - ip6tnl0 device is created in down state 2. address added to ip6tnl0 - equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1 - DAD is started on the address and when it completes the host route is inserted into the FIB 3. ip6tnl0 is brought up - the new fixup_permanent_addr function restarts DAD on the address 4. exit namespace - teardown / cleanup sequence starts - once in a blue moon, lo teardown appears to happen BEFORE teardown of ip6tunl0 + down on 'lo' removes the host route from the FIB since the dst->dev for the route is loobback + host route added to rcu callback list * rcu callback has not run yet, so rt is NOT on the gc list so it has NOT been marked obsolete 5. in parallel to 4. worker_thread runs addrconf_dad_completed - DAD on the address on ip6tnl0 completes - calls ipv6_ifa_notify which inserts the host route All of that happens very quickly. The result is that a host route that has been deleted from the IPv6 FIB and added to the RCU list is re-inserted into the FIB. The exit namespace eventually gets to cleaning up ip6tnl0 which removes the host route from the FIB again, calls the rcu function for cleanup -- and triggers the double rcu trace. The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably, DAD should not be started in step 2. The interface is in the down state, so it can not really send out requests for the address which makes starting DAD pointless. Since the second DAD was introduced by a recent change, seems appropriate to use it for the Fixes tag and have the fixup function only start DAD for addresses in the PREDAD state which occurs in addrconf_ifdown if the address is retained. Big thanks to Andrey for isolating a reliable reproducer for this problem. Fixes: `f1705ec197` ("net: ipv6: Make address flushing on ifdown optional") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David Ahern <dsahern@gmail.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Eric Dumazet	a1baca92af	tcp: fix wraparound issue in tcp_lp [ Upstream commit `a9f11f963a` ] Be careful when comparing tcp_time_stamp to some u32 quantity, otherwise result can be surprising. Fixes: `7c106d7e78` ("[TCP]: TCP Low Priority congestion control") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Daniel Borkmann	f94cc32ccb	bpf, arm64: fix jit branch offset related to ldimm64 [ Upstream commit `ddc665a4bb` ] When the instruction right before the branch destination is a 64 bit load immediate, we currently calculate the wrong jump offset in the ctx->offset[] array as we only account one instruction slot for the 64 bit load immediate although it uses two BPF instructions. Fix it up by setting the offset into the right slot after we incremented the index. Before (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 54ffff82 b.cs 0x00000020 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] After (ldimm64 test 1): [...] 00000020: 52800007 mov w7, #0x0 // #0 00000024: d2800060 mov x0, #0x3 // #3 00000028: d2800041 mov x1, #0x2 // #2 0000002c: eb01001f cmp x0, x1 00000030: 540000a2 b.cs 0x00000044 00000034: d29fffe7 mov x7, #0xffff // #65535 00000038: f2bfffe7 movk x7, #0xffff, lsl #16 0000003c: f2dfffe7 movk x7, #0xffff, lsl #32 00000040: f2ffffe7 movk x7, #0xffff, lsl #48 00000044: d29dddc7 mov x7, #0xeeee // #61166 00000048: f2bdddc7 movk x7, #0xeeee, lsl #16 0000004c: f2ddddc7 movk x7, #0xeeee, lsl #32 00000050: f2fdddc7 movk x7, #0xeeee, lsl #48 [...] Also, add a couple of test cases to make sure JITs pass this test. Tested on Cavium ThunderX ARMv8. The added test cases all pass after the fix. Fixes: `8eee539dde` ("arm64: bpf: fix out-of-bounds read in bpf2a64_offset()") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Yonghong Song	34acfbf08c	bpf: enhance verifier to understand stack pointer arithmetic [ Upstream commit `332270fdc8` ] llvm 4.0 and above generates the code like below: .... 440: (b7) r1 = 15 441: (05) goto pc+73 515: (79) r6 = (u64 )(r10 -152) 516: (bf) r7 = r10 517: (07) r7 += -112 518: (bf) r2 = r7 519: (0f) r2 += r1 520: (71) r1 = (u8 )(r8 +0) 521: (73) (u8 )(r2 +45) = r1 .... and the verifier complains "R2 invalid mem access 'inv'" for insn #521. This is because verifier marks register r2 as unknown value after #519 where r2 is a stack pointer and r1 holds a constant value. Teach verifier to recognize "stack_ptr + imm" and "stack_ptr + reg with const val" as valid stack_ptr with new offset. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Girish Moodalbail	5a5b34164f	geneve: fix incorrect setting of UDP checksum flag [ Upstream commit `5e0740c445` ] Creating a geneve link with 'udpcsum' set results in a creation of link for which UDP checksum will NOT be computed on outbound packets, as can be seen below. 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0 geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum Similarly, creating a link with 'noudpcsum' set results in a creation of link for which UDP checksum will be computed on outbound packets. Fixes: `9b4437a5b8` ("geneve: Unify LWT and netdev handling.") Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Davide Caratti	011fbfb296	tcp: fix access to sk->sk_state in tcp_poll() [ Upstream commit `d68be71ea1` ] avoid direct access to sk->sk_state when tcp_poll() is called on a socket using active TCP fastopen with deferred connect. Use local variable 'state', which stores the result of sk_state_load(), like it was done in commit `00fd38d938` ("tcp: ensure proper barriers in lockless contexts"). Fixes: `19f6d3f3c8` ("net/tcp-fastopen: Add new API support") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Alexandre Belloni	c0e5b847c7	net: macb: fix phy interrupt parsing [ Upstream commit `ae3696c167` ] Since `83a77e9ec4`, the phydev irq is explicitly set to PHY_POLL when there is no pdata. It doesn't work on DT enabled platforms because the phydev irq is already set by libphy before. Fixes: `83a77e9ec4` ("net: macb: Added PCI wrapper for Platform Driver.") Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
Greg Kroah-Hartman	acc7c954b0	refcount: change EXPORT_SYMBOL markings commit `d557d1b58b` upstream. Now that kref is using the refcount apis, the _GPL markings are getting exported to places that it previously wasn't. Now kref.h is GPLv2 licensed, so any non-GPL code using it better be talking to some lawyers, but changing api markings isn't considered "nice", so let's fix this up. Cc: Philip Müller <philm@manjaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-05-14 14:06:01 +02:00
Dave Aldridge	e2b8851f1f	sparc64: fix fault handling in NGbzero.S and GENbzero.S commit `3c7f622120` upstream. When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:01 +02:00
James Hughes	c98f47c1f6	brcmfmac: Make skb header writable before use commit `9cc4b7cb86` upstream. The driver was making changes to the skb_header without ensuring it was writable (i.e. uncloned). This patch also removes some boiler plate header size checking/adjustment code as that is also handled by the skb_cow_header function used to make header writable. Signed-off-by: James Hughes <james.hughes@raspberrypi.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
James Hughes	7f073b7e4d	brcmfmac: Ensure pointer correctly set if skb data location changes commit `455a1eb465` upstream. The incoming skb header may be resized if header space is insufficient, which might change the data adddress in the skb. Ensure that a cached pointer to that data is correctly set by moving assignment to after any possible changes. Signed-off-by: James Hughes <james.hughes@raspberrypi.org> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Giedrius Statkevičius	b998524b6c	power: supply: lp8788: prevent out of bounds array access commit `bdd9968d35` upstream. val might become 7 in which case stime[7] (array of length 7) would be accessed during the scnprintf call later and that will cause issues. Obviously, string concatenation is not intended here so just a comma needs to be added to fix the issue. Fixes: `98a2766493` ("power_supply: Add new lp8788 charger driver") Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@gmail.com> Acked-by: Milo Kim <milo.kim@ti.com> Signed-off-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Vincent Abriou	648ada88cb	drm/sti: fix GDP size to support up to UHD resolution commit `2f410f88c0` upstream. On stih407-410 chip family the GDP layers are able to support up to UHD resolution (3840 x 2160). Signed-off-by: Vincent Abriou <vincent.abriou@st.com> Acked-by: Lee Jones <lee.jones@linaro.org> Tested-by: Lee Jones <lee.jones@linaro.org> Link: http://patchwork.freedesktop.org/patch/msgid/1490280292-30466-1-git-send-email-vincent.abriou@st.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00
Adrian Salido	40a1937317	dm ioctl: prevent stack leak in dm ioctl call commit `4617f564c0` upstream. When calling a dm ioctl that doesn't process any data (IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct dm_ioctl are left initialized. Current code is incorrectly extending the size of data copied back to user, causing the contents of kernel stack to be leaked to user. Fix by only copying contents before data and allow the functions processing the ioctl to override. Signed-off-by: Adrian Salido <salidoa@google.com> Reviewed-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-05-14 14:06:00 +02:00

34177 changed files with 859105 additions and 2802142 deletions

48

.gitignore vendored

View File

@@ -7,40 +7,37 @@
 # command after changing this file, to see if there are
 # any tracked files which get ignored after the change.
 #
 # Normal rules (sorted alphabetically)
 # Normal rules
 #
 .*
 *.a
 *.bin
 *.bz2
 *.c.[012]*.*
 *.dtb
 *.dtb.S
 *.dwo
 *.elf
 *.gcno
 *.gz
 *.i
 *.ko
 *.ll
 *.lst
 *.lz4
 *.lzma
 *.lzo
 *.mod.c
 *.o
 *.o.*
 *.order
 *.patch
 *.a
 *.s
 *.ko
 *.so
 *.so.dbg
 *.su
 *.mod.c
 *.i
 *.lst
 *.symtypes
 *.order
 *.elf
 *.bin
 *.tar
 *.gz
 *.bz2
 *.lzma
 *.xz
 Module.symvers
 *.lz4
 *.lzo
 *.patch
 *.gcno
 modules.builtin
 Module.symvers
 *.dwo
 *.su
 *.c.[012]*.*
 #
 # Top-level generic files
@@ -55,11 +52,6 @@ modules.builtin
 /System.map
 /Module.markers
 #
 # RPM spec file (make rpm-pkg)
 #
 /*.spec
 #
 # Debian directory (make deb-pkg)
 #

12

.mailmap

View File

@@ -15,7 +15,6 @@ Adriana Reus <adi.reus@gmail.com> <adriana.reus@intel.com>
 Alan Cox <alan@lxorguk.ukuu.org.uk>
 Alan Cox <root@hraefn.swansea.linux.org.uk>
 Aleksey Gorelov <aleksey_gorelov@phoenix.com>
 Aleksandar Markovic <aleksandar.markovic@mips.com> <aleksandar.markovic@imgtec.com>
 Al Viro <viro@ftp.linux.org.uk>
 Al Viro <viro@zenIV.linux.org.uk>
 Andreas Herrmann <aherrman@de.ibm.com>
@@ -44,7 +43,6 @@ Corey Minyard <minyard@acm.org>
 Damian Hobson-Garcia <dhobsong@igel.co.jp>
 David Brownell <david-b@pacbell.net>
 David Woodhouse <dwmw2@shinybook.infradead.org>
 Deng-Cheng Zhu <dengcheng.zhu@mips.com> <dengcheng.zhu@imgtec.com>
 Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
 Domen Puncer <domen@coderock.org>
 Douglas Gilbert <dougg@torque.net>
@@ -70,8 +68,6 @@ Jacob Shin <Jacob.Shin@amd.com>
 James Bottomley <jejb@mulgrave.(none)>
 James Bottomley <jejb@titanic.il.steeleye.com>
 James E Wilson <wilson@specifix.com>
 James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
 James Hogan <jhogan@kernel.org> <james@albanarts.com>
 James Ketrenos <jketreno@io.(none)>
 Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com>
 <javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
@@ -102,8 +98,6 @@ Leonid I Ananiev <leonid.i.ananiev@intel.com>
 Linas Vepstas <linas@austin.ibm.com>
 Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@web.de>
 Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@ascom.ch>
 Maciej W. Rozycki <macro@mips.com> <macro@imgtec.com>
 Marcin Nowakowski <marcin.nowakowski@mips.com> <marcin.nowakowski@imgtec.com>
 Mark Brown <broonie@sirena.org.uk>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@theobroma-systems.com>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@ginzinger.com>
@@ -117,12 +111,9 @@ Mauro Carvalho Chehab <mchehab@kernel.org> <mchehab@osg.samsung.com>
 Mauro Carvalho Chehab <mchehab@kernel.org> <mchehab@s-opensource.com>
 Matt Ranostay <mranostay@gmail.com> Matthew Ranostay <mranostay@embeddedalley.com>
 Matt Ranostay <mranostay@gmail.com> <matt.ranostay@intel.com>
 Matt Ranostay <matt.ranostay@konsulko.com> <matt@ranostay.consulting>
 Matt Redfearn <matt.redfearn@mips.com> <matt.redfearn@imgtec.com>
 Mayuresh Janorkar <mayur@ti.com>
 Michael Buesch <m@bues.ch>
 Michel Dänzer <michel@tungstengraphics.com>
 Miodrag Dinic <miodrag.dinic@mips.com> <miodrag.dinic@imgtec.com>
 Mitesh shah <mshah@teja.com>
 Mohit Kumar <mohit.kumar@st.com> <mohit.kumar.dhaka@gmail.com>
 Morten Welinder <terra@gnome.org>
@@ -133,7 +124,6 @@ Mythri P K <mythripk@ti.com>
 Nguyen Anh Quynh <aquynh@gmail.com>
 Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
 Patrick Mochel <mochel@digitalimplant.org>
 Paul Burton <paul.burton@mips.com> <paul.burton@imgtec.com>
 Peter A Jonsson <pj@ludd.ltu.se>
 Peter Oruba <peter@oruba.de>
 Peter Oruba <peter.oruba@amd.com>
@@ -155,8 +145,6 @@ Santosh Shilimkar <ssantosh@kernel.org>
 Santosh Shilimkar <santosh.shilimkar@oracle.org>
 Sascha Hauer <s.hauer@pengutronix.de>
 S.Çağlar Onur <caglar@pardus.org.tr>
 Sebastian Reichel <sre@kernel.org> <sre@debian.org>
 Sebastian Reichel <sre@kernel.org> <sebastian.reichel@collabora.co.uk>
 Shiraz Hashim <shiraz.linux.kernel@gmail.com> <shiraz.hashim@st.com>
 Shuah Khan <shuah@kernel.org> <shuahkhan@gmail.com>
 Shuah Khan <shuah@kernel.org> <shuah.khan@hp.com>

27

CREDITS

View File

@@ -1034,10 +1034,6 @@ S: 2037 Walnut #6
 S: Boulder, Colorado 80302
 S: USA
 N: Hans-Christian Noren Egtvedt
 E: egtvedt@samfundet.no
 D: AVR32 architecture maintainer.
 N: Heiko Eißfeldt
 E: heiko@colossus.escape.de heiko@unifix.de
 D: verify_area stuff, generic SCSI fixes
@@ -2090,7 +2086,7 @@ S: Kuala Lumpur, Malaysia
 N: Mohit Kumar
 D: ST Microelectronics SPEAr13xx PCI host bridge driver
 D: Synopsys DesignWare PCI host bridge driver
 D: Synopsys Designware PCI host bridge driver
 N: Gabor Kuti
 E: seasons@falcon.sch.bme.hu
@@ -2113,10 +2109,6 @@ S: J. Obrechtstr 23
 S: NL-5216 GP 's-Hertogenbosch
 S: The Netherlands
 N: Ashley Lai
 E: ashleydlai@gmail.com
 D: IBM VTPM driver
 N: Savio Lam
 E: lam836@cs.cuhk.hk
 D: Author of the dialog utility, foundation
@@ -2610,9 +2602,11 @@ E: tmolina@cablespeed.com
 D: bug fixes, documentation, minor hackery
 N: Paul Moore
 E: paul@paul-moore.com
 W: http://www.paul-moore.com
 D: NetLabel, SELinux, audit
 E: paul.moore@hp.com
 D: NetLabel author
 S: Hewlett-Packard
 S: 110 Spit Brook Road
 S: Nashua, NH 03062
 N: James Morris
 E: jmorris@namei.org
@@ -3337,10 +3331,6 @@ S: Braunschweiger Strasse 79
 S: 31134 Hildesheim
 S: Germany
 N: Marcel Selhorst
 E: tpmdd@selhorst.net
 D: TPM driver
 N: Darren Senn
 E: sinster@darkwater.com
 D: Whatever I notice needs doing (so far: itimers, /proc)
@@ -3408,10 +3398,6 @@ S: Suite 101
 S: Markham, Ontario L3R 2Z6
 S: Canada
 N: Haavard Skinnemoen
 M: Haavard Skinnemoen <hskinnemoen@gmail.com>
 D: AVR32 architecture port to Linux and maintainer.
 N: Rick Sladkey
 E: jrs@world.std.com
 D: utility hacker: Emacs, NFS server, mount, kmem-ps, UPS debugger, strace, GDB
@@ -4136,6 +4122,7 @@ D: MD driver
 D: EISA/sysfs subsystem
 S: France
 # Don't add your name here, unless you really _are_ after Marc
 # alphabetically. Leonard used to be very proud of being the
 # last entry, and he'll get positively pissed if he can't even

10

Documentation/00-INDEX

View File

@@ -24,6 +24,8 @@ DMA-ISA-LPC.txt
 	- How to do DMA with ISA (and LPC) devices.
 DMA-attributes.txt
 	- listing of the various possible attributes a DMA region can have
 DocBook/
 	- directory with DocBook templates etc. for kernel documentation.
 EDID/
 	- directory with info on customizing EDID for broken gfx/displays.
 IPMI.txt
@@ -38,6 +40,8 @@ Intel-IOMMU.txt
 	- basic info on the Intel IOMMU virtualization support.
 Makefile
 	- It's not of interest for those who aren't touching the build system.
 Makefile.sphinx
 	- It's not of interest for those who aren't touching the build system.
 PCI/
 	- info related to PCI drivers.
 RCU/
@@ -242,6 +246,8 @@ kprobes.txt
 	- documents the kernel probes debugging feature.
 kref.txt
 	- docs on adding reference counters (krefs) to kernel objects.
 kselftest.txt
 	- small unittests for (some) individual codepaths in the kernel.
 laptops/
 	- directory with laptop related info and laptop driver documentation.
 ldm.txt
@@ -258,8 +264,6 @@ logo.gif
 	- full colour GIF image of Linux logo (penguin - Tux).
 logo.txt
 	- info on creator of above logo & site to get additional images from.
 lsm.txt
 	- Linux Security Modules: General Security Hooks for Linux
 lzo.txt
 	- kernel LZO decompressor input formats
 m68k/
@@ -408,8 +412,6 @@ sysctl/
 	- directory with info on the /proc/sys/* files.
 target/
 	- directory with info on generating TCM v4 fabric .ko modules
 tee.txt
 	- info on the TEE subsystem and drivers
 this_cpu_ops.txt
 	- List rationale behind and the way to use this_cpu operations.
 thermal/

8

Documentation/ABI/obsolete/sysfs-firmware-acpi

View File

@@ -1,8 +0,0 @@
 What:		/sys/firmware/acpi/hotplug/force_remove
 Date:		Mar 2017
 Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 Description:
 		Since the force_remove is inherently broken and dangerous to
 		use for some hotplugable resources like memory (because ignoring
 		the offline failure might lead to memory corruption and crashes)
 		enabling this knob is not safe and thus unsupported.

19

Documentation/ABI/stable/sysfs-bus-nvmem

View File

@@ -1,19 +0,0 @@
 What:		/sys/bus/nvmem/devices/.../nvmem
 Date:		July 2015
 KernelVersion:  4.2
 Contact:	Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
 Description:
 		This file allows user to read/write the raw NVMEM contents.
 		Permissions for write to this file depends on the nvmem
 		provider configuration.
 		ex:
 		hexdump /sys/bus/nvmem/devices/qfprom0/nvmem
 		0000000 0000 0000 0000 0000 0000 0000 0000 0000
 		*
 a0 db10 2240 0000 e000 0c00 0c00 0000 0c00
 		0000000 0000 0000 0000 0000 0000 0000 0000 0000
 		...
 		*
 		0001000

2

Documentation/ABI/stable/sysfs-bus-usb

View File

@@ -9,7 +9,7 @@ Description:
 		hubs this facility is always enabled and their device
 		directories will not contain this file.
 		For more information, see Documentation/driver-api/usb/persist.rst.
 		For more information, see Documentation/usb/persist.txt.
 What:		/sys/bus/usb/devices/.../power/autosuspend
 Date:		March 2007

16

Documentation/ABI/stable/sysfs-class-udc

View File

@@ -55,6 +55,14 @@ Description:
 		Indicates the maximum USB speed supported by this port.
 Users:
 What:		/sys/class/udc/<udc>/maximum_speed
 Date:		June 2011
 KernelVersion:	3.1
 Contact:	Felipe Balbi <balbi@kernel.org>
 Description:
 		Indicates the maximum USB speed supported by this port.
 Users:
 What:		/sys/class/udc/<udc>/soft_connect
 Date:		June 2011
 KernelVersion:	3.1
@@ -83,11 +91,3 @@ Description:
 		'configured', and 'suspended'; however not all USB Device
 		Controllers support reporting all states.
 Users:
 What:		/sys/class/udc/<udc>/function
 Date:		June 2017
 KernelVersion:	4.13
 Contact:	Felipe Balbi <balbi@kernel.org>
 Description:
 		Prints out name of currently running USB Gadget Driver.
 Users:

15

Documentation/ABI/stable/sysfs-driver-aspeed-vuart

View File

@@ -1,15 +0,0 @@
 What:		/sys/bus/platform/drivers/aspeed-vuart/*/lpc_address
 Date:		April 2017
 Contact:	Jeremy Kerr <jk@ozlabs.org>
 Description:	Configures which IO port the host side of the UART
 		will appear on the host <-> BMC LPC bus.
 Users:		OpenBMC.  Proposed changes should be mailed to
 		openbmc@lists.ozlabs.org
 What:		/sys/bus/platform/drivers/aspeed-vuart*/sirq
 Date:		April 2017
 Contact:	Jeremy Kerr <jk@ozlabs.org>
 Description:	Configures which interrupt number the host side of
 		the UART will appear on the host <-> BMC LPC bus.
 Users:		OpenBMC.  Proposed changes should be mailed to
 		openbmc@lists.ozlabs.org

30

Documentation/ABI/stable/sysfs-driver-dma-ioatdma

View File

@@ -1,30 +0,0 @@
 What:           sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/cap
 Date:           December 3, 2009
 KernelVersion:  2.6.32
 Contact:        dmaengine@vger.kernel.org
 Description:	Capabilities the DMA supports.Currently there are DMA_PQ, DMA_PQ_VAL,
 		DMA_XOR,DMA_XOR_VAL,DMA_INTERRUPT.
 What:           sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/ring_active
 Date:           December 3, 2009
 KernelVersion:  2.6.32
 Contact:        dmaengine@vger.kernel.org
 Description:	The number of descriptors active in the ring.
 What:           sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/ring_size
 Date:           December 3, 2009
 KernelVersion:  2.6.32
 Contact:        dmaengine@vger.kernel.org
 Description:	Descriptor ring size, total number of descriptors available.
 What:           sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/version
 Date:           December 3, 2009
 KernelVersion:  2.6.32
 Contact:        dmaengine@vger.kernel.org
 Description:	Version of ioatdma device.
 What:           sys/devices/pciXXXX:XX/0000:XX:XX.X/dma/dma<n>chan<n>/quickdata/intr_coalesce
 Date:           August 8, 2017
 KernelVersion:  4.14
 Contact:        dmaengine@vger.kernel.org
 Description:	Tune-able interrupt delay value per channel basis.

119

Documentation/ABI/stable/sysfs-hypervisor-xen

View File

@@ -1,119 +0,0 @@
 What:		/sys/hypervisor/compilation/compile_date
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Contains the build time stamp of the Xen hypervisor
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/compilation/compiled_by
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Contains information who built the Xen hypervisor
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/compilation/compiler
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Compiler which was used to build the Xen hypervisor
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/properties/capabilities
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Space separated list of supported guest system types. Each type
 		is in the format: <class>-<major>.<minor>-<arch>
 		With:
 			<class>: "xen" -- x86: paravirtualized, arm: standard
 				 "hvm" -- x86 only: fully virtualized
 			<major>: major guest interface version
 			<minor>: minor guest interface version
 			<arch>:  architecture, e.g.:
 				 "x86_32": 32 bit x86 guest without PAE
 				 "x86_32p": 32 bit x86 guest with PAE
 				 "x86_64": 64 bit x86 guest
 				 "armv7l": 32 bit arm guest
 				 "aarch64": 64 bit arm guest
 What:		/sys/hypervisor/properties/changeset
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Changeset of the hypervisor (git commit)
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/properties/features
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Features the Xen hypervisor supports for the guest as defined
 		in include/xen/interface/features.h printed as a hex value.
 What:		/sys/hypervisor/properties/pagesize
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Default page size of the hypervisor printed as a hex value.
 		Might return "0" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/properties/virtual_start
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Virtual address of the hypervisor as a hex value.
 What:		/sys/hypervisor/type
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Type of hypervisor:
 		"xen": Xen hypervisor
 What:		/sys/hypervisor/uuid
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		UUID of the guest as known to the Xen hypervisor.
 What:		/sys/hypervisor/version/extra
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		The Xen version is in the format <major>.<minor><extra>
 		This is the <extra> part of it.
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.
 What:		/sys/hypervisor/version/major
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		The Xen version is in the format <major>.<minor><extra>
 		This is the <major> part of it.
 What:		/sys/hypervisor/version/minor
 Date:		March 2009
 KernelVersion:	2.6.30
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		The Xen version is in the format <major>.<minor><extra>
 		This is the <minor> part of it.

3

Documentation/ABI/stable/vdso

View File

@@ -16,8 +16,7 @@ The vDSO uses symbol versioning; whenever you request a symbol from the
 vDSO, specify the version you are expecting.
 Programs that dynamically link to glibc will use the vDSO automatically.
 Otherwise, you can use the reference parser in
 tools/testing/selftests/vDSO/parse_vdso.c.
 Otherwise, you can use the reference parser in Documentation/vDSO/parse_vdso.c.
 Unless otherwise noted, the set of symbols with any given version and the
 ABI of those symbols is considered stable.  It may vary across architectures,

3

Documentation/ABI/testing/configfs-usb-gadget-rndis

View File

@@ -12,6 +12,3 @@ Description:
 				Ethernet over USB link
 		dev_addr	- MAC address of device's end of this
 				Ethernet over USB link
 		class		- USB interface class, default is 02 (hex)
 		subclass	- USB interface subclass, default is 06 (hex)
 		protocol	- USB interface protocol, default is 00 (hex)

18

Documentation/ABI/testing/configfs-usb-gadget-uac1

View File

@@ -1,14 +1,12 @@
 What:		/config/usb-gadget/gadget/functions/uac1.name
 Date:		June 2017
 KernelVersion:	4.14
 Date:		Sep 2014
 KernelVersion:	3.18
 Description:
 		The attributes:
 		c_chmask - capture channel mask
 		c_srate - capture sampling rate
 		c_ssize - capture sample size (bytes)
 		p_chmask - playback channel mask
 		p_srate - playback sampling rate
 		p_ssize - playback sample size (bytes)
 		req_number - the number of pre-allocated request
 			for both capture and playback
 		audio_buf_size - audio buffer size
 		fn_cap - capture pcm device file name
 		fn_cntl - control device file name
 		fn_play - playback pcm device file name
 		req_buf_size - ISO OUT endpoint request buffer size
 		req_count - ISO OUT endpoint request count

12

Documentation/ABI/testing/configfs-usb-gadget-uac1_legacy

View File

@@ -1,12 +0,0 @@
 What:		/config/usb-gadget/gadget/functions/uac1_legacy.name
 Date:		Sep 2014
 KernelVersion:	3.18
 Description:
 		The attributes:
 		audio_buf_size - audio buffer size
 		fn_cap - capture pcm device file name
 		fn_cntl - control device file name
 		fn_play - playback pcm device file name
 		req_buf_size - ISO OUT endpoint request buffer size
 		req_count - ISO OUT endpoint request count

8

Documentation/ABI/testing/ima_policy

View File

@@ -34,10 +34,9 @@ Description:
 			fsuuid:= file system UUID (e.g 8bcbe394-4f13-4144-be8e-5aa9ea2ce2f6)
 			uid:= decimal value
 			euid:= decimal value
 			fowner:= decimal value
 			fowner:=decimal value
 		lsm:  	are LSM specific
 		option:	appraise_type:= [imasig]
 			pcr:= decimal value
 		default policy:
 			# PROC_SUPER_MAGIC
@@ -97,8 +96,3 @@ Description:
 		Smack:
 			measure subj_user=_ func=FILE_CHECK mask=MAY_READ
 		Example of measure rules using alternate PCRs:
 			measure func=KEXEC_KERNEL_CHECK pcr=4
 			measure func=KEXEC_INITRAMFS_CHECK pcr=5

45

Documentation/ABI/testing/ppc-memtrace

View File

@@ -1,45 +0,0 @@
 What:		/sys/kernel/debug/powerpc/memtrace
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	This folder contains the relevant debugfs files for the
 		hardware trace macro to use. CONFIG_PPC64_HARDWARE_TRACING
 		must be set.
 What:		/sys/kernel/debug/powerpc/memtrace/enable
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	Write an integer containing the size in bytes of the memory
 		you want removed from each NUMA node to this file - it must be
 		aligned to the memblock size. This amount of RAM will be removed
 		from the kernel mappings and the following debugfs files will be
 		created. This can only be successfully done once per boot. Once
 		memory is successfully removed from each node, the following
 		files are created.
 What:		/sys/kernel/debug/powerpc/memtrace/<node-id>
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	This directory contains information about the removed memory
 		from the specific NUMA node.
 What:		/sys/kernel/debug/powerpc/memtrace/<node-id>/size
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	This contains the size of the memory removed from the node.
 What:		/sys/kernel/debug/powerpc/memtrace/<node-id>/start
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	This contains the start address of the removed memory.
 What:		/sys/kernel/debug/powerpc/memtrace/<node-id>/trace
 Date:		Aug 2017
 KernelVersion:	4.14
 Contact:	linuxppc-dev@lists.ozlabs.org
 Description:	This is where the hardware trace macro will output the trace
 		it generates.

31

Documentation/ABI/testing/procfs-smaps_rollup

View File

@@ -1,31 +0,0 @@
 What:		/proc/pid/smaps_rollup
 Date:		August 2017
 Contact:	Daniel Colascione <dancol@google.com>
 Description:
 		This file provides pre-summed memory information for a
 		process.  The format is identical to /proc/pid/smaps,
 		except instead of an entry for each VMA in a process,
 		smaps_rollup has a single entry (tagged "[rollup]")
 		for which each field is the sum of the corresponding
 		fields from all the maps in /proc/pid/smaps.
 		For more details, see the procfs man page.
 		Typical output looks like this:
 		00100000-ff709000 ---p 00000000 00:00 0		 [rollup]
 		Rss:		     884 kB
 		Pss:		     385 kB
 		Shared_Clean:	     696 kB
 		Shared_Dirty:	       0 kB
 		Private_Clean:	     120 kB
 		Private_Dirty:	      68 kB
 		Referenced:	     884 kB
 		Anonymous:	      68 kB
 		LazyFree:	       0 kB
 		AnonHugePages:	       0 kB
 		ShmemPmdMapped:	       0 kB
 		Shared_Hugetlb:	       0 kB
 		Private_Hugetlb:       0 kB
 		Swap:		       0 kB
 		SwapPss:	       0 kB
 		Locked:		     385 kB

10

Documentation/ABI/testing/sysfs-block

View File

@@ -213,8 +213,14 @@ What:		/sys/block/<disk>/queue/discard_zeroes_data
 Date:		May 2011
 Contact:	Martin K. Petersen <martin.petersen@oracle.com>
 Description:
 		Will always return 0.  Don't rely on any specific behavior
 		for discards, and don't read this file.
 		Devices that support discard functionality may return
 		stale or random data when a previously discarded block
 		is read back. This can cause problems if the filesystem
 		expects discarded blocks to be explicitly cleared. If a
 		device reports that it deterministically returns zeroes
 		when a discarded area is read the discard_zeroes_data
 		parameter will be set to one. Otherwise it will be 0 and
 		the result of reading a discarded area is undefined.
 What:		/sys/block/<disk>/queue/write_same_max_bytes
 Date:		January 2012

8

Documentation/ABI/testing/sysfs-block-zram

View File

@@ -90,11 +90,3 @@ Description:
 		device's debugging info useful for kernel developers. Its
 		format is not documented intentionally and may change
 		anytime without any notice.
 What:		/sys/block/zram<id>/backing_dev
 Date:		June 2017
 Contact:	Minchan Kim <minchan@kernel.org>
 Description:
 		The backing_dev file is read-write and set up backing
 		device for zram to write incompressible pages.
 		For using, user should enable CONFIG_ZRAM_WRITEBACK.

38

Documentation/ABI/testing/sysfs-bus-fsi

View File

@@ -1,38 +0,0 @@
 What:           /sys/bus/platform/devices/fsi-master/rescan
 Date:		May 2017
 KernelVersion:  4.12
 Contact:        cbostic@linux.vnet.ibm.com
 Description:
                 Initiates a FSI master scan for all connected slave devices
 		on its links.
 What:           /sys/bus/platform/devices/fsi-master/break
 Date:		May 2017
 KernelVersion:  4.12
 Contact:        cbostic@linux.vnet.ibm.com
 Description:
 		Sends an FSI BREAK command on a master's communication
 		link to any connnected slaves.  A BREAK resets connected
 		device's logic and preps it to receive further commands
 		from the master.
 What:           /sys/bus/platform/devices/fsi-master/slave@00:00/term
 Date:		May 2017
 KernelVersion:  4.12
 Contact:        cbostic@linux.vnet.ibm.com
 Description:
 		Sends an FSI terminate command from the master to its
 		connected slave. A terminate resets the slave's state machines
 		that control access to the internally connected engines.  In
 		addition the slave freezes its internal error register for
 		debugging purposes.  This command is also needed to abort any
 		ongoing operation in case of an expired 'Master Time Out'
 		timer.
 What:           /sys/bus/platform/devices/fsi-master/slave@00:00/raw
 Date:		May 2017
 KernelVersion:  4.12
 Contact:        cbostic@linux.vnet.ibm.com
 Description:
 		Provides a means of reading/writing a 32 bit value from/to a
 		specified FSI bus address.

41

Documentation/ABI/testing/sysfs-bus-iio

View File

@@ -32,7 +32,7 @@ Description:
 		Description of the physical chip / device for device X.
 		Typically a part number.
 What:		/sys/bus/iio/devices/iio:deviceX/current_timestamp_clock
 What:		/sys/bus/iio/devices/iio:deviceX/timestamp_clock
 KernelVersion:	4.5
 Contact:	linux-iio@vger.kernel.org
 Description:
@@ -55,7 +55,6 @@ Description:
 		then it is to be found in the base device directory.
 What:		/sys/bus/iio/devices/iio:deviceX/sampling_frequency_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_proximity_sampling_frequency_available
 What:		/sys/.../iio:deviceX/buffer/sampling_frequency_available
 What:		/sys/bus/iio/devices/triggerX/sampling_frequency_available
 KernelVersion:	2.6.35
@@ -119,15 +118,6 @@ Description:
 		unique to allow association with event codes. Units after
 		application of scale and offset are milliamps.
 What:		/sys/bus/iio/devices/iio:deviceX/in_powerY_raw
 KernelVersion:	4.5
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw (unscaled no bias removal etc.) power measurement from
 		channel Y. The number must always be specified and
 		unique to allow association with event codes. Units after
 		application of scale and offset are milliwatts.
 What:		/sys/bus/iio/devices/iio:deviceX/in_capacitanceY_raw
 KernelVersion:	3.2
 Contact:	linux-iio@vger.kernel.org
@@ -1434,17 +1424,6 @@ Description:
 		guarantees that the hardware fifo is flushed to the device
 		buffer.
 What:		/sys/bus/iio/devices/iio:device*/buffer/hwfifo_timeout
 KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description:
 		A read/write property to provide capability to delay reporting of
 		samples till a timeout is reached. This allows host processors to
 		sleep, while the sensor is storing samples in its internal fifo.
 		The maximum timeout in seconds can be specified by setting
 		hwfifo_timeout.The current delay can be read by reading
 		hwfifo_timeout. A value of 0 means that there is no timeout.
 What:		/sys/bus/iio/devices/iio:deviceX/buffer/hwfifo_watermark
 KernelVersion: 4.2
 Contact:	linux-iio@vger.kernel.org
@@ -1614,7 +1593,7 @@ Description:
 		can be processed to siemens per meter.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_raw
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw counter device counts from channel Y. For quadrature
@@ -1622,24 +1601,10 @@ Description:
 		the counts of a single quadrature signal phase from channel Y.
 What:		/sys/bus/iio/devices/iio:deviceX/in_indexY_raw
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw counter device index value from channel Y. This attribute
 		provides an absolute positional reference (e.g. a pulse once per
 		revolution) which may be used to home positional systems as
 		required.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_count_direction_available
 KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description:
 		A list of possible counting directions which are:
 		- "up"	: counter device is increasing.
 		- "down": counter device is decreasing.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_count_direction
 KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Raw counter device counters direction for channel Y.

17

Documentation/ABI/testing/sysfs-bus-iio-adc-max9611

View File

@@ -1,17 +0,0 @@
 What:		/sys/bus/iio/devices/iio:deviceX/in_power_shunt_resistor
 Date:		March 2017
 KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description: 	The value of the shunt resistor used to compute power drain on
                 common input voltage pin (RS+). In Ohms.
 What:		/sys/bus/iio/devices/iio:deviceX/in_current_shunt_resistor
 Date:		March 2017
 KernelVersion:	4.12
 Contact:	linux-iio@vger.kernel.org
 Description: 	The value of the shunt resistor used to compute current flowing
                 between RS+ and RS- voltage sense inputs. In Ohms.
 These attributes describe a single physical component, exposed as two distinct
 attributes as it is used to calculate two different values: power load and
 current flowing between RS+ and RS- inputs.

24

Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8

View File

@@ -1,16 +1,24 @@
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_count_direction_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_count_mode_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_noise_error_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_index_index_polarity_available
 What:		/sys/bus/iio/devices/iio:deviceX/in_index_synchronous_mode_available
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Discrete set of available values for the respective counter
 		configuration are listed in this file.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_count_direction
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Read-only attribute that indicates whether the counter for
 		channel Y is counting up or down.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_count_mode
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Count mode for channel Y. Four count modes are available:
@@ -44,7 +52,7 @@ Description:
 			continuously throughout.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_noise_error
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Read-only attribute that indicates whether excessive noise is
@@ -52,14 +60,14 @@ Description:
 		irrelevant in non-quadrature clock mode.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_preset
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		If the counter device supports preset registers, the preset
 		count for channel Y is provided by this attribute.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_quadrature_mode
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Configure channel Y counter for non-quadrature or quadrature
@@ -80,7 +88,7 @@ Description:
 			decoded for UP/DN clock.
 What:		/sys/bus/iio/devices/iio:deviceX/in_countY_set_to_preset_on_index
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Whether to set channel Y counter with channel Y preset value
@@ -88,14 +96,14 @@ Description:
 		Valid attribute values are boolean.
 What:		/sys/bus/iio/devices/iio:deviceX/in_indexY_index_polarity
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Active level of channel Y index input; irrelevant in
 		non-synchronous load mode.
 What:		/sys/bus/iio/devices/iio:deviceX/in_indexY_synchronous_mode
 KernelVersion:	4.10
 KernelVersion:	4.9
 Contact:	linux-iio@vger.kernel.org
 Description:
 		Configure channel Y counter for non-synchronous or synchronous

57

Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32

View File

@@ -1,57 +0,0 @@
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_preset
 KernelVersion:	4.13
 Contact:	fabrice.gasnier@st.com
 Description:
 		Reading returns the current preset value. Writing sets the
 		preset value. Encoder counts continuously from 0 to preset
 		value, depending on direction (up/down).
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
 KernelVersion:	4.13
 Contact:	fabrice.gasnier@st.com
 Description:
 		Reading returns the list possible quadrature modes.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode
 KernelVersion:	4.13
 Contact:	fabrice.gasnier@st.com
 Description:
 		Configure the device counter quadrature modes:
 		- non-quadrature:
 			Encoder IN1 input servers as the count input (up
 			direction).
 		- quadrature:
 			Encoder IN1 and IN2 inputs are mixed to get direction
 			and count.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_polarity_available
 KernelVersion:	4.13
 Contact:	fabrice.gasnier@st.com
 Description:
 		Reading returns the list possible active edges.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_polarity
 KernelVersion:	4.13
 Contact:	fabrice.gasnier@st.com
 Description:
 		Configure the device encoder/counter active edge:
 		- rising-edge
 		- falling-edge
 		- both-edges
 		In non-quadrature mode, device counts up on active edge.
 		In quadrature mode, encoder counting scenarios are as follows:
 		----------------------------------------------------------------
 		| Active  | Level on |      IN1 signal    |     IN2 signal     |
 		| edge    | opposite |------------------------------------------
 		|         | signal   |  Rising  | Falling |  Rising  | Falling |
 		----------------------------------------------------------------
 		| Rising  | High ->  |   Down   |    -    |    Up    |    -    |
 		| edge    | Low  ->  |    Up    |    -    |   Down   |    -    |
 		----------------------------------------------------------------
 		| Falling | High ->  |    -     |    Up   |    -     |   Down  |
 		| edge    | Low  ->  |    -     |   Down  |    -     |    Up   |
 		----------------------------------------------------------------
 		| Both    | High ->  |   Down   |    Up   |    Up    |   Down  |
 		| edges   | Low  ->  |    Up    |   Down  |   Down   |    Up   |
 		----------------------------------------------------------------

1

Documentation/ABI/testing/sysfs-bus-iio-meas-spec

View File

@@ -5,3 +5,4 @@ Description:
                 Reading returns either '1' or '0'. '1' means that the
                 battery level supplied to sensor is below 2.25V.
                 This ABI is available for tsys02d, htu21, ms8607
 		This ABI is available for htu21, ms8607

8

Documentation/ABI/testing/sysfs-bus-iio-proximity-as3935

View File

@@ -14,11 +14,3 @@ Description:
 		Show or set the gain boost of the amp, from 0-31 range.
 = indoors (default)
 = outdoors
 What		/sys/bus/iio/devices/iio:deviceX/noise_level_tripped
 Date:		May 2017
 KernelVersion:	4.13
 Contact:	Matt Ranostay <matt.ranostay@konsulko.com>
 Description:
 		When 1 the noise level is over the trip level and not reporting
 		valid data

134

Documentation/ABI/testing/sysfs-bus-iio-timer-stm32

View File

@@ -3,67 +3,15 @@ KernelVersion:	4.11
 Contact:	benjamin.gaignard@st.com
 Description:
 		Reading returns the list possible master modes which are:
 		- "reset"     :	The UG bit from the TIMx_EGR register is
 				used as trigger output (TRGO).
 		- "enable"    : The Counter Enable signal CNT_EN is used
 				as trigger output.
 		- "reset"     :	The UG bit from the TIMx_EGR register is used as trigger output (TRGO).
 		- "enable"    : The Counter Enable signal CNT_EN is used as trigger output.
 		- "update"    : The update event is selected as trigger output.
 				For instance a master timer can then be used
 				as a prescaler for a slave timer.
 		- "compare_pulse" : The trigger output send a positive pulse
 				    when the CC1IF flag is to be set.
 				For instance a master timer can then be used as a prescaler for a slave timer.
 		- "compare_pulse" : The trigger output send a positive pulse when the CC1IF flag is to be set.
 		- "OC1REF"    : OC1REF signal is used as trigger output.
 		- "OC2REF"    : OC2REF signal is used as trigger output.
 		- "OC3REF"    : OC3REF signal is used as trigger output.
 		- "OC4REF"    : OC4REF signal is used as trigger output.
 		Additional modes (on TRGO2 only):
 		- "OC5REF"    : OC5REF signal is used as trigger output.
 		- "OC6REF"    : OC6REF signal is used as trigger output.
 		- "compare_pulse_OC4REF":
 		  OC4REF rising or falling edges generate pulses.
 		- "compare_pulse_OC6REF":
 		  OC6REF rising or falling edges generate pulses.
 		- "compare_pulse_OC4REF_r_or_OC6REF_r":
 		  OC4REF or OC6REF rising edges generate pulses.
 		- "compare_pulse_OC4REF_r_or_OC6REF_f":
 		  OC4REF rising or OC6REF falling edges generate pulses.
 		- "compare_pulse_OC5REF_r_or_OC6REF_r":
 		  OC5REF or OC6REF rising edges generate pulses.
 		- "compare_pulse_OC5REF_r_or_OC6REF_f":
 		  OC5REF rising or OC6REF falling edges generate pulses.
 		+-----------+   +-------------+            +---------+
 		| Prescaler +-> | Counter     |        +-> | Master  | TRGO(2)
 		+-----------+   +--+--------+-+        |-> | Control +-->
 		                   |        |          ||  +---------+
 		                +--v--------+-+ OCxREF ||  +---------+
 		                | Chx compare +----------> | Output  | ChX
 		                +-----------+-+         |  | Control +-->
 		                      .     |           |  +---------+
 		                      .     |           |    .
 		                +-----------v-+ OC6REF  |    .
 		                | Ch6 compare +---------+>
 		                +-------------+
 		Example with: "compare_pulse_OC4REF_r_or_OC6REF_r":
 		                X
 		              X   X
 		            X .   . X
 		          X   .   .   X
 		        X     .   .     X
 		count X .     .   .     . X
 		        .     .   .     .
 		        .     .   .     .
 		        +---------------+
 		OC4REF  |     .   .     |
 		      +-+     .   .     +-+
 		        .     +---+     .
 		OC6REF  .     |   |     .
 		      +-------+   +-------+
 		        +-+   +-+
 		TRGO2   | |   | |
 		      +-+ +---+ +---------+
 What:		/sys/bus/iio/devices/triggerX/master_mode
 KernelVersion:	4.11
@@ -79,77 +27,3 @@ Description:
 		Reading returns the current sampling frequency.
 		Writing an value different of 0 set and start sampling.
 		Writing 0 stop sampling.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_preset
 KernelVersion:	4.12
 Contact:	benjamin.gaignard@st.com
 Description:
 		Reading returns the current preset value.
 		Writing sets the preset value.
 		When counting up the counter starts from 0 and fires an
 		event when reach preset value.
 		When counting down the counter start from preset value
 		and fire event when reach 0.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_quadrature_mode_available
 KernelVersion:	4.12
 Contact:	benjamin.gaignard@st.com
 Description:
 		Reading returns the list possible quadrature modes.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_quadrature_mode
 KernelVersion:	4.12
 Contact:	benjamin.gaignard@st.com
 Description:
 		Configure the device counter quadrature modes:
 		channel_A:
 			Encoder A input servers as the count input and B as
 			the UP/DOWN direction control input.
 		channel_B:
 			Encoder B input serves as the count input and A as
 			the UP/DOWN direction control input.
 		quadrature:
 			Encoder A and B inputs are mixed to get direction
 			and count with a scale of 0.25.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_enable_mode_available
 KernelVersion:	4.12
 Contact:	benjamin.gaignard@st.com
 Description:
 		Reading returns the list possible enable modes.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_enable_mode
 KernelVersion:	4.12
 Contact:	benjamin.gaignard@st.com
 Description:
 		Configure the device counter enable modes, in all case
 		counting direction is set by in_count0_count_direction
 		attribute and the counter is clocked by the internal clock.
 		always:
 			Counter is always ON.
 		gated:
 			Counting is enabled when connected trigger signal
 			level is high else counting is disabled.
 		triggered:
 			Counting is enabled on rising edge of the connected
 			trigger, and remains enabled for the duration of this
 			selected mode.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count_trigger_mode_available
 KernelVersion:	4.13
 Contact:	benjamin.gaignard@st.com
 Description:
 		Reading returns the list possible trigger modes.
 What:		/sys/bus/iio/devices/iio:deviceX/in_count0_trigger_mode
 KernelVersion:	4.13
 Contact:	benjamin.gaignard@st.com
 Description:
 		Configure the device counter trigger mode
 		counting direction is set by in_count0_count_direction
 		attribute and the counter is clocked by the connected trigger
 		rising edges.

24

Documentation/ABI/testing/sysfs-bus-pci

View File

@@ -299,27 +299,5 @@ What:		/sys/bus/pci/devices/.../revision
 Date:		November 2016
 Contact:	Emil Velikov <emil.l.velikov@gmail.com>
 Description:
 		This file contains the revision field of the PCI device.
 		This file contains the revision field of the the PCI device.
 		The value comes from device config space. The file is read only.
 What:		/sys/bus/pci/devices/.../sriov_drivers_autoprobe
 Date:		April 2017
 Contact:	Bodong Wang<bodong@mellanox.com>
 Description:
 		This file is associated with the PF of a device that
 		supports SR-IOV.  It determines whether newly-enabled VFs
 		are immediately bound to a driver.  It initially contains
 , which means the kernel automatically binds VFs to a
 		compatible driver immediately after they are enabled.  If
 		an application writes 0 to the file before enabling VFs,
 		the kernel will not bind VFs to a driver.
 		A typical use case is to write 0 to this file, then enable
 		VFs, then assign the newly-created VFs to virtual machines.
 		Note that changing this file does not affect already-
 		enabled VFs.  In this scenario, the user must first disable
 		the VFs, write 0 to sriov_drivers_autoprobe, then re-enable
 		the VFs.
 		This is similar to /sys/bus/pci/drivers_autoprobe, but
 		affects only the VFs associated with a specific PF.

112

Documentation/ABI/testing/sysfs-bus-thunderbolt

View File

@@ -1,112 +0,0 @@
 What: /sys/bus/thunderbolt/devices/.../domainX/security
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute holds current Thunderbolt security level
 		set by the system BIOS. Possible values are:
 		none: All devices are automatically authorized
 		user: Devices are only authorized based on writing
 		      appropriate value to the authorized attribute
 		secure: Require devices that support secure connect at
 			minimum. User needs to authorize each device.
 		dponly: Automatically tunnel Display port (and USB). No
 			PCIe tunnels are created.
 What: /sys/bus/thunderbolt/devices/.../authorized
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute is used to authorize Thunderbolt devices
 		after they have been connected. If the device is not
 		authorized, no devices such as PCIe and Display port are
 		available to the system.
 		Contents of this attribute will be 0 when the device is not
 		yet authorized.
 		Possible values are supported:
 : The device will be authorized and connected
 		When key attribute contains 32 byte hex string the possible
 		values are:
 : The 32 byte hex string is added to the device NVM and
 		   the device is authorized.
 : Send a challenge based on the 32 byte hex string. If the
 		   challenge response from device is valid, the device is
 		   authorized. In case of failure errno will be ENOKEY if
 		   the device did not contain a key at all, and
 		   EKEYREJECTED if the challenge response did not match.
 What: /sys/bus/thunderbolt/devices/.../key
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	When a devices supports Thunderbolt secure connect it will
 		have this attribute. Writing 32 byte hex string changes
 		authorization to use the secure connection method instead.
 		Writing an empty string clears the key and regular connection
 		method can be used again.
 What:		/sys/bus/thunderbolt/devices/.../device
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute contains id of this device extracted from
 		the device DROM.
 What:		/sys/bus/thunderbolt/devices/.../device_name
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute contains name of this device extracted from
 		the device DROM.
 What:		/sys/bus/thunderbolt/devices/.../vendor
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute contains vendor id of this device extracted
 		from the device DROM.
 What:		/sys/bus/thunderbolt/devices/.../vendor_name
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute contains vendor name of this device extracted
 		from the device DROM.
 What:		/sys/bus/thunderbolt/devices/.../unique_id
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	This attribute contains unique_id string of this device.
 		This is either read from hardware registers (UUID on
 		newer hardware) or based on UID from the device DROM.
 		Can be used to uniquely identify particular device.
 What:		/sys/bus/thunderbolt/devices/.../nvm_version
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	If the device has upgradeable firmware the version
 		number is available here. Format: %x.%x, major.minor.
 		If the device is in safe mode reading the file returns
 		-ENODATA instead as the NVM version is not available.
 What:		/sys/bus/thunderbolt/devices/.../nvm_authenticate
 Date:		Sep 2017
 KernelVersion:	4.13
 Contact:	thunderbolt-software@lists.01.org
 Description:	When new NVM image is written to the non-active NVM
 		area (through non_activeX NVMem device), the
 		authentication procedure is started by writing 1 to
 		this file. If everything goes well, the device is
 		restarted with the new NVM firmware. If the image
 		verification fails an error code is returned instead.
 		When read holds status of the last authentication
 		operation if an error occurred during the process. This
 		is directly the status value from the DMA configuration
 		based mailbox before the device is power cycled. Writing
 here clears the status.

13

Documentation/ABI/testing/sysfs-bus-usb-lvstest

View File

@@ -45,16 +45,3 @@ Contact:	Pratyush Anand <pratyush.anand@gmail.com>
 Description:
 		Write to this node to issue "U3 exit" for Link Layer
 		Validation device. It is needed for TD.7.36.
 What:		/sys/bus/usb/devices/.../enable_compliance
 Date:		July 2017
 Description:
 		Write to this node to set the port to compliance mode to test
 		with Link Layer Validation device. It is needed for TD.7.34.
 What:		/sys/bus/usb/devices/.../warm_reset
 Date:		July 2017
 Description:
 		Write to this node to issue "Warm Reset" for Link Layer Validation
 		device. It may be needed to properly reset an xHCI 1.1 host port if
 		compliance mode needed to be explicitly enabled.

4

Documentation/ABI/testing/sysfs-class-cxl

View File

@@ -69,9 +69,7 @@ Date:           September 2014
 Contact:        linuxppc-dev@lists.ozlabs.org
 Description:    read/write
                 Set the mode for prefaulting in segments into the segment table
                 when performing the START_WORK ioctl. Only applicable when
                 running under hashed page table mmu.
                 Possible values:
                 when performing the START_WORK ioctl. Possible values:
                         none: No prefaulting (default)
                         work_element_descriptor: Treat the work element
                                  descriptor as an effective address and

6

Documentation/ABI/testing/sysfs-class-mtd

View File

@@ -229,6 +229,6 @@ KernelVersion:	4.1
 Contact:	linux-mtd@lists.infradead.org
 Description:
 		For a partition, the offset of that partition from the start
 		of the parent (another partition or a flash device) in bytes.
 		This attribute is absent on flash devices, so it can be used
 		to distinguish them from partitions.
 		of the master device in bytes. This attribute is absent on
 		main devices, so it can be used to distinguish between
 		partitions and devices that aren't partitions.

16

Documentation/ABI/testing/sysfs-class-mux

View File

@@ -1,16 +0,0 @@
 What:		/sys/class/mux/
 Date:		April 2017
 KernelVersion:	4.13
 Contact:	Peter Rosin <peda@axentia.se>
 Description:
 		The mux/ class sub-directory belongs to the Generic MUX
 		Framework and provides a sysfs interface for using MUX
 		controllers.
 What:		/sys/class/mux/muxchipN/
 Date:		April 2017
 KernelVersion:	4.13
 Contact:	Peter Rosin <peda@axentia.se>
 Description:
 		A /sys/class/mux/muxchipN directory is created for each
 		probed MUX chip where N is a simple enumeration.

8

Documentation/ABI/testing/sysfs-class-net

View File

@@ -251,11 +251,3 @@ Contact:	netdev@vger.kernel.org
 Description:
 		Indicates the unique physical switch identifier of a switch this
 		port belongs to, as a string.
 What:		/sys/class/net/<iface>/phydev
 Date:		May 2017
 KernelVersion:	4.13
 Contact:	netdev@vger.kernel.org
 Description:
 		Symbolic link to the PHY device this network device is attached
 		to.

36

Documentation/ABI/testing/sysfs-class-net-phydev

View File

@@ -1,36 +0,0 @@
 What:		/sys/class/mdio_bus/<bus>/<device>/attached_dev
 Date:		May 2017
 KernelVersion:	4.13
 Contact:	netdev@vger.kernel.org
 Description:
 		Symbolic link to the network device this PHY device is
 		attached to.
 What:		/sys/class/mdio_bus/<bus>/<device>/phy_has_fixups
 Date:		February 2014
 KernelVersion:	3.15
 Contact:	netdev@vger.kernel.org
 Description:
 		Boolean value indicating whether the PHY device has
 		any fixups registered against it (phy_register_fixup)
 What:		/sys/class/mdio_bus/<bus>/<device>/phy_id
 Date:		November 2012
 KernelVersion:	3.8
 Contact:	netdev@vger.kernel.org
 Description:
 -bit hexadecimal value corresponding to the PHY device's OUI,
 		model and revision number.
 What:		/sys/class/mdio_bus/<bus>/<device>/phy_interface
 Date:		February 2014
 KernelVersion:	3.15
 Contact:	netdev@vger.kernel.org
 Description:
 		String value indicating the PHY interface, possible
 		values are:.
 		<empty> (not available), mii, gmii, sgmii, tbi, rev-mii,
 		rmii, rgmii, rgmii-id, rgmii-rxid, rgmii-txid, rtbi, smii
 		xgmii, moca, qsgmii, trgmii, 1000base-x, 2500base-x, rxaui,
 		xaui, 10gbase-kr, unknown

27

Documentation/ABI/testing/sysfs-class-net-qmi

View File

@@ -21,30 +21,3 @@ Description:
 		is responsible for coordination of driver and firmware
 		link framing mode, changing this setting to 'Y' if the
 		firmware is configured for 'raw-ip' mode.
 What:		/sys/class/net/<iface>/qmi/add_mux
 Date:		March 2017
 KernelVersion:	4.11
 Contact:	Bjørn Mork <bjorn@mork.no>
 Description:
 		Unsigned integer.
 		Write a number ranging from 1 to 127 to add a qmap mux
 		based network device, supported by recent Qualcomm based
 		modems.
 		The network device will be called qmimux.
 		Userspace is in charge of managing the qmux network device
 		activation and data stream setup on the modem side by
 		using the proper QMI protocol requests.
 What:		/sys/class/net/<iface>/qmi/del_mux
 Date:		March 2017
 KernelVersion:	4.11
 Contact:	Bjørn Mork <bjorn@mork.no>
 Description:
 		Unsigned integer.
 		Write a number ranging from 1 to 127 to delete a previously
 		created qmap mux based network device.

17

Documentation/ABI/testing/sysfs-class-power-twl4030

View File

@@ -1,3 +1,20 @@
 What: /sys/class/power_supply/twl4030_ac/max_current
       /sys/class/power_supply/twl4030_usb/max_current
 Description:
 	Read/Write limit on current which may
 	be drawn from the ac (Accessory Charger) or
 	USB port.
 	Value is in micro-Amps.
 	Value is set automatically to an appropriate
 	value when a cable is plugged or unplugged.
 	Value can the set by writing to the attribute.
 	The change will only persist until the next
 	plug event.  These event are reported via udev.
 What: /sys/class/power_supply/twl4030_usb/mode
 Description:
 	Changing mode for USB port.

4

Documentation/ABI/testing/sysfs-class-remoteproc

View File

@@ -1,6 +1,6 @@
 What:		/sys/class/remoteproc/.../firmware
 Date:		October 2016
 Contact:	Matt Redfearn <matt.redfearn@mips.com>
 Contact:	Matt Redfearn <matt.redfearn@imgtec.com>
 Description:	Remote processor firmware
 		Reports the name of the firmware currently loaded to the
@@ -11,7 +11,7 @@ Description:	Remote processor firmware
 What:		/sys/class/remoteproc/.../state
 Date:		October 2016
 Contact:	Matt Redfearn <matt.redfearn@mips.com>
 Contact:	Matt Redfearn <matt.redfearn@imgtec.com>
 Description:	Remote processor state
 		Reports the state of the remote processor, which will be one of:

96

Documentation/ABI/testing/sysfs-class-switchtec

View File

@@ -1,96 +0,0 @@
 switchtec - Microsemi Switchtec PCI Switch Management Endpoint
 For details on this subsystem look at Documentation/switchtec.txt.
 What: 		/sys/class/switchtec
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	The switchtec class subsystem folder.
 		Each registered switchtec driver is represented by a switchtecX
 		subfolder (X being an integer >= 0).
 What:		/sys/class/switchtec/switchtec[0-9]+/component_id
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Component identifier as stored in the hardware (eg. PM8543)
 		(read only)
 Values: 	arbitrary string.
 What:		/sys/class/switchtec/switchtec[0-9]+/component_revision
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Component revision stored in the hardware (read only)
 Values: 	integer.
 What:		/sys/class/switchtec/switchtec[0-9]+/component_vendor
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Component vendor as stored in the hardware (eg. MICROSEM)
 		(read only)
 Values: 	arbitrary string.
 What:		/sys/class/switchtec/switchtec[0-9]+/device_version
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Device version as stored in the hardware (read only)
 Values: 	integer.
 What:		/sys/class/switchtec/switchtec[0-9]+/fw_version
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Currently running firmware version (read only)
 Values: 	integer (in hexadecimal).
 What:		/sys/class/switchtec/switchtec[0-9]+/partition
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Partition number for this device in the switch (read only)
 Values: 	integer.
 What:		/sys/class/switchtec/switchtec[0-9]+/partition_count
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Total number of partitions in the switch (read only)
 Values: 	integer.
 What:		/sys/class/switchtec/switchtec[0-9]+/product_id
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Product identifier as stored in the hardware (eg. PSX 48XG3)
 		(read only)
 Values: 	arbitrary string.
 What:		/sys/class/switchtec/switchtec[0-9]+/product_revision
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Product revision stored in the hardware (eg. RevB)
 		(read only)
 Values: 	arbitrary string.
 What:		/sys/class/switchtec/switchtec[0-9]+/product_vendor
 Date:		05-Jan-2017
 KernelVersion:	v4.11
 Contact:	Logan Gunthorpe <logang@deltatee.com>
 Description:	Product vendor as stored in the hardware (eg. MICROSEM)
 		(read only)
 Values: 	arbitrary string.

291

Documentation/ABI/testing/sysfs-class-typec

View File

@@ -1,291 +0,0 @@
 USB Type-C port devices (eg. /sys/class/typec/port0/)
 What:		/sys/class/typec/<port>/data_role
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The supported USB data roles. This attribute can be used for
 		requesting data role swapping on the port. Swapping is supported
 		as synchronous operation, so write(2) to the attribute will not
 		return until the operation has finished. The attribute is
 		notified about role changes so that poll(2) on the attribute
 		wakes up. Change on the role will also generate uevent
 		KOBJ_CHANGE on the port. The current role is show in brackets,
 		for example "[host] device" when DRP port is in host mode.
 		Valid values: host, device
 What:		/sys/class/typec/<port>/power_role
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The supported power roles. This attribute can be used to request
 		power role swap on the port when the port supports USB Power
 		Delivery. Swapping is supported as synchronous operation, so
 		write(2) to the attribute will not return until the operation
 		has finished. The attribute is notified about role changes so
 		that poll(2) on the attribute wakes up. Change on the role will
 		also generate uevent KOBJ_CHANGE. The current role is show in
 		brackets, for example "[source] sink" when in source mode.
 		Valid values: source, sink
 What:           /sys/class/typec/<port>/port_type
 Date:           May 2017
 Contact:	Badhri Jagan Sridharan <Badhri@google.com>
 Description:
 		Indicates the type of the port. This attribute can be used for
 		requesting a change in the port type. Port type change is
 		supported as a synchronous operation, so write(2) to the
 		attribute will not return until the operation has finished.
 		Valid values:
 		- source (The port will behave as source only DFP port)
 		- sink (The port will behave as sink only UFP port)
 		- dual (The port will behave as dual-role-data and
 			dual-role-power port)
 What:		/sys/class/typec/<port>/vconn_source
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows is the port VCONN Source. This attribute can be used to
 		request VCONN swap to change the VCONN Source during connection
 		when both the port and the partner support USB Power Delivery.
 		Swapping is supported as synchronous operation, so write(2) to
 		the attribute will not return until the operation has finished.
 		The attribute is notified about VCONN source changes so that
 		poll(2) on the attribute wakes up. Change on VCONN source also
 		generates uevent KOBJ_CHANGE.
 		Valid values:
 		- "no" when the port is not the VCONN Source
 		- "yes" when the port is the VCONN Source
 What:		/sys/class/typec/<port>/power_operation_mode
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the current power operational mode the port is in. The
 		power operation mode means current level for VBUS. In case USB
 		Power Delivery communication is used for negotiating the levels,
 		power operation mode should show "usb_power_delivery".
 		Valid values:
 		- default
 		- 1.5A
 		- 3.0A
 		- usb_power_delivery
 What:		/sys/class/typec/<port>/preferred_role
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The user space can notify the driver about the preferred role.
 		It should be handled as enabling of Try.SRC or Try.SNK, as
 		defined in USB Type-C specification, in the port drivers. By
 		default the preferred role should come from the platform.
 		Valid values: source, sink, none (to remove preference)
 What:		/sys/class/typec/<port>/supported_accessory_modes
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Space separated list of accessory modes, defined in the USB
 		Type-C specification, the port supports.
 What:		/sys/class/typec/<port>/usb_power_delivery_revision
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Revision number of the supported USB Power Delivery
 		specification, or 0 when USB Power Delivery is not supported.
 What:		/sys/class/typec/<port>/usb_typec_revision
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Revision number of the supported USB Type-C specification.
 USB Type-C partner devices (eg. /sys/class/typec/port0-partner/)
 What:		/sys/class/typec/<port>-partner/accessory_mode
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the Accessory Mode name when the partner is an Accessory.
 		The Accessory Modes are defined in USB Type-C Specification.
 What:		/sys/class/typec/<port>-partner/supports_usb_power_delivery
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the partner supports USB Power Delivery communication:
 		Valid values: yes, no
 What:		/sys/class/typec/<port>-partner>/identity/
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		This directory appears only if the port device driver is capable
 		of showing the result of Discover Identity USB power delivery
 		command. That will not always be possible even when USB power
 		delivery is supported, for example when USB power delivery
 		communication for the port is mostly handled in firmware. If the
 		directory exists, it will have an attribute file for every VDO
 		in Discover Identity command result.
 What:		/sys/class/typec/<port>-partner/identity/id_header
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		ID Header VDO part of Discover Identity command result. The
 		value will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 What:		/sys/class/typec/<port>-partner/identity/cert_stat
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Cert Stat VDO part of Discover Identity command result. The
 		value will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 What:		/sys/class/typec/<port>-partner/identity/product
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Product VDO part of Discover Identity command result. The value
 		will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 USB Type-C cable devices (eg. /sys/class/typec/port0-cable/)
 Note: Electronically Marked Cables will have a device also for one cable plug
 (eg. /sys/class/typec/port0-plug0). If the cable is active and has also SOP
 Double Prime controller (USB Power Deliver specification ch. 2.4) it will have
 second device also for the other plug. Both plugs may have alternate modes as
 described in USB Type-C and USB Power Delivery specifications.
 What:		/sys/class/typec/<port>-cable/type
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the cable is active.
 		Valid values: active, passive
 What:		/sys/class/typec/<port>-cable/plug_type
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows type of the plug on the cable:
 		- type-a - Standard A
 		- type-b - Standard B
 		- type-c
 		- captive
 What:		/sys/class/typec/<port>-cable/identity/
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		This directory appears only if the port device driver is capable
 		of showing the result of Discover Identity USB power delivery
 		command. That will not always be possible even when USB power
 		delivery is supported. If the directory exists, it will have an
 		attribute for every VDO returned by Discover Identity command.
 What:		/sys/class/typec/<port>-cable/identity/id_header
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		ID Header VDO part of Discover Identity command result. The
 		value will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 What:		/sys/class/typec/<port>-cable/identity/cert_stat
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Cert Stat VDO part of Discover Identity command result. The
 		value will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 What:		/sys/class/typec/<port>-cable/identity/product
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Product VDO part of Discover Identity command result. The value
 		will show 0 until Discover Identity command result becomes
 		available. The value can be polled.
 Alternate Mode devices.
 The alternate modes will have Standard or Vendor ID (SVID) assigned by USB-IF.
 The ports, partners and cable plugs can have alternate modes. A supported SVID
 will consist of a set of modes. Every SVID a port/partner/plug supports will
 have a device created for it, and every supported mode for a supported SVID will
 have its own directory under that device. Below <dev> refers to the device for
 the alternate mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/svid
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		The SVID (Standard or Vendor ID) assigned by USB-IF for this
 		alternate mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Every supported mode will have its own directory. The name of
 		a mode will be "mode<index>" (for example mode1), where <index>
 		is the actual index to the mode VDO returned by Discover Modes
 		USB power delivery command.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/description
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows description of the mode. The description is optional for
 		the drivers, just like with the Billboard Devices.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/vdo
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows the VDO in hexadecimal returned by Discover Modes command
 		for this mode.
 What:		/sys/class/typec/<port|partner|cable>/<dev>/mode<index>/active
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Shows if the mode is active or not. The attribute can be used
 		for entering/exiting the mode with partners and cable plugs, and
 		with the port alternate modes it can be used for disabling
 		support for specific alternate modes. Entering/exiting modes is
 		supported as synchronous operation so write(2) to the attribute
 		does not return until the enter/exit mode operation has
 		finished. The attribute is notified when the mode is
 		entered/exited so poll(2) on the attribute wakes up.
 		Entering/exiting a mode will also generate uevent KOBJ_CHANGE.
 		Valid values: yes, no
 What:		/sys/class/typec/<port>/<dev>/mode<index>/supported_roles
 Date:		April 2017
 Contact:	Heikki Krogerus <heikki.krogerus@linux.intel.com>
 Description:
 		Space separated list of the supported roles.
 		This attribute is available for the devices describing the
 		alternate modes a port supports, and it will not be exposed with
 		the devices presenting the alternate modes the partners or cable
 		plugs support.
 		Valid values: source, sink

24

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

@@ -366,27 +366,3 @@ Contact:	Linux ARM Kernel Mailing list <linux-arm-kernel@lists.infradead.org>
 Description:	AArch64 CPU registers
 		'identification' directory exposes the CPU ID registers for
 		 identifying model and revision of the CPU.
 What:		/sys/devices/system/cpu/cpu#/cpu_capacity
 Date:		December 2016
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	information about CPUs heterogeneity.
 		cpu_capacity: capacity of cpu#.
 What:		/sys/devices/system/cpu/vulnerabilities
 		/sys/devices/system/cpu/vulnerabilities/meltdown
 		/sys/devices/system/cpu/vulnerabilities/spectre_v1
 		/sys/devices/system/cpu/vulnerabilities/spectre_v2
 		/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
 Date:		January 2018
 Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:	Information about CPU vulnerabilities
 		The files are named after the code names of CPU
 		vulnerabilities. The output of those files reflects the
 		state of the CPUs in the system. Possible output values:
 		"Not affected"	  CPU is not affected by the vulnerability
 		"Vulnerable"	  CPU is affected and no mitigation in effect
 		"Mitigation: $M"  CPU is affected and mitigation $M is in effect

8

Documentation/ABI/testing/sysfs-driver-altera-cvp

View File

@@ -1,8 +0,0 @@
 What:		/sys/bus/pci/drivers/altera-cvp/chkcfg
 Date:		May 2017
 Kernel Version:	4.13
 Contact:	Anatolij Gustschin <agust@denx.de>
 Description:
 		Contains either 1 or 0 and controls if configuration
 		error checking in altera-cvp driver is turned on or
 		off.

10

Documentation/ABI/testing/sysfs-firmware-acpi

View File

@@ -44,6 +44,16 @@ Description:
 		or 0 (unset).  Attempts to write any other values to it will
 		cause -EINVAL to be returned.
 What:		/sys/firmware/acpi/hotplug/force_remove
 Date:		May 2013
 Contact:	Rafael J. Wysocki <rafael.j.wysocki@intel.com>
 Description:
 		The number in this file (0 or 1) determines whether (1) or not
 		(0) the ACPI subsystem will allow devices to be hot-removed even
 		if they cannot be put offline gracefully (from the kernel's
 		viewpoint).  That number can be changed by writing a boolean
 		value to this file.
 What:		/sys/firmware/acpi/interrupts/
 Date:		February 2008
 Contact:	Len Brown <lenb@kernel.org>

26

Documentation/ABI/testing/sysfs-firmware-ofw

View File

@@ -1,6 +1,6 @@
 What:		/sys/firmware/devicetree/*
 Date:		November 2013
 Contact:	Grant Likely <grant.likely@arm.com>, devicetree@vger.kernel.org
 Contact:	Grant Likely <grant.likely@linaro.org>
 Description:
 		When using OpenFirmware or a Flattened Device Tree to enumerate
 		hardware, the device tree structure will be exposed in this
@@ -26,27 +26,3 @@ Description:
 		name plus address). Properties are represented as files
 		in the directory. The contents of each file is the exact
 		binary data from the device tree.
 What:		/sys/firmware/fdt
 Date:		February 2015
 KernelVersion:	3.19
 Contact:	Frank Rowand <frowand.list@gmail.com>, devicetree@vger.kernel.org
 Description:
 		Exports the FDT blob that was passed to the kernel by
 		the bootloader. This allows userland applications such
 		as kexec to access the raw binary. This blob is also
 		useful when debugging since it contains any changes
 		made to the blob by the bootloader.
 		The fact that this node does not reside under
 		/sys/firmware/device-tree is deliberate: FDT is also used
 		on arm64 UEFI/ACPI systems to communicate just the UEFI
 		and ACPI entry points, but the FDT is never unflattened
 		and used to configure the system.
 		A CRC32 checksum is calculated over the entire FDT
 		blob, and verified at late_initcall time. The sysfs
 		entry is instantiated only if the checksum is valid,
 		i.e., if the FDT blob has not been modified in the mean
 		time. Otherwise, a warning is printed.
 Users:		kexec, debugging

31

Documentation/ABI/testing/sysfs-firmware-opal-powercap

View File

@@ -1,31 +0,0 @@
 What:		/sys/firmware/opal/powercap
 Date:		August 2017
 Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
 Description:	Powercap directory for Powernv (P8, P9) servers
 		Each folder in this directory contains a
 		power-cappable component.
 What:		/sys/firmware/opal/powercap/system-powercap
 		/sys/firmware/opal/powercap/system-powercap/powercap-min
 		/sys/firmware/opal/powercap/system-powercap/powercap-max
 		/sys/firmware/opal/powercap/system-powercap/powercap-current
 Date:		August 2017
 Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
 Description:	System powercap directory and attributes applicable for
 		Powernv (P8, P9) servers
 		This directory provides powercap information. It
 		contains below sysfs attributes:
 		- powercap-min : This file provides the minimum
 		  possible powercap in Watt units
 		- powercap-max : This file provides the maximum
 		  possible powercap in Watt units
 		- powercap-current : This file provides the current
 		  powercap set on the system. Writing to this file
 		  creates a request for setting a new-powercap. The
 		  powercap requested must be between powercap-min
 		  and powercap-max.

18

Documentation/ABI/testing/sysfs-firmware-opal-psr

View File

@@ -1,18 +0,0 @@
 What:		/sys/firmware/opal/psr
 Date:		August 2017
 Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
 Description:	Power-Shift-Ratio directory for Powernv P9 servers
 		Power-Shift-Ratio allows to provide hints the firmware
 		to shift/throttle power between different entities in
 		the system. Each attribute in this directory indicates
 		a settable PSR.
 What:		/sys/firmware/opal/psr/cpu_to_gpu_X
 Date:		August 2017
 Contact:	Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
 Description:	PSR sysfs attributes for Powernv P9 servers
 		Power-Shift-Ratio between CPU and GPU for a given chip
 		with chip-id X. This file gives the ratio (0-100)
 		which is used by OCC for power-capping.

41

Documentation/ABI/testing/sysfs-fs-f2fs

View File

@@ -57,15 +57,6 @@ Contact:	"Jaegeuk Kim" <jaegeuk.kim@samsung.com>
 Description:
 		 Controls the issue rate of small discard commands.
 What:          /sys/fs/f2fs/<disk>/discard_granularity
 Date:          July 2017
 Contact:       "Chao Yu" <yuchao0@huawei.com>
 Description:
 		Controls discard granularity of inner discard thread, inner thread
 		will not issue discards with size that is smaller than granularity.
 		The unit size is one block, now only support configuring in range
 		of [1, 512].
 What:		/sys/fs/f2fs/<disk>/max_victim_search
 Date:		January 2014
 Contact:	"Jaegeuk Kim" <jaegeuk.kim@samsung.com>
@@ -84,7 +75,7 @@ Contact:	"Jaegeuk Kim" <jaegeuk.kim@samsung.com>
 Description:
 		 Controls the memory footprint used by f2fs.
 What:		/sys/fs/f2fs/<disk>/batched_trim_sections
 What:		/sys/fs/f2fs/<disk>/trim_sections
 Date:		February 2015
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:
@@ -121,33 +112,3 @@ Date:		January 2016
 Contact:	"Shuoran Liu" <liushuoran@huawei.com>
 Description:
 		 Shows total written kbytes issued to disk.
 What:		/sys/fs/f2fs/<disk>/inject_rate
 Date:		May 2016
 Contact:	"Sheng Yong" <shengyong1@huawei.com>
 Description:
 		 Controls the injection rate.
 What:		/sys/fs/f2fs/<disk>/inject_type
 Date:		May 2016
 Contact:	"Sheng Yong" <shengyong1@huawei.com>
 Description:
 		 Controls the injection type.
 What:		/sys/fs/f2fs/<disk>/reserved_blocks
 Date:		June 2017
 Contact:	"Chao Yu" <yuchao0@huawei.com>
 Description:
 		 Controls current reserved blocks in system.
 What:		/sys/fs/f2fs/<disk>/gc_urgent
 Date:		August 2017
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:
 		 Do background GC agressively
 What:		/sys/fs/f2fs/<disk>/gc_urgent_sleep_time
 Date:		August 2017
 Contact:	"Jaegeuk Kim" <jaegeuk@kernel.org>
 Description:
 		 Controls sleep time of GC urgent mode

23

Documentation/ABI/testing/sysfs-hypervisor-pmu Normal file

View File

@@ -0,0 +1,23 @@
 What:		/sys/hypervisor/pmu/pmu_mode
 Date:		August 2015
 KernelVersion:	4.3
 Contact:	Boris Ostrovsky <boris.ostrovsky@oracle.com>
 Description:
 		Describes mode that Xen's performance-monitoring unit (PMU)
 		uses. Accepted values are
 			"off"  -- PMU is disabled
 			"self" -- The guest can profile itself
 			"hv"   -- The guest can profile itself and, if it is
 				  privileged (e.g. dom0), the hypervisor
 			"all" --  The guest can profile itself, the hypervisor
 				  and all other guests. Only available to
 				  privileged guests.
 What:           /sys/hypervisor/pmu/pmu_features
 Date:           August 2015
 KernelVersion:  4.3
 Contact:        Boris Ostrovsky <boris.ostrovsky@oracle.com>
 Description:
 		Describes Xen PMU features (as an integer). A set bit indicates
 		that the corresponding feature is enabled. See
 		include/xen/interface/xenpmu.h for available features

43

Documentation/ABI/testing/sysfs-hypervisor-xen

View File

@@ -1,43 +0,0 @@
 What:		/sys/hypervisor/guest_type
 Date:		June 2017
 KernelVersion:	4.13
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Type of guest:
 		"Xen": standard guest type on arm
 		"HVM": fully virtualized guest (x86)
 		"PV": paravirtualized guest (x86)
 		"PVH": fully virtualized guest without legacy emulation (x86)
 What:		/sys/hypervisor/pmu/pmu_mode
 Date:		August 2015
 KernelVersion:	4.3
 Contact:	Boris Ostrovsky <boris.ostrovsky@oracle.com>
 Description:	If running under Xen:
 		Describes mode that Xen's performance-monitoring unit (PMU)
 		uses. Accepted values are
 			"off"  -- PMU is disabled
 			"self" -- The guest can profile itself
 			"hv"   -- The guest can profile itself and, if it is
 				  privileged (e.g. dom0), the hypervisor
 			"all" --  The guest can profile itself, the hypervisor
 				  and all other guests. Only available to
 				  privileged guests.
 What:           /sys/hypervisor/pmu/pmu_features
 Date:           August 2015
 KernelVersion:  4.3
 Contact:        Boris Ostrovsky <boris.ostrovsky@oracle.com>
 Description:	If running under Xen:
 		Describes Xen PMU features (as an integer). A set bit indicates
 		that the corresponding feature is enabled. See
 		include/xen/interface/xenpmu.h for available features
 What:		/sys/hypervisor/properties/buildid
 Date:		June 2017
 KernelVersion:	4.13
 Contact:	xen-devel@lists.xenproject.org
 Description:	If running under Xen:
 		Build id of the hypervisor, needed for hypervisor live patching.
 		Might return "<denied>" in case of special security settings
 		in the hypervisor.

8

Documentation/ABI/testing/sysfs-kernel-livepatch

View File

@@ -25,14 +25,6 @@ Description:
 		code is currently applied.  Writing 0 will disable the patch
 		while writing 1 will re-enable the patch.
 What:		/sys/kernel/livepatch/<patch>/transition
 Date:		Feb 2017
 KernelVersion:	4.12.0
 Contact:	live-patching@vger.kernel.org
 Description:
 		An attribute which indicates whether the patch is currently in
 		transition.
 What:		/sys/kernel/livepatch/<patch>/<object>
 Date:		Nov 2014
 KernelVersion:	3.19.0

16

Documentation/ABI/testing/sysfs-kernel-mm-swap

View File

@@ -1,16 +0,0 @@
 What:		/sys/kernel/mm/swap/
 Date:		August 2017
 Contact:	Linux memory management mailing list <linux-mm@kvack.org>
 Description:	Interface for swapping
 What:		/sys/kernel/mm/swap/vma_ra_enabled
 Date:		August 2017
 Contact:	Linux memory management mailing list <linux-mm@kvack.org>
 Description:	Enable/disable VMA based swap readahead.
 		If set to true, the VMA based swap readahead algorithm
 		will be used for swappable anonymous pages mapped in a
 		VMA, and the global swap readahead algorithm will be
 		still used for tmpfs etc. other users.  If set to
 		false, the global swap readahead algorithm will be
 		used for all swappable pages.

9

Documentation/ABI/testing/sysfs-platform-chipidea-usb2

View File

@@ -1,9 +0,0 @@
 What:		/sys/bus/platform/devices/ci_hdrc.0/role
 Date:		Mar 2017
 Contact:	Peter Chen <peter.chen@nxp.com>
 Description:
 		It returns string "gadget" or "host" when read it, it indicates
 		current controller role.
 		It will do role switch when write "gadget" or "host" to it.
 		Only controller at dual-role configuration supports writing.

8

Documentation/ABI/testing/sysfs-platform-ideapad-laptop

View File

@@ -17,11 +17,3 @@ Description:
 			* 2 -> Dust Cleaning
 			* 4 -> Efficient Thermal Dissipation Mode
 What:		/sys/devices/platform/ideapad/touchpad
 Date:		May 2017
 KernelVersion:	4.13
 Contact:	"Ritesh Raj Sarraf <rrs@debian.org>"
 Description:
 		Control touchpad mode.
 			* 1 -> Switched On
 			* 0 -> Switched Off

15

Documentation/ABI/testing/sysfs-platform-renesas_usb3

View File

@@ -1,15 +0,0 @@
 What:		/sys/devices/platform/<renesas_usb3's name>/role
 Date:		March 2017
 KernelVersion:	4.13
 Contact:	Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
 Description:
 		This file can be read and write.
 		The file can show/change the drd mode of usb.
 		Write the following string to change the mode:
 		 "host" - switching mode from peripheral to host.
 		 "peripheral" - switching mode from host to peripheral.
 		Read the file, then it shows the following strings:
 		 "host" - The mode is host now.
 		 "peripheral" - The mode is peripheral now.

14

Documentation/ABI/testing/sysfs-power

View File

@@ -127,7 +127,7 @@ Description:
 What;		/sys/power/pm_trace_dev_match
 Date:		October 2010
 Contact:	James Hogan <jhogan@kernel.org>
 Contact:	James Hogan <james@albanarts.com>
 Description:
 		The /sys/power/pm_trace_dev_match file contains the name of the
 		device associated with the last PM event point saved in the RTC
@@ -273,15 +273,3 @@ Description:
 		This output is useful for system wakeup diagnostics of spurious
 		wakeup interrupts.
 What:		/sys/power/pm_debug_messages
 Date:		July 2017
 Contact:	Rafael J. Wysocki <rjw@rjwysocki.net>
 Description:
 		The /sys/power/pm_debug_messages file controls the printing
 		of debug messages from the system suspend/hiberbation
 		infrastructure to the kernel log.
 		Writing a "1" to this file enables the debug messages and
 		writing a "0" (default) to it disables them.  Reads from
 		this file return the current value.

47

Documentation/ABI/testing/sysfs-uevent

View File

@@ -1,47 +0,0 @@
 What:           /sys/.../uevent
 Date:           May 2017
 KernelVersion:  4.13
 Contact:        Linux kernel mailing list <linux-kernel@vger.kernel.org>
 Description:
                 Enable passing additional variables for synthetic uevents that
                 are generated by writing /sys/.../uevent file.
                 Recognized extended format is ACTION [UUID [KEY=VALUE ...].
                 The ACTION is compulsory - it is the name of the uevent action
                 ("add", "change", "remove"). There is no change compared to
                 previous functionality here. The rest of the extended format
                 is optional.
                 You need to pass UUID first before any KEY=VALUE pairs.
                 The UUID must be in "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
                 format where 'x' is a hex digit. The UUID is considered to be
                 a transaction identifier so it's possible to use the same UUID
                 value for one or more synthetic uevents in which case we
                 logically group these uevents together for any userspace
                 listeners. The UUID value appears in uevent as
                 "SYNTH_UUID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" environment
                 variable.
                 If UUID is not passed in, the generated synthetic uevent gains
                 "SYNTH_UUID=0" environment variable automatically.
                 The KEY=VALUE pairs can contain alphanumeric characters only.
                 It's possible to define zero or more pairs - each pair is then
                 delimited by a space character ' '. Each pair appears in
                 synthetic uevent as "SYNTH_ARG_KEY=VALUE". That means the KEY
                 name gains "SYNTH_ARG_" prefix to avoid possible collisions
                 with existing variables.
                 Example of valid sequence written to the uevent file:
                     add fe4d7c9d-b8c6-4a70-9ef1-3d8a58d18eed A=1 B=abc
                 This generates synthetic uevent including these variables:
                     ACTION=add
                     SYNTH_ARG_A=1
                     SYNTH_ARG_B=abc
                     SYNTH_UUID=fe4d7c9d-b8c6-4a70-9ef1-3d8a58d18eed
 Users:
                 udev, userspace tools generating synthetic uevents

184

Documentation/DMA-API-HOWTO.txt

View File

@@ -1,24 +1,22 @@
 =========================
 Dynamic DMA mapping Guide
 =========================
 		     Dynamic DMA mapping Guide
 		     =========================
 :Author: David S. Miller <davem@redhat.com>
 :Author: Richard Henderson <rth@cygnus.com>
 :Author: Jakub Jelinek <jakub@redhat.com>
 		 David S. Miller <davem@redhat.com>
 		 Richard Henderson <rth@cygnus.com>
 		  Jakub Jelinek <jakub@redhat.com>
 This is a guide to device driver writers on how to use the DMA API
 with example pseudo-code.  For a concise description of the API, see
 DMA-API.txt.
 CPU and DMA addresses
 =====================
                        CPU and DMA addresses
 There are several kinds of addresses involved in the DMA API, and it's
 important to understand the differences.
 The kernel normally uses virtual addresses.  Any address returned by
 kmalloc(), vmalloc(), and similar interfaces is a virtual address and can
 be stored in a ``void *``.
 be stored in a "void *".
 The virtual memory system (TLB, page tables, etc.) translates virtual
 addresses to CPU physical addresses, which are stored as "phys_addr_t" or
@@ -39,7 +37,7 @@ be restricted to a subset of that space.  For example, even if a system
 supports 64-bit addresses for main memory and PCI BARs, it may use an IOMMU
 so devices only need to use 32-bit DMA addresses.
 Here's a picture and some examples::
 Here's a picture and some examples:
                CPU                  CPU                  Bus
              Virtual              Physical             Address
@@ -100,16 +98,15 @@ microprocessor architecture. You should use the DMA API rather than the
 bus-specific DMA API, i.e., use the dma_map_*() interfaces rather than the
 pci_map_*() interfaces.
 First of all, you should make sure::
 First of all, you should make sure
 	#include <linux/dma-mapping.h>
 #include <linux/dma-mapping.h>
 is in your driver, which provides the definition of dma_addr_t.  This type
 can hold any valid DMA address for the platform and should be used
 everywhere you hold a DMA address returned from the DMA mapping functions.
 What memory is DMA'able?
 ========================
 			 What memory is DMA'able?
 The first piece of information you must know is what kernel memory can
 be used with the DMA mapping facilities.  There has been an unwritten
@@ -146,8 +143,7 @@ What about block I/O and networking buffers?  The block I/O and
 networking subsystems make sure that the buffers they use are valid
 for you to DMA from/to.
 DMA addressing limitations
 ==========================
 			DMA addressing limitations
 Does your device have any DMA addressing limitations?  For example, is
 your device only capable of driving the low order 24-bits of address?
@@ -170,7 +166,7 @@ style to do this even if your device holds the default setting,
 because this shows that you did think about these issues wrt. your
 device.
 The query is performed via a call to dma_set_mask_and_coherent()::
 The query is performed via a call to dma_set_mask_and_coherent():
 	int dma_set_mask_and_coherent(struct device *dev, u64 mask);
@@ -179,12 +175,12 @@ If you have some special requirements, then the following two separate
 queries can be used instead:
 	The query for streaming mappings is performed via a call to
 	dma_set_mask()::
 	dma_set_mask():
 		int dma_set_mask(struct device *dev, u64 mask);
 	The query for consistent allocations is performed via a call
 	to dma_set_coherent_mask()::
 	to dma_set_coherent_mask():
 		int dma_set_coherent_mask(struct device *dev, u64 mask);
@@ -213,7 +209,7 @@ of your driver reports that performance is bad or that the device is not
 even detected, you can ask them for the kernel messages to find out
 exactly why.
 The standard 32-bit addressing device would do something like this::
 The standard 32-bit addressing device would do something like this:
 	if (dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32))) {
 		dev_warn(dev, "mydev: No suitable DMA available\n");
@@ -229,7 +225,7 @@ than 64-bit addressing.  For example, Sparc64 PCI SAC addressing is
 more efficient than DAC addressing.
 Here is how you would handle a 64-bit capable device which can drive
 all 64-bits when accessing streaming DMA::
 all 64-bits when accessing streaming DMA:
 	int using_dac;
@@ -243,7 +239,7 @@ all 64-bits when accessing streaming DMA::
 	}
 If a card is capable of using 64-bit consistent allocations as well,
 the case would look like this::
 the case would look like this:
 	int using_dac, consistent_using_dac;
@@ -264,7 +260,7 @@ uses consistent allocations, one would have to check the return value from
 dma_set_coherent_mask().
 Finally, if your device can only drive the low 24-bits of
 address you might do something like::
 address you might do something like:
 	if (dma_set_mask(dev, DMA_BIT_MASK(24))) {
 		dev_warn(dev, "mydev: 24-bit DMA addressing not available\n");
@@ -284,7 +280,7 @@ only provide the functionality which the machine can handle.  It
 is important that the last call to dma_set_mask() be for the
 most specific mask.
 Here is pseudo-code showing how this might be done::
 Here is pseudo-code showing how this might be done:
 	#define PLAYBACK_ADDRESS_BITS	DMA_BIT_MASK(32)
 	#define RECORD_ADDRESS_BITS	DMA_BIT_MASK(24)
@@ -312,8 +308,7 @@ A sound card was used as an example here because this genre of PCI
 devices seems to be littered with ISA chips given a PCI front end,
 and thus retaining the 16MB DMA addressing limitations of ISA.
 Types of DMA mappings
 =====================
 			Types of DMA mappings
 There are two types of DMA mappings:
@@ -341,14 +336,12 @@ There are two types of DMA mappings:
   to memory is immediately visible to the device, and vice
   versa.  Consistent mappings guarantee this.
   .. important::
 	     Consistent DMA memory does not preclude the usage of
 	     proper memory barriers.  The CPU may reorder stores to
   IMPORTANT: Consistent DMA memory does not preclude the usage of
              proper memory barriers.  The CPU may reorder stores to
 	     consistent memory just as it may normal memory.  Example:
 	     if it is important for the device to see the first word
 	     of a descriptor updated before the second, you must do
 	     something like::
 	     something like:
 		desc->word0 = address;
 		wmb();
@@ -384,17 +377,16 @@ Also, systems with caches that aren't DMA-coherent will work better
 when the underlying buffers don't share cache lines with other data.
 Using Consistent DMA mappings
 =============================
 		 Using Consistent DMA mappings.
 To allocate and map large (PAGE_SIZE or so) consistent DMA regions,
 you should do::
 you should do:
 	dma_addr_t dma_handle;
 	cpu_addr = dma_alloc_coherent(dev, size, &dma_handle, gfp);
 where device is a ``struct device *``. This may be called in interrupt
 where device is a struct device *. This may be called in interrupt
 context with the GFP_ATOMIC flag.
 Size is the length of the region you want to allocate, in bytes.
@@ -423,7 +415,7 @@ exists (for example) to guarantee that if you allocate a chunk
 which is smaller than or equal to 64 kilobytes, the extent of the
 buffer you receive will not cross a 64K boundary.
 To unmap and free such a DMA region, you call::
 To unmap and free such a DMA region, you call:
 	dma_free_coherent(dev, size, cpu_addr, dma_handle);
@@ -438,7 +430,7 @@ a kmem_cache, but it uses dma_alloc_coherent(), not __get_free_pages().
 Also, it understands common hardware constraints for alignment,
 like queue heads needing to be aligned on N byte boundaries.
 Create a dma_pool like this::
 Create a dma_pool like this:
 	struct dma_pool *pool;
@@ -452,7 +444,7 @@ pass 0 for boundary; passing 4096 says memory allocated from this pool
 must not cross 4KByte boundaries (but at that time it may be better to
 use dma_alloc_coherent() directly instead).
 Allocate memory from a DMA pool like this::
 Allocate memory from a DMA pool like this:
 	cpu_addr = dma_pool_alloc(pool, flags, &dma_handle);
@@ -460,7 +452,7 @@ flags are GFP_KERNEL if blocking is permitted (not in_interrupt nor
 holding SMP locks), GFP_ATOMIC otherwise.  Like dma_alloc_coherent(),
 this returns two values, cpu_addr and dma_handle.
 Free memory that was allocated from a dma_pool like this::
 Free memory that was allocated from a dma_pool like this:
 	dma_pool_free(pool, cpu_addr, dma_handle);
@@ -468,7 +460,7 @@ where pool is what you passed to dma_pool_alloc(), and cpu_addr and
 dma_handle are the values dma_pool_alloc() returned. This function
 may be called in interrupt context.
 Destroy a dma_pool by calling::
 Destroy a dma_pool by calling:
 	dma_pool_destroy(pool);
@@ -476,12 +468,11 @@ Make sure you've called dma_pool_free() for all memory allocated
 from a pool before you destroy the pool. This function may not
 be called in interrupt context.
 DMA Direction
 =============
 			DMA Direction
 The interfaces described in subsequent portions of this document
 take a DMA direction argument, which is an integer and takes on
 one of the following values::
 one of the following values:
  DMA_BIDIRECTIONAL
  DMA_TO_DEVICE
@@ -530,15 +521,14 @@ packets, map/unmap them with the DMA_TO_DEVICE direction
 specifier.  For receive packets, just the opposite, map/unmap them
 with the DMA_FROM_DEVICE direction specifier.
 Using Streaming DMA mappings
 ============================
 		  Using Streaming DMA mappings
 The streaming DMA mapping routines can be called from interrupt
 context.  There are two versions of each map/unmap, one which will
 map/unmap a single memory region, and one which will map/unmap a
 scatterlist.
 To map a single region, you do::
 To map a single region, you do:
 	struct device *dev = &my_dev->dev;
 	dma_addr_t dma_handle;
@@ -555,16 +545,37 @@ To map a single region, you do::
 		goto map_error_handling;
 	}
 and to unmap it::
 and to unmap it:
 	dma_unmap_single(dev, dma_handle, size, direction);
 You should call dma_mapping_error() as dma_map_single() could fail and return
 error.  Doing so will ensure that the mapping code will work correctly on all
 DMA implementations without any dependency on the specifics of the underlying
 implementation. Using the returned address without checking for errors could
 result in failures ranging from panics to silent data corruption.  The same
 applies to dma_map_page() as well.
 error. Not all DMA implementations support the dma_mapping_error() interface.
 However, it is a good practice to call dma_mapping_error() interface, which
 will invoke the generic mapping error check interface. Doing so will ensure
 that the mapping code will work correctly on all DMA implementations without
 any dependency on the specifics of the underlying implementation. Using the
 returned address without checking for errors could result in failures ranging
 from panics to silent data corruption. A couple of examples of incorrect ways
 to check for errors that make assumptions about the underlying DMA
 implementation are as follows and these are applicable to dma_map_page() as
 well.
 Incorrect example 1:
 	dma_addr_t dma_handle;
 	dma_handle = dma_map_single(dev, addr, size, direction);
 	if ((dma_handle & 0xffff != 0) || (dma_handle >= 0x1000000)) {
 		goto map_error;
 	}
 Incorrect example 2:
 	dma_addr_t dma_handle;
 	dma_handle = dma_map_single(dev, addr, size, direction);
 	if (dma_handle == DMA_ERROR_CODE) {
 		goto map_error;
 	}
 You should call dma_unmap_single() when the DMA activity is finished, e.g.,
 from the interrupt which told you that the DMA transfer is done.
@@ -573,7 +584,7 @@ Using CPU pointers like this for single mappings has a disadvantage:
 you cannot reference HIGHMEM memory in this way.  Thus, there is a
 map/unmap interface pair akin to dma_{map,unmap}_single().  These
 interfaces deal with page/offset pairs instead of CPU pointers.
 Specifically::
 Specifically:
 	struct device *dev = &my_dev->dev;
 	dma_addr_t dma_handle;
@@ -603,7 +614,7 @@ error as outlined under the dma_map_single() discussion.
 You should call dma_unmap_page() when the DMA activity is finished, e.g.,
 from the interrupt which told you that the DMA transfer is done.
 With scatterlists, you map a region gathered from several regions by::
 With scatterlists, you map a region gathered from several regions by:
 	int i, count = dma_map_sg(dev, sglist, nents, direction);
 	struct scatterlist *sg;
@@ -627,18 +638,16 @@ Then you should loop count times (note: this can be less than nents times)
 and use sg_dma_address() and sg_dma_len() macros where you previously
 accessed sg->address and sg->length as shown above.
 To unmap a scatterlist, just call::
 To unmap a scatterlist, just call:
 	dma_unmap_sg(dev, sglist, nents, direction);
 Again, make sure DMA activity has already finished.
 .. note::
 	The 'nents' argument to the dma_unmap_sg call must be
 	the _same_ one you passed into the dma_map_sg call,
 	it should _NOT_ be the 'count' value _returned_ from the
 	dma_map_sg call.
 PLEASE NOTE:  The 'nents' argument to the dma_unmap_sg call must be
               the _same_ one you passed into the dma_map_sg call,
 	      it should _NOT_ be the 'count' value _returned_ from the
               dma_map_sg call.
 Every dma_map_{single,sg}() call should have its dma_unmap_{single,sg}()
 counterpart, because the DMA address space is a shared resource and
@@ -650,11 +659,11 @@ properly in order for the CPU and device to see the most up-to-date and
 correct copy of the DMA buffer.
 So, firstly, just map it with dma_map_{single,sg}(), and after each DMA
 transfer call either::
 transfer call either:
 	dma_sync_single_for_cpu(dev, dma_handle, size, direction);
 or::
 or:
 	dma_sync_sg_for_cpu(dev, sglist, nents, direction);
@@ -662,19 +671,17 @@ as appropriate.
 Then, if you wish to let the device get at the DMA area again,
 finish accessing the data with the CPU, and then before actually
 giving the buffer to the hardware call either::
 giving the buffer to the hardware call either:
 	dma_sync_single_for_device(dev, dma_handle, size, direction);
 or::
 or:
 	dma_sync_sg_for_device(dev, sglist, nents, direction);
 as appropriate.
 .. note::
 	      The 'nents' argument to dma_sync_sg_for_cpu() and
 PLEASE NOTE:  The 'nents' argument to dma_sync_sg_for_cpu() and
 	      dma_sync_sg_for_device() must be the same passed to
 	      dma_map_sg(). It is _NOT_ the count returned by
 	      dma_map_sg().
@@ -685,7 +692,7 @@ dma_map_*() call till dma_unmap_*(), then you don't have to call the
 dma_sync_*() routines at all.
 Here is pseudo code which shows a situation in which you would need
 to use the dma_sync_*() interfaces::
 to use the dma_sync_*() interfaces.
 	my_card_setup_receive_buffer(struct my_card *cp, char *buffer, int len)
 	{
@@ -761,8 +768,7 @@ is planned to completely remove virt_to_bus() and bus_to_virt() as
 they are entirely deprecated.  Some ports already do not provide these
 as it is impossible to correctly support them.
 Handling Errors
 ===============
 			Handling Errors
 DMA address space is limited on some architectures and an allocation
 failure can be determined by:
@@ -770,7 +776,7 @@ failure can be determined by:
 - checking if dma_alloc_coherent() returns NULL or dma_map_sg returns 0
 - checking the dma_addr_t returned from dma_map_single() and dma_map_page()
   by using dma_mapping_error()::
   by using dma_mapping_error():
 	dma_addr_t dma_handle;
@@ -788,8 +794,7 @@ failure can be determined by:
   of a multiple page mapping attempt. These example are applicable to
   dma_map_page() as well.
 Example 1::
 Example 1:
 	dma_addr_t dma_handle1;
 	dma_addr_t dma_handle2;
@@ -818,12 +823,8 @@ Example 1::
 		dma_unmap_single(dma_handle1);
 	map_error_handling1:
 Example 2::
 	/*
 	 * if buffers are allocated in a loop, unmap all mapped buffers when
 	 * mapping error is detected in the middle
 	 */
 Example 2: (if buffers are allocated in a loop, unmap all mapped buffers when
 	    mapping error is detected in the middle)
 	dma_addr_t dma_addr;
 	dma_addr_t array[DMA_BUFFERS];
@@ -866,8 +867,7 @@ SCSI drivers must return SCSI_MLQUEUE_HOST_BUSY if the DMA mapping
 fails in the queuecommand hook. This means that the SCSI subsystem
 passes the command to the driver again later.
 Optimizing Unmap State Space Consumption
 ========================================
 		Optimizing Unmap State Space Consumption
 On many platforms, dma_unmap_{single,page}() is simply a nop.
 Therefore, keeping track of the mapping address and length is a waste
@@ -879,7 +879,7 @@ Actually, instead of describing the macros one by one, we'll
 transform some example code.
 ) Use DEFINE_DMA_UNMAP_{ADDR,LEN} in state saving structures.
    Example, before::
    Example, before:
 	struct ring_state {
 		struct sk_buff *skb;
@@ -887,7 +887,7 @@ transform some example code.
 		__u32 len;
 	};
    after::
    after:
 	struct ring_state {
 		struct sk_buff *skb;
@@ -896,23 +896,23 @@ transform some example code.
 	};
 ) Use dma_unmap_{addr,len}_set() to set these values.
    Example, before::
    Example, before:
 	ringp->mapping = FOO;
 	ringp->len = BAR;
    after::
    after:
 	dma_unmap_addr_set(ringp, mapping, FOO);
 	dma_unmap_len_set(ringp, len, BAR);
 ) Use dma_unmap_{addr,len}() to access these values.
    Example, before::
    Example, before:
 	dma_unmap_single(dev, ringp->mapping, ringp->len,
 			 DMA_FROM_DEVICE);
    after::
    after:
 	dma_unmap_single(dev,
 			 dma_unmap_addr(ringp, mapping),
@@ -923,8 +923,7 @@ It really should be self-explanatory.  We treat the ADDR and LEN
 separately, because it is possible for an implementation to only
 need the address in order to perform the unmap operation.
 Platform Issues
 ===============
 			Platform Issues
 If you are just writing drivers for Linux and do not maintain
 an architecture port for the kernel, you can safely skip down
@@ -950,13 +949,12 @@ to "Closing".
    alignment constraints (e.g. the alignment constraints about 64-bit
    objects).
 Closing
 =======
 			   Closing
 This document, and the API itself, would not be in its current
 form without the feedback and suggestions from numerous individuals.
 We would like to specifically mention, in no particular order, the
 following people::
 following people:
 	Russell King <rmk@arm.linux.org.uk>
 	Leo Dagum <dagum@barrel.engr.sgi.com>

591

Documentation/DMA-API.txt

View File

@@ -1,8 +1,7 @@
 ============================================
 Dynamic DMA mapping using the generic device
 ============================================
                Dynamic DMA mapping using the generic device
                ============================================
 :Author: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
         James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
 This document describes the DMA API.  For a more gentle introduction
 of the API (and actual examples), see Documentation/DMA-API-HOWTO.txt.
@@ -13,10 +12,10 @@ machines.  Unless you know that your driver absolutely has to support
 non-consistent platforms (this is usually only legacy platforms) you
 should only use the API described in part I.
 Part I - dma_API
 ----------------
 Part I - dma_ API
 -------------------------------------
 To get the dma_API, you must #include <linux/dma-mapping.h>.  This
 To get the dma_ API, you must #include <linux/dma-mapping.h>.  This
 provides dma_addr_t and the interfaces described below.
 A dma_addr_t can hold any valid DMA address for the platform.  It can be
@@ -27,11 +26,9 @@ address space and the DMA address space.
 Part Ia - Using large DMA-coherent buffers
 ------------------------------------------
 ::
 	void *
 	dma_alloc_coherent(struct device *dev, size_t size,
 			   dma_addr_t *dma_handle, gfp_t flag)
 void *
 dma_alloc_coherent(struct device *dev, size_t size,
 			     dma_addr_t *dma_handle, gfp_t flag)
 Consistent memory is memory for which a write by either the device or
 the processor can immediately be read by the processor or device
@@ -54,24 +51,20 @@ consolidate your requests for consistent memory as much as possible.
 The simplest way to do that is to use the dma_pool calls (see below).
 The flag parameter (dma_alloc_coherent() only) allows the caller to
 specify the ``GFP_`` flags (see kmalloc()) for the allocation (the
 specify the GFP_ flags (see kmalloc()) for the allocation (the
 implementation may choose to ignore flags that affect the location of
 the returned memory, like GFP_DMA).
 ::
 	void *
 	dma_zalloc_coherent(struct device *dev, size_t size,
 			    dma_addr_t *dma_handle, gfp_t flag)
 void *
 dma_zalloc_coherent(struct device *dev, size_t size,
 			     dma_addr_t *dma_handle, gfp_t flag)
 Wraps dma_alloc_coherent() and also zeroes the returned memory if the
 allocation attempt succeeded.
 ::
 	void
 	dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
 			  dma_addr_t dma_handle)
 void
 dma_free_coherent(struct device *dev, size_t size, void *cpu_addr,
 			   dma_addr_t dma_handle)
 Free a region of consistent memory you previously allocated.  dev,
 size and dma_handle must all be the same as those passed into
@@ -85,7 +78,7 @@ may only be called with IRQs enabled.
 Part Ib - Using small DMA-coherent buffers
 ------------------------------------------
 To get this part of the dma_API, you must #include <linux/dmapool.h>
 To get this part of the dma_ API, you must #include <linux/dmapool.h>
 Many drivers need lots of small DMA-coherent memory regions for DMA
 descriptors or I/O buffers.  Rather than allocating in units of a page
@@ -95,8 +88,6 @@ not __get_free_pages().  Also, they understand common hardware constraints
 for alignment, like queue heads needing to be aligned on N-byte boundaries.
 ::
 	struct dma_pool *
 	dma_pool_create(const char *name, struct device *dev,
 			size_t size, size_t align, size_t alloc);
@@ -112,21 +103,16 @@ in bytes, and must be a power of two).  If your device has no boundary
 crossing restrictions, pass 0 for alloc; passing 4096 says memory allocated
 from this pool must not cross 4KByte boundaries.
 ::
 	void *
 	dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
 		        dma_addr_t *handle)
 	void *dma_pool_zalloc(struct dma_pool *pool, gfp_t mem_flags,
 			      dma_addr_t *handle)
 Wraps dma_pool_alloc() and also zeroes the returned memory if the
 allocation attempt succeeded.
 ::
 	void *
 	dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
 		       dma_addr_t *dma_handle);
 	void *dma_pool_alloc(struct dma_pool *pool, gfp_t gfp_flags,
 			dma_addr_t *dma_handle);
 This allocates memory from the pool; the returned memory will meet the
 size and alignment requirements specified at creation time.  Pass
@@ -136,20 +122,16 @@ blocking.  Like dma_alloc_coherent(), this returns two values:  an
 address usable by the CPU, and the DMA address usable by the pool's
 device.
 ::
 	void
 	dma_pool_free(struct dma_pool *pool, void *vaddr,
 		      dma_addr_t addr);
 	void dma_pool_free(struct dma_pool *pool, void *vaddr,
 			dma_addr_t addr);
 This puts memory back into the pool.  The pool is what was passed to
 dma_pool_alloc(); the CPU (vaddr) and DMA addresses are what
 were returned when that routine allocated the memory being freed.
 ::
 	void
 	dma_pool_destroy(struct dma_pool *pool);
 	void dma_pool_destroy(struct dma_pool *pool);
 dma_pool_destroy() frees the resources of the pool.  It must be
 called in a context which can sleep.  Make sure you've freed all allocated
@@ -159,40 +141,32 @@ memory back to the pool before you destroy it.
 Part Ic - DMA addressing limitations
 ------------------------------------
 ::
 	int
 	dma_set_mask_and_coherent(struct device *dev, u64 mask)
 int
 dma_set_mask_and_coherent(struct device *dev, u64 mask)
 Checks to see if the mask is possible and updates the device
 streaming and coherent DMA mask parameters if it is.
 Returns: 0 if successful and a negative error if not.
 ::
 	int
 	dma_set_mask(struct device *dev, u64 mask)
 int
 dma_set_mask(struct device *dev, u64 mask)
 Checks to see if the mask is possible and updates the device
 parameters if it is.
 Returns: 0 if successful and a negative error if not.
 ::
 	int
 	dma_set_coherent_mask(struct device *dev, u64 mask)
 int
 dma_set_coherent_mask(struct device *dev, u64 mask)
 Checks to see if the mask is possible and updates the device
 parameters if it is.
 Returns: 0 if successful and a negative error if not.
 ::
 	u64
 	dma_get_required_mask(struct device *dev)
 u64
 dma_get_required_mask(struct device *dev)
 This API returns the mask that the platform requires to
 operate efficiently.  Usually this means the returned mask
@@ -208,107 +182,94 @@ call to set the mask to the value returned.
 Part Id - Streaming DMA mappings
 --------------------------------
 ::
 	dma_addr_t
 	dma_map_single(struct device *dev, void *cpu_addr, size_t size,
 		       enum dma_data_direction direction)
 dma_addr_t
 dma_map_single(struct device *dev, void *cpu_addr, size_t size,
 		      enum dma_data_direction direction)
 Maps a piece of processor virtual memory so it can be accessed by the
 device and returns the DMA address of the memory.
 The direction for both APIs may be converted freely by casting.
 However the dma_API uses a strongly typed enumerator for its
 However the dma_ API uses a strongly typed enumerator for its
 direction:
 ======================= =============================================
 DMA_NONE		no direction (used for debugging)
 DMA_TO_DEVICE		data is going from the memory to the device
 DMA_FROM_DEVICE		data is coming from the device to the memory
 DMA_BIDIRECTIONAL	direction isn't known
 ======================= =============================================
 .. note::
 Notes:  Not all memory regions in a machine can be mapped by this API.
 Further, contiguous kernel virtual space may not be contiguous as
 physical memory.  Since this API does not provide any scatter/gather
 capability, it will fail if the user tries to map a non-physically
 contiguous piece of memory.  For this reason, memory to be mapped by
 this API should be obtained from sources which guarantee it to be
 physically contiguous (like kmalloc).
 	Not all memory regions in a machine can be mapped by this API.
 	Further, contiguous kernel virtual space may not be contiguous as
 	physical memory.  Since this API does not provide any scatter/gather
 	capability, it will fail if the user tries to map a non-physically
 	contiguous piece of memory.  For this reason, memory to be mapped by
 	this API should be obtained from sources which guarantee it to be
 	physically contiguous (like kmalloc).
 Further, the DMA address of the memory must be within the
 dma_mask of the device (the dma_mask is a bit mask of the
 addressable region for the device, i.e., if the DMA address of
 the memory ANDed with the dma_mask is still equal to the DMA
 address, then the device can perform DMA to the memory).  To
 ensure that the memory allocated by kmalloc is within the dma_mask,
 the driver may specify various platform-dependent flags to restrict
 the DMA address range of the allocation (e.g., on x86, GFP_DMA
 guarantees to be within the first 16MB of available DMA addresses,
 as required by ISA devices).
 	Further, the DMA address of the memory must be within the
 	dma_mask of the device (the dma_mask is a bit mask of the
 	addressable region for the device, i.e., if the DMA address of
 	the memory ANDed with the dma_mask is still equal to the DMA
 	address, then the device can perform DMA to the memory).  To
 	ensure that the memory allocated by kmalloc is within the dma_mask,
 	the driver may specify various platform-dependent flags to restrict
 	the DMA address range of the allocation (e.g., on x86, GFP_DMA
 	guarantees to be within the first 16MB of available DMA addresses,
 	as required by ISA devices).
 Note also that the above constraints on physical contiguity and
 dma_mask may not apply if the platform has an IOMMU (a device which
 maps an I/O DMA address to a physical memory address).  However, to be
 portable, device driver writers may *not* assume that such an IOMMU
 exists.
 	Note also that the above constraints on physical contiguity and
 	dma_mask may not apply if the platform has an IOMMU (a device which
 	maps an I/O DMA address to a physical memory address).  However, to be
 	portable, device driver writers may *not* assume that such an IOMMU
 	exists.
 Warnings:  Memory coherency operates at a granularity called the cache
 line width.  In order for memory mapped by this API to operate
 correctly, the mapped region must begin exactly on a cache line
 boundary and end exactly on one (to prevent two separately mapped
 regions from sharing a single cache line).  Since the cache line size
 may not be known at compile time, the API will not enforce this
 requirement.  Therefore, it is recommended that driver writers who
 don't take special care to determine the cache line size at run time
 only map virtual regions that begin and end on page boundaries (which
 are guaranteed also to be cache line boundaries).
 .. warning::
 DMA_TO_DEVICE synchronisation must be done after the last modification
 of the memory region by the software and before it is handed off to
 the device.  Once this primitive is used, memory covered by this
 primitive should be treated as read-only by the device.  If the device
 may write to it at any point, it should be DMA_BIDIRECTIONAL (see
 below).
 	Memory coherency operates at a granularity called the cache
 	line width.  In order for memory mapped by this API to operate
 	correctly, the mapped region must begin exactly on a cache line
 	boundary and end exactly on one (to prevent two separately mapped
 	regions from sharing a single cache line).  Since the cache line size
 	may not be known at compile time, the API will not enforce this
 	requirement.  Therefore, it is recommended that driver writers who
 	don't take special care to determine the cache line size at run time
 	only map virtual regions that begin and end on page boundaries (which
 	are guaranteed also to be cache line boundaries).
 DMA_FROM_DEVICE synchronisation must be done before the driver
 accesses data that may be changed by the device.  This memory should
 be treated as read-only by the driver.  If the driver needs to write
 to it at any point, it should be DMA_BIDIRECTIONAL (see below).
 	DMA_TO_DEVICE synchronisation must be done after the last modification
 	of the memory region by the software and before it is handed off to
 	the device.  Once this primitive is used, memory covered by this
 	primitive should be treated as read-only by the device.  If the device
 	may write to it at any point, it should be DMA_BIDIRECTIONAL (see
 	below).
 DMA_BIDIRECTIONAL requires special handling: it means that the driver
 isn't sure if the memory was modified before being handed off to the
 device and also isn't sure if the device will also modify it.  Thus,
 you must always sync bidirectional memory twice: once before the
 memory is handed off to the device (to make sure all memory changes
 are flushed from the processor) and once before the data may be
 accessed after being used by the device (to make sure any processor
 cache lines are updated with data that the device may have changed).
 	DMA_FROM_DEVICE synchronisation must be done before the driver
 	accesses data that may be changed by the device.  This memory should
 	be treated as read-only by the driver.  If the driver needs to write
 	to it at any point, it should be DMA_BIDIRECTIONAL (see below).
 	DMA_BIDIRECTIONAL requires special handling: it means that the driver
 	isn't sure if the memory was modified before being handed off to the
 	device and also isn't sure if the device will also modify it.  Thus,
 	you must always sync bidirectional memory twice: once before the
 	memory is handed off to the device (to make sure all memory changes
 	are flushed from the processor) and once before the data may be
 	accessed after being used by the device (to make sure any processor
 	cache lines are updated with data that the device may have changed).
 ::
 	void
 	dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
 			 enum dma_data_direction direction)
 void
 dma_unmap_single(struct device *dev, dma_addr_t dma_addr, size_t size,
 		 enum dma_data_direction direction)
 Unmaps the region previously mapped.  All the parameters passed in
 must be identical to those passed in (and returned) by the mapping
 API.
 ::
 	dma_addr_t
 	dma_map_page(struct device *dev, struct page *page,
 		     unsigned long offset, size_t size,
 		     enum dma_data_direction direction)
 	void
 	dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
 		       enum dma_data_direction direction)
 dma_addr_t
 dma_map_page(struct device *dev, struct page *page,
 		    unsigned long offset, size_t size,
 		    enum dma_data_direction direction)
 void
 dma_unmap_page(struct device *dev, dma_addr_t dma_address, size_t size,
 	       enum dma_data_direction direction)
 API for mapping and unmapping for pages.  All the notes and warnings
 for the other mapping APIs apply here.  Also, although the <offset>
@@ -316,24 +277,20 @@ and <size> parameters are provided to do partial page mapping, it is
 recommended that you never use these unless you really know what the
 cache width is.
 ::
 dma_addr_t
 dma_map_resource(struct device *dev, phys_addr_t phys_addr, size_t size,
 		 enum dma_data_direction dir, unsigned long attrs)
 	dma_addr_t
 	dma_map_resource(struct device *dev, phys_addr_t phys_addr, size_t size,
 			 enum dma_data_direction dir, unsigned long attrs)
 	void
 	dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
 			   enum dma_data_direction dir, unsigned long attrs)
 void
 dma_unmap_resource(struct device *dev, dma_addr_t addr, size_t size,
 		   enum dma_data_direction dir, unsigned long attrs)
 API for mapping and unmapping for MMIO resources. All the notes and
 warnings for the other mapping APIs apply here. The API should only be
 used to map device MMIO resources, mapping of RAM is not permitted.
 ::
 	int
 	dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 int
 dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
 In some circumstances dma_map_single(), dma_map_page() and dma_map_resource()
 will fail to create a mapping. A driver can check for these errors by testing
@@ -341,11 +298,9 @@ the returned DMA address with dma_mapping_error(). A non-zero return value
 means the mapping could not be created and the driver should take appropriate
 action (e.g. reduce current DMA mapping usage or delay and try again later).
 ::
 	int
 	dma_map_sg(struct device *dev, struct scatterlist *sg,
 		   int nents, enum dma_data_direction direction)
 		int nents, enum dma_data_direction direction)
 Returns: the number of DMA address segments mapped (this may be shorter
 than <nents> passed in if some elements of the scatter/gather list are
@@ -361,7 +316,7 @@ critical that the driver do something, in the case of a block driver
 aborting the request or even oopsing is better than doing nothing and
 corrupting the filesystem.
 With scatterlists, you use the resulting mapping like this::
 With scatterlists, you use the resulting mapping like this:
 	int i, count = dma_map_sg(dev, sglist, nents, direction);
 	struct scatterlist *sg;
@@ -382,11 +337,9 @@ Then you should loop count times (note: this can be less than nents times)
 and use sg_dma_address() and sg_dma_len() macros where you previously
 accessed sg->address and sg->length as shown above.
 ::
 	void
 	dma_unmap_sg(struct device *dev, struct scatterlist *sg,
 		     int nents, enum dma_data_direction direction)
 		int nents, enum dma_data_direction direction)
 Unmap the previously mapped scatter/gather list.  All the parameters
 must be the same as those and passed in to the scatter/gather mapping
@@ -395,27 +348,18 @@ API.
 Note: <nents> must be the number you passed in, *not* the number of
 DMA address entries returned.
 ::
 	void
 	dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle,
 				size_t size,
 				enum dma_data_direction direction)
 	void
 	dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle,
 				   size_t size,
 				   enum dma_data_direction direction)
 	void
 	dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg,
 			    int nents,
 			    enum dma_data_direction direction)
 	void
 	dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
 			       int nents,
 			       enum dma_data_direction direction)
 void
 dma_sync_single_for_cpu(struct device *dev, dma_addr_t dma_handle, size_t size,
 			enum dma_data_direction direction)
 void
 dma_sync_single_for_device(struct device *dev, dma_addr_t dma_handle, size_t size,
 			   enum dma_data_direction direction)
 void
 dma_sync_sg_for_cpu(struct device *dev, struct scatterlist *sg, int nents,
 		    enum dma_data_direction direction)
 void
 dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg, int nents,
 		       enum dma_data_direction direction)
 Synchronise a single contiguous or scatter/gather mapping for the CPU
 and device. With the sync_sg API, all the parameters must be the same
@@ -423,41 +367,36 @@ as those passed into the single mapping API. With the sync_single API,
 you can use dma_handle and size parameters that aren't identical to
 those passed into the single mapping API to do a partial sync.
 Notes:  You must do this:
 .. note::
    You must do this:
    - Before reading values that have been written by DMA from the device
      (use the DMA_FROM_DEVICE direction)
    - After writing values that will be written to the device using DMA
      (use the DMA_TO_DEVICE) direction
    - before *and* after handing memory to the device if the memory is
      DMA_BIDIRECTIONAL
 - Before reading values that have been written by DMA from the device
   (use the DMA_FROM_DEVICE direction)
 - After writing values that will be written to the device using DMA
   (use the DMA_TO_DEVICE) direction
 - before *and* after handing memory to the device if the memory is
   DMA_BIDIRECTIONAL
 See also dma_map_single().
 ::
 dma_addr_t
 dma_map_single_attrs(struct device *dev, void *cpu_addr, size_t size,
 		     enum dma_data_direction dir,
 		     unsigned long attrs)
 	dma_addr_t
 	dma_map_single_attrs(struct device *dev, void *cpu_addr, size_t size,
 			     enum dma_data_direction dir,
 			     unsigned long attrs)
 void
 dma_unmap_single_attrs(struct device *dev, dma_addr_t dma_addr,
 		       size_t size, enum dma_data_direction dir,
 		       unsigned long attrs)
 	void
 	dma_unmap_single_attrs(struct device *dev, dma_addr_t dma_addr,
 			       size_t size, enum dma_data_direction dir,
 			       unsigned long attrs)
 int
 dma_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
 		 int nents, enum dma_data_direction dir,
 		 unsigned long attrs)
 	int
 	dma_map_sg_attrs(struct device *dev, struct scatterlist *sgl,
 			 int nents, enum dma_data_direction dir,
 			 unsigned long attrs)
 	void
 	dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sgl,
 			   int nents, enum dma_data_direction dir,
 			   unsigned long attrs)
 void
 dma_unmap_sg_attrs(struct device *dev, struct scatterlist *sgl,
 		   int nents, enum dma_data_direction dir,
 		   unsigned long attrs)
 The four functions above are just like the counterpart functions
 without the _attrs suffixes, except that they pass an optional
@@ -471,38 +410,37 @@ is identical to those of the corresponding function
 without the _attrs suffix. As a result dma_map_single_attrs()
 can generally replace dma_map_single(), etc.
 As an example of the use of the ``*_attrs`` functions, here's how
 As an example of the use of the *_attrs functions, here's how
 you could pass an attribute DMA_ATTR_FOO when mapping memory
 for DMA::
 for DMA:
 	#include <linux/dma-mapping.h>
 	/* DMA_ATTR_FOO should be defined in linux/dma-mapping.h and
 	* documented in Documentation/DMA-attributes.txt */
 	...
 #include <linux/dma-mapping.h>
 /* DMA_ATTR_FOO should be defined in linux/dma-mapping.h and
  * documented in Documentation/DMA-attributes.txt */
 ...
 		unsigned long attr;
 		attr |= DMA_ATTR_FOO;
 		....
 		n = dma_map_sg_attrs(dev, sg, nents, DMA_TO_DEVICE, attr);
 		....
 	unsigned long attr;
 	attr |= DMA_ATTR_FOO;
 	....
 	n = dma_map_sg_attrs(dev, sg, nents, DMA_TO_DEVICE, attr);
 	....
 Architectures that care about DMA_ATTR_FOO would check for its
 presence in their implementations of the mapping and unmapping
 routines, e.g.:::
 routines, e.g.:
 	void whizco_dma_map_sg_attrs(struct device *dev, dma_addr_t dma_addr,
 				     size_t size, enum dma_data_direction dir,
 				     unsigned long attrs)
 	{
 		....
 		if (attrs & DMA_ATTR_FOO)
 			/* twizzle the frobnozzle */
 		....
 	}
 void whizco_dma_map_sg_attrs(struct device *dev, dma_addr_t dma_addr,
 			     size_t size, enum dma_data_direction dir,
 			     unsigned long attrs)
 {
 	....
 	if (attrs & DMA_ATTR_FOO)
 		/* twizzle the frobnozzle */
 	....
 Part II - Advanced dma usage
 ----------------------------
 Part II - Advanced dma_ usage
 -----------------------------
 Warning: These pieces of the DMA API should not be used in the
 majority of cases, since they cater for unlikely corner cases that
@@ -512,18 +450,15 @@ If you don't understand how cache line coherency works between a
 processor and an I/O device, you should not be using this part of the
 API at all.
 ::
 void *
 dma_alloc_noncoherent(struct device *dev, size_t size,
 			       dma_addr_t *dma_handle, gfp_t flag)
 	void *
 	dma_alloc_attrs(struct device *dev, size_t size, dma_addr_t *dma_handle,
 			gfp_t flag, unsigned long attrs)
 Identical to dma_alloc_coherent() except that when the
 DMA_ATTR_NON_CONSISTENT flags is passed in the attrs argument, the
 platform will choose to return either consistent or non-consistent memory
 as it sees fit.  By using this API, you are guaranteeing to the platform
 that you have all the correct and necessary sync points for this memory
 in the driver should it choose to return non-consistent memory.
 Identical to dma_alloc_coherent() except that the platform will
 choose to return either consistent or non-consistent memory as it sees
 fit.  By using this API, you are guaranteeing to the platform that you
 have all the correct and necessary sync points for this memory in the
 driver should it choose to return non-consistent memory.
 Note: where the platform can return consistent memory, it will
 guarantee that the sync points become nops.
@@ -533,50 +468,39 @@ only use this API if you positively know your driver will be
 required to work on one of the rare (usually non-PCI) architectures
 that simply cannot make consistent memory.
 ::
 void
 dma_free_noncoherent(struct device *dev, size_t size, void *cpu_addr,
 			      dma_addr_t dma_handle)
 	void
 	dma_free_attrs(struct device *dev, size_t size, void *cpu_addr,
 		       dma_addr_t dma_handle, unsigned long attrs)
 Free memory allocated by the nonconsistent API.  All parameters must
 be identical to those passed in (and returned by
 dma_alloc_noncoherent()).
 Free memory allocated by the dma_alloc_attrs().  All parameters common
 parameters must identical to those otherwise passed to dma_fre_coherent,
 and the attrs argument must be identical to the attrs passed to
 dma_alloc_attrs().
 ::
 	int
 	dma_get_cache_alignment(void)
 int
 dma_get_cache_alignment(void)
 Returns the processor cache alignment.  This is the absolute minimum
 alignment *and* width that you must observe when either mapping
 memory or doing partial flushes.
 .. note::
 Notes: This API may return a number *larger* than the actual cache
 line, but it will guarantee that one or more cache lines fit exactly
 into the width returned by this call.  It will also always be a power
 of two for easy alignment.
 	This API may return a number *larger* than the actual cache
 	line, but it will guarantee that one or more cache lines fit exactly
 	into the width returned by this call.  It will also always be a power
 	of two for easy alignment.
 void
 dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 	       enum dma_data_direction direction)
 ::
 	void
 	dma_cache_sync(struct device *dev, void *vaddr, size_t size,
 		       enum dma_data_direction direction)
 Do a partial sync of memory that was allocated by dma_alloc_attrs() with
 the DMA_ATTR_NON_CONSISTENT flag starting at virtual address vaddr and
 Do a partial sync of memory that was allocated by
 dma_alloc_noncoherent(), starting at virtual address vaddr and
 continuing on for size.  Again, you *must* observe the cache line
 boundaries when doing this.
 ::
 	int
 	dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
 				    dma_addr_t device_addr, size_t size, int
 				    flags)
 int
 dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
 			    dma_addr_t device_addr, size_t size, int
 			    flags)
 Declare region of memory to be handed out by dma_alloc_coherent() when
 it's asked for coherent memory for this device.
@@ -592,9 +516,32 @@ size is the size of the area (must be multiples of PAGE_SIZE).
 flags can be ORed together and are:
 - DMA_MEMORY_EXCLUSIVE - only allocate memory from the declared regions.
   Do not allow dma_alloc_coherent() to fall back to system memory when
   it's out of memory in the declared region.
 DMA_MEMORY_MAP - request that the memory returned from
 dma_alloc_coherent() be directly writable.
 DMA_MEMORY_IO - request that the memory returned from
 dma_alloc_coherent() be addressable using read()/write()/memcpy_toio() etc.
 One or both of these flags must be present.
 DMA_MEMORY_INCLUDES_CHILDREN - make the declared memory be allocated by
 dma_alloc_coherent of any child devices of this one (for memory residing
 on a bridge).
 DMA_MEMORY_EXCLUSIVE - only allocate memory from the declared regions.
 Do not allow dma_alloc_coherent() to fall back to system memory when
 it's out of memory in the declared region.
 The return value will be either DMA_MEMORY_MAP or DMA_MEMORY_IO and
 must correspond to a passed in flag (i.e. no returning DMA_MEMORY_IO
 if only DMA_MEMORY_MAP were passed in) for success or zero for
 failure.
 Note, for DMA_MEMORY_IO returns, all subsequent memory returned by
 dma_alloc_coherent() may no longer be accessed directly, but instead
 must be accessed using the correct bus functions.  If your driver
 isn't prepared to handle this contingency, it should not specify
 DMA_MEMORY_IO in the input flags.
 As a simplification for the platforms, only *one* such region of
 memory may be declared per device.
@@ -603,10 +550,8 @@ For reasons of efficiency, most platforms choose to track the declared
 region only at the granularity of a page.  For smaller allocations,
 you should use the dma_pool() API.
 ::
 	void
 	dma_release_declared_memory(struct device *dev)
 void
 dma_release_declared_memory(struct device *dev)
 Remove the memory region previously declared from the system.  This
 API performs *no* in-use checking for this region and will return
@@ -614,11 +559,9 @@ unconditionally having removed all the required structures.  It is the
 driver's job to ensure that no parts of this memory region are
 currently in use.
 ::
 	void *
 	dma_mark_declared_memory_occupied(struct device *dev,
 					  dma_addr_t device_addr, size_t size)
 void *
 dma_mark_declared_memory_occupied(struct device *dev,
 				  dma_addr_t device_addr, size_t size)
 This is used to occupy specific regions of the declared space
 (dma_alloc_coherent() will hand out the first free region it finds).
@@ -649,37 +592,38 @@ option has a performance impact. Do not enable it in production kernels.
 If you boot the resulting kernel will contain code which does some bookkeeping
 about what DMA memory was allocated for which device. If this code detects an
 error it prints a warning message with some details into your kernel log. An
 example warning message may look like this::
 example warning message may look like this:
 	WARNING: at /data2/repos/linux-2.6-iommu/lib/dma-debug.c:448
 		check_unmap+0x203/0x490()
 	Hardware name:
 	forcedeth 0000:00:08.0: DMA-API: device driver frees DMA memory with wrong
 		function [device address=0x00000000640444be] [size=66 bytes] [mapped as
 	single] [unmapped as page]
 	Modules linked in: nfsd exportfs bridge stp llc r8169
 	Pid: 0, comm: swapper Tainted: G        W  2.6.28-dmatest-09289-g8bb99c0 #1
 	Call Trace:
 	<IRQ>  [<ffffffff80240b22>] warn_slowpath+0xf2/0x130
 	[<ffffffff80647b70>] _spin_unlock+0x10/0x30
 	[<ffffffff80537e75>] usb_hcd_link_urb_to_ep+0x75/0xc0
 	[<ffffffff80647c22>] _spin_unlock_irqrestore+0x12/0x40
 	[<ffffffff8055347f>] ohci_urb_enqueue+0x19f/0x7c0
 	[<ffffffff80252f96>] queue_work+0x56/0x60
 	[<ffffffff80237e10>] enqueue_task_fair+0x20/0x50
 	[<ffffffff80539279>] usb_hcd_submit_urb+0x379/0xbc0
 	[<ffffffff803b78c3>] cpumask_next_and+0x23/0x40
 	[<ffffffff80235177>] find_busiest_group+0x207/0x8a0
 	[<ffffffff8064784f>] _spin_lock_irqsave+0x1f/0x50
 	[<ffffffff803c7ea3>] check_unmap+0x203/0x490
 	[<ffffffff803c8259>] debug_dma_unmap_page+0x49/0x50
 	[<ffffffff80485f26>] nv_tx_done_optimized+0xc6/0x2c0
 	[<ffffffff80486c13>] nv_nic_irq_optimized+0x73/0x2b0
 	[<ffffffff8026df84>] handle_IRQ_event+0x34/0x70
 	[<ffffffff8026ffe9>] handle_edge_irq+0xc9/0x150
 	[<ffffffff8020e3ab>] do_IRQ+0xcb/0x1c0
 	[<ffffffff8020c093>] ret_from_intr+0x0/0xa
 	<EOI> <4>---[ end trace f6435a98e2a38c0e ]---
 ------------[ cut here ]------------
 WARNING: at /data2/repos/linux-2.6-iommu/lib/dma-debug.c:448
 	check_unmap+0x203/0x490()
 Hardware name:
 forcedeth 0000:00:08.0: DMA-API: device driver frees DMA memory with wrong
 	function [device address=0x00000000640444be] [size=66 bytes] [mapped as
 single] [unmapped as page]
 Modules linked in: nfsd exportfs bridge stp llc r8169
 Pid: 0, comm: swapper Tainted: G        W  2.6.28-dmatest-09289-g8bb99c0 #1
 Call Trace:
  <IRQ>  [<ffffffff80240b22>] warn_slowpath+0xf2/0x130
  [<ffffffff80647b70>] _spin_unlock+0x10/0x30
  [<ffffffff80537e75>] usb_hcd_link_urb_to_ep+0x75/0xc0
  [<ffffffff80647c22>] _spin_unlock_irqrestore+0x12/0x40
  [<ffffffff8055347f>] ohci_urb_enqueue+0x19f/0x7c0
  [<ffffffff80252f96>] queue_work+0x56/0x60
  [<ffffffff80237e10>] enqueue_task_fair+0x20/0x50
  [<ffffffff80539279>] usb_hcd_submit_urb+0x379/0xbc0
  [<ffffffff803b78c3>] cpumask_next_and+0x23/0x40
  [<ffffffff80235177>] find_busiest_group+0x207/0x8a0
  [<ffffffff8064784f>] _spin_lock_irqsave+0x1f/0x50
  [<ffffffff803c7ea3>] check_unmap+0x203/0x490
  [<ffffffff803c8259>] debug_dma_unmap_page+0x49/0x50
  [<ffffffff80485f26>] nv_tx_done_optimized+0xc6/0x2c0
  [<ffffffff80486c13>] nv_nic_irq_optimized+0x73/0x2b0
  [<ffffffff8026df84>] handle_IRQ_event+0x34/0x70
  [<ffffffff8026ffe9>] handle_edge_irq+0xc9/0x150
  [<ffffffff8020e3ab>] do_IRQ+0xcb/0x1c0
  [<ffffffff8020c093>] ret_from_intr+0x0/0xa
  <EOI> <4>---[ end trace f6435a98e2a38c0e ]---
 The driver developer can find the driver and the device including a stacktrace
 of the DMA-API call which caused this warning.
@@ -693,42 +637,43 @@ details.
 The debugfs directory for the DMA-API debugging code is called dma-api/. In
 this directory the following files can currently be found:
 =============================== ===============================================
 dma-api/all_errors		This file contains a numeric value. If this
 	dma-api/all_errors	This file contains a numeric value. If this
 				value is not equal to zero the debugging code
 				will print a warning for every error it finds
 				into the kernel log. Be careful with this
 				option, as it can easily flood your logs.
 dma-api/disabled		This read-only file contains the character 'Y'
 	dma-api/disabled	This read-only file contains the character 'Y'
 				if the debugging code is disabled. This can
 				happen when it runs out of memory or if it was
 				disabled at boot time
 dma-api/error_count		This file is read-only and shows the total
 	dma-api/error_count	This file is read-only and shows the total
 				numbers of errors found.
 dma-api/num_errors		The number in this file shows how many
 	dma-api/num_errors	The number in this file shows how many
 				warnings will be printed to the kernel log
 				before it stops. This number is initialized to
 				one at system boot and be set by writing into
 				this file
 dma-api/min_free_entries	This read-only file can be read to get the
 	dma-api/min_free_entries
 				This read-only file can be read to get the
 				minimum number of free dma_debug_entries the
 				allocator has ever seen. If this value goes
 				down to zero the code will disable itself
 				because it is not longer reliable.
 dma-api/num_free_entries	The current number of free dma_debug_entries
 	dma-api/num_free_entries
 				The current number of free dma_debug_entries
 				in the allocator.
 dma-api/driver-filter		You can write a name of a driver into this file
 	dma-api/driver-filter
 				You can write a name of a driver into this file
 				to limit the debug output to requests from that
 				particular driver. Write an empty string to
 				that file to disable the filter and see
 				all errors again.
 =============================== ===============================================
 If you have this code compiled into your kernel it will be enabled by default.
 If you want to boot without the bookkeeping anyway you can provide
@@ -747,10 +692,7 @@ of preallocated entries is defined per architecture. If it is too low for you
 boot with 'dma_debug_entries=<your_desired_number>' to overwrite the
 architectural default.
 ::
 	void
 	debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr);
 void debug_dmap_mapping_error(struct device *dev, dma_addr_t dma_addr);
 dma-debug interface debug_dma_mapping_error() to debug drivers that fail
 to check DMA mapping errors on addresses returned by dma_map_single() and
@@ -760,3 +702,4 @@ the driver. When driver does unmap, debug_dma_unmap() checks the flag and if
 this flag is still set, prints warning message that includes call trace that
 leads up to the unmap. This interface can be called from dma_mapping_error()
 routines to enable DMA mapping error check debugging.

73

Documentation/DMA-ISA-LPC.txt

View File

@@ -1,20 +1,19 @@
 ============================
 DMA with ISA and LPC devices
 ============================
                         DMA with ISA and LPC devices
                         ============================
 :Author: Pierre Ossman <drzeus@drzeus.cx>
                       Pierre Ossman <drzeus@drzeus.cx>
 This document describes how to do DMA transfers using the old ISA DMA
 controller. Even though ISA is more or less dead today the LPC bus
 uses the same DMA system so it will be around for quite some time.
 Headers and dependencies
 ------------------------
 Part I - Headers and dependencies
 ---------------------------------
 To do ISA style DMA you need to include two headers::
 To do ISA style DMA you need to include two headers:
 	#include <linux/dma-mapping.h>
 	#include <asm/dma.h>
 #include <linux/dma-mapping.h>
 #include <asm/dma.h>
 The first is the generic DMA API used to convert virtual addresses to
 bus addresses (see Documentation/DMA-API.txt for details).
@@ -24,8 +23,8 @@ this is not present on all platforms make sure you construct your
 Kconfig to be dependent on ISA_DMA_API (not ISA) so that nobody tries
 to build your driver on unsupported platforms.
 Buffer allocation
 -----------------
 Part II - Buffer allocation
 ---------------------------
 The ISA DMA controller has some very strict requirements on which
 memory it can access so extra care must be taken when allocating
@@ -43,13 +42,13 @@ requirements you pass the flag GFP_DMA to kmalloc.
 Unfortunately the memory available for ISA DMA is scarce so unless you
 allocate the memory during boot-up it's a good idea to also pass
 __GFP_RETRY_MAYFAIL and __GFP_NOWARN to make the allocator try a bit harder.
 __GFP_REPEAT and __GFP_NOWARN to make the allocator try a bit harder.
 (This scarcity also means that you should allocate the buffer as
 early as possible and not release it until the driver is unloaded.)
 Address translation
 -------------------
 Part III - Address translation
 ------------------------------
 To translate the virtual address to a bus address, use the normal DMA
 API. Do _not_ use isa_virt_to_phys() even though it does the same
@@ -62,8 +61,8 @@ Note: x86_64 had a broken DMA API when it came to ISA but has since
 been fixed. If your arch has problems then fix the DMA API instead of
 reverting to the ISA functions.
 Channels
 --------
 Part IV - Channels
 ------------------
 A normal ISA DMA controller has 8 channels. The lower four are for
 -bit transfers and the upper four are for 16-bit transfers.
@@ -81,8 +80,8 @@ The ability to use 16-bit or 8-bit transfers is _not_ up to you as a
 driver author but depends on what the hardware supports. Check your
 specs or test different channels.
 Transfer data
 -------------
 Part V - Transfer data
 ----------------------
 Now for the good stuff, the actual DMA transfer. :)
@@ -113,37 +112,37 @@ Once the DMA transfer is finished (or timed out) you should disable
 the channel again. You should also check get_dma_residue() to make
 sure that all data has been transferred.
 Example::
 Example:
 	int flags, residue;
 int flags, residue;
 	flags = claim_dma_lock();
 flags = claim_dma_lock();
 	clear_dma_ff();
 clear_dma_ff();
 	set_dma_mode(channel, DMA_MODE_WRITE);
 	set_dma_addr(channel, phys_addr);
 	set_dma_count(channel, num_bytes);
 set_dma_mode(channel, DMA_MODE_WRITE);
 set_dma_addr(channel, phys_addr);
 set_dma_count(channel, num_bytes);
 	dma_enable(channel);
 dma_enable(channel);
 	release_dma_lock(flags);
 release_dma_lock(flags);
 	while (!device_done());
 while (!device_done());
 	flags = claim_dma_lock();
 flags = claim_dma_lock();
 	dma_disable(channel);
 dma_disable(channel);
 	residue = dma_get_residue(channel);
 	if (residue != 0)
 		printk(KERN_ERR "driver: Incomplete DMA transfer!"
 			" %d bytes left!\n", residue);
 residue = dma_get_residue(channel);
 if (residue != 0)
 	printk(KERN_ERR "driver: Incomplete DMA transfer!"
 		" %d bytes left!\n", residue);
 	release_dma_lock(flags);
 release_dma_lock(flags);
 Suspend/resume
 --------------
 Part VI - Suspend/resume
 ------------------------
 It is the driver's responsibility to make sure that the machine isn't
 suspended while a DMA transfer is in progress. Also, all DMA settings

15

Documentation/DMA-attributes.txt

View File

@@ -1,6 +1,5 @@
 ==============
 DMA attributes
 ==============
 			DMA attributes
 			==============
 This document describes the semantics of the DMA attributes that are
 defined in linux/dma-mapping.h.
@@ -109,7 +108,6 @@ This is a hint to the DMA-mapping subsystem that it's probably not worth
 the time to try to allocate memory to in a way that gives better TLB
 efficiency (AKA it's not worth trying to build the mapping out of larger
 pages).  You might want to specify this if:
 - You know that the accesses to this memory won't thrash the TLB.
   You might know that the accesses are likely to be sequential or
   that they aren't sequential but it's unlikely you'll ping-pong
@@ -123,12 +121,11 @@ pages).  You might want to specify this if:
   the mapping to have a short lifetime then it may be worth it to
   optimize allocation (avoid coming up with large pages) instead of
   getting the slight performance win of larger pages.
 Setting this hint doesn't guarantee that you won't get huge pages, but it
 means that we won't try quite as hard to get them.
 .. note:: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM,
 	  though ARM64 patches will likely be posted soon.
 NOTE: At the moment DMA_ATTR_ALLOC_SINGLE_PAGES is only implemented on ARM,
 though ARM64 patches will likely be posted soon.
 DMA_ATTR_NO_WARN
 ----------------
@@ -145,10 +142,10 @@ problem at all, depending on the implementation of the retry mechanism.
 So, this provides a way for drivers to avoid those error messages on calls
 where allocation failures are not a problem, and shouldn't bother the logs.
 .. note:: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
 NOTE: At the moment DMA_ATTR_NO_WARN is only implemented on PowerPC.
 DMA_ATTR_PRIVILEGED
 -------------------
 ------------------------------
 Some advanced peripherals such as remote processors and GPUs perform
 accesses to DMA buffers in both privileged "supervisor" and unprivileged

17

Documentation/DocBook/.gitignore vendored Normal file

View File

@@ -0,0 +1,17 @@
 *.xml
 *.ps
 *.pdf
 *.html
 *.9.gz
 *.9
 *.aux
 *.dvi
 *.log
 *.out
 *.png
 *.gif
 *.svg
 *.proc
 *.db
 media-indices.tmpl
 media-entities.tmpl

									
										278

Documentation/DocBook/Makefile
									
										Normal file
									
												View File
												
				@@ -0,0 +1,278 @@

				###

				# This makefile is used to generate the kernel documentation,

				# primarily based on in-line comments in various source files.

				# See Documentation/kernel-doc-nano-HOWTO.txt for instruction in how

				# to document the SRC - and how to read it.

				# To add a new book the only step required is to add the book to the

				# list of DOCBOOKS.

				DOCBOOKS := z8530book.xml  \

					    kernel-hacking.xml kernel-locking.xml \

					    writing_usb_driver.xml networking.xml \

					    kernel-api.xml filesystems.xml lsm.xml kgdb.xml \

					    gadget.xml libata.xml mtdnand.xml librs.xml rapidio.xml \

					    genericirq.xml s390-drivers.xml scsi.xml \

					    sh.xml w1.xml \

					    writing_musb_glue_layer.xml

				ifeq ($(DOCBOOKS),)

				# Skip DocBook build if the user explicitly requested no DOCBOOKS.

				.DEFAULT:

					@echo "  SKIP    DocBook $@ target (DOCBOOKS=\"\" specified)."

				else

				ifneq ($(SPHINXDIRS),)

				# Skip DocBook build if the user explicitly requested a sphinx dir

				.DEFAULT:

					@echo "  SKIP    DocBook $@ target (SPHINXDIRS specified)."

				else

				###

				# The build process is as follows (targets):

				#              (xmldocs) [by docproc]

				# file.tmpl --> file.xml +--> file.ps   (psdocs)   [by db2ps or xmlto]

				#                        +--> file.pdf  (pdfdocs)  [by db2pdf or xmlto]

				#                        +--> DIR=file  (htmldocs) [by xmlto]

				#                        +--> man/      (mandocs)  [by xmlto]

				# for PDF and PS output you can choose between xmlto and docbook-utils tools

				PDF_METHOD	= $(prefer-db2x)

				PS_METHOD	= $(prefer-db2x)

				targets += $(DOCBOOKS)

				BOOKS := $(addprefix $(obj)/,$(DOCBOOKS))

				xmldocs: $(BOOKS)

				sgmldocs: xmldocs

				PS := $(patsubst %.xml, %.ps, $(BOOKS))

				psdocs: $(PS)

				PDF := $(patsubst %.xml, %.pdf, $(BOOKS))

				pdfdocs: $(PDF)

				HTML := $(sort $(patsubst %.xml, %.html, $(BOOKS)))

				htmldocs: $(HTML)

					$(call cmd,build_main_index)

				MAN := $(patsubst %.xml, %.9, $(BOOKS))

				mandocs: $(MAN)

					find $(obj)/man -name '*.9' | xargs gzip -nf

				installmandocs: mandocs

					mkdir -p /usr/local/man/man9/

					find $(obj)/man -name '*.9.gz' -printf '%h %f\n' | \

						sort -k 2 -k 1 | uniq -f 1 | sed -e 's: :/:' | \

						xargs install -m 644 -t /usr/local/man/man9/

				# no-op for the DocBook toolchain

				epubdocs:

				latexdocs:

				linkcheckdocs:

				###

				#External programs used

				KERNELDOCXMLREF = $(srctree)/scripts/kernel-doc-xml-ref

				KERNELDOC       = $(srctree)/scripts/kernel-doc

				DOCPROC         = $(objtree)/scripts/docproc

				CHECK_LC_CTYPE = $(objtree)/scripts/check-lc_ctype

				# Use a fixed encoding - UTF-8 if the C library has support built-in

				# or ASCII if not

				LC_CTYPE := $(call try-run, LC_CTYPE=C.UTF-8 $(CHECK_LC_CTYPE),C.UTF-8,C)

				export LC_CTYPE

				XMLTOFLAGS = -m $(srctree)/$(src)/stylesheet.xsl

				XMLTOFLAGS += --skip-validation

				###

				# DOCPROC is used for two purposes:

				# 1) To generate a dependency list for a .tmpl file

				# 2) To preprocess a .tmpl file and call kernel-doc with

				#     appropriate parameters.

				# The following rules are used to generate the .xml documentation

				# required to generate the final targets. (ps, pdf, html).

				quiet_cmd_docproc = DOCPROC $@

				      cmd_docproc = SRCTREE=$(srctree)/ $(DOCPROC) doc $< >$@

				define rule_docproc

					set -e;								\

				        $(if $($(quiet)cmd_$(1)),echo '  $($(quiet)cmd_$(1))';) 	\

				        $(cmd_$(1)); 							\

				        ( 								\

				          echo 'cmd_$@ := $(cmd_$(1))'; 				\

				          echo $@: `SRCTREE=$(srctree) $(DOCPROC) depend $<`; 		\

				        ) > $(dir $@).$(notdir $@).cmd

				endef

				%.xml: %.tmpl $(KERNELDOC) $(DOCPROC) $(KERNELDOCXMLREF) FORCE

					$(call if_changed_rule,docproc)

				# Tell kbuild to always build the programs

				always := $(hostprogs-y)

				notfoundtemplate = echo "*** You have to install docbook-utils or xmlto ***"; \

						   exit 1

				db2xtemplate = db2TYPE -o $(dir $@) $<

				xmltotemplate = xmlto TYPE $(XMLTOFLAGS) -o $(dir $@) $<

				# determine which methods are available

				ifeq ($(shell which db2ps >/dev/null 2>&1 && echo found),found)

					use-db2x = db2x

					prefer-db2x = db2x

				else

					use-db2x = notfound

					prefer-db2x = $(use-xmlto)

				endif

				ifeq ($(shell which xmlto >/dev/null 2>&1 && echo found),found)

					use-xmlto = xmlto

					prefer-xmlto = xmlto

				else

					use-xmlto = notfound

					prefer-xmlto = $(use-db2x)

				endif

				# the commands, generated from the chosen template

				quiet_cmd_db2ps = PS      $@

				      cmd_db2ps = $(subst TYPE,ps, $($(PS_METHOD)template))

				%.ps : %.xml

					$(call cmd,db2ps)

				quiet_cmd_db2pdf = PDF     $@

				      cmd_db2pdf = $(subst TYPE,pdf, $($(PDF_METHOD)template))

				%.pdf : %.xml

					$(call cmd,db2pdf)

				index = index.html

				main_idx = $(obj)/$(index)

				quiet_cmd_build_main_index = HTML    $(main_idx)

				      cmd_build_main_index = rm -rf $(main_idx); \

						   echo '<h1>Linux Kernel HTML Documentation</h1>' >> $(main_idx) && \

						   echo '<h2>Kernel Version: $(KERNELVERSION)</h2>' >> $(main_idx) && \

						   cat $(HTML) >> $(main_idx)

				quiet_cmd_db2html = HTML    $@

				      cmd_db2html = xmlto html $(XMLTOFLAGS) -o $(patsubst %.html,%,$@) $< && \

						echo '<a HREF="$(patsubst %.html,%,$(notdir $@))/index.html"> \

						$(patsubst %.html,%,$(notdir $@))</a><p>' > $@

				###

				# Rules to create an aux XML and .db, and use them to re-process the DocBook XML

				# to fill internal hyperlinks

				       gen_aux_xml = :

				 quiet_gen_aux_xml = echo '  XMLREF  $@'

				silent_gen_aux_xml = :

				%.aux.xml: %.xml

					@$($(quiet)gen_aux_xml)

					@rm -rf $@

					@(cat $< | egrep "^<refentry id" | egrep -o "\".*\"" | cut -f 2 -d \" > $<.db)

					@$(KERNELDOCXMLREF) -db $<.db $< > $@

				.PRECIOUS: %.aux.xml

				%.html:	%.aux.xml

					@(which xmlto > /dev/null 2>&1) || \

					 (echo "*** You need to install xmlto ***"; \

					  exit 1)

					@rm -rf $@ $(patsubst %.html,%,$@)

					$(call cmd,db2html)

					@if [ ! -z "$(PNG-$(basename $(notdir $@)))" ]; then \

				            cp $(PNG-$(basename $(notdir $@))) $(patsubst %.html,%,$@); fi

				quiet_cmd_db2man = MAN     $@

				      cmd_db2man = if grep -q refentry $<; then xmlto man $(XMLTOFLAGS) -o $(obj)/man/$(*F) $< ; fi

				%.9 : %.xml

					@(which xmlto > /dev/null 2>&1) || \

					 (echo "*** You need to install xmlto ***"; \

					  exit 1)

					$(Q)mkdir -p $(obj)/man/$(*F)

					$(call cmd,db2man)

					@touch $@

				###

				# Rules to generate postscripts and PNG images from .fig format files

				quiet_cmd_fig2eps = FIG2EPS $@

				      cmd_fig2eps = fig2dev -Leps $< $@

				%.eps: %.fig

					@(which fig2dev > /dev/null 2>&1) || \

					 (echo "*** You need to install transfig ***"; \

					  exit 1)

					$(call cmd,fig2eps)

				quiet_cmd_fig2png = FIG2PNG $@

				      cmd_fig2png = fig2dev -Lpng $< $@

				%.png: %.fig

					@(which fig2dev > /dev/null 2>&1) || \

					 (echo "*** You need to install transfig ***"; \

					  exit 1)

					$(call cmd,fig2png)

				###

				# Rule to convert a .c file to inline XML documentation

				       gen_xml = :

				 quiet_gen_xml = echo '  GEN     $@'

				silent_gen_xml = :

				%.xml: %.c

					@$($(quiet)gen_xml)

					@(                            \

					   echo "<programlisting>";   \

					   expand --tabs=8 < $< |     \

					   sed -e "s/&/\\&amp;/g"     \

					       -e "s/</\\&lt;/g"      \

					       -e "s/>/\\&gt;/g";     \

					   echo "</programlisting>")  > $@

				endif # DOCBOOKS=""

				endif # SPHINDIR=...

				###

				# Help targets as used by the top-level makefile

				dochelp:

					@echo  ' Linux kernel internal documentation in different formats (DocBook):'

					@echo  '  htmldocs        - HTML'

					@echo  '  pdfdocs         - PDF'

					@echo  '  psdocs          - Postscript'

					@echo  '  xmldocs         - XML DocBook'

					@echo  '  mandocs         - man pages'

					@echo  '  installmandocs  - install man pages generated by mandocs'

					@echo  '  cleandocs       - clean all generated DocBook files'

					@echo

					@echo  '  make DOCBOOKS="s1.xml s2.xml" [target] Generate only docs s1.xml s2.xml'

					@echo  '  valid values for DOCBOOKS are: $(DOCBOOKS)'

					@echo

					@echo  "  make DOCBOOKS=\"\" [target] Don't generate docs from Docbook"

					@echo  '     This is useful to generate only the ReST docs (Sphinx)'

				###

				# Temporary files left by various tools

				clean-files := $(DOCBOOKS) \

					$(patsubst %.xml, %.dvi,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.aux,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.tex,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.log,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.out,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.ps,      $(DOCBOOKS)) \

					$(patsubst %.xml, %.pdf,     $(DOCBOOKS)) \

					$(patsubst %.xml, %.html,    $(DOCBOOKS)) \

					$(patsubst %.xml, %.9,       $(DOCBOOKS)) \

					$(patsubst %.xml, %.aux.xml, $(DOCBOOKS)) \

					$(patsubst %.xml, %.xml.db,  $(DOCBOOKS)) \

					$(patsubst %.xml, %.xml,     $(DOCBOOKS)) \

					$(patsubst %.xml, .%.xml.cmd, $(DOCBOOKS)) \

					$(index)

				clean-dirs := $(patsubst %.xml,%,$(DOCBOOKS)) man

				cleandocs:

					$(Q)rm -f $(call objectify, $(clean-files))

					$(Q)rm -rf $(call objectify, $(clean-dirs))

				# Declare the contents of the .PHONY variable as phony.  We keep that

				# information in a variable so we can use it in if_changed and friends.

				.PHONY: $(PHONY)

									
										381

Documentation/DocBook/filesystems.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,381 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="Linux-filesystems-API">

				 <bookinfo>

				  <title>Linux Filesystems API</title>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="vfs">

				     <title>The Linux VFS</title>

				     <sect1 id="the_filesystem_types"><title>The Filesystem types</title>

				!Iinclude/linux/fs.h

				     </sect1>

				     <sect1 id="the_directory_cache"><title>The Directory Cache</title>

				!Efs/dcache.c

				!Iinclude/linux/dcache.h

				     </sect1>

				     <sect1 id="inode_handling"><title>Inode Handling</title>

				!Efs/inode.c

				!Efs/bad_inode.c

				     </sect1>

				     <sect1 id="registration_and_superblocks"><title>Registration and Superblocks</title>

				!Efs/super.c

				     </sect1>

				     <sect1 id="file_locks"><title>File Locks</title>

				!Efs/locks.c

				!Ifs/locks.c

				     </sect1>

				     <sect1 id="other_functions"><title>Other Functions</title>

				!Efs/mpage.c

				!Efs/namei.c

				!Efs/buffer.c

				!Eblock/bio.c

				!Efs/seq_file.c

				!Efs/filesystems.c

				!Efs/fs-writeback.c

				!Efs/block_dev.c

				     </sect1>

				  </chapter>

				  <chapter id="proc">

				     <title>The proc filesystem</title>

				     <sect1 id="sysctl_interface"><title>sysctl interface</title>

				!Ekernel/sysctl.c

				     </sect1>

				     <sect1 id="proc_filesystem_interface"><title>proc filesystem interface</title>

				!Ifs/proc/base.c

				     </sect1>

				  </chapter>

				  <chapter id="fs_events">

				     <title>Events based on file descriptors</title>

				!Efs/eventfd.c

				  </chapter>

				  <chapter id="sysfs">

				     <title>The Filesystem for Exporting Kernel Objects</title>

				!Efs/sysfs/file.c

				!Efs/sysfs/symlink.c

				  </chapter>

				  <chapter id="debugfs">

				     <title>The debugfs filesystem</title>

				     <sect1 id="debugfs_interface"><title>debugfs interface</title>

				!Efs/debugfs/inode.c

				!Efs/debugfs/file.c

				     </sect1>

				  </chapter>

				  <chapter id="LinuxJDBAPI">

				  <chapterinfo>

				  <title>The Linux Journalling API</title>

				  <authorgroup>

				  <author>

				     <firstname>Roger</firstname>

				     <surname>Gammans</surname>

				     <affiliation>

				     <address>

				      <email>rgammans@computer-surgery.co.uk</email>

				     </address>

				    </affiliation>

				     </author>

				  </authorgroup>

				  <authorgroup>

				   <author>

				    <firstname>Stephen</firstname>

				    <surname>Tweedie</surname>

				    <affiliation>

				     <address>

				      <email>sct@redhat.com</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2002</year>

				   <holder>Roger Gammans</holder>

				  </copyright>

				  </chapterinfo>

				  <title>The Linux Journalling API</title>

				    <sect1 id="journaling_overview">

				     <title>Overview</title>

				    <sect2 id="journaling_details">

				     <title>Details</title>

				<para>

				The journalling layer is  easy to use. You need to

				first of all create a journal_t data structure. There are

				two calls to do this dependent on how you decide to allocate the physical

				media on which the journal resides. The jbd2_journal_init_inode() call

				is for journals stored in filesystem inodes, or the jbd2_journal_init_dev()

				call can be used for journal stored on a raw device (in a continuous range

				of blocks). A journal_t is a typedef for a struct pointer, so when

				you are finally finished make sure you call jbd2_journal_destroy() on it

				to free up any used kernel memory.

				</para>

				<para>

				Once you have got your journal_t object you need to 'mount' or load the journal

				file. The journalling layer expects the space for the journal was already

				allocated and initialized properly by the userspace tools.  When loading the

				journal you must call jbd2_journal_load() to process journal contents.  If the

				client file system detects the journal contents does not need to be processed

				(or even need not have valid contents), it may call jbd2_journal_wipe() to

				clear the journal contents before calling jbd2_journal_load().

				</para>

				<para>

				Note that jbd2_journal_wipe(..,0) calls jbd2_journal_skip_recovery() for you if

				it detects any outstanding transactions in the journal and similarly

				jbd2_journal_load() will call jbd2_journal_recover() if necessary.  I would

				advise reading ext4_load_journal() in fs/ext4/super.c for examples on this

				stage.

				</para>

				<para>

				Now you can go ahead and start modifying the underlying

				filesystem. Almost.

				</para>

				<para>

				You still need to actually journal your filesystem changes, this

				is done by wrapping them into transactions. Additionally you

				also need to wrap the modification of each of the buffers

				with calls to the journal layer, so it knows what the modifications

				you are actually making are. To do this use jbd2_journal_start() which

				returns a transaction handle.

				</para>

				<para>

				jbd2_journal_start()

				and its counterpart jbd2_journal_stop(), which indicates the end of a

				transaction are nestable calls, so you can reenter a transaction if necessary,

				but remember you must call jbd2_journal_stop() the same number of times as

				jbd2_journal_start() before the transaction is completed (or more accurately

				leaves the update phase). Ext4/VFS makes use of this feature to simplify

				handling of inode dirtying, quota support, etc.

				</para>

				<para>

				Inside each transaction you need to wrap the modifications to the

				individual buffers (blocks). Before you start to modify a buffer you

				need to call jbd2_journal_get_{create,write,undo}_access() as appropriate,

				this allows the journalling layer to copy the unmodified data if it

				needs to. After all the buffer may be part of a previously uncommitted

				transaction.

				At this point you are at last ready to modify a buffer, and once

				you are have done so you need to call jbd2_journal_dirty_{meta,}data().

				Or if you've asked for access to a buffer you now know is now longer

				required to be pushed back on the device you can call jbd2_journal_forget()

				in much the same way as you might have used bforget() in the past.

				</para>

				<para>

				A jbd2_journal_flush() may be called at any time to commit and checkpoint

				all your transactions.

				</para>

				<para>

				Then at umount time , in your put_super() you can then call jbd2_journal_destroy()

				to clean up your in-core journal object.

				</para>

				<para>

				Unfortunately there a couple of ways the journal layer can cause a deadlock.

				The first thing to note is that each task can only have

				a single outstanding transaction at any one time, remember nothing

				commits until the outermost jbd2_journal_stop(). This means

				you must complete the transaction at the end of each file/inode/address

				etc. operation you perform, so that the journalling system isn't re-entered

				on another journal. Since transactions can't be nested/batched

				across differing journals, and another filesystem other than

				yours (say ext4) may be modified in a later syscall.

				</para>

				<para>

				The second case to bear in mind is that jbd2_journal_start() can

				block if there isn't enough space in the journal for your transaction

				(based on the passed nblocks param) - when it blocks it merely(!) needs to

				wait for transactions to complete and be committed from other tasks,

				so essentially we are waiting for jbd2_journal_stop(). So to avoid

				deadlocks you must treat jbd2_journal_start/stop() as if they

				were semaphores and include them in your semaphore ordering rules to prevent

				deadlocks. Note that jbd2_journal_extend() has similar blocking behaviour to

				jbd2_journal_start() so you can deadlock here just as easily as on

				jbd2_journal_start().

				</para>

				<para>

				Try to reserve the right number of blocks the first time. ;-). This will

				be the maximum number of blocks you are going to touch in this transaction.

				I advise having a look at at least ext4_jbd.h to see the basis on which

				ext4 uses to make these decisions.

				</para>

				<para>

				Another wriggle to watch out for is your on-disk block allocation strategy.

				Why? Because, if you do a delete, you need to ensure you haven't reused any

				of the freed blocks until the transaction freeing these blocks commits. If you

				reused these blocks and crash happens, there is no way to restore the contents

				of the reallocated blocks at the end of the last fully committed transaction.

				One simple way of doing this is to mark blocks as free in internal in-memory

				block allocation structures only after the transaction freeing them commits.

				Ext4 uses journal commit callback for this purpose.

				</para>

				<para>

				With journal commit callbacks you can ask the journalling layer to call a

				callback function when the transaction is finally committed to disk, so that

				you can do some of your own management. You ask the journalling layer for

				calling the callback by simply setting journal->j_commit_callback function

				pointer and that function is called after each transaction commit. You can also

				use transaction->t_private_list for attaching entries to a transaction that

				need processing when the transaction commits.

				</para>

				<para>

				JBD2 also provides a way to block all transaction updates via

				jbd2_journal_{un,}lock_updates(). Ext4 uses this when it wants a window with a

				clean and stable fs for a moment.  E.g.

				</para>

				<programlisting>

					jbd2_journal_lock_updates() //stop new stuff happening..

					jbd2_journal_flush()        // checkpoint everything.

					..do stuff on stable fs

					jbd2_journal_unlock_updates() // carry on with filesystem use.

				</programlisting>

				<para>

				The opportunities for abuse and DOS attacks with this should be obvious,

				if you allow unprivileged userspace to trigger codepaths containing these

				calls.

				</para>

				    </sect2>

				    <sect2 id="jbd_summary">

				     <title>Summary</title>

				<para>

				Using the journal is a matter of wrapping the different context changes,

				being each mount, each modification (transaction) and each changed buffer

				to tell the journalling layer about them.

				</para>

				    </sect2>

				    </sect1>

				    <sect1 id="data_types">

				     <title>Data Types</title>

				     <para>

					The journalling layer uses typedefs to 'hide' the concrete definitions

					of the structures used. As a client of the JBD2 layer you can

					just rely on the using the pointer as a magic cookie  of some sort.

					Obviously the hiding is not enforced as this is 'C'.

				     </para>

					<sect2 id="structures"><title>Structures</title>

				!Iinclude/linux/jbd2.h

					</sect2>

				    </sect1>

				    <sect1 id="functions">

				     <title>Functions</title>

				     <para>

					The functions here are split into two groups those that

					affect a journal as a whole, and those which are used to

					manage transactions

				     </para>

					<sect2 id="journal_level"><title>Journal Level</title>

				!Efs/jbd2/journal.c

				!Ifs/jbd2/recovery.c

					</sect2>

					<sect2 id="transaction_level"><title>Transasction Level</title>

				!Efs/jbd2/transaction.c

					</sect2>

				    </sect1>

				    <sect1 id="see_also">

				     <title>See also</title>

					<para>

					  <citation>

					   <ulink url="http://kernel.org/pub/linux/kernel/people/sct/ext3/journal-design.ps.gz">

					   	Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen Tweedie

					   </ulink>

					  </citation>

					</para>

					<para>

					   <citation>

					   <ulink url="http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html">

					   	Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen Tweedie

					   </ulink>

					   </citation>

					</para>

				    </sect1>

				  </chapter>

				  <chapter id="splice">

				      <title>splice API</title>

				  <para>

					splice is a method for moving blocks of data around inside the

					kernel, without continually transferring them between the kernel

					and user space.

				  </para>

				!Ffs/splice.c

				  </chapter>

				  <chapter id="pipes">

				      <title>pipes API</title>

				  <para>

					Pipe interfaces are all for in-kernel (builtin image) use.

					They are not exported for use by modules.

				  </para>

				!Iinclude/linux/pipe_fs_i.h

				!Ffs/pipe.c

				  </chapter>

				</book>

									
										793

Documentation/DocBook/gadget.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,793 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="USB-Gadget-API">

				  <bookinfo>

				    <title>USB Gadget API for Linux</title>

				    <date>20 August 2004</date>

				    <edition>20 August 2004</edition>

				    <legalnotice>

				       <para>

					 This documentation is free software; you can redistribute

					 it and/or modify it under the terms of the GNU General Public

					 License as published by the Free Software Foundation; either

					 version 2 of the License, or (at your option) any later

					 version.

				       </para>

				       <para>

					 This program is distributed in the hope that it will be

					 useful, but WITHOUT ANY WARRANTY; without even the implied

					 warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

					 See the GNU General Public License for more details.

				       </para>

				       <para>

					 You should have received a copy of the GNU General Public

					 License along with this program; if not, write to the Free

					 Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

					 MA 02111-1307 USA

				       </para>

				       <para>

					 For more details see the file COPYING in the source

					 distribution of Linux.

				       </para>

				    </legalnotice>

				    <copyright>

				      <year>2003-2004</year>

				      <holder>David Brownell</holder>

				    </copyright>

				    <author>

				      <firstname>David</firstname> 

				      <surname>Brownell</surname>

				      <affiliation>

				        <address><email>dbrownell@users.sourceforge.net</email></address>

				      </affiliation>

				    </author>

				  </bookinfo>

				<toc></toc>

				<chapter id="intro"><title>Introduction</title>

				<para>This document presents a Linux-USB "Gadget"

				kernel mode

				API, for use within peripherals and other USB devices

				that embed Linux.

				It provides an overview of the API structure,

				and shows how that fits into a system development project.

				This is the first such API released on Linux to address

				a number of important problems, including: </para>

				<itemizedlist>

				    <listitem><para>Supports USB 2.0, for high speed devices which

					can stream data at several dozen megabytes per second.

					</para></listitem>

				    <listitem><para>Handles devices with dozens of endpoints just as

					well as ones with just two fixed-function ones.  Gadget drivers

					can be written so they're easy to port to new hardware.

					</para></listitem>

				    <listitem><para>Flexible enough to expose more complex USB device

					capabilities such as multiple configurations, multiple interfaces,

					composite devices,

					and alternate interface settings.

					</para></listitem>

				    <listitem><para>USB "On-The-Go" (OTG) support, in conjunction

					with updates to the Linux-USB host side.

					</para></listitem>

				    <listitem><para>Sharing data structures and API models with the

					Linux-USB host side API.  This helps the OTG support, and

					looks forward to more-symmetric frameworks (where the same

					I/O model is used by both host and device side drivers).

					</para></listitem>

				    <listitem><para>Minimalist, so it's easier to support new device

					controller hardware.  I/O processing doesn't imply large

					demands for memory or CPU resources.

					</para></listitem>

				</itemizedlist>

				<para>Most Linux developers will not be able to use this API, since they

				have USB "host" hardware in a PC, workstation, or server.

				Linux users with embedded systems are more likely to

				have USB peripheral hardware.

				To distinguish drivers running inside such hardware from the

				more familiar Linux "USB device drivers",

				which are host side proxies for the real USB devices,

				a different term is used:

				the drivers inside the peripherals are "USB gadget drivers".

				In USB protocol interactions, the device driver is the master

				(or "client driver")

				and the gadget driver is the slave (or "function driver").

				</para>

				<para>The gadget API resembles the host side Linux-USB API in that both

				use queues of request objects to package I/O buffers, and those requests

				may be submitted or canceled.

				They share common definitions for the standard USB

				<emphasis>Chapter 9</emphasis> messages, structures, and constants.

				Also, both APIs bind and unbind drivers to devices.

				The APIs differ in detail, since the host side's current

				URB framework exposes a number of implementation details

				and assumptions that are inappropriate for a gadget API.

				While the model for control transfers and configuration

				management is necessarily different (one side is a hardware-neutral master,

				the other is a hardware-aware slave), the endpoint I/0 API used here

				should also be usable for an overhead-reduced host side API.

				</para>

				</chapter>

				<chapter id="structure"><title>Structure of Gadget Drivers</title>

				<para>A system running inside a USB peripheral

				normally has at least three layers inside the kernel to handle

				USB protocol processing, and may have additional layers in

				user space code.

				The "gadget" API is used by the middle layer to interact

				with the lowest level (which directly handles hardware).

				</para>

				<para>In Linux, from the bottom up, these layers are:

				</para>

				<variablelist>

				    <varlistentry>

				        <term><emphasis>USB Controller Driver</emphasis></term>

					<listitem>

					<para>This is the lowest software level.

					It is the only layer that talks to hardware,

					through registers, fifos, dma, irqs, and the like.

					The <filename>&lt;linux/usb/gadget.h&gt;</filename> API abstracts

					the peripheral controller endpoint hardware.

					That hardware is exposed through endpoint objects, which accept

					streams of IN/OUT buffers, and through callbacks that interact

					with gadget drivers.

					Since normal USB devices only have one upstream

					port, they only have one of these drivers.

					The controller driver can support any number of different

					gadget drivers, but only one of them can be used at a time.

					</para>

					<para>Examples of such controller hardware include

					the PCI-based NetChip 2280 USB 2.0 high speed controller,

					the SA-11x0 or PXA-25x UDC (found within many PDAs),

					and a variety of other products.

					</para>

					</listitem></varlistentry>

				    <varlistentry>

					<term><emphasis>Gadget Driver</emphasis></term>

					<listitem>

					<para>The lower boundary of this driver implements hardware-neutral

					USB functions, using calls to the controller driver.

					Because such hardware varies widely in capabilities and restrictions,

					and is used in embedded environments where space is at a premium,

					the gadget driver is often configured at compile time

					to work with endpoints supported by one particular controller.

					Gadget drivers may be portable to several different controllers,

					using conditional compilation.

					(Recent kernels substantially simplify the work involved in

					supporting new hardware, by <emphasis>autoconfiguring</emphasis>

					endpoints automatically for many bulk-oriented drivers.)

					Gadget driver responsibilities include:

					</para>

					<itemizedlist>

					    <listitem><para>handling setup requests (ep0 protocol responses)

						possibly including class-specific functionality

						</para></listitem>

					    <listitem><para>returning configuration and string descriptors

						</para></listitem>

					    <listitem><para>(re)setting configurations and interface

						altsettings, including enabling and configuring endpoints

						</para></listitem>

					    <listitem><para>handling life cycle events, such as managing

						bindings to hardware,

						USB suspend/resume, remote wakeup,

						and disconnection from the USB host.

						</para></listitem>

					    <listitem><para>managing IN and OUT transfers on all currently

						enabled endpoints

						</para></listitem>

					</itemizedlist>

					<para>

					Such drivers may be modules of proprietary code, although

					that approach is discouraged in the Linux community.

					</para>

					</listitem></varlistentry>

				    <varlistentry>

					<term><emphasis>Upper Level</emphasis></term>

					<listitem>

					<para>Most gadget drivers have an upper boundary that connects

					to some Linux driver or framework in Linux.

					Through that boundary flows the data which the gadget driver

					produces and/or consumes through protocol transfers over USB.

					Examples include:

					</para>

					<itemizedlist>

					    <listitem><para>user mode code, using generic (gadgetfs)

					        or application specific files in

						<filename>/dev</filename>

						</para></listitem>

					    <listitem><para>networking subsystem (for network gadgets,

						like the CDC Ethernet Model gadget driver)

						</para></listitem>

					    <listitem><para>data capture drivers, perhaps video4Linux or

						 a scanner driver; or test and measurement hardware.

						 </para></listitem>

					    <listitem><para>input subsystem (for HID gadgets)

						</para></listitem>

					    <listitem><para>sound subsystem (for audio gadgets)

						</para></listitem>

					    <listitem><para>file system (for PTP gadgets)

						</para></listitem>

					    <listitem><para>block i/o subsystem (for usb-storage gadgets)

						</para></listitem>

					    <listitem><para>... and more </para></listitem>

					</itemizedlist>

					</listitem></varlistentry>

				    <varlistentry>

					<term><emphasis>Additional Layers</emphasis></term>

					<listitem>

					<para>Other layers may exist.

					These could include kernel layers, such as network protocol stacks,

					as well as user mode applications building on standard POSIX

					system call APIs such as

					<emphasis>open()</emphasis>, <emphasis>close()</emphasis>,

					<emphasis>read()</emphasis> and <emphasis>write()</emphasis>.

					On newer systems, POSIX Async I/O calls may be an option.

					Such user mode code will not necessarily be subject to

					the GNU General Public License (GPL).

					</para>

					</listitem></varlistentry>

				</variablelist>

				<para>OTG-capable systems will also need to include a standard Linux-USB

				host side stack,

				with <emphasis>usbcore</emphasis>,

				one or more <emphasis>Host Controller Drivers</emphasis> (HCDs),

				<emphasis>USB Device Drivers</emphasis> to support

				the OTG "Targeted Peripheral List",

				and so forth.

				There will also be an <emphasis>OTG Controller Driver</emphasis>,

				which is visible to gadget and device driver developers only indirectly.

				That helps the host and device side USB controllers implement the

				two new OTG protocols (HNP and SRP).

				Roles switch (host to peripheral, or vice versa) using HNP

				during USB suspend processing, and SRP can be viewed as a

				more battery-friendly kind of device wakeup protocol.

				</para>

				<para>Over time, reusable utilities are evolving to help make some

				gadget driver tasks simpler.

				For example, building configuration descriptors from vectors of

				descriptors for the configurations interfaces and endpoints is

				now automated, and many drivers now use autoconfiguration to

				choose hardware endpoints and initialize their descriptors.

				A potential example of particular interest

				is code implementing standard USB-IF protocols for

				HID, networking, storage, or audio classes.

				Some developers are interested in KDB or KGDB hooks, to let

				target hardware be remotely debugged.

				Most such USB protocol code doesn't need to be hardware-specific,

				any more than network protocols like X11, HTTP, or NFS are.

				Such gadget-side interface drivers should eventually be combined,

				to implement composite devices.

				</para>

				</chapter>

				<chapter id="api"><title>Kernel Mode Gadget API</title>

				<para>Gadget drivers declare themselves through a

				<emphasis>struct usb_gadget_driver</emphasis>, which is responsible for

				most parts of enumeration for a <emphasis>struct usb_gadget</emphasis>.

				The response to a set_configuration usually involves

				enabling one or more of the <emphasis>struct usb_ep</emphasis> objects

				exposed by the gadget, and submitting one or more

				<emphasis>struct usb_request</emphasis> buffers to transfer data.

				Understand those four data types, and their operations, and

				you will understand how this API works.

				</para> 

				<note><title>Incomplete Data Type Descriptions</title>

				<para>This documentation was prepared using the standard Linux

				kernel <filename>docproc</filename> tool, which turns text

				and in-code comments into SGML DocBook and then into usable

				formats such as HTML or PDF.

				Other than the "Chapter 9" data types, most of the significant

				data types and functions are described here.

				</para>

				<para>However, docproc does not understand all the C constructs

				that are used, so some relevant information is likely omitted from

				what you are reading.  

				One example of such information is endpoint autoconfiguration.

				You'll have to read the header file, and use example source

				code (such as that for "Gadget Zero"), to fully understand the API.

				</para>

				<para>The part of the API implementing some basic

				driver capabilities is specific to the version of the

				Linux kernel that's in use.

				The 2.6 kernel includes a <emphasis>driver model</emphasis>

				framework that has no analogue on earlier kernels;

				so those parts of the gadget API are not fully portable.

				(They are implemented on 2.4 kernels, but in a different way.)

				The driver model state is another part of this API that is

				ignored by the kerneldoc tools.

				</para>

				</note>

				<para>The core API does not expose

				every possible hardware feature, only the most widely available ones.

				There are significant hardware features, such as device-to-device DMA

				(without temporary storage in a memory buffer)

				that would be added using hardware-specific APIs.

				</para>

				<para>This API allows drivers to use conditional compilation to handle

				endpoint capabilities of different hardware, but doesn't require that.

				Hardware tends to have arbitrary restrictions, relating to

				transfer types, addressing, packet sizes, buffering, and availability.

				As a rule, such differences only matter for "endpoint zero" logic

				that handles device configuration and management.

				The API supports limited run-time

				detection of capabilities, through naming conventions for endpoints.

				Many drivers will be able to at least partially autoconfigure

				themselves.

				In particular, driver init sections will often have endpoint

				autoconfiguration logic that scans the hardware's list of endpoints

				to find ones matching the driver requirements

				(relying on those conventions), to eliminate some of the most

				common reasons for conditional compilation.

				</para>

				<para>Like the Linux-USB host side API, this API exposes

				the "chunky" nature of USB messages:  I/O requests are in terms

				of one or more "packets", and packet boundaries are visible to drivers.

				Compared to RS-232 serial protocols, USB resembles

				synchronous protocols like HDLC

				(N bytes per frame, multipoint addressing, host as the primary

				station and devices as secondary stations)

				more than asynchronous ones

				(tty style:  8 data bits per frame, no parity, one stop bit).

				So for example the controller drivers won't buffer

				two single byte writes into a single two-byte USB IN packet,

				although gadget drivers may do so when they implement

				protocols where packet boundaries (and "short packets")

				are not significant.

				</para>

				<sect1 id="lifecycle"><title>Driver Life Cycle</title>

				<para>Gadget drivers make endpoint I/O requests to hardware without

				needing to know many details of the hardware, but driver

				setup/configuration code needs to handle some differences.

				Use the API like this:

				</para>

				<orderedlist numeration='arabic'>

				<listitem><para>Register a driver for the particular device side

				usb controller hardware,

				such as the net2280 on PCI (USB 2.0),

				sa11x0 or pxa25x as found in Linux PDAs,

				and so on.

				At this point the device is logically in the USB ch9 initial state

				("attached"), drawing no power and not usable

				(since it does not yet support enumeration).

				Any host should not see the device, since it's not

				activated the data line pullup used by the host to

				detect a device, even if VBUS power is available.

				</para></listitem>

				<listitem><para>Register a gadget driver that implements some higher level

				device function.  That will then bind() to a usb_gadget, which

				activates the data line pullup sometime after detecting VBUS.

				</para></listitem>

				<listitem><para>The hardware driver can now start enumerating.

				The steps it handles are to accept USB power and set_address requests.

				Other steps are handled by the gadget driver.

				If the gadget driver module is unloaded before the host starts to

				enumerate, steps before step 7 are skipped.

				</para></listitem>

				<listitem><para>The gadget driver's setup() call returns usb descriptors,

				based both on what the bus interface hardware provides and on the

				functionality being implemented.

				That can involve alternate settings or configurations,

				unless the hardware prevents such operation.

				For OTG devices, each configuration descriptor includes

				an OTG descriptor.

				</para></listitem>

				<listitem><para>The gadget driver handles the last step of enumeration,

				when the USB host issues a set_configuration call.

				It enables all endpoints used in that configuration,

				with all interfaces in their default settings.

				That involves using a list of the hardware's endpoints, enabling each

				endpoint according to its descriptor.

				It may also involve using <function>usb_gadget_vbus_draw</function>

				to let more power be drawn from VBUS, as allowed by that configuration.

				For OTG devices, setting a configuration may also involve reporting

				HNP capabilities through a user interface.

				</para></listitem>

				<listitem><para>Do real work and perform data transfers, possibly involving

				changes to interface settings or switching to new configurations, until the

				device is disconnect()ed from the host.

				Queue any number of transfer requests to each endpoint.

				It may be suspended and resumed several times before being disconnected.

				On disconnect, the drivers go back to step 3 (above).

				</para></listitem>

				<listitem><para>When the gadget driver module is being unloaded,

				the driver unbind() callback is issued.  That lets the controller

				driver be unloaded.

				</para></listitem>

				</orderedlist>

				<para>Drivers will normally be arranged so that just loading the

				gadget driver module (or statically linking it into a Linux kernel)

				allows the peripheral device to be enumerated, but some drivers

				will defer enumeration until some higher level component (like

				a user mode daemon) enables it.

				Note that at this lowest level there are no policies about how

				ep0 configuration logic is implemented,

				except that it should obey USB specifications.

				Such issues are in the domain of gadget drivers,

				including knowing about implementation constraints

				imposed by some USB controllers

				or understanding that composite devices might happen to

				be built by integrating reusable components.

				</para>

				<para>Note that the lifecycle above can be slightly different

				for OTG devices.

				Other than providing an additional OTG descriptor in each

				configuration, only the HNP-related differences are particularly

				visible to driver code.

				They involve reporting requirements during the SET_CONFIGURATION

				request, and the option to invoke HNP during some suspend callbacks.

				Also, SRP changes the semantics of

				<function>usb_gadget_wakeup</function>

				slightly.

				</para>

				</sect1>

				<sect1 id="ch9"><title>USB 2.0 Chapter 9 Types and Constants</title>

				<para>Gadget drivers

				rely on common USB structures and constants

				defined in the

				<filename>&lt;linux/usb/ch9.h&gt;</filename>

				header file, which is standard in Linux 2.6 kernels.

				These are the same types and constants used by host

				side drivers (and usbcore).

				</para>

				!Iinclude/linux/usb/ch9.h

				</sect1>

				<sect1 id="core"><title>Core Objects and Methods</title>

				<para>These are declared in

				<filename>&lt;linux/usb/gadget.h&gt;</filename>,

				and are used by gadget drivers to interact with

				USB peripheral controller drivers.

				</para>

					<!-- yeech, this is ugly in nsgmls PDF output.

					     the PDF bookmark and refentry output nesting is wrong,

					     and the member/argument documentation indents ugly.

					     plus something (docproc?) adds whitespace before the

					     descriptive paragraph text, so it can't line up right

					     unless the explanations are trivial.

					  -->

				!Iinclude/linux/usb/gadget.h

				</sect1>

				<sect1 id="utils"><title>Optional Utilities</title>

				<para>The core API is sufficient for writing a USB Gadget Driver,

				but some optional utilities are provided to simplify common tasks.

				These utilities include endpoint autoconfiguration.

				</para>

				!Edrivers/usb/gadget/usbstring.c

				!Edrivers/usb/gadget/config.c

				<!-- !Edrivers/usb/gadget/epautoconf.c -->

				</sect1>

				<sect1 id="composite"><title>Composite Device Framework</title>

				<para>The core API is sufficient for writing drivers for composite

				USB devices (with more than one function in a given configuration),

				and also multi-configuration devices (also more than one function,

				but not necessarily sharing a given configuration).

				There is however an optional framework which makes it easier to

				reuse and combine functions.

				</para>

				<para>Devices using this framework provide a <emphasis>struct

				usb_composite_driver</emphasis>, which in turn provides one or

				more <emphasis>struct usb_configuration</emphasis> instances.

				Each such configuration includes at least one

				<emphasis>struct usb_function</emphasis>, which packages a user

				visible role such as "network link" or "mass storage device".

				Management functions may also exist, such as "Device Firmware

				Upgrade".

				</para>

				!Iinclude/linux/usb/composite.h

				!Edrivers/usb/gadget/composite.c

				</sect1>

				<sect1 id="functions"><title>Composite Device Functions</title>

				<para>At this writing, a few of the current gadget drivers have

				been converted to this framework.

				Near-term plans include converting all of them, except for "gadgetfs".

				</para>

				!Edrivers/usb/gadget/function/f_acm.c

				!Edrivers/usb/gadget/function/f_ecm.c

				!Edrivers/usb/gadget/function/f_subset.c

				!Edrivers/usb/gadget/function/f_obex.c

				!Edrivers/usb/gadget/function/f_serial.c

				</sect1>

				</chapter>

				<chapter id="controllers"><title>Peripheral Controller Drivers</title>

				<para>The first hardware supporting this API was the NetChip 2280

				controller, which supports USB 2.0 high speed and is based on PCI.

				This is the <filename>net2280</filename> driver module.

				The driver supports Linux kernel versions 2.4 and 2.6;

				contact NetChip Technologies for development boards and product

				information.

				</para> 

				<para>Other hardware working in the "gadget" framework includes:

				Intel's PXA 25x and IXP42x series processors

				(<filename>pxa2xx_udc</filename>),

				Toshiba TC86c001 "Goku-S" (<filename>goku_udc</filename>),

				Renesas SH7705/7727 (<filename>sh_udc</filename>),

				MediaQ 11xx (<filename>mq11xx_udc</filename>),

				Hynix HMS30C7202 (<filename>h7202_udc</filename>),

				National 9303/4 (<filename>n9604_udc</filename>),

				Texas Instruments OMAP (<filename>omap_udc</filename>),

				Sharp LH7A40x (<filename>lh7a40x_udc</filename>),

				and more.

				Most of those are full speed controllers.

				</para>

				<para>At this writing, there are people at work on drivers in

				this framework for several other USB device controllers,

				with plans to make many of them be widely available.

				</para>

				<!-- !Edrivers/usb/gadget/net2280.c -->

				<para>A partial USB simulator,

				the <filename>dummy_hcd</filename> driver, is available.

				It can act like a net2280, a pxa25x, or an sa11x0 in terms

				of available endpoints and device speeds; and it simulates

				control, bulk, and to some extent interrupt transfers.

				That lets you develop some parts of a gadget driver on a normal PC,

				without any special hardware, and perhaps with the assistance

				of tools such as GDB running with User Mode Linux.

				At least one person has expressed interest in adapting that

				approach, hooking it up to a simulator for a microcontroller.

				Such simulators can help debug subsystems where the runtime hardware

				is unfriendly to software development, or is not yet available.

				</para>

				<para>Support for other controllers is expected to be developed

				and contributed

				over time, as this driver framework evolves.

				</para>

				</chapter>

				<chapter id="gadget"><title>Gadget Drivers</title>

				<para>In addition to <emphasis>Gadget Zero</emphasis>

				(used primarily for testing and development with drivers

				for usb controller hardware), other gadget drivers exist.

				</para>

				<para>There's an <emphasis>ethernet</emphasis> gadget

				driver, which implements one of the most useful

				<emphasis>Communications Device Class</emphasis> (CDC) models.  

				One of the standards for cable modem interoperability even

				specifies the use of this ethernet model as one of two

				mandatory options.

				Gadgets using this code look to a USB host as if they're

				an Ethernet adapter.

				It provides access to a network where the gadget's CPU is one host,

				which could easily be bridging, routing, or firewalling

				access to other networks.

				Since some hardware can't fully implement the CDC Ethernet

				requirements, this driver also implements a "good parts only"

				subset of CDC Ethernet.

				(That subset doesn't advertise itself as CDC Ethernet,

				to avoid creating problems.)

				</para>

				<para>Support for Microsoft's <emphasis>RNDIS</emphasis>

				protocol has been contributed by Pengutronix and Auerswald GmbH.

				This is like CDC Ethernet, but it runs on more slightly USB hardware

				(but less than the CDC subset).

				However, its main claim to fame is being able to connect directly to

				recent versions of Windows, using drivers that Microsoft bundles

				and supports, making it much simpler to network with Windows.

				</para>

				<para>There is also support for user mode gadget drivers,

				using <emphasis>gadgetfs</emphasis>.

				This provides a <emphasis>User Mode API</emphasis> that presents

				each endpoint as a single file descriptor.  I/O is done using

				normal <emphasis>read()</emphasis> and <emphasis>read()</emphasis> calls.

				Familiar tools like GDB and pthreads can be used to

				develop and debug user mode drivers, so that once a robust

				controller driver is available many applications for it

				won't require new kernel mode software.

				Linux 2.6 <emphasis>Async I/O (AIO)</emphasis>

				support is available, so that user mode software

				can stream data with only slightly more overhead

				than a kernel driver.

				</para>

				<para>There's a USB Mass Storage class driver, which provides

				a different solution for interoperability with systems such

				as MS-Windows and MacOS.

				That <emphasis>Mass Storage</emphasis> driver uses a

				file or block device as backing store for a drive,

				like the <filename>loop</filename> driver.

				The USB host uses the BBB, CB, or CBI versions of the mass

				storage class specification, using transparent SCSI commands

				to access the data from the backing store.

				</para>

				<para>There's a "serial line" driver, useful for TTY style

				operation over USB.

				The latest version of that driver supports CDC ACM style

				operation, like a USB modem, and so on most hardware it can

				interoperate easily with MS-Windows.

				One interesting use of that driver is in boot firmware (like a BIOS),

				which can sometimes use that model with very small systems without

				real serial lines.

				</para>

				<para>Support for other kinds of gadget is expected to

				be developed and contributed

				over time, as this driver framework evolves.

				</para>

				</chapter>

				<chapter id="otg"><title>USB On-The-GO (OTG)</title>

				<para>USB OTG support on Linux 2.6 was initially developed

				by Texas Instruments for

				<ulink url="http://www.omap.com">OMAP</ulink> 16xx and 17xx

				series processors.

				Other OTG systems should work in similar ways, but the

				hardware level details could be very different.

				</para> 

				<para>Systems need specialized hardware support to implement OTG,

				notably including a special <emphasis>Mini-AB</emphasis> jack

				and associated transceiver to support <emphasis>Dual-Role</emphasis>

				operation:

				they can act either as a host, using the standard

				Linux-USB host side driver stack,

				or as a peripheral, using this "gadget" framework.

				To do that, the system software relies on small additions

				to those programming interfaces,

				and on a new internal component (here called an "OTG Controller")

				affecting which driver stack connects to the OTG port.

				In each role, the system can re-use the existing pool of

				hardware-neutral drivers, layered on top of the controller

				driver interfaces (<emphasis>usb_bus</emphasis> or

				<emphasis>usb_gadget</emphasis>).

				Such drivers need at most minor changes, and most of the calls

				added to support OTG can also benefit non-OTG products.

				</para>

				<itemizedlist>

				    <listitem><para>Gadget drivers test the <emphasis>is_otg</emphasis>

					flag, and use it to determine whether or not to include

					an OTG descriptor in each of their configurations.

					</para></listitem>

				    <listitem><para>Gadget drivers may need changes to support the

					two new OTG protocols, exposed in new gadget attributes

					such as <emphasis>b_hnp_enable</emphasis> flag.

					HNP support should be reported through a user interface

					(two LEDs could suffice), and is triggered in some cases

					when the host suspends the peripheral.

					SRP support can be user-initiated just like remote wakeup,

					probably by pressing the same button.

					</para></listitem>

				    <listitem><para>On the host side, USB device drivers need

					to be taught to trigger HNP at appropriate moments, using

					<function>usb_suspend_device()</function>.

					That also conserves battery power, which is useful even

					for non-OTG configurations.

					</para></listitem>

				    <listitem><para>Also on the host side, a driver must support the

					OTG "Targeted Peripheral List".  That's just a whitelist,

					used to reject peripherals not supported with a given

					Linux OTG host.

					<emphasis>This whitelist is product-specific;

					each product must modify <filename>otg_whitelist.h</filename>

					to match its interoperability specification.

					</emphasis>

					</para>

					<para>Non-OTG Linux hosts, like PCs and workstations,

					normally have some solution for adding drivers, so that

					peripherals that aren't recognized can eventually be supported.

					That approach is unreasonable for consumer products that may

					never have their firmware upgraded, and where it's usually

					unrealistic to expect traditional PC/workstation/server kinds

					of support model to work.

					For example, it's often impractical to change device firmware

					once the product has been distributed, so driver bugs can't

					normally be fixed if they're found after shipment.

					</para></listitem>

				</itemizedlist>

				<para>

				Additional changes are needed below those hardware-neutral

				<emphasis>usb_bus</emphasis> and <emphasis>usb_gadget</emphasis>

				driver interfaces; those aren't discussed here in any detail.

				Those affect the hardware-specific code for each USB Host or Peripheral

				controller, and how the HCD initializes (since OTG can be active only

				on a single port).

				They also involve what may be called an <emphasis>OTG Controller

				Driver</emphasis>, managing the OTG transceiver and the OTG state

				machine logic as well as much of the root hub behavior for the

				OTG port.

				The OTG controller driver needs to activate and deactivate USB

				controllers depending on the relevant device role.

				Some related changes were needed inside usbcore, so that it

				can identify OTG-capable devices and respond appropriately

				to HNP or SRP protocols.

				</para> 

				</chapter>

				</book>

				<!--

					vim:syntax=sgml:sw=4

				-->

									
										520

Documentation/DocBook/genericirq.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,520 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="Generic-IRQ-Guide">

				 <bookinfo>

				  <title>Linux generic IRQ handling</title>

				  <authorgroup>

				   <author>

				    <firstname>Thomas</firstname>

				    <surname>Gleixner</surname>

				    <affiliation>

				     <address>

				      <email>tglx@linutronix.de</email>

				     </address>

				    </affiliation>

				   </author>

				   <author>

				    <firstname>Ingo</firstname>

				    <surname>Molnar</surname>

				    <affiliation>

				     <address>

				      <email>mingo@elte.hu</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2005-2010</year>

				   <holder>Thomas Gleixner</holder>

				  </copyright>

				  <copyright>

				   <year>2005-2006</year>

				   <holder>Ingo Molnar</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License version 2 as published by the Free Software Foundation.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				    <title>Introduction</title>

				    <para>

					The generic interrupt handling layer is designed to provide a

					complete abstraction of interrupt handling for device drivers.

					It is able to handle all the different types of interrupt controller

					hardware. Device drivers use generic API functions to request, enable,

					disable and free interrupts. The drivers do not have to know anything

					about interrupt hardware details, so they can be used on different

					platforms without code changes.

				    </para>

				    <para>

				  	This documentation is provided to developers who want to implement

					an interrupt subsystem based for their architecture, with the help

					of the generic IRQ handling layer.

				    </para>

				  </chapter>

				  <chapter id="rationale">

				    <title>Rationale</title>

					<para>

					The original implementation of interrupt handling in Linux uses

					the __do_IRQ() super-handler, which is able to deal with every

					type of interrupt logic.

					</para>

					<para>

					Originally, Russell King identified different types of handlers to

					build a quite universal set for the ARM interrupt handler

					implementation in Linux 2.5/2.6. He distinguished between:

					<itemizedlist>

					  <listitem><para>Level type</para></listitem>

					  <listitem><para>Edge type</para></listitem>

					  <listitem><para>Simple type</para></listitem>

					</itemizedlist>

					During the implementation we identified another type:

					<itemizedlist>

					  <listitem><para>Fast EOI type</para></listitem>

					</itemizedlist>

					In the SMP world of the __do_IRQ() super-handler another type

					was identified:

					<itemizedlist>

					  <listitem><para>Per CPU type</para></listitem>

					</itemizedlist>

					</para>

					<para>

					This split implementation of high-level IRQ handlers allows us to

					optimize the flow of the interrupt handling for each specific

					interrupt type. This reduces complexity in that particular code path

					and allows the optimized handling of a given type.

					</para>

					<para>

					The original general IRQ implementation used hw_interrupt_type

					structures and their ->ack(), ->end() [etc.] callbacks to

					differentiate the flow control in the super-handler. This leads to

					a mix of flow logic and low-level hardware logic, and it also leads

					to unnecessary code duplication: for example in i386, there is an

					ioapic_level_irq and an ioapic_edge_irq IRQ-type which share many

					of the low-level details but have different flow handling.

					</para>

					<para>

					A more natural abstraction is the clean separation of the

					'irq flow' and the 'chip details'.

					</para>

					<para>

					Analysing a couple of architecture's IRQ subsystem implementations

					reveals that most of them can use a generic set of 'irq flow'

					methods and only need to add the chip-level specific code.

					The separation is also valuable for (sub)architectures

					which need specific quirks in the IRQ flow itself but not in the

					chip details - and thus provides a more transparent IRQ subsystem

					design.

					</para>

					<para>

					Each interrupt descriptor is assigned its own high-level flow

					handler, which is normally one of the generic

					implementations. (This high-level flow handler implementation also

					makes it simple to provide demultiplexing handlers which can be

					found in embedded platforms on various architectures.)

					</para>

					<para>

					The separation makes the generic interrupt handling layer more

					flexible and extensible. For example, an (sub)architecture can

					use a generic IRQ-flow implementation for 'level type' interrupts

					and add a (sub)architecture specific 'edge type' implementation.

					</para>

					<para>

					To make the transition to the new model easier and prevent the

					breakage of existing implementations, the __do_IRQ() super-handler

					is still available. This leads to a kind of duality for the time

					being. Over time the new model should be used in more and more

					architectures, as it enables smaller and cleaner IRQ subsystems.

					It's deprecated for three years now and about to be removed.

					</para>

				  </chapter>

				  <chapter id="bugs">

				    <title>Known Bugs And Assumptions</title>

				    <para>

					None (knock on wood).

				    </para>

				  </chapter>

				  <chapter id="Abstraction">

				    <title>Abstraction layers</title>

				    <para>

					There are three main levels of abstraction in the interrupt code:

					<orderedlist>

					  <listitem><para>High-level driver API</para></listitem>

					  <listitem><para>High-level IRQ flow handlers</para></listitem>

					  <listitem><para>Chip-level hardware encapsulation</para></listitem>

					</orderedlist>

				    </para>

				    <sect1 id="Interrupt_control_flow">

					<title>Interrupt control flow</title>

					<para>

					Each interrupt is described by an interrupt descriptor structure

					irq_desc. The interrupt is referenced by an 'unsigned int' numeric

					value which selects the corresponding interrupt description structure

					in the descriptor structures array.

					The descriptor structure contains status information and pointers

					to the interrupt flow method and the interrupt chip structure

					which are assigned to this interrupt.

					</para>

					<para>

					Whenever an interrupt triggers, the low-level architecture code calls

					into the generic interrupt code by calling desc->handle_irq().

					This high-level IRQ handling function only uses desc->irq_data.chip

					primitives referenced by the assigned chip descriptor structure.

					</para>

				    </sect1>

				    <sect1 id="Highlevel_Driver_API">

					<title>High-level Driver API</title>

					<para>

					  The high-level Driver API consists of following functions:

					  <itemizedlist>

					  <listitem><para>request_irq()</para></listitem>

					  <listitem><para>free_irq()</para></listitem>

					  <listitem><para>disable_irq()</para></listitem>

					  <listitem><para>enable_irq()</para></listitem>

					  <listitem><para>disable_irq_nosync() (SMP only)</para></listitem>

					  <listitem><para>synchronize_irq() (SMP only)</para></listitem>

					  <listitem><para>irq_set_irq_type()</para></listitem>

					  <listitem><para>irq_set_irq_wake()</para></listitem>

					  <listitem><para>irq_set_handler_data()</para></listitem>

					  <listitem><para>irq_set_chip()</para></listitem>

					  <listitem><para>irq_set_chip_data()</para></listitem>

				          </itemizedlist>

					  See the autogenerated function documentation for details.

					</para>

				    </sect1>

				    <sect1 id="Highlevel_IRQ_flow_handlers">

					<title>High-level IRQ flow handlers</title>

					<para>

					  The generic layer provides a set of pre-defined irq-flow methods:

					  <itemizedlist>

					  <listitem><para>handle_level_irq</para></listitem>

					  <listitem><para>handle_edge_irq</para></listitem>

					  <listitem><para>handle_fasteoi_irq</para></listitem>

					  <listitem><para>handle_simple_irq</para></listitem>

					  <listitem><para>handle_percpu_irq</para></listitem>

					  <listitem><para>handle_edge_eoi_irq</para></listitem>

					  <listitem><para>handle_bad_irq</para></listitem>

					  </itemizedlist>

					  The interrupt flow handlers (either pre-defined or architecture

					  specific) are assigned to specific interrupts by the architecture

					  either during bootup or during device initialization.

					</para>

					<sect2 id="Default_flow_implementations">

					<title>Default flow implementations</title>

					    <sect3 id="Helper_functions">

					 	<title>Helper functions</title>

						<para>

						The helper functions call the chip primitives and

						are used by the default flow implementations.

						The following helper functions are implemented (simplified excerpt):

						<programlisting>

				default_enable(struct irq_data *data)

				{

					desc->irq_data.chip->irq_unmask(data);

				}

				default_disable(struct irq_data *data)

				{

					if (!delay_disable(data))

						desc->irq_data.chip->irq_mask(data);

				}

				default_ack(struct irq_data *data)

				{

					chip->irq_ack(data);

				}

				default_mask_ack(struct irq_data *data)

				{

					if (chip->irq_mask_ack) {

						chip->irq_mask_ack(data);

					} else {

						chip->irq_mask(data);

						chip->irq_ack(data);

					}

				}

				noop(struct irq_data *data))

				{

				}

						</programlisting>

					        </para>

					    </sect3>

					</sect2>

					<sect2 id="Default_flow_handler_implementations">

					<title>Default flow handler implementations</title>

					    <sect3 id="Default_Level_IRQ_flow_handler">

					 	<title>Default Level IRQ flow handler</title>

						<para>

						handle_level_irq provides a generic implementation

						for level-triggered interrupts.

						</para>

						<para>

						The following control flow is implemented (simplified excerpt):

						<programlisting>

				desc->irq_data.chip->irq_mask_ack();

				handle_irq_event(desc->action);

				desc->irq_data.chip->irq_unmask();

						</programlisting>

						</para>

					    </sect3>

					    <sect3 id="Default_FASTEOI_IRQ_flow_handler">

						<title>Default Fast EOI IRQ flow handler</title>

						<para>

						handle_fasteoi_irq provides a generic implementation

						for interrupts, which only need an EOI at the end of

						the handler.

						</para>

						<para>

						The following control flow is implemented (simplified excerpt):

						<programlisting>

				handle_irq_event(desc->action);

				desc->irq_data.chip->irq_eoi();

						</programlisting>

						</para>

					    </sect3>

					    <sect3 id="Default_Edge_IRQ_flow_handler">

					 	<title>Default Edge IRQ flow handler</title>

						<para>

						handle_edge_irq provides a generic implementation

						for edge-triggered interrupts.

						</para>

						<para>

						The following control flow is implemented (simplified excerpt):

						<programlisting>

				if (desc->status &amp; running) {

					desc->irq_data.chip->irq_mask_ack();

					desc->status |= pending | masked;

					return;

				}

				desc->irq_data.chip->irq_ack();

				desc->status |= running;

				do {

					if (desc->status &amp; masked)

						desc->irq_data.chip->irq_unmask();

					desc->status &amp;= ~pending;

					handle_irq_event(desc->action);

				} while (status &amp; pending);

				desc->status &amp;= ~running;

						</programlisting>

						</para>

				   	    </sect3>

					    <sect3 id="Default_simple_IRQ_flow_handler">

					 	<title>Default simple IRQ flow handler</title>

						<para>

						handle_simple_irq provides a generic implementation

						for simple interrupts.

						</para>

						<para>

						Note: The simple flow handler does not call any

						handler/chip primitives.

						</para>

						<para>

						The following control flow is implemented (simplified excerpt):

						<programlisting>

				handle_irq_event(desc->action);

						</programlisting>

						</para>

				   	    </sect3>

					    <sect3 id="Default_per_CPU_flow_handler">

					 	<title>Default per CPU flow handler</title>

						<para>

						handle_percpu_irq provides a generic implementation

						for per CPU interrupts.

						</para>

						<para>

						Per CPU interrupts are only available on SMP and

						the handler provides a simplified version without

						locking.

						</para>

						<para>

						The following control flow is implemented (simplified excerpt):

						<programlisting>

				if (desc->irq_data.chip->irq_ack)

					desc->irq_data.chip->irq_ack();

				handle_irq_event(desc->action);

				if (desc->irq_data.chip->irq_eoi)

				        desc->irq_data.chip->irq_eoi();

						</programlisting>

						</para>

				   	    </sect3>

					    <sect3 id="EOI_Edge_IRQ_flow_handler">

					 	<title>EOI Edge IRQ flow handler</title>

						<para>

						handle_edge_eoi_irq provides an abnomination of the edge

						handler which is solely used to tame a badly wreckaged

						irq controller on powerpc/cell.

						</para>

				   	    </sect3>

					    <sect3 id="BAD_IRQ_flow_handler">

					 	<title>Bad IRQ flow handler</title>

						<para>

						handle_bad_irq is used for spurious interrupts which

						have no real handler assigned..

						</para>

				   	    </sect3>

					</sect2>

					<sect2 id="Quirks_and_optimizations">

					<title>Quirks and optimizations</title>

					<para>

					The generic functions are intended for 'clean' architectures and chips,

					which have no platform-specific IRQ handling quirks. If an architecture

					needs to implement quirks on the 'flow' level then it can do so by

					overriding the high-level irq-flow handler.

					</para>

					</sect2>

					<sect2 id="Delayed_interrupt_disable">

					<title>Delayed interrupt disable</title>

					<para>

					This per interrupt selectable feature, which was introduced by Russell

					King in the ARM interrupt implementation, does not mask an interrupt

					at the hardware level when disable_irq() is called. The interrupt is

					kept enabled and is masked in the flow handler when an interrupt event

					happens. This prevents losing edge interrupts on hardware which does

					not store an edge interrupt event while the interrupt is disabled at

					the hardware level. When an interrupt arrives while the IRQ_DISABLED

					flag is set, then the interrupt is masked at the hardware level and

					the IRQ_PENDING bit is set. When the interrupt is re-enabled by

					enable_irq() the pending bit is checked and if it is set, the

					interrupt is resent either via hardware or by a software resend

					mechanism. (It's necessary to enable CONFIG_HARDIRQS_SW_RESEND when

					you want to use the delayed interrupt disable feature and your

					hardware is not capable of retriggering	an interrupt.)

					The delayed interrupt disable is not configurable.

					</para>

					</sect2>

				    </sect1>

				    <sect1 id="Chiplevel_hardware_encapsulation">

					<title>Chip-level hardware encapsulation</title>

					<para>

					The chip-level hardware descriptor structure irq_chip

					contains all the direct chip relevant functions, which

					can be utilized by the irq flow implementations.

					  <itemizedlist>

					  <listitem><para>irq_ack()</para></listitem>

					  <listitem><para>irq_mask_ack() - Optional, recommended for performance</para></listitem>

					  <listitem><para>irq_mask()</para></listitem>

					  <listitem><para>irq_unmask()</para></listitem>

					  <listitem><para>irq_eoi() - Optional, required for EOI flow handlers</para></listitem>

					  <listitem><para>irq_retrigger() - Optional</para></listitem>

					  <listitem><para>irq_set_type() - Optional</para></listitem>

					  <listitem><para>irq_set_wake() - Optional</para></listitem>

					  </itemizedlist>

					These primitives are strictly intended to mean what they say: ack means

					ACK, masking means masking of an IRQ line, etc. It is up to the flow

					handler(s) to use these basic units of low-level functionality.

					</para>

				    </sect1>

				  </chapter>

				  <chapter id="doirq">

				     <title>__do_IRQ entry point</title>

				     <para>

					The original implementation __do_IRQ() was an alternative entry

					point for all types of interrupts. It no longer exists.

				     </para>

				     <para>

					This handler turned out to be not suitable for all

					interrupt hardware and was therefore reimplemented with split

					functionality for edge/level/simple/percpu interrupts. This is not

					only a functional optimization. It also shortens code paths for

					interrupts.

				      </para>

				  </chapter>

				  <chapter id="locking">

				     <title>Locking on SMP</title>

				     <para>

					The locking of chip registers is up to the architecture that

					defines the chip primitives. The per-irq structure is

					protected via desc->lock, by the generic layer.

				     </para>

				  </chapter>

				  <chapter id="genericchip">

				     <title>Generic interrupt chip</title>

				     <para>

				       To avoid copies of identical implementations of IRQ chips the

				       core provides a configurable generic interrupt chip

				       implementation. Developers should check carefully whether the

				       generic chip fits their needs before implementing the same

				       functionality slightly differently themselves.

				     </para>

				!Ekernel/irq/generic-chip.c

				  </chapter>

				  <chapter id="structs">

				     <title>Structures</title>

				     <para>

				     This chapter contains the autogenerated documentation of the structures which are

				     used in the generic IRQ layer.

				     </para>

				!Iinclude/linux/irq.h

				!Iinclude/linux/interrupt.h

				  </chapter>

				  <chapter id="pubfunctions">

				     <title>Public Functions Provided</title>

				     <para>

				     This chapter contains the autogenerated documentation of the kernel API functions

				      which are exported.

				     </para>

				!Ekernel/irq/manage.c

				!Ekernel/irq/chip.c

				  </chapter>

				  <chapter id="intfunctions">

				     <title>Internal Functions Provided</title>

				     <para>

				     This chapter contains the autogenerated documentation of the internal functions.

				     </para>

				!Ikernel/irq/irqdesc.c

				!Ikernel/irq/handle.c

				!Ikernel/irq/chip.c

				  </chapter>

				  <chapter id="credits">

				     <title>Credits</title>

					<para>

						The following people have contributed to this document:

						<orderedlist>

							<listitem><para>Thomas Gleixner<email>tglx@linutronix.de</email></para></listitem>

							<listitem><para>Ingo Molnar<email>mingo@elte.hu</email></para></listitem>

						</orderedlist>

					</para>

				  </chapter>

				</book>

									
										331

Documentation/DocBook/kernel-api.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,331 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="LinuxKernelAPI">

				 <bookinfo>

				  <title>The Linux Kernel API</title>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="adt">

				     <title>Data Types</title>

				     <sect1><title>Doubly Linked Lists</title>

				!Iinclude/linux/list.h

				     </sect1>

				  </chapter>

				  <chapter id="libc">

				     <title>Basic C Library Functions</title>

				     <para>

				       When writing drivers, you cannot in general use routines which are

				       from the C Library.  Some of the functions have been found generally

				       useful and they are listed below.  The behaviour of these functions

				       may vary slightly from those defined by ANSI, and these deviations

				       are noted in the text.

				     </para>

				     <sect1><title>String Conversions</title>

				!Elib/vsprintf.c

				!Finclude/linux/kernel.h kstrtol

				!Finclude/linux/kernel.h kstrtoul

				!Elib/kstrtox.c

				     </sect1>

				     <sect1><title>String Manipulation</title>

				<!-- All functions are exported at now

				X!Ilib/string.c

				 -->

				!Elib/string.c

				     </sect1>

				     <sect1><title>Bit Operations</title>

				!Iarch/x86/include/asm/bitops.h

				     </sect1>

				  </chapter>

				  <chapter id="kernel-lib">

				     <title>Basic Kernel Library Functions</title>

				     <para>

				       The Linux kernel provides more basic utility functions.

				     </para>

				     <sect1><title>Bitmap Operations</title>

				!Elib/bitmap.c

				!Ilib/bitmap.c

				     </sect1>

				     <sect1><title>Command-line Parsing</title>

				!Elib/cmdline.c

				     </sect1>

				     <sect1 id="crc"><title>CRC Functions</title>

				!Elib/crc7.c

				!Elib/crc16.c

				!Elib/crc-itu-t.c

				!Elib/crc32.c

				!Elib/crc-ccitt.c

				     </sect1>

				     <sect1 id="idr"><title>idr/ida Functions</title>

				!Pinclude/linux/idr.h idr sync

				!Plib/idr.c IDA description

				!Elib/idr.c

				     </sect1>

				  </chapter>

				  <chapter id="mm">

				     <title>Memory Management in Linux</title>

				     <sect1><title>The Slab Cache</title>

				!Iinclude/linux/slab.h

				!Emm/slab.c

				!Emm/util.c

				     </sect1>

				     <sect1><title>User Space Memory Access</title>

				!Iarch/x86/include/asm/uaccess_32.h

				!Earch/x86/lib/usercopy_32.c

				     </sect1>

				     <sect1><title>More Memory Management Functions</title>

				!Emm/readahead.c

				!Emm/filemap.c

				!Emm/memory.c

				!Emm/vmalloc.c

				!Imm/page_alloc.c

				!Emm/mempool.c

				!Emm/dmapool.c

				!Emm/page-writeback.c

				!Emm/truncate.c

				     </sect1>

				  </chapter>

				  <chapter id="ipc">

				     <title>Kernel IPC facilities</title>

				     <sect1><title>IPC utilities</title>

				!Iipc/util.c

				     </sect1>

				  </chapter>

				  <chapter id="kfifo">

				     <title>FIFO Buffer</title>

				     <sect1><title>kfifo interface</title>

				!Iinclude/linux/kfifo.h

				     </sect1>

				  </chapter>

				  <chapter id="relayfs">

				     <title>relay interface support</title>

				     <para>

					Relay interface support

					is designed to provide an efficient mechanism for tools and

					facilities to relay large amounts of data from kernel space to

					user space.

				     </para>

				     <sect1><title>relay interface</title>

				!Ekernel/relay.c

				!Ikernel/relay.c

				     </sect1>

				  </chapter>

				  <chapter id="modload">

				     <title>Module Support</title>

				     <sect1><title>Module Loading</title>

				!Ekernel/kmod.c

				     </sect1>

				     <sect1><title>Inter Module support</title>

				        <para>

				           Refer to the file kernel/module.c for more information.

				        </para>

				<!-- FIXME: Removed for now since no structured comments in source

				X!Ekernel/module.c

				-->

				     </sect1>

				  </chapter>

				  <chapter id="hardware">

				     <title>Hardware Interfaces</title>

				     <sect1><title>Interrupt Handling</title>

				!Ekernel/irq/manage.c

				     </sect1>

				     <sect1><title>DMA Channels</title>

				!Ekernel/dma.c

				     </sect1>

				     <sect1><title>Resources Management</title>

				!Ikernel/resource.c

				!Ekernel/resource.c

				     </sect1>

				     <sect1><title>MTRR Handling</title>

				!Earch/x86/kernel/cpu/mtrr/main.c

				     </sect1>

				     <sect1><title>PCI Support Library</title>

				!Edrivers/pci/pci.c

				!Edrivers/pci/pci-driver.c

				!Edrivers/pci/remove.c

				!Edrivers/pci/search.c

				!Edrivers/pci/msi.c

				!Edrivers/pci/bus.c

				!Edrivers/pci/access.c

				!Edrivers/pci/irq.c

				!Edrivers/pci/htirq.c

				<!-- FIXME: Removed for now since no structured comments in source

				X!Edrivers/pci/hotplug.c

				-->

				!Edrivers/pci/probe.c

				!Edrivers/pci/slot.c

				!Edrivers/pci/rom.c

				!Edrivers/pci/iov.c

				!Idrivers/pci/pci-sysfs.c

				     </sect1>

				     <sect1><title>PCI Hotplug Support Library</title>

				!Edrivers/pci/hotplug/pci_hotplug_core.c

				     </sect1>

				  </chapter>

				  <chapter id="firmware">

				     <title>Firmware Interfaces</title>

				     <sect1><title>DMI Interfaces</title>

				!Edrivers/firmware/dmi_scan.c

				     </sect1>

				     <sect1><title>EDD Interfaces</title>

				!Idrivers/firmware/edd.c

				     </sect1>

				  </chapter>

				  <chapter id="security">

				     <title>Security Framework</title>

				!Isecurity/security.c

				!Esecurity/inode.c

				  </chapter>

				  <chapter id="audit">

				     <title>Audit Interfaces</title>

				!Ekernel/audit.c

				!Ikernel/auditsc.c

				!Ikernel/auditfilter.c

				  </chapter>

				  <chapter id="accounting">

				     <title>Accounting Framework</title>

				!Ikernel/acct.c

				  </chapter>

				  <chapter id="blkdev">

				     <title>Block Devices</title>

				!Eblock/blk-core.c

				!Iblock/blk-core.c

				!Eblock/blk-map.c

				!Iblock/blk-sysfs.c

				!Eblock/blk-settings.c

				!Eblock/blk-exec.c

				!Eblock/blk-flush.c

				!Eblock/blk-lib.c

				!Eblock/blk-tag.c

				!Iblock/blk-tag.c

				!Eblock/blk-integrity.c

				!Ikernel/trace/blktrace.c

				!Iblock/genhd.c

				!Eblock/genhd.c

				  </chapter>

				  <chapter id="chrdev">

					<title>Char devices</title>

				!Efs/char_dev.c

				  </chapter>

				  <chapter id="miscdev">

				     <title>Miscellaneous Devices</title>

				!Edrivers/char/misc.c

				  </chapter>

				  <chapter id="clk">

				     <title>Clock Framework</title>

				     <para>

					The clock framework defines programming interfaces to support

					software management of the system clock tree.

					This framework is widely used with System-On-Chip (SOC) platforms

					to support power management and various devices which may need

					custom clock rates.

					Note that these "clocks" don't relate to timekeeping or real

					time clocks (RTCs), each of which have separate frameworks.

					These <structname>struct clk</structname> instances may be used

					to manage for example a 96 MHz signal that is used to shift bits

					into and out of peripherals or busses, or otherwise trigger

					synchronous state machine transitions in system hardware.

				     </para>

				     <para>

					Power management is supported by explicit software clock gating:

					unused clocks are disabled, so the system doesn't waste power

					changing the state of transistors that aren't in active use.

					On some systems this may be backed by hardware clock gating,

					where clocks are gated without being disabled in software.

					Sections of chips that are powered but not clocked may be able

					to retain their last state.

					This low power state is often called a <emphasis>retention

					mode</emphasis>.

					This mode still incurs leakage currents, especially with finer

					circuit geometries, but for CMOS circuits power is mostly used

					by clocked state changes.

				     </para>

				     <para>

					Power-aware drivers only enable their clocks when the device

					they manage is in active use.  Also, system sleep states often

					differ according to which clock domains are active:  while a

					"standby" state may allow wakeup from several active domains, a

					"mem" (suspend-to-RAM) state may require a more wholesale shutdown

					of clocks derived from higher speed PLLs and oscillators, limiting

					the number of possible wakeup event sources.  A driver's suspend

					method may need to be aware of system-specific clock constraints

					on the target sleep state.

				     </para>

				     <para>

				        Some platforms support programmable clock generators.  These

					can be used by external chips of various kinds, such as other

					CPUs, multimedia codecs, and devices with strict requirements

					for interface clocking.

				     </para>

				!Iinclude/linux/clk.h

				  </chapter>

				</book>

1312

Documentation/DocBook/kernel-hacking.tmpl Normal file

View File

File diff suppressed because it is too large Load Diff

2151

Documentation/DocBook/kernel-locking.tmpl Normal file

View File

File diff suppressed because it is too large Load Diff

									
										918

Documentation/DocBook/kgdb.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,918 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="kgdbOnLinux">

				 <bookinfo>

				  <title>Using kgdb, kdb and the kernel debugger internals</title>

				  <authorgroup>

				   <author>

				    <firstname>Jason</firstname>

				    <surname>Wessel</surname>

				    <affiliation>

				     <address>

				      <email>jason.wessel@windriver.com</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2008,2010</year>

				   <holder>Wind River Systems, Inc.</holder>

				  </copyright>

				  <copyright>

				   <year>2004-2005</year>

				   <holder>MontaVista Software, Inc.</holder>

				  </copyright>

				  <copyright>

				   <year>2004</year>

				   <holder>Amit S. Kale</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				   This file is licensed under the terms of the GNU General Public License

				   version 2. This program is licensed "as is" without any warranty of any

				   kind, whether express or implied.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="Introduction">

				    <title>Introduction</title>

				    <para>

				    The kernel has two different debugger front ends (kdb and kgdb)

				    which interface to the debug core.  It is possible to use either

				    of the debugger front ends and dynamically transition between them

				    if you configure the kernel properly at compile and runtime.

				    </para>

				    <para>

				    Kdb is simplistic shell-style interface which you can use on a

				    system console with a keyboard or serial console.  You can use it

				    to inspect memory, registers, process lists, dmesg, and even set

				    breakpoints to stop in a certain location.  Kdb is not a source

				    level debugger, although you can set breakpoints and execute some

				    basic kernel run control.  Kdb is mainly aimed at doing some

				    analysis to aid in development or diagnosing kernel problems.  You

				    can access some symbols by name in kernel built-ins or in kernel

				    modules if the code was built

				    with <symbol>CONFIG_KALLSYMS</symbol>.

				    </para>

				    <para>

				    Kgdb is intended to be used as a source level debugger for the

				    Linux kernel. It is used along with gdb to debug a Linux kernel.

				    The expectation is that gdb can be used to "break in" to the

				    kernel to inspect memory, variables and look through call stack

				    information similar to the way an application developer would use

				    gdb to debug an application.  It is possible to place breakpoints

				    in kernel code and perform some limited execution stepping.

				    </para>

				    <para>

				    Two machines are required for using kgdb. One of these machines is

				    a development machine and the other is the target machine.  The

				    kernel to be debugged runs on the target machine. The development

				    machine runs an instance of gdb against the vmlinux file which

				    contains the symbols (not a boot image such as bzImage, zImage,

				    uImage...).  In gdb the developer specifies the connection

				    parameters and connects to kgdb.  The type of connection a

				    developer makes with gdb depends on the availability of kgdb I/O

				    modules compiled as built-ins or loadable kernel modules in the test

				    machine's kernel.

				    </para>

				  </chapter>

				  <chapter id="CompilingAKernel">

				  <title>Compiling a kernel</title>

				  <para>

				  <itemizedlist>

				  <listitem><para>In order to enable compilation of kdb, you must first enable kgdb.</para></listitem>

				  <listitem><para>The kgdb test compile options are described in the kgdb test suite chapter.</para></listitem>

				  </itemizedlist>

				  </para>

				  <sect1 id="CompileKGDB">

				    <title>Kernel config options for kgdb</title>

				    <para>

				    To enable <symbol>CONFIG_KGDB</symbol> you should look under

				    "Kernel hacking" / "Kernel debugging" and select "KGDB: kernel debugger".

				    </para>

				    <para>

				    While it is not a hard requirement that you have symbols in your

				    vmlinux file, gdb tends not to be very useful without the symbolic

				    data, so you will want to turn

				    on <symbol>CONFIG_DEBUG_INFO</symbol> which is called "Compile the

				    kernel with debug info" in the config menu.

				    </para>

				    <para>

				    It is advised, but not required, that you turn on the

				    <symbol>CONFIG_FRAME_POINTER</symbol> kernel option which is called "Compile the

				    kernel with frame pointers" in the config menu.  This option

				    inserts code to into the compiled executable which saves the frame

				    information in registers or on the stack at different points which

				    allows a debugger such as gdb to more accurately construct

				    stack back traces while debugging the kernel.

				    </para>

				    <para>

				    If the architecture that you are using supports the kernel option

				    CONFIG_STRICT_KERNEL_RWX, you should consider turning it off.  This

				    option will prevent the use of software breakpoints because it

				    marks certain regions of the kernel's memory space as read-only.

				    If kgdb supports it for the architecture you are using, you can

				    use hardware breakpoints if you desire to run with the

				    CONFIG_STRICT_KERNEL_RWX option turned on, else you need to turn off

				    this option.

				    </para>

				    <para>

				    Next you should choose one of more I/O drivers to interconnect

				    debugging host and debugged target.  Early boot debugging requires

				    a KGDB I/O driver that supports early debugging and the driver

				    must be built into the kernel directly. Kgdb I/O driver

				    configuration takes place via kernel or module parameters which

				    you can learn more about in the in the section that describes the

				    parameter "kgdboc".

				    </para>

				    <para>Here is an example set of .config symbols to enable or

				    disable for kgdb:

				    <itemizedlist>

				    <listitem><para># CONFIG_STRICT_KERNEL_RWX is not set</para></listitem>

				    <listitem><para>CONFIG_FRAME_POINTER=y</para></listitem>

				    <listitem><para>CONFIG_KGDB=y</para></listitem>

				    <listitem><para>CONFIG_KGDB_SERIAL_CONSOLE=y</para></listitem>

				    </itemizedlist>

				    </para>

				  </sect1>

				  <sect1 id="CompileKDB">

				    <title>Kernel config options for kdb</title>

				    <para>Kdb is quite a bit more complex than the simple gdbstub

				    sitting on top of the kernel's debug core.  Kdb must implement a

				    shell, and also adds some helper functions in other parts of the

				    kernel, responsible for printing out interesting data such as what

				    you would see if you ran "lsmod", or "ps".  In order to build kdb

				    into the kernel you follow the same steps as you would for kgdb.

				    </para>

				    <para>The main config option for kdb

				    is <symbol>CONFIG_KGDB_KDB</symbol> which is called "KGDB_KDB:

				    include kdb frontend for kgdb" in the config menu.  In theory you

				    would have already also selected an I/O driver such as the

				    CONFIG_KGDB_SERIAL_CONSOLE interface if you plan on using kdb on a

				    serial port, when you were configuring kgdb.

				    </para>

				    <para>If you want to use a PS/2-style keyboard with kdb, you would

				    select CONFIG_KDB_KEYBOARD which is called "KGDB_KDB: keyboard as

				    input device" in the config menu.  The CONFIG_KDB_KEYBOARD option

				    is not used for anything in the gdb interface to kgdb.  The

				    CONFIG_KDB_KEYBOARD option only works with kdb.

				    </para>

				    <para>Here is an example set of .config symbols to enable/disable kdb:

				    <itemizedlist>

				    <listitem><para># CONFIG_STRICT_KERNEL_RWX is not set</para></listitem>

				    <listitem><para>CONFIG_FRAME_POINTER=y</para></listitem>

				    <listitem><para>CONFIG_KGDB=y</para></listitem>

				    <listitem><para>CONFIG_KGDB_SERIAL_CONSOLE=y</para></listitem>

				    <listitem><para>CONFIG_KGDB_KDB=y</para></listitem>

				    <listitem><para>CONFIG_KDB_KEYBOARD=y</para></listitem>

				    </itemizedlist>

				    </para>

				  </sect1>

				  </chapter>

				  <chapter id="kgdbKernelArgs">

				  <title>Kernel Debugger Boot Arguments</title>

				  <para>This section describes the various runtime kernel

				  parameters that affect the configuration of the kernel debugger.

				  The following chapter covers using kdb and kgdb as well as

				  providing some examples of the configuration parameters.</para>

				   <sect1 id="kgdboc">

				   <title>Kernel parameter: kgdboc</title>

				   <para>The kgdboc driver was originally an abbreviation meant to

				   stand for "kgdb over console".  Today it is the primary mechanism

				   to configure how to communicate from gdb to kgdb as well as the

				   devices you want to use to interact with the kdb shell.

				   </para>

				   <para>For kgdb/gdb, kgdboc is designed to work with a single serial

				   port. It is intended to cover the circumstance where you want to

				   use a serial console as your primary console as well as using it to

				   perform kernel debugging.  It is also possible to use kgdb on a

				   serial port which is not designated as a system console.  Kgdboc

				   may be configured as a kernel built-in or a kernel loadable module.

				   You can only make use of <constant>kgdbwait</constant> and early

				   debugging if you build kgdboc into the kernel as a built-in.

				   </para>

				   <para>Optionally you can elect to activate kms (Kernel Mode

				   Setting) integration.  When you use kms with kgdboc and you have a

				   video driver that has atomic mode setting hooks, it is possible to

				   enter the debugger on the graphics console.  When the kernel

				   execution is resumed, the previous graphics mode will be restored.

				   This integration can serve as a useful tool to aid in diagnosing

				   crashes or doing analysis of memory with kdb while allowing the

				   full graphics console applications to run.

				   </para>

				   <sect2 id="kgdbocArgs">

				   <title>kgdboc arguments</title>

				   <para>Usage: <constant>kgdboc=[kms][[,]kbd][[,]serial_device][,baud]</constant></para>

				   <para>The order listed above must be observed if you use any of the

				   optional configurations together.

				   </para>

				   <para>Abbreviations:

				   <itemizedlist>

				   <listitem><para>kms = Kernel Mode Setting</para></listitem>

				   <listitem><para>kbd = Keyboard</para></listitem>

				   </itemizedlist>

				   </para>

				   <para>You can configure kgdboc to use the keyboard, and/or a serial

				   device depending on if you are using kdb and/or kgdb, in one of the

				   following scenarios.  The order listed above must be observed if

				   you use any of the optional configurations together.  Using kms +

				   only gdb is generally not a useful combination.</para>

				   <sect3 id="kgdbocArgs1">

				   <title>Using loadable module or built-in</title>

				   <para>

				   <orderedlist>

				   <listitem><para>As a kernel built-in:</para>

				   <para>Use the kernel boot argument: <constant>kgdboc=&lt;tty-device&gt;,[baud]</constant></para></listitem>

				   <listitem>

				   <para>As a kernel loadable module:</para>

				   <para>Use the command: <constant>modprobe kgdboc kgdboc=&lt;tty-device&gt;,[baud]</constant></para>

				   <para>Here are two examples of how you might format the kgdboc

				   string. The first is for an x86 target using the first serial port.

				   The second example is for the ARM Versatile AB using the second

				   serial port.

				   <orderedlist>

				   <listitem><para><constant>kgdboc=ttyS0,115200</constant></para></listitem>

				   <listitem><para><constant>kgdboc=ttyAMA1,115200</constant></para></listitem>

				   </orderedlist>

				   </para>

				   </listitem>

				   </orderedlist></para>

				   </sect3>

				   <sect3 id="kgdbocArgs2">

				   <title>Configure kgdboc at runtime with sysfs</title>

				   <para>At run time you can enable or disable kgdboc by echoing a

				   parameters into the sysfs.  Here are two examples:</para>

				   <orderedlist>

				   <listitem><para>Enable kgdboc on ttyS0</para>

				   <para><constant>echo ttyS0 &gt; /sys/module/kgdboc/parameters/kgdboc</constant></para></listitem>

				   <listitem><para>Disable kgdboc</para>

				   <para><constant>echo "" &gt; /sys/module/kgdboc/parameters/kgdboc</constant></para></listitem>

				   </orderedlist>

				   <para>NOTE: You do not need to specify the baud if you are

				   configuring the console on tty which is already configured or

				   open.</para>

				   </sect3>

				   <sect3 id="kgdbocArgs3">

				   <title>More examples</title>

				   <para>You can configure kgdboc to use the keyboard, and/or a serial device

				   depending on if you are using kdb and/or kgdb, in one of the

				   following scenarios.

				   <orderedlist>

				   <listitem><para>kdb and kgdb over only a serial port</para>

				   <para><constant>kgdboc=&lt;serial_device&gt;[,baud]</constant></para>

				   <para>Example: <constant>kgdboc=ttyS0,115200</constant></para>

				   </listitem>

				   <listitem><para>kdb and kgdb with keyboard and a serial port</para>

				   <para><constant>kgdboc=kbd,&lt;serial_device&gt;[,baud]</constant></para>

				   <para>Example: <constant>kgdboc=kbd,ttyS0,115200</constant></para>

				   </listitem>

				   <listitem><para>kdb with a keyboard</para>

				   <para><constant>kgdboc=kbd</constant></para>

				   </listitem>

				   <listitem><para>kdb with kernel mode setting</para>

				   <para><constant>kgdboc=kms,kbd</constant></para>

				   </listitem>

				   <listitem><para>kdb with kernel mode setting and kgdb over a serial port</para>

				   <para><constant>kgdboc=kms,kbd,ttyS0,115200</constant></para>

				   </listitem>

				   </orderedlist>

				   </para>

				   <para>NOTE: Kgdboc does not support interrupting the target via the

				   gdb remote protocol.  You must manually send a sysrq-g unless you

				   have a proxy that splits console output to a terminal program.

				   A console proxy has a separate TCP port for the debugger and a separate

				   TCP port for the "human" console.  The proxy can take care of sending

				   the sysrq-g for you.

				   </para>

				   <para>When using kgdboc with no debugger proxy, you can end up

				    connecting the debugger at one of two entry points.  If an

				    exception occurs after you have loaded kgdboc, a message should

				    print on the console stating it is waiting for the debugger.  In

				    this case you disconnect your terminal program and then connect the

				    debugger in its place.  If you want to interrupt the target system

				    and forcibly enter a debug session you have to issue a Sysrq

				    sequence and then type the letter <constant>g</constant>.  Then

				    you disconnect the terminal session and connect gdb.  Your options

				    if you don't like this are to hack gdb to send the sysrq-g for you

				    as well as on the initial connect, or to use a debugger proxy that

				    allows an unmodified gdb to do the debugging.

				   </para>

				   </sect3>

				   </sect2>

				   </sect1>

				   <sect1 id="kgdbwait">

				   <title>Kernel parameter: kgdbwait</title>

				   <para>

				   The Kernel command line option <constant>kgdbwait</constant> makes

				   kgdb wait for a debugger connection during booting of a kernel.  You

				   can only use this option if you compiled a kgdb I/O driver into the

				   kernel and you specified the I/O driver configuration as a kernel

				   command line option.  The kgdbwait parameter should always follow the

				   configuration parameter for the kgdb I/O driver in the kernel

				   command line else the I/O driver will not be configured prior to

				   asking the kernel to use it to wait.

				   </para>

				   <para>

				   The kernel will stop and wait as early as the I/O driver and

				   architecture allows when you use this option.  If you build the

				   kgdb I/O driver as a loadable kernel module kgdbwait will not do

				   anything.

				   </para>

				   </sect1>

				   <sect1 id="kgdbcon">

				   <title>Kernel parameter: kgdbcon</title>

				   <para> The kgdbcon feature allows you to see printk() messages

				   inside gdb while gdb is connected to the kernel.  Kdb does not make

				    use of the kgdbcon feature.

				   </para>

				   <para>Kgdb supports using the gdb serial protocol to send console

				   messages to the debugger when the debugger is connected and running.

				   There are two ways to activate this feature.

				   <orderedlist>

				   <listitem><para>Activate with the kernel command line option:</para>

				   <para><constant>kgdbcon</constant></para>

				   </listitem>

				   <listitem><para>Use sysfs before configuring an I/O driver</para>

				   <para>

				   <constant>echo 1 &gt; /sys/module/kgdb/parameters/kgdb_use_con</constant>

				   </para>

				   <para>

				   NOTE: If you do this after you configure the kgdb I/O driver, the

				   setting will not take effect until the next point the I/O is

				   reconfigured.

				   </para>

				   </listitem>

				   </orderedlist>

				  </para>

				   <para>IMPORTANT NOTE: You cannot use kgdboc + kgdbcon on a tty that is an

				   active system console.  An example of incorrect usage is <constant>console=ttyS0,115200 kgdboc=ttyS0 kgdbcon</constant>

				   </para>

				   <para>It is possible to use this option with kgdboc on a tty that is not a system console.

				   </para>

				  </sect1>

				   <sect1 id="kgdbreboot">

				   <title>Run time parameter: kgdbreboot</title>

				   <para> The kgdbreboot feature allows you to change how the debugger

				   deals with the reboot notification.  You have 3 choices for the

				   behavior.  The default behavior is always set to 0.</para>

				   <orderedlist>

				   <listitem><para>echo -1 > /sys/module/debug_core/parameters/kgdbreboot</para>

				   <para>Ignore the reboot notification entirely.</para>

				   </listitem>

				   <listitem><para>echo 0 > /sys/module/debug_core/parameters/kgdbreboot</para>

				   <para>Send the detach message to any attached debugger client.</para>

				   </listitem>

				   <listitem><para>echo 1 > /sys/module/debug_core/parameters/kgdbreboot</para>

				   <para>Enter the debugger on reboot notify.</para>

				   </listitem>

				   </orderedlist>

				  </sect1>

				  </chapter>

				  <chapter id="usingKDB">

				  <title>Using kdb</title>

				  <para>

				  </para>

				  <sect1 id="quickKDBserial">

				  <title>Quick start for kdb on a serial port</title>

				  <para>This is a quick example of how to use kdb.</para>

				  <para><orderedlist>

				  <listitem><para>Configure kgdboc at boot using kernel parameters:

				  <itemizedlist>

				  <listitem><para><constant>console=ttyS0,115200 kgdboc=ttyS0,115200</constant></para></listitem>

				  </itemizedlist></para>

				  <para>OR</para>

				  <para>Configure kgdboc after the kernel has booted; assuming you are using a serial port console:

				  <itemizedlist>

				  <listitem><para><constant>echo ttyS0 &gt; /sys/module/kgdboc/parameters/kgdboc</constant></para></listitem>

				  </itemizedlist>

				  </para>

				  </listitem>

				  <listitem><para>Enter the kernel debugger manually or by waiting for an oops or fault.  There are several ways you can enter the kernel debugger manually; all involve using the sysrq-g, which means you must have enabled CONFIG_MAGIC_SYSRQ=y in your kernel config.</para>

				  <itemizedlist>

				  <listitem><para>When logged in as root or with a super user session you can run:</para>

				   <para><constant>echo g &gt; /proc/sysrq-trigger</constant></para></listitem>

				  <listitem><para>Example using minicom 2.2</para>

				  <para>Press: <constant>Control-a</constant></para>

				  <para>Press: <constant>f</constant></para>

				  <para>Press: <constant>g</constant></para>

				  </listitem>

				  <listitem><para>When you have telneted to a terminal server that supports sending a remote break</para>

				  <para>Press: <constant>Control-]</constant></para>

				  <para>Type in:<constant>send break</constant></para>

				  <para>Press: <constant>Enter</constant></para>

				  <para>Press: <constant>g</constant></para>

				  </listitem>

				  </itemizedlist>

				  </listitem>

				  <listitem><para>From the kdb prompt you can run the "help" command to see a complete list of the commands that are available.</para>

				  <para>Some useful commands in kdb include:

				  <itemizedlist>

				  <listitem><para>lsmod  -- Shows where kernel modules are loaded</para></listitem>

				  <listitem><para>ps -- Displays only the active processes</para></listitem>

				  <listitem><para>ps A -- Shows all the processes</para></listitem>

				  <listitem><para>summary -- Shows kernel version info and memory usage</para></listitem>

				  <listitem><para>bt -- Get a backtrace of the current process using dump_stack()</para></listitem>

				  <listitem><para>dmesg -- View the kernel syslog buffer</para></listitem>

				  <listitem><para>go -- Continue the system</para></listitem>

				  </itemizedlist>

				  </para>

				  </listitem>

				  <listitem>

				  <para>When you are done using kdb you need to consider rebooting the

				  system or using the "go" command to resuming normal kernel

				  execution.  If you have paused the kernel for a lengthy period of

				  time, applications that rely on timely networking or anything to do

				  with real wall clock time could be adversely affected, so you

				  should take this into consideration when using the kernel

				  debugger.</para>

				  </listitem>

				  </orderedlist></para>

				  </sect1>

				  <sect1 id="quickKDBkeyboard">

				  <title>Quick start for kdb using a keyboard connected console</title>

				  <para>This is a quick example of how to use kdb with a keyboard.</para>

				  <para><orderedlist>

				  <listitem><para>Configure kgdboc at boot using kernel parameters:

				  <itemizedlist>

				  <listitem><para><constant>kgdboc=kbd</constant></para></listitem>

				  </itemizedlist></para>

				  <para>OR</para>

				  <para>Configure kgdboc after the kernel has booted:

				  <itemizedlist>

				  <listitem><para><constant>echo kbd &gt; /sys/module/kgdboc/parameters/kgdboc</constant></para></listitem>

				  </itemizedlist>

				  </para>

				  </listitem>

				  <listitem><para>Enter the kernel debugger manually or by waiting for an oops or fault.  There are several ways you can enter the kernel debugger manually; all involve using the sysrq-g, which means you must have enabled CONFIG_MAGIC_SYSRQ=y in your kernel config.</para>

				  <itemizedlist>

				  <listitem><para>When logged in as root or with a super user session you can run:</para>

				   <para><constant>echo g &gt; /proc/sysrq-trigger</constant></para></listitem>

				  <listitem><para>Example using a laptop keyboard</para>

				  <para>Press and hold down: <constant>Alt</constant></para>

				  <para>Press and hold down: <constant>Fn</constant></para>

				  <para>Press and release the key with the label: <constant>SysRq</constant></para>

				  <para>Release: <constant>Fn</constant></para>

				  <para>Press and release: <constant>g</constant></para>

				  <para>Release: <constant>Alt</constant></para>

				  </listitem>

				  <listitem><para>Example using a PS/2 101-key keyboard</para>

				  <para>Press and hold down: <constant>Alt</constant></para>

				  <para>Press and release the key with the label: <constant>SysRq</constant></para>

				  <para>Press and release: <constant>g</constant></para>

				  <para>Release: <constant>Alt</constant></para>

				  </listitem>

				  </itemizedlist>

				  </listitem>

				  <listitem>

				  <para>Now type in a kdb command such as "help", "dmesg", "bt" or "go" to continue kernel execution.</para>

				  </listitem>

				  </orderedlist></para>

				  </sect1>

				  </chapter>

				  <chapter id="EnableKGDB">

				   <title>Using kgdb / gdb</title>

				   <para>In order to use kgdb you must activate it by passing

				   configuration information to one of the kgdb I/O drivers.  If you

				   do not pass any configuration information kgdb will not do anything

				   at all.  Kgdb will only actively hook up to the kernel trap hooks

				   if a kgdb I/O driver is loaded and configured.  If you unconfigure

				   a kgdb I/O driver, kgdb will unregister all the kernel hook points.

				   </para>

				   <para> All kgdb I/O drivers can be reconfigured at run time, if

				   <symbol>CONFIG_SYSFS</symbol> and <symbol>CONFIG_MODULES</symbol>

				   are enabled, by echo'ing a new config string to

				   <constant>/sys/module/&lt;driver&gt;/parameter/&lt;option&gt;</constant>.

				   The driver can be unconfigured by passing an empty string.  You cannot

				   change the configuration while the debugger is attached.  Make sure

				   to detach the debugger with the <constant>detach</constant> command

				   prior to trying to unconfigure a kgdb I/O driver.

				   </para>

				  <sect1 id="ConnectingGDB">

				  <title>Connecting with gdb to a serial port</title>

				  <orderedlist>

				  <listitem><para>Configure kgdboc</para>

				   <para>Configure kgdboc at boot using kernel parameters:

				   <itemizedlist>

				    <listitem><para><constant>kgdboc=ttyS0,115200</constant></para></listitem>

				   </itemizedlist></para>

				   <para>OR</para>

				   <para>Configure kgdboc after the kernel has booted:

				   <itemizedlist>

				    <listitem><para><constant>echo ttyS0 &gt; /sys/module/kgdboc/parameters/kgdboc</constant></para></listitem>

				   </itemizedlist></para>

				  </listitem>

				  <listitem>

				  <para>Stop kernel execution (break into the debugger)</para>

				  <para>In order to connect to gdb via kgdboc, the kernel must

				  first be stopped.  There are several ways to stop the kernel which

				  include using kgdbwait as a boot argument, via a sysrq-g, or running

				  the kernel until it takes an exception where it waits for the

				  debugger to attach.

				  <itemizedlist>

				  <listitem><para>When logged in as root or with a super user session you can run:</para>

				   <para><constant>echo g &gt; /proc/sysrq-trigger</constant></para></listitem>

				  <listitem><para>Example using minicom 2.2</para>

				  <para>Press: <constant>Control-a</constant></para>

				  <para>Press: <constant>f</constant></para>

				  <para>Press: <constant>g</constant></para>

				  </listitem>

				  <listitem><para>When you have telneted to a terminal server that supports sending a remote break</para>

				  <para>Press: <constant>Control-]</constant></para>

				  <para>Type in:<constant>send break</constant></para>

				  <para>Press: <constant>Enter</constant></para>

				  <para>Press: <constant>g</constant></para>

				  </listitem>

				  </itemizedlist>

				  </para>

				  </listitem>

				  <listitem>

				    <para>Connect from gdb</para>

				    <para>

				    Example (using a directly connected port):

				    </para>

				    <programlisting>

				    % gdb ./vmlinux

				    (gdb) set remotebaud 115200

				    (gdb) target remote /dev/ttyS0

				    </programlisting>

				    <para>

				    Example (kgdb to a terminal server on TCP port 2012):

				    </para>

				    <programlisting>

				    % gdb ./vmlinux

				    (gdb) target remote 192.168.2.2:2012

				    </programlisting>

				    <para>

				    Once connected, you can debug a kernel the way you would debug an

				    application program.

				    </para>

				    <para>

				    If you are having problems connecting or something is going

				    seriously wrong while debugging, it will most often be the case

				    that you want to enable gdb to be verbose about its target

				    communications.  You do this prior to issuing the <constant>target

				    remote</constant> command by typing in: <constant>set debug remote 1</constant>

				    </para>

				  </listitem>

				  </orderedlist>

				  <para>Remember if you continue in gdb, and need to "break in" again,

				  you need to issue an other sysrq-g.  It is easy to create a simple

				  entry point by putting a breakpoint at <constant>sys_sync</constant>

				  and then you can run "sync" from a shell or script to break into the

				  debugger.</para>

				  </sect1>

				  </chapter>

				  <chapter id="switchKdbKgdb">

				  <title>kgdb and kdb interoperability</title>

				  <para>It is possible to transition between kdb and kgdb dynamically.

				  The debug core will remember which you used the last time and

				  automatically start in the same mode.</para>

				  <sect1>

				  <title>Switching between kdb and kgdb</title>

				  <sect2>

				  <title>Switching from kgdb to kdb</title>

				  <para>

				  There are two ways to switch from kgdb to kdb: you can use gdb to

				  issue a maintenance packet, or you can blindly type the command $3#33.

				  Whenever the kernel debugger stops in kgdb mode it will print the

				  message <constant>KGDB or $3#33 for KDB</constant>.  It is important

				  to note that you have to type the sequence correctly in one pass.

				  You cannot type a backspace or delete because kgdb will interpret

				  that as part of the debug stream.

				  <orderedlist>

				  <listitem><para>Change from kgdb to kdb by blindly typing:</para>

				  <para><constant>$3#33</constant></para></listitem>

				  <listitem><para>Change from kgdb to kdb with gdb</para>

				  <para><constant>maintenance packet 3</constant></para>

				  <para>NOTE: Now you must kill gdb. Typically you press control-z and

				  issue the command: kill -9 %</para></listitem>

				  </orderedlist>

				  </para>

				  </sect2>

				  <sect2>

				  <title>Change from kdb to kgdb</title>

				  <para>There are two ways you can change from kdb to kgdb.  You can

				  manually enter kgdb mode by issuing the kgdb command from the kdb

				  shell prompt, or you can connect gdb while the kdb shell prompt is

				  active.  The kdb shell looks for the typical first commands that gdb

				  would issue with the gdb remote protocol and if it sees one of those

				  commands it automatically changes into kgdb mode.</para>

				  <orderedlist>

				  <listitem><para>From kdb issue the command:</para>

				  <para><constant>kgdb</constant></para>

				  <para>Now disconnect your terminal program and connect gdb in its place</para></listitem>

				  <listitem><para>At the kdb prompt, disconnect the terminal program and connect gdb in its place.</para></listitem>

				  </orderedlist>

				  </sect2>

				  </sect1>

				  <sect1>

				  <title>Running kdb commands from gdb</title>

				  <para>It is possible to run a limited set of kdb commands from gdb,

				  using the gdb monitor command.  You don't want to execute any of the

				  run control or breakpoint operations, because it can disrupt the

				  state of the kernel debugger.  You should be using gdb for

				  breakpoints and run control operations if you have gdb connected.

				  The more useful commands to run are things like lsmod, dmesg, ps or

				  possibly some of the memory information commands.  To see all the kdb

				  commands you can run <constant>monitor help</constant>.</para>

				  <para>Example:

				  <informalexample><programlisting>

				(gdb) monitor ps

				1 idle process (state I) and

				27 sleeping system daemon (state M) processes suppressed,

				use 'ps A' to see all.

				Task Addr       Pid   Parent [*] cpu State Thread     Command

				0xc78291d0        1        0  0    0   S  0xc7829404  init

				0xc7954150      942        1  0    0   S  0xc7954384  dropbear

				0xc78789c0      944        1  0    0   S  0xc7878bf4  sh

				(gdb)

				  </programlisting></informalexample>

				  </para>

				  </sect1>

				  </chapter>

				  <chapter id="KGDBTestSuite">

				    <title>kgdb Test Suite</title>

				    <para>

				    When kgdb is enabled in the kernel config you can also elect to

				    enable the config parameter KGDB_TESTS.  Turning this on will

				    enable a special kgdb I/O module which is designed to test the

				    kgdb internal functions.

				    </para>

				    <para>

				    The kgdb tests are mainly intended for developers to test the kgdb

				    internals as well as a tool for developing a new kgdb architecture

				    specific implementation.  These tests are not really for end users

				    of the Linux kernel.  The primary source of documentation would be

				    to look in the drivers/misc/kgdbts.c file.

				    </para>

				    <para>

				    The kgdb test suite can also be configured at compile time to run

				    the core set of tests by setting the kernel config parameter

				    KGDB_TESTS_ON_BOOT.  This particular option is aimed at automated

				    regression testing and does not require modifying the kernel boot

				    config arguments.  If this is turned on, the kgdb test suite can

				    be disabled by specifying "kgdbts=" as a kernel boot argument.

				    </para>

				  </chapter>

				  <chapter id="CommonBackEndReq">

				  <title>Kernel Debugger Internals</title>

				  <sect1 id="kgdbArchitecture">

				    <title>Architecture Specifics</title>

				      <para>

				      The kernel debugger is organized into a number of components:

				      <orderedlist>

				      <listitem><para>The debug core</para>

				      <para>

				      The debug core is found in kernel/debugger/debug_core.c.  It contains:

				      <itemizedlist>

				      <listitem><para>A generic OS exception handler which includes

				      sync'ing the processors into a stopped state on an multi-CPU

				      system.</para></listitem>

				      <listitem><para>The API to talk to the kgdb I/O drivers</para></listitem>

				      <listitem><para>The API to make calls to the arch-specific kgdb implementation</para></listitem>

				      <listitem><para>The logic to perform safe memory reads and writes to memory while using the debugger</para></listitem>

				      <listitem><para>A full implementation for software breakpoints unless overridden by the arch</para></listitem>

				      <listitem><para>The API to invoke either the kdb or kgdb frontend to the debug core.</para></listitem>

				      <listitem><para>The structures and callback API for atomic kernel mode setting.</para>

				      <para>NOTE: kgdboc is where the kms callbacks are invoked.</para></listitem>

				      </itemizedlist>

				      </para>

				      </listitem>

				      <listitem><para>kgdb arch-specific implementation</para>

				      <para>

				      This implementation is generally found in arch/*/kernel/kgdb.c.

				      As an example, arch/x86/kernel/kgdb.c contains the specifics to

				      implement HW breakpoint as well as the initialization to

				      dynamically register and unregister for the trap handlers on

				      this architecture.  The arch-specific portion implements:

				      <itemizedlist>

				      <listitem><para>contains an arch-specific trap catcher which

				      invokes kgdb_handle_exception() to start kgdb about doing its

				      work</para></listitem>

				      <listitem><para>translation to and from gdb specific packet format to pt_regs</para></listitem>

				      <listitem><para>Registration and unregistration of architecture specific trap hooks</para></listitem>

				      <listitem><para>Any special exception handling and cleanup</para></listitem>

				      <listitem><para>NMI exception handling and cleanup</para></listitem>

				      <listitem><para>(optional) HW breakpoints</para></listitem>

				      </itemizedlist>

				      </para>

				      </listitem>

				      <listitem><para>gdbstub frontend (aka kgdb)</para>

				      <para>The gdbstub is located in kernel/debug/gdbstub.c. It contains:</para>

				      <itemizedlist>

				        <listitem><para>All the logic to implement the gdb serial protocol</para></listitem>

				      </itemizedlist>

				      </listitem>

				      <listitem><para>kdb frontend</para>

				      <para>The kdb debugger shell is broken down into a number of

				      components.  The kdb core is located in kernel/debug/kdb.  There

				      are a number of helper functions in some of the other kernel

				      components to make it possible for kdb to examine and report

				      information about the kernel without taking locks that could

				      cause a kernel deadlock.  The kdb core contains implements the following functionality.</para>

				      <itemizedlist>

				        <listitem><para>A simple shell</para></listitem>

				        <listitem><para>The kdb core command set</para></listitem>

				        <listitem><para>A registration API to register additional kdb shell commands.</para>

					<itemizedlist>

				        <listitem><para>A good example of a self-contained kdb module

				        is the "ftdump" command for dumping the ftrace buffer.  See:

				        kernel/trace/trace_kdb.c</para></listitem>

				        <listitem><para>For an example of how to dynamically register

				        a new kdb command you can build the kdb_hello.ko kernel module

				        from samples/kdb/kdb_hello.c.  To build this example you can

				        set CONFIG_SAMPLES=y and CONFIG_SAMPLE_KDB=m in your kernel

				        config.  Later run "modprobe kdb_hello" and the next time you

				        enter the kdb shell, you can run the "hello"

				        command.</para></listitem>

					</itemizedlist></listitem>

				        <listitem><para>The implementation for kdb_printf() which

				        emits messages directly to I/O drivers, bypassing the kernel

				        log.</para></listitem>

				        <listitem><para>SW / HW breakpoint management for the kdb shell</para></listitem>

				      </itemizedlist>

				      </listitem>

				      <listitem><para>kgdb I/O driver</para>

				      <para>

				      Each kgdb I/O driver has to provide an implementation for the following:

				      <itemizedlist>

				      <listitem><para>configuration via built-in or module</para></listitem>

				      <listitem><para>dynamic configuration and kgdb hook registration calls</para></listitem>

				      <listitem><para>read and write character interface</para></listitem>

				      <listitem><para>A cleanup handler for unconfiguring from the kgdb core</para></listitem>

				      <listitem><para>(optional) Early debug methodology</para></listitem>

				      </itemizedlist>

				      Any given kgdb I/O driver has to operate very closely with the

				      hardware and must do it in such a way that does not enable

				      interrupts or change other parts of the system context without

				      completely restoring them. The kgdb core will repeatedly "poll"

				      a kgdb I/O driver for characters when it needs input.  The I/O

				      driver is expected to return immediately if there is no data

				      available.  Doing so allows for the future possibility to touch

				      watchdog hardware in such a way as to have a target system not

				      reset when these are enabled.

				      </para>

				      </listitem>

				      </orderedlist>

				      </para>

				      <para>

				      If you are intent on adding kgdb architecture specific support

				      for a new architecture, the architecture should define

				      <constant>HAVE_ARCH_KGDB</constant> in the architecture specific

				      Kconfig file.  This will enable kgdb for the architecture, and

				      at that point you must create an architecture specific kgdb

				      implementation.

				      </para>

				      <para>

				      There are a few flags which must be set on every architecture in

				      their &lt;asm/kgdb.h&gt; file.  These are:

				      <itemizedlist>

				        <listitem>

				          <para>

				          NUMREGBYTES: The size in bytes of all of the registers, so

				          that we can ensure they will all fit into a packet.

				          </para>

				        </listitem>

				        <listitem>

				          <para>

				          BUFMAX: The size in bytes of the buffer GDB will read into.

				          This must be larger than NUMREGBYTES.

				          </para>

				        </listitem>

				        <listitem>

				          <para>

				          CACHE_FLUSH_IS_SAFE: Set to 1 if it is always safe to call

				          flush_cache_range or flush_icache_range.  On some architectures,

				          these functions may not be safe to call on SMP since we keep other

				          CPUs in a holding pattern.

				          </para>

				        </listitem>

				      </itemizedlist>

				      </para>

				      <para>

				      There are also the following functions for the common backend,

				      found in kernel/kgdb.c, that must be supplied by the

				      architecture-specific backend unless marked as (optional), in

				      which case a default function maybe used if the architecture

				      does not need to provide a specific implementation.

				      </para>

				!Iinclude/linux/kgdb.h

				  </sect1>

				  <sect1 id="kgdbocDesign">

				  <title>kgdboc internals</title>

				  <sect2>

				  <title>kgdboc and uarts</title>

				  <para>

				  The kgdboc driver is actually a very thin driver that relies on the

				  underlying low level to the hardware driver having "polling hooks"

				  to which the tty driver is attached.  In the initial

				  implementation of kgdboc the serial_core was changed to expose a

				  low level UART hook for doing polled mode reading and writing of a

				  single character while in an atomic context.  When kgdb makes an I/O

				  request to the debugger, kgdboc invokes a callback in the serial

				  core which in turn uses the callback in the UART driver.</para>

				  <para>

				  When using kgdboc with a UART, the UART driver must implement two callbacks in the <constant>struct uart_ops</constant>. Example from drivers/8250.c:<programlisting>

				#ifdef CONFIG_CONSOLE_POLL

					.poll_get_char = serial8250_get_poll_char,

					.poll_put_char = serial8250_put_poll_char,

				#endif

				  </programlisting>

				  Any implementation specifics around creating a polling driver use the

				  <constant>#ifdef CONFIG_CONSOLE_POLL</constant>, as shown above.

				  Keep in mind that polling hooks have to be implemented in such a way

				  that they can be called from an atomic context and have to restore

				  the state of the UART chip on return such that the system can return

				  to normal when the debugger detaches.  You need to be very careful

				  with any kind of lock you consider, because failing here is most likely

				  going to mean pressing the reset button.

				  </para>

				  </sect2>

				  <sect2 id="kgdbocKbd">

				  <title>kgdboc and keyboards</title>

				  <para>The kgdboc driver contains logic to configure communications

				  with an attached keyboard.  The keyboard infrastructure is only

				  compiled into the kernel when CONFIG_KDB_KEYBOARD=y is set in the

				  kernel configuration.</para>

				  <para>The core polled keyboard driver driver for PS/2 type keyboards

				  is in drivers/char/kdb_keyboard.c.  This driver is hooked into the

				  debug core when kgdboc populates the callback in the array

				  called <constant>kdb_poll_funcs[]</constant>.  The

				  kdb_get_kbd_char() is the top-level function which polls hardware

				  for single character input.

				  </para>

				  </sect2>

				  <sect2 id="kgdbocKms">

				  <title>kgdboc and kms</title>

				  <para>The kgdboc driver contains logic to request the graphics

				  display to switch to a text context when you are using

				  "kgdboc=kms,kbd", provided that you have a video driver which has a

				  frame buffer console and atomic kernel mode setting support.</para>

				  <para>

				  Every time the kernel

				  debugger is entered it calls kgdboc_pre_exp_handler() which in turn

				  calls con_debug_enter() in the virtual console layer.  On resuming kernel

				  execution, the kernel debugger calls kgdboc_post_exp_handler() which

				  in turn calls con_debug_leave().</para>

				  <para>Any video driver that wants to be compatible with the kernel

				  debugger and the atomic kms callbacks must implement the

				  mode_set_base_atomic, fb_debug_enter and fb_debug_leave operations.

				  For the fb_debug_enter and fb_debug_leave the option exists to use

				  the generic drm fb helper functions or implement something custom for

				  the hardware.  The following example shows the initialization of the

				  .mode_set_base_atomic operation in

				  drivers/gpu/drm/i915/intel_display.c:

				  <informalexample>

				  <programlisting>

				static const struct drm_crtc_helper_funcs intel_helper_funcs = {

				[...]

				        .mode_set_base_atomic = intel_pipe_set_base_atomic,

				[...]

				};

				  </programlisting>

				  </informalexample>

				  </para>

				  <para>Here is an example of how the i915 driver initializes the fb_debug_enter and fb_debug_leave functions to use the generic drm helpers in

				  drivers/gpu/drm/i915/intel_fb.c:

				  <informalexample>

				  <programlisting>

				static struct fb_ops intelfb_ops = {

				[...]

				       .fb_debug_enter = drm_fb_helper_debug_enter,

				       .fb_debug_leave = drm_fb_helper_debug_leave,

				[...]

				};

				  </programlisting>

				  </informalexample>

				  </para>

				  </sect2>

				  </sect1>

				  </chapter>

				  <chapter id="credits">

				     <title>Credits</title>

					<para>

						The following people have contributed to this document:

						<orderedlist>

							<listitem><para>Amit Kale<email>amitkale@linsyssoft.com</email></para></listitem>

							<listitem><para>Tom Rini<email>trini@kernel.crashing.org</email></para></listitem>

						</orderedlist>

				                In March 2008 this document was completely rewritten by:

						<itemizedlist>

						<listitem><para>Jason Wessel<email>jason.wessel@windriver.com</email></para></listitem>

						</itemizedlist>

				                In Jan 2010 this document was updated to include kdb.

						<itemizedlist>

						<listitem><para>Jason Wessel<email>jason.wessel@windriver.com</email></para></listitem>

						</itemizedlist>

					</para>

				  </chapter>

				</book>

1625

Documentation/DocBook/libata.tmpl Normal file

View File

File diff suppressed because it is too large Load Diff

									
										289

Documentation/DocBook/librs.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,289 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="Reed-Solomon-Library-Guide">

				 <bookinfo>

				  <title>Reed-Solomon Library Programming Interface</title>

				  <authorgroup>

				   <author>

				    <firstname>Thomas</firstname>

				    <surname>Gleixner</surname>

				    <affiliation>

				     <address>

				      <email>tglx@linutronix.de</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2004</year>

				   <holder>Thomas Gleixner</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License version 2 as published by the Free Software Foundation.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				      <title>Introduction</title>

				  <para>

				  	The generic Reed-Solomon Library provides encoding, decoding

					and error correction functions.

				  </para>

				  <para>

				  	Reed-Solomon codes are used in communication and storage

					applications to ensure data integrity. 

				  </para>

				  <para>

				  	This documentation is provided for developers who want to utilize

					the functions provided by the library.

				  </para>

				  </chapter>

				  <chapter id="bugs">

				     <title>Known Bugs And Assumptions</title>

				  <para>

					None.	

				  </para>

				  </chapter>

				  <chapter id="usage">

				     	<title>Usage</title>

					<para>

						This chapter provides examples of how to use the library.

					</para>

					<sect1>

						<title>Initializing</title>

						<para>

							The init function init_rs returns a pointer to an

							rs decoder structure, which holds the necessary

							information for encoding, decoding and error correction

							with the given polynomial. It either uses an existing

							matching decoder or creates a new one. On creation all

							the lookup tables for fast en/decoding are created.

							The function may take a while, so make sure not to 

							call it in critical code paths.

						</para>

						<programlisting>

				/* the Reed Solomon control structure */

				static struct rs_control *rs_decoder;

				/* Symbolsize is 10 (bits)

				 * Primitive polynomial is x^10+x^3+1

				 * first consecutive root is 0

				 * primitive element to generate roots = 1

				 * generator polynomial degree (number of roots) = 6

				 */

				rs_decoder = init_rs (10, 0x409, 0, 1, 6);

						</programlisting>

					</sect1>

					<sect1>

						<title>Encoding</title>

						<para>

							The encoder calculates the Reed-Solomon code over

							the given data length and stores the result in 

							the parity buffer. Note that the parity buffer must

							be initialized before calling the encoder.

						</para>

						<para>

							The expanded data can be inverted on the fly by

							providing a non-zero inversion mask. The expanded data is

							XOR'ed with the mask. This is used e.g. for FLASH

							ECC, where the all 0xFF is inverted to an all 0x00.

							The Reed-Solomon code for all 0x00 is all 0x00. The

							code is inverted before storing to FLASH so it is 0xFF

							too. This prevents that reading from an erased FLASH

							results in ECC errors.

						</para>

						<para>

							The databytes are expanded to the given symbol size

							on the fly. There is no support for encoding continuous

							bitstreams with a symbol size != 8 at the moment. If

							it is necessary it should be not a big deal to implement

							such functionality.

						</para>

						<programlisting>

				/* Parity buffer. Size = number of roots */

				uint16_t par[6];

				/* Initialize the parity buffer */

				memset(par, 0, sizeof(par));

				/* Encode 512 byte in data8. Store parity in buffer par */

				encode_rs8 (rs_decoder, data8, 512, par, 0);

						</programlisting>

					</sect1>

					<sect1>

						<title>Decoding</title>

						<para>

							The decoder calculates the syndrome over

							the given data length and the received parity symbols

							and corrects errors in the data.

						</para>

						<para>

							If a syndrome is available from a hardware decoder

							then the syndrome calculation is skipped.

						</para>

						<para>

							The correction of the data buffer can be suppressed

							by providing a correction pattern buffer and an error

							location buffer to the decoder. The decoder stores the

							calculated error location and the correction bitmask

							in the given buffers. This is useful for hardware

							decoders which use a weird bit ordering scheme.

						</para>

						<para>

							The databytes are expanded to the given symbol size

							on the fly. There is no support for decoding continuous

							bitstreams with a symbolsize != 8 at the moment. If

							it is necessary it should be not a big deal to implement

							such functionality.

						</para>

						<sect2>

						<title>

							Decoding with syndrome calculation, direct data correction

						</title>

						<programlisting>

				/* Parity buffer. Size = number of roots */

				uint16_t par[6];

				uint8_t  data[512];

				int numerr;

				/* Receive data */

				.....

				/* Receive parity */

				.....

				/* Decode 512 byte in data8.*/

				numerr = decode_rs8 (rs_decoder, data8, par, 512, NULL, 0, NULL, 0, NULL);

						</programlisting>

						</sect2>

						<sect2>

						<title>

							Decoding with syndrome given by hardware decoder, direct data correction

						</title>

						<programlisting>

				/* Parity buffer. Size = number of roots */

				uint16_t par[6], syn[6];

				uint8_t  data[512];

				int numerr;

				/* Receive data */

				.....

				/* Receive parity */

				.....

				/* Get syndrome from hardware decoder */

				.....

				/* Decode 512 byte in data8.*/

				numerr = decode_rs8 (rs_decoder, data8, par, 512, syn, 0, NULL, 0, NULL);

						</programlisting>

						</sect2>

						<sect2>

						<title>

							Decoding with syndrome given by hardware decoder, no direct data correction.

						</title>

						<para>

							Note: It's not necessary to give data and received parity to the decoder.

						</para>

						<programlisting>

				/* Parity buffer. Size = number of roots */

				uint16_t par[6], syn[6], corr[8];

				uint8_t  data[512];

				int numerr, errpos[8];

				/* Receive data */

				.....

				/* Receive parity */

				.....

				/* Get syndrome from hardware decoder */

				.....

				/* Decode 512 byte in data8.*/

				numerr = decode_rs8 (rs_decoder, NULL, NULL, 512, syn, 0, errpos, 0, corr);

				for (i = 0; i &lt; numerr; i++) {

					do_error_correction_in_your_buffer(errpos[i], corr[i]);

				}

						</programlisting>

						</sect2>

					</sect1>

					<sect1>

						<title>Cleanup</title>

						<para>

							The function free_rs frees the allocated resources,

							if the caller is the last user of the decoder.

						</para>

						<programlisting>

				/* Release resources */

				free_rs(rs_decoder);

						</programlisting>

					</sect1>

				  </chapter>

				  <chapter id="structs">

				     <title>Structures</title>

				     <para>

				     This chapter contains the autogenerated documentation of the structures which are

				     used in the Reed-Solomon Library and are relevant for a developer.

				     </para>

				!Iinclude/linux/rslib.h

				  </chapter>

				  <chapter id="pubfunctions">

				     <title>Public Functions Provided</title>

				     <para>

				     This chapter contains the autogenerated documentation of the Reed-Solomon functions

				     which are exported.

				     </para>

				!Elib/reed_solomon/reed_solomon.c

				  </chapter>

				  <chapter id="credits">

				     <title>Credits</title>

					<para>

						The library code for encoding and decoding was written by Phil Karn.

					</para>

					<programlisting>

						Copyright 2002, Phil Karn, KA9Q

				 		May be used under the terms of the GNU General Public License (GPL)

					</programlisting>

					<para>

						The wrapper functions and interfaces are written by Thomas Gleixner.

					</para>

					<para>

						Many users have provided bugfixes, improvements and helping hands for testing.

						Thanks a lot.

					</para>

					<para>

						The following people have contributed to this document:

					</para>

					<para>

						Thomas Gleixner<email>tglx@linutronix.de</email>

					</para>

				  </chapter>

				</book>

									
										265

Documentation/DocBook/lsm.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,265 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<article class="whitepaper" id="LinuxSecurityModule" lang="en">

				 <articleinfo>

				 <title>Linux Security Modules:  General Security Hooks for Linux</title>

				 <authorgroup>

				 <author>

				 <firstname>Stephen</firstname> 

				 <surname>Smalley</surname>

				 <affiliation>

				 <orgname>NAI Labs</orgname>

				 <address><email>ssmalley@nai.com</email></address>

				 </affiliation>

				 </author>

				 <author>

				 <firstname>Timothy</firstname> 

				 <surname>Fraser</surname>

				 <affiliation>

				 <orgname>NAI Labs</orgname>

				 <address><email>tfraser@nai.com</email></address>

				 </affiliation>

				 </author>

				 <author>

				 <firstname>Chris</firstname> 

				 <surname>Vance</surname>

				 <affiliation>

				 <orgname>NAI Labs</orgname>

				 <address><email>cvance@nai.com</email></address>

				 </affiliation>

				 </author>

				 </authorgroup>

				 </articleinfo>

				<sect1 id="Introduction"><title>Introduction</title>

				<para>

				In March 2001, the National Security Agency (NSA) gave a presentation

				about Security-Enhanced Linux (SELinux) at the 2.5 Linux Kernel

				Summit.  SELinux is an implementation of flexible and fine-grained

				nondiscretionary access controls in the Linux kernel, originally

				implemented as its own particular kernel patch.  Several other

				security projects (e.g. RSBAC, Medusa) have also developed flexible

				access control architectures for the Linux kernel, and various

				projects have developed particular access control models for Linux

				(e.g. LIDS, DTE, SubDomain).  Each project has developed and

				maintained its own kernel patch to support its security needs.

				</para>

				<para>

				In response to the NSA presentation, Linus Torvalds made a set of

				remarks that described a security framework he would be willing to

				consider for inclusion in the mainstream Linux kernel.  He described a

				general framework that would provide a set of security hooks to

				control operations on kernel objects and a set of opaque security

				fields in kernel data structures for maintaining security attributes.

				This framework could then be used by loadable kernel modules to

				implement any desired model of security.  Linus also suggested the

				possibility of migrating the Linux capabilities code into such a

				module.

				</para>

				<para>

				The Linux Security Modules (LSM) project was started by WireX to

				develop such a framework.  LSM is a joint development effort by

				several security projects, including Immunix, SELinux, SGI and Janus,

				and several individuals, including Greg Kroah-Hartman and James

				Morris, to develop a Linux kernel patch that implements this

				framework.  The patch is currently tracking the 2.4 series and is

				targeted for integration into the 2.5 development series.  This

				technical report provides an overview of the framework and the example

				capabilities security module provided by the LSM kernel patch.

				</para>

				</sect1>

				<sect1 id="framework"><title>LSM Framework</title>

				<para>

				The LSM kernel patch provides a general kernel framework to support

				security modules.  In particular, the LSM framework is primarily

				focused on supporting access control modules, although future

				development is likely to address other security needs such as

				auditing.  By itself, the framework does not provide any additional

				security; it merely provides the infrastructure to support security

				modules.  The LSM kernel patch also moves most of the capabilities

				logic into an optional security module, with the system defaulting

				to the traditional superuser logic.  This capabilities module

				is discussed further in <xref linkend="cap"/>.

				</para>

				<para>

				The LSM kernel patch adds security fields to kernel data structures

				and inserts calls to hook functions at critical points in the kernel

				code to manage the security fields and to perform access control.  It

				also adds functions for registering and unregistering security

				modules, and adds a general <function>security</function> system call

				to support new system calls for security-aware applications.

				</para>

				<para>

				The LSM security fields are simply <type>void*</type> pointers.  For

				process and program execution security information, security fields

				were added to <structname>struct task_struct</structname> and 

				<structname>struct linux_binprm</structname>.  For filesystem security

				information, a security field was added to 

				<structname>struct super_block</structname>.  For pipe, file, and socket

				security information, security fields were added to 

				<structname>struct inode</structname> and 

				<structname>struct file</structname>.  For packet and network device security

				information, security fields were added to

				<structname>struct sk_buff</structname> and

				<structname>struct net_device</structname>.  For System V IPC security

				information, security fields were added to

				<structname>struct kern_ipc_perm</structname> and

				<structname>struct msg_msg</structname>; additionally, the definitions

				for <structname>struct msg_msg</structname>, <structname>struct 

				msg_queue</structname>, and <structname>struct 

				shmid_kernel</structname> were moved to header files

				(<filename>include/linux/msg.h</filename> and

				<filename>include/linux/shm.h</filename> as appropriate) to allow

				the security modules to use these definitions.

				</para>

				<para>

				Each LSM hook is a function pointer in a global table,

				security_ops. This table is a

				<structname>security_operations</structname> structure as defined by

				<filename>include/linux/security.h</filename>.  Detailed documentation

				for each hook is included in this header file.  At present, this

				structure consists of a collection of substructures that group related

				hooks based on the kernel object (e.g. task, inode, file, sk_buff,

				etc) as well as some top-level hook function pointers for system

				operations.  This structure is likely to be flattened in the future

				for performance.  The placement of the hook calls in the kernel code

				is described by the "called:" lines in the per-hook documentation in

				the header file.  The hook calls can also be easily found in the

				kernel code by looking for the string "security_ops->".

				</para>

				<para>

				Linus mentioned per-process security hooks in his original remarks as a

				possible alternative to global security hooks.  However, if LSM were

				to start from the perspective of per-process hooks, then the base

				framework would have to deal with how to handle operations that

				involve multiple processes (e.g. kill), since each process might have

				its own hook for controlling the operation.  This would require a

				general mechanism for composing hooks in the base framework.

				Additionally, LSM would still need global hooks for operations that

				have no process context (e.g. network input operations).

				Consequently, LSM provides global security hooks, but a security

				module is free to implement per-process hooks (where that makes sense)

				by storing a security_ops table in each process' security field and

				then invoking these per-process hooks from the global hooks.

				The problem of composition is thus deferred to the module.

				</para>

				<para>

				The global security_ops table is initialized to a set of hook

				functions provided by a dummy security module that provides

				traditional superuser logic.  A <function>register_security</function>

				function (in <filename>security/security.c</filename>) is provided to

				allow a security module to set security_ops to refer to its own hook

				functions, and an <function>unregister_security</function> function is

				provided to revert security_ops to the dummy module hooks.  This

				mechanism is used to set the primary security module, which is

				responsible for making the final decision for each hook.

				</para>

				<para>

				LSM also provides a simple mechanism for stacking additional security

				modules with the primary security module.  It defines

				<function>register_security</function> and

				<function>unregister_security</function> hooks in the

				<structname>security_operations</structname> structure and provides

				<function>mod_reg_security</function> and

				<function>mod_unreg_security</function> functions that invoke these

				hooks after performing some sanity checking.  A security module can

				call these functions in order to stack with other modules.  However,

				the actual details of how this stacking is handled are deferred to the

				module, which can implement these hooks in any way it wishes

				(including always returning an error if it does not wish to support

				stacking).  In this manner, LSM again defers the problem of

				composition to the module.

				</para>

				<para>

				Although the LSM hooks are organized into substructures based on

				kernel object, all of the hooks can be viewed as falling into two

				major categories: hooks that are used to manage the security fields

				and hooks that are used to perform access control.  Examples of the

				first category of hooks include the

				<function>alloc_security</function> and

				<function>free_security</function> hooks defined for each kernel data

				structure that has a security field.  These hooks are used to allocate

				and free security structures for kernel objects.  The first category

				of hooks also includes hooks that set information in the security

				field after allocation, such as the <function>post_lookup</function>

				hook in <structname>struct inode_security_ops</structname>.  This hook

				is used to set security information for inodes after successful lookup

				operations.  An example of the second category of hooks is the

				<function>permission</function> hook in 

				<structname>struct inode_security_ops</structname>.  This hook checks

				permission when accessing an inode.

				</para>

				</sect1>

				<sect1 id="cap"><title>LSM Capabilities Module</title>

				<para>

				The LSM kernel patch moves most of the existing POSIX.1e capabilities

				logic into an optional security module stored in the file

				<filename>security/capability.c</filename>.  This change allows

				users who do not want to use capabilities to omit this code entirely

				from their kernel, instead using the dummy module for traditional

				superuser logic or any other module that they desire.  This change

				also allows the developers of the capabilities logic to maintain and

				enhance their code more freely, without needing to integrate patches

				back into the base kernel.

				</para>

				<para>

				In addition to moving the capabilities logic, the LSM kernel patch

				could move the capability-related fields from the kernel data

				structures into the new security fields managed by the security

				modules.  However, at present, the LSM kernel patch leaves the

				capability fields in the kernel data structures.  In his original

				remarks, Linus suggested that this might be preferable so that other

				security modules can be easily stacked with the capabilities module

				without needing to chain multiple security structures on the security field.

				It also avoids imposing extra overhead on the capabilities module

				to manage the security fields.  However, the LSM framework could

				certainly support such a move if it is determined to be desirable,

				with only a few additional changes described below.

				</para>

				<para>

				At present, the capabilities logic for computing process capabilities

				on <function>execve</function> and <function>set*uid</function>,

				checking capabilities for a particular process, saving and checking

				capabilities for netlink messages, and handling the

				<function>capget</function> and <function>capset</function> system

				calls have been moved into the capabilities module.  There are still a

				few locations in the base kernel where capability-related fields are

				directly examined or modified, but the current version of the LSM

				patch does allow a security module to completely replace the

				assignment and testing of capabilities.  These few locations would

				need to be changed if the capability-related fields were moved into

				the security field.  The following is a list of known locations that

				still perform such direct examination or modification of

				capability-related fields:

				<itemizedlist>

				<listitem><para><filename>fs/open.c</filename>:<function>sys_access</function></para></listitem>

				<listitem><para><filename>fs/lockd/host.c</filename>:<function>nlm_bind_host</function></para></listitem>

				<listitem><para><filename>fs/nfsd/auth.c</filename>:<function>nfsd_setuser</function></para></listitem>

				<listitem><para><filename>fs/proc/array.c</filename>:<function>task_cap</function></para></listitem>

				</itemizedlist>

				</para>

				</sect1>

				</article>

1291

Documentation/DocBook/mtdnand.tmpl Normal file

View File

File diff suppressed because it is too large Load Diff

									
										111

Documentation/DocBook/networking.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,111 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="LinuxNetworking">

				 <bookinfo>

				  <title>Linux Networking and Network Devices APIs</title>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="netcore">

				     <title>Linux Networking</title>

				     <sect1><title>Networking Base Types</title>

				!Iinclude/linux/net.h

				     </sect1>

				     <sect1><title>Socket Buffer Functions</title>

				!Iinclude/linux/skbuff.h

				!Iinclude/net/sock.h

				!Enet/socket.c

				!Enet/core/skbuff.c

				!Enet/core/sock.c

				!Enet/core/datagram.c

				!Enet/core/stream.c

				     </sect1>

				     <sect1><title>Socket Filter</title>

				!Enet/core/filter.c

				     </sect1>

				     <sect1><title>Generic Network Statistics</title>

				!Iinclude/uapi/linux/gen_stats.h

				!Enet/core/gen_stats.c

				!Enet/core/gen_estimator.c

				     </sect1>

				     <sect1><title>SUN RPC subsystem</title>

				<!-- The !D functionality is not perfect, garbage has to be protected by comments

				!Dnet/sunrpc/sunrpc_syms.c

				-->

				!Enet/sunrpc/xdr.c

				!Enet/sunrpc/svc_xprt.c

				!Enet/sunrpc/xprt.c

				!Enet/sunrpc/sched.c

				!Enet/sunrpc/socklib.c

				!Enet/sunrpc/stats.c

				!Enet/sunrpc/rpc_pipe.c

				!Enet/sunrpc/rpcb_clnt.c

				!Enet/sunrpc/clnt.c

				     </sect1>

				     <sect1><title>WiMAX</title>

				!Enet/wimax/op-msg.c

				!Enet/wimax/op-reset.c

				!Enet/wimax/op-rfkill.c

				!Enet/wimax/stack.c

				!Iinclude/net/wimax.h

				!Iinclude/uapi/linux/wimax.h

				     </sect1>

				  </chapter>

				  <chapter id="netdev">

				     <title>Network device support</title>

				     <sect1><title>Driver Support</title>

				!Enet/core/dev.c

				!Enet/ethernet/eth.c

				!Enet/sched/sch_generic.c

				!Iinclude/linux/etherdevice.h

				!Iinclude/linux/netdevice.h

				     </sect1>

				     <sect1><title>PHY Support</title>

				!Edrivers/net/phy/phy.c

				!Idrivers/net/phy/phy.c

				!Edrivers/net/phy/phy_device.c

				!Idrivers/net/phy/phy_device.c

				!Edrivers/net/phy/mdio_bus.c

				!Idrivers/net/phy/mdio_bus.c

				     </sect1>

				<!-- FIXME: Removed for now since no structured comments in source

				     <sect1><title>Wireless</title>

				X!Enet/core/wireless.c

				     </sect1>

				-->

				  </chapter>

				</book>

									
										158

Documentation/DocBook/rapidio.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,158 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

				        "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [

					<!ENTITY rapidio SYSTEM "rapidio.xml">

					]>

				<book id="RapidIO-Guide">

				 <bookinfo>

				  <title>RapidIO Subsystem Guide</title>

				  <authorgroup>

				   <author>

				    <firstname>Matt</firstname>

				    <surname>Porter</surname>

				    <affiliation>

				     <address>

				      <email>mporter@kernel.crashing.org</email>

				      <email>mporter@mvista.com</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2005</year>

				   <holder>MontaVista Software, Inc.</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License version 2 as published by the Free Software Foundation.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				      <title>Introduction</title>

				  <para>

					RapidIO is a high speed switched fabric interconnect with

					features aimed at the embedded market.  RapidIO provides

					support for memory-mapped I/O as well as message-based

					transactions over the switched fabric network. RapidIO has

					a standardized discovery mechanism not unlike the PCI bus

					standard that allows simple detection of devices in a

					network.

				  </para>

				  <para>

				  	This documentation is provided for developers intending

					to support RapidIO on new architectures, write new drivers,

					or to understand the subsystem internals.

				  </para>

				  </chapter>

				  <chapter id="bugs">

				     <title>Known Bugs and Limitations</title>

				     <sect1 id="known_bugs">

				     	<title>Bugs</title>

					  <para>None. ;)</para>

				     </sect1>

				     <sect1 id="Limitations">

				     	<title>Limitations</title>

					  <para>

					    <orderedlist>

					      <listitem><para>Access/management of RapidIO memory regions is not supported</para></listitem>

					      <listitem><para>Multiple host enumeration is not supported</para></listitem>

					    </orderedlist>

					 </para>

				     </sect1>

				  </chapter>

				  <chapter id="drivers">

				     	<title>RapidIO driver interface</title>

					<para>

						Drivers are provided a set of calls in order

						to interface with the subsystem to gather info

						on devices, request/map memory region resources,

						and manage mailboxes/doorbells.

					</para>

					<sect1 id="Functions">

						<title>Functions</title>

				!Iinclude/linux/rio_drv.h

				!Edrivers/rapidio/rio-driver.c

				!Edrivers/rapidio/rio.c

					</sect1>

				  </chapter>

				  <chapter id="internals">

				     <title>Internals</title>

				     <para>

				     This chapter contains the autogenerated documentation of the RapidIO

				     subsystem.

				     </para>

				     <sect1 id="Structures"><title>Structures</title>

				!Iinclude/linux/rio.h

				     </sect1>

				     <sect1 id="Enumeration_and_Discovery"><title>Enumeration and Discovery</title>

				!Idrivers/rapidio/rio-scan.c

				     </sect1>

				     <sect1 id="Driver_functionality"><title>Driver functionality</title>

				!Idrivers/rapidio/rio.c

				!Idrivers/rapidio/rio-access.c

				     </sect1>

				     <sect1 id="Device_model_support"><title>Device model support</title>

				!Idrivers/rapidio/rio-driver.c

				     </sect1>

				     <sect1 id="Sysfs_support"><title>Sysfs support</title>

				!Idrivers/rapidio/rio-sysfs.c

				     </sect1>

				     <sect1 id="PPC32_support"><title>PPC32 support</title>

				!Iarch/powerpc/sysdev/fsl_rio.c

				     </sect1>

				  </chapter>

				  <chapter id="credits">

				     <title>Credits</title>

					<para>

						The following people have contributed to the RapidIO

						subsystem directly or indirectly:

						<orderedlist>

							<listitem><para>Matt Porter<email>mporter@kernel.crashing.org</email></para></listitem>

							<listitem><para>Randy Vinson<email>rvinson@mvista.com</email></para></listitem>

							<listitem><para>Dan Malek<email>dan@embeddedalley.com</email></para></listitem>

						</orderedlist>

					</para>

					<para>

						The following people have contributed to this document:

						<orderedlist>

							<listitem><para>Matt Porter<email>mporter@kernel.crashing.org</email></para></listitem>

						</orderedlist>

					</para>

				  </chapter>

				</book>

									
										161

Documentation/DocBook/s390-drivers.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,161 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="s390drivers">

				 <bookinfo>

				  <title>Writing s390 channel device drivers</title>

				  <authorgroup>

				   <author>

				    <firstname>Cornelia</firstname>

				    <surname>Huck</surname>

				    <affiliation>

				     <address>

				       <email>cornelia.huck@de.ibm.com</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2007</year>

				   <holder>IBM Corp.</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				   <title>Introduction</title>

				  <para>

				    This document describes the interfaces available for device drivers that

				    drive s390 based channel attached I/O devices. This includes interfaces for

				    interaction with the hardware and interfaces for interacting with the

				    common driver core. Those interfaces are provided by the s390 common I/O

				    layer.

				  </para>

				  <para>

				    The document assumes a familarity with the technical terms associated

				    with the s390 channel I/O architecture. For a description of this

				    architecture, please refer to the "z/Architecture: Principles of

				    Operation", IBM publication no. SA22-7832.

				  </para>

				  <para>

				    While most I/O devices on a s390 system are typically driven through the

				    channel I/O mechanism described here, there are various other methods

				    (like the diag interface). These are out of the scope of this document.

				  </para>

				  <para>

				    Some additional information can also be found in the kernel source

				    under Documentation/s390/driver-model.txt.

				  </para>

				  </chapter>

				  <chapter id="ccw">

				   <title>The ccw bus</title>

				  <para>

					The ccw bus typically contains the majority of devices available to

					a s390 system. Named after the channel command word (ccw), the basic

					command structure used to address its devices, the ccw bus contains

					so-called channel attached devices. They are addressed via I/O

					subchannels, visible on the css bus. A device driver for

					channel-attached devices, however, will never interact	with the

					subchannel directly, but only via the I/O device on the ccw bus,

					the ccw device.

				  </para>

				    <sect1 id="channelIO">

				     <title>I/O functions for channel-attached devices</title>

				    <para>

				      Some hardware structures have been translated into C structures for use

				      by the common I/O layer and device drivers. For more information on

				      the hardware structures represented here, please consult the Principles

				      of Operation.

				    </para>

				!Iarch/s390/include/asm/cio.h

				    </sect1>

				    <sect1 id="ccwdev">

				     <title>ccw devices</title>

				    <para>

				      Devices that want to initiate channel I/O need to attach to the ccw bus.

				      Interaction with the driver core is done via the common I/O layer, which

				      provides the abstractions of ccw devices and ccw device drivers.

				    </para>

				    <para>

				      The functions that initiate or terminate channel I/O all act upon a

				      ccw device structure. Device drivers must not bypass those functions

				      or strange side effects may happen.

				    </para>

				!Iarch/s390/include/asm/ccwdev.h

				!Edrivers/s390/cio/device.c

				!Edrivers/s390/cio/device_ops.c

				    </sect1>

				    <sect1 id="cmf">

				     <title>The channel-measurement facility</title>

				  <para>

					The channel-measurement facility provides a means to collect

					measurement data which is made available by the channel subsystem

					for each channel attached device.

				  </para>

				!Iarch/s390/include/asm/cmb.h

				!Edrivers/s390/cio/cmf.c

				    </sect1>

				  </chapter>

				  <chapter id="ccwgroup">

				   <title>The ccwgroup bus</title>

				  <para>

					The ccwgroup bus only contains artificial devices, created by the user.

					Many networking devices (e.g. qeth) are in fact composed of several

					ccw devices (like read, write and data channel for qeth). The

					ccwgroup bus provides a mechanism to create a meta-device which

					contains those ccw devices as slave devices and can be associated

					with the netdevice.

				  </para>

				   <sect1 id="ccwgroupdevices">

				    <title>ccw group devices</title>

				!Iarch/s390/include/asm/ccwgroup.h

				!Edrivers/s390/cio/ccwgroup.c

				   </sect1>

				  </chapter>

				  <chapter id="genericinterfaces">

				   <title>Generic interfaces</title>

				  <para>

					Some interfaces are available to other drivers that do not necessarily

					have anything to do with the busses described above, but still are

					indirectly using basic infrastructure in the common I/O layer.

					One example is the support for adapter interrupts.

				  </para>

				!Edrivers/s390/cio/airq.c

				  </chapter>

				</book>

									
										409

Documentation/DocBook/scsi.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,409 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="scsimid">

				  <bookinfo>

				    <title>SCSI Interfaces Guide</title>

				    <authorgroup>

				      <author>

				        <firstname>James</firstname>

				        <surname>Bottomley</surname>

				        <affiliation>

				          <address>

				            <email>James.Bottomley@hansenpartnership.com</email>

				          </address>

				        </affiliation>

				      </author>

				      <author>

				        <firstname>Rob</firstname>

				        <surname>Landley</surname>

				        <affiliation>

				          <address>

				            <email>rob@landley.net</email>

				          </address>

				        </affiliation>

				      </author>

				    </authorgroup>

				    <copyright>

				      <year>2007</year>

				      <holder>Linux Foundation</holder>

				    </copyright>

				    <legalnotice>

				      <para>

				        This documentation is free software; you can redistribute

				        it and/or modify it under the terms of the GNU General Public

				        License version 2.

				      </para>

				      <para>

				        This program is distributed in the hope that it will be

				        useful, but WITHOUT ANY WARRANTY; without even the implied

				        warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				        For more details see the file COPYING in the source

				        distribution of Linux.

				      </para>

				    </legalnotice>

				  </bookinfo>

				  <toc></toc>

				  <chapter id="intro">

				    <title>Introduction</title>

				    <sect1 id="protocol_vs_bus">

				      <title>Protocol vs bus</title>

				      <para>

				        Once upon a time, the Small Computer Systems Interface defined both

				        a parallel I/O bus and a data protocol to connect a wide variety of

				        peripherals (disk drives, tape drives, modems, printers, scanners,

				        optical drives, test equipment, and medical devices) to a host

				        computer.

				      </para>

				      <para>

				        Although the old parallel (fast/wide/ultra) SCSI bus has largely

				        fallen out of use, the SCSI command set is more widely used than ever

				        to communicate with devices over a number of different busses.

				      </para>

				      <para>

				        The <ulink url='http://www.t10.org/scsi-3.htm'>SCSI protocol</ulink>

				        is a big-endian peer-to-peer packet based protocol.  SCSI commands

				        are 6, 10, 12, or 16 bytes long, often followed by an associated data

				        payload.

				      </para>

				      <para>

				        SCSI commands can be transported over just about any kind of bus, and

				        are the default protocol for storage devices attached to USB, SATA,

				        SAS, Fibre Channel, FireWire, and ATAPI devices.  SCSI packets are

				        also commonly exchanged over Infiniband,

				        <ulink url='http://i2o.shadowconnect.com/faq.php'>I20</ulink>, TCP/IP

				        (<ulink url='https://en.wikipedia.org/wiki/ISCSI'>iSCSI</ulink>), even

				        <ulink url='http://cyberelk.net/tim/parport/parscsi.html'>Parallel

				        ports</ulink>.

				      </para>

				    </sect1>

				    <sect1 id="subsystem_design">

				      <title>Design of the Linux SCSI subsystem</title>

				      <para>

				        The SCSI subsystem uses a three layer design, with upper, mid, and low

				        layers.  Every operation involving the SCSI subsystem (such as reading

				        a sector from a disk) uses one driver at each of the 3 levels: one

				        upper layer driver, one lower layer driver, and the SCSI midlayer.

				      </para>

				      <para>

				        The SCSI upper layer provides the interface between userspace and the

				        kernel, in the form of block and char device nodes for I/O and

				        ioctl().  The SCSI lower layer contains drivers for specific hardware

				        devices.

				      </para>

				      <para>

				        In between is the SCSI mid-layer, analogous to a network routing

				        layer such as the IPv4 stack.  The SCSI mid-layer routes a packet

				        based data protocol between the upper layer's /dev nodes and the

				        corresponding devices in the lower layer.  It manages command queues,

				        provides error handling and power management functions, and responds

				        to ioctl() requests.

				      </para>

				    </sect1>

				  </chapter>

				  <chapter id="upper_layer">

				    <title>SCSI upper layer</title>

				    <para>

				      The upper layer supports the user-kernel interface by providing

				      device nodes.

				    </para>

				    <sect1 id="sd">

				      <title>sd (SCSI Disk)</title>

				      <para>sd (sd_mod.o)</para>

				<!-- !Idrivers/scsi/sd.c -->

				    </sect1>

				    <sect1 id="sr">

				      <title>sr (SCSI CD-ROM)</title>

				      <para>sr (sr_mod.o)</para>

				    </sect1>

				    <sect1 id="st">

				      <title>st (SCSI Tape)</title>

				      <para>st (st.o)</para>

				    </sect1>

				    <sect1 id="sg">

				      <title>sg (SCSI Generic)</title>

				      <para>sg (sg.o)</para>

				    </sect1>

				    <sect1 id="ch">

				      <title>ch (SCSI Media Changer)</title>

				      <para>ch (ch.c)</para>

				    </sect1>

				  </chapter>

				  <chapter id="mid_layer">

				    <title>SCSI mid layer</title>

				    <sect1 id="midlayer_implementation">

				      <title>SCSI midlayer implementation</title>

				      <sect2 id="scsi_device.h">

				        <title>include/scsi/scsi_device.h</title>

				        <para>

				        </para>

				!Iinclude/scsi/scsi_device.h

				      </sect2>

				      <sect2 id="scsi.c">

				        <title>drivers/scsi/scsi.c</title>

				        <para>Main file for the SCSI midlayer.</para>

				!Edrivers/scsi/scsi.c

				      </sect2>

				      <sect2 id="scsicam.c">

				        <title>drivers/scsi/scsicam.c</title>

				        <para>

				          <ulink url='http://www.t10.org/ftp/t10/drafts/cam/cam-r12b.pdf'>SCSI

				          Common Access Method</ulink> support functions, for use with

				          HDIO_GETGEO, etc.

				        </para>

				!Edrivers/scsi/scsicam.c

				      </sect2>

				      <sect2 id="scsi_error.c">

				        <title>drivers/scsi/scsi_error.c</title>

				        <para>Common SCSI error/timeout handling routines.</para>

				!Edrivers/scsi/scsi_error.c

				      </sect2>

				      <sect2 id="scsi_devinfo.c">

				        <title>drivers/scsi/scsi_devinfo.c</title>

				        <para>

				          Manage scsi_dev_info_list, which tracks blacklisted and whitelisted

				          devices.

				        </para>

				!Idrivers/scsi/scsi_devinfo.c

				      </sect2>

				      <sect2 id="scsi_ioctl.c">

				        <title>drivers/scsi/scsi_ioctl.c</title>

				        <para>

				          Handle ioctl() calls for SCSI devices.

				        </para>

				!Edrivers/scsi/scsi_ioctl.c

				      </sect2>

				      <sect2 id="scsi_lib.c">

				        <title>drivers/scsi/scsi_lib.c</title>

				        <para>

				          SCSI queuing library.

				        </para>

				!Edrivers/scsi/scsi_lib.c

				      </sect2>

				      <sect2 id="scsi_lib_dma.c">

				        <title>drivers/scsi/scsi_lib_dma.c</title>

				        <para>

				          SCSI library functions depending on DMA

				          (map and unmap scatter-gather lists).

				        </para>

				!Edrivers/scsi/scsi_lib_dma.c

				      </sect2>

				      <sect2 id="scsi_module.c">

				        <title>drivers/scsi/scsi_module.c</title>

				        <para>

				          The file drivers/scsi/scsi_module.c contains legacy support for

				          old-style host templates.  It should never be used by any new driver.

				        </para>

				      </sect2>

				      <sect2 id="scsi_proc.c">

				        <title>drivers/scsi/scsi_proc.c</title>

				        <para>

				          The functions in this file provide an interface between

				          the PROC file system and the SCSI device drivers

				          It is mainly used for debugging, statistics and to pass

				          information directly to the lowlevel driver.

				          I.E. plumbing to manage /proc/scsi/*

				        </para>

				!Idrivers/scsi/scsi_proc.c

				      </sect2>

				      <sect2 id="scsi_netlink.c">

				        <title>drivers/scsi/scsi_netlink.c</title>

				        <para>

				          Infrastructure to provide async events from transports to userspace

				          via netlink, using a single NETLINK_SCSITRANSPORT protocol for all

				          transports.

				          See <ulink url='http://marc.info/?l=linux-scsi&amp;m=115507374832500&amp;w=2'>the

				          original patch submission</ulink> for more details.

				        </para>

				!Idrivers/scsi/scsi_netlink.c

				      </sect2>

				      <sect2 id="scsi_scan.c">

				        <title>drivers/scsi/scsi_scan.c</title>

				        <para>

				          Scan a host to determine which (if any) devices are attached.

				          The general scanning/probing algorithm is as follows, exceptions are

				          made to it depending on device specific flags, compilation options,

				          and global variable (boot or module load time) settings.

				          A specific LUN is scanned via an INQUIRY command; if the LUN has a

				          device attached, a scsi_device is allocated and setup for it.

				          For every id of every channel on the given host, start by scanning

				          LUN 0.  Skip hosts that don't respond at all to a scan of LUN 0.

				          Otherwise, if LUN 0 has a device attached, allocate and setup a

				          scsi_device for it.  If target is SCSI-3 or up, issue a REPORT LUN,

				          and scan all of the LUNs returned by the REPORT LUN; else,

				          sequentially scan LUNs up until some maximum is reached, or a LUN is

				          seen that cannot have a device attached to it.

				        </para>

				!Idrivers/scsi/scsi_scan.c

				      </sect2>

				      <sect2 id="scsi_sysctl.c">

				        <title>drivers/scsi/scsi_sysctl.c</title>

				        <para>

				          Set up the sysctl entry: "/dev/scsi/logging_level"

				          (DEV_SCSI_LOGGING_LEVEL) which sets/returns scsi_logging_level.

				        </para>

				      </sect2>

				      <sect2 id="scsi_sysfs.c">

				        <title>drivers/scsi/scsi_sysfs.c</title>

				        <para>

				          SCSI sysfs interface routines.

				        </para>

				!Edrivers/scsi/scsi_sysfs.c

				      </sect2>

				      <sect2 id="hosts.c">

				        <title>drivers/scsi/hosts.c</title>

				        <para>

				          mid to lowlevel SCSI driver interface

				        </para>

				!Edrivers/scsi/hosts.c

				      </sect2>

				      <sect2 id="constants.c">

				        <title>drivers/scsi/constants.c</title>

				        <para>

				          mid to lowlevel SCSI driver interface

				        </para>

				!Edrivers/scsi/constants.c

				      </sect2>

				    </sect1>

				    <sect1 id="Transport_classes">

				      <title>Transport classes</title>

				      <para>

				        Transport classes are service libraries for drivers in the SCSI

				        lower layer, which expose transport attributes in sysfs.

				      </para>

				      <sect2 id="Fibre_Channel_transport">

				        <title>Fibre Channel transport</title>

				        <para>

				          The file drivers/scsi/scsi_transport_fc.c defines transport attributes

				          for Fibre Channel.

				        </para>

				!Edrivers/scsi/scsi_transport_fc.c

				      </sect2>

				      <sect2 id="iSCSI_transport">

				        <title>iSCSI transport class</title>

				        <para>

				          The file drivers/scsi/scsi_transport_iscsi.c defines transport

				          attributes for the iSCSI class, which sends SCSI packets over TCP/IP

				          connections.

				        </para>

				!Edrivers/scsi/scsi_transport_iscsi.c

				      </sect2>

				      <sect2 id="SAS_transport">

				        <title>Serial Attached SCSI (SAS) transport class</title>

				        <para>

				          The file drivers/scsi/scsi_transport_sas.c defines transport

				          attributes for Serial Attached SCSI, a variant of SATA aimed at

				          large high-end systems.

				        </para>

				        <para>

				          The SAS transport class contains common code to deal with SAS HBAs,

				          an aproximated representation of SAS topologies in the driver model,

				          and various sysfs attributes to expose these topologies and management

				          interfaces to userspace.

				        </para>

				        <para>

				          In addition to the basic SCSI core objects this transport class

				          introduces two additional intermediate objects:  The SAS PHY

				          as represented by struct sas_phy defines an "outgoing" PHY on

				          a SAS HBA or Expander, and the SAS remote PHY represented by

				          struct sas_rphy defines an "incoming" PHY on a SAS Expander or

				          end device.  Note that this is purely a software concept, the

				          underlying hardware for a PHY and a remote PHY is the exactly

				          the same.

				        </para>

				        <para>

				          There is no concept of a SAS port in this code, users can see

				          what PHYs form a wide port based on the port_identifier attribute,

				          which is the same for all PHYs in a port.

				        </para>

				!Edrivers/scsi/scsi_transport_sas.c

				      </sect2>

				      <sect2 id="SATA_transport">

				        <title>SATA transport class</title>

				        <para>

				          The SATA transport is handled by libata, which has its own book of

				          documentation in this directory.

				        </para>

				      </sect2>

				      <sect2 id="SPI_transport">

				        <title>Parallel SCSI (SPI) transport class</title>

				        <para>

				          The file drivers/scsi/scsi_transport_spi.c defines transport

				          attributes for traditional (fast/wide/ultra) SCSI busses.

				        </para>

				!Edrivers/scsi/scsi_transport_spi.c

				      </sect2>

				      <sect2 id="SRP_transport">

				        <title>SCSI RDMA (SRP) transport class</title>

				        <para>

				          The file drivers/scsi/scsi_transport_srp.c defines transport

				          attributes for SCSI over Remote Direct Memory Access.

				        </para>

				!Edrivers/scsi/scsi_transport_srp.c

				      </sect2>

				    </sect1>

				  </chapter>

				  <chapter id="lower_layer">

				    <title>SCSI lower layer</title>

				    <sect1 id="hba_drivers">

				      <title>Host Bus Adapter transport types</title>

				      <para>

				        Many modern device controllers use the SCSI command set as a protocol to

				        communicate with their devices through many different types of physical

				        connections.

				      </para>

				      <para>

				        In SCSI language a bus capable of carrying SCSI commands is

				        called a "transport", and a controller connecting to such a bus is

				        called a "host bus adapter" (HBA).

				      </para>

				      <sect2 id="scsi_debug.c">

				        <title>Debug transport</title>

				        <para>

				          The file drivers/scsi/scsi_debug.c simulates a host adapter with a

				          variable number of disks (or disk like devices) attached, sharing a

				          common amount of RAM.  Does a lot of checking to make sure that we are

				          not getting blocks mixed up, and panics the kernel if anything out of

				          the ordinary is seen.

				        </para>

				        <para>

				          To be more realistic, the simulated devices have the transport

				          attributes of SAS disks.

				        </para>

				        <para>

				          For documentation see

				          <ulink url='http://sg.danny.cz/sg/sdebug26.html'>http://sg.danny.cz/sg/sdebug26.html</ulink>

				        </para>

				<!-- !Edrivers/scsi/scsi_debug.c -->

				      </sect2>

				      <sect2 id="todo">

				        <title>todo</title>

				        <para>Parallel (fast/wide/ultra) SCSI, USB, SATA,

				        SAS, Fibre Channel, FireWire, ATAPI devices, Infiniband,

				        I20, iSCSI, Parallel ports, netlink...

				        </para>

				      </sect2>

				    </sect1>

				  </chapter>

				</book>

									
										105

Documentation/DocBook/sh.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,105 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="sh-drivers">

				 <bookinfo>

				  <title>SuperH Interfaces Guide</title>

				  <authorgroup>

				   <author>

				    <firstname>Paul</firstname>

				    <surname>Mundt</surname>

				    <affiliation>

				     <address>

				      <email>lethal@linux-sh.org</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2008-2010</year>

				   <holder>Paul Mundt</holder>

				  </copyright>

				  <copyright>

				   <year>2008-2010</year>

				   <holder>Renesas Technology Corp.</holder>

				  </copyright>

				  <copyright>

				   <year>2010</year>

				   <holder>Renesas Electronics Corp.</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License version 2 as published by the Free Software Foundation.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="mm">

				    <title>Memory Management</title>

				    <sect1 id="sh4">

				    <title>SH-4</title>

				      <sect2 id="sq">

				        <title>Store Queue API</title>

				!Earch/sh/kernel/cpu/sh4/sq.c

				      </sect2>

				    </sect1>

				    <sect1 id="sh5">

				      <title>SH-5</title>

				      <sect2 id="tlb">

					<title>TLB Interfaces</title>

				!Iarch/sh/mm/tlb-sh5.c

				!Iarch/sh/include/asm/tlb_64.h

				      </sect2>

				    </sect1>

				  </chapter>

				  <chapter id="mach">

				    <title>Machine Specific Interfaces</title>

				    <sect1 id="dreamcast">

				      <title>mach-dreamcast</title>

				!Iarch/sh/boards/mach-dreamcast/rtc.c

				    </sect1>

				    <sect1 id="x3proto">

				      <title>mach-x3proto</title>

				!Earch/sh/boards/mach-x3proto/ilsel.c

				    </sect1>

				  </chapter>

				  <chapter id="busses">

				    <title>Busses</title>

				    <sect1 id="superhyway">

				      <title>SuperHyway</title>

				!Edrivers/sh/superhyway/superhyway.c

				    </sect1>

				    <sect1 id="maple">

				      <title>Maple</title>

				!Edrivers/sh/maple/maple.c

				    </sect1>

				  </chapter>

				</book>

									
										11

Documentation/DocBook/stylesheet.xsl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,11 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<stylesheet xmlns="http://www.w3.org/1999/XSL/Transform" version="1.0">

				<param name="chunk.quietly">1</param>

				<param name="funcsynopsis.style">ansi</param>

				<param name="funcsynopsis.tabular.threshold">80</param>

				<param name="callout.graphics">0</param>

				<!-- <param name="paper.type">A4</param> -->

				<param name="generate.consistent.ids">1</param>

				<param name="generate.section.toc.level">2</param>

				<param name="use.id.as.filename">1</param>

				</stylesheet>

									
										101

Documentation/DocBook/w1.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,101 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="w1id">

				  <bookinfo>

				    <title>W1: Dallas' 1-wire bus</title>

				    <authorgroup>

				      <author>

				        <firstname>David</firstname>

				        <surname>Fries</surname>

				        <affiliation>

				          <address>

				            <email>David@Fries.net</email>

				          </address>

				        </affiliation>

				      </author>

				    </authorgroup>

				    <copyright>

				      <year>2013</year>

				      <!--

				      <holder></holder>

				      -->

				    </copyright>

				    <legalnotice>

				      <para>

				        This documentation is free software; you can redistribute

				        it and/or modify it under the terms of the GNU General Public

				        License version 2.

				      </para>

				      <para>

				        This program is distributed in the hope that it will be

				        useful, but WITHOUT ANY WARRANTY; without even the implied

				        warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				        For more details see the file COPYING in the source

				        distribution of Linux.

				      </para>

				    </legalnotice>

				  </bookinfo>

				  <toc></toc>

				  <chapter id="w1_internal">

				    <title>W1 API internal to the kernel</title>

				    <sect1 id="w1_internal_api">

				      <title>W1 API internal to the kernel</title>

				      <sect2 id="w1.h">

				        <title>drivers/w1/w1.h</title>

				        <para>W1 core functions.</para>

				!Idrivers/w1/w1.h

				      </sect2>

				      <sect2 id="w1.c">

				        <title>drivers/w1/w1.c</title>

				        <para>W1 core functions.</para>

				!Idrivers/w1/w1.c

				      </sect2>

				      <sect2 id="w1_family.h">

				        <title>drivers/w1/w1_family.h</title>

				        <para>Allows registering device family operations.</para>

				!Idrivers/w1/w1_family.h

				      </sect2>

				      <sect2 id="w1_family.c">

				        <title>drivers/w1/w1_family.c</title>

				        <para>Allows registering device family operations.</para>

				!Edrivers/w1/w1_family.c

				      </sect2>

				      <sect2 id="w1_int.c">

				        <title>drivers/w1/w1_int.c</title>

				        <para>W1 internal initialization for master devices.</para>

				!Edrivers/w1/w1_int.c

				      </sect2>

				      <sect2 id="w1_netlink.h">

				        <title>drivers/w1/w1_netlink.h</title>

				        <para>W1 external netlink API structures and commands.</para>

				!Idrivers/w1/w1_netlink.h

				      </sect2>

				      <sect2 id="w1_io.c">

				        <title>drivers/w1/w1_io.c</title>

				        <para>W1 input/output.</para>

				!Edrivers/w1/w1_io.c

				!Idrivers/w1/w1_io.c

				      </sect2>

				    </sect1>

				  </chapter>

				</book>

									
										873

Documentation/DocBook/writing_musb_glue_layer.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,873 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="Writing-MUSB-Glue-Layer">

				 <bookinfo>

				  <title>Writing an MUSB Glue Layer</title>

				  <authorgroup>

				   <author>

				    <firstname>Apelete</firstname>

				    <surname>Seketeli</surname>

				    <affiliation>

				     <address>

				      <email>apelete at seketeli.net</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2014</year>

				   <holder>Apelete Seketeli</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute it

				     and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later version.

				   </para>

				   <para>

				     This documentation is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public License

				     along with this documentation; if not, write to the Free Software

				     Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA

				     02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the Linux kernel source

				     tree.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="introduction">

				    <title>Introduction</title>

				    <para>

				      The Linux MUSB subsystem is part of the larger Linux USB

				      subsystem. It provides support for embedded USB Device Controllers

				      (UDC) that do not use Universal Host Controller Interface (UHCI)

				      or Open Host Controller Interface (OHCI).

				    </para>

				    <para>

				      Instead, these embedded UDC rely on the USB On-the-Go (OTG)

				      specification which they implement at least partially. The silicon

				      reference design used in most cases is the Multipoint USB

				      Highspeed Dual-Role Controller (MUSB HDRC) found in the Mentor

				      Graphics Inventra™ design.

				    </para>

				    <para>

				      As a self-taught exercise I have written an MUSB glue layer for

				      the Ingenic JZ4740 SoC, modelled after the many MUSB glue layers

				      in the kernel source tree. This layer can be found at

				      drivers/usb/musb/jz4740.c. In this documentation I will walk

				      through the basics of the jz4740.c glue layer, explaining the

				      different pieces and what needs to be done in order to write your

				      own device glue layer.

				    </para>

				  </chapter>

				  <chapter id="linux-musb-basics">

				    <title>Linux MUSB Basics</title>

				    <para>

				      To get started on the topic, please read USB On-the-Go Basics (see

				      Resources) which provides an introduction of USB OTG operation at

				      the hardware level. A couple of wiki pages by Texas Instruments

				      and Analog Devices also provide an overview of the Linux kernel

				      MUSB configuration, albeit focused on some specific devices

				      provided by these companies. Finally, getting acquainted with the

				      USB specification at USB home page may come in handy, with

				      practical instance provided through the Writing USB Device Drivers

				      documentation (again, see Resources).

				    </para>

				    <para>

				      Linux USB stack is a layered architecture in which the MUSB

				      controller hardware sits at the lowest. The MUSB controller driver

				      abstract the MUSB controller hardware to the Linux USB stack.

				    </para>

				    <programlisting>

				      ------------------------

				      |                      | &lt;------- drivers/usb/gadget

				      | Linux USB Core Stack | &lt;------- drivers/usb/host

				      |                      | &lt;------- drivers/usb/core

				      ------------------------

				                 ⬍

				     --------------------------

				     |                        | &lt;------ drivers/usb/musb/musb_gadget.c

				     | MUSB Controller driver | &lt;------ drivers/usb/musb/musb_host.c

				     |                        | &lt;------ drivers/usb/musb/musb_core.c

				     --------------------------

				                 ⬍

				  ---------------------------------

				  | MUSB Platform Specific Driver |

				  |                               | &lt;-- drivers/usb/musb/jz4740.c

				  |       aka &quot;Glue Layer&quot;        |

				  ---------------------------------

				                 ⬍

				  ---------------------------------

				  |   MUSB Controller Hardware    |

				  ---------------------------------

				    </programlisting>

				    <para>

				      As outlined above, the glue layer is actually the platform

				      specific code sitting in between the controller driver and the

				      controller hardware.

				    </para>

				    <para>

				      Just like a Linux USB driver needs to register itself with the

				      Linux USB subsystem, the MUSB glue layer needs first to register

				      itself with the MUSB controller driver. This will allow the

				      controller driver to know about which device the glue layer

				      supports and which functions to call when a supported device is

				      detected or released; remember we are talking about an embedded

				      controller chip here, so no insertion or removal at run-time.

				    </para>

				    <para>

				      All of this information is passed to the MUSB controller driver

				      through a platform_driver structure defined in the glue layer as:

				    </para>

				    <programlisting linenumbering="numbered">

				static struct platform_driver jz4740_driver = {

					.probe		= jz4740_probe,

					.remove		= jz4740_remove,

					.driver		= {

						.name	= "musb-jz4740",

					},

				};

				    </programlisting>

				    <para>

				      The probe and remove function pointers are called when a matching

				      device is detected and, respectively, released. The name string

				      describes the device supported by this glue layer. In the current

				      case it matches a platform_device structure declared in

				      arch/mips/jz4740/platform.c. Note that we are not using device

				      tree bindings here.

				    </para>

				    <para>

				      In order to register itself to the controller driver, the glue

				      layer goes through a few steps, basically allocating the

				      controller hardware resources and initialising a couple of

				      circuits. To do so, it needs to keep track of the information used

				      throughout these steps. This is done by defining a private

				      jz4740_glue structure:

				    </para>

				    <programlisting linenumbering="numbered">

				struct jz4740_glue {

					struct device           *dev;

					struct platform_device  *musb;

					struct clk		*clk;

				};

				    </programlisting>

				    <para>

				      The dev and musb members are both device structure variables. The

				      first one holds generic information about the device, since it's

				      the basic device structure, and the latter holds information more

				      closely related to the subsystem the device is registered to. The

				      clk variable keeps information related to the device clock

				      operation.

				    </para>

				    <para>

				      Let's go through the steps of the probe function that leads the

				      glue layer to register itself to the controller driver.

				    </para>

				    <para>

				      N.B.: For the sake of readability each function will be split in

				      logical parts, each part being shown as if it was independent from

				      the others.

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_probe(struct platform_device *pdev)

				{

					struct platform_device		*musb;

					struct jz4740_glue		*glue;

					struct clk                      *clk;

					int				ret;

					glue = devm_kzalloc(&amp;pdev->dev, sizeof(*glue), GFP_KERNEL);

					if (!glue)

						return -ENOMEM;

					musb = platform_device_alloc("musb-hdrc", PLATFORM_DEVID_AUTO);

					if (!musb) {

						dev_err(&amp;pdev->dev, "failed to allocate musb device\n");

						return -ENOMEM;

					}

					clk = devm_clk_get(&amp;pdev->dev, "udc");

					if (IS_ERR(clk)) {

						dev_err(&amp;pdev->dev, "failed to get clock\n");

						ret = PTR_ERR(clk);

						goto err_platform_device_put;

					}

					ret = clk_prepare_enable(clk);

					if (ret) {

						dev_err(&amp;pdev->dev, "failed to enable clock\n");

						goto err_platform_device_put;

					}

					musb->dev.parent		= &amp;pdev->dev;

					glue->dev			= &amp;pdev->dev;

					glue->musb			= musb;

					glue->clk			= clk;

					return 0;

				err_platform_device_put:

					platform_device_put(musb);

					return ret;

				}

				    </programlisting>

				    <para>

				      The first few lines of the probe function allocate and assign the

				      glue, musb and clk variables. The GFP_KERNEL flag (line 8) allows

				      the allocation process to sleep and wait for memory, thus being

				      usable in a blocking situation. The PLATFORM_DEVID_AUTO flag (line

				      12) allows automatic allocation and management of device IDs in

				      order to avoid device namespace collisions with explicit IDs. With

				      devm_clk_get() (line 18) the glue layer allocates the clock -- the

				      <literal>devm_</literal> prefix indicates that clk_get() is

				      managed: it automatically frees the allocated clock resource data

				      when the device is released -- and enable it.

				    </para>

				    <para>

				      Then comes the registration steps:

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_probe(struct platform_device *pdev)

				{

					struct musb_hdrc_platform_data	*pdata = &amp;jz4740_musb_platform_data;

					pdata->platform_ops		= &amp;jz4740_musb_ops;

					platform_set_drvdata(pdev, glue);

					ret = platform_device_add_resources(musb, pdev->resource,

									    pdev->num_resources);

					if (ret) {

						dev_err(&amp;pdev->dev, "failed to add resources\n");

						goto err_clk_disable;

					}

					ret = platform_device_add_data(musb, pdata, sizeof(*pdata));

					if (ret) {

						dev_err(&amp;pdev->dev, "failed to add platform_data\n");

						goto err_clk_disable;

					}

					return 0;

				err_clk_disable:

					clk_disable_unprepare(clk);

				err_platform_device_put:

					platform_device_put(musb);

					return ret;

				}

				    </programlisting>

				    <para>

				      The first step is to pass the device data privately held by the

				      glue layer on to the controller driver through

				      platform_set_drvdata() (line 7). Next is passing on the device

				      resources information, also privately held at that point, through

				      platform_device_add_resources() (line 9).

				    </para>

				    <para>

				      Finally comes passing on the platform specific data to the

				      controller driver (line 16). Platform data will be discussed in

				      <link linkend="device-platform-data">Chapter 4</link>, but here

				      we are looking at the platform_ops function pointer (line 5) in

				      musb_hdrc_platform_data structure (line 3).  This function

				      pointer allows the MUSB controller driver to know which function

				      to call for device operation:

				    </para>

				    <programlisting linenumbering="numbered">

				static const struct musb_platform_ops jz4740_musb_ops = {

					.init		= jz4740_musb_init,

					.exit		= jz4740_musb_exit,

				};

				    </programlisting>

				    <para>

				      Here we have the minimal case where only init and exit functions

				      are called by the controller driver when needed. Fact is the

				      JZ4740 MUSB controller is a basic controller, lacking some

				      features found in other controllers, otherwise we may also have

				      pointers to a few other functions like a power management function

				      or a function to switch between OTG and non-OTG modes, for

				      instance.

				    </para>

				    <para>

				      At that point of the registration process, the controller driver

				      actually calls the init function:

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_musb_init(struct musb *musb)

				{

					musb->xceiv = usb_get_phy(USB_PHY_TYPE_USB2);

					if (!musb->xceiv) {

						pr_err("HS UDC: no transceiver configured\n");

						return -ENODEV;

					}

					/* Silicon does not implement ConfigData register.

					 * Set dyn_fifo to avoid reading EP config from hardware.

					 */

					musb->dyn_fifo = true;

					musb->isr = jz4740_musb_interrupt;

					return 0;

				}

				    </programlisting>

				    <para>

				      The goal of jz4740_musb_init() is to get hold of the transceiver

				      driver data of the MUSB controller hardware and pass it on to the

				      MUSB controller driver, as usual. The transceiver is the circuitry

				      inside the controller hardware responsible for sending/receiving

				      the USB data. Since it is an implementation of the physical layer

				      of the OSI model, the transceiver is also referred to as PHY.

				    </para>

				    <para>

				      Getting hold of the MUSB PHY driver data is done with

				      usb_get_phy() which returns a pointer to the structure

				      containing the driver instance data. The next couple of

				      instructions (line 12 and 14) are used as a quirk and to setup

				      IRQ handling respectively. Quirks and IRQ handling will be

				      discussed later in <link linkend="device-quirks">Chapter

				      5</link> and <link linkend="handling-irqs">Chapter 3</link>.

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_musb_exit(struct musb *musb)

				{

					usb_put_phy(musb->xceiv);

					return 0;

				}

				    </programlisting>

				    <para>

				      Acting as the counterpart of init, the exit function releases the

				      MUSB PHY driver when the controller hardware itself is about to be

				      released.

				    </para>

				    <para>

				      Again, note that init and exit are fairly simple in this case due

				      to the basic set of features of the JZ4740 controller hardware.

				      When writing an musb glue layer for a more complex controller

				      hardware, you might need to take care of more processing in those

				      two functions.

				    </para>

				    <para>

				      Returning from the init function, the MUSB controller driver jumps

				      back into the probe function:

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_probe(struct platform_device *pdev)

				{

					ret = platform_device_add(musb);

					if (ret) {

						dev_err(&amp;pdev->dev, "failed to register musb device\n");

						goto err_clk_disable;

					}

					return 0;

				err_clk_disable:

					clk_disable_unprepare(clk);

				err_platform_device_put:

					platform_device_put(musb);

					return ret;

				}

				    </programlisting>

				    <para>

				      This is the last part of the device registration process where the

				      glue layer adds the controller hardware device to Linux kernel

				      device hierarchy: at this stage, all known information about the

				      device is passed on to the Linux USB core stack.

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_remove(struct platform_device *pdev)

				{

					struct jz4740_glue	*glue = platform_get_drvdata(pdev);

					platform_device_unregister(glue->musb);

					clk_disable_unprepare(glue->clk);

					return 0;

				}

				    </programlisting>

				    <para>

				      Acting as the counterpart of probe, the remove function unregister

				      the MUSB controller hardware (line 5) and disable the clock (line

				      6), allowing it to be gated.

				    </para>

				  </chapter>

				  <chapter id="handling-irqs">

				    <title>Handling IRQs</title>

				    <para>

				      Additionally to the MUSB controller hardware basic setup and

				      registration, the glue layer is also responsible for handling the

				      IRQs:

				    </para>

				    <programlisting linenumbering="numbered">

				static irqreturn_t jz4740_musb_interrupt(int irq, void *__hci)

				{

					unsigned long   flags;

					irqreturn_t     retval = IRQ_NONE;

					struct musb     *musb = __hci;

					spin_lock_irqsave(&amp;musb->lock, flags);

					musb->int_usb = musb_readb(musb->mregs, MUSB_INTRUSB);

					musb->int_tx = musb_readw(musb->mregs, MUSB_INTRTX);

					musb->int_rx = musb_readw(musb->mregs, MUSB_INTRRX);

					/*

					 * The controller is gadget only, the state of the host mode IRQ bits is

					 * undefined. Mask them to make sure that the musb driver core will

					 * never see them set

					 */

					musb->int_usb &amp;= MUSB_INTR_SUSPEND | MUSB_INTR_RESUME |

					    MUSB_INTR_RESET | MUSB_INTR_SOF;

					if (musb->int_usb || musb->int_tx || musb->int_rx)

						retval = musb_interrupt(musb);

					spin_unlock_irqrestore(&amp;musb->lock, flags);

					return retval;

				}

				    </programlisting>

				    <para>

				      Here the glue layer mostly has to read the relevant hardware

				      registers and pass their values on to the controller driver which

				      will handle the actual event that triggered the IRQ.

				    </para>

				    <para>

				      The interrupt handler critical section is protected by the

				      spin_lock_irqsave() and counterpart spin_unlock_irqrestore()

				      functions (line 7 and 24 respectively), which prevent the

				      interrupt handler code to be run by two different threads at the

				      same time.

				    </para>

				    <para>

				      Then the relevant interrupt registers are read (line 9 to 11):

				    </para>

				    <itemizedlist>

				      <listitem>

				        <para>

				          MUSB_INTRUSB: indicates which USB interrupts are currently

				          active,

				        </para>

				      </listitem>

				      <listitem>

				        <para>

				          MUSB_INTRTX: indicates which of the interrupts for TX

				          endpoints are currently active,

				        </para>

				      </listitem>

				      <listitem>

				        <para>

				          MUSB_INTRRX: indicates which of the interrupts for TX

				          endpoints are currently active.

				        </para>

				      </listitem>

				    </itemizedlist>

				    <para>

				      Note that musb_readb() is used to read 8-bit registers at most,

				      while musb_readw() allows us to read at most 16-bit registers.

				      There are other functions that can be used depending on the size

				      of your device registers. See musb_io.h for more information.

				    </para>

				    <para>

				      Instruction on line 18 is another quirk specific to the JZ4740

				      USB device controller, which will be discussed later in <link

				      linkend="device-quirks">Chapter 5</link>.

				    </para>

				    <para>

				      The glue layer still needs to register the IRQ handler though.

				      Remember the instruction on line 14 of the init function:

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_musb_init(struct musb *musb)

				{

					musb->isr = jz4740_musb_interrupt;

					return 0;

				}

				    </programlisting>

				    <para>

				      This instruction sets a pointer to the glue layer IRQ handler

				      function, in order for the controller hardware to call the handler

				      back when an IRQ comes from the controller hardware. The interrupt

				      handler is now implemented and registered.

				    </para>

				  </chapter>

				  <chapter id="device-platform-data">

				    <title>Device Platform Data</title>

				    <para>

				      In order to write an MUSB glue layer, you need to have some data

				      describing the hardware capabilities of your controller hardware,

				      which is called the platform data.

				    </para>

				    <para>

				      Platform data is specific to your hardware, though it may cover a

				      broad range of devices, and is generally found somewhere in the

				      arch/ directory, depending on your device architecture.

				    </para>

				    <para>

				      For instance, platform data for the JZ4740 SoC is found in

				      arch/mips/jz4740/platform.c. In the platform.c file each device of

				      the JZ4740 SoC is described through a set of structures.

				    </para>

				    <para>

				      Here is the part of arch/mips/jz4740/platform.c that covers the

				      USB Device Controller (UDC):

				    </para>

				    <programlisting linenumbering="numbered">

				/* USB Device Controller */

				struct platform_device jz4740_udc_xceiv_device = {

					.name = "usb_phy_gen_xceiv",

					.id   = 0,

				};

				static struct resource jz4740_udc_resources[] = {

					[0] = {

						.start = JZ4740_UDC_BASE_ADDR,

						.end   = JZ4740_UDC_BASE_ADDR + 0x10000 - 1,

						.flags = IORESOURCE_MEM,

					},

					[1] = {

						.start = JZ4740_IRQ_UDC,

						.end   = JZ4740_IRQ_UDC,

						.flags = IORESOURCE_IRQ,

						.name  = "mc",

					},

				};

				struct platform_device jz4740_udc_device = {

					.name = "musb-jz4740",

					.id   = -1,

					.dev  = {

						.dma_mask          = &amp;jz4740_udc_device.dev.coherent_dma_mask,

						.coherent_dma_mask = DMA_BIT_MASK(32),

					},

					.num_resources = ARRAY_SIZE(jz4740_udc_resources),

					.resource      = jz4740_udc_resources,

				};

				    </programlisting>

				    <para>

				      The jz4740_udc_xceiv_device platform device structure (line 2)

				      describes the UDC transceiver with a name and id number.

				    </para>

				    <para>

				      At the time of this writing, note that

				      &quot;usb_phy_gen_xceiv&quot; is the specific name to be used for

				      all transceivers that are either built-in with reference USB IP or

				      autonomous and doesn't require any PHY programming. You will need

				      to set CONFIG_NOP_USB_XCEIV=y in the kernel configuration to make

				      use of the corresponding transceiver driver. The id field could be

				      set to -1 (equivalent to PLATFORM_DEVID_NONE), -2 (equivalent to

				      PLATFORM_DEVID_AUTO) or start with 0 for the first device of this

				      kind if we want a specific id number.

				    </para>

				    <para>

				      The jz4740_udc_resources resource structure (line 7) defines the

				      UDC registers base addresses.

				    </para>

				    <para>

				      The first array (line 9 to 11) defines the UDC registers base

				      memory addresses: start points to the first register memory

				      address, end points to the last register memory address and the

				      flags member defines the type of resource we are dealing with. So

				      IORESOURCE_MEM is used to define the registers memory addresses.

				      The second array (line 14 to 17) defines the UDC IRQ registers

				      addresses. Since there is only one IRQ register available for the

				      JZ4740 UDC, start and end point at the same address. The

				      IORESOURCE_IRQ flag tells that we are dealing with IRQ resources,

				      and the name &quot;mc&quot; is in fact hard-coded in the MUSB core

				      in order for the controller driver to retrieve this IRQ resource

				      by querying it by its name.

				    </para>

				    <para>

				      Finally, the jz4740_udc_device platform device structure (line 21)

				      describes the UDC itself.

				    </para>

				    <para>

				      The &quot;musb-jz4740&quot; name (line 22) defines the MUSB

				      driver that is used for this device; remember this is in fact

				      the name that we used in the jz4740_driver platform driver

				      structure in <link linkend="linux-musb-basics">Chapter

				      2</link>. The id field (line 23) is set to -1 (equivalent to

				      PLATFORM_DEVID_NONE) since we do not need an id for the device:

				      the MUSB controller driver was already set to allocate an

				      automatic id in <link linkend="linux-musb-basics">Chapter

				      2</link>. In the dev field we care for DMA related information

				      here. The dma_mask field (line 25) defines the width of the DMA

				      mask that is going to be used, and coherent_dma_mask (line 26)

				      has the same purpose but for the alloc_coherent DMA mappings: in

				      both cases we are using a 32 bits mask. Then the resource field

				      (line 29) is simply a pointer to the resource structure defined

				      before, while the num_resources field (line 28) keeps track of

				      the number of arrays defined in the resource structure (in this

				      case there were two resource arrays defined before).

				    </para>

				    <para>

				      With this quick overview of the UDC platform data at the arch/

				      level now done, let's get back to the MUSB glue layer specific

				      platform data in drivers/usb/musb/jz4740.c:

				    </para>

				    <programlisting linenumbering="numbered">

				static struct musb_hdrc_config jz4740_musb_config = {

					/* Silicon does not implement USB OTG. */

					.multipoint = 0,

					/* Max EPs scanned, driver will decide which EP can be used. */

					.num_eps    = 4,

					/* RAMbits needed to configure EPs from table */

					.ram_bits   = 9,

					.fifo_cfg = jz4740_musb_fifo_cfg,

					.fifo_cfg_size = ARRAY_SIZE(jz4740_musb_fifo_cfg),

				};

				static struct musb_hdrc_platform_data jz4740_musb_platform_data = {

					.mode   = MUSB_PERIPHERAL,

					.config = &amp;jz4740_musb_config,

				};

				    </programlisting>

				    <para>

				      First the glue layer configures some aspects of the controller

				      driver operation related to the controller hardware specifics.

				      This is done through the jz4740_musb_config musb_hdrc_config

				      structure.

				    </para>

				    <para>

				      Defining the OTG capability of the controller hardware, the

				      multipoint member (line 3) is set to 0 (equivalent to false)

				      since the JZ4740 UDC is not OTG compatible. Then num_eps (line

				      5) defines the number of USB endpoints of the controller

				      hardware, including endpoint 0: here we have 3 endpoints +

				      endpoint 0. Next is ram_bits (line 7) which is the width of the

				      RAM address bus for the MUSB controller hardware. This

				      information is needed when the controller driver cannot

				      automatically configure endpoints by reading the relevant

				      controller hardware registers. This issue will be discussed when

				      we get to device quirks in <link linkend="device-quirks">Chapter

				      5</link>. Last two fields (line 8 and 9) are also about device

				      quirks: fifo_cfg points to the USB endpoints configuration table

				      and fifo_cfg_size keeps track of the size of the number of

				      entries in that configuration table. More on that later in <link

				      linkend="device-quirks">Chapter 5</link>.

				    </para>

				    <para>

				      Then this configuration is embedded inside

				      jz4740_musb_platform_data musb_hdrc_platform_data structure (line

				      11): config is a pointer to the configuration structure itself,

				      and mode tells the controller driver if the controller hardware

				      may be used as MUSB_HOST only, MUSB_PERIPHERAL only or MUSB_OTG

				      which is a dual mode.

				    </para>

				    <para>

				      Remember that jz4740_musb_platform_data is then used to convey

				      platform data information as we have seen in the probe function

				      in <link linkend="linux-musb-basics">Chapter 2</link>

				    </para>

				  </chapter>

				  <chapter id="device-quirks">

				    <title>Device Quirks</title>

				    <para>

				      Completing the platform data specific to your device, you may also

				      need to write some code in the glue layer to work around some

				      device specific limitations. These quirks may be due to some

				      hardware bugs, or simply be the result of an incomplete

				      implementation of the USB On-the-Go specification.

				    </para>

				    <para>

				      The JZ4740 UDC exhibits such quirks, some of which we will discuss

				      here for the sake of insight even though these might not be found

				      in the controller hardware you are working on.

				    </para>

				    <para>

				      Let's get back to the init function first:

				    </para>

				    <programlisting linenumbering="numbered">

				static int jz4740_musb_init(struct musb *musb)

				{

					musb->xceiv = usb_get_phy(USB_PHY_TYPE_USB2);

					if (!musb->xceiv) {

						pr_err("HS UDC: no transceiver configured\n");

						return -ENODEV;

					}

					/* Silicon does not implement ConfigData register.

					 * Set dyn_fifo to avoid reading EP config from hardware.

					 */

					musb->dyn_fifo = true;

					musb->isr = jz4740_musb_interrupt;

					return 0;

				}

				    </programlisting>

				    <para>

				      Instruction on line 12 helps the MUSB controller driver to work

				      around the fact that the controller hardware is missing registers

				      that are used for USB endpoints configuration.

				    </para>

				    <para>

				      Without these registers, the controller driver is unable to read

				      the endpoints configuration from the hardware, so we use line 12

				      instruction to bypass reading the configuration from silicon, and

				      rely on a hard-coded table that describes the endpoints

				      configuration instead:

				    </para>

				    <programlisting linenumbering="numbered">

				static struct musb_fifo_cfg jz4740_musb_fifo_cfg[] = {

				{ .hw_ep_num = 1, .style = FIFO_TX, .maxpacket = 512, },

				{ .hw_ep_num = 1, .style = FIFO_RX, .maxpacket = 512, },

				{ .hw_ep_num = 2, .style = FIFO_TX, .maxpacket = 64, },

				};

				    </programlisting>

				    <para>

				      Looking at the configuration table above, we see that each

				      endpoints is described by three fields: hw_ep_num is the endpoint

				      number, style is its direction (either FIFO_TX for the controller

				      driver to send packets in the controller hardware, or FIFO_RX to

				      receive packets from hardware), and maxpacket defines the maximum

				      size of each data packet that can be transmitted over that

				      endpoint. Reading from the table, the controller driver knows that

				      endpoint 1 can be used to send and receive USB data packets of 512

				      bytes at once (this is in fact a bulk in/out endpoint), and

				      endpoint 2 can be used to send data packets of 64 bytes at once

				      (this is in fact an interrupt endpoint).

				    </para>

				    <para>

				      Note that there is no information about endpoint 0 here: that one

				      is implemented by default in every silicon design, with a

				      predefined configuration according to the USB specification. For

				      more examples of endpoint configuration tables, see musb_core.c.

				    </para>

				    <para>

				      Let's now get back to the interrupt handler function:

				    </para>

				    <programlisting linenumbering="numbered">

				static irqreturn_t jz4740_musb_interrupt(int irq, void *__hci)

				{

					unsigned long   flags;

					irqreturn_t     retval = IRQ_NONE;

					struct musb     *musb = __hci;

					spin_lock_irqsave(&amp;musb->lock, flags);

					musb->int_usb = musb_readb(musb->mregs, MUSB_INTRUSB);

					musb->int_tx = musb_readw(musb->mregs, MUSB_INTRTX);

					musb->int_rx = musb_readw(musb->mregs, MUSB_INTRRX);

					/*

					 * The controller is gadget only, the state of the host mode IRQ bits is

					 * undefined. Mask them to make sure that the musb driver core will

					 * never see them set

					 */

					musb->int_usb &amp;= MUSB_INTR_SUSPEND | MUSB_INTR_RESUME |

					    MUSB_INTR_RESET | MUSB_INTR_SOF;

					if (musb->int_usb || musb->int_tx || musb->int_rx)

						retval = musb_interrupt(musb);

					spin_unlock_irqrestore(&amp;musb->lock, flags);

					return retval;

				}

				    </programlisting>

				    <para>

				      Instruction on line 18 above is a way for the controller driver to

				      work around the fact that some interrupt bits used for USB host

				      mode operation are missing in the MUSB_INTRUSB register, thus left

				      in an undefined hardware state, since this MUSB controller

				      hardware is used in peripheral mode only. As a consequence, the

				      glue layer masks these missing bits out to avoid parasite

				      interrupts by doing a logical AND operation between the value read

				      from MUSB_INTRUSB and the bits that are actually implemented in

				      the register.

				    </para>

				    <para>

				      These are only a couple of the quirks found in the JZ4740 USB

				      device controller. Some others were directly addressed in the MUSB

				      core since the fixes were generic enough to provide a better

				      handling of the issues for others controller hardware eventually.

				    </para>

				  </chapter>

				  <chapter id="conclusion">

				    <title>Conclusion</title>

				    <para>

				      Writing a Linux MUSB glue layer should be a more accessible task,

				      as this documentation tries to show the ins and outs of this

				      exercise.

				    </para>

				    <para>

				      The JZ4740 USB device controller being fairly simple, I hope its

				      glue layer serves as a good example for the curious mind. Used

				      with the current MUSB glue layers, this documentation should

				      provide enough guidance to get started; should anything gets out

				      of hand, the linux-usb mailing list archive is another helpful

				      resource to browse through.

				    </para>

				  </chapter>

				  <chapter id="acknowledgements">

				    <title>Acknowledgements</title>

				    <para>

				      Many thanks to Lars-Peter Clausen and Maarten ter Huurne for

				      answering my questions while I was writing the JZ4740 glue layer

				      and for helping me out getting the code in good shape.

				    </para>

				    <para>

				      I would also like to thank the Qi-Hardware community at large for

				      its cheerful guidance and support.

				    </para>

				  </chapter>

				  <chapter id="resources">

				    <title>Resources</title>

				    <para>

				      USB Home Page:

				      <ulink url="http://www.usb.org">http://www.usb.org</ulink>

				    </para>

				    <para>

				      linux-usb Mailing List Archives:

				      <ulink url="http://marc.info/?l=linux-usb">http://marc.info/?l=linux-usb</ulink>

				    </para>

				    <para>

				      USB On-the-Go Basics:

				      <ulink url="http://www.maximintegrated.com/app-notes/index.mvp/id/1822">http://www.maximintegrated.com/app-notes/index.mvp/id/1822</ulink>

				    </para>

				    <para>

				      Writing USB Device Drivers:

				      <ulink url="https://www.kernel.org/doc/htmldocs/writing_usb_driver/index.html">https://www.kernel.org/doc/htmldocs/writing_usb_driver/index.html</ulink>

				    </para>

				    <para>

				      Texas Instruments USB Configuration Wiki Page:

				      <ulink url="http://processors.wiki.ti.com/index.php/Usbgeneralpage">http://processors.wiki.ti.com/index.php/Usbgeneralpage</ulink>

				    </para>

				    <para>

				      Analog Devices Blackfin MUSB Configuration:

				      <ulink url="http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:drivers:musb">http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:drivers:musb</ulink>

				    </para>

				  </chapter>

				</book>

									
										412

Documentation/DocBook/writing_usb_driver.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,412 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="USBDeviceDriver">

				 <bookinfo>

				  <title>Writing USB Device Drivers</title>

				  <authorgroup>

				   <author>

				    <firstname>Greg</firstname>

				    <surname>Kroah-Hartman</surname>

				    <affiliation>

				     <address>

				      <email>greg@kroah.com</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2001-2002</year>

				   <holder>Greg Kroah-Hartman</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				   <para>

				     This documentation is based on an article published in 

				     Linux Journal Magazine, October 2001, Issue 90.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				      <title>Introduction</title>

				  <para>

				      The Linux USB subsystem has grown from supporting only two different

				      types of devices in the 2.2.7 kernel (mice and keyboards), to over 20

				      different types of devices in the 2.4 kernel. Linux currently supports

				      almost all USB class devices (standard types of devices like keyboards,

				      mice, modems, printers and speakers) and an ever-growing number of

				      vendor-specific devices (such as USB to serial converters, digital

				      cameras, Ethernet devices and MP3 players). For a full list of the

				      different USB devices currently supported, see Resources.

				  </para>

				  <para>

				      The remaining kinds of USB devices that do not have support on Linux are

				      almost all vendor-specific devices. Each vendor decides to implement a

				      custom protocol to talk to their device, so a custom driver usually needs

				      to be created. Some vendors are open with their USB protocols and help

				      with the creation of Linux drivers, while others do not publish them, and

				      developers are forced to reverse-engineer. See Resources for some links

				      to handy reverse-engineering tools.

				  </para>

				  <para>

				      Because each different protocol causes a new driver to be created, I have

				      written a generic USB driver skeleton, modelled after the pci-skeleton.c

				      file in the kernel source tree upon which many PCI network drivers have

				      been based. This USB skeleton can be found at drivers/usb/usb-skeleton.c

				      in the kernel source tree. In this article I will walk through the basics

				      of the skeleton driver, explaining the different pieces and what needs to

				      be done to customize it to your specific device.

				  </para>

				  </chapter>

				  <chapter id="basics">

				      <title>Linux USB Basics</title>

				  <para>

				      If you are going to write a Linux USB driver, please become familiar with

				      the USB protocol specification. It can be found, along with many other

				      useful documents, at the USB home page (see Resources). An excellent

				      introduction to the Linux USB subsystem can be found at the USB Working

				      Devices List (see Resources). It explains how the Linux USB subsystem is

				      structured and introduces the reader to the concept of USB urbs

				      (USB Request Blocks), which are essential to USB drivers.

				  </para>

				  <para>

				      The first thing a Linux USB driver needs to do is register itself with

				      the Linux USB subsystem, giving it some information about which devices

				      the driver supports and which functions to call when a device supported

				      by the driver is inserted or removed from the system. All of this

				      information is passed to the USB subsystem in the usb_driver structure.

				      The skeleton driver declares a usb_driver as:

				  </para>

				  <programlisting>

				static struct usb_driver skel_driver = {

				        .name        = "skeleton",

				        .probe       = skel_probe,

				        .disconnect  = skel_disconnect,

				        .fops        = &amp;skel_fops,

				        .minor       = USB_SKEL_MINOR_BASE,

				        .id_table    = skel_table,

				};

				  </programlisting>

				  <para>

				      The variable name is a string that describes the driver. It is used in

				      informational messages printed to the system log. The probe and

				      disconnect function pointers are called when a device that matches the

				      information provided in the id_table variable is either seen or removed.

				  </para>

				  <para>

				      The fops and minor variables are optional. Most USB drivers hook into

				      another kernel subsystem, such as the SCSI, network or TTY subsystem.

				      These types of drivers register themselves with the other kernel

				      subsystem, and any user-space interactions are provided through that

				      interface. But for drivers that do not have a matching kernel subsystem,

				      such as MP3 players or scanners, a method of interacting with user space

				      is needed. The USB subsystem provides a way to register a minor device

				      number and a set of file_operations function pointers that enable this

				      user-space interaction. The skeleton driver needs this kind of interface,

				      so it provides a minor starting number and a pointer to its

				      file_operations functions.

				  </para>

				  <para>

				      The USB driver is then registered with a call to usb_register, usually in

				      the driver's init function, as shown here:

				  </para>

				  <programlisting>

				static int __init usb_skel_init(void)

				{

				        int result;

				        /* register this driver with the USB subsystem */

				        result = usb_register(&amp;skel_driver);

				        if (result &lt; 0) {

				                err(&quot;usb_register failed for the &quot;__FILE__ &quot;driver.&quot;

				                    &quot;Error number %d&quot;, result);

				                return -1;

				        }

				        return 0;

				}

				module_init(usb_skel_init);

				  </programlisting>

				  <para>

				      When the driver is unloaded from the system, it needs to deregister

				      itself with the USB subsystem. This is done with the usb_deregister

				      function:

				  </para>

				  <programlisting>

				static void __exit usb_skel_exit(void)

				{

				        /* deregister this driver with the USB subsystem */

				        usb_deregister(&amp;skel_driver);

				}

				module_exit(usb_skel_exit);

				  </programlisting>

				  <para>

				     To enable the linux-hotplug system to load the driver automatically when

				     the device is plugged in, you need to create a MODULE_DEVICE_TABLE. The

				     following code tells the hotplug scripts that this module supports a

				     single device with a specific vendor and product ID:

				  </para>

				  <programlisting>

				/* table of devices that work with this driver */

				static struct usb_device_id skel_table [] = {

				        { USB_DEVICE(USB_SKEL_VENDOR_ID, USB_SKEL_PRODUCT_ID) },

				        { }                      /* Terminating entry */

				};

				MODULE_DEVICE_TABLE (usb, skel_table);

				  </programlisting>

				  <para>

				     There are other macros that can be used in describing a usb_device_id for

				     drivers that support a whole class of USB drivers. See usb.h for more

				     information on this.

				  </para>

				  </chapter>

				  <chapter id="device">

				      <title>Device operation</title>

				  <para>

				     When a device is plugged into the USB bus that matches the device ID

				     pattern that your driver registered with the USB core, the probe function

				     is called. The usb_device structure, interface number and the interface ID

				     are passed to the function:

				  </para>

				  <programlisting>

				static int skel_probe(struct usb_interface *interface,

				    const struct usb_device_id *id)

				  </programlisting>

				  <para>

				     The driver now needs to verify that this device is actually one that it

				     can accept. If so, it returns 0.

				     If not, or if any error occurs during initialization, an errorcode

				     (such as <literal>-ENOMEM</literal> or <literal>-ENODEV</literal>)

				     is returned from the probe function.

				  </para>

				  <para>

				     In the skeleton driver, we determine what end points are marked as bulk-in

				     and bulk-out. We create buffers to hold the data that will be sent and

				     received from the device, and a USB urb to write data to the device is

				     initialized.

				  </para>

				  <para>

				     Conversely, when the device is removed from the USB bus, the disconnect

				     function is called with the device pointer. The driver needs to clean any

				     private data that has been allocated at this time and to shut down any

				     pending urbs that are in the USB system.

				  </para>

				  <para>

				     Now that the device is plugged into the system and the driver is bound to

				     the device, any of the functions in the file_operations structure that

				     were passed to the USB subsystem will be called from a user program trying

				     to talk to the device. The first function called will be open, as the

				     program tries to open the device for I/O. We increment our private usage

				     count and save a pointer to our internal structure in the file

				     structure. This is done so that future calls to file operations will

				     enable the driver to determine which device the user is addressing.  All

				     of this is done with the following code:

				  </para>

				  <programlisting>

				/* increment our usage count for the module */

				++skel->open_count;

				/* save our object in the file's private structure */

				file->private_data = dev;

				  </programlisting>

				  <para>

				     After the open function is called, the read and write functions are called

				     to receive and send data to the device. In the skel_write function, we

				     receive a pointer to some data that the user wants to send to the device

				     and the size of the data. The function determines how much data it can

				     send to the device based on the size of the write urb it has created (this

				     size depends on the size of the bulk out end point that the device has).

				     Then it copies the data from user space to kernel space, points the urb to

				     the data and submits the urb to the USB subsystem.  This can be seen in

				     the following code:

				  </para>

				  <programlisting>

				/* we can only write as much as 1 urb will hold */

				bytes_written = (count > skel->bulk_out_size) ? skel->bulk_out_size : count;

				/* copy the data from user space into our urb */

				copy_from_user(skel->write_urb->transfer_buffer, buffer, bytes_written);

				/* set up our urb */

				usb_fill_bulk_urb(skel->write_urb,

				                  skel->dev,

				                  usb_sndbulkpipe(skel->dev, skel->bulk_out_endpointAddr),

				                  skel->write_urb->transfer_buffer,

				                  bytes_written,

				                  skel_write_bulk_callback,

				                  skel);

				/* send the data out the bulk port */

				result = usb_submit_urb(skel->write_urb);

				if (result) {

				        err(&quot;Failed submitting write urb, error %d&quot;, result);

				}

				  </programlisting>

				  <para>

				     When the write urb is filled up with the proper information using the

				     usb_fill_bulk_urb function, we point the urb's completion callback to call our

				     own skel_write_bulk_callback function. This function is called when the

				     urb is finished by the USB subsystem. The callback function is called in

				     interrupt context, so caution must be taken not to do very much processing

				     at that time. Our implementation of skel_write_bulk_callback merely

				     reports if the urb was completed successfully or not and then returns.

				  </para>

				  <para>

				     The read function works a bit differently from the write function in that

				     we do not use an urb to transfer data from the device to the driver.

				     Instead we call the usb_bulk_msg function, which can be used to send or

				     receive data from a device without having to create urbs and handle

				     urb completion callback functions. We call the usb_bulk_msg function,

				     giving it a buffer into which to place any data received from the device

				     and a timeout value. If the timeout period expires without receiving any

				     data from the device, the function will fail and return an error message.

				     This can be shown with the following code:

				  </para>

				  <programlisting>

				/* do an immediate bulk read to get data from the device */

				retval = usb_bulk_msg (skel->dev,

				                       usb_rcvbulkpipe (skel->dev,

				                       skel->bulk_in_endpointAddr),

				                       skel->bulk_in_buffer,

				                       skel->bulk_in_size,

				                       &amp;count, HZ*10);

				/* if the read was successful, copy the data to user space */

				if (!retval) {

				        if (copy_to_user (buffer, skel->bulk_in_buffer, count))

				                retval = -EFAULT;

				        else

				                retval = count;

				}

				  </programlisting>

				  <para>

				     The usb_bulk_msg function can be very useful for doing single reads or

				     writes to a device; however, if you need to read or write constantly to a

				     device, it is recommended to set up your own urbs and submit them to the

				     USB subsystem.

				  </para>

				  <para>

				     When the user program releases the file handle that it has been using to

				     talk to the device, the release function in the driver is called. In this

				     function we decrement our private usage count and wait for possible

				     pending writes:

				  </para>

				  <programlisting>

				/* decrement our usage count for the device */

				--skel->open_count;

				  </programlisting>

				  <para>

				     One of the more difficult problems that USB drivers must be able to handle

				     smoothly is the fact that the USB device may be removed from the system at

				     any point in time, even if a program is currently talking to it. It needs

				     to be able to shut down any current reads and writes and notify the

				     user-space programs that the device is no longer there. The following

				     code (function <function>skel_delete</function>)

				     is an example of how to do this: </para>

				  <programlisting>

				static inline void skel_delete (struct usb_skel *dev)

				{

				    kfree (dev->bulk_in_buffer);

				    if (dev->bulk_out_buffer != NULL)

				        usb_free_coherent (dev->udev, dev->bulk_out_size,

				            dev->bulk_out_buffer,

				            dev->write_urb->transfer_dma);

				    usb_free_urb (dev->write_urb);

				    kfree (dev);

				}

				  </programlisting>

				  <para>

				     If a program currently has an open handle to the device, we reset the flag

				     <literal>device_present</literal>. For

				     every read, write, release and other functions that expect a device to be

				     present, the driver first checks this flag to see if the device is

				     still present. If not, it releases that the device has disappeared, and a

				     -ENODEV error is returned to the user-space program. When the release

				     function is eventually called, it determines if there is no device

				     and if not, it does the cleanup that the skel_disconnect

				     function normally does if there are no open files on the device (see

				     Listing 5).

				  </para>

				  </chapter>

				  <chapter id="iso">

				      <title>Isochronous Data</title>

				  <para>

				     This usb-skeleton driver does not have any examples of interrupt or

				     isochronous data being sent to or from the device. Interrupt data is sent

				     almost exactly as bulk data is, with a few minor exceptions.  Isochronous

				     data works differently with continuous streams of data being sent to or

				     from the device. The audio and video camera drivers are very good examples

				     of drivers that handle isochronous data and will be useful if you also

				     need to do this.

				  </para>

				  </chapter>

				  <chapter id="Conclusion">

				      <title>Conclusion</title>

				  <para>

				     Writing Linux USB device drivers is not a difficult task as the

				     usb-skeleton driver shows. This driver, combined with the other current

				     USB drivers, should provide enough examples to help a beginning author

				     create a working driver in a minimal amount of time. The linux-usb-devel

				     mailing list archives also contain a lot of helpful information.

				  </para>

				  </chapter>

				  <chapter id="resources">

				      <title>Resources</title>

				  <para>

				     The Linux USB Project: <ulink url="http://www.linux-usb.org">http://www.linux-usb.org/</ulink>

				  </para>

				  <para>

				     Linux Hotplug Project: <ulink url="http://linux-hotplug.sourceforge.net">http://linux-hotplug.sourceforge.net/</ulink>

				  </para>

				  <para>

				     Linux USB Working Devices List: <ulink url="http://www.qbik.ch/usb/devices">http://www.qbik.ch/usb/devices/</ulink>

				  </para>

				  <para>

				     linux-usb-devel Mailing List Archives: <ulink url="http://marc.theaimsgroup.com/?l=linux-usb-devel">http://marc.theaimsgroup.com/?l=linux-usb-devel</ulink>

				  </para>

				  <para>

				     Programming Guide for Linux USB Device Drivers: <ulink url="http://usb.cs.tum.edu/usbdoc">http://usb.cs.tum.edu/usbdoc</ulink>

				  </para>

				  <para>

				     USB Home Page: <ulink url="http://www.usb.org">http://www.usb.org</ulink>

				  </para>

				  </chapter>

				</book>

									
										371

Documentation/DocBook/z8530book.tmpl
									
										Normal file
									
												View File
												
				@@ -0,0 +1,371 @@

				<?xml version="1.0" encoding="UTF-8"?>

				<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"

					"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>

				<book id="Z85230Guide">

				 <bookinfo>

				  <title>Z8530 Programming Guide</title>

				  <authorgroup>

				   <author>

				    <firstname>Alan</firstname>

				    <surname>Cox</surname>

				    <affiliation>

				     <address>

				      <email>alan@lxorguk.ukuu.org.uk</email>

				     </address>

				    </affiliation>

				   </author>

				  </authorgroup>

				  <copyright>

				   <year>2000</year>

				   <holder>Alan Cox</holder>

				  </copyright>

				  <legalnotice>

				   <para>

				     This documentation is free software; you can redistribute

				     it and/or modify it under the terms of the GNU General Public

				     License as published by the Free Software Foundation; either

				     version 2 of the License, or (at your option) any later

				     version.

				   </para>

				   <para>

				     This program is distributed in the hope that it will be

				     useful, but WITHOUT ANY WARRANTY; without even the implied

				     warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

				     See the GNU General Public License for more details.

				   </para>

				   <para>

				     You should have received a copy of the GNU General Public

				     License along with this program; if not, write to the Free

				     Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,

				     MA 02111-1307 USA

				   </para>

				   <para>

				     For more details see the file COPYING in the source

				     distribution of Linux.

				   </para>

				  </legalnotice>

				 </bookinfo>

				<toc></toc>

				  <chapter id="intro">

				      <title>Introduction</title>

				  <para>

					The Z85x30 family synchronous/asynchronous controller chips are

					used on a large number of cheap network interface cards. The

					kernel provides a core interface layer that is designed to make

					it easy to provide WAN services using this chip.

				  </para>

				  <para>

					The current driver only support synchronous operation. Merging the

					asynchronous driver support into this code to allow any Z85x30

					device to be used as both a tty interface and as a synchronous 

					controller is a project for Linux post the 2.4 release

				  </para>

				  </chapter>

				  <chapter id="Driver_Modes">

				 	<title>Driver Modes</title>

				  <para>

					The Z85230 driver layer can drive Z8530, Z85C30 and Z85230 devices

					in three different modes. Each mode can be applied to an individual

					channel on the chip (each chip has two channels).

				  </para>

				  <para>

					The PIO synchronous mode supports the most common Z8530 wiring. Here

					the chip is interface to the I/O and interrupt facilities of the

					host machine but not to the DMA subsystem. When running PIO the

					Z8530 has extremely tight timing requirements. Doing high speeds,

					even with a Z85230 will be tricky. Typically you should expect to

					achieve at best 9600 baud with a Z8C530 and 64Kbits with a Z85230.

				  </para>

				  <para>

					The DMA mode supports the chip when it is configured to use dual DMA

					channels on an ISA bus. The better cards tend to support this mode

					of operation for a single channel. With DMA running the Z85230 tops

					out when it starts to hit ISA DMA constraints at about 512Kbits. It

					is worth noting here that many PC machines hang or crash when the

					chip is driven fast enough to hold the ISA bus solid.

				  </para>

				  <para>

					Transmit DMA mode uses a single DMA channel. The DMA channel is used

					for transmission as the transmit FIFO is smaller than the receive

					FIFO. it gives better performance than pure PIO mode but is nowhere

					near as ideal as pure DMA mode. 

				  </para>

				  </chapter>

				  <chapter id="Using_the_Z85230_driver">

				 	<title>Using the Z85230 driver</title>

				  <para>

					The Z85230 driver provides the back end interface to your board. To

					configure a Z8530 interface you need to detect the board and to 

					identify its ports and interrupt resources. It is also your problem

					to verify the resources are available.

				  </para>

				  <para>

					Having identified the chip you need to fill in a struct z8530_dev,

					which describes each chip. This object must exist until you finally

					shutdown the board. Firstly zero the active field. This ensures 

					nothing goes off without you intending it. The irq field should

					be set to the interrupt number of the chip. (Each chip has a single

					interrupt source rather than each channel). You are responsible

					for allocating the interrupt line. The interrupt handler should be

					set to <function>z8530_interrupt</function>. The device id should

					be set to the z8530_dev structure pointer. Whether the interrupt can

					be shared or not is board dependent, and up to you to initialise.

				  </para>

				  <para>

					The structure holds two channel structures. 

					Initialise chanA.ctrlio and chanA.dataio with the address of the

					control and data ports. You can or this with Z8530_PORT_SLEEP to

					indicate your interface needs the 5uS delay for chip settling done

					in software. The PORT_SLEEP option is architecture specific. Other

					flags may become available on future platforms, eg for MMIO.

					Initialise the chanA.irqs to &amp;z8530_nop to start the chip up

					as disabled and discarding interrupt events. This ensures that

					stray interrupts will be mopped up and not hang the bus. Set

					chanA.dev to point to the device structure itself. The

					private and name field you may use as you wish. The private field

					is unused by the Z85230 layer. The name is used for error reporting

					and it may thus make sense to make it match the network name.

				  </para>

				  <para>

					Repeat the same operation with the B channel if your chip has

					both channels wired to something useful. This isn't always the

					case. If it is not wired then the I/O values do not matter, but

					you must initialise chanB.dev.

				  </para>

				  <para>

					If your board has DMA facilities then initialise the txdma and

					rxdma fields for the relevant channels. You must also allocate the

					ISA DMA channels and do any necessary board level initialisation

					to configure them. The low level driver will do the Z8530 and

					DMA controller programming but not board specific magic.

				  </para>

				  <para>

					Having initialised the device you can then call

					<function>z8530_init</function>. This will probe the chip and 

					reset it into a known state. An identification sequence is then

					run to identify the chip type. If the checks fail to pass the

					function returns a non zero error code. Typically this indicates

					that the port given is not valid. After this call the

					type field of the z8530_dev structure is initialised to either

					Z8530, Z85C30 or Z85230 according to the chip found.

				  </para>

				  <para>

					Once you have called z8530_init you can also make use of the utility

					function <function>z8530_describe</function>. This provides a 

					consistent reporting format for the Z8530 devices, and allows all

					the drivers to provide consistent reporting.

				  </para>

				  </chapter>

				  <chapter id="Attaching_Network_Interfaces">

				 	<title>Attaching Network Interfaces</title>

				  <para>

					If you wish to use the network interface facilities of the driver,

					then you need to attach a network device to each channel that is

					present and in use. In addition to use the generic HDLC

					you need to follow some additional plumbing rules. They may seem 

					complex but a look at the example hostess_sv11 driver should

					reassure you.

				  </para>

				  <para>

					The network device used for each channel should be pointed to by

					the netdevice field of each channel. The hdlc-&gt; priv field of the

					network device points to your private data - you will need to be

					able to find your private data from this.

				  </para>

				  <para>

					The way most drivers approach this particular problem is to

					create a structure holding the Z8530 device definition and

					put that into the private field of the network device. The

					network device fields of the channels then point back to the

					network devices.

				  </para>

				  <para>

					If you wish to use the generic HDLC then you need to register

					the HDLC device.

				  </para>

				  <para>

					Before you register your network device you will also need to

					provide suitable handlers for most of the network device callbacks. 

					See the network device documentation for more details on this.

				  </para>

				  </chapter>

				  <chapter id="Configuring_And_Activating_The_Port">

				 	<title>Configuring And Activating The Port</title>

				  <para>

					The Z85230 driver provides helper functions and tables to load the

					port registers on the Z8530 chips. When programming the register

					settings for a channel be aware that the documentation recommends

					initialisation orders. Strange things happen when these are not

					followed. 

				  </para>

				  <para>

					<function>z8530_channel_load</function> takes an array of

					pairs of initialisation values in an array of u8 type. The first

					value is the Z8530 register number. Add 16 to indicate the alternate

					register bank on the later chips. The array is terminated by a 255.

				  </para>

				  <para>

					The driver provides a pair of public tables. The

					z8530_hdlc_kilostream table is for the UK 'Kilostream' service and

					also happens to cover most other end host configurations. The

					z8530_hdlc_kilostream_85230 table is the same configuration using

					the enhancements of the 85230 chip. The configuration loaded is

					standard NRZ encoded synchronous data with HDLC bitstuffing. All

					of the timing is taken from the other end of the link.

				  </para>

				  <para>

					When writing your own tables be aware that the driver internally

					tracks register values. It may need to reload values. You should

					therefore be sure to set registers 1-7, 9-11, 14 and 15 in all

					configurations. Where the register settings depend on DMA selection

					the driver will update the bits itself when you open or close.

					Loading a new table with the interface open is not recommended.

				  </para>

				  <para>

					There are three standard configurations supported by the core

					code. In PIO mode the interface is programmed up to use

					interrupt driven PIO. This places high demands on the host processor

					to avoid latency. The driver is written to take account of latency

					issues but it cannot avoid latencies caused by other drivers,

					notably IDE in PIO mode. Because the drivers allocate buffers you

					must also prevent MTU changes while the port is open.

				  </para>

				  <para>

					Once the port is open it will call the rx_function of each channel

					whenever a completed packet arrived. This is invoked from

					interrupt context and passes you the channel and a network	

					buffer (struct sk_buff) holding the data. The data includes

					the CRC bytes so most users will want to trim the last two

					bytes before processing the data. This function is very timing

					critical. When you wish to simply discard data the support

					code provides the function <function>z8530_null_rx</function>

					to discard the data.

				  </para>

				  <para>

					To active PIO mode sending and receiving the <function>

					z8530_sync_open</function> is called. This expects to be passed

					the network device and the channel. Typically this is called from

					your network device open callback. On a failure a non zero error

					status is returned. The <function>z8530_sync_close</function> 

					function shuts down a PIO channel. This must be done before the 

					channel is opened again	and before the driver shuts down 

					and unloads.

				  </para>

				  <para>

					The ideal mode of operation is dual channel DMA mode. Here the

					kernel driver will configure the board for DMA in both directions.

					The driver also handles ISA DMA issues such as controller

					programming and the memory range limit for you. This mode is

					activated by calling the <function>z8530_sync_dma_open</function>

					function. On failure a non zero error value is returned.

					Once this mode is activated it can be shut down by calling the

					<function>z8530_sync_dma_close</function>. You must call the close

					function matching the open mode you used.

				  </para>

				  <para>

					The final supported mode uses a single DMA channel to drive the

					transmit side. As the Z85C30 has a larger FIFO on the receive

					channel	this tends to increase the maximum speed a little. 

					This is activated by calling the <function>z8530_sync_txdma_open

					</function>. This returns a non zero error code on failure. The

					<function>z8530_sync_txdma_close</function> function closes down

					the Z8530 interface from this mode.

				  </para>

				  </chapter>

				  <chapter id="Network_Layer_Functions">

				 	<title>Network Layer Functions</title>

				  <para>

					The Z8530 layer provides functions to queue packets for

					transmission. The driver internally buffers the frame currently

					being transmitted and one further frame (in order to keep back

					to back transmission running). Any further buffering is up to

					the caller.

				  </para>

				  <para>

					The function <function>z8530_queue_xmit</function> takes a network

					buffer in sk_buff format and queues it for transmission. The

					caller must provide the entire packet with the exception of the

					bitstuffing and CRC. This is normally done by the caller via

					the generic HDLC interface layer. It returns 0 if the buffer has been

					queued and non zero values for queue full. If the function accepts

					the buffer it becomes property of the Z8530 layer and the caller

					should not free it.

				  </para>

				  <para>

					The function <function>z8530_get_stats</function> returns a pointer

					to an internally maintained per interface statistics block. This

					provides most of the interface code needed to implement the network

					layer get_stats callback.

				  </para>

				  </chapter>

				  <chapter id="Porting_The_Z8530_Driver">

				     <title>Porting The Z8530 Driver</title>

				  <para>

					The Z8530 driver is written to be portable. In DMA mode it makes

					assumptions about the use of ISA DMA. These are probably warranted

					in most cases as the Z85230 in particular was designed to glue to PC

					type machines. The PIO mode makes no real assumptions.

				  </para>

				  <para>

					Should you need to retarget the Z8530 driver to another architecture

					the only code that should need changing are the port I/O functions.

					At the moment these assume PC I/O port accesses. This may not be

					appropriate for all platforms. Replacing 

					<function>z8530_read_port</function> and <function>z8530_write_port

					</function> is intended to be all that is required to port this

					driver layer.

				  </para>

				  </chapter>

				  <chapter id="bugs">

				     <title>Known Bugs And Assumptions</title>

				  <para>

				  <variablelist>

				    <varlistentry><term>Interrupt Locking</term>

				    <listitem>

				    <para>

					The locking in the driver is done via the global cli/sti lock. This

					makes for relatively poor SMP performance. Switching this to use a

					per device spin lock would probably materially improve performance.

				    </para>

				    </listitem></varlistentry>

				    <varlistentry><term>Occasional Failures</term>

				    <listitem>

				    <para>

					We have reports of occasional failures when run for very long

					periods of time and the driver starts to receive junk frames. At

					the moment the cause of this is not clear.

				    </para>

				    </listitem></varlistentry>

				  </variablelist>

				  </para>

				  </chapter>

				  <chapter id="pubfunctions">

				     <title>Public Functions Provided</title>

				!Edrivers/net/wan/z85230.c

				  </chapter>

				  <chapter id="intfunctions">

				     <title>Internal Functions</title>

				!Idrivers/net/wan/z85230.c

				  </chapter>

				</book>

									
										6

Documentation/EDID/edid.S
									
												View File
												
				@@ -59,9 +59,9 @@

				/* Fixed header pattern */

				header:		.byte	0x00,0xff,0xff,0xff,0xff,0xff,0xff,0x00

				mfg_id:		.hword	swap16(mfgname2id(MFG_LNX1, MFG_LNX2, MFG_LNX3))

				mfg_id:		.word	swap16(mfgname2id(MFG_LNX1, MFG_LNX2, MFG_LNX3))

				prod_code:	.hword	0

				prod_code:	.word	0

				/* Serial number. 32 bits, little endian. */

				serial_number:	.long	SERIAL

				@@ -177,7 +177,7 @@ std_vres:	.byte	(XY_RATIO<<6)+VFREQ-60

				descriptor1:

				/* Pixel clock in 10 kHz units. (0.-655.35 MHz, little-endian) */

				clock:		.hword	CLOCK/10

				clock:		.word	CLOCK/10

				/* Horizontal active pixels 8 lsbits (0-4095) */

				x_act_lsb:	.byte	XPIX&0xff

76

Documentation/IPMI.txt

View File

@@ -1,8 +1,9 @@
 =====================
 The Linux IPMI Driver
 =====================
 :Author: Corey Minyard <minyard@mvista.com> / <minyard@acm.org>
                           The Linux IPMI Driver
 			  ---------------------
 			      Corey Minyard
 			  <minyard@mvista.com>
 			    <minyard@acm.org>
 The Intelligent Platform Management Interface, or IPMI, is a
 standard for controlling intelligent devices that monitor a system.
@@ -140,7 +141,7 @@ Addressing
 ----------
 The IPMI addressing works much like IP addresses, you have an overlay
 to handle the different address types.  The overlay is::
 to handle the different address types.  The overlay is:
   struct ipmi_addr
   {
@@ -152,7 +153,7 @@ to handle the different address types.  The overlay is::
 The addr_type determines what the address really is.  The driver
 currently understands two different types of addresses.
 "System Interface" addresses are defined as::
 "System Interface" addresses are defined as:
   struct ipmi_system_interface_addr
   {
@@ -165,7 +166,7 @@ straight to the BMC on the current card.  The channel must be
 IPMI_BMC_CHANNEL.
 Messages that are destined to go out on the IPMB bus use the
 IPMI_IPMB_ADDR_TYPE address type.  The format is::
 IPMI_IPMB_ADDR_TYPE address type.  The format is
   struct ipmi_ipmb_addr
   {
@@ -183,16 +184,16 @@ spec.
 Messages
 --------
 Messages are defined as::
 Messages are defined as:
   struct ipmi_msg
   {
 struct ipmi_msg
 {
 	unsigned char netfn;
 	unsigned char lun;
 	unsigned char cmd;
 	unsigned char *data;
 	int           data_len;
   };
 };
 The driver takes care of adding/stripping the header information.  The
 data portion is just the data to be send (do NOT put addressing info
@@ -207,7 +208,7 @@ block of data, even when receiving messages.  Otherwise the driver
 will have no place to put the message.
 Messages coming up from the message handler in kernelland will come in
 as::
 as:
   struct ipmi_recv_msg
   {
@@ -245,7 +246,6 @@ and the user should not have to care what type of SMI is below them.
 Watching For Interfaces
 ^^^^^^^^^^^^^^^^^^^^^^^
 When your code comes up, the IPMI driver may or may not have detected
 if IPMI devices exist.  So you might have to defer your setup until
@@ -256,7 +256,6 @@ and tell you when they come and go.
 Creating the User
 ^^^^^^^^^^^^^^^^^
 To use the message handler, you must first create a user using
 ipmi_create_user.  The interface number specifies which SMI you want
@@ -273,7 +272,6 @@ closing the device automatically destroys the user.
 Messaging
 ^^^^^^^^^
 To send a message from kernel-land, the ipmi_request_settime() call does
 pretty much all message handling.  Most of the parameter are
@@ -323,7 +321,6 @@ though, since it is tricky to manage your own buffers.
 Events and Incoming Commands
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 The driver takes care of polling for IPMI events and receiving
 commands (commands are messages that are not responses, they are
@@ -370,7 +367,7 @@ in the system.  It discovers interfaces through a host of different
 methods, depending on the system.
 You can specify up to four interfaces on the module load line and
 control some module parameters::
 control some module parameters:
   modprobe ipmi_si.o type=<type1>,<type2>....
        ports=<port1>,<port2>... addrs=<addr1>,<addr2>...
@@ -440,7 +437,7 @@ default is one.  Setting to 0 is useful with the hotmod, but is
 obviously only useful for modules.
 When compiled into the kernel, the parameters can be specified on the
 kernel command line as::
 kernel command line as:
   ipmi_si.type=<type1>,<type2>...
        ipmi_si.ports=<port1>,<port2>... ipmi_si.addrs=<addr1>,<addr2>...
@@ -477,22 +474,16 @@ The driver supports a hot add and remove of interfaces.  This way,
 interfaces can be added or removed after the kernel is up and running.
 This is done using /sys/modules/ipmi_si/parameters/hotmod, which is a
 write-only parameter.  You write a string to this interface.  The string
 has the format::
 has the format:
    <op1>[:op2[:op3...]]
 The "op"s are::
 The "op"s are:
    add|remove,kcs|bt|smic,mem|i/o,<address>[,<opt1>[,<opt2>[,...]]]
 You can specify more than one interface on the line.  The "opt"s are::
 You can specify more than one interface on the line.  The "opt"s are:
    rsp=<regspacing>
    rsi=<regsize>
    rsh=<regshift>
    irq=<irq>
    ipmb=<ipmb slave addr>
 and these have the same meanings as discussed above.  Note that you
 can also use this on the kernel command line for a more compact format
 for specifying an interface.  Note that when removing an interface,
@@ -505,7 +496,7 @@ The SMBus Driver (SSIF)
 The SMBus driver allows up to 4 SMBus devices to be configured in the
 system.  By default, the driver will only register with something it
 finds in DMI or ACPI tables.  You can change this
 at module load time (for a module) with::
 at module load time (for a module) with:
   modprobe ipmi_ssif.o
 	addr=<i2caddr1>[,<i2caddr2>[,...]]
@@ -544,7 +535,7 @@ the smb_addr parameter unless you have DMI or ACPI data to tell the
 driver what to use.
 When compiled into the kernel, the addresses can be specified on the
 kernel command line as::
 kernel command line as:
   ipmb_ssif.addr=<i2caddr1>[,<i2caddr2>[...]]
 	ipmi_ssif.adapter=<adapter1>[,<adapter2>[...]]
@@ -574,9 +565,9 @@ Some users need more detailed information about a device, like where
 the address came from or the raw base device for the IPMI interface.
 You can use the IPMI smi_watcher to catch the IPMI interfaces as they
 come or go, and to grab the information, you can use the function
 ipmi_get_smi_info(), which returns the following structure::
 ipmi_get_smi_info(), which returns the following structure:
   struct ipmi_smi_info {
 struct ipmi_smi_info {
 	enum ipmi_addr_src addr_src;
 	struct device *dev;
 	union {
@@ -584,7 +575,7 @@ ipmi_get_smi_info(), which returns the following structure::
 			void *acpi_handle;
 		} acpi_info;
 	} addr_info;
   };
 };
 Currently special info for only for SI_ACPI address sources is
 returned.  Others may be added as necessary.
@@ -599,7 +590,7 @@ Watchdog
 A watchdog timer is provided that implements the Linux-standard
 watchdog timer interface.  It has three module parameters that can be
 used to control it::
 used to control it:
   modprobe ipmi_watchdog timeout=<t> pretimeout=<t> action=<action type>
       preaction=<preaction type> preop=<preop type> start_now=x
@@ -644,7 +635,7 @@ watchdog device is closed.  The default value of nowayout is true
 if the CONFIG_WATCHDOG_NOWAYOUT option is enabled, or false if not.
 When compiled into the kernel, the kernel command line is available
 for configuring the watchdog::
 for configuring the watchdog:
   ipmi_watchdog.timeout=<t> ipmi_watchdog.pretimeout=<t>
 	ipmi_watchdog.action=<action type>
@@ -684,7 +675,6 @@ also get a bunch of OEM events holding the panic string.
 The field settings of the events are:
 * Generator ID: 0x21 (kernel)
 * EvM Rev: 0x03 (this event is formatting in IPMI 1.0 format)
 * Sensor Type: 0x20 (OS critical stop sensor)
@@ -693,20 +683,18 @@ The field settings of the events are:
 * Event Data 1: 0xa1 (Runtime stop in OEM bytes 2 and 3)
 * Event data 2: second byte of panic string
 * Event data 3: third byte of panic string
 See the IPMI spec for the details of the event layout.  This event is
 always sent to the local management controller.  It will handle routing
 the message to the right place
 Other OEM events have the following format:
 * Record ID (bytes 0-1): Set by the SEL.
 * Record type (byte 2): 0xf0 (OEM non-timestamped)
 * byte 3: The slave address of the card saving the panic
 * byte 4: A sequence number (starting at zero)
   The rest of the bytes (11 bytes) are the panic string.  If the panic string
   is longer than 11 bytes, multiple messages will be sent with increasing
   sequence numbers.
 Record ID (bytes 0-1): Set by the SEL.
 Record type (byte 2): 0xf0 (OEM non-timestamped)
 byte 3: The slave address of the card saving the panic
 byte 4: A sequence number (starting at zero)
 The rest of the bytes (11 bytes) are the panic string.  If the panic string
 is longer than 11 bytes, multiple messages will be sent with increasing
 sequence numbers.
 Because you cannot send OEM events using the standard interface, this
 function will attempt to find an SEL and add the events there.  It

75

Documentation/IRQ-affinity.txt

View File

@@ -1,11 +1,8 @@
 ================
 SMP IRQ affinity
 ================
 ChangeLog:
 	- Started by Ingo Molnar <mingo@redhat.com>
 	- Update by Max Krasnyansky <maxk@qualcomm.com>
 	Started by Ingo Molnar <mingo@redhat.com>
 	Update by Max Krasnyansky <maxk@qualcomm.com>
 SMP IRQ affinity
 /proc/irq/IRQ#/smp_affinity and /proc/irq/IRQ#/smp_affinity_list specify
 which target CPUs are permitted for a given IRQ source.  It's a bitmask
@@ -19,52 +16,50 @@ will be set to the default mask. It can then be changed as described above.
 Default mask is 0xffffffff.
 Here is an example of restricting IRQ44 (eth1) to CPU0-3 then restricting
 it to CPU4-7 (this is an 8-CPU SMP box)::
 it to CPU4-7 (this is an 8-CPU SMP box):
 	[root@moon 44]# cd /proc/irq/44
 	[root@moon 44]# cat smp_affinity
 	ffffffff
 [root@moon 44]# cd /proc/irq/44
 [root@moon 44]# cat smp_affinity
 ffffffff
 	[root@moon 44]# echo 0f > smp_affinity
 	[root@moon 44]# cat smp_affinity
 	0000000f
 	[root@moon 44]# ping -f h
 	PING hell (195.4.7.3): 56 data bytes
 	...
 	--- hell ping statistics ---
 packets transmitted, 6027 packets received, 0% packet loss
 	round-trip min/avg/max = 0.1/0.1/0.4 ms
 	[root@moon 44]# cat /proc/interrupts | grep 'CPU\|44:'
 		CPU0       CPU1       CPU2       CPU3      CPU4       CPU5        CPU6       CPU7
 :       1068       1785       1785       1783         0          0           0         0    IO-APIC-level  eth1
 [root@moon 44]# echo 0f > smp_affinity
 [root@moon 44]# cat smp_affinity
 0000000f
 [root@moon 44]# ping -f h
 PING hell (195.4.7.3): 56 data bytes
 ...
 --- hell ping statistics ---
 packets transmitted, 6027 packets received, 0% packet loss
 round-trip min/avg/max = 0.1/0.1/0.4 ms
 [root@moon 44]# cat /proc/interrupts | grep 'CPU\|44:'
            CPU0       CPU1       CPU2       CPU3      CPU4       CPU5        CPU6       CPU7
 :       1068       1785       1785       1783         0          0           0         0    IO-APIC-level  eth1
 As can be seen from the line above IRQ44 was delivered only to the first four
 processors (0-3).
 Now lets restrict that IRQ to CPU(4-7).
 ::
 	[root@moon 44]# echo f0 > smp_affinity
 	[root@moon 44]# cat smp_affinity
 f0
 	[root@moon 44]# ping -f h
 	PING hell (195.4.7.3): 56 data bytes
 	..
 	--- hell ping statistics ---
 packets transmitted, 2777 packets received, 0% packet loss
 	round-trip min/avg/max = 0.1/0.5/585.4 ms
 	[root@moon 44]# cat /proc/interrupts |  'CPU\|44:'
 		CPU0       CPU1       CPU2       CPU3      CPU4       CPU5        CPU6       CPU7
 :       1068       1785       1785       1783      1784       1069        1070       1069   IO-APIC-level  eth1
 [root@moon 44]# echo f0 > smp_affinity
 [root@moon 44]# cat smp_affinity
 f0
 [root@moon 44]# ping -f h
 PING hell (195.4.7.3): 56 data bytes
 ..
 --- hell ping statistics ---
 packets transmitted, 2777 packets received, 0% packet loss
 round-trip min/avg/max = 0.1/0.5/585.4 ms
 [root@moon 44]# cat /proc/interrupts |  'CPU\|44:'
            CPU0       CPU1       CPU2       CPU3      CPU4       CPU5        CPU6       CPU7
 :       1068       1785       1785       1783      1784       1069        1070       1069   IO-APIC-level  eth1
 This time around IRQ44 was delivered only to the last four processors.
 i.e counters for the CPU0-3 did not change.
 Here is an example of limiting that same irq (44) to cpus 1024 to 1031::
 Here is an example of limiting that same irq (44) to cpus 1024 to 1031:
 	[root@moon 44]# echo 1024-1031 > smp_affinity_list
 	[root@moon 44]# cat smp_affinity_list
 -1031
 [root@moon 44]# echo 1024-1031 > smp_affinity_list
 [root@moon 44]# cat smp_affinity_list
 -1031
 Note that to do this with a bitmask would require 32 bitmasks of zero
 to follow the pertinent one.

110

Documentation/IRQ-domain.txt

View File

@@ -1,6 +1,4 @@
 ===============================================
 The irq_domain interrupt number mapping library
 ===============================================
 irq_domain interrupt number mapping library
 The current design of the Linux kernel uses a single large number
 space where each separate IRQ source is assigned a different number.
@@ -38,9 +36,7 @@ irq_domain also implements translation from an abstract irq_fwspec
 structure to hwirq numbers (Device Tree and ACPI GSI so far), and can
 be easily extended to support other IRQ topology data sources.
 irq_domain usage
 ================
 === irq_domain usage ===
 An interrupt controller driver creates and registers an irq_domain by
 calling one of the irq_domain_add_*() functions (each mapping method
 has a different allocator function, more on that later).  The function
@@ -66,21 +62,15 @@ If the driver has the Linux IRQ number or the irq_data pointer, and
 needs to know the associated hwirq number (such as in the irq_chip
 callbacks) then it can be directly obtained from irq_data->hwirq.
 Types of irq_domain mappings
 ============================
 === Types of irq_domain mappings ===
 There are several mechanisms available for reverse mapping from hwirq
 to Linux irq, and each mechanism uses a different allocation function.
 Which reverse map type should be used depends on the use case.  Each
 of the reverse map types are described below:
 Linear
 ------
 ::
 	irq_domain_add_linear()
 	irq_domain_create_linear()
 ==== Linear ====
 irq_domain_add_linear()
 irq_domain_create_linear()
 The linear reverse map maintains a fixed size table indexed by the
 hwirq number.  When a hwirq is mapped, an irq_desc is allocated for
@@ -99,13 +89,9 @@ accepts a more general abstraction 'struct fwnode_handle'.
 The majority of drivers should use the linear map.
 Tree
 ----
 ::
 	irq_domain_add_tree()
 	irq_domain_create_tree()
 ==== Tree ====
 irq_domain_add_tree()
 irq_domain_create_tree()
 The irq_domain maintains a radix tree map from hwirq numbers to Linux
 IRQs.  When an hwirq is mapped, an irq_desc is allocated and the
@@ -123,12 +109,8 @@ accepts a more general abstraction 'struct fwnode_handle'.
 Very few drivers should need this mapping.
 No Map
 ------
 ::
 	irq_domain_add_nomap()
 ==== No Map ===-
 irq_domain_add_nomap()
 The No Map mapping is to be used when the hwirq number is
 programmable in the hardware.  In this case it is best to program the
@@ -139,14 +121,10 @@ Linux IRQ number into the hardware.
 Most drivers cannot use this mapping.
 Legacy
 ------
 ::
 	irq_domain_add_simple()
 	irq_domain_add_legacy()
 	irq_domain_add_legacy_isa()
 ==== Legacy ====
 irq_domain_add_simple()
 irq_domain_add_legacy()
 irq_domain_add_legacy_isa()
 The Legacy mapping is a special case for drivers that already have a
 range of irq_descs allocated for the hwirqs.  It is used when the
@@ -185,17 +163,14 @@ that the driver using the simple domain call irq_create_mapping()
 before any irq_find_mapping() since the latter will actually work
 for the static IRQ assignment case.
 Hierarchy IRQ domain
 --------------------
 ==== Hierarchy IRQ domain ====
 On some architectures, there may be multiple interrupt controllers
 involved in delivering an interrupt from the device to the target CPU.
 Let's look at a typical interrupt delivering path on x86 platforms::
 Let's look at a typical interrupt delivering path on x86 platforms:
   Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
 Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
 There are three interrupt controllers involved:
 ) IOAPIC controller
 ) Interrupt remapping controller
 ) Local APIC controller
@@ -205,8 +180,7 @@ hardware architecture, an irq_domain data structure is built for each
 interrupt controller and those irq_domains are organized into hierarchy.
 When building irq_domain hierarchy, the irq_domain near to the device is
 child and the irq_domain near to CPU is parent. So a hierarchy structure
 as below will be built for the example above::
 as below will be built for the example above.
 	CPU Vector irq_domain (root irq_domain to manage CPU vectors)
 		^
 		|
@@ -216,7 +190,6 @@ as below will be built for the example above::
 	IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
 There are four major interfaces to use hierarchy irq_domain:
 ) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
    controller related resources to deliver these interrupts.
 ) irq_domain_free_irqs(): free IRQ descriptors and interrupt controller
@@ -226,8 +199,7 @@ There are four major interfaces to use hierarchy irq_domain:
 ) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
    to stop delivering the interrupt.
 Following changes are needed to support hierarchy irq_domain:
 Following changes are needed to support hierarchy irq_domain.
 ) a new field 'parent' is added to struct irq_domain; it's used to
    maintain irq_domain hierarchy information.
 ) a new field 'parent_data' is added to struct irq_data; it's used to
@@ -251,7 +223,6 @@ software architecture.
 For an interrupt controller driver to support hierarchy irq_domain, it
 needs to:
 ) Implement irq_domain_ops.alloc and irq_domain_ops.free
 ) Optionally implement irq_domain_ops.activate and
    irq_domain_ops.deactivate.
@@ -260,42 +231,5 @@ needs to:
 ) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
    they are unused with hierarchy irq_domain.
 Hierarchy irq_domain is in no way x86 specific, and is heavily used to
 support other architectures, such as ARM, ARM64 etc.
 === Debugging ===
 If you switch on CONFIG_IRQ_DOMAIN_DEBUG (which depends on
 CONFIG_IRQ_DOMAIN and CONFIG_DEBUG_FS), you will find a new file in
 your debugfs mount point, called irq_domain_mapping. This file
 contains a live snapshot of all the IRQ domains in the system:
  name              mapped  linear-max  direct-max  devtree-node
  pl061                  8           8           0  /smb/gpio@e0080000
  pl061                  8           8           0  /smb/gpio@e1050000
  pMSI                   0           0           0  /interrupt-controller@e1101000/v2m@e0080000
  MSI                   37           0           0  /interrupt-controller@e1101000/v2m@e0080000
  GICv2m                37           0           0  /interrupt-controller@e1101000/v2m@e0080000
  GICv2                448         448           0  /interrupt-controller@e1101000
 it also iterates over the interrupts to display their mapping in the
 domains, and makes the domain stacking visible:
 irq    hwirq    chip name        chip data           active  type            domain
 0x00019  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
 0x0001d  GICv2            0xffff00000916bfd8          LINEAR          GICv2
 0x0001e  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
 0x0001b  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
 0x0001a  GICv2            0xffff00000916bfd8          LINEAR          GICv2
 [...]
 0x81808  MSI              0x          (null)           RADIX          MSI
 + 0x00063  GICv2m           0xffff8003ee116980           RADIX          GICv2m
 + 0x00063  GICv2            0xffff00000916bfd8          LINEAR          GICv2
 0x08800  MSI              0x          (null)     *     RADIX          MSI
 + 0x00064  GICv2m           0xffff8003ee116980     *     RADIX          GICv2m
 + 0x00064  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
 Here, interrupts 1-5 are only using a single domain, while 96 and 97
 are build out of a stack of three domain, each level performing a
 particular function.
 Hierarchy irq_domain may also be used to support other architectures,
 such as ARM, ARM64 etc.

2

Documentation/IRQ.txt

View File

@@ -1,6 +1,4 @@
 ===============
 What is an IRQ?
 ===============
 An IRQ is an interrupt request from a device.
 Currently they can come in over a pin, or over a packet.

37

Documentation/Intel-IOMMU.txt

View File

@@ -1,4 +1,3 @@
 ===================
 Linux IOMMU Support
 ===================
@@ -10,11 +9,11 @@ This guide gives a quick cheat sheet for some basic understanding.
 Some Keywords
 - DMAR - DMA remapping
 - DRHD - DMA Remapping Hardware Unit Definition
 - RMRR - Reserved memory Region Reporting Structure
 - ZLR  - Zero length reads from PCI devices
 - IOVA - IO Virtual address.
 DMAR - DMA remapping
 DRHD - DMA Remapping Hardware Unit Definition
 RMRR - Reserved memory Region Reporting Structure
 ZLR  - Zero length reads from PCI devices
 IOVA - IO Virtual address.
 Basic stuff
 -----------
@@ -34,7 +33,7 @@ devices that need to access these regions. OS is expected to setup
 unity mappings for these regions for these devices to access these regions.
 How is IOVA generated?
 ----------------------
 ---------------------
 Well behaved drivers call pci_map_*() calls before sending command to device
 that needs to perform DMA. Once DMA is completed and mapping is no longer
@@ -83,14 +82,14 @@ in ACPI.
 ACPI: DMAR (v001 A M I  OEMDMAR  0x00000001 MSFT 0x00000097) @ 0x000000007f5b5ef0
 When DMAR is being processed and initialized by ACPI, prints DMAR locations
 and any RMRR's processed::
 and any RMRR's processed.
 	ACPI DMAR:Host address width 36
 	ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000
 	ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000
 	ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000
 	ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff
 	ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff
 ACPI DMAR:Host address width 36
 ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed90000
 ACPI DMAR:DRHD (flags: 0x00000000)base: 0x00000000fed91000
 ACPI DMAR:DRHD (flags: 0x00000001)base: 0x00000000fed93000
 ACPI DMAR:RMRR base: 0x00000000000ed000 end: 0x00000000000effff
 ACPI DMAR:RMRR base: 0x000000007f600000 end: 0x000000007fffffff
 When DMAR is enabled for use, you will notice..
@@ -99,12 +98,10 @@ PCI-DMA: Using DMAR IOMMU
 Fault reporting
 ---------------
 ::
 	DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
 	DMAR:[fault reason 05] PTE Write access is not set
 	DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
 	DMAR:[fault reason 05] PTE Write access is not set
 DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
 DMAR:[fault reason 05] PTE Write access is not set
 DMAR:[DMA Write] Request device [00:02.0] fault addr 6df084000
 DMAR:[fault reason 05] PTE Write access is not set
 TBD
 ----

									
										117

Documentation/Makefile
									
												View File
												
				@@ -1,118 +1 @@

				# -*- makefile -*-

				# Makefile for Sphinx documentation

				#

				subdir-y :=

				# You can set these variables from the command line.

				SPHINXBUILD   = sphinx-build

				SPHINXOPTS    =

				SPHINXDIRS    = .

				_SPHINXDIRS   = $(patsubst $(srctree)/Documentation/%/conf.py,%,$(wildcard $(srctree)/Documentation/*/conf.py))

				SPHINX_CONF   = conf.py

				PAPER         =

				BUILDDIR      = $(obj)/output

				PDFLATEX      = xelatex

				LATEXOPTS     = -interaction=batchmode

				# User-friendly check for sphinx-build

				HAVE_SPHINX := $(shell if which $(SPHINXBUILD) >/dev/null 2>&1; then echo 1; else echo 0; fi)

				ifeq ($(HAVE_SPHINX),0)

				.DEFAULT:

					$(warning The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the '$(SPHINXBUILD)' executable.)

					@echo

					@./scripts/sphinx-pre-install

					@echo "  SKIP    Sphinx $@ target."

				else # HAVE_SPHINX

				# User-friendly check for pdflatex

				HAVE_PDFLATEX := $(shell if which $(PDFLATEX) >/dev/null 2>&1; then echo 1; else echo 0; fi)

				# Internal variables.

				PAPEROPT_a4     = -D latex_paper_size=a4

				PAPEROPT_letter = -D latex_paper_size=letter

				KERNELDOC       = $(srctree)/scripts/kernel-doc

				KERNELDOC_CONF  = -D kerneldoc_srctree=$(srctree) -D kerneldoc_bin=$(KERNELDOC)

				ALLSPHINXOPTS   =  $(KERNELDOC_CONF) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS)

				# the i18n builder cannot share the environment and doctrees with the others

				I18NSPHINXOPTS  = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .

				# commands; the 'cmd' from scripts/Kbuild.include is not *loopable*

				loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit;

				# $2 sphinx builder e.g. "html"

				# $3 name of the build subfolder / e.g. "media", used as:

				#    * dest folder relative to $(BUILDDIR) and

				#    * cache folder relative to $(BUILDDIR)/.doctrees

				# $4 dest subfolder e.g. "man" for man pages at media/man

				# $5 reST source folder relative to $(srctree)/$(src),

				#    e.g. "media" for the linux-tv book-set at ./Documentation/media

				quiet_cmd_sphinx = SPHINX  $@ --> file://$(abspath $(BUILDDIR)/$3/$4)

				      cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media $2 && \

					PYTHONDONTWRITEBYTECODE=1 \

					BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(srctree)/$(src)/$5/$(SPHINX_CONF)) \

					$(SPHINXBUILD) \

					-b $2 \

					-c $(abspath $(srctree)/$(src)) \

					-d $(abspath $(BUILDDIR)/.doctrees/$3) \

					-D version=$(KERNELVERSION) -D release=$(KERNELRELEASE) \

					$(ALLSPHINXOPTS) \

					$(abspath $(srctree)/$(src)/$5) \

					$(abspath $(BUILDDIR)/$3/$4)

				htmldocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))

				linkcheckdocs:

					@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,linkcheck,$(var),,$(var)))

				latexdocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))

				ifeq ($(HAVE_PDFLATEX),0)

				pdfdocs:

					$(warning The '$(PDFLATEX)' command was not found. Make sure you have it installed and in PATH to produce PDF output.)

					@echo "  SKIP    Sphinx $@ target."

				else # HAVE_PDFLATEX

				pdfdocs: latexdocs

					$(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=$(PDFLATEX) LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit;)

				endif # HAVE_PDFLATEX

				epubdocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var)))

				xmldocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var)))

				endif # HAVE_SPHINX

				# The following targets are independent of HAVE_SPHINX, and the rules should

				# work or silently pass without Sphinx.

				cleandocs:

					$(Q)rm -rf $(BUILDDIR)

					$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media clean

				dochelp:

					@echo  ' Linux kernel internal documentation in different formats from ReST:'

					@echo  '  htmldocs        - HTML'

					@echo  '  latexdocs       - LaTeX'

					@echo  '  pdfdocs         - PDF'

					@echo  '  epubdocs        - EPUB'

					@echo  '  xmldocs         - XML'

					@echo  '  linkcheckdocs   - check for broken external links (will connect to external hosts)'

					@echo  '  cleandocs       - clean all generated files'

					@echo

					@echo  '  make SPHINXDIRS="s1 s2" [target] Generate only docs of folder s1, s2'

					@echo  '  valid values for SPHINXDIRS are: $(_SPHINXDIRS)'

					@echo

					@echo  '  make SPHINX_CONF={conf-file} [target] use *additional* sphinx-build'

					@echo  '  configuration. This is e.g. useful to build with nit-picking config.'

									
										130

Documentation/Makefile.sphinx
									
										Normal file
									
												View File
												
				@@ -0,0 +1,130 @@

				# -*- makefile -*-

				# Makefile for Sphinx documentation

				#

				# You can set these variables from the command line.

				SPHINXBUILD   = sphinx-build

				SPHINXOPTS    =

				SPHINXDIRS    = .

				_SPHINXDIRS   = $(patsubst $(srctree)/Documentation/%/conf.py,%,$(wildcard $(srctree)/Documentation/*/conf.py))

				SPHINX_CONF   = conf.py

				PAPER         =

				BUILDDIR      = $(obj)/output

				PDFLATEX      = xelatex

				LATEXOPTS     = -interaction=batchmode

				# User-friendly check for sphinx-build

				HAVE_SPHINX := $(shell if which $(SPHINXBUILD) >/dev/null 2>&1; then echo 1; else echo 0; fi)

				ifeq ($(HAVE_SPHINX),0)

				.DEFAULT:

					$(warning The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the '$(SPHINXBUILD)' executable.)

					@echo "  SKIP    Sphinx $@ target."

				else ifneq ($(DOCBOOKS),)

				# Skip Sphinx build if the user explicitly requested DOCBOOKS.

				.DEFAULT:

					@echo "  SKIP    Sphinx $@ target (DOCBOOKS specified)."

				else # HAVE_SPHINX

				# User-friendly check for pdflatex

				HAVE_PDFLATEX := $(shell if which $(PDFLATEX) >/dev/null 2>&1; then echo 1; else echo 0; fi)

				# Internal variables.

				PAPEROPT_a4     = -D latex_paper_size=a4

				PAPEROPT_letter = -D latex_paper_size=letter

				KERNELDOC       = $(srctree)/scripts/kernel-doc

				KERNELDOC_CONF  = -D kerneldoc_srctree=$(srctree) -D kerneldoc_bin=$(KERNELDOC)

				ALLSPHINXOPTS   =  $(KERNELDOC_CONF) $(PAPEROPT_$(PAPER)) $(SPHINXOPTS)

				# the i18n builder cannot share the environment and doctrees with the others

				I18NSPHINXOPTS  = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .

				# commands; the 'cmd' from scripts/Kbuild.include is not *loopable*

				loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit;

				# $2 sphinx builder e.g. "html"

				# $3 name of the build subfolder / e.g. "media", used as:

				#    * dest folder relative to $(BUILDDIR) and

				#    * cache folder relative to $(BUILDDIR)/.doctrees

				# $4 dest subfolder e.g. "man" for man pages at media/man

				# $5 reST source folder relative to $(srctree)/$(src),

				#    e.g. "media" for the linux-tv book-set at ./Documentation/media

				quiet_cmd_sphinx = SPHINX  $@ --> file://$(abspath $(BUILDDIR)/$3/$4)

				      cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media $2 && \

					PYTHONDONTWRITEBYTECODE=1 \

					BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(srctree)/$(src)/$5/$(SPHINX_CONF)) \

					$(SPHINXBUILD) \

					-b $2 \

					-c $(abspath $(srctree)/$(src)) \

					-d $(abspath $(BUILDDIR)/.doctrees/$3) \

					-D version=$(KERNELVERSION) -D release=$(KERNELRELEASE) \

					$(ALLSPHINXOPTS) \

					$(abspath $(srctree)/$(src)/$5) \

					$(abspath $(BUILDDIR)/$3/$4)

				htmldocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))

				linkcheckdocs:

					@$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,linkcheck,$(var),,$(var)))

				latexdocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,latex,$(var),latex,$(var)))

				ifeq ($(HAVE_PDFLATEX),0)

				pdfdocs:

					$(warning The '$(PDFLATEX)' command was not found. Make sure you have it installed and in PATH to produce PDF output.)

					@echo "  SKIP    Sphinx $@ target."

				else # HAVE_PDFLATEX

				pdfdocs: latexdocs

					$(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX=$(PDFLATEX) LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit;)

				endif # HAVE_PDFLATEX

				epubdocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,epub,$(var),epub,$(var)))

				xmldocs:

					@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,xml,$(var),xml,$(var)))

				endif # HAVE_SPHINX

				# The following targets are independent of HAVE_SPHINX, and the rules should

				# work or silently pass without Sphinx.

				# no-ops for the Sphinx toolchain

				sgmldocs:

					@:

				psdocs:

					@:

				mandocs:

					@:

				installmandocs:

					@:

				cleandocs:

					$(Q)rm -rf $(BUILDDIR)

					$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media clean

				dochelp:

					@echo  ' Linux kernel internal documentation in different formats (Sphinx):'

					@echo  '  htmldocs        - HTML'

					@echo  '  latexdocs       - LaTeX'

					@echo  '  pdfdocs         - PDF'

					@echo  '  epubdocs        - EPUB'

					@echo  '  xmldocs         - XML'

					@echo  '  linkcheckdocs   - check for broken external links (will connect to external hosts)'

					@echo  '  cleandocs       - clean all generated files'

					@echo

					@echo  '  make SPHINXDIRS="s1 s2" [target] Generate only docs of folder s1, s2'

					@echo  '  valid values for SPHINXDIRS are: $(_SPHINXDIRS)'

					@echo

					@echo  '  make SPHINX_CONF={conf-file} [target] use *additional* sphinx-build'

					@echo  '  configuration. This is e.g. useful to build with nit-picking config.'

10

Documentation/PCI/00-INDEX

View File

@@ -12,13 +12,3 @@ pci.txt
 	- info on the PCI subsystem for device driver authors
 pcieaer-howto.txt
 	- the PCI Express Advanced Error Reporting Driver Guide HOWTO
 endpoint/pci-endpoint.txt
 	- guide to add endpoint controller driver and endpoint function driver.
 endpoint/pci-endpoint-cfs.txt
 	- guide to use configfs to configure the PCI endpoint function.
 endpoint/pci-test-function.txt
 	- specification of *PCI test* function device.
 endpoint/pci-test-howto.txt
 	- userguide for PCI endpoint test function.
 endpoint/function/binding/
 	- binding documentation for PCI endpoint function

2

Documentation/PCI/MSI-HOWTO.txt

View File

@@ -186,7 +186,7 @@ must disable interrupts while the lock is held.  If the device sends
 a different interrupt, the driver will deadlock trying to recursively
 acquire the spinlock.  Such deadlocks can be avoided by using
 spin_lock_irqsave() or spin_lock_irq() which disable local interrupts
 and acquire the lock (see Documentation/kernel-hacking/locking.rst).
 and acquire the lock (see Documentation/DocBook/kernel-locking).
 .5 How to tell whether MSI/MSI-X is enabled on a device

17

Documentation/PCI/endpoint/function/binding/pci-test.txt

View File

@@ -1,17 +0,0 @@
 PCI TEST ENDPOINT FUNCTION
 name: Should be "pci_epf_test" to bind to the pci_epf_test driver.
 Configurable Fields:
 vendorid	 : should be 0x104c
 deviceid	 : should be 0xb500 for DRA74x and 0xb501 for DRA72x
 revid		 : don't care
 progif_code	 : don't care
 subclass_code	 : don't care
 baseclass_code	 : should be 0xff
 cache_line_size	 : don't care
 subsys_vendor_id : don't care
 subsys_id	 : don't care
 interrupt_pin	 : Should be 1 - INTA, 2 - INTB, 3 - INTC, 4 -INTD
 msi_interrupts	 : Should be 1 to 32 depending on the number of MSI interrupts
 		   to test

105

Documentation/PCI/endpoint/pci-endpoint-cfs.txt

View File

@@ -1,105 +0,0 @@
                    CONFIGURING PCI ENDPOINT USING CONFIGFS
                     Kishon Vijay Abraham I <kishon@ti.com>
 The PCI Endpoint Core exposes configfs entry (pci_ep) to configure the
 PCI endpoint function and to bind the endpoint function
 with the endpoint controller. (For introducing other mechanisms to
 configure the PCI Endpoint Function refer to [1]).
 *) Mounting configfs
 The PCI Endpoint Core layer creates pci_ep directory in the mounted configfs
 directory. configfs can be mounted using the following command.
 	mount -t configfs none /sys/kernel/config
 *) Directory Structure
 The pci_ep configfs has two directories at its root: controllers and
 functions. Every EPC device present in the system will have an entry in
 the *controllers* directory and and every EPF driver present in the system
 will have an entry in the *functions* directory.
 /sys/kernel/config/pci_ep/
 	.. controllers/
 	.. functions/
 *) Creating EPF Device
 Every registered EPF driver will be listed in controllers directory. The
 entries corresponding to EPF driver will be created by the EPF core.
 /sys/kernel/config/pci_ep/functions/
 	.. <EPF Driver1>/
 		... <EPF Device 11>/
 		... <EPF Device 21>/
 	.. <EPF Driver2>/
 		... <EPF Device 12>/
 		... <EPF Device 22>/
 In order to create a <EPF device> of the type probed by <EPF Driver>, the
 user has to create a directory inside <EPF DriverN>.
 Every <EPF device> directory consists of the following entries that can be
 used to configure the standard configuration header of the endpoint function.
 (These entries are created by the framework when any new <EPF Device> is
 created)
 	.. <EPF Driver1>/
 		... <EPF Device 11>/
 			... vendorid
 			... deviceid
 			... revid
 			... progif_code
 			... subclass_code
 			... baseclass_code
 			... cache_line_size
 			... subsys_vendor_id
 			... subsys_id
 			... interrupt_pin
 *) EPC Device
 Every registered EPC device will be listed in controllers directory. The
 entries corresponding to EPC device will be created by the EPC core.
 /sys/kernel/config/pci_ep/controllers/
 	.. <EPC Device1>/
 		... <Symlink EPF Device11>/
 		... <Symlink EPF Device12>/
 		... start
 	.. <EPC Device2>/
 		... <Symlink EPF Device21>/
 		... <Symlink EPF Device22>/
 		... start
 The <EPC Device> directory will have a list of symbolic links to
 <EPF Device>. These symbolic links should be created by the user to
 represent the functions present in the endpoint device.
 The <EPC Device> directory will also have a *start* field. Once
 "1" is written to this field, the endpoint device will be ready to
 establish the link with the host. This is usually done after
 all the EPF devices are created and linked with the EPC device.
 			 | controllers/
 				| <Directory: EPC name>/
 					| <Symbolic Link: Function>
 					| start
 			 | functions/
 				| <Directory: EPF driver>/
 					| <Directory: EPF device>/
 						| vendorid
 						| deviceid
 						| revid
 						| progif_code
 						| subclass_code
 						| baseclass_code
 						| cache_line_size
 						| subsys_vendor_id
 						| subsys_id
 						| interrupt_pin
 						| function
 [1] -> Documentation/PCI/endpoint/pci-endpoint.txt

215

Documentation/PCI/endpoint/pci-endpoint.txt

View File

@@ -1,215 +0,0 @@
 			    PCI ENDPOINT FRAMEWORK
 		    Kishon Vijay Abraham I <kishon@ti.com>
 This document is a guide to use the PCI Endpoint Framework in order to create
 endpoint controller driver, endpoint function driver, and using configfs
 interface to bind the function driver to the controller driver.
 . Introduction
 Linux has a comprehensive PCI subsystem to support PCI controllers that
 operates in Root Complex mode. The subsystem has capability to scan PCI bus,
 assign memory resources and IRQ resources, load PCI driver (based on
 vendor ID, device ID), support other services like hot-plug, power management,
 advanced error reporting and virtual channels.
 However the PCI controller IP integrated in some SoCs is capable of operating
 either in Root Complex mode or Endpoint mode. PCI Endpoint Framework will
 add endpoint mode support in Linux. This will help to run Linux in an
 EP system which can have a wide variety of use cases from testing or
 validation, co-processor accelerator, etc.
 . PCI Endpoint Core
 The PCI Endpoint Core layer comprises 3 components: the Endpoint Controller
 library, the Endpoint Function library, and the configfs layer to bind the
 endpoint function with the endpoint controller.
 .1 PCI Endpoint Controller(EPC) Library
 The EPC library provides APIs to be used by the controller that can operate
 in endpoint mode. It also provides APIs to be used by function driver/library
 in order to implement a particular endpoint function.
 .1.1 APIs for the PCI controller Driver
 This section lists the APIs that the PCI Endpoint core provides to be used
 by the PCI controller driver.
 *) devm_pci_epc_create()/pci_epc_create()
    The PCI controller driver should implement the following ops:
 	 * write_header: ops to populate configuration space header
 	 * set_bar: ops to configure the BAR
 	 * clear_bar: ops to reset the BAR
 	 * alloc_addr_space: ops to allocate in PCI controller address space
 	 * free_addr_space: ops to free the allocated address space
 	 * raise_irq: ops to raise a legacy or MSI interrupt
 	 * start: ops to start the PCI link
 	 * stop: ops to stop the PCI link
    The PCI controller driver can then create a new EPC device by invoking
    devm_pci_epc_create()/pci_epc_create().
 *) devm_pci_epc_destroy()/pci_epc_destroy()
    The PCI controller driver can destroy the EPC device created by either
    devm_pci_epc_create() or pci_epc_create() using devm_pci_epc_destroy() or
    pci_epc_destroy().
 *) pci_epc_linkup()
    In order to notify all the function devices that the EPC device to which
    they are linked has established a link with the host, the PCI controller
    driver should invoke pci_epc_linkup().
 *) pci_epc_mem_init()
    Initialize the pci_epc_mem structure used for allocating EPC addr space.
 *) pci_epc_mem_exit()
    Cleanup the pci_epc_mem structure allocated during pci_epc_mem_init().
 .1.2 APIs for the PCI Endpoint Function Driver
 This section lists the APIs that the PCI Endpoint core provides to be used
 by the PCI endpoint function driver.
 *) pci_epc_write_header()
    The PCI endpoint function driver should use pci_epc_write_header() to
    write the standard configuration header to the endpoint controller.
 *) pci_epc_set_bar()
    The PCI endpoint function driver should use pci_epc_set_bar() to configure
    the Base Address Register in order for the host to assign PCI addr space.
    Register space of the function driver is usually configured
    using this API.
 *) pci_epc_clear_bar()
    The PCI endpoint function driver should use pci_epc_clear_bar() to reset
    the BAR.
 *) pci_epc_raise_irq()
    The PCI endpoint function driver should use pci_epc_raise_irq() to raise
    Legacy Interrupt or MSI Interrupt.
 *) pci_epc_mem_alloc_addr()
    The PCI endpoint function driver should use pci_epc_mem_alloc_addr(), to
    allocate memory address from EPC addr space which is required to access
    RC's buffer
 *) pci_epc_mem_free_addr()
    The PCI endpoint function driver should use pci_epc_mem_free_addr() to
    free the memory space allocated using pci_epc_mem_alloc_addr().
 .1.3 Other APIs
 There are other APIs provided by the EPC library. These are used for binding
 the EPF device with EPC device. pci-ep-cfs.c can be used as reference for
 using these APIs.
 *) pci_epc_get()
    Get a reference to the PCI endpoint controller based on the device name of
    the controller.
 *) pci_epc_put()
    Release the reference to the PCI endpoint controller obtained using
    pci_epc_get()
 *) pci_epc_add_epf()
    Add a PCI endpoint function to a PCI endpoint controller. A PCIe device
    can have up to 8 functions according to the specification.
 *) pci_epc_remove_epf()
    Remove the PCI endpoint function from PCI endpoint controller.
 *) pci_epc_start()
    The PCI endpoint function driver should invoke pci_epc_start() once it
    has configured the endpoint function and wants to start the PCI link.
 *) pci_epc_stop()
    The PCI endpoint function driver should invoke pci_epc_stop() to stop
    the PCI LINK.
 .2 PCI Endpoint Function(EPF) Library
 The EPF library provides APIs to be used by the function driver and the EPC
 library to provide endpoint mode functionality.
 .2.1 APIs for the PCI Endpoint Function Driver
 This section lists the APIs that the PCI Endpoint core provides to be used
 by the PCI endpoint function driver.
 *) pci_epf_register_driver()
    The PCI Endpoint Function driver should implement the following ops:
 	 * bind: ops to perform when a EPC device has been bound to EPF device
 	 * unbind: ops to perform when a binding has been lost between a EPC
 	   device and EPF device
 	 * linkup: ops to perform when the EPC device has established a
 	   connection with a host system
   The PCI Function driver can then register the PCI EPF driver by using
   pci_epf_register_driver().
 *) pci_epf_unregister_driver()
   The PCI Function driver can unregister the PCI EPF driver by using
   pci_epf_unregister_driver().
 *) pci_epf_alloc_space()
   The PCI Function driver can allocate space for a particular BAR using
   pci_epf_alloc_space().
 *) pci_epf_free_space()
   The PCI Function driver can free the allocated space
   (using pci_epf_alloc_space) by invoking pci_epf_free_space().
 .2.2 APIs for the PCI Endpoint Controller Library
 This section lists the APIs that the PCI Endpoint core provides to be used
 by the PCI endpoint controller library.
 *) pci_epf_linkup()
    The PCI endpoint controller library invokes pci_epf_linkup() when the
    EPC device has established the connection to the host.
 .2.2 Other APIs
 There are other APIs provided by the EPF library. These are used to notify
 the function driver when the EPF device is bound to the EPC device.
 pci-ep-cfs.c can be used as reference for using these APIs.
 *) pci_epf_create()
    Create a new PCI EPF device by passing the name of the PCI EPF device.
    This name will be used to bind the the EPF device to a EPF driver.
 *) pci_epf_destroy()
    Destroy the created PCI EPF device.
 *) pci_epf_bind()
    pci_epf_bind() should be invoked when the EPF device has been bound to
    a EPC device.
 *) pci_epf_unbind()
    pci_epf_unbind() should be invoked when the binding between EPC device
    and EPF device is lost.

66

Documentation/PCI/endpoint/pci-test-function.txt

View File

@@ -1,66 +0,0 @@
 				PCI TEST
 		    Kishon Vijay Abraham I <kishon@ti.com>
 Traditionally PCI RC has always been validated by using standard
 PCI cards like ethernet PCI cards or USB PCI cards or SATA PCI cards.
 However with the addition of EP-core in linux kernel, it is possible
 to configure a PCI controller that can operate in EP mode to work as
 a test device.
 The PCI endpoint test device is a virtual device (defined in software)
 used to test the endpoint functionality and serve as a sample driver
 for other PCI endpoint devices (to use the EP framework).
 The PCI endpoint test device has the following registers:
 ) PCI_ENDPOINT_TEST_MAGIC
 ) PCI_ENDPOINT_TEST_COMMAND
 ) PCI_ENDPOINT_TEST_STATUS
 ) PCI_ENDPOINT_TEST_SRC_ADDR
 ) PCI_ENDPOINT_TEST_DST_ADDR
 ) PCI_ENDPOINT_TEST_SIZE
 ) PCI_ENDPOINT_TEST_CHECKSUM
 *) PCI_ENDPOINT_TEST_MAGIC
 This register will be used to test BAR0. A known pattern will be written
 and read back from MAGIC register to verify BAR0.
 *) PCI_ENDPOINT_TEST_COMMAND:
 This register will be used by the host driver to indicate the function
 that the endpoint device must perform.
 Bitfield Description:
   Bit 0		: raise legacy IRQ
   Bit 1		: raise MSI IRQ
   Bit 2 - 7	: MSI interrupt number
   Bit 8		: read command (read data from RC buffer)
   Bit 9		: write command (write data to RC buffer)
   Bit 10	: copy command (copy data from one RC buffer to another
 		  RC buffer)
 *) PCI_ENDPOINT_TEST_STATUS
 This register reflects the status of the PCI endpoint device.
 Bitfield Description:
   Bit 0		: read success
   Bit 1		: read fail
   Bit 2		: write success
   Bit 3		: write fail
   Bit 4		: copy success
   Bit 5		: copy fail
   Bit 6		: IRQ raised
   Bit 7		: source address is invalid
   Bit 8		: destination address is invalid
 *) PCI_ENDPOINT_TEST_SRC_ADDR
 This register contains the source address (RC buffer address) for the
 COPY/READ command.
 *) PCI_ENDPOINT_TEST_DST_ADDR
 This register contains the destination address (RC buffer address) for
 the COPY/WRITE command.

179

Documentation/PCI/endpoint/pci-test-howto.txt

View File

@@ -1,179 +0,0 @@
 			    PCI TEST USERGUIDE
 		    Kishon Vijay Abraham I <kishon@ti.com>
 This document is a guide to help users use pci-epf-test function driver
 and pci_endpoint_test host driver for testing PCI. The list of steps to
 be followed in the host side and EP side is given below.
 . Endpoint Device
 .1 Endpoint Controller Devices
 To find the list of endpoint controller devices in the system:
 	# ls /sys/class/pci_epc/
 	  51000000.pcie_ep
 If PCI_ENDPOINT_CONFIGFS is enabled
 	# ls /sys/kernel/config/pci_ep/controllers
 	  51000000.pcie_ep
 .2 Endpoint Function Drivers
 To find the list of endpoint function drivers in the system:
 	# ls /sys/bus/pci-epf/drivers
 	  pci_epf_test
 If PCI_ENDPOINT_CONFIGFS is enabled
 	# ls /sys/kernel/config/pci_ep/functions
 	  pci_epf_test
 .3 Creating pci-epf-test Device
 PCI endpoint function device can be created using the configfs. To create
 pci-epf-test device, the following commands can be used
 	# mount -t configfs none /sys/kernel/config
 	# cd /sys/kernel/config/pci_ep/
 	# mkdir functions/pci_epf_test/func1
 The "mkdir func1" above creates the pci-epf-test function device that will
 be probed by pci_epf_test driver.
 The PCI endpoint framework populates the directory with the following
 configurable fields.
 	# ls functions/pci_epf_test/func1
 	  baseclass_code	interrupt_pin	revid		subsys_vendor_id
 	  cache_line_size	msi_interrupts	subclass_code	vendorid
 	  deviceid          	progif_code	subsys_id
 The PCI endpoint function driver populates these entries with default values
 when the device is bound to the driver. The pci-epf-test driver populates
 vendorid with 0xffff and interrupt_pin with 0x0001
 	# cat functions/pci_epf_test/func1/vendorid
 xffff
 	# cat functions/pci_epf_test/func1/interrupt_pin
 x0001
 .4 Configuring pci-epf-test Device
 The user can configure the pci-epf-test device using configfs entry. In order
 to change the vendorid and the number of MSI interrupts used by the function
 device, the following commands can be used.
 	# echo 0x104c > functions/pci_epf_test/func1/vendorid
 	# echo 0xb500 > functions/pci_epf_test/func1/deviceid
 	# echo 16 > functions/pci_epf_test/func1/msi_interrupts
 .5 Binding pci-epf-test Device to EP Controller
 In order for the endpoint function device to be useful, it has to be bound to
 a PCI endpoint controller driver. Use the configfs to bind the function
 device to one of the controller driver present in the system.
 	# ln -s functions/pci_epf_test/func1 controllers/51000000.pcie_ep/
 Once the above step is completed, the PCI endpoint is ready to establish a link
 with the host.
 .6 Start the Link
 In order for the endpoint device to establish a link with the host, the _start_
 field should be populated with '1'.
 	# echo 1 > controllers/51000000.pcie_ep/start
 . RootComplex Device
 .1 lspci Output
 Note that the devices listed here correspond to the value populated in 1.4 above
 :00.0 PCI bridge: Texas Instruments Device 8888 (rev 01)
 :00.0 Unassigned class [ff00]: Texas Instruments Device b500
 .2 Using Endpoint Test function Device
 pcitest.sh added in tools/pci/ can be used to run all the default PCI endpoint
 tests. Before pcitest.sh can be used pcitest.c should be compiled using the
 following commands.
 	cd <kernel-dir>
 	make headers_install ARCH=arm
 	arm-linux-gnueabihf-gcc -Iusr/include tools/pci/pcitest.c -o pcitest
 	cp pcitest  <rootfs>/usr/sbin/
 	cp tools/pci/pcitest.sh <rootfs>
 .2.1 pcitest.sh Output
 	# ./pcitest.sh
 	BAR tests
 	BAR0:           OKAY
 	BAR1:           OKAY
 	BAR2:           OKAY
 	BAR3:           OKAY
 	BAR4:           NOT OKAY
 	BAR5:           NOT OKAY
 	Interrupt tests
 	LEGACY IRQ:     NOT OKAY
 	MSI1:           OKAY
 	MSI2:           OKAY
 	MSI3:           OKAY
 	MSI4:           OKAY
 	MSI5:           OKAY
 	MSI6:           OKAY
 	MSI7:           OKAY
 	MSI8:           OKAY
 	MSI9:           OKAY
 	MSI10:          OKAY
 	MSI11:          OKAY
 	MSI12:          OKAY
 	MSI13:          OKAY
 	MSI14:          OKAY
 	MSI15:          OKAY
 	MSI16:          OKAY
 	MSI17:          NOT OKAY
 	MSI18:          NOT OKAY
 	MSI19:          NOT OKAY
 	MSI20:          NOT OKAY
 	MSI21:          NOT OKAY
 	MSI22:          NOT OKAY
 	MSI23:          NOT OKAY
 	MSI24:          NOT OKAY
 	MSI25:          NOT OKAY
 	MSI26:          NOT OKAY
 	MSI27:          NOT OKAY
 	MSI28:          NOT OKAY
 	MSI29:          NOT OKAY
 	MSI30:          NOT OKAY
 	MSI31:          NOT OKAY
 	MSI32:          NOT OKAY
 	Read Tests
 	READ (      1 bytes):           OKAY
 	READ (   1024 bytes):           OKAY
 	READ (   1025 bytes):           OKAY
 	READ (1024000 bytes):           OKAY
 	READ (1024001 bytes):           OKAY
 	Write Tests
 	WRITE (      1 bytes):          OKAY
 	WRITE (   1024 bytes):          OKAY
 	WRITE (   1025 bytes):          OKAY
 	WRITE (1024000 bytes):          OKAY
 	WRITE (1024001 bytes):          OKAY
 	Copy Tests
 	COPY (      1 bytes):           OKAY
 	COPY (   1024 bytes):           OKAY
 	COPY (   1025 bytes):           OKAY
 	COPY (1024000 bytes):           OKAY
 	COPY (1024001 bytes):           OKAY

12

Documentation/PCI/pci-error-recovery.txt

View File

@@ -11,7 +11,7 @@
 Many PCI bus controllers are able to detect a variety of hardware
 PCI errors on the bus, such as parity errors on the data and address
 buses, as well as SERR and PERR errors.  Some of the more advanced
 busses, as well as SERR and PERR errors.  Some of the more advanced
 chipsets are able to deal with these errors; these include PCI-E chipsets,
 and the PCI-host bridges found on IBM Power4, Power5 and Power6-based
 pSeries boxes. A typical action taken is to disconnect the affected device,
@@ -173,7 +173,7 @@ is STEP 6 (Permanent Failure).
 >>> a value of 0xff on read, and writes will be dropped. If more than
 >>> EEH_MAX_FAILS I/O's are attempted to a frozen adapter, EEH
 >>> assumes that the device driver has gone into an infinite loop
 >>> and prints an error to syslog.  A reboot is then required to
 >>> and prints an error to syslog.  A reboot is then required to
 >>> get the device working again.
 STEP 2: MMIO Enabled
@@ -231,14 +231,14 @@ proceeds to STEP 4 (Slot Reset)
 STEP 3: Link Reset
 ------------------
 The platform resets the link.  This is a PCI-Express specific step
 and is done whenever a fatal error has been detected that can be
 and is done whenever a non-fatal error has been detected that can be
 "solved" by resetting the link.
 STEP 4: Slot Reset
 ------------------
 In response to a return value of PCI_ERS_RESULT_NEED_RESET, the
 the platform will perform a slot reset on the requesting PCI device(s).
 the platform will perform a slot reset on the requesting PCI device(s).
 The actual steps taken by a platform to perform a slot reset
 will be platform-dependent. Upon completion of slot reset, the
 platform will call the device slot_reset() callback.
@@ -258,7 +258,7 @@ configuration registers to initialize to their default conditions.
 For most PCI devices, a soft reset will be sufficient for recovery.
 Optional fundamental reset is provided to support a limited number
 of PCI Express devices for which a soft reset is not sufficient
 of PCI Express PCI devices  for which a soft reset is not sufficient
 for recovery.
 If the platform supports PCI hotplug, then the reset might be
@@ -303,7 +303,7 @@ driver performs device init only from PCI function 0:
 		Same as above.
 Drivers for PCI Express cards that require a fundamental reset must
 set the needs_freset bit in the pci_dev structure in their probe function.
 set the needs_freset bit in the pci_dev structure in their probe function.
 For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
 PCI card types:

Compare commits

1194 Commits pull/2638/ ... rpi-4.11.y

48 .gitignore vendored Unescape Escape View File

12 .mailmap Unescape Escape View File

27 CREDITS Unescape Escape View File

10 Documentation/00-INDEX Unescape Escape View File

8 Documentation/ABI/obsolete/sysfs-firmware-acpi Unescape Escape View File

19 Documentation/ABI/stable/sysfs-bus-nvmem Unescape Escape View File

2 Documentation/ABI/stable/sysfs-bus-usb Unescape Escape View File

16 Documentation/ABI/stable/sysfs-class-udc Unescape Escape View File

15 Documentation/ABI/stable/sysfs-driver-aspeed-vuart Unescape Escape View File

30 Documentation/ABI/stable/sysfs-driver-dma-ioatdma Unescape Escape View File

119 Documentation/ABI/stable/sysfs-hypervisor-xen Unescape Escape View File

3 Documentation/ABI/stable/vdso Unescape Escape View File

3 Documentation/ABI/testing/configfs-usb-gadget-rndis Unescape Escape View File

18 Documentation/ABI/testing/configfs-usb-gadget-uac1 Unescape Escape View File

12 Documentation/ABI/testing/configfs-usb-gadget-uac1_legacy Unescape Escape View File

8 Documentation/ABI/testing/ima_policy Unescape Escape View File

45 Documentation/ABI/testing/ppc-memtrace Unescape Escape View File

31 Documentation/ABI/testing/procfs-smaps_rollup Unescape Escape View File

10 Documentation/ABI/testing/sysfs-block Unescape Escape View File

8 Documentation/ABI/testing/sysfs-block-zram Unescape Escape View File

38 Documentation/ABI/testing/sysfs-bus-fsi Unescape Escape View File

41 Documentation/ABI/testing/sysfs-bus-iio Unescape Escape View File

17 Documentation/ABI/testing/sysfs-bus-iio-adc-max9611 Unescape Escape View File

24 Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8 Unescape Escape View File

57 Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32 Unescape Escape View File

1 Documentation/ABI/testing/sysfs-bus-iio-meas-spec Unescape Escape View File

8 Documentation/ABI/testing/sysfs-bus-iio-proximity-as3935 Unescape Escape View File

134 Documentation/ABI/testing/sysfs-bus-iio-timer-stm32 Unescape Escape View File

24 Documentation/ABI/testing/sysfs-bus-pci Unescape Escape View File

112 Documentation/ABI/testing/sysfs-bus-thunderbolt Unescape Escape View File

13 Documentation/ABI/testing/sysfs-bus-usb-lvstest Unescape Escape View File

4 Documentation/ABI/testing/sysfs-class-cxl Unescape Escape View File

6 Documentation/ABI/testing/sysfs-class-mtd Unescape Escape View File

16 Documentation/ABI/testing/sysfs-class-mux Unescape Escape View File

8 Documentation/ABI/testing/sysfs-class-net Unescape Escape View File

36 Documentation/ABI/testing/sysfs-class-net-phydev Unescape Escape View File

27 Documentation/ABI/testing/sysfs-class-net-qmi Unescape Escape View File

17 Documentation/ABI/testing/sysfs-class-power-twl4030 Unescape Escape View File

4 Documentation/ABI/testing/sysfs-class-remoteproc Unescape Escape View File

96 Documentation/ABI/testing/sysfs-class-switchtec Unescape Escape View File

291 Documentation/ABI/testing/sysfs-class-typec Unescape Escape View File

24 Documentation/ABI/testing/sysfs-devices-system-cpu Unescape Escape View File

8 Documentation/ABI/testing/sysfs-driver-altera-cvp Unescape Escape View File

10 Documentation/ABI/testing/sysfs-firmware-acpi Unescape Escape View File

26 Documentation/ABI/testing/sysfs-firmware-ofw Unescape Escape View File

31 Documentation/ABI/testing/sysfs-firmware-opal-powercap Unescape Escape View File

18 Documentation/ABI/testing/sysfs-firmware-opal-psr Unescape Escape View File

41 Documentation/ABI/testing/sysfs-fs-f2fs Unescape Escape View File

23 Documentation/ABI/testing/sysfs-hypervisor-pmu Normal file Unescape Escape View File

43 Documentation/ABI/testing/sysfs-hypervisor-xen Unescape Escape View File

8 Documentation/ABI/testing/sysfs-kernel-livepatch Unescape Escape View File

16 Documentation/ABI/testing/sysfs-kernel-mm-swap Unescape Escape View File

9 Documentation/ABI/testing/sysfs-platform-chipidea-usb2 Unescape Escape View File

8 Documentation/ABI/testing/sysfs-platform-ideapad-laptop Unescape Escape View File

15 Documentation/ABI/testing/sysfs-platform-renesas_usb3 Unescape Escape View File

14 Documentation/ABI/testing/sysfs-power Unescape Escape View File

47 Documentation/ABI/testing/sysfs-uevent Unescape Escape View File

184 Documentation/DMA-API-HOWTO.txt Unescape Escape View File

591 Documentation/DMA-API.txt Unescape Escape View File

73 Documentation/DMA-ISA-LPC.txt Unescape Escape View File

15 Documentation/DMA-attributes.txt Unescape Escape View File

17 Documentation/DocBook/.gitignore vendored Normal file Unescape Escape View File

278 Documentation/DocBook/Makefile Normal file Unescape Escape View File

381 Documentation/DocBook/filesystems.tmpl Normal file Unescape Escape View File

793 Documentation/DocBook/gadget.tmpl Normal file Unescape Escape View File

520 Documentation/DocBook/genericirq.tmpl Normal file Unescape Escape View File

331 Documentation/DocBook/kernel-api.tmpl Normal file Unescape Escape View File

1312 Documentation/DocBook/kernel-hacking.tmpl Normal file View File

2151 Documentation/DocBook/kernel-locking.tmpl Normal file View File

918 Documentation/DocBook/kgdb.tmpl Normal file Unescape Escape View File

1625 Documentation/DocBook/libata.tmpl Normal file View File

289 Documentation/DocBook/librs.tmpl Normal file Unescape Escape View File

265 Documentation/DocBook/lsm.tmpl Normal file Unescape Escape View File

1291 Documentation/DocBook/mtdnand.tmpl Normal file View File

111 Documentation/DocBook/networking.tmpl Normal file Unescape Escape View File

158 Documentation/DocBook/rapidio.tmpl Normal file Unescape Escape View File

161 Documentation/DocBook/s390-drivers.tmpl Normal file Unescape Escape View File

409 Documentation/DocBook/scsi.tmpl Normal file Unescape Escape View File

1194 Commits

pull/2638/ ... rpi-4.11.y

48

.gitignore vendored

View File

12

.mailmap

View File

27

CREDITS

View File

10

Documentation/00-INDEX

View File

8

Documentation/ABI/obsolete/sysfs-firmware-acpi

View File

19

Documentation/ABI/stable/sysfs-bus-nvmem

View File

2

Documentation/ABI/stable/sysfs-bus-usb

View File

16

Documentation/ABI/stable/sysfs-class-udc

View File

15

Documentation/ABI/stable/sysfs-driver-aspeed-vuart

View File

30

Documentation/ABI/stable/sysfs-driver-dma-ioatdma

View File

119

Documentation/ABI/stable/sysfs-hypervisor-xen

View File

3

Documentation/ABI/stable/vdso

View File

3

Documentation/ABI/testing/configfs-usb-gadget-rndis

View File

18

Documentation/ABI/testing/configfs-usb-gadget-uac1

View File

12

Documentation/ABI/testing/configfs-usb-gadget-uac1_legacy

View File

8

Documentation/ABI/testing/ima_policy

View File

45

Documentation/ABI/testing/ppc-memtrace

View File

31

Documentation/ABI/testing/procfs-smaps_rollup

View File

10

Documentation/ABI/testing/sysfs-block

View File

8

Documentation/ABI/testing/sysfs-block-zram

View File

38

Documentation/ABI/testing/sysfs-bus-fsi

View File

41

Documentation/ABI/testing/sysfs-bus-iio

View File

17

Documentation/ABI/testing/sysfs-bus-iio-adc-max9611

View File

24

Documentation/ABI/testing/sysfs-bus-iio-counter-104-quad-8

View File

57

Documentation/ABI/testing/sysfs-bus-iio-lptimer-stm32

View File

1

Documentation/ABI/testing/sysfs-bus-iio-meas-spec

View File

8

Documentation/ABI/testing/sysfs-bus-iio-proximity-as3935

View File

134

Documentation/ABI/testing/sysfs-bus-iio-timer-stm32

View File

24

Documentation/ABI/testing/sysfs-bus-pci

View File

112

Documentation/ABI/testing/sysfs-bus-thunderbolt

View File

13

Documentation/ABI/testing/sysfs-bus-usb-lvstest

View File

4

Documentation/ABI/testing/sysfs-class-cxl

View File

6

Documentation/ABI/testing/sysfs-class-mtd

View File

16

Documentation/ABI/testing/sysfs-class-mux

View File

8

Documentation/ABI/testing/sysfs-class-net

View File

36

Documentation/ABI/testing/sysfs-class-net-phydev

View File

27

Documentation/ABI/testing/sysfs-class-net-qmi

View File

17

Documentation/ABI/testing/sysfs-class-power-twl4030

View File

4

Documentation/ABI/testing/sysfs-class-remoteproc

View File

96

Documentation/ABI/testing/sysfs-class-switchtec

View File

291

Documentation/ABI/testing/sysfs-class-typec

View File

24

Documentation/ABI/testing/sysfs-devices-system-cpu

View File

8

Documentation/ABI/testing/sysfs-driver-altera-cvp

View File

10

Documentation/ABI/testing/sysfs-firmware-acpi

View File

26

Documentation/ABI/testing/sysfs-firmware-ofw

View File

31

Documentation/ABI/testing/sysfs-firmware-opal-powercap

View File

18

Documentation/ABI/testing/sysfs-firmware-opal-psr

View File

41

Documentation/ABI/testing/sysfs-fs-f2fs

View File

23

Documentation/ABI/testing/sysfs-hypervisor-pmu Normal file

View File

43

Documentation/ABI/testing/sysfs-hypervisor-xen

View File

8

Documentation/ABI/testing/sysfs-kernel-livepatch

View File

16

Documentation/ABI/testing/sysfs-kernel-mm-swap

View File

9

Documentation/ABI/testing/sysfs-platform-chipidea-usb2

View File

8

Documentation/ABI/testing/sysfs-platform-ideapad-laptop

View File

15

Documentation/ABI/testing/sysfs-platform-renesas_usb3

View File

14

Documentation/ABI/testing/sysfs-power

View File

47

Documentation/ABI/testing/sysfs-uevent

View File

184

Documentation/DMA-API-HOWTO.txt

View File

591

Documentation/DMA-API.txt

View File

73

Documentation/DMA-ISA-LPC.txt

View File

15

Documentation/DMA-attributes.txt

View File

17

Documentation/DocBook/.gitignore vendored Normal file

View File

278

Documentation/DocBook/Makefile Normal file

View File

381

Documentation/DocBook/filesystems.tmpl Normal file

View File

793

Documentation/DocBook/gadget.tmpl Normal file

View File

520

Documentation/DocBook/genericirq.tmpl Normal file

View File

331

Documentation/DocBook/kernel-api.tmpl Normal file

View File

1312

Documentation/DocBook/kernel-hacking.tmpl Normal file

View File

2151

Documentation/DocBook/kernel-locking.tmpl Normal file

View File

918

Documentation/DocBook/kgdb.tmpl Normal file

View File

1625

Documentation/DocBook/libata.tmpl Normal file

View File

289

Documentation/DocBook/librs.tmpl Normal file

View File

265

Documentation/DocBook/lsm.tmpl Normal file

View File

1291

Documentation/DocBook/mtdnand.tmpl Normal file

View File

111

Documentation/DocBook/networking.tmpl Normal file

View File

158

Documentation/DocBook/rapidio.tmpl Normal file

View File

161

Documentation/DocBook/s390-drivers.tmpl Normal file

View File

409

Documentation/DocBook/scsi.tmpl Normal file

View File

105

Documentation/DocBook/sh.tmpl Normal file

View File