linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-28 13:02:59 +00:00

Author	SHA1	Message	Date
Jonathan Bell	12a7c6dd7b	drivers: char: add chardev for mmap'ing the RPiVid control registers Based on the gpiomem driver, allow mapping of the decoder register spaces such that userspace can access control/status registers. This driver is intended for use with a custom ffmpeg backend accelerator prior to a v4l2 driver being written. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org> driver: char: rpivid: Destroy the legacy device on remove The legacy name support created a new device that was never destroyed. If the driver was unloaded and reloaded, it failed due to the device already existing. Fixes: "75f1d14 driver: char: rpivid - also support legacy name" Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> driver: char: rpivid: Clean up error handling use of ERR_PTR/IS_ERR The driver used an unnecessary intermediate void* variable so it only called ERR_PTR once to convert to the error value. Switch to converting as the error arises to remove these intermediate variables. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> driver: char: rpivid: Add error handling to the legacy device load The return value from device_create for the legacy device was never checked or handled. Add the required error handling. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> driver: char: rpivid: Fix coding style whitespace issues. Makes checkpatch happier. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> driver: char: rpimem: Add SPDX licence header. Stops checkpatch complaining. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com> driver: char: rpivid: Fix access to freed memory The error path during probe frees the private memory block, and then promptly dereferences it to log an error message. Use the base device instead of the pointer to it in the private structure. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>	2020-11-04 11:07:51 +00:00
Jonathan Bell	e5c1434553	usb: add plumbing for updating interrupt endpoint interval state xHCI caches device and endpoint data after the interface is configured, so an explicit command needs to be issued for any device driver wanting to alter the polling interval of an endpoint. Add usb_fixup_endpoint() to allow drivers to do this. The fixup must be called after calculating endpoint bandwidth requirements but before any URBs are submitted. If polling intervals are shortened, any bandwidth reservations are no longer valid but in practice polling intervals are only ever relaxed. Limit the scope to interrupt transfers for now. Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>	2020-11-04 11:07:37 +00:00
Eric Anholt	b68f3c32bf	soc: bcm: bcm2835-pm: Add support for 2711. Without the actual power management part any more, there's a lot less to set up for V3D. We just need to clear the RSTN field for the power domain, and expose the reset controller for toggling it again. This is definitely incomplete -- the old ISP and H264 is in the old bridge, but since we have no consumers of it I've just done the minimum to get V3D working. Signed-off-by: Eric Anholt <eric@anholt.net>	2020-11-04 11:07:36 +00:00
Dave Stevenson	5c4c033863	staging: vc04_services: Add new vc-sm-cma driver This new driver allows contiguous memory blocks to be imported into the VideoCore VPU memory map, and manages the lifetime of those objects, only releasing the source dmabuf once the VPU has confirmed it has finished with it. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Correct DMA configuration. Now that VCHIQ is setting up the DMA configuration as our parent device, don't try to configure it during probe. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Use a void* pointer as the handle within the kernel The driver was using an unsigned int as the handle to the outside world, and doing a nasty cast to the struct dmabuf when handed it back. This breaks badly with a 64 bit kernel where the pointer doesn't fit in an unsigned int. Switch to using a void* within the kernel. Reality is that it is a struct dma_buf, but advertising it as such to other drivers seems to encourage the use of it as such, and I'm not sure on the implications of that. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Fix up for 64bit builds There were a number of logging lines that were using inappropriate formatting under 64bit kernels. The kernel_id field passed to/from the VPU was being abused for storing the struct vc_sm_buffer . This breaks with 64bit kernels, so change to using an IDR. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc_sm_cma: Remove erroneous misc_deregister Code from the misc /dev node was still present in bcm2835_vc_sm_cma_remove, which caused a NULL deref. Remove it. See #2885. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Remove the debugfs directory on remove Without removing that, reloading the driver fails. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Use devm_ allocs for sm_state. Use managed allocations for sm_state, removing reliance on manual management. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Don't fail if debugfs calls fail. Return codes from debugfs calls should never alter the flow of the main code. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Ensure mutex and idr are destroyed map_lock and kernelid_map are created in probe, but not released in release should the vcsm service not connect (eg running the cutdown firmware). Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Remove obsolete comment and make function static Removes obsolete comment about wanting to pass a function pointer into mmal-vchiq as we now do. As the function is passed as a function pointer, the function itself can be static. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Add in allocation for VPU requests. Module has to change from tristate to bool as all CMA functions are boolean. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Update TODO. The driver is already a platform driver, so that can be deleted from the TODO. There are no known issues that need to be resolved. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Add in userspace allocation API Replacing the functionality from the older vc-sm driver, add in a userspace API that allows allocation of buffers, and importing of dma-bufs. The driver hands out dma-buf fds, therefore much of the handling around lifespan and odd mmaps from the old driver goes away. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Add cache control ioctls The old driver allowed for direct cache manipulation and that was used by various clients. Replicate here. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Alter dev node permissions to 0666 Until the udev rules are updated, open up access to this node by default. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Drop logging level on messages in vc_sm_release_resource They weren't errors but were logged as such. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Fixup the alloc code handling of kernel_id The allocation code had been copied in from an old branch prior to having added the IDR for 64bit support. It was therefore pushing a pointer into the kernel_id field instead of an IDR handle, the lookup therefore failed, and we never released the buffer. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Remove cache manipulation ioctl from ARM64 The cache flushing ioctls are used by the Pi3 HEVC hw-assisted decoder as it needs finer grained flushing control than dma_ops allow. These cache calls are not present for ARM64, therefore disable them. We are not actively supporting 64bit kernels at present, and the use case of the HEVC decoder is fairly limited. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vcsm-cma: Rework to use dma APIs, not CMA Due to a misunderstanding of the DMA mapping APIs, I made the wrong decision on how to implement this. Rework to use dma_alloc_coherent instead of the CMA API. This also allows it to be built as a module easily. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc-sm-cma: Fix the few remaining coding style issues Fix a few minor checkpatch complaints to make the driver clean Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> staging: vc04_services: fix compiling in separate directory The vc04_services Makefiles do not respect the O=path argument correctly: include paths in CFLAGS are given relatively to object path, not source path. Compiling in a separate directory yields #include errors. Signed-off-by: Marek Behún <marek.behun@nic.cz> vc-sm-cma: Fix compatibility ioctl This code path hasn't been used previously. Fixed up after testing with kodi on 32-bit userland and 64-bit kernel Signed-off-by: popcornmix <popcornmix@gmail.com>	2020-11-04 11:07:26 +00:00
Phil Elwell	34c2953cfc	net: lan78xx: Support auto-downshift to 100Mb/s Ethernet cables with faulty or missing pairs (specifically pairs C and D) allow auto-negotiation to 1000Mbs, but do not support the successful establishment of a link. Add a DT property, "microchip,downshift-after", to configure the number of auto-negotiation failures after which it falls back to 100Mbs. Valid values are 2, 3, 4, 5 and 0, where 0 means never downshift. Signed-off-by: Phil Elwell <phil@raspberrypi.org>	2020-11-04 11:07:18 +00:00
Phil Elwell	d441aca63b	mfd: Add Raspberry Pi Sense HAT core driver mfd: Add rpi_sense_core of compatible string	2020-11-04 11:07:05 +00:00
Phil Elwell	8d65a5ed08	BCM270x_DT: Add pwr_led, and the required "input" trigger The "input" trigger makes the associated GPIO an input. This is to support the Raspberry Pi PWR LED, which is driven by external hardware in normal use. N.B. pwr_led is not available on Model A or B boards. leds-gpio: Implement the brightness_get method The power LED uses some clever logic that means it is driven by a voltage measuring circuit when configured as input, otherwise it is driven by the GPIO output value. This patch wires up the brightness_get method for leds-gpio so that user-space can monitor the LED value via /sys/class/gpio/led1/brightness. Using the input trigger this returns an indication of the system power health, otherwise it is just whatever value the trigger has written most recently. See: https://github.com/raspberrypi/linux/issues/1064	2020-11-04 11:07:04 +00:00
Luke Wren	540f4faf7c	Add SMI driver Signed-off-by: Luke Wren <wren6991@gmail.com> MISC: bcm2835: smi: use clock manager and fix reload issues Use clock manager instead of self-made clockmanager. Also fix some error paths that showd up during development (especially missing release of dma resources on rmmod) Signed-off-by: Martin Sperl <kernel@martin.sperl.org> bcm2835_smi: re-add dereference to fix DMA transfers	2020-11-04 11:07:03 +00:00
Tim Gover	614f12c98a	vcsm: VideoCore shared memory service for BCM2835 Add experimental support for the VideoCore shared memory service. This allows user processes to allocate memory from VideoCore's GPU relocatable heap and mmap the buffers. Additionally, the memory handles can passed to other VideoCore services such as MMAL, OpenMax and DispmanX TODO * This driver was originally released for BCM28155 which has a different cache architecture to BCM2835. Consequently, in this release only uncached mappings are supported. However, there's no fundamental reason which cached mappings cannot be support or BCM2835 * More refactoring is required to remove the typedefs. * Re-enable the some of the commented out debug-fs statistics which were disabled when migrating code from proc-fs. * There's a lot of code to support sharing of VCSM in order to support Android. This could probably done more cleanly or perhaps just removed. Signed-off-by: Tim Gover <timgover@gmail.com> config: Disable VC_SM for now to fix hang with cutdown kernel vcsm: Use boolean as it cannot be built as module On building the bcm_vc_sm as a module we get the following error: v7_dma_flush_range and do_munmap are undefined in vc-sm.ko. Fix by making it not an option to build as module vcsm: Add ioctl for custom cache flushing vc-sm: Move headers out of arch directory Signed-off-by: Noralf Trønnes <noralf@tronnes.org> vcsm: Treat EBUSY as success rather than SIGBUS Currently if two cores access the same page concurrently one will return VM_FAULT_NOPAGE and the other VM_FAULT_SIGBUS crashing the user code. Also report when mapping fails. Signed-off-by: popcornmix <popcornmix@gmail.com> vcsm: Provide new ioctl to clean/invalidate a 2D block vcsm: Convert to loading via device tree. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> VCSM: New option to import a DMABUF for VPU use Takes a dmabuf, and then calls over to the VPU to wrap it into a suitable handle. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> vcsm: fix multi-platform build vcsm: add macros for cache functions vcsm: use dma APIs for cache functions * Will handle multi-platform builds vcsm: Fix up macros to avoid breaking numbers used by existing apps vcsm: Define cache operation constants in user header Without this change, users have to use raw values (1, 2, 3) to specify cache operation. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Support for finding user/vc handle in memory pool vmcs_sm_{usr,vc}_handle_from_pid_and_address() were failing to find handle if specified user pointer is not exactly the one that the memory locking call returned even if the pointer is in range of map/resource. So fixed the functions to match the range. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Unify cache manipulating functions Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix obscure conditions Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix memory leaking on clean_invalid2 ioctl handler Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Describe the use of cache operation constants Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Fix obscure conditions again Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Add no-op cache operation constant Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Revert to do page-table-walk-based cache manipulating on some ioctl calls On FLUSH, INVALID, CLEAN_INVALID ioctl calls, cache operations based on page table walk were used in case that the buffer of the cache is not pinned. So reverted to do page-table-based cache manipulating. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Define cache operation constants in user header Without this change, users have to use raw values (1, 2, 3) to specify cache operation. Signed-off-by: Sugizaki Yukimasa <i.can.speak.c.and.basic@gmail.com> vcsm: Updates for changed vchiq interface vcsm: Fix an NULL dereference in the import_dmabuf error path resource was dereferenced even though it was NULL. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> vcsm: Use struct service_creation vcsm: Fix makefile include on out-of-tree builds The vc_sm module tries to include the 'fs' directory from the $(srctree). $(srctree) is already provided by the build system, and causes the include path to be duplicated. With -Werror this fails to compile. Remove the unnecessary variable. Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> vcsm: Remove set but unused variable The 'success' variable is set by the call to vchi_service_close() but never checked. Remove it, keeping the call in place. Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> vcsm: Reduce scope of local functions The functions: vc_vchi_sm_send_msg vc_sm_ioctl_alloc vc_sm_ioctl_alloc_share vc_sm_ioctl_import_dmabuf Are declared without a prototype. They are not used outside of this module, thus - convert them to static functions. Signed-off-by: Kieran Bingham <kieran.bingham@ideasonboard.com> vc_sm: Let it support to build in the non-src folder If we build the kernel with "-O=$non-src-folder", this driver will introdcue a building error because of the header's location. Signed-off-by: Hui Wang <hui.wang@canonical.com>	2020-11-04 11:07:02 +00:00
popcornmix	d3a2ccde15	vc_mem: Add vc_mem driver for querying firmware memory addresses Signed-off-by: popcornmix <popcornmix@gmail.com> BCM270x: Move vc_mem Make the vc_mem module available for ARCH_BCM2835 by moving it. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> char: vc_mem: Fix up compat ioctls for 64bit kernel compat_ioctl wasn't defined, so 32bit user/64bit kernel always failed. VC_MEM_IOC_MEM_PHYS_ADDR was defined with parameter size unsigned long, so the ioctl cmd changes between sizes. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org> char: vc_mem: Fix all coding style issues. Cleans up all checkpatch errors in vc_mem.c and vc_mem.h No functional change to the code. Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>	2020-11-04 11:07:02 +00:00
gellert	977c72a0d7	MMC: added alternative MMC driver mmc: Disable CMD23 transfers on all cards Pending wire-level investigation of these types of transfers and associated errors on bcm2835-mmc, disable for now. Fallback of CMD18/CMD25 transfers will be used automatically by the MMC layer. Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org> mmc: bcm2835-mmc: enable DT support for all architectures Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Enable Device Tree support for all architectures. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> mmc: bcm2835-mmc: fix probe error handling Probe error handling is broken in several places. Simplify error handling by using device managed functions. Replace pr_{err,info} with dev_{err,info}. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Add locks when accessing sdhost registers bcm2835-mmc: Add range of debug options for slowing things down bcm2835-mmc: Add option to disable some delays bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23 bcm2835-mmc: Adding overclocking option Allow a different clock speed to be substitued for a requested 50MHz. This option is exposed using the "overclock_50" DT parameter. Note that the mmc interface is restricted to EVEN integer divisions of 250MHz, and the highest sensible option is 63 (250/4 = 62.5), the next being 125 (250/2) which is much too high. Use at your own risk. bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz Also only warn once for each overclock setting. mmc: bcm2835-mmc: Make available on ARCH_BCM2835 Make the bcm2835-mmc driver available for use on ARCH_BCM2835. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-mmc entry Add Device Tree entry for bcm2835-mmc. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2835-mmc: Don't overwrite MMC capabilities from DT bcm2835-mmc: Don't override bus width capabilities from devicetree Take out the force setting of the MMC_CAP_4_BIT_DATA host capability so that the result read from devicetree via mmc_of_parse() is preserved. bcm2835-mmc: Only claim one DMA channel With both MMC controllers enabled there are few DMA channels left. The bcm2835-mmc driver only uses DMA in one direction at a time, so it doesn't need to claim two channels. See: https://github.com/raspberrypi/linux/issues/1327 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-mmc: New timer API mmc: bcm2835-mmc: Support underclocking Support underclocking of the SD bus using the max-frequency DT property (which currently has no DT parameter). The sd_overclock parameter already provides another way to achieve the same thing which should be equivalent in end result, but it is a bug not to support max-frequency as well. See: https://github.com/raspberrypi/linux/issues/2350 Signed-off-by: Phil Elwell <phil@raspberrypi.org> mmc/bcm2835: Recover from MMC_SEND_EXT_CSD If the user issues an "mmc extcsd read", the SD controller receives what it thinks is a SEND_IF_COND command with an unexpected data block. The resulting operations leave the FSM stuck in READWAIT, a state which persists until the MMC framework resets the controller, by which point the root filesystem is likely to have been unmounted. A less heavyweight solution is to detect the condition and nudge the FSM by asserting the (self-clearing) FORCE_DATA_MODE bit. N.B. This workaround was essentially discovered by accident and without a full understanding the inner workings of the controller, so it is fortunate that the "fix" only modifies error paths. See: https://github.com/raspberrypi/linux/issues/2728 Signed-off-by: Phil Elwell <phil@raspberrypi.org> bcm2835-mmc: Fix DMA channel leak The BCM2835 MMC host driver requests a DMA channel on probe but neglects to release the channel in the probe error path and on driver unbind. I'm seeing this happen on every boot of the Compute Module 3: On first driver probe, DMA channel 2 is allocated and then leaked with a "could not get clk, deferring probe" message. On second driver probe, channel 4 is allocated. Fix it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> bcm2835-mmc: Fix struct mmc_host leak on probe The BCM2835 MMC host driver requests the bus address of the host's register map on probe. If that fails, the driver leaks the struct mmc_host allocated earlier. Fix it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> bcm2835-mmc: Fix duplicate free_irq() on remove The BCM2835 MMC host driver requests its interrupt as a device-managed resource, so the interrupt is automatically freed after the driver is unbound. However on driver unbind, bcm2835_mmc_remove() frees the interrupt explicitly to avoid invocation of the interrupt handler after driver structures have been torn down. The interrupt is thus freed twice, leading to a WARN splat in __free_irq(). Fix by not requesting the interrupt as a device-managed resource. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> bcm2835-mmc: Handle mmc_add_host() errors The BCM2835 MMC host driver calls mmc_add_host() but doesn't check its return value. Errors occurring in that function are therefore not handled. Fix it. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> bcm2835-mmc: Deduplicate reset of driver data on remove The BCM2835 MMC host driver sets the device's driver data pointer to NULL on ->remove() even though the driver core subsequently does the same in __device_release_driver(). Drop the duplicate assignment. Signed-off-by: Lukas Wunner <lukas@wunner.de> Cc: Frank Pavlic <f.pavlic@kunbus.de> bcm2835_mmc: Remove vestigial threaded IRQ With SDIO processing now managed by the MMC framework with a workqueue, the bcm2835_mmc driver no longer needs a threaded IRQ. Signed-off-by: Phil Elwell <phil@raspberrypi.org> Add missing dma_unmap_sg calls to free relevant swiotlb bounce buffers. This prevents DMA leaks. Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com> Limit max_req_size under arm64 (or any other platform that uses swiotlb) to prevent potential buffer overflow due to bouncing. Signed-off-by: Yaroslav Rosomakho <yaroslavros@gmail.com>	2020-11-04 11:07:02 +00:00
Florian Meier	6af4977e40	dmaengine: Add support for BCM2708 Add support for DMA controller of BCM2708 as used in the Raspberry Pi. Currently it only supports cyclic DMA. Signed-off-by: Florian Meier <florian.meier@koalo.de> dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels DMA: fix cyclic LITE length overflow bug dmaengine: bcm2708: Remove chancnt affectations Mirror bcm2835-dma.c commit `9eba5536a7`: chancnt is already filled by dma_async_device_register, which uses the channel list to know how much channels there is. Since it's already filled, we can safely remove it from the drivers' probe function. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: overwrite dreq only if it is not set dreq is set when the DMA channel is fetched from Device Tree. slave_id is set using dmaengine_slave_config(). Only overwrite dreq with slave_id if it is not set. dreq/slave_id in the cyclic DMA case is not touched, because I don't have hardware to test with. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: do device registration in the board file Don't register the device in the driver. Do it in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835 Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now. Add Device Tree support to the non ARCH_BCM2835 case. Use the same driver name regardless of architecture. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> BCM270x_DT: add bcm2835-dma entry Add Device Tree entry for bcm2835-dma. The entry doesn't contain any resources since they are handled by the arch/arm/mach-bcm270x/dma.c driver. In non-DT mode, don't add the device in the board file. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine: Add debug options BCM270x: Add memory and irq resources to dmaengine device and DT Prepare for merging of the legacy DMA API arch driver dma.c with bcm2708-dmaengine by adding memory and irq resources both to platform file device and Device Tree node. Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c Merge the legacy DMA API driver with bcm2708-dmaengine. This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox driver is also needed). Changes to the dma.c code: - Use BIT() macro. - Cutdown some comments to one line. - Add mutex to vc_dmaman and use this, since the dev lock is locked during probing of the engine part. - Add global g_dmaman variable since drvdata is used by the engine part. - Restructure for readability: vc_dmaman_chan_alloc() vc_dmaman_chan_free() bcm_dma_chan_free() - Restructure bcm_dma_chan_alloc() to simplify error handling. - Use device irq resources instead of hardcoded bcm_dma_irqs table. - Remove dev_dmaman_register() and code it directly. - Remove dev_dmaman_deregister() and code it directly. - Simplify bcm_dmaman_probe() using devm_* functions. - Get dmachans from DT if available. - Keep 'dma.dmachans' module argument name for backwards compatibility. Make it available on ARCH_BCM2835 as well. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: set residue_granularity field bcm2708-dmaengine supports residue reporting at burst level but didn't report this via the residue_granularity field. Without this field set properly we get playback issues with I2S cards. dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer bcm2708-dmaengine: Use more DMA channels (but not 12) 1) Only the bcm2708_fb drivers uses the legacy DMA API, and it requires a BULK-capable channel, so all other types (FAST, NORMAL and LITE) can be made available to the regular DMA API. 2) DMA channels 11-14 share an interrupt. The driver can't handle this, so don't use channels 12-14 (12 was used, probably because it appears to have an interrupt, but in reality that interrupt is for activity on ANY channel). This may explain a lockup encountered when running out of DMA channels. The combined effect of this patch is to leave 7 DMA channels available + channel 0 for bcm2708_fb via the legacy API. See: https://github.com/raspberrypi/linux/issues/1110 https://github.com/raspberrypi/linux/issues/1108 dmaengine: bcm2708: Make legacy API available for bcm2835-dma bcm2708_fb uses the legacy DMA API, so in order to start using bcm2835-dma, bcm2835-dma has to support the legacy API. Make this possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove(). Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Change DT compatible string Both bcm2835-dma and bcm2708-dmaengine have the same compatible string. So change compatible to "brcm,bcm2708-dma". Signed-off-by: Noralf Trønnes <noralf@tronnes.org> dmaengine: bcm2708: Remove driver but keep legacy API Dropping non-DT support means we don't need this driver, but we still need the legacy DMA API. Signed-off-by: Noralf Trønnes <noralf@tronnes.org> bcm2708-dmaengine - Fix arm64 portability/build issues dma-bcm2708: Fix module compilation of CONFIG_DMA_BCM2708 bcm2708-dmaengine.c defines functions like bcm_dma_start which are defined as well in dma-bcm2708.h as inline versions when CONFIG_DMA_BCM2708 is not defined. This works fine when CONFIG_DMA_BCM2708 is built in, but when it is selected as module build fails with redefinition errors because in the build system when CONFIG_DMA_BCM2708 is selected as module, the macro becomes CONFIG_DMA_BCM2708_MODULE. This patch makes the header use CONFIG_DMA_BCM2708_MODULE too when available. Fixes https://github.com/raspberrypi/linux/issues/2056 Signed-off-by: Andrei Gherzan <andrei@gherzan.com>	2020-11-04 11:07:01 +00:00
Grygorii Strashko	f6b94060a1	PM: runtime: Fix timer_expires data type on 32-bit arches commit `6b61d49a55` upstream. Commit `8234f6734c` ("PM-runtime: Switch autosuspend over to using hrtimers") switched PM runtime autosuspend to use hrtimers and all related time accounting in ns, but missed to update the timer_expires data type in struct dev_pm_info to u64. This causes the timer_expires value to be truncated on 32-bit architectures when assignment is done from u64 values: rpm_suspend() \|- dev->power.timer_expires = expires; Fix it by changing the timer_expires type to u64. Fixes: `8234f6734c` ("PM-runtime: Switch autosuspend over to using hrtimers") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Pavel Machek <pavel@ucw.cz> Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: 5.0+ <stable@vger.kernel.org> # 5.0+ [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:42 +01:00
Paras Sharma	e3f6c126a3	serial: qcom_geni_serial: To correct QUP Version detection logic commit `c9ca43d42e` upstream. For QUP IP versions 2.5 and above the oversampling rate is halved from 32 to 16. Commit `ce73460054` ("tty: serial: qcom_geni_serial: Update the oversampling rate") is pushed to handle this scenario. But the existing logic is failing to classify QUP Version 3.0 into the correct group ( 2.5 and above). As result Serial Engine clocks are not configured properly for baud rate and garbage data is sampled to FIFOs from the line. So, fix the logic to detect QUP with versions 2.5 and above. Fixes: `ce73460054` ("tty: serial: qcom_geni_serial: Update the oversampling rate") Cc: stable <stable@vger.kernel.org> Signed-off-by: Paras Sharma <parashar@codeaurora.org> Reviewed-by: Akash Asthana <akashast@codeaurora.org> Link: https://lore.kernel.org/r/1601445926-23673-1-git-send-email-parashar@codeaurora.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:42 +01:00
Gustavo A. R. Silva	241bd102e3	mtd: lpddr: Fix bad logic in print_drs_error commit `1c9c02bb22` upstream. Update logic for broken test. Use a more common logging style. It appears the logic in this function is broken for the consecutive tests of if (prog_status & 0x3) ... else if (prog_status & 0x2) ... else (prog_status & 0x1) ... Likely the first test should be if ((prog_status & 0x3) == 0x3) Found by inspection of include files using printk. Fixes: `eb3db27507` ("[MTD] LPDDR PFOW definition") Cc: stable@vger.kernel.org Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/3fb0e29f5b601db8be2938a01d974b00c8788501.1588016644.git.gustavo@embeddedor.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:42 +01:00
Dan Williams	a092869e03	x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}() commit `ec6347bb43` upstream. In reaction to a proposal to introduce a memcpy_mcsafe_fast() implementation Linus points out that memcpy_mcsafe() is poorly named relative to communicating the scope of the interface. Specifically what addresses are valid to pass as source, destination, and what faults / exceptions are handled. Of particular concern is that even though x86 might be able to handle the semantics of copy_mc_to_user() with its common copy_user_generic() implementation other archs likely need / want an explicit path for this case: On Fri, May 1, 2020 at 11:28 AM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Thu, Apr 30, 2020 at 6:21 PM Dan Williams <dan.j.williams@intel.com> wrote: > > > > However now I see that copy_user_generic() works for the wrong reason. > > It works because the exception on the source address due to poison > > looks no different than a write fault on the user address to the > > caller, it's still just a short copy. So it makes copy_to_user() work > > for the wrong reason relative to the name. > > Right. > > And it won't work that way on other architectures. On x86, we have a > generic function that can take faults on either side, and we use it > for both cases (and for the "in_user" case too), but that's an > artifact of the architecture oddity. > > In fact, it's probably wrong even on x86 - because it can hide bugs - > but writing those things is painful enough that everybody prefers > having just one function. Replace a single top-level memcpy_mcsafe() with either copy_mc_to_user(), or copy_mc_to_kernel(). Introduce an x86 copy_mc_fragile() name as the rename for the low-level x86 implementation formerly named memcpy_mcsafe(). It is used as the slow / careful backend that is supplanted by a fast copy_mc_generic() in a follow-on patch. One side-effect of this reorganization is that separating copy_mc_64.S to its own file means that perf no longer needs to track dependencies for its memcpy_64.S benchmarks. [ bp: Massage a bit. ] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: <stable@vger.kernel.org> Link: http://lore.kernel.org/r/CAHk-=wjSqtXAqfUJxFtWNwmguFASTgB0dz1dT3V-78Quiezqbg@mail.gmail.com Link: https://lkml.kernel.org/r/160195561680.2163339.11574962055305783722.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:37 +01:00
Kees Cook	4720b25e4c	fs/kernel_read_file: Remove FIRMWARE_EFI_EMBEDDED enum commit `06e67b849a` upstream. The "FIRMWARE_EFI_EMBEDDED" enum is a "where", not a "what". It should not be distinguished separately from just "FIRMWARE", as this confuses the LSMs about what is being loaded. Additionally, there was no actual validation of the firmware contents happening. Fixes: `e4c2c0ff00` ("firmware: Add new platform fallback mechanism and firmware_request_platform()") Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Scott Branden <scott.branden@broadcom.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20201002173828.2099543-3-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:36 +01:00
Jens Axboe	511abceaf0	io_uring: don't rely on weak ->files references commit `0f2122045b` upstream. Grab actual references to the files_struct. To avoid circular references issues due to this, we add a per-task note that keeps track of what io_uring contexts a task has used. When the tasks execs or exits its assigned files, we cancel requests based on this tracking. With that, we can grab proper references to the files table, and no longer need to rely on stashing away ring_fd and ring_file to check if the ring_fd may have been closed. Cc: stable@vger.kernel.org # v5.5+ Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-11-01 12:45:35 +01:00
Serge Semin	e106dc6c4c	dmaengine: dw: Add DMA-channels mask cell support [ Upstream commit `e8ee6c8cb6` ] DW DMA IP-core provides a way to synthesize the DMA controller with channels having different parameters like maximum burst-length, multi-block support, maximum data width, etc. Those parameters both explicitly and implicitly affect the channels performance. Since DMA slave devices might be very demanding to the DMA performance, let's provide a functionality for the slaves to be assigned with DW DMA channels, which performance according to the platform engineer fulfill their requirements. After this patch is applied it can be done by passing the mask of suitable DMA-channels either directly in the dw_dma_slave structure instance or as a fifth cell of the DMA DT-property. If mask is zero or not provided, then there is no limitation on the channels allocation. For instance Baikal-T1 SoC is equipped with a DW DMAC engine, which first two channels are synthesized with max burst length of 16, while the rest of the channels have been created with max-burst-len=4. It would seem that the first two channels must be faster than the others and should be more preferable for the time-critical DMA slave devices. In practice it turned out that the situation is quite the opposite. The channels with max-burst-len=4 demonstrated a better performance than the channels with max-burst-len=16 even when they both had been initialized with the same settings. The performance drop of the first two DMA-channels made them unsuitable for the DW APB SSI slave device. No matter what settings they are configured with, full-duplex SPI transfers occasionally experience the Rx FIFO overflow. It means that the DMA-engine doesn't keep up with incoming data pace even though the SPI-bus is enabled with speed of 25MHz while the DW DMA controller is clocked with 50MHz signal. There is no such problem has been noticed for the channels synthesized with max-burst-len=4. Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Link: https://lore.kernel.org/r/20200731200826.9292-6-Sergey.Semin@baikalelectronics.ru Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:29 +01:00
Maciej Fijalkowski	7a40d28144	bpf: Limit caller's stack depth 256 for subprogs with tailcalls [ Upstream commit `7f6e4312e1` ] Protect against potential stack overflow that might happen when bpf2bpf calls get combined with tailcalls. Limit the caller's stack depth for such case down to 256 so that the worst case scenario would result in 8k stack size (32 which is tailcall limit * 256 = 8k). Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:24 +01:00
Dennis YC Hsieh	632bf6c3b8	soc: mediatek: cmdq: add clear option in cmdq_pkt_wfe api [ Upstream commit `23c22299cd` ] Add clear parameter to let client decide if event should be clear to 0 after GCE receive it. Signed-off-by: Dennis YC Hsieh <dennis-yc.hsieh@mediatek.com> Acked-by: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/1594136714-11650-9-git-send-email-dennis-yc.hsieh@mediatek.com [mb: fix commit message] Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:15 +01:00
Matthew Rosato	7e4f15f7c9	PCI/IOV: Mark VFs as not implementing PCI_COMMAND_MEMORY [ Upstream commit `12856e7acd` ] For VFs, the Memory Space Enable bit in the Command Register is hard-wired to 0. Add a new bit to signify devices where the Command Register Memory Space Enable bit does not control the device's response to MMIO accesses. Fixes: `abafbc551f` ("vfio-pci: Invalidate mmaps and block MMIO access on disabled memory") Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:05 +01:00
Matthew Wilcox (Oracle)	546f367094	mm/page_owner: change split_page_owner to take a count [ Upstream commit `8fb156c9ee` ] The implementation of split_page_owner() prefers a count rather than the old order of the page. When we support a variable size THP, we won't have the order at this point, but we will have the number of pages. So change the interface to what the caller and callee would prefer. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: SeongJae Park <sjpark@amazon.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Huang Ying <ying.huang@intel.com> Link: https://lkml.kernel.org/r/20200908195539.25896-4-willy@infradead.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:03 +01:00
Leon Romanovsky	ea879d9c81	overflow: Include header file with SIZE_MAX declaration [ Upstream commit `a4947e84f2` ] The various array_size functions use SIZE_MAX define, but missed limits.h causes to failure to compile code that needs overflow.h. In file included from drivers/infiniband/core/uverbs_std_types_device.c:6: ./include/linux/overflow.h: In function 'array_size': ./include/linux/overflow.h:258:10: error: 'SIZE_MAX' undeclared (first use in this function) 258 \| return SIZE_MAX; \| ^~~~~~~~ Fixes: `610b15c50e` ("overflow.h: Add allocation size calculation helpers") Link: https://lore.kernel.org/r/20200913102928.134985-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:08:00 +01:00
Suren Baghdasaryan	7871c282d2	mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary [ Upstream commit `67197a4f28` ] Currently __set_oom_adj loops through all processes in the system to keep oom_score_adj and oom_score_adj_min in sync between processes sharing their mm. This is done for any task with more that one mm_users, which includes processes with multiple threads (sharing mm and signals). However for such processes the loop is unnecessary because their signal structure is shared as well. Android updates oom_score_adj whenever a tasks changes its role (background/foreground/...) or binds to/unbinds from a service, making it more/less important. Such operation can happen frequently. We noticed that updates to oom_score_adj became more expensive and after further investigation found out that the patch mentioned in "Fixes" introduced a regression. Using Pixel 4 with a typical Android workload, write time to oom_score_adj increased from ~3.57us to ~362us. Moreover this regression linearly depends on the number of multi-threaded processes running on the system. Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK). Change __set_oom_adj to use MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj update should be synchronized between multiple processes. To prevent races between clone() and __set_oom_adj(), when oom_score_adj of the process being cloned might be modified from userspace, we use oom_adj_mutex. Its scope is changed to global. The combination of (CLONE_VM && !CLONE_THREAD) is rarely used except for the case of vfork(). To prevent performance regressions of vfork(), we skip taking oom_adj_mutex and setting MMF_MULTIPROCESS when CLONE_VFORK is specified. Clearing the MMF_MULTIPROCESS flag (when the last process sharing the mm exits) is left out of this patch to keep it simple and because it is believed that this threading model is rare. Should there ever be a need for optimizing that case as well, it can be done by hooking into the exit path, likely following the mm_update_next_owner pattern. With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being quite rare, the regression is gone after the change is applied. [surenb@google.com: v3] Link: https://lkml.kernel.org/r/20200902012558.2335613-1-surenb@google.com Fixes: `44a70adec9` ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj") Reported-by: Tim Murray <timmurray@google.com> Suggested-by: Michal Hocko <mhocko@kernel.org> Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Eugene Syromiatnikov <esyr@redhat.com> Cc: Christian Kellner <christian@kellner.me> Cc: Adrian Reber <areber@redhat.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alexey Gladkov <gladkov.alexey@gmail.com> Cc: Michel Lespinasse <walken@google.com> Cc: Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Bernd Edlinger <bernd.edlinger@hotmail.de> Cc: John Johansen <john.johansen@canonical.com> Cc: Yafang Shao <laoar.shao@gmail.com> Link: https://lkml.kernel.org/r/20200824153036.3201505-1-surenb@google.com Debugged-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-29 10:07:50 +01:00
Vijay Balakrishna	a4c5f912c9	mm: khugepaged: recalculate min_free_kbytes after memory hotplug as expected by khugepaged commit `4aab2be098` upstream. When memory is hotplug added or removed the min_free_kbytes should be recalculated based on what is expected by khugepaged. Currently after hotplug, min_free_kbytes will be set to a lower default and higher default set when THP enabled is lost. This change restores min_free_kbytes as expected for THP consumers. [vijayb@linux.microsoft.com: v5] Link: https://lkml.kernel.org/r/1601398153-5517-1-git-send-email-vijayb@linux.microsoft.com Fixes: `f000565adb` ("thp: set recommended min free kbytes") Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Pavel Tatashin <pasha.tatashin@soleen.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Allen Pais <apais@microsoft.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Song Liu <songliubraving@fb.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/1600305709-2319-2-git-send-email-vijayb@linux.microsoft.com Link: https://lkml.kernel.org/r/1600204258-13683-1-git-send-email-vijayb@linux.microsoft.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-14 11:55:59 +02:00
Minchan Kim	0d600018dd	mm: validate inode in mapping_set_error() commit `8b7b2eb131` upstream. The swap address_space doesn't have host. Thus, it makes kernel crash once swap write meets error. Fix it. Fixes: `735e4ae5ba` ("vfs: track per-sb writeback errors and report them to syncfs") Signed-off-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Jan Kara <jack@suse.cz> Cc: Andres Freund <andres@anarazel.de> Cc: Matthew Wilcox <willy@infradead.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: David Howells <dhowells@redhat.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20201010000650.750063-1-minchan@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-14 11:55:59 +02:00
Eran Ben Elisha	da87ea1373	net/mlx5: Avoid possible free of command entry while timeout comp handler [ Upstream commit `50b2412b7e` ] Upon command completion timeout, driver simulates a forced command completion. In a rare case where real interrupt for that command arrives simultaneously, it might release the command entry while the forced handler might still access it. Fix that by adding an entry refcount, to track current amount of allowed handlers. Command entry to be released only when this refcount is decremented to zero. Command refcount is always initialized to one. For callback commands, command completion handler is the symmetric flow to decrement it. For non-callback commands, it is wait_func(). Before ringing the doorbell, increment the refcount for the real completion handler. Once the real completion handler is called, it will decrement it. For callback commands, once the delayed work is scheduled, increment the refcount. Upon callback command completion handler, we will try to cancel the timeout callback. In case of success, we need to decrement the callback refcount as it will never run. In addition, gather the entry index free and the entry free into a one flow for all command types release. Fixes: `e126ba97db` ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-14 11:55:56 +02:00
Qian Cai	04f31610f3	pipe: Fix memory leaks in create_pipe_files() [ Upstream commit `8a018eb55e` ] Calling pipe2() with O_NOTIFICATION_PIPE could results in memory leaks unless watch_queue_init() is successful. In case of watch_queue_init() failure in pipe2() we are left with inode and pipe_inode_info instances that need to be freed. That failure exit has been introduced in commit `c73be61ced` ("pipe: Add general notification queue support") and its handling should've been identical to nearby treatment of alloc_file_pseudo() failures - it is dealing with the same situation. As it is, the mainline kernel leaks in that case. Another problem is that CONFIG_WATCH_QUEUE and !CONFIG_WATCH_QUEUE cases are treated differently (and the former leaks just pipe_inode_info, the latter - both pipe_inode_info and inode). Fixed by providing a dummy wacth_queue_init() in !CONFIG_WATCH_QUEUE case and by having failures of wacth_queue_init() handled the same way we handle alloc_file_pseudo() ones. Fixes: `c73be61ced` ("pipe: Add general notification queue support") Signed-off-by: Qian Cai <cai@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-14 11:55:55 +02:00
Coly Li	854828e10e	net: introduce helper sendpage_ok() in include/linux/net.h commit `c381b07941` upstream. The original problem was from nvme-over-tcp code, who mistakenly uses kernel_sendpage() to send pages allocated by __get_free_pages() without __GFP_COMP flag. Such pages don't have refcount (page_count is 0) on tail pages, sending them by kernel_sendpage() may trigger a kernel panic from a corrupted kernel heap, because these pages are incorrectly freed in network stack as page_count 0 pages. This patch introduces a helper sendpage_ok(), it returns true if the checking page, - is not slab page: PageSlab(page) is false. - has page refcount: page_count(page) is not zero All drivers who want to send page to remote end by kernel_sendpage() may use this helper to check whether the page is OK. If the helper does not return true, the driver should try other non sendpage method (e.g. sock_no_sendpage()) to handle the page. Signed-off-by: Coly Li <colyli@suse.de> Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Jan Kara <jack@suse.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Mikhail Skorzhinskii <mskorzhinskiy@solarflare.com> Cc: Philipp Reisner <philipp.reisner@linbit.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Vlastimil Babka <vbabka@suse.com> Cc: stable@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-14 11:55:49 +02:00
Peilin Ye	34873e40e8	Fonts: Support FONT_EXTRA_WORDS macros for built-in fonts commit `6735b4632d` upstream. syzbot has reported an issue in the framebuffer layer, where a malicious user may overflow our built-in font data buffers. In order to perform a reliable range check, subsystems need to know `FONTDATAMAX` for each built-in font. Unfortunately, our font descriptor, `struct console_font` does not contain `FONTDATAMAX`, and is part of the UAPI, making it infeasible to modify it. For user-provided fonts, the framebuffer layer resolves this issue by reserving four extra words at the beginning of data buffers. Later, whenever a function needs to access them, it simply uses the following macros: Recently we have gathered all the above macros to <linux/font.h>. Let us do the same thing for built-in fonts, prepend four extra words (including `FONTDATAMAX`) to their data buffers, so that subsystems can use these macros for all fonts, no matter built-in or user-provided. This patch depends on patch "fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h". Cc: stable@vger.kernel.org Link: https://syzkaller.appspot.com/bug?id=08b8be45afea11888776f897895aef9ad1c3ecfd Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/ef18af00c35fb3cc826048a5f70924ed6ddce95b.1600953813.git.yepeilin.cs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-14 11:55:45 +02:00
Peilin Ye	3714c5596a	fbdev, newport_con: Move FONT_EXTRA_WORDS macros into linux/font.h commit `bb0890b4cd` upstream. drivers/video/console/newport_con.c is borrowing FONT_EXTRA_WORDS macros from drivers/video/fbdev/core/fbcon.h. To keep things simple, move all definitions into <linux/font.h>. Since newport_con now uses four extra words, initialize the fourth word in newport_set_font() properly. Cc: stable@vger.kernel.org Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/7fb8bc9b0abc676ada6b7ac0e0bd443499357267.1600953813.git.yepeilin.cs@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-14 11:55:45 +02:00
Damien Le Moal	a12f67b547	scsi: sd: sd_zbc: Fix handling of host-aware ZBC disks commit `27ba3e8ff3` upstream. When CONFIG_BLK_DEV_ZONED is disabled, allow using host-aware ZBC disks as regular disks. In this case, ensure that command completion is correctly executed by changing sd_zbc_complete() to return good_bytes instead of 0 and causing a hang during device probe (endless retries). When CONFIG_BLK_DEV_ZONED is enabled and a host-aware disk is detected to have partitions, it will be used as a regular disk. In this case, make sure to not do anything in sd_zbc_revalidate_zones() as that triggers warnings. Since all these different cases result in subtle settings of the disk queue zoned model, introduce the block layer helper function blk_queue_set_zoned() to generically implement setting up the effective zoned model according to the disk type, the presence of partitions on the disk and CONFIG_BLK_DEV_ZONED configuration. Link: https://lore.kernel.org/r/20200915073347.832424-2-damien.lemoal@wdc.com Fixes: `b72053072c` ("block: allow partitions on host aware zone devices") Cc: <stable@vger.kernel.org> Reported-by: Borislav Petkov <bp@alien8.de> Suggested-by: Christoph Hellwig <hch@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-07 08:02:54 +02:00
Linus Torvalds	d4ff049a34	pipe: remove pipe_wait() and fix wakeup race with splice [ Upstream commit `472e5b056f` ] The pipe splice code still used the old model of waiting for pipe IO by using a non-specific "pipe_wait()" that waited for any pipe event to happen, which depended on all pipe IO being entirely serialized by the pipe lock. So by checking the state you were waiting for, and then adding yourself to the wait queue before dropping the lock, you were guaranteed to see all the wakeups. Strictly speaking, the actual wakeups were not done under the lock, but the pipe_wait() model still worked, because since the waiter held the lock when checking whether it should sleep, it would always see the current state, and the wakeup was always done after updating the state. However, commit `0ddad21d3e` ("pipe: use exclusive waits when reading or writing") split the single wait-queue into two, and in the process also made the "wait for event" code wait for _two_ wait queues, and that then showed a race with the wakers that were not serialized by the pipe lock. It's only splice that used that "pipe_wait()" model, so the problem wasn't obvious, but Josef Bacik reports: "I hit a hang with fstest btrfs/187, which does a btrfs send into /dev/null. This works by creating a pipe, the write side is given to the kernel to write into, and the read side is handed to a thread that splices into a file, in this case /dev/null. The box that was hung had the write side stuck here [pipe_write] and the read side stuck here [splice_from_pipe_next -> pipe_wait]. [ more details about pipe_wait() scenario ] The problem is we're doing the prepare_to_wait, which sets our state each time, however we can be woken up either with reads or writes. In the case above we race with the WRITER waking us up, and re-set our state to INTERRUPTIBLE, and thus never break out of schedule" Josef had a patch that avoided the issue in pipe_wait() by just making it set the state only once, but the deeper problem is that pipe_wait() depends on a level of synchonization by the pipe mutex that it really shouldn't. And the whole "wait for any pipe state change" model really isn't very good to begin with. So rather than trying to work around things in pipe_wait(), remove that legacy model of "wait for arbitrary pipe event" entirely, and actually create functions that wait for the pipe actually being readable or writable, and can do so without depending on the pipe lock serializing everything. Fixes: `0ddad21d3e` ("pipe: use exclusive waits when reading or writing") Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/ Reported-by: Josef Bacik <josef@toxicpanda.com> Reviewed-and-tested-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-07 08:02:54 +02:00
Kai-Heng Feng	b64a43b072	memstick: Skip allocating card when removing host commit `62c59a8786` upstream. After commit `6827ca573c` ("memstick: rtsx_usb_ms: Support runtime power management"), removing module rtsx_usb_ms will be stuck. The deadlock is caused by powering on and powering off at the same time, the former one is when memstick_check() is flushed, and the later is called by memstick_remove_host(). Soe let's skip allocating card to prevent this issue. Fixes: `6827ca573c` ("memstick: rtsx_usb_ms: Support runtime power management") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Link: https://lore.kernel.org/r/20200925084952.13220-1-kai.heng.feng@canonical.com Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-07 08:02:46 +02:00
Laurent Dufour	862f8bb32f	mm: don't rely on system state to detect hot-plug operations commit `f85086f95f` upstream. In register_mem_sect_under_node() the system_state's value is checked to detect whether the call is made during boot time or during an hot-plug operation. Unfortunately, that check against SYSTEM_BOOTING is wrong because regular memory is registered at SYSTEM_SCHEDULING state. In addition, memory hot-plug operation can be triggered at this system state by the ACPI [1]. So checking against the system state is not enough. The consequence is that on system with interleaved node's ranges like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] This can be seen on PowerPC LPAR after multiple memory hot-plug and hot-unplug operations are done. At the next reboot the node's memory ranges can be interleaved and since the call to link_mem_sections() is made in topology_init() while the system is in the SYSTEM_SCHEDULING state, the node's id is not checked, and the sections registered to multiple nodes: $ ls -l /sys/devices/system/memory/memory21/node* total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 In that case, the system is able to boot but if later one of theses memory blocks is hot-unplugged and then hot-plugged, the sysfs inconsistency is detected and this is triggering a BUG_ON(): kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This patch addresses the root cause by not relying on the system_state value to detect whether the call is due to a hot-plug operation. An extra parameter is added to link_mem_sections() detailing whether the operation is due to a hot-plug operation. [1] According to Oscar Salvador, using this qemu command line, ACPI memory hotplug operations are raised at SYSTEM_SCHEDULING state: $QEMU -enable-kvm -machine pc -smp 4,sockets=4,cores=1,threads=1 -cpu host -monitor pty \ -m size=$MEM,slots=255,maxmem=4294967296k \ -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,mem=512 \ -object memory-backend-ram,id=memdimm0,size=134217728 -device pc-dimm,node=0,memdev=memdimm0,id=dimm0,slot=0 \ -object memory-backend-ram,id=memdimm1,size=134217728 -device pc-dimm,node=0,memdev=memdimm1,id=dimm1,slot=1 \ -object memory-backend-ram,id=memdimm2,size=134217728 -device pc-dimm,node=0,memdev=memdimm2,id=dimm2,slot=2 \ -object memory-backend-ram,id=memdimm3,size=134217728 -device pc-dimm,node=0,memdev=memdimm3,id=dimm3,slot=3 \ -object memory-backend-ram,id=memdimm4,size=134217728 -device pc-dimm,node=1,memdev=memdimm4,id=dimm4,slot=4 \ -object memory-backend-ram,id=memdimm5,size=134217728 -device pc-dimm,node=1,memdev=memdimm5,id=dimm5,slot=5 \ -object memory-backend-ram,id=memdimm6,size=134217728 -device pc-dimm,node=1,memdev=memdimm6,id=dimm6,slot=6 \ Fixes: `4fbce63391` ("mm/memory_hotplug.c: make register_mem_sect_under_node() a callback of walk_memory_range()") Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-3-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-01 17:36:35 +02:00
Laurent Dufour	b8fdce3178	mm: replace memmap_context by meminit_context commit `c1d0da8335` upstream. Patch series "mm: fix memory to node bad links in sysfs", v3. Sometimes, firmware may expose interleaved memory layout like this: Early memory node ranges node 1: [mem 0x0000000000000000-0x000000011fffffff] node 2: [mem 0x0000000120000000-0x000000014fffffff] node 1: [mem 0x0000000150000000-0x00000001ffffffff] node 0: [mem 0x0000000200000000-0x000000048fffffff] node 2: [mem 0x0000000490000000-0x00000007ffffffff] In that case, we can see memory blocks assigned to multiple nodes in sysfs: $ ls -l /sys/devices/system/memory/memory21 total 0 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node1 -> ../../node/node1 lrwxrwxrwx 1 root root 0 Aug 24 05:27 node2 -> ../../node/node2 -rw-r--r-- 1 root root 65536 Aug 24 05:27 online -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_device -r--r--r-- 1 root root 65536 Aug 24 05:27 phys_index drwxr-xr-x 2 root root 0 Aug 24 05:27 power -r--r--r-- 1 root root 65536 Aug 24 05:27 removable -rw-r--r-- 1 root root 65536 Aug 24 05:27 state lrwxrwxrwx 1 root root 0 Aug 24 05:25 subsystem -> ../../../../bus/memory -rw-r--r-- 1 root root 65536 Aug 24 05:25 uevent -r--r--r-- 1 root root 65536 Aug 24 05:27 valid_zones The same applies in the node's directory with a memory21 link in both the node1 and node2's directory. This is wrong but doesn't prevent the system to run. However when later, one of these memory blocks is hot-unplugged and then hot-plugged, the system is detecting an inconsistency in the sysfs layout and a BUG_ON() is raised: kernel BUG at /Users/laurent/src/linux-ppc/mm/memory_hotplug.c:1084! LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp pseries_rng rng_core vmx_crypto gf128mul binfmt_misc ip_tables x_tables xfs libcrc32c crc32c_vpmsum autofs4 CPU: 8 PID: 10256 Comm: drmgr Not tainted 5.9.0-rc1+ #25 Call Trace: add_memory_resource+0x23c/0x340 (unreliable) __add_memory+0x5c/0xf0 dlpar_add_lmb+0x1b4/0x500 dlpar_memory+0x1f8/0xb80 handle_dlpar_errorlog+0xc0/0x190 dlpar_store+0x198/0x4a0 kobj_attr_store+0x30/0x50 sysfs_kf_write+0x64/0x90 kernfs_fop_write+0x1b0/0x290 vfs_write+0xe8/0x290 ksys_write+0xdc/0x130 system_call_exception+0x160/0x270 system_call_common+0xf0/0x27c This has been seen on PowerPC LPAR. The root cause of this issue is that when node's memory is registered, the range used can overlap another node's range, thus the memory block is registered to multiple nodes in sysfs. There are two issues here: (a) The sysfs memory and node's layouts are broken due to these multiple links (b) The link errors in link_mem_sections() should not lead to a system panic. To address (a) register_mem_sect_under_node should not rely on the system state to detect whether the link operation is triggered by a hot plug operation or not. This is addressed by the patches 1 and 2 of this series. Issue (b) will be addressed separately. This patch (of 2): The memmap_context enum is used to detect whether a memory operation is due to a hot-add operation or happening at boot time. Make it general to the hotplug operation and rename it as meminit_context. There is no functional change introduced by this patch Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J . Wysocki" <rafael@kernel.org> Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Scott Cheloha <cheloha@linux.ibm.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200915094143.79181-1-ldufour@linux.ibm.com Link: https://lkml.kernel.org/r/20200915132624.9723-1-ldufour@linux.ibm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-01 17:36:34 +02:00
Vasily Gorbik	2a4b8662dd	mm/gup: fix gup_fast with dynamic page table folding commit `d3f7b1bb20` upstream. Currently to make sure that every page table entry is read just once gup_fast walks perform READ_ONCE and pass pXd value down to the next gup_pXd_range function by value e.g.: static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page *pages, int nr) ... pudp = pud_offset(&p4d, addr); This function passes a reference on that local value copy to pXd_offset, and might get the very same pointer in return. This happens when the level is folded (on most arches), and that pointer should not be iterated. On s390 due to the fact that each task might have different 5,4 or 3-level address translation and hence different levels folded the logic is more complex and non-iteratable pointer to a local copy leads to severe problems. Here is an example of what happens with gup_fast on s390, for a task with 3-level paging, crossing a 2 GB pud boundary: // addr = 0x1007ffff000, end = 0x10080001000 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end, unsigned int flags, struct page *pages, int nr) { unsigned long next; pud_t pudp; // pud_offset returns &p4d itself (a pointer to a value on stack) pudp = pud_offset(&p4d, addr); do { // on second iteratation reading "random" stack value pud_t pud = READ_ONCE(pudp); // next = 0x10080000000, due to PUD_SIZE/MASK != PGDIR_SIZE/MASK on s390 next = pud_addr_end(addr, end); ... } while (pudp++, addr = next, addr != end); // pudp++ iterating over stack return 1; } This happens since s390 moved to common gup code with commit `d1874a0c28` ("s390/mm: make the pxd_offset functions more robust") and commit `1a42010cdc` ("s390/mm: convert to the generic get_user_pages_fast code"). s390 tried to mimic static level folding by changing pXd_offset primitives to always calculate top level page table offset in pgd_offset and just return the value passed when pXd_offset has to act as folded. What is crucial for gup_fast and what has been overlooked is that PxD_SIZE/MASK and thus pXd_addr_end should also change correspondingly. And the latter is not possible with dynamic folding. To fix the issue in addition to pXd values pass original pXdp pointers down to gup_pXd_range functions. And introduce pXd_offset_lockless helpers, which take an additional pXd entry value parameter. This has already been discussed in https://lkml.kernel.org/r/20190418100218.0a4afd51@mschwideX1 Fixes: `1a42010cdc` ("s390/mm: convert to the generic get_user_pages_fast code") Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Mike Rapoport <rppt@linux.ibm.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Claudio Imbrenda <imbrenda@linux.ibm.com> Cc: <stable@vger.kernel.org> [5.2+] Link: https://lkml.kernel.org/r/patch.git-943f1e5dcff2.your-ad-here.call-01599856292-ext-8676@work.hours Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-01 17:36:34 +02:00
Masami Hiramatsu	913d4c0dcd	kprobes: tracing/kprobes: Fix to kill kprobes on initmem after boot commit `82d083ab60` upstream. Since kprobe_event= cmdline option allows user to put kprobes on the functions in initmem, kprobe has to make such probes gone after boot. Currently the probes on the init functions in modules will be handled by module callback, but the kernel init text isn't handled. Without this, kprobes may access non-exist text area to disable or remove it. Link: https://lkml.kernel.org/r/159972810544.428528.1839307531600646955.stgit@devnote2 Fixes: `970988e19e` ("tracing/kprobe: Add kprobe_event= boot parameter") Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-10-01 17:36:33 +02:00
Dmitry Bogdanov	0f5479c614	net: qed: Disable aRFS for NPAR and 100G [ Upstream commit `2d2fe84337` ] In CMT and NPAR the PF is unknown when the GFS block processes the packet. Therefore cannot use searcher as it has a per PF database, and thus ARFS must be disabled. Fixes: `d51e4af5c2` ("qed: aRFS infrastructure support") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Dmitry Bogdanov <dbogdanov@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-10-01 17:36:27 +02:00
Eric Dumazet	b42148e227	net: add __must_check to skb_put_padto() [ Upstream commit `4a009cb04a` ] skb_put_padto() and __skb_put_padto() callers must check return values or risk use-after-free. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-26 18:05:26 +02:00
Jan Kara	e37b4e9e1f	dax: Fix compilation for CONFIG_DAX && !CONFIG_FS_DAX commit `88b67edd72` upstream. dax_supported() is defined whenever CONFIG_DAX is enabled. So dummy implementation should be defined only in !CONFIG_DAX case, not in !CONFIG_FS_DAX case. Fixes: `e2ec512825` ("dm: Call proper helper to determine dax support") Cc: <stable@vger.kernel.org> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-23 13:00:03 +02:00
Jan Kara	19f02438a8	dm: Call proper helper to determine dax support commit `e2ec512825` upstream. DM was calling generic_fsdax_supported() to determine whether a device referenced in the DM table supports DAX. However this is a helper for "leaf" device drivers so that they don't have to duplicate common generic checks. High level code should call dax_supported() helper which that calls into appropriate helper for the particular device. This problem manifested itself as kernel messages: dm-3: error: dax access failed (-95) when lvm2-testsuite run in cases where a DM device was stacked on top of another DM device. Fixes: `7bf7eac8d6` ("dax: Arrange for dax_supported check to span multiple devices") Cc: <stable@vger.kernel.org> Tested-by: Adrian Huang <ahuang12@lenovo.com> Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: Mike Snitzer <snitzer@redhat.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/160061715195.13131.5503173247632041975.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-23 13:00:03 +02:00
Andrew Jones	be0d3b6e44	arm64: paravirt: Initialize steal time when cpu is online commit `75df529bec` upstream. Steal time initialization requires mapping a memory region which invokes a memory allocation. Doing this at CPU starting time results in the following trace when CONFIG_DEBUG_ATOMIC_SLEEP is enabled: BUG: sleeping function called from invalid context at mm/slab.h:498 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.0-rc5+ #1 Call trace: dump_backtrace+0x0/0x208 show_stack+0x1c/0x28 dump_stack+0xc4/0x11c ___might_sleep+0xf8/0x130 __might_sleep+0x58/0x90 slab_pre_alloc_hook.constprop.101+0xd0/0x118 kmem_cache_alloc_node_trace+0x84/0x270 __get_vm_area_node+0x88/0x210 get_vm_area_caller+0x38/0x40 __ioremap_caller+0x70/0xf8 ioremap_cache+0x78/0xb0 memremap+0x9c/0x1a8 init_stolen_time_cpu+0x54/0xf0 cpuhp_invoke_callback+0xa8/0x720 notify_cpu_starting+0xc8/0xd8 secondary_start_kernel+0x114/0x180 CPU1: Booted secondary processor 0x0000000001 [0x431f0a11] However we don't need to initialize steal time at CPU starting time. We can simply wait until CPU online time, just sacrificing a bit of accuracy by returning zero for steal time until we know better. While at it, add __init to the functions that are only called by pv_time_init() which is __init. Signed-off-by: Andrew Jones <drjones@redhat.com> Fixes: `e0685fa228` ("arm64: Retrieve stolen time as paravirtualized guest") Cc: stable@vger.kernel.org Reviewed-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20200916154530.40809-1-drjones@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-23 13:00:02 +02:00
Johan Hovold	165b69c8f0	serial: core: fix console port-lock regression commit `e0830dbf71` upstream. Fix the port-lock initialisation regression introduced by commit `a3cb39d258` ("serial: core: Allow detach and attach serial device for console") by making sure that the lock is again initialised during console setup. The console may be registered before the serial controller has been probed in which case the port lock needs to be initialised during console setup by a call to uart_set_options(). The console-detach changes introduced a regression in several drivers by effectively removing that initialisation by not initialising the lock when the port is used as a console (which is always the case during console setup). Add back the early lock initialisation and instead use a new console-reinit flag to handle the case where a console is being re-attached through sysfs. The question whether the console-detach interface should have been added in the first place is left for another discussion. Note that the console-enabled check in uart_set_options() is not redundant because of kgdboc, which can end up reinitialising an already enabled console (see commit `42b6a1baa3` ("serial_core: Don't re-initialize a previously initialized spinlock.")). Fixes: `a3cb39d258` ("serial: core: Allow detach and attach serial device for console") Cc: stable <stable@vger.kernel.org> # 5.7 Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20200909143101.15389-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-23 13:00:01 +02:00
Hou Tao	ce2b5fe9e7	locking/percpu-rwsem: Use this_cpu_{inc,dec}() for read_count [ Upstream commit `e6b1a44ecc` ] The __this_cpu() accessors are (in general) IRQ-unsafe which, given that percpu-rwsem is a blocking primitive, should be just fine. However, file_end_write() is used from IRQ context and will cause load-store issues on architectures where the per-cpu accessors are not natively irq-safe. Fix it by using the IRQ-safe this_cpu_() for operations on read_count. This will generate more expensive code on a number of platforms, which might cause a performance regression for some of the other percpu-rwsem users. If any such is reported, we can consider alternative solutions. Fixes: `70fe2f4815` ("aio: fix freeze protection of aio writes") Signed-off-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com> Link: https://lkml.kernel.org/r/20200915140750.137881-1-houtao1@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-23 12:59:58 +02:00
Evan Nimmo	13773c6dba	i2c: algo: pca: Reapply i2c bus settings after reset [ Upstream commit `0a355aeb24` ] If something goes wrong (such as the SCL being stuck low) then we need to reset the PCA chip. The issue with this is that on reset we lose all config settings and the chip ends up in a disabled state which results in a lock up/high CPU usage. We need to re-apply any configuration that had previously been set and re-enable the chip. Signed-off-by: Evan Nimmo <evan.nimmo@alliedtelesis.co.nz> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-23 12:59:52 +02:00
Kees Cook	f17c2fe6ea	test_firmware: Test platform fw loading on non-EFI systems commit `baaabecfc8` upstream. On non-EFI systems, it wasn't possible to test the platform firmware loader because it will have never set "checked_fw" during __init. Instead, allow the test code to override this check. Additionally split the declarations into a private symbol namespace so there is greater enforcement of the symbol visibility. Fixes: `548193cba2` ("test_firmware: add support for firmware_request_platform") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20200909225354.3118328-1-keescook@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-17 13:55:45 +02:00
Florian Westphal	a0a91436d5	netfilter: conntrack: allow sctp hearbeat after connection re-use [ Upstream commit `cc5453a5b7` ] If an sctp connection gets re-used, heartbeats are flagged as invalid because their vtag doesn't match. Handle this in a similar way as TCP conntrack when it suspects that the endpoints and conntrack are out-of-sync. When a HEARTBEAT request fails its vtag validation, flag this in the conntrack state and accept the packet. When a HEARTBEAT_ACK is received with an invalid vtag in the reverse direction after we allowed such a HEARTBEAT through, assume we are out-of-sync and re-set the vtag info. v2: remove left-over snippet from an older incarnation that moved new_state/old_state assignments, thats not needed so keep that as-is. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-17 13:55:32 +02:00
Tejun Heo	dea6f05d37	libata: implement ATA_HORKAGE_MAX_TRIM_128M and apply to Sandisks commit `3b5455636f` upstream. All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-09-09 19:14:30 +02:00

1 2 3 4 5 ...

72987 Commits