Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.
There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.
On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.
commit 19b7ccf865 upstream.
Commit 25520d55cd ("block: Inline blk_integrity in struct gendisk")
introduced blk_integrity_revalidate(), which seems to assume ownership
of the stable pages flag and unilaterally clears it if no blk_integrity
profile is registered:
if (bi->profile)
disk->queue->backing_dev_info->capabilities |=
BDI_CAP_STABLE_WRITES;
else
disk->queue->backing_dev_info->capabilities &=
~BDI_CAP_STABLE_WRITES;
It's called from revalidate_disk() and rescan_partitions(), making it
impossible to enable stable pages for drivers that support partitions
and don't use blk_integrity: while the call in revalidate_disk() can be
trivially worked around (see zram, which doesn't support partitions and
hence gets away with zram_revalidate_disk()), rescan_partitions() can
be triggered from userspace at any time. This breaks rbd, where the
ceph messenger is responsible for generating/verifying CRCs.
Since blk_integrity_{un,}register() "must" be used for (un)registering
the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES
setting there. This way drivers that call blk_integrity_register() and
use integrity infrastructure won't interfere with drivers that don't
but still want stable pages.
Fixes: 25520d55cd ("block: Inline blk_integrity in struct gendisk")
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mike Snitzer <snitzer@redhat.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
[idryomov@gmail.com: backport to < 4.11: bdi is embedded in queue]
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3089c1df10 upstream.
The vm fault handler relies on the fact that the VMA owns a reference
to the BO. However, once mmap_sem is released, other tasks are free to
destroy the VMA, which can lead to the BO being freed. Fix two code
paths where that can happen, both related to vm fault retries.
Found via a lock debugging warning which flagged &bo->wu_mutex as
locked while being destroyed.
Fixes: cbe12e74ee ("drm/ttm: Allow vm fault retries")
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7ee74b56f upstream.
This event is used by the Firmware to limit the RX BA win size
for a specific link.
The event handler updates the new size in the mac's sta->sta struct.
BA sessions opened for that link will use the new restricted
win_size. This limitation remains until a new update is received or
until the link is closed.
Signed-off-by: Maxim Altshul <maxim.altshul@ti.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42c7372a11 upstream.
When starting a new BA session, we must pass the win_size to the FW.
To do this we take max_rx_aggregation_subframes (BA RX win size)
which is stored in ieee80211_sta structure (e.g per link and not per HW)
We will use the value stored per link when passing the win_size to
firmware through the ACX_BA_SESSION_RX_SETUP command.
Signed-off-by: Maxim Altshul <maxim.altshul@ti.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84d582d236 upstream.
Recent discussion (http://marc.info/?l=xen-devel&m=149192184523741)
established that commit 72a9b18629 ("xen: Remove event channel
notification through Xen PCI platform device") (and thus commit
da72ff5bfc ("partially revert "xen: Remove event channel
notification through Xen PCI platform device"")) are unnecessary and,
in fact, prevent HVM guests from booting on Xen releases prior to 4.0
Therefore we revert both of those commits.
The summary of that discussion is below:
Here is the brief summary of the current situation:
Before the offending commit (72a9b18629):
1) INTx does not work because of the reset_watches path.
2) The reset_watches path is only taken if you have Xen > 4.0
3) The Linux Kernel by default will use vector inject if the hypervisor
support. So even INTx does not work no body running the kernel with
Xen > 4.0 would notice. Unless he explicitly disabled this feature
either in the kernel or in Xen (and this can only be disabled by
modifying the code, not user-supported way to do it).
After the offending commit (+ partial revert):
1) INTx is no longer support for HVM (only for PV guests).
2) Any HVM guest The kernel will not boot on Xen < 4.0 which does
not have vector injection support. Since the only other mode
supported is INTx which.
So based on this summary, I think before commit (72a9b18629) we were
in much better position from a user point of view.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
Cc: xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9dd46188e upstream.
F2FS uses 4 bytes to represent block address. As a result, supported
size of disk is 16 TB and it equals to 16 * 1024 * 1024 / 2 segments.
Signed-off-by: Jin Qian <jinqian@google.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 922c60e89d ]
If an error is encountered in mdio_mux_init(), the error path will call
mdiobus_free(). Since mdiobus_register() has been called prior to
mdio_mux_init(), the bus->state will not be MDIOBUS_UNREGISTERED. This
causes a BUG_ON() in mdiobus_free(). To correct this issue, add an
error path for mdio_mux_init() which calls mdiobus_unregister() prior to
mdiobus_free().
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Fixes: 98bc865a1e ("net: mdio-mux: Add MDIO mux driver for iProc SoCs")
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0d0e57697f ]
The patch fixes two things at once:
1) It checks the env->allow_ptr_leaks and only prints the map address to
the log if we have the privileges to do so, otherwise it just dumps 0
as we would when kptr_restrict is enabled on %pK. Given the latter is
off by default and not every distro sets it, I don't want to rely on
this, hence the 0 by default for unprivileged.
2) Printing of ldimm64 in the verifier log is currently broken in that
we don't print the full immediate, but only the 32 bit part of the
first insn part for ldimm64. Thus, fix this up as well; it's okay to
access, since we verified all ldimm64 earlier already (including just
constants) through replace_map_fd_with_map_ptr().
Fixes: 1be7f75d16 ("bpf: enable non-root eBPF programs")
Fixes: cbd3570086 ("bpf: verifier (add ability to receive verification log)")
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 242d3a49a2 ]
For each netns (except init_net), we initialize its null entry
in 3 places:
1) The template itself, as we use kmemdup()
2) Code around dst_init_metrics() in ip6_route_net_init()
3) ip6_route_dev_notify(), which is supposed to initialize it after
loopback registers
Unfortunately the last one still happens in a wrong order because
we expect to initialize net->ipv6.ip6_null_entry->rt6i_idev to
net->loopback_dev's idev, thus we have to do that after we add
idev to loopback. However, this notifier has priority == 0 same as
ipv6_dev_notf, and ipv6_dev_notf is registered after
ip6_route_dev_notifier so it is called actually after
ip6_route_dev_notifier. This is similar to commit 2f460933f5
("ipv6: initialize route null entry in addrconf_init()") which
fixes init_net.
Fix it by picking a smaller priority for ip6_route_dev_notifier.
Also, we have to release the refcnt accordingly when unregistering
loopback_dev because device exit functions are called before subsys
exit functions.
Acked-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2f460933f5 ]
Andrey reported a crash on init_net.ipv6.ip6_null_entry->rt6i_idev
since it is always NULL.
This is clearly wrong, we have code to initialize it to loopback_dev,
unfortunately the order is still not correct.
loopback_dev is registered very early during boot, we lose a chance
to re-initialize it in notifier. addrconf_init() is called after
ip6_route_init(), which means we have no chance to correct it.
Fix it by moving this initialization explicitly after
ipv6_add_dev(init_net.loopback_dev) in addrconf_init().
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 77ef033b68 ]
IFLA_PHYS_PORT_NAME is a string attribute, so terminate it with \0.
Otherwise libnl3 fails to validate netlink messages with this attribute.
"ip -detail a" assumes too that the attribute is NUL-terminated when
printing it. It often was, due to padding.
I noticed this as libvirtd failing to start on a system with sfc driver
after upgrading it to Linux 4.11, i.e. when sfc added support for
phys_port_name.
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6d717134a1 ]
Andrey reported a warning triggered by the rcu code:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 5911 at lib/debugobjects.c:289
debug_print_object+0x175/0x210
ODEBUG: activate active (active state 1) object type: rcu_head hint:
(null)
Modules linked in:
CPU: 1 PID: 5911 Comm: a.out Not tainted 4.11.0-rc8+ #271
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:16
dump_stack+0x192/0x22d lib/dump_stack.c:52
__warn+0x19f/0x1e0 kernel/panic.c:549
warn_slowpath_fmt+0xe0/0x120 kernel/panic.c:564
debug_print_object+0x175/0x210 lib/debugobjects.c:286
debug_object_activate+0x574/0x7e0 lib/debugobjects.c:442
debug_rcu_head_queue kernel/rcu/rcu.h:75
__call_rcu.constprop.76+0xff/0x9c0 kernel/rcu/tree.c:3229
call_rcu_sched+0x12/0x20 kernel/rcu/tree.c:3288
rt6_rcu_free net/ipv6/ip6_fib.c:158
rt6_release+0x1ea/0x290 net/ipv6/ip6_fib.c:188
fib6_del_route net/ipv6/ip6_fib.c:1461
fib6_del+0xa42/0xdc0 net/ipv6/ip6_fib.c:1500
__ip6_del_rt+0x100/0x160 net/ipv6/route.c:2174
ip6_del_rt+0x140/0x1b0 net/ipv6/route.c:2187
__ipv6_ifa_notify+0x269/0x780 net/ipv6/addrconf.c:5520
addrconf_ifdown+0xe60/0x1a20 net/ipv6/addrconf.c:3672
...
Andrey's reproducer program runs in a very tight loop, calling
'unshare -n' and then spawning 2 sets of 14 threads running random ioctl
calls. The relevant networking sequence:
1. New network namespace created via unshare -n
- ip6tnl0 device is created in down state
2. address added to ip6tnl0
- equivalent to ip -6 addr add dev ip6tnl0 fd00::bb/1
- DAD is started on the address and when it completes the host
route is inserted into the FIB
3. ip6tnl0 is brought up
- the new fixup_permanent_addr function restarts DAD on the address
4. exit namespace
- teardown / cleanup sequence starts
- once in a blue moon, lo teardown appears to happen BEFORE teardown
of ip6tunl0
+ down on 'lo' removes the host route from the FIB since the dst->dev
for the route is loobback
+ host route added to rcu callback list
* rcu callback has not run yet, so rt is NOT on the gc list so it has
NOT been marked obsolete
5. in parallel to 4. worker_thread runs addrconf_dad_completed
- DAD on the address on ip6tnl0 completes
- calls ipv6_ifa_notify which inserts the host route
All of that happens very quickly. The result is that a host route that
has been deleted from the IPv6 FIB and added to the RCU list is re-inserted
into the FIB.
The exit namespace eventually gets to cleaning up ip6tnl0 which removes the
host route from the FIB again, calls the rcu function for cleanup -- and
triggers the double rcu trace.
The root cause is duplicate DAD on the address -- steps 2 and 3. Arguably,
DAD should not be started in step 2. The interface is in the down state,
so it can not really send out requests for the address which makes starting
DAD pointless.
Since the second DAD was introduced by a recent change, seems appropriate
to use it for the Fixes tag and have the fixup function only start DAD for
addresses in the PREDAD state which occurs in addrconf_ifdown if the
address is retained.
Big thanks to Andrey for isolating a reliable reproducer for this problem.
Fixes: f1705ec197 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a9f11f963a ]
Be careful when comparing tcp_time_stamp to some u32 quantity,
otherwise result can be surprising.
Fixes: 7c106d7e78 ("[TCP]: TCP Low Priority congestion control")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 332270fdc8 ]
llvm 4.0 and above generates the code like below:
....
440: (b7) r1 = 15
441: (05) goto pc+73
515: (79) r6 = *(u64 *)(r10 -152)
516: (bf) r7 = r10
517: (07) r7 += -112
518: (bf) r2 = r7
519: (0f) r2 += r1
520: (71) r1 = *(u8 *)(r8 +0)
521: (73) *(u8 *)(r2 +45) = r1
....
and the verifier complains "R2 invalid mem access 'inv'" for insn #521.
This is because verifier marks register r2 as unknown value after #519
where r2 is a stack pointer and r1 holds a constant value.
Teach verifier to recognize "stack_ptr + imm" and
"stack_ptr + reg with const val" as valid stack_ptr with new offset.
Signed-off-by: Yonghong Song <yhs@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7162fb242c ]
Andrey found a way to trigger the WARN_ON_ONCE(delta < len) in
skb_try_coalesce() using syzkaller and a filter attached to a TCP
socket over loopback interface.
I believe one issue with looped skbs is that tcp_trim_head() can end up
producing skb with under estimated truesize.
It hardly matters for normal conditions, since packets sent over
loopback are never truncated.
Bytes trimmed from skb->head should not change skb truesize, since
skb->head is not reallocated.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5294b83086 ]
We call skb_cow_data, which is good anyway to ensure we can actually
modify the skb as such (another error from prior). Now that we have the
number of fragments required, we can safely allocate exactly that amount
of memory.
Fixes: c09440f7dc ("macsec: introduce IEEE 802.1AE driver")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c7f622120 upstream.
When any of the functions contained in NGbzero.S and GENbzero.S
vector through *bzero_from_clear_user, we may end up taking a
fault when executing one of the store alternate address space
instructions. If this happens, the exception handler does not
restore the %asi register.
This commit fixes the issue by introducing a new exception
handler that ensures the %asi register is restored when
a fault is handled.
Orabug: 25577560
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com>
Reviewed-by: Rob Gardner <rob.gardner@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab949d5196 upstream.
Imre Deak reported a deadlock of HD-audio driver at unbinding while
it's still in probing. Since we probe the codecs asynchronously in a
work, the codec driver probe may still be kicked off while the
controller itself is being unbound. And, azx_remove() tries to
process all pending tasks via cancel_work_sync() for fixing the other
races (see commit [0b8c82190c: ALSA: hda - Cancel probe work instead
of flush at remove]), now we may meet a bizarre deadlock:
Unbind snd_hda_intel via sysfs:
device_release_driver() ->
device_lock(snd_hda_intel) ->
azx_remove() ->
cancel_work_sync(azx_probe_work)
azx_probe_work():
codec driver probe() ->
__driver_attach() ->
device_lock(snd_hda_intel)
This deadlock is caused by the fact that both device_release_driver()
and driver_probe_device() take both the device and its parent locks at
the same time. The codec device sets the controller device as its
parent, and this lock is taken before the probe() callback is called,
while the controller remove() callback gets called also with the same
lock.
In this patch, as an ugly workaround, we unlock the controller device
temporarily during cancel_work_sync() call. The race against another
bind call should be still suppressed by the parent's device lock.
Reported-by: Imre Deak <imre.deak@intel.com>
Fixes: 0b8c82190c ("ALSA: hda - Cancel probe work instead of flush at remove")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f3445067d upstream.
The probe function is not marked __init, but some other functions
are. This leads to a warning on older compilers (e.g. gcc-4.3),
and can cause executing freed memory when built with those
compilers:
WARNING: drivers/staging/emxx_udc/emxx_udc.o(.text+0x2d78): Section mismatch in reference from the function nbu2ss_drv_probe() to the function .init.text:nbu2ss_drv_contest_init()
This removes the annotations.
Fixes: 33aa8d45a4 ("staging: emxx_udc: Add Emma Mobile USB Gadget driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c474b8579 upstream.
Conversion macros le16_to_cpu was removed and that caused new sparse warning
sparse output:
drivers/staging/wlan-ng/p80211netdev.c:241:44: warning: incorrect type in argument 2 (different base types)
drivers/staging/wlan-ng/p80211netdev.c:241:44: expected unsigned short [unsigned] [usertype] fc
drivers/staging/wlan-ng/p80211netdev.c:241:44: got restricted __le16 [usertype] fc
Fixes: 7ad8257234 ("staging:wlan-ng:Fix sparse warning")
Signed-off-by: Igor Pylypiv <igor.pylypiv@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c13990e35 upstream.
root_squash control got accidentally moved to sysfs instead of
debugfs, and the write side of it was also broken expecting a
userspace buffer.
It contains both uid and gid values in a single file, so debugfs
is a clear place for it.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Fixes: c948390f10 "fix inconsistencies of root squash feature"
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cc4b7cb86 upstream.
The driver was making changes to the skb_header without
ensuring it was writable (i.e. uncloned).
This patch also removes some boiler plate header size
checking/adjustment code as that is also handled by the
skb_cow_header function used to make header writable.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed10858ead upstream.
When we have turned off RTC support, the smartpqi driver fails to build:
ERROR: "rtc_time64_to_tm" [drivers/scsi/smartpqi/smartpqi.ko] undefined!
This is easily avoided by using the generic 'struct tm' based helper rather
than the RTC specific one. While fixing this, I noticed that even though
the driver uses time64_t for storing seconds, it gets them from the
old 32-bit struct timeval. To address this, we can simplify the code
by calling ktime_get_real_seconds() directly.
Fixes: 6c223761eb ("smartpqi: initial commit of Microsemi smartpqi driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2559a1ef68 upstream.
The mac_scsi driver still gets disabled when SCSI=m. This should have
been fixed back when I enabled the tristate but I didn't see the bug.
Fixes: 6e9ae6d560 ("[PATCH] mac_scsi: Add module option to Kconfig")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f7c2beef8 upstream.
After a Qlogic card breaks when initializing (test case), the system can
crash in qla2xxx_eh_abort if processing anything but a scsi command type
srb.
Fixes: 1535aa75a3 ("scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove")
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-By: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e0f5cc650 upstream.
Otherwise the interconnect related code implementing PM runtime will
produce these errors on a failed probe:
omap_uart 48066000.serial: omap_device: omap_device_enable() called from invalid state 1
omap_uart 48066000.serial: use pm_runtime_put_sync_suspend() in driver?
Note that we now also need to check for priv in omap8250_runtime_suspend()
as it has not yet been registered if probe fails. And we need to use
pm_runtime_put_sync() to properly idle the device like we already do
in omap8250_remove().
Fixes: 61929cf016 ("tty: serial: Add 8250-core based omap driver")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1a09b6a7c1 upstream.
We get the following compile errors if EXTCON is enabled as a
module but this driver is builtin:
drivers/built-in.o: In function `qcom_usb_hs_phy_power_off':
phy-qcom-usb-hs.c:(.text+0x1089): undefined reference to `extcon_unregister_notifier'
drivers/built-in.o: In function `qcom_usb_hs_phy_probe':
phy-qcom-usb-hs.c:(.text+0x11b5): undefined reference to `extcon_get_edev_by_phandle'
drivers/built-in.o: In function `qcom_usb_hs_phy_power_on':
phy-qcom-usb-hs.c:(.text+0x128e): undefined reference to `extcon_get_state'
phy-qcom-usb-hs.c:(.text+0x12a9): undefined reference to `extcon_register_notifier'
so let's mark this as needing to follow the modular status of
the extcon framework.
Fixes: 9994a33865e2427b09ba (phy: Add support for Qualcomm's USB HS phy")
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b1b23f03a upstream.
The mux_pll_src_apll_dpll_gpll_usb480m_p parent list was missing a ","
between the 3rd and 4th parent names, making them fall together and thus
lookups fail. Fix that.
Fixes: 5190c08b29 ("clk: rockchip: add clock controller for rk3036")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c0e25d883 upstream.
Make sure to detect short control-message transfers and log an error
when reading incomplete manufacturer and boot descriptors.
Note that the default all-zero descriptors will now be used after a
short transfer is detected instead of partially initialised ones.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36356a669e upstream.
Make sure to detect short control-message transfers so that errors are
logged when reading the modem status at open.
Note that while this also avoids initialising the modem status using
uninitialised heap data, these bits could not leak to user space as they
are currently not used.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c34cb8ddf upstream.
Make sure to detect short control-message transfers when fetching
modem and line state in open and when retrieving registers.
This specifically makes sure that an errno is returned to user space on
errors in TIOCMGET instead of a zero bitmask.
Also drop the unused getdevice function which also lacked appropriate
error handling.
Fixes: f7a33e608d ("USB: serial: add quatech2 usb to serial driver")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3e574ad85 upstream.
Make sure to detect short responses when reading the latency timer to
avoid using stale buffer data.
Note that no heap data would currently leak through sysfs as
ASYNC_LOW_LATENCY is set by default.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b631433b17 upstream.
Fix open error handling which failed to detect errors when reading the
MSR and LSR registers, something which could lead to the shadow
registers being initialised from errnos.
Note that calling the generic close implementation is sufficient in the
error paths as the interrupt urb has not yet been submitted and the
register updates have not been made.
Fixes: f4c1e8d597 ("USB: ark3116: Make existing functions 16450-aware
and add close and release functions.")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 39712e8bfa upstream.
Make sure to detect and return an error on zero-length control-message
transfers when reading from the device.
This addresses a potential failure to detect an empty transmit buffer
during close.
Also remove a redundant check for short transfer when sending a command.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4457d9798 upstream.
Use a dedicated buffer for the DMA transfer and make sure to detect
short transfers to avoid parsing a corrupt descriptor.
Fixes: 6e8cf7751f ("USB: add EPIC support to the io_edgeport driver")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1eac5c244f upstream.
Make sure to detect short control-message transfers rather than continue
with zero-initialised data when retrieving modem status and during
device initialisation.
Fixes: 52af954599 ("USB: add USB serial ssu100 driver")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b0aed2b16 upstream.
Make sure the received data has the required headers before parsing it.
Also drop the redundant urb-status check, which has already been handled
by the caller.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a89b94b533 upstream.
We're currently emulating the vbus and id interrupts in the OTGSC
read API, but we also need to make sure that if we're handling
the events with extcon that we don't enable the interrupts for
those events in the hardware. Therefore, properly emulate this
register if we're using extcon, but don't enable the interrupts.
This allows me to get my cable connect/disconnect working
properly without getting spurious interrupts on my device that
uses an extcon for these two events.
Acked-by: Peter Chen <peter.chen@nxp.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Ivan T. Ivanov" <iivanov.xz@gmail.com>
Fixes: 3ecb3e09b0 ("usb: chipidea: Use extcon framework for VBUS and ID detect")
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f60f8ccd54 upstream.
With the id and vbus detection done via extcon we need to make
sure we poll the status of OTGSC properly by considering what the
extcon is saying, and not just what the register is saying. Let's
move this hw_wait_reg() function to the only place it's used and
simplify it for polling the OTGSC register. Then we can make
certain we only use the hw_read_otgsc() API to read OTGSC, which
will make sure we properly handle extcon events.
Acked-by: Peter Chen <peter.chen@nxp.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Ivan T. Ivanov" <iivanov.xz@gmail.com>
Fixes: 3ecb3e09b0 ("usb: chipidea: Use extcon framework for VBUS and ID detect")
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68bd6fc3cf upstream.
Returning from for_each_available_child_of_node() loop requires cleaning
up node refcount. Error paths lacked it so for example in case of
deferred probe, the refcount of phy node was left increased.
Fixes: 6d40500ac9 ("usb: ehci/ohci-exynos: Fix of_node_put() for child when getting PHYs")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f6026b1dc upstream.
Returning from for_each_available_child_of_node() loop requires cleaning
up node refcount. Error paths lacked it so for example in case of
deferred probe, the refcount of phy node was left increased.
Fixes: 6d40500ac9 ("usb: ehci/ohci-exynos: Fix of_node_put() for child when getting PHYs")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3fe81d2cc upstream.
ulseep_range() uses hrtimers and provides no advantage over msleep()
for larger delays. Fix up the 100ms delays here passing the adjusted "min"
value to msleep(). This helps reduce the load on the hrtimer subsystem.
Link: http://lkml.org/lkml/2017/1/11/377
Fixes: commit 2938fc63e0 ("usb: dwc2: Properly account for the force mode delays")
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab007cc94f upstream.
The PML feature is not exposed to guests so we should not be forwarding
the vmexit either.
This commit fixes BSOD 0x20001 (HYPERVISOR_ERROR) when running Hyper-V
enabled Windows Server 2016 in L1 on hardware that supports PML.
Fixes: 843e433057 ("KVM: VMX: Add PML support in VMX")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fb883bb82 upstream.
L2 was running with uninitialized PML fields which led to incomplete
dirty bitmap logging. This manifested as all kinds of subtle erratic
behavior of the nested guest.
Fixes: 843e433057 ("KVM: VMX: Add PML support in VMX")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75013fb16f upstream.
Fix to the exception table entry check by using probed address
instead of the address of copied instruction.
This bug may cause unexpected kernel panic if user probe an address
where an exception can happen which should be fixup by __ex_table
(e.g. copy_from_user.)
Unless user puts a kprobe on such address, this doesn't
cause any problem.
This bug has been introduced years ago, by commit:
464846888d ("x86/kprobes: Fix a bug which can modify kernel code permanently").
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 464846888d ("x86/kprobes: Fix a bug which can modify kernel code permanently")
Link: http://lkml.kernel.org/r/148829899399.28855.12581062400757221722.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 251fe09f13 upstream.
This is a static analysis fix. The warning is:
drivers/net/wireless/intel/iwlwifi/mvm/fw-dbg.c:912 iwl_mvm_fw_dbg_collect()
warn: integer overflows 'sizeof(*desc) + len'
I guess this code is supposed to take a NUL character, but if we write
zero bytes then it tries to write -1 characters and crashes.
Fixes: c91b865cb1 ("iwlwifi: mvm: support description for user triggered fw dbg collection")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b70f07686 upstream.
When driver needs to access the contents of a streaming DMA buffer
without unmapping it it should call dma_sync_single_for_cpu().
Once the call has been made, the CPU "owns" the DMA buffer and can
work with it as needed.
Before the device accesses the buffer, however, ownership should be
transferred back to it with dma_sync_single_for_device().
Both calls weren't performed by the driver, resulting with odd paging
errors on some platforms. Fix it.
Fixes: a6c4fb4441 ("iwlwifi: mvm: Add FW paging mechanism for the UMAC on PCI")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c56108b58a upstream.
In DQA mode, first_agg_queue is initialized to
IWL_MVM_DQA_MIN_DATA_QUEUE. This causes two bugs in the tx response
flow:
1. When TX fails, we set IEEE80211_TX_STAT_AMPDU_NO_BACK regardless
if we actually have aggregation open on the queue. This causes
mac80211 to send a BAR frame even though there is no aggregation
open.
Fix that by simply checking the AMPDU flag that is set on by
mac80211 for AMPDU packets.
2. When reclaiming frames in aggregation mode, we reclaim based on
scheduler ssn and not the SN.
The reason is that scheduler ssn may be ahead of SN due to a hole
in the BA window that was filled.
However, if we have aggregations open on IWL_MVM_DQA_BSS_CLIENT_QUEUE
the reclaim flow will still go to the code of non-aggregation
instead of the aggregation code since IWL_MVM_DQA_BSS_CLIENT_QUEUE
is smaller than IWL_MVM_DQA_MIN_DATA_QUEUE, although it is a valid
aggregation queue.
Fix that by always using the aggregation reclaim code by default in
DQA mode (currently it is implicitly used by default for all queues
except the reserved BSS queue).
Fixes: cf961e1662 ("iwlwifi: mvm: support dqa-mode agg on non-shared queue")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 94c3e614df upstream.
In DQA mode the check whether to decrement the pending frames
counter relies on the tid status and not on the txq id.
This may result in an inconsistent state of the pending frames
counter in case frame is queued on a non aggregation queue but
with this TID, and will be followed by a failure to remove the
station and later on SYSASSERT 0x3421 when trying to remove the
MAC.
Such frames are for example bar and qos NDPs.
Fix it by aligning the condition of incrementing the counter
with the condition of decrementing it - rely on TID state for
DQA mode.
Also, avoid internal error like this affecting station removal
for DQA mode - since we can know for sure it is an internal
error.
Fixes: cf961e1662 ("iwlwifi: mvm: support dqa-mode agg on non-shared queue")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05e5a7e58d upstream.
Instead of setting the tx_cmd length in the mvm code, which is
complicated by the fact that DQA may want to temporarily store
the SKB on the side, adjust the length in the PCIe code which
also knows about this since it's responsible for duplicating
all those headers that are account for in this code.
As the PCIe code already relies on the tx_cmd->len field, this
doesn't really introduce any new dependencies.
To make this possible we need to move the memcpy() of the TX
command until after it was updated.
This does even simplify the code though, since the PCIe code
already does a lot of manipulations to build A-MSDUs correctly
and changing the length becomes a simple operation to see how
much was added/removed, rather than predicting it.
Fixes: 24afba7690 ("iwlwifi: mvm: support bss dynamic alloc/dealloc of queues")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6574dc943f upstream.
Since offchannel activity doesn't always require a BSS, e.g. ANQP
sessions, offchannel frames should not use the BSS queue, because it
might not be initialized.
Use the auxilary queue instead
Fixes: e3118ad74d ("iwlwifi: mvm: support tdls in dqa mode")
Signed-off-by: Beni Lev <beni.lev@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5351f9ab25 upstream.
When NSSN is behind the reorder buffer due to timeout
the reorder timer isn't getting re-armed until NSSN
catches up. Fix it.
Fixes: 0690405fef ("iwlwifi: mvm: add reorder timeout per frame")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c6262b754 upstream.
Our 9000 device supports 64 bit DMA address for RX only, and
not for TX.
Setting DMA mask to 64 for the whole device is erroneous - we
can do it only for a000 devices where device is capable of
both RX & TX DMA with 64 bit address space.
Fixes: 96a6497bc3 ("iwlwifi: pcie: add 9000 series multi queue rx DMA support")
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ce4a03852 upstream.
shift_param is defined and set in iwl_pcie_load_cpu_sections but not
used. Fix this to avoid -Wunused-but-set-variable warning.
The code using it turned into dead code with commit dcab8ecd56
("iwlwifi: mvm: support ucode load for family_8000 B0 only") which
added a separate function iwl_pcie_load_given_ucode_8000 (then 8000b)
for IWL_DEVICE_FAMILY_8000. Commit 76f8c0e17e ("iwlwifi: pcie:
remove dead code") removed the dead code but left shift_param as is.
iwlwifi/pcie/trans.c: In function ‘iwl_pcie_load_cpu_sections’:
iwlwifi/pcie/trans.c:871:6: warning: variable ‘shift_param’ set but not used [-Wunused-but-set-variable]
Fixes: dcab8ecd56 ("iwlwifi: mvm: support ucode load for family_8000 B0 only")
Fixes: 76f8c0e17e ("iwlwifi: pcie: remove dead code")
Signed-off-by: Kirtika Ruchandani <kirtika@google.com>
Cc: Sara Sharon <sara.sharon@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Liad Kaufman <liad.kaufman@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
[removed some unnecessary braces]
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd05a5bd6b upstream.
We don't really need clear the skb's status area nor store the
dev_cmd into it until we really commit to the frame by handing
it to the transport - defer those operations until just before
we do that.
This doesn't entirely fix the bug with frames not getting sent
out after having been deferred due to DQA, because it doesn't
restore the info->driver_data[0] place that was already set to
zero (or another value) by the A-MSDU logic.
Fixes: 24afba7690 ("iwlwifi: mvm: support bss dynamic alloc/dealloc of queues")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bac453ab37 upstream.
For unified images, we shouldn't restart the HW if suspend fails. The
only reason for restarting the HW with non-unified images is to go
back to the D0 image.
Fixes: 23ae61282b ("iwlwifi: mvm: Do not switch to D3 image on suspend")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8320d75b5 upstream.
IWL6000G2B_UCODE_API_MAX is not defined. ucode_api_max of
IWL_DEVICE_6030 uses IWL6000G2_UCODE_API_MAX. Use this also for
MODULE_FIRMWARE.
Fixes: 9d9b21d1b6 ("iwlwifi: remove IWL_*_UCODE_API_OK")
Signed-off-by: Jürg Billeter <j@bitron.ch>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5b60de697 upstream.
This patch fixes the issue specific to AP. AP is started with WEP
security and external station is connected to it. Data path works
in this case. Now if AP is restarted with WPA/WPA2 security,
station is able to connect but ping fails.
Driver skips the deletion of WEP keys if interface type is AP.
Removing that redundant check resolves the issue.
Fixes: e57f1734d8 ("mwifiex: add key material v2 support")
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6183468a23 upstream.
Similar to commit fcd2042e8d ("mwifiex: printk() overflow with 32-byte
SSIDs"), we failed to account for the existence of 32-char SSIDs in our
debugfs code. Unlike in that case though, we zeroed out the containing
struct first, and I'm pretty sure we're guaranteed to have some padding
after the 'ssid.ssid' and 'ssid.ssid_len' fields (the struct is 33 bytes
long).
So, this is the difference between:
# cat /sys/kernel/debug/mwifiex/mlan0/info
...
essid="0123456789abcdef0123456789abcdef "
...
and the correct output:
# cat /sys/kernel/debug/mwifiex/mlan0/info
...
essid="0123456789abcdef0123456789abcdef"
...
Fixes: 5e6e3a92b9 ("wireless: mwifiex: initial commit for Marvell mwifiex driver")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cdefd5b54 upstream.
The CPU port of the BCM53125 is configured with RGMII (no delays) but
this should actually be RGMII with transmit delay (rgmii-txid) because
STMMAC takes care of inserting the transmitter delay. This fixes
occasional packet loss encountered.
Fixes: d7b9eaff5f ("ARM: dts: sun7i: Add BCM53125 switch nodes to the lamobo-r1 board")
Reported-by: Hartmut Knaack <knaack.h@gmx.de>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit acfa28b364 upstream.
The libgpio code pre-sets the GPIO values for the gpio-reset in the
device tree. This results in the device being reset during bringup.
To prevent this pre-setting, use the "open-source" flag in the device
tree.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Fixes: b1aaf88 ("ARM: dts: NSP: Add GPIO reboot method to bcm958625hr DTS file")
Fixes: 10baed1 ("ARM: dts: NSP: Add GPIO reboot method to bcm958625xmc DTS file")
Fixes: 088e3148 ("ARM: dts: NSP: Add new DT file for bcm958522er")
Fixes: e3227c1 ("ARM: dts: NSP: Add new DT file for bcm958525er")
Fixes: 2f8bc00 ("ARM: dts: NSP: Add new DT file for bcm958622hr")
Fixes: d454c37 ("ARM: dts: NSP: Add new DT file for bcm958623hr")
Fixes: f27eacf ("ARM: dts: NSP: Add new DT file for bcm988312hr")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbe99c538d upstream.
gcc gets confused about the control flow in ktd2692_parse_dt(), causing
it to warn about what seems like a potential bug:
drivers/leds/leds-ktd2692.c: In function 'ktd2692_probe':
drivers/leds/leds-ktd2692.c:244:15: error: '*((void *)&led_cfg+8)' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/leds/leds-ktd2692.c:225:7: error: 'led_cfg.flash_max_microamp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/leds/leds-ktd2692.c:232:3: error: 'led_cfg.movie_max_microamp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
The code is fine, and slightly reworking it in an equivalent way lets
gcc figure that out too, which gets rid of the warning.
Fixes: 77e7915b15 ("leds: ktd2692: Add missing of_node_put")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ec663d967b upstream.
Commit cab15ce604 ("arm64: Introduce execute-only page access
permissions") allowed a valid user PTE to have the PTE_USER bit clear.
As a consequence, the pte_valid_not_user() macro in set_pte() was
replaced with pte_valid_global() under the assumption that only user
pages have the nG bit set. EFI mappings, however, also have the nG bit
set and set_pte() wrongly ignores issuing the DSB+ISB.
This patch reinstates the pte_valid_not_user() macro and adds the
PTE_UXN bit check since all kernel mappings have this bit set. For
clarity, pte_exec() is renamed to pte_user_exec() as it only checks for
the absence of PTE_UXN. Consequently, the user executable check in
set_pte_at() drops the pte_ng() test since pte_user_exec() is
sufficient.
Fixes: cab15ce604 ("arm64: Introduce execute-only page access permissions")
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06dbf468a2 upstream.
The ipq board has these rates as 25MHz, and not 19.2 and 27. I
copy/pasted from other boards that have those rates but forgot
to fix the rates here.
Fixes: 30fc4212d5 ("arm: dts: qcom: Add more board clocks")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba52e75718 upstream.
Reading both fault and status registers and logging any fault should
take priority over handling status register update.
Fix by moving the status handling to later in interrupt routine.
Fixes: d7bf353fd0 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Signed-off-by: Liam Breck <kernel@networkimprov.net>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68abfb8015 upstream.
Caching the fault register after a single I2C read may not keep an accurate
value.
Fix by doing two reads in irq_handle_thread() and using the cached value
elsewhere. If a safety timer fault later clears itself, we apparently don't get
an interrupt (INT), however other interrupts would refresh the register cache.
From the data sheet: "When a fault occurs, the charger device sends out INT
and keeps the fault state in REG09 until the host reads the fault register.
Before the host reads REG09 and all the faults are cleared, the charger
device would not send any INT upon new faults. In order to read the
current fault status, the host has to read REG09 two times consecutively.
The 1st reads fault register status from the last read [1] and the 2nd reads
the current fault register status."
[1] presumably a typo; should be "last fault"
Fixes: d7bf353fd0 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Signed-off-by: Liam Breck <kernel@networkimprov.net>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d9fee6a42 upstream.
We wrongly get uevents for bq24190-charger and bq24190-battery on every
register change.
Fix by checking the association with charger and battery before
emitting uevent(s).
Fixes: d7bf353fd0 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Signed-off-by: Liam Breck <kernel@networkimprov.net>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d62acc5ef0 upstream.
The device specific data is not fully initialized on
request_threaded_irq(). This may cause a crash when the IRQ handler
tries to reference them.
Fix the issue by installing IRQ handler at the end of the probe.
Fixes: d7bf353fd0 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Signed-off-by: Liam Breck <kernel@networkimprov.net>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 767eee362f upstream.
The interrupt signal is TRIGGER_FALLING. This is is specified in the
data sheet PIN FUNCTIONS: "The INT pin sends active low, 256us
pulse to host to report charger device status and fault."
Also the direction can be seen in the data sheet Figure 37 "BQ24190
with D+/D- Detection and USB On-The-Go (OTG)" which shows a 10k
pull-up resistor installed for the sample configurations.
Fixes: d7bf353fd0 ("bq24190_charger: Add support for TI BQ24190 Battery Charger")
Signed-off-by: Liam Breck <kernel@networkimprov.net>
Acked-by: Mark Greer <mgreer@animalcreek.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eac6f8b0c7 upstream.
Commit 38addce8b6 ("gcc-plugins: Add latent_entropy plugin") excludes
certain powerpc early boot code from the latent entropy plugin by adding
appropriate CFLAGS. It looks like this was supposed to cover
prom_init.o, but ended up saying init.o (which doesn't exist) instead.
Fix the typo.
Fixes: 38addce8b6 ("gcc-plugins: Add latent_entropy plugin")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 496e9cb5b2 upstream.
The final paragraph of the help text is reversed. We want to enable
this option by default, and disable it if the toolchain has a working
-mprofile-kernel.
Fixes: 8c50b72a3b ("powerpc/ftrace: Add Kconfig & Make glue for mprofile-kernel")
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a7e0fb6c20 upstream.
Currently the opal_exit tracepoint usually shows the opcode as 0:
<idle>-0 [047] d.h. 635.654292: opal_entry: opcode=63
<idle>-0 [047] d.h. 635.654296: opal_exit: opcode=0 retval=0
kopald-1209 [019] d... 636.420943: opal_entry: opcode=10
kopald-1209 [019] d... 636.420959: opal_exit: opcode=0 retval=0
This is because we incorrectly load the opcode into r0 before calling
__trace_opal_exit(), whereas it expects the opcode in r3 (first function
parameter). In fact we are leaving the retval in r3, so opcode and
retval will always show the same value.
Instead load the opcode into r3, resulting in:
<idle>-0 [040] d.h. 636.618625: opal_entry: opcode=63
<idle>-0 [040] d.h. 636.618627: opal_exit: opcode=63 retval=0
Fixes: c49f63530b ("powernv: Add OPAL tracepoints")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ab2537c42 upstream.
In commit a4b349540a ("powerpc/mm: Cleanup LPCR defines") we updated
LPCR_VRMASD wrongly as below.
-#define LPCR_VRMASD (0x1ful << (63-16))
+#define LPCR_VRMASD_SH 47
+#define LPCR_VRMASD (ASM_CONST(1) << LPCR_VRMASD_SH)
We initialize the VRMA bits in LPCR to 0x00 in kvm. Hence using a
different mask value as above while updating lpcr should not have any
impact.
This patch updates it to the correct value.
Fixes: a4b349540a ("powerpc/mm: Cleanup LPCR defines")
Reported-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Jia He <hejianet@gmail.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bdd9968d35 upstream.
val might become 7 in which case stime[7] (array of length 7) would be
accessed during the scnprintf call later and that will cause issues.
Obviously, string concatenation is not intended here so just a comma needs
to be added to fix the issue.
Fixes: 98a2766493 ("power_supply: Add new lp8788 charger driver")
Signed-off-by: Giedrius Statkevičius <giedrius.statkevicius@gmail.com>
Acked-by: Milo Kim <milo.kim@ti.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 87ec02e740 upstream.
In case ctx_dma dma mapping fails, ahash_unmap_ctx() tries to
dma unmap an invalid address:
map_seq_out_ptr_ctx() / ctx_map_to_sec4_sg() -> goto unmap_ctx ->
-> ahash_unmap_ctx() -> dma unmap ctx_dma
There is also possible to reach ahash_unmap_ctx() with ctx_dma
uninitialzed or to try to unmap the same address twice.
Fix these by setting ctx_dma = 0 where needed:
-initialize ctx_dma in ahash_init()
-clear ctx_dma in case of mapping error (instead of holding
the error code returned by the dma map function)
-clear ctx_dma after each unmapping
Fixes: 32686d34f8 ("crypto: caam - ensure that we clean up after an error")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d761119a9 upstream.
The error code handling is broken as any error code that has the same
bits set as TPM_RC_HASH passes. Implemented tpm2_rc_value() helper to
parse the error value from FMT0 and FMT1 error codes so that these types
of mistakes are prevented in the future.
Fixes: 5ca4c20cfd ("keys, trusted: select hash algorithm for TPM2 chips")
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d66777caa5 upstream.
pwm4 is enabled if bit 2 of GPIO control register 4 is disabled,
not when it is enabled. Since the check is for the skip condition,
it is reversed. This applies to both IT8620 and IT8628.
Fixes: 36c4d98a78 ("hwmon: (it87) Add support for all pwm channels ...")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The mmc, sdhost, sdio and sdio-1bit overlays had a few
anachronisms and oddities which were overdue for fixing.
The new versions should be functionally equivalent.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
of zero must be changed to 0xffff, however this feature is not
supported by smsc95xx family hence enable csum offload only for
IPv4 TCP/UDP packets.
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
Reported-by: popcorn mix <popcornmix@gmail.com>
In order to reduce power consumption and bus traffic, it is sensible
for secondary cores to enter a low-power idle state when waiting to
be started. The wfe instruction causes a core to wait until an event
or interrupt arrives before continuing to the next instruction.
The sev instruction sends a wakeup event to the other cores, so call
it from bcm2836_smp_boot_secondary, the function that wakes up the
waiting cores during booting.
It is harmless to use this patch without the corresponding change
adding wfe to the ARMv7/ARMv8-32 stubs, but if the stubs are updated
and this patch is not applied then the other cores will sleep forever.
See: https://github.com/raspberrypi/linux/issues/1989
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Raspberry Pi startup stub files for multi-core BCM27XX processors
make the secondary CPUs spin until the corresponding mailbox is
written. These stubs are loaded at physical address 0x00000xxx (as seen
by the ARMs), but this page will be reused by the kernel unless it is
explicitly reserved, causing the waiting cores to execute random code.
Use the /memreserve/ Device Tree directive to mark the first page as
off-limits to the kernel.
See: https://github.com/raspberrypi/linux/issues/1989
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
commit 4617f564c0 upstream.
When calling a dm ioctl that doesn't process any data
(IOCTL_FLAGS_NO_PARAMS), the contents of the data field in struct
dm_ioctl are left initialized. Current code is incorrectly extending
the size of data copied back to user, causing the contents of kernel
stack to be leaked to user. Fix by only copying contents before data
and allow the functions processing the ioctl to override.
Signed-off-by: Adrian Salido <salidoa@google.com>
Reviewed-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc434e056f upstream.
The setup/remove_state/instance() functions in the hotplug core code are
serialized against concurrent CPU hotplug, but unfortunately not serialized
against themself.
As a consequence a concurrent invocation of these function results in
corruption of the callback machinery because two instances try to invoke
callbacks on remote cpus at the same time. This results in missing callback
invocations and initiator threads waiting forever on the completion.
The obvious solution to replace get_cpu_online() with cpu_hotplug_begin()
is not possible because at least one callsite calls into these functions
from a get_online_cpu() locked region.
Extend the protection scope of the cpuhp_state_mutex from solely protecting
the state arrays to cover the callback invocation machinery as well.
Fixes: 5b7aa87e04 ("cpu/hotplug: Implement setup/removal interface")
Reported-and-tested-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: hpa@zytor.com
Cc: mingo@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20170314150645.g4tdyoszlcbajmna@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b1ac852eb upstream.
For readahead/fadvise cases, caller of ceph_readpages does not
hold buffer capability. Pages can be added to page cache while
there is no buffer capability. This can cause data integrity
issue.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c130b666a9 upstream.
Commit f209fa03fc ("serial: 8250_pci: Detach low-level driver during
PCI error recovery") introduces a potential use-after-free in case the
pciserial_init_ports call in serial8250_io_resume fails, which may
happen if a memory allocation fails or if the .init quirk failed for
whatever reason). If this happen, further pci_get_drvdata will return a
pointer to freed memory.
This patch reworks the PCI recovery resume hook to restore the old priv
structure in this case, which should be ok, since the ports were already
detached. Such error during recovery causes us to give up on the
recovery.
Fixes: f209fa03fc ("serial: 8250_pci: Detach low-level driver during PCI error recovery")
Reported-by: Michal Suchanek <msuchanek@suse.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f1c635b439 upstream.
Hyper-V host emulation of SCSI for virtual DVD device reports SCSI
version 0 (UNKNOWN) but is still capable of supporting REPORTLUN.
Without this patch, a GEN2 Linux guest on Hyper-V will not boot 4.11
successfully with virtual DVD ROM device. What happens is that the SCSI
scan process falls back to doing sequential probing by INQUIRY. But the
storvsc driver has a previous workaround that masks/blocks all errors
reports from INQUIRY (or MODE_SENSE) commands. This workaround causes
the scan to then populate a full set of bogus LUN's on the target and
then sends kernel spinning off into a death spiral doing block reads on
the non-existent LUNs.
By setting the correct blacklist flags, the target with the DVD device
is scanned with REPORTLUN and that works correctly.
Patch needs to go in current 4.11, it is safe but not necessary in older
kernels.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d70fe9d9c upstream.
Since commit 1107d065fd ("tpm_tis: Introduce intermediate layer for
TPM access") Atmel 3203 TPM on ThinkPad X61S (TPM firmware version 13.9)
no longer works. The initialization proceeds fine until we get and
start using chip-reported timeouts - and the chip reports C and D
timeouts of zero.
It turns out that until commit 8e54caf407 ("tpm: Provide a generic
means to override the chip returned timeouts") we had actually let
default timeout values remain in this case, so let's bring back this
behavior to make chips like Atmel 3203 work again.
Use a common code that was introduced by that commit so a warning is
printed in this case and /sys/class/tpm/tpm*/timeouts correctly says the
timeouts aren't chip-original.
This is a backport for 4.9 kernel version of the original commit, with
renaming of "TPM_TIS_ITPM_POSSIBLE" flag removed since it was only a
cosmetic change and not a part of the real bug fix.
Fixes: 1107d065fd ("tpm_tis: Introduce intermediate layer for TPM access")
Cc: stable@vger.kernel.org
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 38bd49064a upstream.
A signal can interrupt a SendReceive call which result in incoming
responses to the call being ignored. This is a problem for calls such as
open which results in the successful response being ignored. This
results in an open file resource on the server.
The patch looks into responses which were cancelled after being sent and
in case of successful open closes the open fids.
For this patch, the check is only done in SendReceive2()
RH-bz: 1403319
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Acked-by: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If firmware has locked up it is useful to get vcdbg log out without a firmware mbox response.
Issue the mbox call at probe time instead.
Signed-off-by: popcornmix <popcornmix@gmail.com>
Currently if two cores access the same page concurrently one will return VM_FAULT_NOPAGE
and the other VM_FAULT_SIGBUS crashing the user code.
Also report when mapping fails.
Signed-off-by: popcornmix <popcornmix@gmail.com>
* fiq_fsm: Use correct states when starting isoc OUT transfers
In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.
For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.
Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.
Set the FSM state up properly if we select a single-packet transfer.
Fixes https://github.com/raspberrypi/linux/issues/1842
commit 34a477e529 upstream.
On x86-32, with CONFIG_FIRMWARE and multiple CPUs, if you enable function
graph tracing and then suspend to RAM, it will triple fault and reboot when
it resumes.
The first fault happens when booting a secondary CPU:
startup_32_smp()
load_ucode_ap()
prepare_ftrace_return()
ftrace_graph_is_dead()
(accesses 'kill_ftrace_graph')
The early head_32.S code calls into load_ucode_ap(), which has an an
ftrace hook, so it calls prepare_ftrace_return(), which calls
ftrace_graph_is_dead(), which tries to access the global
'kill_ftrace_graph' variable with a virtual address, causing a fault
because the CPU is still in real mode.
The fix is to add a check in prepare_ftrace_return() to make sure it's
running in protected mode before continuing. The check makes sure the
stack pointer is a virtual kernel address. It's a bit of a hack, but
it's not very intrusive and it works well enough.
For reference, here are a few other (more difficult) ways this could
have potentially been fixed:
- Move startup_32_smp()'s call to load_ucode_ap() down to *after* paging
is enabled. (No idea what that would break.)
- Track down load_ucode_ap()'s entire callee tree and mark all the
functions 'notrace'. (Probably not realistic.)
- Pause graph tracing in ftrace_suspend_notifier_call() or bringup_cpu()
or __cpu_up(), and ensure that the pause facility can be queried from
real mode.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: linux-acpi@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/5c1272269a580660703ed2eccf44308e790c7a98.1492123841.git.jpoimboe@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecd43afdbe upstream.
This is not exposed to userspace debugers yet, which can be done
independently as a seperate patch !
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b05c73bd1e upstream.
Allocate buffers on HEAP instead of STACK for local structures
that are to be sent using usb_control_msg().
Signed-off-by: Maksim Salau <maksim.salau@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d6fa57b4d upstream.
While this may appear as a humdrum one line change, it's actually quite
important. An sk_buff stores data in three places:
1. A linear chunk of allocated memory in skb->data. This is the easiest
one to work with, but it precludes using scatterdata since the memory
must be linear.
2. The array skb_shinfo(skb)->frags, which is of maximum length
MAX_SKB_FRAGS. This is nice for scattergather, since these fragments
can point to different pages.
3. skb_shinfo(skb)->frag_list, which is a pointer to another sk_buff,
which in turn can have data in either (1) or (2).
The first two are rather easy to deal with, since they're of a fixed
maximum length, while the third one is not, since there can be
potentially limitless chains of fragments. Fortunately dealing with
frag_list is opt-in for drivers, so drivers don't actually have to deal
with this mess. For whatever reason, macsec decided it wanted pain, and
so it explicitly specified NETIF_F_FRAGLIST.
Because dealing with (1), (2), and (3) is insane, most users of sk_buff
doing any sort of crypto or paging operation calls a convenient function
called skb_to_sgvec (which happens to be recursive if (3) is in use!).
This takes a sk_buff as input, and writes into its output pointer an
array of scattergather list items. Sometimes people like to declare a
fixed size scattergather list on the stack; othertimes people like to
allocate a fixed size scattergather list on the heap. However, if you're
doing it in a fixed-size fashion, you really shouldn't be using
NETIF_F_FRAGLIST too (unless you're also ensuring the sk_buff and its
frag_list children arent't shared and then you check the number of
fragments in total required.)
Macsec specifically does this:
size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
tmp = kmalloc(size, GFP_ATOMIC);
*sg = (struct scatterlist *)(tmp + sg_offset);
...
sg_init_table(sg, MAX_SKB_FRAGS + 1);
skb_to_sgvec(skb, sg, 0, skb->len);
Specifying MAX_SKB_FRAGS + 1 is the right answer usually, but not if you're
using NETIF_F_FRAGLIST, in which case the call to skb_to_sgvec will
overflow the heap, and disaster ensues.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8179a101eb upstream.
ceph_set_acl() calls __ceph_setattr() if the setacl operation needs
to modify inode's i_mode. __ceph_setattr() updates inode's i_mode,
then calls posix_acl_chmod().
The problem is that __ceph_setattr() calls posix_acl_chmod() before
sending the setattr request. The get_acl() call in posix_acl_chmod()
can trigger a getxattr request. The reply of the getxattr request
can restore inode's i_mode to its old value. The set_acl() call in
posix_acl_chmod() sees old value of inode's i_mode, so it calls
__ceph_setattr() again.
Link: http://tracker.ceph.com/issues/19688
Reported-by: Jerry Lee <leisurelysw24@gmail.com>
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Tested-by: Luis Henriques <lhenriques@suse.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13bf9fbff0 upstream.
The NFSv2/v3 code does not systematically check whether we decode past
the end of the buffer. This generally appears to be harmless, but there
are a few places where we do arithmetic on the pointers involved and
don't account for the possibility that a length could be negative. Add
checks to catch these.
Reported-by: Tuomas Haanpää <thaan@synopsys.com>
Reported-by: Ari Kauppi <ari@synopsys.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6838a29ec upstream.
A client can append random data to the end of an NFSv2 or NFSv3 RPC call
without our complaining; we'll just stop parsing at the end of the
expected data and ignore the rest.
Encoded arguments and replies are stored together in an array of pages,
and if a call is too large it could leave inadequate space for the
reply. This is normally OK because NFS RPC's typically have either
short arguments and long replies (like READ) or long arguments and short
replies (like WRITE). But a client that sends an incorrectly long reply
can violate those assumptions. This was observed to cause crashes.
Also, several operations increment rq_next_page in the decode routine
before checking the argument size, which can leave rq_next_page pointing
well past the end of the page array, causing trouble later in
svc_free_pages.
So, following a suggestion from Neil Brown, add a central check to
enforce our expectation that no NFSv2/v3 call has both a large call and
a large reply.
As followup we may also want to rewrite the encoding routines to check
more carefully that they aren't running off the end of the page array.
We may also consider rejecting calls that have any extra garbage
appended. That would be safer, and within our rights by spec, but given
the age of our server and the NFS protocol, and the fact that we've
never enforced this before, we may need to balance that against the
possibility of breaking some oddball client.
Reported-by: Tuomas Haanpää <thaan@synopsys.com>
Reported-by: Ari Kauppi <ari@synopsys.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e4cac23c5 upstream.
The FE setups of Intel SST bytcr_rt5640 and bytcr_rt5651 drivers carry
the ignore_suspend flag, and this prevents the suspend/resume working
properly while the stream is running, since SST core code has the
check of the running streams and returns -EBUSY. Drop these
superfluous flags for fixing the behavior.
Also, the bytcr_rt5640 driver lacks of nonatomic flag in some FE
definitions, which leads to the kernel Oops at suspend/resume like:
BUG: scheduling while atomic: systemd-sleep/3144/0x00000003
Call Trace:
dump_stack+0x5c/0x7a
__schedule_bug+0x55/0x70
__schedule+0x63c/0x8c0
schedule+0x3d/0x90
schedule_timeout+0x16b/0x320
? del_timer_sync+0x50/0x50
? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
? sst_wait_timeout+0xa9/0x170 [snd_intel_sst_core]
? remove_wait_queue+0x60/0x60
? sst_prepare_and_post_msg+0x275/0x960 [snd_intel_sst_core]
? sst_pause_stream+0x9b/0x110 [snd_intel_sst_core]
....
This patch addresses these appropriately, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c46f59e902 upstream.
arch_check_elf contains a usage of current_cpu_data that will call
smp_processor_id() with preemption enabled and therefore triggers a
"BUG: using smp_processor_id() in preemptible" warning when an fpxx
executable is loaded.
As a follow-up to commit b244614a60 ("MIPS: Avoid a BUG warning during
prctl(PR_SET_FP_MODE, ...)"), apply the same fix to arch_check_elf by
using raw_current_cpu_data instead. The rationale quoted from the previous
commit:
"It is assumed throughout the kernel that if any CPU has an FPU, then
all CPUs would have an FPU as well, so it is safe to perform the check
with preemption enabled - change the code to use raw_ variant of the
check to avoid the warning."
Fixes: 46490b5725 ("MIPS: kernel: elf: Improve the overall ABI and FPU mode checks")
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d7f29cdb4 upstream.
calculate_min_delta() may incorrectly access a 4th element of buf2[]
which only has 3 elements. This may trigger undefined behaviour and has
been reported to cause strange crashes in start_kernel() sometime after
timer initialization when built with GCC 5.3, possibly due to
register/stack corruption:
sched_clock: 32 bits at 200MHz, resolution 5ns, wraps every 10737418237ns
CPU 0 Unable to handle kernel paging request at virtual address ffffb0aa, epc == 8067daa8, ra == 8067da84
Oops[#1]:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #51
task: 8065e3e0 task.stack: 80644000
$ 0 : 00000000 00000001 00000000 00000000
$ 4 : 8065b4d0 00000000 805d0000 00000010
$ 8 : 00000010 80321400 fffff000 812de408
$12 : 00000000 00000000 00000000 ffffffff
$16 : 00000002 ffffffff 80660000 806a666c
$20 : 806c0000 00000000 00000000 00000000
$24 : 00000000 00000010
$28 : 80644000 80645ed0 00000000 8067da84
Hi : 00000000
Lo : 00000000
epc : 8067daa8 start_kernel+0x33c/0x500
ra : 8067da84 start_kernel+0x318/0x500
Status: 11000402 KERNEL EXL
Cause : 4080040c (ExcCode 03)
BadVA : ffffb0aa
PrId : 0501992c (MIPS 1004Kc)
Modules linked in:
Process swapper/0 (pid: 0, threadinfo=80644000, task=8065e3e0, tls=00000000)
Call Trace:
[<8067daa8>] start_kernel+0x33c/0x500
Code: 24050240 0c0131f9 24849c64 <a200b0a8> 41606020 000000c0 0c1a45e6 00000000 0c1a5f44
UBSAN also detects the same issue:
================================================================
UBSAN: Undefined behaviour in arch/mips/kernel/cevt-r4k.c:85:41
load of address 80647e4c with insufficient space
for an object of type 'unsigned int'
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.18 #47
Call Trace:
[<80028f70>] show_stack+0x88/0xa4
[<80312654>] dump_stack+0x84/0xc0
[<8034163c>] ubsan_epilogue+0x14/0x50
[<803417d8>] __ubsan_handle_type_mismatch+0x160/0x168
[<8002dab0>] r4k_clockevent_init+0x544/0x764
[<80684d34>] time_init+0x18/0x90
[<8067fa5c>] start_kernel+0x2f0/0x500
=================================================================
buf2[] is intentionally only 3 elements so that the last element is the
median once 5 samples have been inserted, so explicitly prevent the
possibility of comparing against the 4th element rather than extending
the array.
Fixes: 1fa405552e ("MIPS: cevt-r4k: Dynamically calculate min_delta_ns")
Reported-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Tested-by: Rabin Vincent <rabinv@axis.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15892/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 162b270c66 upstream.
KGDB is a kernel debug stub and it can't be used to debug userland as it
can only safely access kernel memory.
On MIPS however KGDB has always got the register state of sleeping
processes from the userland register context at the beginning of the
kernel stack. This is meaningless for kernel threads (which never enter
userland), and for user threads it prevents the user seeing what it is
doing while in the kernel:
(gdb) info threads
Id Target Id Frame
...
3 Thread 2 (kthreadd) 0x0000000000000000 in ?? ()
2 Thread 1 (init) 0x000000007705c4b4 in ?? ()
1 Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
Get the register state instead from the (partial) kernel register
context stored in the task's thread_struct for resume() to restore. All
threads now correctly appear to be in context_switch():
(gdb) info threads
Id Target Id Frame
...
3 Thread 2 (kthreadd) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
2 Thread 1 (init) context_switch (rq=<optimized out>, cookie=..., next=<optimized out>, prev=0x0) at kernel/sched/core.c:2903
1 Thread -2 (shadowCPU0) 0xffffffff8012524c in arch_kgdb_breakpoint () at arch/mips/kernel/kgdb.c:201
Call clobbered registers which aren't saved and exception registers
(BadVAddr & Cause) which can't be easily determined without stack
unwinding are reported as 0. The PC is taken from the return address,
such that the state presented matches that found immediately after
returning from resume().
Fixes: 8854700115 ("[MIPS] kgdb: add arch support for the kernel's kgdb core")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15829/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e7655fd4f upstream.
The snd_use_lock_sync() (thus its implementation
snd_use_lock_sync_helper()) has the 5 seconds timeout to break out of
the sync loop. It was introduced from the beginning, just to be
"safer", in terms of avoiding the stupid bugs.
However, as Ben Hutchings suggested, this timeout rather introduces a
potential leak or use-after-free that was apparently fixed by the
commit 2d7d54002e ("ALSA: seq: Fix race during FIFO resize"):
for example, snd_seq_fifo_event_in() -> snd_seq_event_dup() ->
copy_from_user() could block for a long time, and snd_use_lock_sync()
goes timeout and still leaves the cell at releasing the pool.
For fixing such a problem, we remove the break by the timeout while
still keeping the warning.
Suggested-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dfb00a5693 upstream.
An abstraction of asynchronous transaction for transmission of MIDI
messages was introduced in Linux v4.4. Each driver can utilize this
abstraction to transfer MIDI messages via fixed-length payload of
transaction to a certain unit address. Filling payload of the transaction
is done by callback. In this callback, each driver can return negative
error code, however current implementation assigns the return value to
unsigned variable.
This commit changes type of the variable to fix the bug.
Reported-by: Julia Lawall <Julia.Lawall@lip6.fr>
Fixes: 585d7cba5e ("ALSA: firewire-lib: add helper functions for asynchronous transactions to transfer MIDI messages")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d016d57fd upstream.
At a commit 6c29230e2a ("ALSA: oxfw: delayed registration of sound
card"), ALSA oxfw driver fails to handle SCS.1m/1d, due to -EBUSY at a call
of snd_card_register(). The cause is that the driver manages to register
two rawmidi instances with the same device number 0. This is a regression
introduced since kernel 4.7.
This commit fixes the regression, by fixing up device property after
discovering stream formats.
Fixes: 6c29230e2a ("ALSA: oxfw: delayed registration of sound card")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 105f5528b9 ]
In situations where an skb is paged, the transport header pointer and
tail pointer can be the same because the skb contents are in frags.
This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a
length of 0 when the length to receive is actually greater than zero.
skb->len is already correctly set in ip6_input_finish() with
pskb_pull(), so use skb->len as it always returns the correct result
for both linear and paged data.
Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c120144407 ]
Always zero out ca_priv data in tcp_assign_congestion_control() so that
ca_priv data is cleared out during socket creation.
Also always zero out ca_priv data in tcp_reinit_congestion_control() so
that when cc algorithm is changed, ca_priv data is cleared out as well.
We should still zero out ca_priv data even in TCP_CLOSE state because
user could call connect() on AF_UNSPEC to disconnect the socket and
leave it in TCP_CLOSE state and later call setsockopt() to switch cc
algorithm on this socket.
Fixes: 2b0a8c9ee ("tcp: add CDG congestion control")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 199ab00f3c ]
Andrey reported a out-of-bound access in ip6_tnl_xmit(), this
is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4
neigh key as an IPv6 address:
neigh = dst_neigh_lookup(skb_dst(skb),
&ipv6_hdr(skb)->daddr);
if (!neigh)
goto tx_err_link_failure;
addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE
addr_type = ipv6_addr_type(addr6);
if (addr_type == IPV6_ADDR_ANY)
addr6 = &ipv6_hdr(skb)->daddr;
memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
Also the network header of the skb at this point should be still IPv4
for 4in6 tunnels, we shold not just use it as IPv6 header.
This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it
is, we are safe to do the nexthop lookup using skb_dst() and
ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which
dest address we can pick here, we have to rely on callers to fill it
from tunnel config, so just fall to ip6_route_output() to make the
decision.
Fixes: ea3dc9601b ("ip6_tunnel: Add support for wildcard tunnel endpoints.")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f555f34fdc ]
The Ethernet link on an interrupt driven PHY was not coming up if the Ethernet
cable was plugged before the Ethernet interface was brought up.
The patch trigger PHY state machine to update link state if PHY was requested to
do auto-negotiation and auto-negotiation complete flag already set.
During power-up cycle the PHY do auto-negotiation, generate interrupt and set
auto-negotiation complete flag. Interrupt is handled by PHY state machine but
doesn't update link state because PHY is in PHY_READY state. After some time
MAC bring up, start and request PHY to do auto-negotiation. If there are no new
settings to advertise genphy_config_aneg() doesn't start PHY auto-negotiation.
PHY continue to stay in auto-negotiation complete state and doesn't fire
interrupt. At the same time PHY state machine expect that PHY started
auto-negotiation and is waiting for interrupt from PHY and it won't get it.
Fixes: 321beec504 ("net: phy: Use interrupts when available in NOLINK state")
Signed-off-by: Alexander Kochetkov <al.kochet@gmail.com>
Cc: stable <stable@vger.kernel.org> # v4.9+
Tested-by: Roger Quadros <rogerq@ti.com>
Tested-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8048ced9be ]
Taking down the loopback device wreaks havoc on IPv6 routing. By
extension, taking down a VRF device wreaks havoc on its table.
Dmitry and Andrey both reported heap out-of-bounds reports in the IPv6
FIB code while running syzkaller fuzzer. The root cause is a dead dst
that is on the garbage list gets reinserted into the IPv6 FIB. While on
the gc (or perhaps when it gets added to the gc list) the dst->next is
set to an IPv4 dst. A subsequent walk of the ipv6 tables causes the
out-of-bounds access.
Andrey's reproducer was the key to getting to the bottom of this.
With IPv6, host routes for an address have the dst->dev set to the
loopback device. When the 'lo' device is taken down, rt6_ifdown initiates
a walk of the fib evicting routes with the 'lo' device which means all
host routes are removed. That process moves the dst which is attached to
an inet6_ifaddr to the gc list and marks it as dead.
The recent change to keep global IPv6 addresses added a new function,
fixup_permanent_addr, that is called on admin up. That function restarts
dad for an inet6_ifaddr and when it completes the host route attached
to it is inserted into the fib. Since the route was marked dead and
moved to the gc list, re-inserting the route causes the reported
out-of-bounds accesses. If the device with the address is taken down
or the address is removed, the WARN_ON in fib6_del is triggered.
All of those faults are fixed by regenerating the host route if the
existing one has been moved to the gc list, something that can be
determined by checking if the rt6i_ref counter is 0.
Fixes: f1705ec197 ("net: ipv6: Make address flushing on ifdown optional")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f6478218e6 ]
When a parent macvlan device is destroyed we end up purging its
broadcast queue without dropping the device reference count on
the packet source device. This causes the source device to linger.
This patch drops that reference count.
Fixes: 260916dfb4 ("macvlan: Fix potential use-after free for...")
Reported-by: Joe Ghalam <Joe.Ghalam@dell.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e82c9e4ed ]
Handler for ETHTOOL_GRXCLSRLALL must set info->data to the size
of the table, regardless of the amount of entries in it.
Existing code does not do that, and this breaks all usage of ethtool -N
or -n without explicit location, with this error:
rmgr: Invalid RX class rules table size: Success
Set info->data to the table size.
Tested:
ethtool -n ens8
ethtool -N ens8 flow-type ip4 src-ip 1.1.1.1 dst-ip 2.2.2.2 action 1
ethtool -N ens8 flow-type ip4 src-ip 1.1.1.1 dst-ip 2.2.2.2 action 1 loc 55
ethtool -n ens8
ethtool -N ens8 delete 1023
ethtool -N ens8 delete 55
Fixes: f913a72aa0 ("net/mlx5e: Add support to get ethtool flow rules")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cbad8cddb6 ]
RX packet headers are meant to be contained in SKB linear part,
and chose a threshold of 128.
It turns out this is not enough, i.e. for IPv6 packet over VxLAN.
In this case, UDP/IPv4 needs 42 bytes, GENEVE header is 8 bytes,
and 86 bytes for TCP/IPv6. In total 136 bytes that is more than
current 128 bytes. In this case expand header flow is reached.
The warning in skb_try_coalesce() caused by a wrong truesize
was already fixed here:
commit 158f323b98 ("net: adjust skb->truesize in pskb_expand_head()").
Still, we prefer to totally avoid the expand header flow for performance reasons.
Tested regular TCP_STREAM with iperf for 1 and 8 streams, no degradation was found.
Fixes: 461017cb00 ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 55378a238e ]
If FW is stuck in initializing state we will skip the driver load, but
current error handling flow doesn't clean previously allocated command
interface resources.
Fixes: e3297246c2 ('net/mlx5_core: Wait for FW readiness on startup')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 557c44be91 ]
Andrey reported a fault in the IPv6 route code:
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880069809600 task.stack: ffff880062dc8000
RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
RSP: 0018:ffff880062dced30 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
FS: 00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
Call Trace:
ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128
ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212
...
Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit
set. Flags passed to the kernel are blindly copied to the allocated
rt6_info by ip6_route_info_create making a newly inserted route appear
as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set
and expects rt->dst.from to be set - which it is not since it is not
really a per-cpu copy. The subsequent call to __ip6_dst_alloc then
generates the fault.
Fix by checking for the flag and failing with EINVAL.
Fixes: d52d3997f8 ("ipv6: Create percpu rt6_info")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 43170c4e0b ]
Commit 07b26c9454 ("gso: Support partial splitting at the frag_list
pointer") assumes that all SKBs in a frag_list (except maybe the last
one) contain the same amount of GSO payload.
This assumption is not always correct, resulting in the following
warning message in the log:
skb_segment: too many frags
For example, mlx5 driver in Striding RQ mode creates some RX SKBs with
one frag, and some with 2 frags.
After GRO, the frag_list SKBs end up having different amounts of payload.
If this frag_list SKB is then forwarded, the aforementioned assumption
is violated.
Validate the assumption, and fall back to software GSO if it not true.
Change-Id: Ia03983f4a47b6534dd987d7a2aad96d54d46d212
Fixes: 07b26c9454 ("gso: Support partial splitting at the frag_list pointer")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d386cd9a7 ]
This patch is prompted by a static checker warning about a potential
use after free. The concern is that netif_rx_ni() can free "skb" and we
call it twice.
When I look at the commit that added this, it looks like some stray
lines were added accidentally. It doesn't make sense to me that we
would recieve the same data two times. I asked the author but never
recieved a response.
I can't test this code, but I'm pretty sure my patch is correct.
Fixes: 4b063258ab ("dp83640: Delay scheduled work.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1debdc8f9e ]
The DMA API debugging (when enabled) causes:
WARNING: CPU: 0 PID: 1445 at lib/dma-debug.c:519 add_dma_entry+0xe0/0x12c
DMA-API: exceeded 7 overlapping mappings of cacheline 0x01b2974d
to be printed after repeated initialization of the Ether device, e.g.
suspend/resume or 'ifconfig' up/down. This is because DMA buffers mapped
using dma_map_single() in sh_eth_ring_format() and sh_eth_start_xmit() are
never unmapped. Resolve this problem by unmapping the buffers when freeing
the descriptor rings; in order to do it right, we'd have to add an extra
parameter to sh_eth_txfree() (we rename this function to sh_eth_tx_free(),
while at it).
Based on the commit a47b70ea86 ("ravb: unmap descriptors when freeing
rings").
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1862d6208d ]
Syzkaller reported a use-after-free in ip_recv_error at line
info->ipi_ifindex = skb->dev->ifindex;
This function is called on dequeue from the error queue, at which
point the device pointer may no longer be valid.
Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
pointer is valid or NULL. Store it in temporary storage skb->cb.
It is safe to reference skb->dev here, as called from device drivers
or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
in that case it is NULL and ifindex is set to 0 (invalid).
Do not return a pktinfo cmsg if ifindex is 0. This maintains the
current behavior of not returning a cmsg if skb->dev was NULL.
On dequeue, the ipv4 path will cast from sock_exterr_skb to
in_pktinfo. Both have ifindex as their first element, so no explicit
conversion is needed. This is by design, introduced in commit
0b922b7a82 ("net: original ingress device index in PKTINFO"). For
ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.
Fixes: 829ae9d611 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a2d6cbb067 ]
addrconf_ifdown() removes elements from the idev->addr_list without
holding the idev->lock.
If this happens while the loop in __ipv6_dev_get_saddr() is handling the
same element, that function ends up in an infinite loop:
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [test:1719]
Call Trace:
ipv6_get_saddr_eval+0x13c/0x3a0
__ipv6_dev_get_saddr+0xe4/0x1f0
ipv6_dev_get_saddr+0x1b4/0x204
ip6_dst_lookup_tail+0xcc/0x27c
ip6_dst_lookup_flow+0x38/0x80
udpv6_sendmsg+0x708/0xba8
sock_sendmsg+0x18/0x30
SyS_sendto+0xb8/0xf8
syscall_common+0x34/0x58
Fixes: 6a923934c3 (Revert "ipv6: Revert optional address flusing on ifdown.")
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 34b2789f1d ]
Now sctp doesn't check sock's state before listening on it. It could
even cause changing a sock with any state to become a listening sock
when doing sctp_listen.
This patch is to fix it by checking sock's state in sctp_listen, so
that it will listen on the sock with right state.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a8801799c6 ]
inet_rtm_getroute synthesizes a skeletal ICMP skb, which is passed to
ip_route_input when iif is given. If a multipath route is present for
the designated destination, ip_multipath_icmp_hash ends up being called,
which uses the source/destination addresses within the skb to calculate
a hash. However, those are not set in the synthetic skb, causing it to
return an arbitrary and incorrect result.
Instead, use UDP, which gets no such special treatment.
Signed-off-by: Florian Larysch <fl@n621.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e08293a4cc ]
Take a reference on the sessions returned by l2tp_session_find_nth()
(and rename it l2tp_session_get_nth() to reflect this change), so that
caller is assured that the session isn't going to disappear while
processing it.
For procfs and debugfs handlers, the session is held in the .start()
callback and dropped in .show(). Given that pppol2tp_seq_session_show()
dereferences the associated PPPoL2TP socket and that
l2tp_dfs_seq_session_show() might call pppol2tp_show(), we also need to
call the session's .ref() callback to prevent the socket from going
away from under us.
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Fixes: 0ad6614048 ("l2tp: Add debugfs files for dumping l2tp debug info")
Fixes: 309795f4be ("l2tp: Add netlink control API for L2TP")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8f8d28e4d6 ]
When calculating rb->frames_per_block * req->tp_block_nr the result
can overflow.
Add a check that tp_block_size * tp_block_nr <= UINT_MAX.
Since frames_per_block <= tp_block_size, the expression would
never overflow.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e91793bb61 ]
The Rx path may grab the socket right before pppol2tp_release(), but
nothing guarantees that it will enqueue packets before
skb_queue_purge(). Therefore, the socket can be destroyed without its
queues fully purged.
Fix this by purging queues in pppol2tp_session_destruct() where we're
guaranteed nothing is still referencing the socket.
Fixes: 9e9cb6221a ("l2tp: fix userspace reception on plain L2TP sockets")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 94d7ee0baa ]
The code following l2tp_tunnel_find() expects that a new reference is
held on sk. Either sk_receive_skb() or the discard_put error path will
drop a reference from the tunnel's socket.
This issue exists in both l2tp_ip and l2tp_ip6.
Fixes: a3c18422a4 ("l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b1977682a3 ]
llvm can optimize the 'if (ptr > data_end)' checks to be in the order
slightly different than the original C code which will confuse verifier.
Like:
if (ptr + 16 > data_end)
return TC_ACT_SHOT;
// may be followed by
if (ptr + 14 > data_end)
return TC_ACT_SHOT;
while llvm can see that 'ptr' is valid for all 16 bytes,
the verifier could not.
Fix verifier logic to account for such case and add a test.
Reported-by: Huapeng Zhou <hzhou@fb.com>
Fixes: 969bf05eb3 ("bpf: direct packet access")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 49d52e8108 ]
If the PHY is halted on stop, then do not set the state to PHY_UP. This
ensures the phy will be restarted later in phy_start when the machine is
started again.
Fixes: 00db8189d9 ("This patch adds a PHY Abstraction Layer to the Linux Kernel, enabling ethernet drivers to remain as ignorant as is reasonable of the connected PHY's design and operation details.")
Signed-off-by: Nathan Sullivan <nathan.sullivan@ni.com>
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Acked-by: Xander Huff <xander.huff@ni.com>
Acked-by: Kyle Roeschley <kyle.roeschley@ni.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 48481c8fa1 ]
Dmitry posted a nice reproducer of a bug triggering in neigh_probe()
when dereferencing a NULL neigh->ops->solicit method.
This can happen for arp_direct_ops/ndisc_direct_ops and similar,
which can be used for NUD_NOARP neighbours (created when dev->header_ops
is NULL). Admin can then force changing nud_state to some other state
that would fire neigh timer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit adfae8a5d8 ]
I encountered this bug when using /proc/kcore to examine the kernel. Plus a
coworker inquired about debugging tools. We computed pa but did
not use it during the maximum physical address bits test. Instead we used
the identity mapped virtual address which will always fail this test.
I believe the defect came in here:
[bpicco@zareason linus.git]$ git describe --contains bb4e6e85da
v3.18-rc1~87^2~4
.
Signed-off-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
See https://github.com/raspberrypi/linux/issues/1709
Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()
commit 956a4cd2c9 upstream.
The following warning triggers with a new unit test that stresses the
device-dax interface.
===============================
[ ERR: suspicious RCU usage. ]
4.11.0-rc4+ #1049 Tainted: G O
-------------------------------
./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
2 locks held by fio/9070:
#0: (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
#1: (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
Call Trace:
dump_stack+0x86/0xc3
lockdep_rcu_suspicious+0xd7/0x110
___might_sleep+0xac/0x250
__might_sleep+0x4a/0x80
__alloc_pages_nodemask+0x23a/0x360
alloc_pages_current+0xa1/0x1f0
pte_alloc_one+0x17/0x80
__pte_alloc+0x1e/0x120
__get_locked_pte+0x1bf/0x1d0
insert_pfn.isra.70+0x3a/0x100
? lookup_memtype+0xa6/0xd0
vm_insert_mixed+0x64/0x90
dax_dev_huge_fault+0x520/0x620 [dax]
? dax_dev_huge_fault+0x32/0x620 [dax]
dax_dev_fault+0x10/0x20 [dax]
__do_fault+0x1e/0x140
__handle_mm_fault+0x9af/0x10d0
handle_mm_fault+0x16d/0x370
? handle_mm_fault+0x47/0x370
__do_page_fault+0x28c/0x4f0
trace_do_page_fault+0x58/0x2a0
do_async_page_fault+0x1a/0xa0
async_page_fault+0x28/0x30
Inserting a page table entry may trigger an allocation while we are
holding a read lock to keep the device instance alive for the duration
of the fault. Use srcu for this keep-alive protection.
Fixes: dee4107924 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0dc9c639e6 upstream.
The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
[..]
Call Trace:
dump_stack
___might_sleep
__might_sleep
mutex_lock_nested
? __lock_acquire
nfit_handle_mce
notifier_call_chain
atomic_notifier_call_chain
? atomic_notifier_call_chain
mce_gen_pool_process
Convert the notifier to a blocking one which gets to run only in process
context.
Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure
they go out and get logged at least.
Fixes: 6839a6d96f ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29f72ce3e4 upstream.
MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.
Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.
Tested on AMD Fam15h and Fam17h systems.
WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
kobject: (ffff88085bb256c0): attempted to be registered with empty name!
...
Call Trace:
kobject_add_internal
kobject_add
kobject_create_and_add
threshold_create_device
threshold_init_device
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e1ba4f27f upstream.
If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
OOPS:
Bad kernel stack pointer cd93c840 at c000000000009868
Oops: Bad kernel stack pointer, sig: 6 [#1]
...
GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
...
NIP [c000000000009868] resume_kernel+0x2c/0x58
LR [c000000000006208] program_check_common+0x108/0x180
On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
not emulate actual store in emulate_step() because it may corrupt the exception
frame. So the kernel does the actual store operation in exception return code
i.e. resume_kernel().
resume_kernel() loads the saved stack pointer from memory using lwz, which only
loads the low 32-bits of the address, causing the kernel crash.
Fix this by loading the 64-bit value instead.
Fixes: be96f63375 ("powerpc: Split out instruction analysis part of emulate_step()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
[mpe: Change log massage, add stable tag]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cd9a21ce0 upstream.
In commit 6afaf8a484 ("UBI: flush wl before clearing update marker") I
managed to trigger and fix a similar bug. Now here is another version of
which I assumed it wouldn't matter back then but it turns out UBI has a
check for it and will error out like this:
|ubi0 warning: validate_vid_hdr: inconsistent used_ebs
|ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592
All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
powercut in the middle of the operation.
ubi_start_update() sets the update-marker and puts all EBs on the erase
list. After that userland can proceed to write new data while the old EB
aren't erased completely. A powercut at this point is usually not that
much of a tragedy. UBI won't give read access to the static volume
because it has the update marker. It will most likely set the corrupted
flag because it misses some EBs.
So we are all good. Unless the size of the image that has been written
differs from the old image in the magnitude of at least one EB. In that
case UBI will find two different values for `used_ebs' and refuse to
attach the image with the error message mentioned above.
So in order not to get in the situation, the patch will ensure that we
wait until everything is removed before it tries to write any data.
The alternative would be to detect such a case and remove all EBs at the
attached time after we processed the volume-table and see the
update-marker set. The patch looks bigger and I doubt it is worth it
since usually the write() will wait from time to time for a new EB since
usually there not that many spare EB that can be used.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e478066ea upstream.
There are two bugs in the follow-MAC code:
* it treats the radiotap header as the 802.11 header
(therefore it can't possibly work)
* it doesn't verify that the skb data it accesses is actually
present in the header, which is mitigated by the first point
Fix this by moving all of this out into a separate function.
This function copies the data it needs using skb_copy_bits()
to make sure it can be accessed if it's paged, and offsets
that by the possibly present vendor radiotap header.
This also makes all those conditions more readable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3018e947d7 upstream.
AP/AP_VLAN modes don't accept any real 802.11 multicast data
frames, but since they do need to accept broadcast management
frames the same is currently permitted for data frames. This
opens a security problem because such frames would be decrypted
with the GTK, and could even contain unicast L3 frames.
Since the spec says that ToDS frames must always have the BSSID
as the RA (addr1), reject any other data frames.
The problem was originally reported in "Predicting, Decrypting,
and Abusing WPA2/802.11 Group Keys" at usenix
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
and brought to my attention by Jouni.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
commit 32fe905c17 upstream.
It is perfectly fine to link a tmpfile back using linkat().
Since tmpfiles are created with a link count of 0 they appear
on the orphan list, upon re-linking the inode has to be removed
from the orphan list again.
Ralph faced a filesystem corruption in combination with overlayfs
due to this bug.
Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reported-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Reported-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 474b93704f ("ubifs: Implement O_TMPFILE")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3d9fda688 upstream.
Remove faulty leftover check in do_rename(), apparently introduced in a
merge that combined whiteout support changes with commit f03b8ad8d3
("fs: support RENAME_NOREPLACE for local filesystems")
Fixes: f03b8ad8d3 ("fs: support RENAME_NOREPLACE for local filesystems")
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f32784535 upstream.
Currently for DDR50 card, it need tuning in default. We meet tuning fail
issue for DDR50 card and some data CRC error when DDR50 sd card works.
This is because the default pad I/O drive strength can't make sure DDR50
card work stable. So increase the pad I/O drive strength for DDR50 card,
and use pins_100mhz.
This fixes DDR50 card support for IMX since DDR50 tuning was enabled from
commit 9faac7b95e ("mmc: sdhci: enable tuning for DDR50")
Tested-and-reported-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe8c470ab8 upstream.
gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state()
is enterered at least once, so it assumes that cur_state might not get
initialized:
drivers/acpi/power.c: In function 'acpi_power_get_inferred_state':
drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This sets the variable to zero at the start of the loop, to ensure that
there is well-defined behavior even for an empty list. This gets rid of
the warning.
The warning first showed up when the -Os flag got removed in a bug fix
patch in linux-4.11-rc5.
I would suggest merging this addon patch on top of that bug fix to avoid
introducing a new warning in the stable kernels.
Fixes: 61b79e16c6 (ACPI: Fix incompatibility with mcount-based function graph tracing)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 704de489e0 upstream.
Temporary got a Lifebook E547 into my hands and noticed the touchpad
only works after running:
echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled
Add it to the list of machines that need this workaround.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8f60d1fad upstream.
On heavy paging with KSM I see guest data corruption. Turns out that
KSM will add pages to its tree, where the mapping return true for
pte_unused (or might become as such later). KSM will unmap such pages
and reinstantiate with different attributes (e.g. write protected or
special, e.g. in replace_page or write_protect_page)). This uncovered
a bug in our pagetable handling: We must remove the unused flag as
soon as an entry becomes present again.
Signed-of-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0918f1ce6 upstream.
STATUS_BAD_NETWORK_NAME can be received during node failover,
causing the flag to be set and making the reconnect thread
always unsuccessful, thereafter.
Once the only place where it is set is removed, the remaining
bits are rendered moot.
Removing it does not prevent "mount" from failing when a non
existent share is passed.
What happens when the share really ceases to exist while the
share is mounted is undefined now as much as it was before.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62a6cfddcc upstream.
commit 4fcd1813e6 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") added support for Negotiate requests to
be initiated by echo calls.
To avoid delays in calling echo after a reconnect, I added the patch
introduced by the commit b8c600120f ("Call echo service immediately
after socket reconnect").
This has however caused a regression with cifs shares which do not have
support for echo calls to trigger Negotiate requests. On connections
which need to call Negotiation, the echo calls trigger an error which
triggers a reconnect which in turn triggers another echo call. This
results in a loop which is only broken when an operation is performed on
the cifs share. For an idle share, it can DOS a server.
The patch uses the smb_operation can_echo() for cifs so that it is
called only if connection has been already been setup.
kernel bz: 194531
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc280fe871 upstream.
Commit 6afcf8ef0c ("mm, compaction: fix NR_ISOLATED_* stats for pfn
based migration") moved the dec_node_page_state() call (along with the
page_is_file_cache() call) to after putback_lru_page().
But page_is_file_cache() can change after putback_lru_page() is called,
so it should be called before putback_lru_page(), as it was before that
patch, to prevent NR_ISOLATE_* stats from going negative.
Without this fix, non-CONFIG_SMP kernels end up hanging in the
while(too_many_isolated()) { congestion_wait() } loop in
shrink_active_list() due to the negative stats.
Mem-Info:
active_anon:32567 inactive_anon:121 isolated_anon:1
active_file:6066 inactive_file:6639 isolated_file:4294967295
^^^^^^^^^^
unevictable:0 dirty:115 writeback:0 unstable:0
slab_reclaimable:2086 slab_unreclaimable:3167
mapped:3398 shmem:18366 pagetables:1145 bounce:0
free:1798 free_pcp:13 free_cma:0
Fixes: 6afcf8ef0c ("mm, compaction: fix NR_ISOLATED_* stats for pfn based migration")
Link: http://lkml.kernel.org/r/1492683865-27549-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ming Ling <ming.ling@spreadtrum.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78f7a45dac upstream.
I noticed that reading the snapshot file when it is empty no longer gives a
status. It suppose to show the status of the snapshot buffer as well as how
to allocate and use it. For example:
># cat snapshot
# tracer: nop
#
#
# * Snapshot is allocated *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
# Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
# (Doesn't have to be '2' works with any number that
# is not a '0' or '1')
But instead it just showed an empty buffer:
># cat snapshot
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
What happened was that it was using the ring_buffer_iter_empty() function to
see if it was empty, and if it was, it showed the status. But that function
was returning false when it was empty. The reason was that the iter header
page was on the reader page, and the reader page was empty, but so was the
buffer itself. The check only tested to see if the iter was on the commit
page, but the commit page was no longer pointing to the reader page, but as
all pages were empty, the buffer is also.
Fixes: 651e22f270 ("ring-buffer: Always reset iterator to reader page")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df62db5be2 upstream.
Currently the snapshot trigger enables the probe and then allocates the
snapshot. If the probe triggers before the allocation, it could cause the
snapshot to fail and turn tracing off. It's best to allocate the snapshot
buffer first, and then enable the trigger. If something goes wrong in the
enabling of the trigger, the snapshot buffer is still allocated, but it can
also be freed by the user by writting zero into the snapshot buffer file.
Also add a check of the return status of alloc_snapshot().
Fixes: 77fd5c15e3 ("tracing: Add snapshot trigger to function probes")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9f838d104 upstream.
This fixes CVE-2017-7472.
Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:
#include <keyutils.h>
int main()
{
for (;;)
keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
}
Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.
Fixes: d84f4f992c ("CRED: Inaugurate COW credentials")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1644fe041 upstream.
This fixes CVE-2017-6951.
Userspace should not be able to do things with the "dead" key type as it
doesn't have some of the helper functions set upon it that the kernel
needs. Attempting to use it may cause the kernel to crash.
Fix this by changing the name of the type to ".dead" so that it's rejected
up front on userspace syscalls by key_get_type_from_user().
Though this doesn't seem to affect recent kernels, it does affect older
ones, certainly those prior to:
commit c06cfb08b8
Author: David Howells <dhowells@redhat.com>
Date: Tue Sep 16 17:36:06 2014 +0100
KEYS: Remove key_type::match in favour of overriding default by match_preparse
which went in before 3.18-rc1.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee8f844e3c upstream.
This fixes CVE-2016-9604.
Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing. However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING. Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.
This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added. This permits root to add extra public
keys, thereby bypassing module verification.
This also affects kexec and IMA.
This can be tested by (as root):
keyctl session .builtin_trusted_keys
keyctl add user a a @s
keyctl list @s
which on my test box gives me:
2 keys in keyring:
180010936: ---lswrv 0 0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
801382539: --alswrv 0 0 user: a
Fix this by rejecting names beginning with a '.' in the keyctl.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: linux-ima-devel@lists.sourceforge.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.
Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.
Fixes https://github.com/raspberrypi/linux/issues/1709
commit 956a4cd2c9 upstream.
The following warning triggers with a new unit test that stresses the
device-dax interface.
===============================
[ ERR: suspicious RCU usage. ]
4.11.0-rc4+ #1049 Tainted: G O
-------------------------------
./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
2 locks held by fio/9070:
#0: (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
#1: (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
Call Trace:
dump_stack+0x86/0xc3
lockdep_rcu_suspicious+0xd7/0x110
___might_sleep+0xac/0x250
__might_sleep+0x4a/0x80
__alloc_pages_nodemask+0x23a/0x360
alloc_pages_current+0xa1/0x1f0
pte_alloc_one+0x17/0x80
__pte_alloc+0x1e/0x120
__get_locked_pte+0x1bf/0x1d0
insert_pfn.isra.70+0x3a/0x100
? lookup_memtype+0xa6/0xd0
vm_insert_mixed+0x64/0x90
dax_dev_huge_fault+0x520/0x620 [dax]
? dax_dev_huge_fault+0x32/0x620 [dax]
dax_dev_fault+0x10/0x20 [dax]
__do_fault+0x1e/0x140
__handle_mm_fault+0x9af/0x10d0
handle_mm_fault+0x16d/0x370
? handle_mm_fault+0x47/0x370
__do_page_fault+0x28c/0x4f0
trace_do_page_fault+0x58/0x2a0
do_async_page_fault+0x1a/0xa0
async_page_fault+0x28/0x30
Inserting a page table entry may trigger an allocation while we are
holding a read lock to keep the device instance alive for the duration
of the fault. Use srcu for this keep-alive protection.
Fixes: dee4107924 ("/dev/dax, core: file operations and dax-mmap")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0dc9c639e6 upstream.
The NFIT MCE handler callback (for handling media errors on NVDIMMs)
takes a mutex to add the location of a memory error to a list. But since
the notifier call chain for machine checks (x86_mce_decoder_chain) is
atomic, we get a lockdep splat like:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
[..]
Call Trace:
dump_stack
___might_sleep
__might_sleep
mutex_lock_nested
? __lock_acquire
nfit_handle_mce
notifier_call_chain
atomic_notifier_call_chain
? atomic_notifier_call_chain
mce_gen_pool_process
Convert the notifier to a blocking one which gets to run only in process
context.
Boris: remove the notifier call in atomic context in print_mce(). For
now, let's print the MCE on the atomic path so that we can make sure
they go out and get logged at least.
Fixes: 6839a6d96f ("nfit: do an ARS scrub on hitting a latent media error")
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29f72ce3e4 upstream.
MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.
Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.
Tested on AMD Fam15h and Fam17h systems.
WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
kobject: (ffff88085bb256c0): attempted to be registered with empty name!
...
Call Trace:
kobject_add_internal
kobject_add
kobject_create_and_add
threshold_create_device
threshold_init_device
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e1ba4f27f upstream.
If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
OOPS:
Bad kernel stack pointer cd93c840 at c000000000009868
Oops: Bad kernel stack pointer, sig: 6 [#1]
...
GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
...
NIP [c000000000009868] resume_kernel+0x2c/0x58
LR [c000000000006208] program_check_common+0x108/0x180
On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
not emulate actual store in emulate_step() because it may corrupt the exception
frame. So the kernel does the actual store operation in exception return code
i.e. resume_kernel().
resume_kernel() loads the saved stack pointer from memory using lwz, which only
loads the low 32-bits of the address, causing the kernel crash.
Fix this by loading the 64-bit value instead.
Fixes: be96f63375 ("powerpc: Split out instruction analysis part of emulate_step()")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
[mpe: Change log massage, add stable tag]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cd9a21ce0 upstream.
In commit 6afaf8a484 ("UBI: flush wl before clearing update marker") I
managed to trigger and fix a similar bug. Now here is another version of
which I assumed it wouldn't matter back then but it turns out UBI has a
check for it and will error out like this:
|ubi0 warning: validate_vid_hdr: inconsistent used_ebs
|ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592
All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
powercut in the middle of the operation.
ubi_start_update() sets the update-marker and puts all EBs on the erase
list. After that userland can proceed to write new data while the old EB
aren't erased completely. A powercut at this point is usually not that
much of a tragedy. UBI won't give read access to the static volume
because it has the update marker. It will most likely set the corrupted
flag because it misses some EBs.
So we are all good. Unless the size of the image that has been written
differs from the old image in the magnitude of at least one EB. In that
case UBI will find two different values for `used_ebs' and refuse to
attach the image with the error message mentioned above.
So in order not to get in the situation, the patch will ensure that we
wait until everything is removed before it tries to write any data.
The alternative would be to detect such a case and remove all EBs at the
attached time after we processed the volume-table and see the
update-marker set. The patch looks bigger and I doubt it is worth it
since usually the write() will wait from time to time for a new EB since
usually there not that many spare EB that can be used.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e478066ea upstream.
There are two bugs in the follow-MAC code:
* it treats the radiotap header as the 802.11 header
(therefore it can't possibly work)
* it doesn't verify that the skb data it accesses is actually
present in the header, which is mitigated by the first point
Fix this by moving all of this out into a separate function.
This function copies the data it needs using skb_copy_bits()
to make sure it can be accessed if it's paged, and offsets
that by the possibly present vendor radiotap header.
This also makes all those conditions more readable.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3018e947d7 upstream.
AP/AP_VLAN modes don't accept any real 802.11 multicast data
frames, but since they do need to accept broadcast management
frames the same is currently permitted for data frames. This
opens a security problem because such frames would be decrypted
with the GTK, and could even contain unicast L3 frames.
Since the spec says that ToDS frames must always have the BSSID
as the RA (addr1), reject any other data frames.
The problem was originally reported in "Predicting, Decrypting,
and Abusing WPA2/802.11 Group Keys" at usenix
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
and brought to my attention by Jouni.
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
--
commit 32fe905c17 upstream.
It is perfectly fine to link a tmpfile back using linkat().
Since tmpfiles are created with a link count of 0 they appear
on the orphan list, upon re-linking the inode has to be removed
from the orphan list again.
Ralph faced a filesystem corruption in combination with overlayfs
due to this bug.
Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Cc: Amir Goldstein <amir73il@gmail.com>
Reported-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Tested-by: Ralph Sennhauser <ralph.sennhauser@gmail.com>
Reported-by: Amir Goldstein <amir73il@gmail.com>
Fixes: 474b93704f ("ubifs: Implement O_TMPFILE")
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3d9fda688 upstream.
Remove faulty leftover check in do_rename(), apparently introduced in a
merge that combined whiteout support changes with commit f03b8ad8d3
("fs: support RENAME_NOREPLACE for local filesystems")
Fixes: f03b8ad8d3 ("fs: support RENAME_NOREPLACE for local filesystems")
Fixes: 9e0a1fff8d ("ubifs: Implement RENAME_WHITEOUT")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f32784535 upstream.
Currently for DDR50 card, it need tuning in default. We meet tuning fail
issue for DDR50 card and some data CRC error when DDR50 sd card works.
This is because the default pad I/O drive strength can't make sure DDR50
card work stable. So increase the pad I/O drive strength for DDR50 card,
and use pins_100mhz.
This fixes DDR50 card support for IMX since DDR50 tuning was enabled from
commit 9faac7b95e ("mmc: sdhci: enable tuning for DDR50")
Tested-and-reported-by: Tim Harvey <tharvey@gateworks.com>
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe8c470ab8 upstream.
gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state()
is enterered at least once, so it assumes that cur_state might not get
initialized:
drivers/acpi/power.c: In function 'acpi_power_get_inferred_state':
drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This sets the variable to zero at the start of the loop, to ensure that
there is well-defined behavior even for an empty list. This gets rid of
the warning.
The warning first showed up when the -Os flag got removed in a bug fix
patch in linux-4.11-rc5.
I would suggest merging this addon patch on top of that bug fix to avoid
introducing a new warning in the stable kernels.
Fixes: 61b79e16c6 (ACPI: Fix incompatibility with mcount-based function graph tracing)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 704de489e0 upstream.
Temporary got a Lifebook E547 into my hands and noticed the touchpad
only works after running:
echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled
Add it to the list of machines that need this workaround.
Signed-off-by: Thorsten Leemhuis <linux@leemhuis.info>
Reviewed-by: Ulrik De Bie <ulrik.debie-os@e2big.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8f60d1fad upstream.
On heavy paging with KSM I see guest data corruption. Turns out that
KSM will add pages to its tree, where the mapping return true for
pte_unused (or might become as such later). KSM will unmap such pages
and reinstantiate with different attributes (e.g. write protected or
special, e.g. in replace_page or write_protect_page)). This uncovered
a bug in our pagetable handling: We must remove the unused flag as
soon as an entry becomes present again.
Signed-of-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0918f1ce6 upstream.
STATUS_BAD_NETWORK_NAME can be received during node failover,
causing the flag to be set and making the reconnect thread
always unsuccessful, thereafter.
Once the only place where it is set is removed, the remaining
bits are rendered moot.
Removing it does not prevent "mount" from failing when a non
existent share is passed.
What happens when the share really ceases to exist while the
share is mounted is undefined now as much as it was before.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 62a6cfddcc upstream.
commit 4fcd1813e6 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") added support for Negotiate requests to
be initiated by echo calls.
To avoid delays in calling echo after a reconnect, I added the patch
introduced by the commit b8c600120f ("Call echo service immediately
after socket reconnect").
This has however caused a regression with cifs shares which do not have
support for echo calls to trigger Negotiate requests. On connections
which need to call Negotiation, the echo calls trigger an error which
triggers a reconnect which in turn triggers another echo call. This
results in a loop which is only broken when an operation is performed on
the cifs share. For an idle share, it can DOS a server.
The patch uses the smb_operation can_echo() for cifs so that it is
called only if connection has been already been setup.
kernel bz: 194531
Signed-off-by: Sachin Prabhu <sprabhu@redhat.com>
Tested-by: Jonathan Liu <net147@gmail.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc280fe871 upstream.
Commit 6afcf8ef0c ("mm, compaction: fix NR_ISOLATED_* stats for pfn
based migration") moved the dec_node_page_state() call (along with the
page_is_file_cache() call) to after putback_lru_page().
But page_is_file_cache() can change after putback_lru_page() is called,
so it should be called before putback_lru_page(), as it was before that
patch, to prevent NR_ISOLATE_* stats from going negative.
Without this fix, non-CONFIG_SMP kernels end up hanging in the
while(too_many_isolated()) { congestion_wait() } loop in
shrink_active_list() due to the negative stats.
Mem-Info:
active_anon:32567 inactive_anon:121 isolated_anon:1
active_file:6066 inactive_file:6639 isolated_file:4294967295
^^^^^^^^^^
unevictable:0 dirty:115 writeback:0 unstable:0
slab_reclaimable:2086 slab_unreclaimable:3167
mapped:3398 shmem:18366 pagetables:1145 bounce:0
free:1798 free_pcp:13 free_cma:0
Fixes: 6afcf8ef0c ("mm, compaction: fix NR_ISOLATED_* stats for pfn based migration")
Link: http://lkml.kernel.org/r/1492683865-27549-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Ming Ling <ming.ling@spreadtrum.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78f7a45dac upstream.
I noticed that reading the snapshot file when it is empty no longer gives a
status. It suppose to show the status of the snapshot buffer as well as how
to allocate and use it. For example:
># cat snapshot
# tracer: nop
#
#
# * Snapshot is allocated *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
# Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
# (Doesn't have to be '2' works with any number that
# is not a '0' or '1')
But instead it just showed an empty buffer:
># cat snapshot
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
What happened was that it was using the ring_buffer_iter_empty() function to
see if it was empty, and if it was, it showed the status. But that function
was returning false when it was empty. The reason was that the iter header
page was on the reader page, and the reader page was empty, but so was the
buffer itself. The check only tested to see if the iter was on the commit
page, but the commit page was no longer pointing to the reader page, but as
all pages were empty, the buffer is also.
Fixes: 651e22f270 ("ring-buffer: Always reset iterator to reader page")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df62db5be2 upstream.
Currently the snapshot trigger enables the probe and then allocates the
snapshot. If the probe triggers before the allocation, it could cause the
snapshot to fail and turn tracing off. It's best to allocate the snapshot
buffer first, and then enable the trigger. If something goes wrong in the
enabling of the trigger, the snapshot buffer is still allocated, but it can
also be freed by the user by writting zero into the snapshot buffer file.
Also add a check of the return status of alloc_snapshot().
Fixes: 77fd5c15e3 ("tracing: Add snapshot trigger to function probes")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9f838d104 upstream.
This fixes CVE-2017-7472.
Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:
#include <keyutils.h>
int main()
{
for (;;)
keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
}
Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.
Fixes: d84f4f992c ("CRED: Inaugurate COW credentials")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c1644fe041 upstream.
This fixes CVE-2017-6951.
Userspace should not be able to do things with the "dead" key type as it
doesn't have some of the helper functions set upon it that the kernel
needs. Attempting to use it may cause the kernel to crash.
Fix this by changing the name of the type to ".dead" so that it's rejected
up front on userspace syscalls by key_get_type_from_user().
Though this doesn't seem to affect recent kernels, it does affect older
ones, certainly those prior to:
commit c06cfb08b8
Author: David Howells <dhowells@redhat.com>
Date: Tue Sep 16 17:36:06 2014 +0100
KEYS: Remove key_type::match in favour of overriding default by match_preparse
which went in before 3.18-rc1.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee8f844e3c upstream.
This fixes CVE-2016-9604.
Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing. However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING. Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.
This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added. This permits root to add extra public
keys, thereby bypassing module verification.
This also affects kexec and IMA.
This can be tested by (as root):
keyctl session .builtin_trusted_keys
keyctl add user a a @s
keyctl list @s
which on my test box gives me:
2 keys in keyring:
180010936: ---lswrv 0 0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
801382539: --alswrv 0 0 user: a
Fix this by rejecting names beginning with a '.' in the keyctl.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
cc: linux-ima-devel@lists.sourceforge.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is unwise to use sources other than the oscillator and PLLD_PER for
the PCM peripheral (and perhaps others - TBD) because their rate can
change and they may even be switched off, so explicitly restrict the
choice using dummy entries in the list of potential parents (item index
is significant).
See: https://github.com/raspberrypi/linux/issues/1949
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The driver was making changes to the skb_header without
ensuring it was writable (i.e. uncloned).
This patch also removes some boiler plate header size
checking/adjustment code as that is also handled by the
skb_cow_header function used to make header writable.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The incoming skb header may be resized if header space is
insufficient, which might change the data adddress in the skb.
Ensure that a cached pointer to that data is correctly set by
moving assignment to after any possible changes.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The driver was failing to check that the SKB wasn't cloned
before adding checksum data.
Replace existing handling to extend/copy the header buffer
with skb_cow_head.
Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.
skb_cow_head() is the proper helper to deal with this.
Fixes: d0cad87170 ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit dfcb9f4f99 upstream.
commit 2dcab59848 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
attempted to avoid a BUG_ON call when the association being used for a
sendmsg() is blocked waiting for more sndbuf and another thread did a
peeloff operation on such asoc, moving it to another socket.
As Ben Hutchings noticed, then in such case it would return without
locking back the socket and would cause two unlocks in a row.
Further analysis also revealed that it could allow a double free if the
application managed to peeloff the asoc that is created during the
sendmsg call, because then sctp_sendmsg() would try to free the asoc
that was created only for that call.
This patch takes another approach. It will deny the peeloff operation
if there is a thread sleeping on the asoc, so this situation doesn't
exist anymore. This avoids the issues described above and also honors
the syscalls that are already being handled (it can be multiple sendmsg
calls).
Joint work with Xin Long.
Fixes: 2dcab59848 ("sctp: avoid BUG_ON on sctp_wait_for_sndbuf")
Cc: Alexander Popov <alex.popov@linux.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2ed1880fd upstream.
The protocol field is checked when deleting IPv4 routes, but ignored for
IPv6, which causes problems with routing daemons accidentally deleting
externally set routes (observed by multiple bird6 users).
This can be verified using `ip -6 route del <prefix> proto something`.
Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4baad5029 upstream.
put_chars() stuffs the buffer it gets into an sg, but that buffer may be
on the stack. This breaks with CONFIG_VMAP_STACK=y (for me, it
manifested as printks getting turned into NUL bytes).
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f190e3aec upstream.
Commit 17ce039b4e ("[media] cxusb: don't do DMA on stack")
added a kmalloc'ed bounce buffer for writes, but missed to do the same
for reads. As the read only happens after the write is finished, we can
reuse the same buffer.
As dvb_usb_generic_rw handles a read length of 0 by itself, avoid calling
it using the dvb_usb_generic_read wrapper function.
Signed-off-by: Stefan Brüns <stefan.bruens@rwth-aachen.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4866aa812 upstream.
Under CONFIG_STRICT_DEVMEM, reading System RAM through /dev/mem is
disallowed. However, on x86, the first 1MB was always allowed for BIOS
and similar things, regardless of it actually being System RAM. It was
possible for heap to end up getting allocated in low 1MB RAM, and then
read by things like x86info or dd, which would trip hardened usercopy:
usercopy: kernel memory exposure attempt detected from ffff880000090000 (dma-kmalloc-256) (4096 bytes)
This changes the x86 exception for the low 1MB by reading back zeros for
System RAM areas instead of blindly allowing them. More work is needed to
extend this to mmap, but currently mmap doesn't go through usercopy, so
hardened usercopy won't Oops the kernel.
Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Tested-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fa4086987 upstream.
Accessing the registers of the RTC block on Tegra requires the module
clock to be enabled. This only works because the RTC module clock will
be enabled by default during early boot. However, because the clock is
unused, the CCF will disable it at late_init time. This causes the RTC
to become unusable afterwards. This can easily be reproduced by trying
to use the RTC:
$ hwclock --rtc /dev/rtc1
This will hang the system. I ran into this by following up on a report
by Martin Michlmayr that reboot wasn't working on Tegra210 systems. It
turns out that the rtc-tegra driver's ->shutdown() implementation will
hang the CPU, because of the disabled clock, before the system can be
rebooted.
What confused me for a while is that the same driver is used on prior
Tegra generations where the hang can not be observed. However, as Peter
De Schrijver pointed out, this is because on 32-bit Tegra chips the RTC
clock is enabled by the tegra20_timer.c clocksource driver, which uses
the RTC to provide a persistent clock. This code is never enabled on
64-bit Tegra because the persistent clock infrastructure does not exist
on 64-bit ARM.
The proper fix for this is to add proper clock handling to the RTC
driver in order to ensure that the clock is enabled when the driver
requires it. All device trees contain the clock already, therefore
no additional changes are required.
Reported-by: Martin Michlmayr <tbm@cyrius.com>
Acked-By Peter De Schrijver <pdeschrijver@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc272163ea upstream.
This patch fixes the following warning message seen when booting the
kernel as Dom0 with Xen on Intel machines.
[0.003000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: 0 APIC: 1]
The code generating the warning in validate_apic_and_package_id() matches
cpu_data(cpu).apicid (initialized in init_intel()->
detect_extended_topology() using cpuid) against the apicid returned from
xen_apic_read(). Now, xen_apic_read() makes a hypercall to retrieve apicid
for the boot cpu but returns 0 otherwise. Hence the warning gets thrown
for all but the boot cpu.
The idea behind xen_apic_read() returning 0 for apicid is that the
guests (even Dom0) should not need to know what physical processor their
vcpus are running on. This is because we currently do not have topology
information in Xen and also because xen allows more vcpus than physical
processors. However, boot cpu's apicid is required for loading
xen-acpi-processor driver on AMD machines. Look at following patch for
details:
commit 558daa289a ("xen/apic: Return the APIC ID (and version) for CPU
0.")
So to get rid of the warning, this patch modifies
xen_cpu_present_to_apicid() to return cpu_data(cpu).apicid instead of
calling xen_apic_read().
The warning is not seen on AMD machines because init_amd() populates
cpu_data(cpu).apicid by calling hard_smp_processor_id()->xen_apic_read()
as opposed to using apicid from cpuid as is done on Intel machines.
Signed-off-by: Mohit Gambhir <mohit.gambhir@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98d610c373 upstream.
The accelerometer event relies on the ACERWMID_EVENT_GUID notify.
So, this patch changes the codes to setup accelerometer input device
when detected ACERWMID_EVENT_GUID. It avoids that the accel input
device created on every Acer machines.
In addition, patch adds a clearly parsing logic of accelerometer hid
to acer_wmi_get_handle_cb callback function. It is positive matching
the "SENR" name with "BST0001" device to avoid non-supported hardware.
Reported-by: Bjørn Mork <bjorn@mork.no>
Cc: Darren Hart <dvhart@infradead.org>
Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
[andy: slightly massage commit message]
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebf79091bf upstream.
Select DW_DMAC_CORE like the rest of glue drivers do, e.g.
drivers/dma/dw/Kconfig.
While here group selectors under SND_SOC_INTEL_HASWELL and
SND_SOC_INTEL_BAYTRAIL.
Make platforms, which are using a common SST firmware driver, to be
dependent on DMADEVICES.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Liam Girdwood <liam.r.girdwood@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e88f72cb9f upstream.
We have this:
ERROR: "__aeabi_ldivmod" [drivers/block/nbd.ko] undefined!
ERROR: "__divdi3" [drivers/block/nbd.ko] undefined!
nbd.c:(.text+0x247c72): undefined reference to `__divdi3'
due to a recent commit, that did 64-bit division. Use the proper
divider function so that 32-bit compiles don't break.
Fixes: ef77b51524 ("nbd: use loff_t for blocksize and nbd_set_size args")
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef77b51524 upstream.
If we have large devices (say like the 40t drive I was trying to test with) we
will end up overflowing the int arguments to nbd_set_size and not get the right
size for our device. Fix this by using loff_t everywhere so I don't have to
think about this again. Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13583c3d32 upstream.
Creating a lot of cgroups at the same time might stall all worker
threads with kmem cache creation works, because kmem cache creation is
done with the slab_mutex held. The problem was amplified by commits
801faf0db8 ("mm/slab: lockless decision to grow cache") in case of
SLAB and 81ae6d0395 ("mm/slub.c: replace kick_all_cpus_sync() with
synchronize_sched() in kmem_cache_shrink()") in case of SLUB, which
increased the maximal time the slab_mutex can be held.
To prevent that from happening, let's use a special ordered single
threaded workqueue for kmem cache creation. This shouldn't introduce
any functional changes regarding how kmem caches are created, as the
work function holds the global slab_mutex during its whole runtime
anyway, making it impossible to run more than one work at a time. By
using a single threaded workqueue, we just avoid creating a thread per
each work. Ordering is required to avoid a situation when a cgroup's
work is put off indefinitely because there are other cgroups to serve,
in other words to guarantee fairness.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981
Link: http://lkml.kernel.org/r/20161004131417.GC1862@esperanza
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Reported-by: Doug Smythies <dsmythies@telus.net>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05ac5aa18a upstream.
We've fixed the race condition problem in calculating ext4 checksum
value in commit b47820edd1 ("ext4: avoid modifying checksum fields
directly during checksum veficationon"). However, by this change,
when calculating the checksum value of inode whose i_extra_size is
less than 4, we couldn't calculate the checksum value in a proper way.
This problem was found and reported by Nix, Thank you.
Reported-by: Nix <nix@esperi.org.uk>
Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Youngjin Gil <youngjin.gil@samsung.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 005145378c upstream.
I ran into a stack frame size warning because of the on-stack copy of
the USB device structure:
drivers/media/usb/dvb-usb-v2/dvb_usb_core.c: In function 'dvb_usbv2_disconnect':
drivers/media/usb/dvb-usb-v2/dvb_usb_core.c:1029:1: error: the frame size of 1104 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
Copying a device structure like this is wrong for a number of other reasons
too aside from the possible stack overflow. One of them is that the
dev_info() call will print the name of the device later, but AFAICT
we have only copied a pointer to the name earlier and the actual name
has been freed by the time it gets printed.
This removes the on-stack copy of the device and instead copies the
device name using kstrdup(). I'm ignoring the possible failure here
as both printk() and kfree() are able to deal with NULL pointers.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f795cef0e upstream.
This fixes a bug in which the upper 32-bits of a 64-bit value which is
read by get_user() was lost on a 32-bit kernel.
While touching this code, split out pre-loading of %sr2 space register
and clean up code indent.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef0579b64e upstream.
The ahash API modifies the request's callback function in order
to clean up after itself in some corner cases (unaligned final
and missing finup).
When the request is complete ahash will restore the original
callback and everything is fine. However, when the request gets
an EBUSY on a full queue, an EINPROGRESS callback is made while
the request is still ongoing.
In this case the ahash API will incorrectly call its own callback.
This patch fixes the problem by creating a temporary request
object on the stack which is used to relay EINPROGRESS back to
the original completion function.
This patch also adds code to preserve the original flags value.
Fixes: ab6bf4e5e5 ("crypto: hash - Fix the pointer voodoo in...")
Reported-by: Sabrina Dubroca <sd@queasysnail.net>
Tested-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6534aebb2 upstream.
The algif_aead completion function tries to deduce the aead_request
from the crypto_async_request argument. This is broken because
the API does not guarantee that the same request will be pased to
the completion function. Only the value of req->data can be used
in the completion function.
This patch fixes it by storing a pointer to sk in areq and using
that instead of passing in sk through req->data.
Fixes: 83094e5e9e ("crypto: af_alg - add async support to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d879d0b8c1 upstream.
When function tracer has a pid filter, it adds a probe to sched_switch
to track if current task can be ignored. The probe checks the
ftrace_ignore_pid from current tr to filter tasks. But it misses to
delete the probe when removing an instance so that it can cause a crash
due to the invalid tr pointer (use-after-free).
This is easily reproducible with the following:
# cd /sys/kernel/debug/tracing
# mkdir instances/buggy
# echo $$ > instances/buggy/set_ftrace_pid
# rmdir instances/buggy
============================================================================
BUG: KASAN: use-after-free in ftrace_filter_pid_sched_switch_probe+0x3d/0x90
Read of size 8 by task kworker/0:1/17
CPU: 0 PID: 17 Comm: kworker/0:1 Tainted: G B 4.11.0-rc3 #198
Call Trace:
dump_stack+0x68/0x9f
kasan_object_err+0x21/0x70
kasan_report.part.1+0x22b/0x500
? ftrace_filter_pid_sched_switch_probe+0x3d/0x90
kasan_report+0x25/0x30
__asan_load8+0x5e/0x70
ftrace_filter_pid_sched_switch_probe+0x3d/0x90
? fpid_start+0x130/0x130
__schedule+0x571/0xce0
...
To fix it, use ftrace_clear_pids() to unregister the probe. As
instance_rmdir() already updated ftrace codes, it can just free the
filter safely.
Link: http://lkml.kernel.org/r/20170417024430.21194-2-namhyung@kernel.org
Fixes: 0c8916c342 ("tracing: Add rmdir to remove multibuffer instances")
Cc: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d72e9a7a93 upstream.
The copy_page is optimized memcpy for page-alinged address. If it is
used with non-page aligned address, it can corrupt memory which means
system corruption. With zram, it can happen with
1. 64K architecture
2. partial IO
3. slub debug
Partial IO need to allocate a page and zram allocates it via kmalloc.
With slub debug, kmalloc(PAGE_SIZE) doesn't return page-size aligned
address. And finally, copy_page(mem, cmem) corrupts memory.
So, this patch changes it to memcpy.
Actuaully, we don't need to change zram_bvec_write part because zsmalloc
returns page-aligned address in case of PAGE_SIZE class but it's not
good to rely on the internal of zsmalloc.
Note:
When this patch is merged to stable, clear_page should be fixed, too.
Unfortunately, recent zram removes it by "same page merge" feature so
it's hard to backport this patch to -stable tree.
I will handle it when I receive the mail from stable tree maintainer to
merge this patch to backport.
Fixes: 42e99bd ("zram: optimize memory operations with clear_page()/copy_page()")
Link: http://lkml.kernel.org/r/1492042622-12074-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06ce521af9 upstream.
handle_vmon gets a reference on VMXON region page,
but does not release it. Release the reference.
Found by syzkaller; based on a patch by Dmitry.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[bwh: Backported to 3.16: use skip_emulated_instruction()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2cfa58b13 upstream.
Without a bool string present, using "# CONFIG_DEVPORT is not set" in
defconfig files would not actually unset devport. This esnured that
/dev/port was always on, but there are reasons a user may wish to
disable it (smaller kernel, attack surface reduction) if it's not being
used. Adding a message here in order to make this user visible.
Signed-off-by: Max Bires <jbires@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4a3fa261b upstream.
There is a report that after commit 27622b061e ("cpufreq: Convert
to hotplug state machine"), the normal CPU offline/online cycle
fails on some platforms.
According to the ftrace result, this problem was triggered on
platforms using acpi-cpufreq as the default cpufreq driver,
and due to the lack of some ACPI freq method (eg. _PCT),
cpufreq_online() failed and returned a negative value, so the CPU
hotplug state machine rolled back the CPU online process. Actually,
from the user's perspective, the failure of cpufreq_online() should
not prevent that CPU from being brought up, although cpufreq might
not work on that CPU.
BTW, during system startup cpufreq_online() is not invoked via CPU
online but by the cpufreq device creation process, so the APs can be
brought up even though cpufreq_online() fails in that stage.
This patch ignores the return value of cpufreq_online/offline() and
lets the cpufreq framework deal with the failure. cpufreq_online()
itself will do a proper rollback in that case and if _PCT is missing,
the ACPI cpufreq driver will print a warning if the corresponding
debug options have been enabled.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=194581
Fixes: 27622b061e ("cpufreq: Convert to hotplug state machine")
Reported-and-tested-by: Tomasz Maciej Nowak <tmn505@gmail.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a900152b5c upstream.
If the PWM was not enabled at U-Boot loader, PWM could not work for
clock always disabled at PWM driver. The PWM clock is enabled at
beginning of pwm_apply(), but disabled at end of pwm_apply().
If the PWM was enabled at U-Boot loader, PWM clock is always enabled
unless closed by ATF. The pwm-backlight might turn off the power at
early suspend, should disable PWM clock for saving power consume.
It is important to provide opportunity to enable/disable clock at PWM
driver, the PWM consumer should ensure correct order to call PWM enable
and disable, and PWM driver ensure state of PWM clock synchronized with
PWM enabled state.
Fixes: 2bf1c98aa5 ("pwm: rockchip: Add support for atomic update")
Signed-off-by: David Wu <david.wu@rock-chips.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0beb2012a1 upstream.
Holding the reconfig_mutex over a potential userspace fault sets up a
lockdep dependency chain between filesystem-DAX and the libnvdimm ioctl
path. Move the user access outside of the lock.
[ INFO: possible circular locking dependency detected ]
4.11.0-rc3+ #13 Tainted: G W O
-------------------------------------------------------
fallocate/16656 is trying to acquire lock:
(&nvdimm_bus->reconfig_mutex){+.+.+.}, at: [<ffffffffa00080b1>] nvdimm_bus_lock+0x21/0x30 [libnvdimm]
but task is already holding lock:
(jbd2_handle){++++..}, at: [<ffffffff813b4944>] start_this_handle+0x104/0x460
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 (jbd2_handle){++++..}:
lock_acquire+0xbd/0x200
start_this_handle+0x16a/0x460
jbd2__journal_start+0xe9/0x2d0
__ext4_journal_start_sb+0x89/0x1c0
ext4_dirty_inode+0x32/0x70
__mark_inode_dirty+0x235/0x670
generic_update_time+0x87/0xd0
touch_atime+0xa9/0xd0
ext4_file_mmap+0x90/0xb0
mmap_region+0x370/0x5b0
do_mmap+0x415/0x4f0
vm_mmap_pgoff+0xd7/0x120
SyS_mmap_pgoff+0x1c5/0x290
SyS_mmap+0x22/0x30
entry_SYSCALL_64_fastpath+0x1f/0xc2
-> #1 (&mm->mmap_sem){++++++}:
lock_acquire+0xbd/0x200
__might_fault+0x70/0xa0
__nd_ioctl+0x683/0x720 [libnvdimm]
nvdimm_ioctl+0x8b/0xe0 [libnvdimm]
do_vfs_ioctl+0xa8/0x740
SyS_ioctl+0x79/0x90
do_syscall_64+0x6c/0x200
return_from_SYSCALL_64+0x0/0x7a
-> #0 (&nvdimm_bus->reconfig_mutex){+.+.+.}:
__lock_acquire+0x16b6/0x1730
lock_acquire+0xbd/0x200
__mutex_lock+0x88/0x9b0
mutex_lock_nested+0x1b/0x20
nvdimm_bus_lock+0x21/0x30 [libnvdimm]
nvdimm_forget_poison+0x25/0x50 [libnvdimm]
nvdimm_clear_poison+0x106/0x140 [libnvdimm]
pmem_do_bvec+0x1c2/0x2b0 [nd_pmem]
pmem_make_request+0xf9/0x270 [nd_pmem]
generic_make_request+0x118/0x3b0
submit_bio+0x75/0x150
Fixes: 62232e45f4 ("libnvdimm: control (ioctl) messages for nvdimm_bus and nvdimm devices")
Cc: Dave Jiang <dave.jiang@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe514739d8 upstream.
Commit a1f3e4d6a0 "libnvdimm, region: update nd_region_available_dpa()
for multi-pmem support" reworked blk dpa (DIMM Physical Address)
accounting to comprehend multiple pmem namespace allocations aliasing
with a given blk-dpa range.
The following call trace is a result of failing to account for allocated
blk capacity.
WARNING: CPU: 1 PID: 2433 at tools/testing/nvdimm/../../../drivers/nvdimm/names
4 size_store+0x6f3/0x930 [libnvdimm]
nd_region region5: allocation underrun: 0x0 of 0x1000000 bytes
[..]
Call Trace:
dump_stack+0x86/0xc3
__warn+0xcb/0xf0
warn_slowpath_fmt+0x5f/0x80
size_store+0x6f3/0x930 [libnvdimm]
dev_attr_store+0x18/0x30
If a given blk-dpa allocation does not alias with any pmem ranges then
the full allocation should be accounted as busy space, not the size of
the current pmem contribution to the region.
The thinkos that led to this confusion was not realizing that the struct
resource management is already guaranteeing no collisions between pmem
allocations and blk allocations on the same dimm. Also, we do not try to
support blk allocations in aliased pmem holes.
This patch also fixes a case where the available blk goes negative.
Fixes: a1f3e4d6a0 ("libnvdimm, region: update nd_region_available_dpa() for multi-pmem support").
Reported-by: Dariusz Dokupil <dariusz.dokupil@intel.com>
Reported-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3278682123 upstream.
Fixes the mess observed in e.g. rsync over a noisy link we'd been
seeing since last Summer. What happens is that we copy part of
a datagram before noticing a checksum mismatch. Datagram will be
resent, all right, but we want the next try go into the same place,
not after it...
All this family of primitives (copy/checksum and copy a datagram
into destination) is "all or nothing" sort of interface - either
we get 0 (meaning that copy had been successful) or we get an
error (and no way to tell how much had been copied before we ran
into whatever error it had been). Make all of them leave iterator
unadvanced in case of errors - all callers must be able to cope
with that (an error might've been caught before the iterator had
been advanced), it costs very little to arrange, it's safer for
callers and actually fixes at least one bug in said callers.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 27c0e3748e upstream.
opposite to iov_iter_advance(); the caller is responsible for never
using it to move back past the initial position.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9121b15b56 upstream.
Connecting to the backend isn't working reliably in xen-fbfront: in
case XenbusStateInitWait of the backend has been missed the backend
transition to XenbusStateConnected will trigger the connected state
only without doing the actions required when the backend has
connected.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49cb77e297 upstream.
This patch closes a race between se_lun deletion during configfs
unlink in target_fabric_port_unlink() -> core_dev_del_lun()
-> core_tpg_remove_lun(), when transport_clear_lun_ref() blocks
waiting for percpu_ref RCU grace period to finish, but a new
NodeACL mappedlun is added before the RCU grace period has
completed.
This can happen in target_fabric_mappedlun_link() because it
only checks for se_lun->lun_se_dev, which is not cleared until
after transport_clear_lun_ref() percpu_ref RCU grace period
finishes.
This bug originally manifested as NULL pointer dereference
OOPsen in target_stat_scsi_att_intr_port_show_attr_dev() on
v4.1.y code, because it dereferences lun->lun_se_dev without
a explicit NULL pointer check.
In post v4.1 code with target-core RCU conversion, the code
in target_stat_scsi_att_intr_port_show_attr_dev() no longer
uses se_lun->lun_se_dev, but the same race still exists.
To address the bug, go ahead and set se_lun>lun_shutdown as
early as possible in core_tpg_remove_lun(), and ensure new
NodeACL mappedlun creation in target_fabric_mappedlun_link()
fails during se_lun shutdown.
Reported-by: James Shen <jcs@datera.io>
Cc: James Shen <jcs@datera.io>
Tested-by: James Shen <jcs@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c856152cb upstream.
We previously made sure that the reported disk capacity was less than
0xffffffff blocks when the kernel was not compiled with large sector_t
support (CONFIG_LBDAF). However, this check assumed that the capacity
was reported in units of 512 bytes.
Add a sanity check function to ensure that we only enable disks if the
entire reported capacity can be expressed in terms of sector_t.
Reported-by: Steve Magnani <steve.magnani@digidescorp.com>
Cc: Bart Van Assche <Bart.VanAssche@sandisk.com>
Reviewed-by: Bart Van Assche <Bart.VanAssche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf6061b17a upstream.
Add fix to read correct register value for ISP82xx, during check for
register disconnect.ISP82xx has different base register.
Fixes: a465537ad1 ("qla2xxx: Disable the adapter and skip error recovery in case of register disconnect")
Signed-off-by: Sawan Chandak <sawan.chandak@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6780414519 upstream.
If device reports a small max_xfer_blocks and a zero opt_xfer_blocks, we
end up using BLK_DEF_MAX_SECTORS, which is wrong and r/w of that size
may get error.
[mkp: tweaked to avoid setting rw_max twice and added typecast]
Fixes: ca369d51b3 ("block/sd: Fix device-imposed transfer length limits")
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a00a786251 upstream.
Kefeng Wang discovered that old versions of the QEMU CD driver would
return mangled mode data causing us to walk off the end of the buffer in
an attempt to parse it. Sanity check the returned mode sense data.
Reported-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Tested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c99de981f upstream.
Once upon a time back in 2009, a work-around was added to support
the GlobalSAN iSCSI initiator v3.3 for MacOSX, which during login
did not propose nor respond to MaxBurstLength, FirstBurstLength,
DefaultTime2Wait and DefaultTime2Retain keys.
The work-around in iscsi_check_proposer_for_optional_reply()
allowed the missing keys to be proposed, but did not require
waiting for a response before moving to full feature phase
operation. This allowed GlobalSAN v3.3 to work out-of-the
box, and for many years we didn't run into login interopt
issues with any other initiators..
Until recently, when Martin tried a QLogic 57840S iSCSI Offload
HBA on Windows 2016 which completed login, but subsequently
failed with:
Got unknown iSCSI OpCode: 0x43
The issue was QLogic MSFT side did not propose DefaultTime2Wait +
DefaultTime2Retain, so LIO proposes them itself, and immediately
transitions to full feature phase because of the GlobalSAN hack.
However, the QLogic MSFT side still attempts to respond to
DefaultTime2Retain + DefaultTime2Wait, even though LIO has set
ISCSI_FLAG_LOGIN_NEXT_STAGE3 + ISCSI_FLAG_LOGIN_TRANSIT
in last login response.
So while the QLogic MSFT side should have been proposing these
two keys to start, it was doing the correct thing per RFC-3720
attempting to respond to proposed keys before transitioning to
full feature phase.
All that said, recent versions of GlobalSAN iSCSI (v5.3.0.541)
does correctly propose the four keys during login, making the
original work-around moot.
So in order to allow QLogic MSFT to run unmodified as-is, go
ahead and drop this long standing work-around.
Reported-by: Martin Svec <martin.svec@zoner.cz>
Cc: Martin Svec <martin.svec@zoner.cz>
Cc: Himanshu Madhani <Himanshu.Madhani@cavium.com>
Cc: Arun Easi <arun.easi@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit efb2ea770b upstream.
This patch fixes a iscsi-target specific TMR reference leak
during session shutdown, that could occur when a TMR was
quiesced before the hand-off back to iscsi-target code
via transport_cmd_check_stop_to_fabric().
The reference leak happens because iscsit_free_cmd() was
incorrectly skipping the final target_put_sess_cmd() for
TMRs when transport_generic_free_cmd() returned zero because
the se_cmd->cmd_kref did not reach zero, due to the missing
se_cmd assignment in original code.
The result was iscsi_cmd and it's associated se_cmd memory
would be freed once se_sess->sess_cmd_map where released,
but the associated se_tmr_req was leaked and remained part
of se_device->dev_tmr_list.
This bug would manfiest itself as kernel paging request
OOPsen in core_tmr_lun_reset(), when a left-over se_tmr_req
attempted to dereference it's se_cmd pointer that had
already been released during normal session shutdown.
To address this bug, go ahead and treat ISCSI_OP_SCSI_CMD
and ISCSI_OP_SCSI_TMFUNC the same when there is an extra
se_cmd->cmd_kref to drop in iscsit_free_cmd(), and use
op_scsi to signal __iscsit_free_cmd() when the former
needs to clear any further iscsi related I/O state.
Reported-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Reported-by: Chu Yuan Lin <cyl@datera.io>
Cc: Chu Yuan Lin <cyl@datera.io>
Tested-by: Chu Yuan Lin <cyl@datera.io>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55d728a40d upstream.
On UEFI systems, the PCI subsystem is enumerated by the firmware,
and if a graphical framebuffer is exposed via a PCI device, its base
address and size are exposed to the OS via the Graphics Output
Protocol (GOP).
On arm64 PCI systems, the entire PCI hierarchy is reconfigured from
scratch at boot. This may result in the GOP framebuffer address to
become stale, if the BAR covering the framebuffer is modified. This
will cause the framebuffer to become unresponsive, and may in some
cases result in unpredictable behavior if the range is reassigned to
another device.
So add a non-x86 quirk to the EFI fb driver to find the BAR associated
with the GOP base address, and claim the BAR resource so that the PCI
core will not move it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: leif.lindholm@linaro.org
Cc: linux-efi@vger.kernel.org
Cc: lorenzo.pieralisi@arm.com
Fixes: 9822504c1f ("efifb: Enable the efi-framebuffer platform driver ...")
Link: http://lkml.kernel.org/r/20170404152744.26687-3-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 540f4c0e89 upstream.
The UEFI Specification permits Graphics Output Protocol (GOP) instances
without direct framebuffer access. This is indicated in the Mode structure
with a PixelFormat enumeration value of PIXEL_BLT_ONLY. Given that the
kernel does not know how to drive a Blt() only framebuffer (which is only
permitted before ExitBootServices() anyway), we should disregard such
framebuffers when looking for a GOP instance that is suitable for use as
the boot console.
So modify the EFI GOP initialization to not use a PIXEL_BLT_ONLY instance,
preventing attempts later in boot to use an invalid screen_info.lfb_base
address.
Signed-off-by: Eugene Cohen <eugene@hp.com>
[ Moved the Blt() only check into the loop and clarified that Blt() only GOPs are unusable by the kernel. ]
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: leif.lindholm@linaro.org
Cc: linux-efi@vger.kernel.org
Cc: lorenzo.pieralisi@arm.com
Fixes: 9822504c1f ("efifb: Enable the efi-framebuffer platform driver ...")
Link: http://lkml.kernel.org/r/20170404152744.26687-2-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 409c1b250e upstream.
The patch 554bfeceb8 ("parisc: Fix access
fault handling in pa_memcpy()") reimplements the pa_memcpy function.
Unfortunatelly, it makes the kernel unbootable. The crash happens in the
function ide_complete_cmd where memcpy is called with the same source
and destination address.
This patch fixes a few bugs in pa_memcpy:
* When jumping to .Lcopy_loop_16 for the first time, don't skip the
instruction "ldi 31,t0" (this bug made the kernel unbootable)
* Use the COND macro when comparing length, so that the comparison is
64-bit (a theoretical issue, in case the length is greater than
0xffffffff)
* Don't use the COND macro after the "extru" instruction (the PA-RISC
specification says that the upper 32-bits of extru result are undefined,
although they are set to zero in practice)
* Fix exception addresses in .Lcopy16_fault and .Lcopy8_fault
* Rename .Lcopy_loop_4 to .Lcopy_loop_8 (so that it is consistent with
.Lcopy8_fault)
Fixes: 554bfeceb8 ("parisc: Fix access fault handling in pa_memcpy()")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b03b99a329 upstream.
While reviewing the -stable patch for commit 86ef58a4e3 "nfit,
libnvdimm: fix interleave set cookie calculation" Ben noted:
"This is returning an int, thus it's effectively doing a 32-bit
comparison and not the 64-bit comparison you say is needed."
Update the compare operation to be immune to this integer demotion problem.
Cc: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Fixes: 86ef58a4e3 ("nfit, libnvdimm: fix interleave set cookie calculation")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6fdc6dd902 upstream.
The vsyscall32 sysctl can racy against a concurrent fork when it switches
from disabled to enabled:
arch_setup_additional_pages()
if (vdso32_enabled)
--> No mapping
sysctl.vsysscall32()
--> vdso32_enabled = true
create_elf_tables()
ARCH_DLINFO_IA32
if (vdso32_enabled) {
--> Add VDSO entry with NULL pointer
Make ARCH_DLINFO_IA32 check whether the VDSO mapping has been set up for
the newly forked process or not.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mathias Krause <minipli@googlemail.com>
Link: http://lkml.kernel.org/r/20170410151723.602367196@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 11e63f6d92 upstream.
Before we rework the "pmem api" to stop abusing __copy_user_nocache()
for memcpy_to_pmem() we need to fix cases where we may strand dirty data
in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
for arbitrary data transfers from userspace. There is no guarantee that
these transfers, performed by dax_iomap_actor(), will have aligned
destinations or aligned transfer lengths. Backstop the usage
__copy_user_nocache() with explicit cache management in these unaligned
cases.
Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
that is saved for a later patch that moves the entirety of the "pmem
api" into the pmem driver directly.
Fixes: 5de490daec ("pmem: add copy_from_iter_pmem() and clear_pmem()")
Cc: <x86@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f6266a561 upstream.
Reserving a runtime region results in splitting the EFI memory
descriptors for the runtime region. This results in runtime region
descriptors with bogus memory mappings, leading to interesting crashes
like the following during a kexec:
general protection fault: 0000 [#1] SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1 #53
Hardware name: Wiwynn Leopard-Orv2/Leopard-DDR BW, BIOS LBM05 09/30/2016
RIP: 0010:virt_efi_set_variable()
...
Call Trace:
efi_delete_dummy_variable()
efi_enter_virtual_mode()
start_kernel()
? set_init_arg()
x86_64_start_reservations()
x86_64_start_kernel()
start_cpu()
...
Kernel panic - not syncing: Fatal exception
Runtime regions will not be freed and do not need to be reserved, so
skip the memmap modification in this case.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 8e80632fb2 ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
Link: http://lkml.kernel.org/r/20170412152719.9779-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fa839b498 upstream.
This fixes Continuous Availability when errors during
file reopen are encountered.
cifs_user_readv and cifs_user_writev would wait for ever if
results of cifs_reopen_file are not stored and for later inspection.
In fact, results are checked and, in case of errors, a chain
of function calls leading to reads and writes to be scheduled in
a separate thread is skipped.
These threads will wake up the corresponding waiters once reads
and writes are done.
However, given the return value is not stored, when rc is checked
for errors a previous one (always zero) is inspected instead.
This leads to pending reads/writes added to the list, making
cifs_user_readv and cifs_user_writev wait for ever.
Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f94773b9f5 upstream.
The NV4A (aka NV44A) is an oddity in the family. It only comes in AGP
and PCI varieties, rather than a core PCIE chip with a bridge for
AGP/PCI as necessary. As a result, it appears that the MMU is also
non-functional. For AGP cards, the vast majority of the NV4A lineup,
this worked out since we force AGP cards to use the nv04 mmu. However
for PCI variants, this did not work.
Switching to the NV04 MMU makes it work like a charm. Thanks to mwk for
the suggestion. This should be a no-op for NV4A AGP boards, as they were
using it already.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70388
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ec1688c53 upstream.
Otherwise lockdep says:
[ 1337.483798] ================================================
[ 1337.483999] [ BUG: lock held when returning to user space! ]
[ 1337.484252] 4.11.0-rc6 #19 Not tainted
[ 1337.484423] ------------------------------------------------
[ 1337.484626] mount/14766 is leaving the kernel with locks still held!
[ 1337.484841] 1 lock held by mount/14766:
[ 1337.485017] #0: (&type->s_umount_key#33/1){+.+.+.}, at: [<ffffffff8124171f>] sget_userns+0x2af/0x520
Caught by xfstests generic/413 which tried to mount with the unsupported
mount option dax. Then xfstests generic/422 ran sync which deadlocks.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5d68ba858 upstream.
For the bidirectional case, the Data-Out buffer blocks will always at
the head of the tcmu_cmd's bitmap, and before gathering the Data-In
buffer, first of all it should skip the Data-Out ones, or the device
supporting BIDI commands won't work.
Fixed: 26418649ee ("target/user: Introduce data_bitmap, replace
data_length/data_head/data_tail")
Reported-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abe342a5b4 upstream.
The t_data_nents and t_bidi_data_nents are the numbers of the
segments, but it couldn't be sure the block size equals to size
of the segment.
For the worst case, all the blocks are discontiguous and there
will need the same number of iovecs, that's to say: blocks == iovs.
So here just set the number of iovs to block count needed by tcmu
cmd.
Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ab22d2604c upstream.
If there has BIDI data, its first iov[] will overwrite the last
iov[] for se_cmd->t_data_sg.
To fix this, we can just increase the iov pointer, but this may
introuduce a new memory leakage bug: If the se_cmd->data_length
and se_cmd->t_bidi_data_sg->length are all not aligned up to the
DATA_BLOCK_SIZE, the actual length needed maybe larger than just
sum of them.
So, this could be avoided by rounding all the data lengthes up
to DATA_BLOCK_SIZE.
Reviewed-by: Mike Christie <mchristi@redhat.com>
Tested-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77f88796ce upstream.
Creation of a kthread goes through a couple interlocked stages between
the kthread itself and its creator. Once the new kthread starts
running, it initializes itself and wakes up the creator. The creator
then can further configure the kthread and then let it start doing its
job by waking it up.
In this configuration-by-creator stage, the creator is the only one
that can wake it up but the kthread is visible to userland. When
altering the kthread's attributes from userland is allowed, this is
fine; however, for cases where CPU affinity is critical,
kthread_bind() is used to first disable affinity changes from userland
and then set the affinity. This also prevents the kthread from being
migrated into non-root cgroups as that can affect the CPU affinity and
many other things.
Unfortunately, the cgroup side of protection is racy. While the
PF_NO_SETAFFINITY flag prevents further migrations, userland can win
the race before the creator sets the flag with kthread_bind() and put
the kthread in a non-root cgroup, which can lead to all sorts of
problems including incorrect CPU affinity and starvation.
This bug got triggered by userland which periodically tries to migrate
all processes in the root cpuset cgroup to a non-root one. Per-cpu
workqueue workers got caught while being created and ended up with
incorrected CPU affinity breaking concurrency management and sometimes
stalling workqueue execution.
This patch adds task->no_cgroup_migration which disallows the task to
be migrated by userland. kthreadd starts with the flag set making
every child kthread start in the root cgroup with migration
disallowed. The flag is cleared after the kthread finishes
initialization by which time PF_NO_SETAFFINITY is set if the kthread
should stay in the root cgroup.
It'd be better to wait for the initialization instead of failing but I
couldn't think of a way of implementing that without adding either a
new PF flag, or sleeping and retrying from waiting side. Even if
userland depends on changing cgroup membership of a kthread, it either
has to be synchronized with kthread_create() or periodically repeat,
so it's unlikely that this would break anything.
v2: Switch to a simpler implementation using a new task_struct bit
field suggested by Oleg.
Signed-off-by: Tejun Heo <tj@kernel.org>
Suggested-by: Oleg Nesterov <oleg@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-and-debugged-by: Chris Mason <clm@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the i2c driver hadn't pobed before the panel driver probes, then
the client would be NULL but we were looking for an ERR_PTR in the
error case.
Signed-off-by: Eric Anholt <eric@anholt.net>
The HDMI encoder IP embeds all needed blocks to output audio, with a
custom DAI called MAI moving audio between the two parts of the HDMI
core. This driver now exposes a sound card to let users stream audio
to their display.
Using the hdmi-codec driver has been considered here, but MAI meant
having to significantly rework hdmi-codec, and it would have left
little shared code with the I2S mode anyway.
The encoder requires that the audio be SPDIF-formatted frames only,
which alsalib will format-convert for us.
This patch is the combined work of Eric Anholt (initial register setup
with a separate dmaengine driver and using simple-audio-card) and
Boris Brezillon (moving it all into HDMI, massive debug to get it
actually working), and which Eric has the permission to release.
v2: Drop "-audio" from sound card name, since that's already implied
(suggestion by Boris)
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170227202803.12855-2-eric@anholt.net
Until now, we've had to limit Raspberry Pi to 256MB of CMA memory to
keep from triggering the hardware addressing bug between of the tile
binner of the tile alloc memory (where the top 4 bits come from the
tile state data array's address).
To work around that and allow more memory to be reserved for graphics,
allocate a single BO to store tile state data arrays and tile
alloc/overflow memory while the GPU is active, and make sure that that
one BO doesn't happen to cross a 256MB boundary. With that in place,
we can allocate textures and shaders anywhere in system memory (still
contiguous, of course).
Signed-off-by: Eric Anholt <eric@anholt.net>
commit 7c3945bc20 upstream.
Save the qp context flags byte containing the flag disabling vlan stripping
in the RESET to INIT qp transition, rather than in the INIT to RTR
transition. Per the firmware spec, the flags in this byte are active
in the RESET to INIT transition.
As a result of saving the flags in the incorrect qp transition, when
switching dynamically from VGT to VST and back to VGT, the vlan
remained stripped (as is required for VST) and did not return to
not-stripped (as is required for VGT).
Fixes: f0f829bf42 ("net/mlx4_core: Add immediate activate for VGT->VST->VGT")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 291c566a28 upstream.
In function mlx4_cq_completion() and mlx4_cq_event(), the
radix_tree_lookup requires a rcu_read_lock.
This is mandatory: if another core frees the CQ, it could
run the radix_tree_node_rcu_free() call_rcu() callback while
its being used by the radix tree lookup function.
Additionally, in function mlx4_cq_event(), since we are adding
the rcu lock around the radix-tree lookup, we no longer need to take
the spinlock. Also, the synchronize_irq() call for the async event
eliminates the need for incrementing the cq reference count in
mlx4_cq_event().
Other changes:
1. In function mlx4_cq_free(), replace spin_lock_irq with spin_lock:
we no longer take this spinlock in the interrupt context.
The spinlock here, therefore, simply protects against different
threads simultaneously invoking mlx4_cq_free() for different cq's.
2. In function mlx4_cq_free(), we move the radix tree delete to before
the synchronize_irq() calls. This guarantees that we will not
access this cq during any subsequent interrupts, and therefore can
safely free the CQ after the synchronize_irq calls. The rcu_read_lock
in the interrupt handlers only needs to protect against corrupting the
radix tree; the interrupt handlers may access the cq outside the
rcu_read_lock due to the synchronize_irq calls which protect against
premature freeing of the cq.
3. In function mlx4_cq_event(), we change the mlx_warn message to mlx4_dbg.
4. We leave the cq reference count mechanism in place, because it is
still needed for the cq completion tasklet mechanism.
Fixes: 6d90aa5cf1 ("net/mlx4_core: Make sure there are no pending async events when freeing CQ")
Fixes: 225c7b1fee ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6496bbf0ec upstream.
Single send WQE in RX buffer should be stamped with software
ownership in order to prevent the flow of QP in error in FW
once UPDATE_QP is called.
Fixes: 9f519f68cf ('mlx4_en: Not using Shared Receive Queues')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 22547c4cc4 upstream.
On a system with a defective USB device connected to an USB hub,
an endless sequence of port connect events was observed. The sequence
of events as observed is as follows:
- Port reports connected event (port status=USB_PORT_STAT_CONNECTION).
- Event handler debounces port and resets it by calling hub_port_reset().
- hub_port_reset() calls hub_port_wait_reset() to wait for the reset
to complete.
- The reset completes, but USB_PORT_STAT_CONNECTION is not immediately
set in the port status register.
- hub_port_wait_reset() returns -ENOTCONN.
- Port initialization sequence is aborted.
- A few milliseconds later, the port again reports a connected event,
and the sequence repeats.
This continues either forever or, randomly, stops if the connection
is already re-established when the port status is read. It results in
a high rate of udev events. This in turn destabilizes userspace since
the above sequence holds the device mutex pretty much continuously
and prevents userspace from actually reading the device status.
To prevent the problem from happening, let's wait for the connection
to be re-established after a port reset. If the device was actually
disconnected, the code will still return an error, but it will do so
only after the long reset timeout.
Cc: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36e1f3d107 upstream.
While stressing memory and IO at the same time we changed SMT settings,
we were able to consistently trigger deadlocks in the mm system, which
froze the entire machine.
I think that under memory stress conditions, the large allocations
performed by blk_mq_init_rq_map may trigger a reclaim, which stalls
waiting on the block layer remmaping completion, thus deadlocking the
system. The trace below was collected after the machine stalled,
waiting for the hotplug event completion.
The simplest fix for this is to make allocations in this path
non-reclaimable, with GFP_NOIO. With this patch, We couldn't hit the
issue anymore.
This should apply on top of Jens's for-next branch cleanly.
Changes since v1:
- Use GFP_NOIO instead of GFP_NOWAIT.
Call Trace:
[c000000f0160aaf0] [c000000f0160ab50] 0xc000000f0160ab50 (unreliable)
[c000000f0160acc0] [c000000000016624] __switch_to+0x2e4/0x430
[c000000f0160ad20] [c000000000b1a880] __schedule+0x310/0x9b0
[c000000f0160ae00] [c000000000b1af68] schedule+0x48/0xc0
[c000000f0160ae30] [c000000000b1b4b0] schedule_preempt_disabled+0x20/0x30
[c000000f0160ae50] [c000000000b1d4fc] __mutex_lock_slowpath+0xec/0x1f0
[c000000f0160aed0] [c000000000b1d678] mutex_lock+0x78/0xa0
[c000000f0160af00] [d000000019413cac] xfs_reclaim_inodes_ag+0x33c/0x380 [xfs]
[c000000f0160b0b0] [d000000019415164] xfs_reclaim_inodes_nr+0x54/0x70 [xfs]
[c000000f0160b0f0] [d0000000194297f8] xfs_fs_free_cached_objects+0x38/0x60 [xfs]
[c000000f0160b120] [c0000000003172c8] super_cache_scan+0x1f8/0x210
[c000000f0160b190] [c00000000026301c] shrink_slab.part.13+0x21c/0x4c0
[c000000f0160b2d0] [c000000000268088] shrink_zone+0x2d8/0x3c0
[c000000f0160b380] [c00000000026834c] do_try_to_free_pages+0x1dc/0x520
[c000000f0160b450] [c00000000026876c] try_to_free_pages+0xdc/0x250
[c000000f0160b4e0] [c000000000251978] __alloc_pages_nodemask+0x868/0x10d0
[c000000f0160b6f0] [c000000000567030] blk_mq_init_rq_map+0x160/0x380
[c000000f0160b7a0] [c00000000056758c] blk_mq_map_swqueue+0x33c/0x360
[c000000f0160b820] [c000000000567904] blk_mq_queue_reinit+0x64/0xb0
[c000000f0160b850] [c00000000056a16c] blk_mq_queue_reinit_notify+0x19c/0x250
[c000000f0160b8a0] [c0000000000f5d38] notifier_call_chain+0x98/0x100
[c000000f0160b8f0] [c0000000000c5fb0] __cpu_notify+0x70/0xe0
[c000000f0160b930] [c0000000000c63c4] notify_prepare+0x44/0xb0
[c000000f0160b9b0] [c0000000000c52f4] cpuhp_invoke_callback+0x84/0x250
[c000000f0160ba10] [c0000000000c570c] cpuhp_up_callbacks+0x5c/0x120
[c000000f0160ba60] [c0000000000c7cb8] _cpu_up+0xf8/0x1d0
[c000000f0160bac0] [c0000000000c7eb0] do_cpu_up+0x120/0x150
[c000000f0160bb40] [c0000000006fe024] cpu_subsys_online+0x64/0xe0
[c000000f0160bb90] [c0000000006f5124] device_online+0xb4/0x120
[c000000f0160bbd0] [c0000000006f5244] online_store+0xb4/0xc0
[c000000f0160bc20] [c0000000006f0a68] dev_attr_store+0x68/0xa0
[c000000f0160bc60] [c0000000003ccc30] sysfs_kf_write+0x80/0xb0
[c000000f0160bca0] [c0000000003cbabc] kernfs_fop_write+0x17c/0x250
[c000000f0160bcf0] [c00000000030fe6c] __vfs_write+0x6c/0x1e0
[c000000f0160bd90] [c000000000311490] vfs_write+0xd0/0x270
[c000000f0160bde0] [c0000000003131fc] SyS_write+0x6c/0x110
[c000000f0160be30] [c000000000009204] system_call+0x38/0xec
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: Douglas Miller <dougmill@linux.vnet.ibm.com>
Cc: linux-block@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b6867c2ce upstream.
Subtracting tp_sizeof_priv from tp_block_size and casting to int
to check whether one is less then the other doesn't always work
(both of them are unsigned ints).
Compare them as is instead.
Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as
it can overflow inside BLK_PLUS_PRIV otherwise.
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33fa46d7b3 upstream.
In case caam_jr_alloc() fails, ctx->dev carries the error code,
thus accessing it with dev_err() is incorrect.
Fixes: 8c419778ab ("crypto: caam - add support for RSA algorithm")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40c98cb57c upstream.
RNG instantiation was previously fixed by
commit 62743a4145 ("crypto: caam - fix RNG init descriptor ret. code checking")
while deinstantiation was not addressed.
Since the descriptors used are similar, in the sense that they both end
with a JUMP HALT command, checking for errors should be similar too,
i.e. status code 7000_0000h should be considered successful.
Fixes: 1005bccd7a ("crypto: caam - enable instantiation of all RNG4 state handles")
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c25f8064c1 upstream.
Commit dda45f701c ("MIPS: Switch to the irq_stack in interrupts")
changed both the normal and vectored interrupt handlers. Unfortunately
the vectored version, "except_vec_vi_handler", was incorrectly modified
to unconditionally jal to plat_irq_dispatch, rather than doing a jalr to
the vectored handler that has been set up. This is ok for many platforms
which set the vectored handler to plat_irq_dispatch anyway, but will
cause problems with platforms that use other handlers.
Fixes: dda45f701c ("MIPS: Switch to the irq_stack in interrupts")
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15110/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dda45f701c upstream.
When enterring interrupt context via handle_int or except_vec_vi, switch
to the irq_stack of the current CPU if it is not already in use.
The current stack pointer is masked with the thread size and compared to
the base or the irq stack. If it does not match then the stack pointer
is set to the top of that stack, otherwise this is a nested irq being
handled on the irq stack so the stack pointer should be left as it was.
The in-use stack pointer is placed in the callee saved register s1. It
will be saved to the stack when plat_irq_dispatch is invoked and can be
restored once control returns here.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14743/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 510d86362a upstream.
The SAVE_SOME macro is used to save the execution context on all
exceptions.
If an exception occurs while executing user code, the stack is switched
to the kernel's stack for the current task, and register $28 is switched
to point to the current_thread_info, which is at the bottom of the stack
region.
If the exception occurs while executing kernel code, the stack is left,
and this change ensures that register $28 is not updated. This is the
correct behaviour when the kernel can be executing on the separate irq
stack, because the thread_info will not be at the base of it.
With this change, register $28 is only switched to it's kernel
conventional usage of the currrent thread info pointer at the point at
which execution enters kernel space. Doing it on every exception was
redundant, but OK without an IRQ stack, but will be erroneous once that
is introduced.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Acked-by: Jason A. Donenfeld <jason@zx2c4.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14742/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd5d213101 upstream.
After parsing TRX we should skip to the first block placed behind it.
Our code was working only with TRX with length not aligned to the
blocksize. In other cases (length aligned) it was missing the block
places right after TRX.
This fixes calculation and simplifies the comment.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a083c8fd27 upstream.
In device removal routine, usage of "#ifdef CONFIG_RT2X00_LIB_USB"
will not cover the case when it is configured as module. This will
omit the entire if-block which does cleanup of URBs and cancellation
of pending work. Changing the #ifdef to #if IS_ENABLED() to fix it.
Signed-off-by: Vishal Thanki <vishalthanki@gmail.com>
Acked-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93c7018ec1 upstream.
We might kill TX or RX urb during rt2x00usb_flush_entry(), what can
cause anchor list corruption like shown below:
[ 2074.035633] WARNING: CPU: 2 PID: 14480 at lib/list_debug.c:33 __list_add+0xac/0xc0
[ 2074.035634] list_add corruption. prev->next should be next (ffff88020f362c28), but was dead000000000100. (prev=ffff8801d161bb70).
<snip>
[ 2074.035670] Call Trace:
[ 2074.035672] [<ffffffff813bde47>] dump_stack+0x63/0x8c
[ 2074.035674] [<ffffffff810a2231>] __warn+0xd1/0xf0
[ 2074.035676] [<ffffffff810a22af>] warn_slowpath_fmt+0x5f/0x80
[ 2074.035678] [<ffffffffa073855d>] ? rt2x00usb_register_write_lock+0x3d/0x60 [rt2800usb]
[ 2074.035679] [<ffffffff813dbe4c>] __list_add+0xac/0xc0
[ 2074.035681] [<ffffffff81591c6c>] usb_anchor_urb+0x4c/0xa0
[ 2074.035683] [<ffffffffa07322af>] rt2x00usb_kick_rx_entry+0xaf/0x100 [rt2x00usb]
[ 2074.035684] [<ffffffffa0732322>] rt2x00usb_clear_entry+0x22/0x30 [rt2x00usb]
To fix do not anchor TX and RX urb's, it is not needed as during
shutdown we kill those urbs in rt2x00usb_free_entries().
Cc: Vishal Thanki <vishalthanki@gmail.com>
Fixes: 8b4c000931 ("rt2x00usb: Use usb anchor to manage URB")
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e247454103 upstream.
Writing messages larger than the FIFO size results in a hang, rendering
the machine unusable. This is because the RXD status flag is set on the
first interrupt which results in bcm2835_drain_rxfifo() stealing bytes
from the buffer. The controller continues to trigger interrupts waiting
for the missing bytes, but bcm2835_fill_txfifo() has none to give.
In this situation wait_for_completion_timeout() apparently is unable to
stop the madness.
The BCM2835 ARM Peripherals datasheet has this to say about the flags:
TXD: is set when the FIFO has space for at least one byte of data.
RXD: is set when the FIFO contains at least one byte of data.
TXW: is set during a write transfer and the FIFO is less than full.
RXR: is set during a read transfer and the FIFO is or more full.
Implementing the logic from the downstream i2c-bcm2708 driver solved
the hang problem.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb68d0324d upstream.
The deamon through which the kernel module communicates with the userspace
part of Orangefs, the "client-core", sends initialization data to the
kernel module with ioctl. The initialization data was built by the
client-core in a 2k buffer and copy_from_user'd into a 1k buffer
in the kernel module. When more than 1k of initialization data needed
to be sent, some was lost, reducing the usability of the control by which
debug levels are set. This patch sets the kernel side buffer to 2K to
match the userspace side...
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05973c2efb upstream.
This patch is simlar to one Dan Carpenter sent me, cleans
up some return codes and whitespace errors. There was one
place where he thought inserting an error message into
the ring buffer might be too chatty, I hope I convinced him
othewise. As a consolation <g> I changed a truly chatty
error message in another location into a debug message,
system-admins had already yelled at me about that one...
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4defb5f912 upstream.
allocates string 'new' is not free'd on the exit path when
cdm_element_count <= 0. Fix this by kfree'ing it.
Fixes CoverityScan CID#1375923 "Resource Leak"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f5418e564 upstream.
This patch makes the I915_PARAM_HAS_EXEC_CONSTANTS getparam return 0
(indicating the optional feature is not supported), and makes execbuf
always return -EINVAL if the flags are used.
Apparently, no userspace ever shipped which used this optional feature:
I checked the git history of Mesa, xf86-video-intel, libva, and Beignet,
and there were zero commits showing a use of these flags. Kernel commit
72bfa19c8d apparently introduced the feature prematurely. According
to Chris, the intention was to use this in cairo-drm, but "the use was
broken for gen6", so I don't think it ever happened.
'relative_constants_mode' has always been tracked per-device, but this
has actually been wrong ever since hardware contexts were introduced, as
the INSTPM register is saved (and automatically restored) as part of the
render ring context. The software per-device value could therefore get
out of sync with the hardware per-context value. This meant that using
them is actually unsafe: a client which tried to use them could damage
the state of other clients, causing the GPU to interpret their BO
offsets as absolute pointers, leading to bogus memory reads.
These flags were also never ported to execlist mode, making them no-ops
on Gen9+ (which requires execlists), and Gen8 in the default mode.
On Gen8+, userspace can write these registers directly, achieving the
same effect. On Gen6-7.5, it likely makes sense to extend the command
parser to support them. I don't think anyone wants this on Gen4-5.
Based on a patch by Dave Gordon.
v3: Return -ENODEV for the getparam, as this is what we do for other
obsolete features. Suggested by Chris Wilson.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92448
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170215093446.21291-1-kenneth@whitecape.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313170433.26843-1-chris@chris-wilson.co.uk
(cherry picked from commit ef0f411f51)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 34dc8993ee upstream.
Certain Baytrails, namely the 4 cpu core variants, have been
plaqued by spurious system hangs, mostly occurring with light loads.
Multiple bisects by various people point to a commit which changes the
reclocking strategy for Baytrail to follow its bigger brethen:
commit 8fb55197e6 ("drm/i915: Agressive downclocking on Baytrail")
There is also a review comment attached to this commit from Deepak S
on avoiding punit access on Cherryview and thus it was excluded on
common reclocking path. By taking the same approach and omitting
the punit access by not tweaking the thresholds when the hardware
has been asked to move into different frequency, considerable gains
in stability have been observed.
With J1900 box, light render/video load would end up in system hang
in usually less than 12 hours. With this patch applied, the cumulative
uptime has now been 34 days without issues. To provoke system hang,
light loads on both render and bsd engines in parallel have been used:
glxgears >/dev/null 2>/dev/null &
mpv --vo=vaapi --hwdec=vaapi --loop=inf vid.mp4
So far, author has not witnessed system hang with above load
and this patch applied. Reports from the tenacious people at
kernel bugzilla are also promising.
Considering that the punit access frequency with this patch is
considerably less, there is a possibility that this will push
the, still unknown, root cause past the triggering point on most loads.
But as we now can reliably reproduce the hang independently,
we can reduce the pain that users are having and use a
static thresholds until a root cause is found.
v3: don't break debugfs and simplification (Chris Wilson)
References: https://bugzilla.kernel.org/show_bug.cgi?id=109051
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: fritsch@xbmc.org
Cc: miku@iki.fi
Cc: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
CC: Michal Feix <michal@feix.cz>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1487166779-26945-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 6067a27d1f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This is an EXAMPLE CODE of Raspberry Pi TV HAT device tree overlay.
Although this is not a part of our release code, it has been used to verify
CXD2880 device driver with TV HAT.
Add the following line to /boot/config.txt to enable TV HAT:
dtoverlay=rpi-tv
Reboot Raspberry Pi and check the existance of /proc/device-tree/soc/spi@7e204000/cxd2880@0.
If exists, the installation is successful. you should be able to find the following three files.
/dev/dvb/adapter0/frontend0
/dev/dvb/adapter0/demux0
/dev/dvb/adapter0/dvr0
Signed-off-by: Yasunari Takiguchi <Yasunari.Takiguchi@sony.com>
Signed-off-by: Masayuki Yamamoto <Masayuki.Yamamoto@sony.com>
Signed-off-by: Hideki Nozawa <Hideki.Nozawa@sony.com>
Signed-off-by: Kota Yonezawa <Kota.Yonezawa@sony.com>
Signed-off-by: Toshihiko Matsumoto <Toshihiko.Matsumoto@sony.com>
Signed-off-by: Satoshi Watanabe <Satoshi.C.Watanabe@sony.com>
[ Upstream commit d595259fbb ]
This USB-SATA bridge chip is used in a StarTech enclosure for
optical drives.
Without the quirk MakeMKV fails during the key exchange with an
installed BluRay drive:
> Error 'Scsi error - ILLEGAL REQUEST:COPY PROTECTION KEY EXCHANGE FAILURE - KEY NOT ESTABLISHED'
> occurred while issuing SCSI command AD010..080002400 to device 'SG:dev_11:2'
Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71050ae7bf ]
Some Asus laptops that have an airplane-mode indicator LED, also have
the WMI WLAN user bit set, and the following bits in their DSDT:
Scope (_SB)
{
(...)
Device (ATKD)
{
(...)
Method (WMNB, 3, Serialized)
{
(...)
If (LEqual (IIA0, 0x00010002))
{
OWGD (IIA1)
Return (One)
}
}
}
}
So when asus-wmi uses ASUS_WMI_DEVID_WLAN_LED (0x00010002) to store the
wlan state, it drives the airplane-mode indicator LED (through the call
to OWGD) in an inverted fashion: the LED is ON when airplane mode is OFF
(since wlan is ON), and vice-versa.
This commit skips registering RFKill switches at all for these laptops,
to allow the asus-wireless driver to drive the airplane mode LED
correctly through the ASHS ACPI device. Relying on the presence of ASHS
and ASUS_WMI_DSTS_USER_BIT avoids adding DMI-based quirks for at least
21 different laptops.
Signed-off-by: João Paulo Rechi Vita <jprvita@endlessm.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0b445549ea ]
In soft (no-reboot) mode, the driver self-pings watchdog upon expiration
of an interrupt. However the interrupt itself was not cleared thus on
first hit, the system enters infinite interrupt handling loop.
On Odroid U3 (Exynos4412), when booted with s3c2410_wdt.soft_noboot=1
argument the console is flooded:
# killall -9 watchdog
[ 60.523760] s3c2410-wdt 10060000.watchdog: watchdog timer expired (irq)
[ 60.536744] s3c2410-wdt 10060000.watchdog: watchdog timer expired (irq)
Fix this by writing something to the WTCLRINT register to clear the
interrupt. The register WTCLRINT however appeared in S3C6410 so a new
watchdog quirk and flavor are needed.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 33be632b84 ]
The Qualcomm QDF2xxx root ports don't advertise an ACS capability, but they
do provide ACS-like features to disable peer transactions and validate bus
numbers in requests.
To be specific:
* Hardware supports source validation but it will report the issue as
Completer Abort instead of ACS Violation.
* Hardware doesn't support peer-to-peer and each root port is a root
complex with unique segment numbers.
* It is not possible for one root port to pass traffic to the other root
port. All PCIe transactions are terminated inside the root port.
Add an ACS quirk for the QDF2400 and QDF2432 products.
[bhelgaas: changelog]
Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e9acc77dd0 ]
Initially all QorIQ platforms were PowerPC architecture and they didn't
support card detection except several platforms. The driver added the
quirk SDHCI_QUIRK_BROKEN_CARD_DETECTION as default and this made broken-cd
property in dts node didn't work. Now QorIQ platform turns to ARM
architecture and most of them could support card detection. However it's
a large number of dts trees that need to be fixed with broken-cd if we
remove the default SDHCI_QUIRK_BROKEN_CARD_DETECTION in driver. And the
users don't want to see this. So this patch is to remove this default
quirk just for ARM and keep it for PowerPC.(Note, QorIQ PowerPC platform
only has big-endian eSDHC while QorIQ ARM platform has big-endian or
little-endian eSDHC) This makes broken-cd property work again for ARM.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72f2ff0deb ]
The PCIe Root Port in Hip06/Hip07 SoCs advertises an MSI capability, but it
cannot generate MSIs. It can transfer MSI/MSI-X from downstream devices,
but does not support MSI/MSI-X itself.
Add a quirk to prevent use of MSI/MSI-X by the Root Port.
[bhelgaas: changelog, sort vendor ID #define, drop device ID #define]
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gabriele Paoloni <gabriele.paoloni@huawei.com>
Reviewed-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ce709f8650 ]
The Broadcom Northstar2 SoC has a number of quirks for the PAXC
(internal/fake) PCI bus. Specifically, the PCI config space is shared
between the root port and the first PF (ie., PF0), and a number of fields
are tied to zero (thus preventing them from being set). These cannot be
"fixed" in device firmware, so we must fix them with a quirk.
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3046ec674d ]
Commit 680a0873e1 ("arm: kernel: Add SMC structure parameter") added
a new "quirk" parameter to the SMC and HVC SMCCC backends, but only
updated the comment for the SMC version. This patch adds the new
paramater to the comment describing the HVC version too.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a5725ab049 ]
We get 2 warnings when building kernel with W=1:
drivers/gpu/drm/msm/adreno/a3xx_gpu.c:535:17: warning: no previous prototype for 'a3xx_gpu_init' [-Wmissing-prototypes]
drivers/gpu/drm/msm/adreno/a4xx_gpu.c:624:17: warning: no previous prototype for 'a4xx_gpu_init' [-Wmissing-prototypes]
In fact, both functions are declared in
drivers/gpu/drm/msm/adreno/adreno_device.c, but should be declared
in a header file. So this patch moves both function declarations to
drivers/gpu/drm/msm/adreno/adreno_gpu.h.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1477127865-9381-1-git-send-email-baoyou.xie@linaro.org
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 82bcd08702 ]
This patch adds a Qualcomm specific quirk to the arm_smccc_smc call.
On Qualcomm ARM64 platforms, the SMC call can return before it has
completed. If this occurs, the call can be restarted, but it requires
using the returned session ID value from the interrupted SMC call.
The quirk stores off the session ID from the interrupted call in the
quirk structure so that it can be used by the caller.
This patch folds in a fix given by Sricharan R:
https://lkml.org/lkml/2016/9/28/272
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 680a0873e1 ]
This patch adds a quirk parameter to the arm_smccc_(smc/hvc) calls.
The quirk structure allows for specialized SMC operations due to SoC
specific requirements. The current arm_smccc_(smc/hvc) is renamed and
macros are used instead to specify the standard arm_smccc_(smc/hvc) or
the arm_smccc_(smc/hvc)_quirk function.
This patch and partial implementation was suggested by Will Deacon.
Signed-off-by: Andy Gross <andy.gross@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e7deb1570a ]
Non-generic devices have numbered_buttons set for both pen and
touch interfaces by default. The actual number of buttons on the
interface is normally manually decided later, which is different
from what those HID generic devices are processed, where number
of buttons are directly retrieved from HID descriptors.
This patch adds the missed HID_GENERIC check and moves the statement
to wacom_setup_pad_input_capabilities since it's not a quirk anymore.
Signed-off-by: Ping Cheng <ping.cheng@wacom.com>
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2ad6f30de7 ]
Some SoCs have a reset line that must be asserted/deasserted.
This patch adds a quirk to handle the new compatible
"allwinner,sun6i-a31-i2s" which will deassert the reset
line on probe function and assert it on remove's one.
This new compatible is useful in case of A33 codec driver, for example.
Signed-off-by: Mylène Josserand <mylene.josserand@free-electrons.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cbc00c1310 ]
In commit 821d6f0359 (ACPI / sleep: Do not save NVS for new machines to
accelerate S3), to optimize S3 suspend/resume speed, code is introduced
to ignore NVS memory saving during S3 for all the platforms later than
2012.
But, Lenovo G50-45, a platform released in 2015, still needs NVS memory
saving during S3. A quirk is introduced for this platform.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=189431
Tested-by: Przemek <soprwa@gmail.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[ rjw: Drop unnecessary code ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a50477e55f ]
The existing code assumes a 19.2 MHz MCLK as the default
hardware configuration. This is valid for CherryTrail but
not for Baytrail.
Add explicit MCLK configuration to set the 19.2 clock on/off
depending on DAPM events.
This is a prerequisite step to enable devices with Baytrail
and RT5645 such as Asus X205TA
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 77e9a4aa9d ]
More and more platforms need the button.lid_init_state=open quirk. This
patch sets it the default behavior.
If a platform doesn't send lid open event or lid open event is lost due to
the underlying system problems, then we can compare various combinations:
1. systemd/acpid is used to suspend system or not, systemd has a special
logic forcing open event after resuming;
2. _LID returns a cached value or not.
The result is as follows:
1. lid_init_state=method
1. cached
1. resumed by lid:
(x) event=close
(x) systemd=suspends again
(x) acpid=suspends again
(x) state=close
2. resumed by other:
(o) event=close
(x) systemd=suspends again
(x) acpid=suspends again
(o) state=close
2. non-cached
1. resumed by lid:
(o) event=open
(o) systemd=resumes
(o) acpid=resumes
(o) state=open
2. resumed by other:
(o) event=close
(x) systemd=suspends again
(x) acpid=suspends again
(o) state=close
2. lid_init_state=open
1. cached
1. resumed by lid:
(o) event=open
(o) systemd=resumes
(o) acpid=resumes
(x) state=close
2. resumed by other:
(x) event=open
(o) systemd=resumes
(o) acpid=resumes
(o) state=close
2. non-cached
1. resumed by lid:
(o) event=open
(o) systemd=resumes
(o) acpid=resumes
(o) state=open
2. resumed by other:
(x) event=open
(o) systemd=resumes
(o) acpid=resumes
(o) state=close
3. lid_init_state=ignore
1. cached
1. resumed by lid:
(o) event=none
(x) systemd=suspends again
(o) acpid=resumes
(x) state=close
2. resumed by other:
(o) event=none
(x) systemd=suspends again
(o) acpid=resumes
(o) state=close
2. non-cached
1. resumed by lid:
(o) event=none
(x) systemd=suspends again
(o) acpid=resumes
(o) state=open
2. resumed by other:
(o) event=none
(x) systemd=suspends again
(o) acpid=resumes
(o) state=close
As a conclusion:
1. With systemd changed, lid_init_state=ignore has only one problem and the
problem comes from an underlying issue, not userspace and kernel lid
handling.
2. Without systemd changed, lid_init_state=open can be the default
behavior as the pass ratio is not much worse than lid_init_state=ignore.
3. lid_init_state=method is buggy, we can have a separate patch to make it
deprectated.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=187271
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f4d435f326 ]
There's an issue with the da850 SATA controller: if port multiplier
support is compiled in, but we're connecting the drive directly to
the SATA port on the board, the drive can't be detected.
To make SATA work on the da850-lcdk board: first try to softreset
with pmp - if the operation fails with -EBUSY, retry without pmp.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7184f5b451 ]
Intel 200-series chipsets have the same errata as 100-series: the ACS
capability doesn't follow the PCIe spec, the capability and control
registers are dwords rather than words. Add PCIe root port device IDs to
existing quirk.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8413299cb3 ]
Since v4.10-rc1, the following logs appears in loop :
[ 801.953836] usb usb6-port1: Cannot enable. Maybe the USB cable is bad?
[ 801.960455] xhci-hcd xhci-hcd.0.auto: Cannot set link state.
[ 801.966611] usb usb6-port1: cannot disable (err = -32)
[ 806.083772] usb usb6-port1: Cannot enable. Maybe the USB cable is bad?
[ 806.090370] xhci-hcd xhci-hcd.0.auto: Cannot set link state.
[ 806.096494] usb usb6-port1: cannot disable (err = -32)
After analysis, xhci try to set link in U3 and returns an error.
Using snps,dis_u3_susphy_quirk fix this issue.
Signed-off-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e42a5dbb8a ]
dwc3 revisions <=3.00a have a limitation where Port Disable command
doesn't work. Set the quirk-broken-port-ped property for such
controllers so XHCI core can do the necessary workaround.
[rogerq@ti.com] Updated code from platform data to device property.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 41135de1e7 ]
Some devices from Texas Instruments [1] suffer from
a silicon bug where Port Enabled/Disabled bit
should not be used to silence an erroneous device.
The bug is so that if port is disabled with PED
bit, an IRQ for device removal (or attachment)
will never fire.
Just for the sake of completeness, the actual
problem lies with SNPS USB IP and this affects
all known versions up to 3.00a. A separate
patch will be added to dwc3 to enabled this
quirk flag if version is <= 3.00a.
[1] - AM572x Silicon Errata http://www.ti.com/lit/er/sprz429j/sprz429j.pdf
Section i896— USB xHCI Port Disable Feature Does Not Work
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5feeca3c1e ]
GPIO descriptors are the preferred way over legacy GPIO numbers
nowadays. Convert the driver to use GPIO descriptors internally but
still allow passing legacy GPIO numbers from platform data to support
existing platforms.
Based on commits 633a21d80b ("input: gpio_keys_polled: Add support
for GPIO descriptors") and 1ae5ddb6f8 ("Input: gpio_keys_polled -
request GPIO pin as input.").
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b6ffcf2108 ]
UART uses as EDMA as dma engine on AM437x SoC and therefore, requires
OMAP_DMA_TX_KICK quirk just like AM33xx. So, enable OMAP_DMA_TX_KICK
quirk for AM437x platform as well. While at that, drop use of
of_machine_is_compatible() and instead pass quirks via device data.
Signed-off-by: Vignesh R <vigneshr@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit dd3749099c ]
The core framework already handles setting this parameter with a
platform quirk. Add the appropriate flag so that we always set
AHBBURST to 0. Technically DT should be doing this, but we always
do it for msm chipidea devices so setting the flag in the driver
works just as well. If the burst needs to be anything besides 0,
we expect the 'ahb-burst-config' dts property to be present.
Acked-by: Peter Chen <peter.chen@nxp.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7caf489b99 ]
If we issue the link startup to the device while its UniPro state is
LinkDown (and device state is sleep/power-down) then link startup
will not move the device state to Active. Device will only move to
active state if the link starup is issued when its UniPro state is
LinkUp. So in this case, we would have to issue the link startup 2
times to make sure that device moves to active state.
Reviewed-by: Gilad Broner <gbroner@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 98b2f01c8d ]
Back in 2014, commit fb7023e0e2 ("drm/i915: BDW: Adding Reserved PCI
IDs.") added the reserved PCI IDs in order to try to make sure we had
working drivers in case we ever released products using these IDs
(since we had instances of this type of problem in the past). The
problem is that the patch only touched the macros used by
early-quirks.c and by the user space components that rely on
i915_pciids.h, it didn't touch the macros used by i915_pci.c. So we
correctly handled the stolen memory for these theoretical IDs, but we
didn't actually drive the devices from i915.ko.
So this patch fixes the original commit by actually making i915.ko
drive these IDs, which was the goal. There's no information on what
would be the GT count on these IDs, so we just go with the safer
intel_broadwell_info, at the risk of ignoring a possibly inexistent
BSD2_RING.
I did some checking, and it seems that these IDs are driven by
intel-gpu-tools, xf86-video-intel and libdrm (since they contain old
copies of i915_pciids.h), but they are not checked by mesa.
The alternative to this patch would be to just assume we're actually
never going to use these IDs, and then remove them from our ID lists
and make sure our user space components sync the latest i915_pciids.h
copy. I'm fine with either approaches, as long as we make sure that
every component tries to drive the same list of PCI IDs.
Fixes: fb7023e0e2 ("drm/i915: BDW: Adding Reserved PCI IDs.")
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1483473860-17644-3-git-send-email-paulo.r.zanoni@intel.com
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f83f90cf7b ]
The Futaba TOSD-5711BB VFD crashes when the initial HID report is requested,
register the display in hid-ids and tell hid-quirks to not do the init.
Signed-off-by: Alex Wood <thetewood@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9c4aa1eecb ]
Sometimes, the users may require a quirk to be provided from ACPI subsystem
core to prevent a GPE from flooding.
Normally, if a GPE cannot be dispatched, ACPICA core automatically prevents
the GPE from firing. But there are cases the GPE is dispatched by _Lxx/_Exx
provided via AML table, and OSPM is lacking of the knowledge to get
_Lxx/_Exx correctly executed to handle the GPE, thus the GPE flooding may
still occur.
The existing quirk mechanism can be enabled/disabled using the following
commands to prevent such kind of GPE flooding during runtime:
# echo mask > /sys/firmware/acpi/interrupts/gpe00
# echo unmask > /sys/firmware/acpi/interrupts/gpe00
To avoid GPE flooding during boot, we need a boot stage mechanism.
This patch provides such a boot stage quirk mechanism to stop this kind of
GPE flooding. This patch doesn't fix any feature gap but since the new
feature gaps could be found in the future endlessly, and can disappear if
the feature gaps are filled, providing a boot parameter rather than a DMI
table should suffice.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=53071
Link: https://bugzilla.kernel.org/show_bug.cgi?id=117481
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/887793
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e6282aef7b ]
Some OEMs believe they own the Identify Controller vendor specific
region and will repurpose it with their own values. While not common,
we can't rely on the PCI VID:DID to tell use how to decode the field
we reserved for this as the stripe size so we need to do something else
for the list of devices using this quirk.
The field was supposed to allow flexibility on the device's back-end
striping, but it turned out that never materialized; the chunk is always
the same as MDTS in the products subscribing to this quirk, so this
patch removes the stripe_size field and sets the chunk to the max hw
transfer size for the devices using this quirk.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5241b1938a ]
The AMW0_GUID1 wmi is not only found on Acer family but also other
machines like Lenovo, Fujitsu and Medion. In the past, acer-wmi handled
those non-Acer machines by quirks list.
But actually acer-wmi driver was loaded on any machine that had
AMW0_GUID1. This behavior is strange because those machines should be
supported by appropriate wmi drivers. e.g. fujitsu-laptop,
ideapad-laptop.
This patch adds the logic to check the machine that has AMW0_GUID1
should be in Acer/Packard Bell/Gateway white list. But, it still keeps
the quirk list of those supported non-acer machines for backward
compatibility.
Tested-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7f38ca047b ]
This patch adds native DSD support for the following devices.
- TEAC NT-503
- TEAC UD-503
- TEAC UD-501
(1) Add quirks for native DSD support for TEAC devices.
(2) A specific vendor command is needed to switch between PCM/DOP and
DSD mode, same as Denon/Marantz devices.
Signed-off-by: Nobutaka Okabe <nob77413@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 044bc425bb ]
It's not very enlightening to see
pci 0000:07:00.0: [Firmware Bug]: VPD access disabled
in the dmesg log because there's no clue about what the firmware bug is.
Expand the message to explain why we're disabling VPD.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 17f08b0d9a ]
The Axe-Fx II implicit feedback end point and the data sync endpoint
are in different interface descriptors. Add quirk to ensure a sync
endpoint is properly configured.
Signed-off-by: Alberto Aguirre <albaguirre@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 56d4a1866d ]
The maximum value PA_SaveConfigTime is 250 (10us) but this is not enough
for some vendors. Gear switch from PWM to HS may fail even with this
max. PA_SaveConfigTime. Gear switch can be issued by host controller as
an error recovery and any software delay will not help on this case so
we need to increase PA_SaveConfigTime to >32us as per vendor
recommendation. This change adds a quirk to increase the
PA_SaveConfigTime parameter.
Reviewed-by: Venkat Gopalakrishnan <venkatg@codeaurora.org>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0d414268fb ]
Pull the register resource lookup out of thunder_pem_init() so we can
easily add a corresponding lookup using ACPI. No functional change
intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 093d24a204 ]
Currently we use one shared global acpi_pci_root_ops structure to keep
controller-specific ops. We pass its pointer to acpi_pci_root_create() and
associate it with a host bridge instance for good. Such a design implies
serious drawback. Any potential manipulation on the single system-wide
acpi_pci_root_ops leads to kernel crash. The structure content is not
really changing even across multiple host bridges creation; thus it was not
an issue so far.
In preparation for adding ECAM quirks mechanism (where controller-specific
PCI ops may be different for each host bridge) allocate new
acpi_pci_root_ops and fill in with data for each bridge. Now it is safe to
have different controller-specific info. As a consequence free
acpi_pci_root_ops when host bridge is released.
No functional changes in this patch.
Signed-off-by: Tomasz Nowicki <tn@semihalf.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5e7ec268fd ]
Add CPU ID for Atom Z34xx processors. Datasheets indicate support for this,
detailed information about potential quirks or limitations are missing, though.
So we just reuse the definition from official BSP code.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4d712ef1db ]
S5.3.3.1 of RFC 2203 requires that an incoming GSS-wrapped message
whose sequence number lies outside the current window is dropped.
The rationale is:
The reason for discarding requests silently is that the server
is unable to determine if the duplicate or out of range request
was due to a sequencing problem in the client, network, or the
operating system, or due to some quirk in routing, or a replay
attack by an intruder. Discarding the request allows the client
to recover after timing out, if indeed the duplication was
unintentional or well intended.
However, clients may rely on the server dropping the connection to
indicate that a retransmit is needed. Without a connection reset, a
client can wait forever without retransmitting, and the workload
just stops dead. I've reproduced this behavior by running xfstests
generic/323 on an NFSv4.0 mount with proto=rdma and sec=krb5i.
To address this issue, have the server close the connection when it
silently discards an incoming message due to a GSS sequence number
problem.
There are a few other places where the server will never reply.
Change those spots in a similar fashion.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b897f6db3a ]
We already have in place a quirk for Windows 8 devices, but it looks
like the Surface Cover are not conforming to it.
Given that we are only interested in 3 feature reports (the ones that
the Windows driver retrieves), we should be safe to unconditionally apply
the quirk to everybody.
In case there is an issue with a controller, we can always mark it as such
in the transport driver, and hid-multitouch won't try to retrieve the
feature report.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8fe89ef076 ]
There is no reasons to filter out keyboard and consumer control collections
in hid-multitouch.
With the previous hid-input fix, there is now a full support of the Type
Cover and we can remove all specific bits from hid-core and hid-microsoft.
hid-multitouch will automatically set HID_QUIRK_NO_INIT_REPORTS so we can
also remove it from the list of ushbid quirks.
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d8ec7595a0 ]
The ARM specifies that the system counter "must be implemented in an
always-on power domain," and so we try to use the counter as a source of
timekeeping across suspend/resume. Unfortunately, some SoCs (e.g.,
Rockchip's RK3399) do not keep the counter ticking properly when
switched from their high-power clock to the lower-power clock used in
system suspend. Support this quirk by adding a new device tree property.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c19e4b9037 ]
We added a bunch of new Mellanox device ID definitions because they'll be
used by INTx quirks. Use them in the mlx4 ID table also so grep can find
both places. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de288e36fe upstream.
In the case of bounced ep0 requests, we must delay DMA operation until
after ->complete() otherwise we might overwrite contents of req->buf.
This caused problems with RNDIS gadget.
Signed-off-by: Janusz Dziedzic <januszx.dziedzic@intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71af01a8c8 ]
Certain devices produced by Weida Tech need to have a wakeup command sent to
them before powering on. The call itself will come back with error, but the
device can be powered on afterwards.
[jkosina@suse.cz: rewrite changelog]
[jkosina@suse.cz: remove unused device ID addition]
Signed-off-by: HungNien Chen <hn.chen@weidahitech.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f84d42a9cf ]
In common clock framework CLK_DIVIDER_ONE_BASED or'ed with
CLK_DIVIDER_ALLOW_ZERO flags indicates that
1) a divider clock may be set to zero value,
2) divider's zero value is interpreted as a non-divided clock.
On the LPC32xx platform clock dividers of PWM and memory card clocks
comply with the first condition, but zero value means a gated clock,
thus it may happen that the divider value is not updated when
the clock is enabled and the clock remains gated.
The change adds one-shot quirks, which check for zero value of divider
on initialization and set it to a non-zero value, therefore in runtime
a gate clock will work as expected.
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93a5ec14da upstream.
The A31 TCON has mux controls for how TCON outputs are routed to the
HDMI and MIPI DSI blocks.
Since the A31s does not have MIPI DSI, it only has a mux for the HDMI
controller input.
This patch only adds support for the compatible strings. Actual support
for the mux controls should be added with HDMI and MIPI DSI support.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49c440e87c upstream.
The A31's display pipeline has 2 frontends, 2 backends, and 2 TCONs. It
also has new display enhancement blocks, such as the DRC (Dynamic Range
Controller), the DEU (Display Enhancement Unit), and the CMU (Color
Management Unit). It supports HDMI, MIPI DSI, and 2 LCD/LVDS channels.
The A31s display pipeline is almost the same, just without MIPI DSI.
Only the TCON seems to be different, due to the missing mux for MIPI
DSI.
Add compatible strings for both of them.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91ea2f29cb upstream.
We already have some differences between the 2 supported SoCs.
More will be added as we support other SoCs. To avoid bloating
the probe function with even more conditionals, move the quirks
to a separate data structure that's tied to the compatible string.
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5b98461cb upstream.
Now that our crng uses chacha20, we can rely on its speedy
characteristics for replacing MD5, while simultaneously achieving a
higher security guarantee. Before the idea was to use these functions if
you wanted random integers that aren't stupidly insecure but aren't
necessarily secure either, a vague gray zone, that hopefully was "good
enough" for its users. With chacha20, we can strengthen this claim,
since either we're using an rdrand-like instruction, or we're using the
same crng as /dev/urandom. And it's faster than what was before.
We could have chosen to replace this with a SipHash-derived function,
which might be slightly faster, but at the cost of having yet another
RNG construction in the kernel. By moving to chacha20, we have a single
RNG to analyze and verify, and we also already get good performance
improvements on all platforms.
Implementation-wise, rather than use a generic buffer for both
get_random_int/long and memcpy based on the size needs, we use a
specific buffer for 32-bit reads and for 64-bit reads. This way, we're
guaranteed to always have aligned accesses on all platforms. While
slightly more verbose in C, the assembly this generates is a lot
simpler than otherwise.
Finally, on 32-bit platforms where longs and ints are the same size,
we simply alias get_random_int to get_random_long.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Suggested-by: Theodore Ts'o <tytso@mit.edu>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf01fb9985 upstream.
In the case that compat_get_bitmap fails we do not want to copy the
bitmap to the user as it will contain uninitialized stack data and leak
sensitive data.
Signed-off-by: Chris Salls <salls@cs.ucsb.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf903e9d3a upstream.
A patch documenting how to specify which kernels a particular fix should
be backported to (seemingly) inadvertently added a minus sign after the
kernel version. This particular stable-tag format had never been used
prior to this patch, and was neither present when the patch in question
was first submitted (it was added in v2 without any comment).
Drop the minus sign to avoid any confusion.
Fixes: fdc81b7910 ("stable_kernel_rules: Add clause about specification of kernel versions to patch.")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0115f6cbf2 upstream.
On VTLB+FTLB platforms (such as Loongson-3A R2), FTLB's pagesize is
usually configured the same as PAGE_SIZE. In such a case, Huge page
entry is not suitable to write in FTLB.
Unfortunately, when a huge page is created, its page table entries
haven't created immediately. Then the TLB refill handler will fetch an
invalid page table entry which has no "HUGE" bit, and this entry may be
written to FTLB. Since it is invalid, TLB load/store handler will then
use tlbwi to write the valid entry at the same place. However, the
valid entry is a huge page entry which isn't suitable for FTLB.
Our solution is to modify build_huge_handler_tail. Flush the invalid
old entry (whether it is in FTLB or VTLB, this is in order to reduce
branches) and use tlbwr to write the valid new entry.
Signed-off-by: Rui Wang <wangr@lemote.com>
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15754/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a34133167 upstream.
Loongson-3's micro TLB (ITLB) is not strictly a subset of JTLB. That
means: when a JTLB entry is replaced by hardware, there may be an old
valid entry exists in ITLB. So, a TLB miss exception may occur while
handle_ri_rdhwr() is running because it try to access EPC's content.
However, handle_ri_rdhwr() doesn't clear EXL, which makes a TLB Refill
exception be treated as a TLB Invalid exception and tlbp may fail. In
this case, if FTLB (which is usually set-associative instead of set-
associative) is enabled, a tlbp failure will cause an invalid tlbwi,
which will hang the whole system.
This patch rename handle_ri_rdhwr_vivt to handle_ri_rdhwr_tlbp and use
it for Loongson-3. It try to solve the same problem described as below,
but more straightforwards.
https://patchwork.linux-mips.org/patch/12591/
I think Loongson-2 has the same problem, but it has no FTLB, so we just
keep it as is.
Signed-off-by: Huacai Chen <chenhc@lemote.com>
Cc: Rui Wang <wangr@lemote.com>
Cc: John Crispin <john@phrozen.org>
Cc: Steven J . Hill <Steven.Hill@caviumnetworks.com>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Zhangjin Wu <wuzhangjin@gmail.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15753/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b5347a24a upstream.
When building for microMIPS we need to ensure that the assembler always
knows that there is code at the target of a branch or jump. Recent
toolchains will fail to link a microMIPS kernel when this isn't the case
due to what it thinks is a branch to non-microMIPS code.
mips-mti-linux-gnu-ld kernel/built-in.o: .spinlock.text+0x2fc: Unsupported branch between ISA modes.
mips-mti-linux-gnu-ld final link failed: Bad value
This is due to inline assembly labels in spinlock.h not being followed
by an instruction mnemonic, either due to a .subsection pseudo-op or the
end of the inline asm block.
Fix this with a .insn direction after such labels.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15325/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e6c774773 upstream.
When a 32-bit kernel is configured to support MIPS64r6 (CPU_MIPS64_R6),
MIPS_O32_FP64_SUPPORT won't be selected as it should be because
MIPS32_O32 is disabled (o32 is already the default ABI available on
32-bit kernels).
This results in userland FP breakage as CP0_Status.FR is read-only 1
since r6 (when an FPU is present) so __enable_fpu() will fail to clear
FR. This causes the FPU emulator to get used which will incorrectly
emulate 32-bit FPU registers.
Force o32 fp64 support in this case by also selecting
MIPS_O32_FP64_SUPPORT from CPU_MIPS64_R6 if 32BIT.
Fixes: 4e9d324d42 ("MIPS: Require O32 FP64 support for MIPS64 with O32 compat")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Paul Burton <paul.burton@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15310/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d09c5373e8 upstream.
Commit fd2d2b191f ("s390: get_user() should zero on failure")
intended to fix s390's get_user() implementation which did not zero
the target operand if the read from user space faulted. Unfortunately
the patch has no effect: the corresponding inline assembly specifies
that the operand is only written to ("=") and the previous value is
discarded.
Therefore the compiler is free to and actually does omit the zero
initialization.
To fix this simply change the contraint modifier to "+", so the
compiler cannot omit the initialization anymore.
Fixes: c9ca78415a ("s390/uaccess: provide inline variants of get_user/put_user")
Fixes: fd2d2b191f ("s390: get_user() should zero on failure")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d82c0d12c9 upstream.
Reorder the operations in decompress_kernel() to ensure initrd is moved
to a safe location before the bss section is zeroed.
During decompression bss can overlap with the initrd and this can
corrupt the initrd contents depending on the size of the compressed
kernel (which affects where the initrd is placed by the bootloader) and
the size of the bss section of the decompressor.
Also use the correct initrd size when checking for overlaps with
parmblock.
Fixes: 06c0dd72ae ([S390] fix boot failures with compressed kernels)
Reviewed-by: Joy Latten <joy.latten@canonical.com>
Reviewed-by: Vineetha HariPai <vineetha.hari.pai@canonical.com>
Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b83878dd7 upstream.
When __pa is applied to virtual address in uncached KSEG region the
result is incorrect. Fix it by checking if the original address is in
the uncached KSEG and adjusting the result. It looks better than masking
off bits because pfn_valid would correctly work with new __pa results
and it may be made working in noMMU case, once we get definition for
uncached memory view.
This is required for the dma_common_mmap and DMA debug code to work
correctly: they both indirectly use __pa with coherent DMA addresses.
In case of DMA debug the visible effect is false reports that an address
mapped for DMA is accessed by CPU.
Tested-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 921d701e6f upstream.
Make sure to reserve the boot memory for the flattened device tree.
Otherwise it might get overwritten, e.g. when initial_boot_params is
copied, leading to a corrupted FDT and a boot hang/crash:
bootconsole [early0] enabled
Early console on uart16650 initialized at 0xf8001600
OF: fdt: Error -11 processing FDT
Kernel panic - not syncing: setup_cpuinfo: No CPU found in devicetree!
---[ end Kernel panic - not syncing: setup_cpuinfo: No CPU found in devicetree!
Guenter Roeck says:
> I think I found the problem. In unflatten_and_copy_device_tree(), with added
> debug information:
>
> OF: fdt: initial_boot_params=c861e400, dt=c861f000 size=28874 (0x70ca)
>
> ... and then initial_boot_params is copied to dt, which results in corrupted
> fdt since the memory overlaps. Looks like the initial_boot_params memory
> is not reserved and (re-)allocated by early_init_dt_alloc_memory_arch().
Reported-by: Guenter Roeck <linux@roeck-us.net>
Reference: http://lkml.kernel.org/r/20170226210338.GA19476@roeck-us.net
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4749228f02 upstream.
In crc32c_vpmsum() we call enable_kernel_altivec() without first
disabling preemption, which is not allowed:
WARNING: CPU: 9 PID: 2949 at ../arch/powerpc/kernel/process.c:277 enable_kernel_altivec+0x100/0x120
Modules linked in: dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio libcrc32c vmx_crypto ...
CPU: 9 PID: 2949 Comm: docker Not tainted 4.11.0-rc5-compiler_gcc-6.3.1-00033-g308ac7563944 #381
...
NIP [c00000000001e320] enable_kernel_altivec+0x100/0x120
LR [d000000003df0910] crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
Call Trace:
0xc138fd09 (unreliable)
crc32c_vpmsum+0x108/0x150 [crc32c_vpmsum]
crc32c_vpmsum_update+0x3c/0x60 [crc32c_vpmsum]
crypto_shash_update+0x88/0x1c0
crc32c+0x64/0x90 [libcrc32c]
dm_bm_checksum+0x48/0x80 [dm_persistent_data]
sb_check+0x84/0x120 [dm_thin_pool]
dm_bm_validate_buffer.isra.0+0xc0/0x1b0 [dm_persistent_data]
dm_bm_read_lock+0x80/0xf0 [dm_persistent_data]
__create_persistent_data_objects+0x16c/0x810 [dm_thin_pool]
dm_pool_metadata_open+0xb0/0x1a0 [dm_thin_pool]
pool_ctr+0x4cc/0xb60 [dm_thin_pool]
dm_table_add_target+0x16c/0x3c0
table_load+0x184/0x400
ctl_ioctl+0x2f0/0x560
dm_ctl_ioctl+0x38/0x50
do_vfs_ioctl+0xd8/0x920
SyS_ioctl+0x68/0xc0
system_call+0x38/0xfc
It used to be sufficient just to call pagefault_disable(), because that
also disabled preemption. But the two were decoupled in commit 8222dbe21e
("sched/preempt, mm/fault: Decouple preemption from the page fault
logic") in mid 2015.
So add the missing preempt_disable/enable(). We should also call
disable_kernel_fp(), although it does nothing by default, there is a
debug switch to make it active and all enables should be paired with
disables.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48fe9e9488 upstream.
In the past, there was only one load-with-reservation instruction,
lwarx, and if a program attempted a lwarx on a misaligned address, it
would take an alignment interrupt and the kernel handler would emulate
it as though it was lwzx, which was not really correct, but benign since
it is loading the right amount of data, and the lwarx should be paired
with a stwcx. to the same address, which would also cause an alignment
interrupt which would result in a SIGBUS being delivered to the process.
We now have 5 different sizes of load-with-reservation instruction. Of
those, lharx and ldarx cause an immediate SIGBUS by luck since their
entries in aligninfo[] overlap instructions which were not fixed up, but
lqarx overlaps with lhz and will be emulated as such. lbarx can never
generate an alignment interrupt since it only operates on 1 byte.
To straighten this out and fix the lqarx case, this adds code to detect
the l[hwdq]arx instructions and return without fixing them up, resulting
in a SIGBUS being delivered to the process.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f5f525d5b upstream.
When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does
not include a global entry point. A function's global entry point is
used when the function is called from a different TOC context and in the
kernel this typically means a call from a module into the vmlinux (or
vice-versa).
There are a few exported asm functions declared with _GLOBAL() and
calling them from a module will likely crash the kernel since any TOC
relative load will yield garbage.
flush_icache_range() and flush_dcache_range() are both exported to
modules, and use the TOC, so must use _GLOBAL_TOC().
Fixes: 721aeaa9fd ("powerpc: Build little endian ppc64 kernel with ABIv2")
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88b1bf7268 upstream.
Commit 4c6d9acce1 ("powerpc/mm: Add hooks for cxl") converted local
TLB invalidates to global if the cxl driver is active. This is necessary
because the CAPP snoops invalidations to forward them to the PSL on the
cxl adapter. However one path was forgotten. native_flush_hash_range()
still does local TLB invalidates, as found out the hard way recently.
This patch fixes it by following the same logic as previously: if the
cxl driver is active, the local TLB invalidates are 'upgraded' to
global.
Fixes: 4c6d9acce1 ("powerpc/mm: Add hooks for cxl")
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ed23e1bae upstream.
On Power8 & Power9 the early CPU inititialisation in __init_HFSCR()
turns on HFSCR[TM] (Hypervisor Facility Status and Control Register
[Transactional Memory]), but that doesn't take into account that TM
might be disabled by CPU features, or disabled by the kernel being built
with CONFIG_PPC_TRANSACTIONAL_MEM=n.
So later in boot, when we have setup the CPU features, clear HSCR[TM] if
the TM CPU feature has been disabled. We use CPU_FTR_TM_COMP to account
for the CONFIG_PPC_TRANSACTIONAL_MEM=n case.
Without this a KVM guest might try use TM, even if told not to, and
cause an oops in the host kernel. Typically the oops is seen in
__kvmppc_vcore_entry() and may or may not be fatal to the host, but is
always bad news.
In practice all shipping CPU revisions do support TM, and all host
kernels we are aware of build with TM support enabled, so no one should
actually be able to hit this in the wild.
Fixes: 2a3563b023 ("powerpc: Setup in HFSCR for POWER8")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[mpe: Rewrite change log with input from Sam, add Fixes/stable]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b884a190af upstream.
The rapf copy loops in the Meta usercopy code is missing some extable
entries for HTP cores with unaligned access checking enabled, where
faults occur on the instruction immediately after the faulting access.
Add the fixup labels and extable entries for these cases so that corner
case user copy failures don't cause kernel crashes.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c0b1df88b upstream.
The fixup code to rewind the source pointer in
__asm_copy_from_user_{32,64}bit_rapf_loop() always rewound the source by
a single unit (4 or 8 bytes), however this is insufficient if the fault
didn't occur on the first load in the loop, as the source pointer will
have been incremented but nothing will have been stored until all 4
register [pairs] are loaded.
Read the LSM_STEP field of TXSTATUS (which is already loaded into a
register), a bit like the copy_to_user versions, to determine how many
iterations of MGET[DL] have taken place, all of which need rewinding.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd40eee129 upstream.
The fixup code for the copy_to_user rapf loops reads TXStatus.LSM_STEP
to decide how far to rewind the source pointer. There is a special case
for the last execution of an MGETL/MGETD, since it leaves LSM_STEP=0
even though the number of MGETLs/MGETDs attempted was 4. This uses ADDZ
which is conditional upon the Z condition flag, but the AND instruction
which masked the TXStatus.LSM_STEP field didn't set the condition flags
based on the result.
Fix that now by using ANDS which does set the flags, and also marking
the condition codes as clobbered by the inline assembly.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 563ddc1076 upstream.
Currently we try to zero the destination for a failed read from userland
in fixup code in the usercopy.c macros. The rest of the destination
buffer is then zeroed from __copy_user_zeroing(), which is used for both
copy_from_user() and __copy_from_user().
Unfortunately we fail to zero in the fixup code as D1Ar1 is set to 0
before the fixup code entry labels, and __copy_from_user() shouldn't even
be zeroing the rest of the buffer.
Move the zeroing out into copy_from_user() and rename
__copy_user_zeroing() to raw_copy_from_user() since it no longer does
any zeroing. This also conveniently matches the name needed for
RAW_COPY_USER support in a later patch.
Fixes: 373cd784d0 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb8ea062a8 upstream.
When copying to userland on Meta, if any faults are encountered
immediately abort the copy instead of continuing on and repeatedly
faulting, and worse potentially copying further bytes successfully to
subsequent valid pages.
Fixes: 373cd784d0 ("metag: Memory handling")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2257211942 upstream.
Fix the error checking of the alignment adjustment code in
raw_copy_from_user(), which mistakenly considers it safe to skip the
error check when aligning the source buffer on a 2 or 4 byte boundary.
If the destination buffer was unaligned it may have started to copy
using byte or word accesses, which could well be at the start of a new
(valid) source page. This would result in it appearing to have copied 1
or 2 bytes at the end of the first (invalid) page rather than none at
all.
Fixes: 373cd784d0 ("metag: Memory handling")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: linux-metag@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d77facb884 upstream.
A use-after-free was found using KASAN. In brcmf_p2p_del_if() the virtual
interface is removed using call to brcmf_remove_interface(). After that
the virtual interface instance has been freed and should not be referenced.
Solve this by storing the nl80211 iftype in local variable, which is used
in a couple of places anyway.
Reported-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d65f82954 upstream.
When internal mac80211 TXQs aren't supported, netdev queues must
always started out started even when driver queues are stopped
while the interface is added. This is necessary because with the
internal TXQ support netdev queues are never stopped and packet
scheduling/dropping is done in mac80211.
Fixes: 80a83cfc43 ("mac80211: skip netdev queue control with software queuing")
Reported-and-tested-by: Sven Eckelmann <sven.eckelmann@openmesh.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3dd09d5a85 upstream.
When punching past EOF on XFS, fallocate(mode=PUNCH_HOLE|KEEP_SIZE) will
round the file size up to the nearest multiple of PAGE_SIZE:
calvinow@vm-disks/generic-xfs-1 ~$ dd if=/dev/urandom of=test bs=2048 count=1
calvinow@vm-disks/generic-xfs-1 ~$ stat test
Size: 2048 Blocks: 8 IO Block: 4096 regular file
calvinow@vm-disks/generic-xfs-1 ~$ fallocate -n -l 2048 -o 2048 -p test
calvinow@vm-disks/generic-xfs-1 ~$ stat test
Size: 4096 Blocks: 8 IO Block: 4096 regular file
Commit 3c2bdc912a ("xfs: kill xfs_zero_remaining_bytes") replaced
xfs_zero_remaining_bytes() with calls to iomap helpers. The new helpers
don't enforce that [pos,offset) lies strictly on [0,i_size) when being
called from xfs_free_file_space(), so by "leaking" these ranges into
xfs_zero_range() we get this buggy behavior.
Fix this by reintroducing the checks xfs_zero_remaining_bytes() did
against i_size at the bottom of xfs_free_file_space().
Reported-by: Aaron Gao <gzh@fb.com>
Fixes: 3c2bdc912a ("xfs: kill xfs_zero_remaining_bytes")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cefdc26e86 upstream.
Without this fix (and another to the userspace component itself
described later), the kernel will be unable to process any OrangeFS
requests after the userspace component is restarted (due to a crash or
at the administrator's behest).
The bug here is that inside orangefs_remount, the orangefs_request_mutex
is locked. When the userspace component restarts while the filesystem
is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device,
which causes the kernel to send it a few requests aimed at synchronizing
the state between the two. While this is happening the
orangefs_request_mutex is locked to prevent any other requests going
through.
This is only half of the bugfix. The other half is in the userspace
component which outright ignores(!) requests made before it considers
the filesystem remounted, which is after the ioctl returns. Of course
the ioctl doesn't return until after the userspace component responds to
the request it ignores. The userspace component has been changed to
allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status.
Mike Marshall says:
"I've tested this patch against the fixed userspace part. This patch is
real important, I hope it can make it into 4.11...
Here's what happens when the userspace daemon is restarted, without
the patch:
=============================================
[ INFO: possible recursive locking detected ]
[ 4.10.0-00007-ge98bdb3 #1 Not tainted ]
---------------------------------------------
pvfs2-client-co/29032 is trying to acquire lock:
(orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs]
but task is already holding lock:
(orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs]
CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb3 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
Call Trace:
__lock_acquire+0x7eb/0x1290
lock_acquire+0xe8/0x1d0
mutex_lock_killable_nested+0x6f/0x6e0
service_operation+0x3c7/0x7b0 [orangefs]
orangefs_remount+0xea/0x150 [orangefs]
dispatch_ioctl_command+0x227/0x330 [orangefs]
orangefs_devreq_ioctl+0x29/0x70 [orangefs]
do_vfs_ioctl+0xa3/0x6e0
SyS_ioctl+0x79/0x90"
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
Acked-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b334e19ae9 upstream.
In commit a76bcf557e ("Kbuild: enable -Wmaybe-uninitialized warning
for "make W=1""), I reverted another change that happened to fix a problem
with old compilers, and now we get this report again with old compilers
(prior to gcc-4.8) and GCOV enabled:
cc1: warnings being treated as errors
drivers/gpu/drm/i915/intel_ringbuffer.c: In function 'intel_ring_setup_status_page':
drivers/gpu/drm/i915/intel_ringbuffer.c:438: error: 'mmio.reg' may be used uninitialized in this function
At top level:
>> cc1: error: unrecognized command line option "-Wno-maybe-uninitialized"
The problem is that we turn off the warning conditionally in a number
of places as we should, but one of them does it unconditionally.
Instead, change it to call cc-disable-warning as we do elsewhere.
The original patch that caused it was merged into linux-4.7, then
4.8 removed the change and 4.9 brought it back, so we probably want
a backport to 4.9 once this is merged.
Use a ':=' assignment instead of '=' to force the cc-disable-warning
call to only be evaluated once instead of every time.
Fixes: a76bcf557e ("Kbuild: enable -Wmaybe-uninitialized warning for "make W=1"")
Fixes: e72e2dfe7c ("gcov: disable -Wmaybe-uninitialized warning")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f1a880a93b upstream.
If the hash tree itself is sufficiently corrupt in addition to data blocks,
it's possible for error correction to end up in a deep recursive loop,
which eventually causes a kernel panic. This change limits the
recursion to a reasonable level during a single I/O operation.
Fixes: a739ff3f54 ("dm verity: add support for forward error correction")
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5402e97af6 upstream.
In PT_SEIZED + LISTEN mode STOP/CONT signals cause a wakeup against
__TASK_TRACED. If this races with the ptrace_unfreeze_traced at the end
of a PTRACE_LISTEN, this can wake the task /after/ the check against
__TASK_TRACED, but before the reset of state to TASK_TRACED. This
causes it to instead clobber TASK_WAKING, allowing a subsequent wakeup
against TRACED while the task is still on the rq wake_list, corrupting
it.
Oleg said:
"The kernel can crash or this can lead to other hard-to-debug problems.
In short, "task->state = TASK_TRACED" in ptrace_unfreeze_traced()
assumes that nobody else can wake it up, but PTRACE_LISTEN breaks the
contract. Obviusly it is very wrong to manipulate task->state if this
task is already running, or WAKING, or it sleeps again"
[akpm@linux-foundation.org: coding-style fixes]
Fixes: 9899d11f ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL")
Link: http://lkml.kernel.org/r/xm26y3vfhmkp.fsf_-_@bsegall-linux.mtv.corp.google.com
Signed-off-by: Ben Segall <bsegall@google.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3ef5520c1 upstream.
We got the following use-after-free KASAN report:
BUG: KASAN: use-after-free in wiphy_resume+0x591/0x5a0 [cfg80211]
at addr ffff8803fc244090
Read of size 8 by task kworker/u16:24/2587
CPU: 6 PID: 2587 Comm: kworker/u16:24 Tainted: G B 4.9.13-debug+
Hardware name: Dell Inc. XPS 15 9550/0N7TVV, BIOS 1.2.19 12/22/2016
Workqueue: events_unbound async_run_entry_fn
ffff880425d4f9d8 ffffffffaeedb541 ffff88042b80ef00 ffff8803fc244088
ffff880425d4fa00 ffffffffae84d7a1 ffff880425d4fa98 ffff8803fc244080
ffff88042b80ef00 ffff880425d4fa88 ffffffffae84da3a ffffffffc141f7d9
Call Trace:
[<ffffffffaeedb541>] dump_stack+0x85/0xc4
[<ffffffffae84d7a1>] kasan_object_err+0x21/0x70
[<ffffffffae84da3a>] kasan_report_error+0x1fa/0x500
[<ffffffffc141f7d9>] ? cfg80211_bss_age+0x39/0xc0 [cfg80211]
[<ffffffffc141f83a>] ? cfg80211_bss_age+0x9a/0xc0 [cfg80211]
[<ffffffffae48d46d>] ? trace_hardirqs_on+0xd/0x10
[<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211]
[<ffffffffae84def1>] __asan_report_load8_noabort+0x61/0x70
[<ffffffffc13fb100>] ? wiphy_suspend+0xbb0/0xc70 [cfg80211]
[<ffffffffc13fb751>] ? wiphy_resume+0x591/0x5a0 [cfg80211]
[<ffffffffc13fb751>] wiphy_resume+0x591/0x5a0 [cfg80211]
[<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211]
[<ffffffffaf3b206e>] dpm_run_callback+0x6e/0x4f0
[<ffffffffaf3b31b2>] device_resume+0x1c2/0x670
[<ffffffffaf3b367d>] async_resume+0x1d/0x50
[<ffffffffae3ee84e>] async_run_entry_fn+0xfe/0x610
[<ffffffffae3d0666>] process_one_work+0x716/0x1a50
[<ffffffffae3d05c9>] ? process_one_work+0x679/0x1a50
[<ffffffffafdd7b6d>] ? _raw_spin_unlock_irq+0x3d/0x60
[<ffffffffae3cff50>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0
[<ffffffffae3d1a80>] worker_thread+0xe0/0x1460
[<ffffffffae3d19a0>] ? process_one_work+0x1a50/0x1a50
[<ffffffffae3e54c2>] kthread+0x222/0x2e0
[<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
[<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
[<ffffffffae3e52a0>] ? kthread_park+0x80/0x80
[<ffffffffafdd86aa>] ret_from_fork+0x2a/0x40
Object at ffff8803fc244088, in cache kmalloc-1024 size: 1024
Allocated:
PID = 71
save_stack_trace+0x1b/0x20
save_stack+0x46/0xd0
kasan_kmalloc+0xad/0xe0
kasan_slab_alloc+0x12/0x20
__kmalloc_track_caller+0x134/0x360
kmemdup+0x20/0x50
brcmf_cfg80211_attach+0x10b/0x3a90 [brcmfmac]
brcmf_bus_start+0x19a/0x9a0 [brcmfmac]
brcmf_pcie_setup+0x1f1a/0x3680 [brcmfmac]
brcmf_fw_request_nvram_done+0x44c/0x11b0 [brcmfmac]
request_firmware_work_func+0x135/0x280
process_one_work+0x716/0x1a50
worker_thread+0xe0/0x1460
kthread+0x222/0x2e0
ret_from_fork+0x2a/0x40
Freed:
PID = 2568
save_stack_trace+0x1b/0x20
save_stack+0x46/0xd0
kasan_slab_free+0x71/0xb0
kfree+0xe8/0x2e0
brcmf_cfg80211_detach+0x62/0xf0 [brcmfmac]
brcmf_detach+0x14a/0x2b0 [brcmfmac]
brcmf_pcie_remove+0x140/0x5d0 [brcmfmac]
brcmf_pcie_pm_leave_D3+0x198/0x2e0 [brcmfmac]
pci_pm_resume+0x186/0x220
dpm_run_callback+0x6e/0x4f0
device_resume+0x1c2/0x670
async_resume+0x1d/0x50
async_run_entry_fn+0xfe/0x610
process_one_work+0x716/0x1a50
worker_thread+0xe0/0x1460
kthread+0x222/0x2e0
ret_from_fork+0x2a/0x40
Memory state around the buggy address:
ffff8803fc243f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8803fc244000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8803fc244080: fc fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8803fc244100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8803fc244180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
What is happening is that brcmf_pcie_resume() detects a device that
is no longer responsive and it decides to unbind resulting in a
wiphy_unregister() and wiphy_free() call. Now the wiphy instance
remains allocated, because PM needs to call wiphy_resume() for it.
However, brcmfmac already does a kfree() for the struct
cfg80211_registered_device::ops field. Change the checks in
wiphy_resume() to only access the struct cfg80211_registered_device::ops
if the wiphy instance is still registered at this time.
Reported-by: Daniel J Blueman <daniel@quora.org>
Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com>
Reviewed-by: Franky Lin <franky.lin@broadcom.com>
Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09a6adf53d upstream.
After 52d7523 (arm64: mm: allow the kernel to handle alignment faults on
user accesses) commit user-land accesses that produce unaligned exceptions
like in case of aarch32 ldm/stm/ldrd/strd instructions operating on
unaligned memory received by user-land as SIGSEGV. It is wrong, it should
be reported as SIGBUS as it was before 52d7523 commit.
Changed do_bad_area function to take signal and code parameters out of esr
value using fault_info table, so in case of do_alignment_fault fault
user-land will receive SIGBUS. Wrapped access to fault_info table into
esr_to_fault_info function.
Fixes: 52d7523 (arm64: mm: allow the kernel to handle alignment faults on user accesses)
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4bdc902968 upstream.
The gyroscope chip might need to be reset to be used.
Without the chip being reset, the driver stopped at the first
regmap_read (to get the CHIP_ID) and failed to probe.
The datasheet of the gyroscope says that a minimum wait of 30ms after
the reset has to be done.
This patch has been checked on a BMX055 and the datasheet of the BMG160
and the BMI055 give the same reset register and bits.
Signed-off-by: Quentin Schulz <quentin.schulz@free-electrons.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b3405e345 upstream.
In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.
Fixes: commit d5d8184d35 ("KVM: ARM: Memory virtualization setup")
Cc: Paolo Bonzini <pbonzin@redhat.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72f310481a upstream.
We don't hold the mmap_sem while searching for VMAs (via find_vma), in
kvm_arch_prepare_memory_region, which can end up in expected failures.
Fixes: commit 8eef91239e ("arm/arm64: KVM: map MMIO regions at creation time")
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Eric Auger <eric.auger@rehat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
[ Handle dirty page logging failure case ]
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90f6e150e4 upstream.
We don't hold the mmap_sem while searching for the VMAs when
we try to unmap each memslot for a VM. Fix this properly to
avoid unexpected results.
Fixes: commit 957db105c9 ("arm/arm64: KVM: Introduce stage2_unmap_vm")
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97fbfef6bd upstream.
vfs_llseek will check whether the file mode has
FMODE_LSEEK, no return failure. But ashmem can be
lseek, so add FMODE_LSEEK to ashmem file.
Comment From Greg Hackmann:
ashmem_llseek() passes the llseek() call through to the backing
shmem file. 91360b02ab ("ashmem: use vfs_llseek()") changed
this from directly calling the file's llseek() op into a VFS
layer call. This also adds a check for the FMODE_LSEEK bit, so
without that bit ashmem_llseek() now always fails with -ESPIPE.
Fixes: 91360b02ab ("ashmem: use vfs_llseek()")
Signed-off-by: Shuxiao Zhang <zhangshuxiao@xiaomi.com>
Tested-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8a139d001 upstream.
ops->show() can return a negative error code.
Commit 65da3484d9 ("sysfs: correctly handle short reads on PREALLOC attrs.")
(in v4.4) caused this to be stored in an unsigned 'size_t' variable, so errors
would look like large numbers.
As a result, if an error is returned, sysfs_kf_read() will return the
value of 'count', typically 4096.
Commit 17d0774f80 ("sysfs: correctly handle read offset on PREALLOC attrs")
(in v4.8) extended this error to use the unsigned large 'len' as a size for
memmove().
Consequently, if ->show returns an error, then the first read() on the
sysfs file will return 4096 and could return uninitialized memory to
user-space.
If the application performs a subsequent read, this will trigger a memmove()
with extremely large count, and is likely to crash the machine is bizarre ways.
This bug can currently only be triggered by reading from an md
sysfs attribute declared with __ATTR_PREALLOC() during the
brief period between when mddev_put() deletes an mddev from
the ->all_mddevs list, and when mddev_delayed_delete() - which is
scheduled on a workqueue - completes.
Before this, an error won't be returned by the ->show()
After this, the ->show() won't be called.
I can reproduce it reliably only by putting delay like
usleep_range(500000,700000);
early in mddev_delayed_delete(). Then after creating an
md device md0 run
echo clear > /sys/block/md0/md/array_state; cat /sys/block/md0/md/array_state
The bug can be triggered without the usleep.
Fixes: 65da3484d9 ("sysfs: correctly handle short reads on PREALLOC attrs.")
Fixes: 17d0774f80 ("sysfs: correctly handle read offset on PREALLOC attrs")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7e11f9956 upstream.
In vmw_surface_define_ioctl(), the 'num_sizes' is the sum of the
'req->mip_levels' array. This array can be assigned any value from
the user space. As both the 'num_sizes' and the array is uint32_t,
it is easy to make 'num_sizes' overflow. The later 'mip_levels' is
used as the loop count. This can lead an oob write. Add the check of
'req->mip_levels' to avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53e16798b0 upstream.
The mesa winsys sometimes uses unimplemented parameter requests to
check for features. Remove the error message to avoid bloating the
kernel log.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe25deb773 upstream.
Previously, when a surface was opened using a legacy (non prime) handle,
it was verified to have been created by a client in the same master realm.
Relax this so that opening is also allowed recursively if the client
already has the surface open.
This works around a regression in svga mesa where opening of a shared
surface is used recursively to obtain surface information.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63774069d9 upstream.
In vmw_get_cap_3d_ioctl(), a user can supply 0 for a size that is
used in vzalloc(). This eventually calls dump_stack() (in warn_alloc()),
which can leak useful addresses to dmesg.
Add check to avoid a size of 0.
Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36274ab8c5 upstream.
Before memory allocations vmw_surface_define_ioctl() checks the
upper-bounds of a user-supplied size, but does not check if the
supplied size is 0.
Add check to avoid NULL pointer dereferences.
Signed-off-by: Murray McAllister <murray.mcallister@insomniasec.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7652afa8e upstream.
A malicious caller could otherwise hand over handles to other objects
causing all sorts of interesting problems.
Testing done: Ran a Fedora 25 desktop using both Xorg and
gnome-shell/Wayland.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a69645dde upstream.
Usually every parallel port will have a single pardev registered with
it. But ppdev driver is an exception. This userspace parallel port
driver allows to create multiple parrallel port devices for a single
parallel port. And as a result we were having a big warning like:
"sysfs: cannot create duplicate filename '/devices/parport0/ppdev0.0'".
And with that many parallel port printers stopped working.
We have been using the minor number as the id field while registering
a parralel port device with a parralel port. But when there are
multiple parrallel port device for one single parallel port, they all
tried to register with the same name like 'pardev0.0' and everything
started failing.
Use an incremented index as the id instead of the minor number.
Fixes: 8b7d3a9d90 ("ppdev: use new parport device model")
Cc: stable <stable@vger.kernel.org> # v4.9+
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1414656
Bugzilla: https://bugs.archlinux.org/task/52322
Tested-by: James Feeney <james@nurealm.net>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd5c472a60 upstream.
After parport starts using the device model, all pardevice drivers
should decide in their match_port callback function if they want to
attach with that particulatr port. ppdev has been converted to use the
new parport device-model code but pp_attach() tried to attach with all
the ports.
Create a new array of pointer and use that to remember the ports we
have attached. And use that information to skip attaching ports which
we have already attached.
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6db28eda26 upstream.
If the device is not present, the driver should disable the queues
immediately. Prior to this, the driver was relying on the watchdog timer
to kill the queues if requests were outstanding to the device, and that
just delays removal up to one second.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f33447b90e upstream.
If a namespace has already been marked dead, we don't want to kick the
request_queue again since we may have just freed it from another thread.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de5540d088 upstream.
Under extremely heavy uses of padata, crashes occur, and with list
debugging turned on, this happens instead:
[87487.298728] WARNING: CPU: 1 PID: 882 at lib/list_debug.c:33
__list_add+0xae/0x130
[87487.301868] list_add corruption. prev->next should be next
(ffffb17abfc043d0), but was ffff8dba70872c80. (prev=ffff8dba70872b00).
[87487.339011] [<ffffffff9a53d075>] dump_stack+0x68/0xa3
[87487.342198] [<ffffffff99e119a1>] ? console_unlock+0x281/0x6d0
[87487.345364] [<ffffffff99d6b91f>] __warn+0xff/0x140
[87487.348513] [<ffffffff99d6b9aa>] warn_slowpath_fmt+0x4a/0x50
[87487.351659] [<ffffffff9a58b5de>] __list_add+0xae/0x130
[87487.354772] [<ffffffff9add5094>] ? _raw_spin_lock+0x64/0x70
[87487.357915] [<ffffffff99eefd66>] padata_reorder+0x1e6/0x420
[87487.361084] [<ffffffff99ef0055>] padata_do_serial+0xa5/0x120
padata_reorder calls list_add_tail with the list to which its adding
locked, which seems correct:
spin_lock(&squeue->serial.lock);
list_add_tail(&padata->list, &squeue->serial.list);
spin_unlock(&squeue->serial.lock);
This therefore leaves only place where such inconsistency could occur:
if padata->list is added at the same time on two different threads.
This pdata pointer comes from the function call to
padata_get_next(pd), which has in it the following block:
next_queue = per_cpu_ptr(pd->pqueue, cpu);
padata = NULL;
reorder = &next_queue->reorder;
if (!list_empty(&reorder->list)) {
padata = list_entry(reorder->list.next,
struct padata_priv, list);
spin_lock(&reorder->lock);
list_del_init(&padata->list);
atomic_dec(&pd->reorder_objects);
spin_unlock(&reorder->lock);
pd->processed++;
goto out;
}
out:
return padata;
I strongly suspect that the problem here is that two threads can race
on reorder list. Even though the deletion is locked, call to
list_entry is not locked, which means it's feasible that two threads
pick up the same padata object and subsequently call list_add_tail on
them at the same time. The fix is thus be hoist that lock outside of
that block.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f5fe1b5190 upstream.
Commit 79bd99596b ("blk: improve order of bio handling in generic_make_request()")
changed current->bio_list so that it did not contain *all* of the
queued bios, but only those submitted by the currently running
make_request_fn.
There are two places which walk the list and requeue selected bios,
and others that check if the list is empty. These are no longer
correct.
So redefine current->bio_list to point to an array of two lists, which
contain all queued bios, and adjust various code to test or walk both
lists.
Signed-off-by: NeilBrown <neilb@suse.com>
Fixes: 79bd99596b ("blk: improve order of bio handling in generic_make_request()")
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79bd99596b upstream.
To avoid recursion on the kernel stack when stacked block devices
are in use, generic_make_request() will, when called recursively,
queue new requests for later handling. They will be handled when the
make_request_fn for the current bio completes.
If any bios are submitted by a make_request_fn, these will ultimately
be handled seqeuntially. If the handling of one of those generates
further requests, they will be added to the end of the queue.
This strict first-in-first-out behaviour can lead to deadlocks in
various ways, normally because a request might need to wait for a
previous request to the same device to complete. This can happen when
they share a mempool, and can happen due to interdependencies
particular to the device. Both md and dm have examples where this happens.
These deadlocks can be erradicated by more selective ordering of bios.
Specifically by handling them in depth-first order. That is: when the
handling of one bio generates one or more further bios, they are
handled immediately after the parent, before any siblings of the
parent. That way, when generic_make_request() calls make_request_fn
for some particular device, we can be certain that all previously
submited requests for that device have been completely handled and are
not waiting for anything in the queue of requests maintained in
generic_make_request().
An easy way to achieve this would be to use a last-in-first-out stack
instead of a queue. However this will change the order of consecutive
bios submitted by a make_request_fn, which could have unexpected consequences.
Instead we take a slightly more complex approach.
A fresh queue is created for each call to a make_request_fn. After it completes,
any bios for a different device are placed on the front of the main queue, followed
by any bios for the same device, followed by all bios that were already on
the queue before the make_request_fn was called.
This provides the depth-first approach without reordering bios on the same level.
This, by itself, it not enough to remove all deadlocks. It just makes
it possible for drivers to take the extra step required themselves.
To avoid deadlocks, drivers must never risk waiting for a request
after submitting one to generic_make_request. This includes never
allocing from a mempool twice in the one call to a make_request_fn.
A common pattern in drivers is to call bio_split() in a loop, handling
the first part and then looping around to possibly split the next part.
Instead, a driver that finds it needs to split a bio should queue
(with generic_make_request) the second part, handle the first part,
and then return. The new code in generic_make_request will ensure the
requests to underlying bios are processed first, then the second bio
that was split off. If it splits again, the same process happens. In
each case one bio will be completely handled before the next one is attempted.
With this is place, it should be possible to disable the
punt_bios_to_recover() recovery thread for many block devices, and
eventually it may be possible to remove it completely.
Ref: http://www.spinics.net/lists/raid/msg54680.html
Tested-by: Jinpu Wang <jinpu.wang@profitbricks.com>
Inspired-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Cc: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cefabdaf7 upstream.
Commit 0a6b76dd23 ("mm: workingset: make shadow node shrinker memcg
aware") enabled cgroup-awareness in the shadow node shrinker, but forgot
to also enable cgroup-awareness in the list_lru the shadow nodes sit on.
Consequently, all shadow nodes are sitting on a global (per-NUMA node)
list, while the shrinker applies the limits according to the amount of
cache in the cgroup its shrinking. The result is excessive pressure on
the shadow nodes from cgroups that have very little cache.
Enable memcg-mode on the shadow node LRUs, such that per-cgroup limits
are applied to per-cgroup lists.
Fixes: 0a6b76dd23 ("mm: workingset: make shadow node shrinker memcg aware")
Link: http://lkml.kernel.org/r/20170322005320.8165-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vladimir Davydov <vdavydov@tarantool.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6c356eda22 upstream.
With the IRQ stack changes integrated, the XRX200 devices started
emitting a constant stream of kernel messages like this:
[ 565.415310] Spurious IRQ: CAUSE=0x1100c300
This is caused by IP0 getting handled by plat_irq_dispatch() rather than
its vectored interrupt handler, which is fixed by commit de856416e714
("MIPS: IRQ Stack: Fix erroneous jal to plat_irq_dispatch").
Fix plat_irq_dispatch() to handle non-vectored IPI interrupts correctly
by setting up IP2-6 as proper chained IRQ handlers and calling do_IRQ
for all MIPS CPU interrupts.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: John Crispin <john@phrozen.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15077/
[james.hogan@imgtec.com: tweaked commit message]
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c2bf9f959 upstream.
GIC_PPI flags were misconfigured for the timers, resulting in errors
like:
[ 0.000000] GIC: PPI11 is secure or misconfigured
Changing them to being edge triggered corrects the issue
Suggested-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Jon Mason <jon.mason@broadcom.com>
Fixes: d27509f1 ("ARM: BCM5301X: add dts files for BCM4708 SoC")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09f3510fb7 upstream.
Since early BCM5301X days we got abort handler that was removed by
commit 937b12306e ("ARM: BCM5301X: remove workaround imprecise abort
fault handler"). It assumed we need to deal only with pending aborts
left by the bootloader. Unfortunately this isn't true for BCM5301X.
When probing PCI config space (device enumeration) it is expected to
have master aborts on the PCI bus. Most bridges don't forward (or they
allow disabling it) these errors onto the AXI/AMBA bus but not the
Northstar (BCM5301X) one.
iProc PCIe controller on Northstar seems to be some older one, without
a control register for errors forwarding. It means we need to workaround
this at platform level. All newer platforms are not affected by this
issue.
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3cd1b064f upstream.
The fence allocation needs to be protected by the GPU mutex, otherwise
the fence seqnos of concurrent submits might not match the insertion order
of the jobs in the kernel ring. This breaks the assumption that jobs
complete with monotonically increasing fence seqnos.
Fixes: d985349017 (drm/etnaviv: take GPU lock later in the submit process)
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce4b4f228e upstream.
We were accidentally only overriding the first VRAM placement. For BOs
with the RADEON_GEM_NO_CPU_ACCESS flag set,
radeon_ttm_placement_from_domain creates a second VRAM placment with
fpfn == 0. If VRAM is almost full, the first VRAM placement with
fpfn > 0 may not work, but the second one with fpfn == 0 always will
(the BO's current location trivially satisfies it). Because "moving"
the BO to its current location puts it back on the LRU list, this
results in an infinite loop.
Fixes: 2a85aedd11 ("drm/radeon: Try evicting from CPU accessible to
inaccessible VRAM first")
Reported-by: Zachary Michaels <zmichaels@oblong.com>
Reported-and-Tested-by: Julien Isorce <jisorce@oblong.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90db10434b upstream.
No caller currently checks the return value of
kvm_io_bus_unregister_dev(). This is evil, as all callers silently go on
freeing their device. A stale reference will remain in the io_bus,
getting at least used again, when the iobus gets teared down on
kvm_destroy_vm() - leading to use after free errors.
There is nothing the callers could do, except retrying over and over
again.
So let's simply remove the bus altogether, print an error and make
sure no one can access this broken bus again (returning -ENOMEM on any
attempt to access it).
Fixes: e93f8a0f82 ("KVM: convert io_bus to SRCU")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df630b8c1e upstream.
When releasing the bus, let's clear the bus pointers to mark it out. If
any further device unregister happens on this bus, we know that we're
done if we found the bus being released already.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6040bc610 upstream.
The reference manual for the i.MX28 recommends to calculate the divisor
as
divisor = (UARTCLK * 32) / baud rate, rounded to the nearest integer
, so let's do this. For a typical setup of UARTCLK = 24 MHz and baud
rate = 115200 this changes the divisor from 6666 to 6667 and so the
actual baud rate improves from 115211.521 Bd (error ≅ 0.01 %) to
115194.240 Bd (error ≅ 0.005 %).
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1633682053 upstream.
Using KASAN, Dmitry found a bug in the rh_call_control() routine: If
buffer allocation fails, the routine returns immediately without
unlinking its URB from the control endpoint, eventually leading to
linked-list corruption.
This patch fixes the problem by jumping to the end of the routine
(where the URB is unlinked) when an allocation failure occurs.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 497e1e16f4 upstream.
A side effect of 89d8232411 ("tty/serial: atmel_serial: BUG: stop DMA
from transmitting in stop_tx") is that the console can be called with
TX path disabled. Then the system would hang trying to push charecters
out in atmel_console_putchar().
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 89d8232411 ("tty/serial: atmel_serial: BUG: stop DMA from transmitting in stop_tx")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31ca2c63fd upstream.
If uart_flush_buffer() is called between atmel_tx_dma() and
atmel_complete_tx_dma(), the circular buffer has been cleared, but not
atmel_port->tx_len.
That leads to a circular buffer overflow (dumping (UART_XMIT_SIZE -
atmel_port->tx_len) bytes).
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08f63d9774 upstream.
No platform-device is required for IO(x)APICs, so don't even
create them.
[ rjw: This fixes a problem with leaking platform device objects
after IOAPIC/IOxAPIC hot-removal events.]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61b79e16c6 upstream.
Paul Menzel reported a warning:
WARNING: CPU: 0 PID: 774 at /build/linux-ROBWaj/linux-4.9.13/kernel/trace/trace_functions_graph.c:233 ftrace_return_to_handler+0x1aa/0x1e0
Bad frame pointer: expected f6919d98, received f6919db0
from func acpi_pm_device_sleep_wake return to c43b6f9d
The warning means that function graph tracing is broken for the
acpi_pm_device_sleep_wake() function. That's because the ACPI Makefile
unconditionally sets the '-Os' gcc flag to optimize for size. That's an
issue because mcount-based function graph tracing is incompatible with
'-Os' on x86, thanks to the following gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=42109
I have another patch pending which will ensure that mcount-based
function graph tracing is never used with CONFIG_CC_OPTIMIZE_FOR_SIZE on
x86.
But this patch is needed in addition to that one because the ACPI
Makefile overrides that config option for no apparent reason. It has
had this flag since the beginning of git history, and there's no related
comment, so I don't know why it's there. As far as I can tell, there's
no reason for it to be there. The appropriate behavior is for it to
honor CONFIG_CC_OPTIMIZE_FOR_{SIZE,PERFORMANCE} like the rest of the
kernel.
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 554bfeceb8 upstream.
pa_memcpy() is the major memcpy implementation in the parisc kernel which is
used to do any kind of userspace/kernel memory copies.
Al Viro noticed various bugs in the implementation of pa_mempcy(), most notably
that in case of faults it may report back to have copied more bytes than it
actually did.
Fixing those bugs is quite hard in the C-implementation, because the compiler
is messing around with the registers and we are not guaranteed that specific
variables are always in the same processor registers. This makes proper fault
handling complicated.
This patch implements pa_memcpy() in assembler. That way we have correct fault
handling and adding a 64-bit copy routine was quite easy.
Runtime tested with 32- and 64bit kernels.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 476e75a44b upstream.
Commit 73580dac76 ("parisc: Fix system shutdown halt") introduced an endless
loop for systems which don't provide a software power off function. But the
soft lockup detector will detect this and report stalled CPUs after some time.
Avoid those unwanted warnings by disabling the soft lockup detector.
Fixes: 73580dac76 ("parisc: Fix system shutdown halt")
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d19f5e41b3 upstream.
Al Viro noticed that userspace accesses via get_user()/put_user() can be
simplified a lot with regard to usage of the exception handling.
This patch implements a fixup routine for get_user() and put_user() in such
that the exception handler will automatically load -EFAULT into the register
%r8 (the error value) in case on a fault on userspace. Additionally the fixup
routine will zero the target register on fault in case of a get_user() call.
The target register is extracted out of the faulting assembly instruction.
This patch brings a few benefits over the old implementation:
1. Exception handling gets much cleaner, easier and smaller in size.
2. Helper functions like fixup_get_user_skip_1 (all of fixup.S) can be dropped.
3. No need to hardcode %r9 as target register for get_user() any longer. This
helps the compiler register allocator and thus creates less assembler
statements.
4. No dependency on the exception_data contents any longer.
5. Nested faults will be handled cleanly.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e3d3e5df0 upstream.
Commit 63d63cbf5e "NFSv4.1: Don't recheck delegations that
have already been checked" introduced a regression where when a
client received BAD_STATEID error it would not send any TEST_STATEID
and instead go into an infinite loop of resending the IO that caused
the BAD_STATEID.
Fixes: 63d63cbf5e ("NFSv4.1: Don't recheck delegations that have already been checked")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0918764c1 upstream.
The controller has different timings for MMC_TIMING_UHS_DDR50 and
MMC_TIMING_MMC_DDR52. Configuring the controller with SDHCI_CTRL_UHS_DDR50,
when MMC_TIMING_MMC_DDR52 timings are requested, is not correct and can
lead to unexpected behavior.
Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Fixes: bb5f8ea4d5 ("mmc: sdhci-of-at91: introduce driver for the Atmel SDMMC")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 923713b357 upstream.
SDIO cards may need clock to send the card interrupt to the host.
On a cherrytrail tablet with a RTL8723BS wifi chip, without this patch
pinging the tablet results in:
PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=78.6 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1760 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=753 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=3.88 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=795 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1841 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=810 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1860 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=812 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=48.6 ms
Where as with this patch I get:
PING 192.168.1.14 (192.168.1.14) 56(84) bytes of data.
64 bytes from 192.168.1.14: icmp_seq=1 ttl=64 time=3.96 ms
64 bytes from 192.168.1.14: icmp_seq=2 ttl=64 time=1.97 ms
64 bytes from 192.168.1.14: icmp_seq=3 ttl=64 time=17.2 ms
64 bytes from 192.168.1.14: icmp_seq=4 ttl=64 time=2.46 ms
64 bytes from 192.168.1.14: icmp_seq=5 ttl=64 time=2.83 ms
64 bytes from 192.168.1.14: icmp_seq=6 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=7 ttl=64 time=2.10 ms
64 bytes from 192.168.1.14: icmp_seq=8 ttl=64 time=1.40 ms
64 bytes from 192.168.1.14: icmp_seq=9 ttl=64 time=2.04 ms
64 bytes from 192.168.1.14: icmp_seq=10 ttl=64 time=1.40 ms
Cc: Dong Aisheng <b29396@freescale.com>
Cc: Ian W MORRISON <ianwmorrison@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Dong Aisheng <aisheng.dong@nxp.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b40735969 upstream.
A previous commit (below) adds a check for already probed interfaces to
Wacom's matching heuristic. Unfortunately this causes the Bamboo Pen
(CTL-460) to match itself to its 'ghost' touch interface. After
subsequent changes to the driver this match to the ghost causes the
kernel to crash. This patch avoids calling wacom_add_shared_data()
for the BAMBOO_PEN's ghost touch interface.
Fixes: 41372d5d40 ("HID: wacom: Augment 'oVid' and 'oPid' with heuristics for HID_GENERIC")
Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1a6fe41d3 upstream.
In 'skl_tplg_set_module_init_data()', a pointer to 'params' member of
'struct skl_algo_data' is calculated, then casted to (u32 *) and assigned
to a member of configuration data. The configuration data is passed to the
other functions and used to process intel IPC. In this processing, the
value of member is used to get message data, however this can bring invalid
memory access in 'skl_set_module_params()' as a result of calculation of
a pointer for actual message data.
(sound/soc/intel/skylake/skl-topology.c)
skl_tplg_init_pipe_modules()
->skl_tplg_set_module_init_data() (has this bug)
->skl_tplg_set_module_params()
(sound/soc/intel/skylake/skl-messages.c)
->skl_set_module_params()
((char *)param) + data_offset
This commit fixes the bug.
Fixes: abb740033b ("ASoC: Intel: Skylake: Add support to configure module params")
Signed-off-by: Takashi Sakamoto <takashi.sakamoto@miraclelinux.com>
Acked-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2f726aec19 upstream.
On this Dell AIO machine, the lineout jack does not work.
We found the pin 0x1a is assigned to lineout on this machine, and in
the past, we applied ALC298_FIXUP_DELL1_MIC_NO_PRESENCE to fix the
heaset-set mic problem for this machine, this fixup will redefine
the pin 0x1a to headphone-mic, as a result the lineout doesn't
work anymore.
After consulting with Dell, they told us this machine doesn't support
microphone via headset jack, so we add a new fixup which only defines
the pin 0x18 as the headset-mic.
[rearranged the fixup insertion position by tiwai in order to make the
merge with other branches easier -- tiwai]
Fixes: 59ec4b57bc ("ALSA: hda - Fix headset mic detection problem for two dell machines")
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d7d54002e upstream.
When a new event is queued while processing to resize the FIFO in
snd_seq_fifo_clear(), it may lead to a use-after-free, as the old pool
that is being queued gets removed. For avoiding this race, we need to
close the pool to be deleted and sync its usage before actually
deleting it.
The issue was spotted by syzkaller.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e347b5e05 upstream.
The host bridge memory window resource is inserted into the iomem_resource
tree and cannot be deallocated until the host bridge itself is removed.
Previously, the window was on the stack, which meant the iomem_resource
entry pointed into the stack and was corrupted as soon as the probe
function returned, which caused memory corruption and errors like this:
pcie_iproc_bcma bcma0:8: resource collision: [mem 0x40000000-0x47ffffff] conflicts with PCIe MEM space [mem 0x40000000-0x47ffffff]
Move the memory window resource from the stack into struct iproc_pcie so
its lifetime matches that of the host bridge.
Fixes: c3245a5664 ("PCI: iproc: Request host bridge window resources")
Reported-and-tested-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7cb689fe42 upstream.
Callers of scsi_dh_activate(), e.g. dm-mpath, assume that this function
either returns an error code or calls the completion function. Make
alua_activate() call the completion function even if scsi_device_get()
fails.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Tang Junhui <tang.junhui@zte.com.cn>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9702c67c60 upstream.
The total ata xfer length may not be calculated properly, in that we do
not use the proper method to get an sg element dma length.
According to the code comment, sg_dma_len() should be used after
dma_map_sg() is called.
This issue was found by turning on the SMMUv3 in front of the hisi_sas
controller in hip07. Multiple sg elements were being combined into a
single element, but the original first element length was being use as
the total xfer length.
Fixes: ff2aeb1eb6 ("libata: convert to chained sg")
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf33f87dd0 upstream.
The user can control the size of the next command passed along, but the
value passed to the ioctl isn't checked against the usable max command
size.
Signed-off-by: Peter Chang <dpf@google.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fcc319d24 upstream.
When a reflink operation causes the bmap code to allocate a btree block
we're currently doing single-AG allocations due to having ->firstblock
set and then try any higher AG due a little reflink quirk we've put in
when adding the reflink code. But given that we do not have a minleft
reservation of any kind in this AG we can still not have any space in
the same or higher AG even if the file system has enough free space.
To fix this use a XFS_ALLOCTYPE_FIRST_AG allocation in this fall back
path instead.
[And yes, we need to redo this properly instead of piling hacks over
hacks. I'm working on that, but it's not going to be a small series.
In the meantime this fixes the customer reported issue]
Also add a warning for failing allocations to make it easier to debug.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f65e6fad29 upstream.
Commit fa7f138 ("xfs: clear delalloc and cache on buffered write
failure") fixed one regression in the iomap error handling code and
exposed another. The fundamental problem is that if a buffered write
is a rewrite of preexisting delalloc blocks and the write fails, the
failure handling code can punch out preexisting blocks with valid
file data.
This was reproduced directly by sub-block writes in the LTP
kernel/syscalls/write/write03 test. A first 100 byte write allocates
a single block in a file. A subsequent 100 byte write fails and
punches out the block, including the data successfully written by
the previous write.
To address this problem, update the ->iomap_begin() handler to
distinguish newly allocated delalloc blocks from preexisting
delalloc blocks via the IOMAP_F_NEW flag. Use this flag in the
->iomap_end() handler to decide when a failed or short write should
punch out delalloc blocks.
This introduces the subtle requirement that ->iomap_begin() should
never combine newly allocated delalloc blocks with existing blocks
in the resulting iomap descriptor. This can occur when a new
delalloc reservation merges with a neighboring extent that is part
of the current write, for example. Therefore, drop the
post-allocation extent lookup from xfs_bmapi_reserve_delalloc() and
just return the record inserted into the fork. This ensures only new
blocks are returned and thus that preexisting delalloc blocks are
always handled as "found" blocks and not punched out on a failed
rewrite.
Reported-by: Xiong Zhou <xzhou@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5825712ee upstream.
When block size is larger than inode cluster size, the call to
XFS_B_TO_FSBT(mp, mp->m_inode_cluster_size) returns 0. Also, mkfs.xfs
would have set xfs_sb->sb_inoalignmt to 0. Hence in
xfs_set_inoalignment(), xfs_mount->m_inoalign_mask gets initialized to
-1 instead of 0. However, xfs_mount->m_sinoalign would get correctly
intialized to 0 because for every positive value of xfs_mount->m_dalign,
the condition "!(mp->m_dalign & mp->m_inoalign_mask)" would evaluate to
false.
Also, xfs_imap() worked fine even with xfs_mount->m_inoalign_mask having
-1 as the value because blks_per_cluster variable would have the value 1
and hence we would never have a need to use xfs_mount->m_inoalign_mask
to compute the inode chunk's agbno and offset within the chunk.
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 787eb48550 upstream.
There are two different cases of buffered I/O errors:
- first we can have an already shutdown fs. In that case we should skip
any on-disk operations and just clean up the appen transaction if
present and destroy the ioend
- a real I/O error. In that case we should cleanup any lingering COW
blocks. This gets skipped in the current code and is fixed by this
patch.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3802a34532 upstream.
We only want to reclaim preallocations from our periodic work item.
Currently this is archived by looking for a dirty inode, but that check
is rather fragile. Instead add a flag to xfs_reflink_cancel_cow_* so
that the caller can ask for just cancelling unwritten extents in the COW
fork.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: fix typos in commit message]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 410d17f67e upstream.
In various places we currently assert that xfs_bmap_btalloc allocates
from the same as the firstblock value passed in, unless it's either
NULLAGNO or the dop_low flag is set. But the reflink code does not
fully follow this convention as it passes in firstblock purely as
a hint for the allocator without actually having previous allocations
in the transaction, and without having a minleft check on the current
AG, leading to the assert firing on a very full and heavily used
file system. As even the reflink code only allocates from equal or
higher AGs for now we can simply the check to always allow for equal
or higher AGs.
Note that we need to eventually split the two meanings of the firstblock
value. At that point we can also allow the reflink code to allocate
from any AG instead of limiting it in any way.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48af96ab92 upstream.
The block reservation for the transaction allocated in
xfs_shift_file_space() is an artifact of the original collapse range
support. It exists to handle the case where a collapse range occurs,
the initial extent is left shifted into a location that forms a
contiguous boundary with the previous extent and thus the extents
are merged. This code was subsequently refactored and reused for
insert range (right shift) support.
If an insert range occurs under low free space conditions, the
extent at the starting offset is split before the first shift
transaction is allocated. If the block reservation fails, this
leaves separate, but contiguous extents around in the inode. While
not a fatal problem, this is unexpected and will flag a warning on
subsequent insert range operations on the inode. This problem has
been reproduce intermittently by generic/270 running against a
ramdisk device.
Since right shift does not create new extent boundaries in the
inode, a block reservation for extent merge is unnecessary. Update
xfs_shift_file_space() to conditionally reserve fs blocks for left
shift transactions only. This avoids the warning reproduced by
generic/270.
Reported-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75d65361cf upstream.
Certain workoads that punch holes into speculative preallocation can
cause delalloc indirect reservation splits when the delalloc extent is
split in two. If further splits occur, an already short-handed extent
can be split into two in a manner that leaves zero indirect blocks for
one of the two new extents. This occurs because the shortage is large
enough that the xfs_bmap_split_indlen() algorithm completely drains the
requested indlen of one of the extents before it honors the existing
reservation.
This ultimately results in a warning from xfs_bmap_del_extent(). This
has been observed during file copies of large, sparse files using 'cp
--sparse=always.'
To avoid this problem, update xfs_bmap_split_indlen() to explicitly
apply the reservation shortage fairly between both extents. This smooths
out the overall indlen shortage and defers the situation where we end up
with a delalloc extent with zero indlen reservation to extreme
circumstances.
Reported-by: Patrick Dung <mpatdung@gmail.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e339ef855 upstream.
When a delalloc extent is created, it can be merged with pre-existing,
contiguous, delalloc extents. When this occurs,
xfs_bmap_add_extent_hole_delay() merges the extents along with the
associated indirect block reservations. The expectation here is that the
combined worst case indlen reservation is always less than or equal to
the indlen reservation for the individual extents.
This is not always the case, however, as existing extents can less than
the expected indlen reservation if the extent was previously split due
to a hole punch. If a new extent merges with such an extent, the total
indlen requirement may be larger than the sum of the indlen reservations
held by both extents.
xfs_bmap_add_extent_hole_delay() assumes that the worst case indlen
reservation is always available and assigns it to the merged extent
without consideration for the indlen held by the pre-existing extent. As
a result, the subsequent xfs_mod_fdblocks() call can attempt an
unintentional allocation rather than a free (indicated by an ASSERT()
failure). Further, if the allocation happens to fail in this context,
the failure goes unhandled and creates a filesystem wide block
accounting inconsistency.
Fix xfs_bmap_add_extent_hole_delay() to function as designed. Cap the
indlen reservation assigned to the merged extent to the sum of the
indlen reservations held by each of the individual extents.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e30c23d13 upstream.
We don't just need the structure to track busy extents which can be
avoided with a synchronous transaction, but also to keep track of
pending discard.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54a4ef8af4 upstream.
We currently fall back from direct to buffered writes if we detect a
remaining shared extent in the iomap_begin callback. But by the time
iomap_begin is called for the potentially unaligned end block we might
have already written most of the data to disk, which we'd now write
again using buffered I/O. To avoid this reject all writes to reflinked
files before starting I/O so that we are guaranteed to only write the
data once.
The alternative would be to unshare the unaligned start and/or end block
before doing the I/O. I think that's doable, and will actually be
required to support reflinks on DAX file system. But it will take a
little more time and I'd rather get rid of the double write ASAP.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
[slight changes in context due to the new direct I/O code in 4.10+]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c5ecb42342 upstream.
We're changing both metadata and data, so we need to update the
timestamps for clone operations. Dedupe on the other hand does
not change file data, and only changes invisible metadata so the
timestamps should not be updated.
This follows existing btrfs behavior.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
[darrick: remove redundant is_dedupe test]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4dd2eb6335 upstream.
After successful IO or permanent error, b_first_retry_time also
needs to be cleared, else the invalid first retry time will be
used by the next retry check.
Signed-off-by: Hou Tao <houtao1@huawei.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5eda430000 upstream.
Christoph Hellwig pointed out that there's a potentially nasty race when
performing simultaneous nearby directio cow writes:
"Thread 1 writes a range from B to c
" B --------- C
p
"a little later thread 2 writes from A to B
" A --------- B
p
[editor's note: the 'p' denote cowextsize boundaries, which I added to
make this more clear]
"but the code preallocates beyond B into the range where thread
"1 has just written, but ->end_io hasn't been called yet.
"But once ->end_io is called thread 2 has already allocated
"up to the extent size hint into the write range of thread 1,
"so the end_io handler will splice the unintialized blocks from
"that preallocation back into the file right after B."
We can avoid this race by ensuring that thread 1 cannot accidentally
remap the blocks that thread 2 allocated (as part of speculative
preallocation) as part of t2's write preparation in t1's end_io handler.
The way we make this happen is by taking advantage of the unwritten
extent flag as an intermediate step.
Recall that when we begin the process of writing data to shared blocks,
we create a delayed allocation extent in the CoW fork:
D: --RRRRRRSSSRRRRRRRR---
C: ------DDDDDDD---------
When a thread prepares to CoW some dirty data out to disk, it will now
convert the delalloc reservation into an /unwritten/ allocated extent in
the cow fork. The da conversion code tries to opportunistically
allocate as much of a (speculatively prealloc'd) extent as possible, so
we may end up allocating a larger extent than we're actually writing
out:
D: --RRRRRRSSSRRRRRRRR---
U: ------UUUUUUU---------
Next, we convert only the part of the extent that we're actively
planning to write to normal (i.e. not unwritten) status:
D: --RRRRRRSSSRRRRRRRR---
U: ------UURRUUU---------
If the write succeeds, the end_cow function will now scan the relevant
range of the CoW fork for real extents and remap only the real extents
into the data fork:
D: --RRRRRRRRSRRRRRRRR---
U: ------UU--UUU---------
This ensures that we never obliterate valid data fork extents with
unwritten blocks from the CoW fork.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05a630d76b upstream.
In the data fork, we only allow extents to perform the following state
transitions:
delay -> real <-> unwritten
There's no way to move directly from a delalloc reservation to an
/unwritten/ allocated extent. However, for the CoW fork we want to be
able to do the following to each extent:
delalloc -> unwritten -> written -> remapped to data fork
This will help us to avoid a race in the speculative CoW preallocation
code between a first thread that is allocating a CoW extent and a second
thread that is remapping part of a file after a write. In order to do
this, however, we need two things: first, we have to be able to
transition from da to unwritten, and second the function that converts
between real and unwritten has to be made aware of the cow fork. Do
both of those things.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de14c5f541 upstream.
Perform basic sanity checking of the directory free block header
fields so that we avoid hanging the system on invalid data.
(Granted that just means that now we shutdown on directory write,
but that seems better than hanging...)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3bf607d58 upstream.
We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's
no such thing as a zero-level bmbt (for that we have extents format),
so if we see this, send back an error code.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5a91baeb6 upstream.
Don't let anybody load an obviously bad btree pointer. Since the values
come from disk, we must return an error, not just ASSERT.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a652bbe36 upstream.
When we open a directory, we try to readahead block 0 of the directory
on the assumption that we're going to need it soon. If the bmbt is
corrupt, the directory will never be usable and the readahead fails
immediately, so we might as well prevent the directory from being opened
at all. This prevents a subsequent read or modify operation from
hitting it and taking the fs offline.
NOTE: We're only checking for early failures in the block mapping, not
the readahead directory block itself.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b5bd5bf3f upstream.
We use di_format and if_flags to decide whether we're grabbing the ilock
in btree mode (btree extents not loaded) or shared mode (anything else),
but the state of those fields can be changed by other threads that are
also trying to load the btree extents -- IFEXTENTS gets set before the
_bmap_read_extents call and cleared if it fails.
We don't actually need to have IFEXTENTS set until after the bmbt
records are successfully loaded and validated, which will fix the race
between multiple threads trying to read the same directory. The next
patch strengthens directory bmbt validation by refusing to open the
directory if reading the bmbt to start directory readahead fails.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4229d6b0b upstream.
It's possible for post-eof blocks to end up being used for direct I/O
writes. dio write performs an upfront unwritten extent allocation, sends
the dio and then updates the inode size (if necessary) on write
completion. If a file release occurs while a file extending dio write is
in flight, it is possible to mistake the post-eof blocks for speculative
preallocation and incorrectly truncate them from the inode. This means
that the resulting dio write completion can discover a hole and allocate
new blocks rather than perform unwritten extent conversion.
This requires a strange mix of I/O and is thus not likely to reproduce
in real world workloads. It is intermittently reproduced by generic/299.
The error manifests as an assert failure due to transaction overrun
because the aforementioned write completion transaction has only
reserved enough blocks for btree operations:
XFS: Assertion failed: tp->t_blk_res_used <= tp->t_blk_res, \
file: fs/xfs//xfs_trans.c, line: 309
The root cause is that xfs_free_eofblocks() uses i_size to truncate
post-eof blocks from the inode, but async, file extending direct writes
do not update i_size until write completion, long after inode locks are
dropped. Therefore, xfs_free_eofblocks() effectively truncates the inode
to the incorrect size.
Update xfs_free_eofblocks() to serialize against dio similar to how
extending writes are serialized against i_size updates before post-eof
block zeroing. Specifically, wait on dio while under the iolock. This
ensures that dio write completions have updated i_size before post-eof
blocks are processed.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c3155097ad upstream.
The xfs_eofblocks.eof_scan_owner field is an internal field to
facilitate invoking eofb scans from the kernel while under the iolock.
This is necessary because the eofb scan acquires the iolock of each
inode. Synchronous scans are invoked on certain buffered write failures
while under iolock. In such cases, the scan owner indicates that the
context for the scan already owns the particular iolock and prevents a
double lock deadlock.
eofblocks scans while under iolock are still livelock prone in the event
of multiple parallel scans, however. If multiple buffered writes to
different inodes fail and invoke eofblocks scans at the same time, each
scan avoids a deadlock with its own inode by virtue of the
eof_scan_owner field, but will never be able to acquire the iolock of
the inode from the parallel scan. Because the low free space scans are
invoked with SYNC_WAIT, the scan will not return until it has processed
every tagged inode and thus both scans will spin indefinitely on the
iolock being held across the opposite scan. This problem can be
reproduced reliably by generic/224 on systems with higher cpu counts
(x16).
To avoid this problem, simplify the semantics of eofblocks scans to
never invoke a scan while under iolock. This means that the buffered
write context must drop the iolock before the scan. It must reacquire
the lock before the write retry and also repeat the initial write
checks, as the original state might no longer be valid once the iolock
was dropped.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a36b926180 upstream.
xfs_free_eofblocks() requires the IOLOCK_EXCL lock, but is called from
different contexts where the lock may or may not be held. The
need_iolock parameter exists for this reason, to indicate whether
xfs_free_eofblocks() must acquire the iolock itself before it can
proceed.
This is ugly and confusing. Simplify the semantics of
xfs_free_eofblocks() to require the caller to acquire the iolock
appropriately and kill the need_iolock parameter. While here, the mp
param can be removed as well as the xfs_mount is accessible from the
xfs_inode structure. This patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76d771b4cb upstream.
Currently we try to rely on the global reserved block pool for block
allocations for the free inode btree, but I have customer reports
(fairly complex workload, need to find an easier reproducer) where that
is not enough as the AG where we free an inode that requires a new
finobt block is entirely full. This causes us to cancel a dirty
transaction and thus a file system shutdown.
I think the right way to guard against this is to treat the finot the same
way as the refcount btree and have a per-AG reservations for the possible
worst case size of it, and the patch below implements that.
Note that this could increase mount times with large finobt trees. In
an ideal world we would have added a field for the number of finobt
fields to the AGI, similar to what we did for the refcount blocks.
We should do add it next time we rev the AGI or AGF format by adding
new fields.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4dfa2b8411 upstream.
Try to reserve the blocks first and only then update the fields in
or hanging off the mount structure. This way we can call __xfs_ag_resv_init
again after a previous failure.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ecec8503a upstream.
When relocating the p2m, take special care not to relocate it so
that is overlaps with the current location of the p2m/initrd. This is
needed since the full extent of the current location is not marked as a
reserved region in the e820.
This was seen to happen to a dom0 with a large initial p2m and a small
reserved region in the middle of the initial p2m.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 619bd4a718 upstream.
Since the change in commit:
fd7a4bed18 ("sched, rt: Convert switched_{from, to}_rt() / prio_changed_rt() to balance callbacks")
... we don't reschedule a task under certain circumstances:
Lets say task-A, SCHED_OTHER, is running on CPU0 (and it may run only on
CPU0) and holds a PI lock. This task is removed from the CPU because it
used up its time slice and another SCHED_OTHER task is running. Task-B on
CPU1 runs at RT priority and asks for the lock owned by task-A. This
results in a priority boost for task-A. Task-B goes to sleep until the
lock has been made available. Task-A is already runnable (but not active),
so it receives no wake up.
The reality now is that task-A gets on the CPU once the scheduler decides
to remove the current task despite the fact that a high priority task is
enqueued and waiting. This may take a long time.
The desired behaviour is that CPU0 immediately reschedules after the
priority boost which made task-A the task with the lowest priority.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: fd7a4bed18 ("sched, rt: Convert switched_{from, to}_rt() prio_changed_rt() to balance callbacks")
Link: http://lkml.kernel.org/r/20170124144006.29821-1-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b53cf9815 upstream.
Filesystem encryption ostensibly supported revoking a keyring key that
had been used to "unlock" encrypted files, causing those files to become
"locked" again. This was, however, buggy for several reasons, the most
severe of which was that when key revocation happened to be detected for
an inode, its fscrypt_info was immediately freed, even while other
threads could be using it for encryption or decryption concurrently.
This could be exploited to crash the kernel or worse.
This patch fixes the use-after-free by removing the code which detects
the keyring key having been revoked, invalidated, or expired. Instead,
an encrypted inode that is "unlocked" now simply remains unlocked until
it is evicted from memory. Note that this is no worse than the case for
block device-level encryption, e.g. dm-crypt, and it still remains
possible for a privileged user to evict unused pages, inodes, and
dentries by running 'sync; echo 3 > /proc/sys/vm/drop_caches', or by
simply unmounting the filesystem. In fact, one of those actions was
already needed anyway for key revocation to work even somewhat sanely.
This change is not expected to break any applications.
In the future I'd like to implement a real API for fscrypt key
revocation that interacts sanely with ongoing filesystem operations ---
waiting for existing operations to complete and blocking new operations,
and invalidating and sanitizing key material and plaintext from the VFS
caches. But this is a hard problem, and for now this bug must be fixed.
This bug affected almost all versions of ext4, f2fs, and ubifs
encryption, and it was potentially reachable in any kernel configured
with encryption support (CONFIG_EXT4_ENCRYPTION=y,
CONFIG_EXT4_FS_ENCRYPTION=y, CONFIG_F2FS_FS_ENCRYPTION=y, or
CONFIG_UBIFS_FS_ENCRYPTION=y). Note that older kernels did not use the
shared fs/crypto/ code, but due to the potential security implications
of this bug, it may still be worthwhile to backport this fix to them.
Fixes: b7236e21d5 ("ext4 crypto: reorganize how we store keys in the inode")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Acked-by: Michael Halcrow <mhalcrow@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7195ee3120 upstream.
It's not clear what behaviour is sensible when doing partial write of
NT_METAG_RPIPE, so just don't bother.
This patch assumes that userspace will never rely on a partial SETREGSET
in this case, since it's not clear what should happen anyway.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d614fd58a2 upstream.
Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 502585c755 upstream.
regs_set() and regs_get() are vulnerable to an off-by-1 buffer overrun
if CONFIG_CPU_H8S is set, since this adds an extra entry to
register_offset[] but not to user_regs_struct.
So, iterate over user_regs_struct based on its actual size, not based on
the length of register_offset[].
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb411b837b upstream.
gpr_set won't work correctly and can never have been tested, and the
correct behaviour is not clear due to the endianness-dependent task
layout.
So, just remove it. The core code will now return -EOPNOTSUPPORT when
trying to set NT_PRSTATUS on this architecture until/unless a correct
implementation is supplied.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc8653228c upstream.
When init_vqs runs, virtio_balloon.stats is either uninitialized or
contains stale values. The host updates its state with garbage data
because it has no way of knowing that this is just a marker buffer
used for signaling.
This patch updates the stats before pushing the initial buffer.
Alternative fixes:
* Push an empty buffer in init_vqs. Not easily done with the current
virtio implementation and violates the spec "Driver MUST supply the
same subset of statistics in all buffers submitted to the statsq".
* Push a buffer with invalid tags in init_vqs. Violates the same
spec clause, plus "invalid tag" is not really defined.
Note: the spec says:
When using the legacy interface, the device SHOULD ignore all values in
the first buffer in the statsq supplied by the driver after device
initialization. Note: Historically, drivers supplied an uninitialized
buffer in the first buffer.
Unfortunately QEMU does not seem to implement the recommendation
even for the legacy interface.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f843ee6dd0 upstream.
Kees Cook has pointed out that xfrm_replay_state_esn_len() is subject to
wrapping issues. To ensure we are correctly ensuring that the two ESN
structures are the same size compare both the overall size as reported
by xfrm_replay_state_esn_len() and the internal length are the same.
CVE-2017-7184
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 677e806da4 upstream.
When a new xfrm state is created during an XFRM_MSG_NEWSA call we
validate the user supplied replay_esn to ensure that the size is valid
and to ensure that the replay_window size is within the allocated
buffer. However later it is possible to update this replay_esn via a
XFRM_MSG_NEWAE call. There we again validate the size of the supplied
buffer matches the existing state and if so inject the contents. We do
not at this point check that the replay_window is within the allocated
memory. This leads to out-of-bounds reads and writes triggered by
netlink packets. This leads to memory corruption and the potential for
priviledge escalation.
We already attempt to validate the incoming replay information in
xfrm_new_ae() via xfrm_replay_verify_len(). This confirms that the user
is not trying to change the size of the replay state buffer which
includes the replay_esn. It however does not check the replay_window
remains within that buffer. Add validation of the contained
replay_window.
CVE-2017-7184
Signed-off-by: Andy Whitcroft <apw@canonical.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c282222a45 upstream.
Dmitry reports following splat:
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 13059 Comm: syz-executor1 Not tainted 4.10.0-rc7-next-20170207 #1
[..]
spin_lock_bh include/linux/spinlock.h:304 [inline]
xfrm_policy_flush+0x32/0x470 net/xfrm/xfrm_policy.c:963
xfrm_policy_fini+0xbf/0x560 net/xfrm/xfrm_policy.c:3041
xfrm_net_init+0x79f/0x9e0 net/xfrm/xfrm_policy.c:3091
ops_init+0x10a/0x530 net/core/net_namespace.c:115
setup_net+0x2ed/0x690 net/core/net_namespace.c:291
copy_net_ns+0x26c/0x530 net/core/net_namespace.c:396
create_new_namespaces+0x409/0x860 kernel/nsproxy.c:106
unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:205
SYSC_unshare kernel/fork.c:2281 [inline]
Problem is that when we get error during xfrm_net_init we will call
xfrm_policy_fini which will acquire xfrm_policy_lock before it was
initialized. Just move it around so locks get set up first.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 283bc9f35b ("xfrm: Namespacify xfrm state/policy locks")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e9f44eaae upstream.
Add Quectel UC15, UC20, EC21, and EC25. The EC20 is handled by
qcserial due to a USB VID/PID conflict with an existing Acer
device.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
This patch adds new sample rates to the Audioinjector Octo sound card. The
new supported rates are (in kHz) :
96, 48, 32, 24, 16, 8, 88.2, 44.1, 29.4, 22.05, 14.7
This patch also replaces the regulators in the device tree overlay with
the ones declared in bcm270x.dtsi include file.
This patch also adds an extra codec reset and delay on probe.
commit 6207119444 upstream.
With this reproducer:
struct sockaddr_alg alg = {
.salg_family = 0x26,
.salg_type = "hash",
.salg_feat = 0xf,
.salg_mask = 0x5,
.salg_name = "digest_null",
};
int sock, sock2;
sock = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(sock, (struct sockaddr *)&alg, sizeof(alg));
sock2 = accept(sock, NULL, NULL);
setsockopt(sock, SOL_ALG, ALG_SET_KEY, "\x9b\xca", 2);
accept(sock2, NULL, NULL);
==== 8< ======== 8< ======== 8< ======== 8< ====
one can immediatelly see an UBSAN warning:
UBSAN: Undefined behaviour in crypto/algif_hash.c:187:7
variable length array bound value 0 <= 0
CPU: 0 PID: 15949 Comm: syz-executor Tainted: G E 4.4.30-0-default #1
...
Call Trace:
...
[<ffffffff81d598fd>] ? __ubsan_handle_vla_bound_not_positive+0x13d/0x188
[<ffffffff81d597c0>] ? __ubsan_handle_out_of_bounds+0x1bc/0x1bc
[<ffffffffa0e2204d>] ? hash_accept+0x5bd/0x7d0 [algif_hash]
[<ffffffffa0e2293f>] ? hash_accept_nokey+0x3f/0x51 [algif_hash]
[<ffffffffa0e206b0>] ? hash_accept_parent_nokey+0x4a0/0x4a0 [algif_hash]
[<ffffffff8235c42b>] ? SyS_accept+0x2b/0x40
It is a correct warning, as hash state is propagated to accept as zero,
but creating a zero-length variable array is not allowed in C.
Fix this as proposed by Herbert -- do "?: 1" on that site. No sizeof or
similar happens in the code there, so we just allocate one byte even
though we do not use the array.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net> (maintainer:CRYPTO API)
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8aac7f3436 upstream.
fbcon can deal with vc_hi_font_mask (the upper 256 chars) and adjust
the vc attrs dynamically when vc_hi_font_mask is changed at
fbcon_init(). When the vc_hi_font_mask is set, it remaps the attrs in
the existing console buffer with one bit shift up (for 9 bits), while
it remaps with one bit shift down (for 8 bits) when the value is
cleared. It works fine as long as the font gets updated after fbcon
was initialized.
However, we hit a bizarre problem when the console is switched to
another fb driver (typically from vesafb or efifb to drmfb). At
switching to the new fb driver, we temporarily rebind the console to
the dummy console, then rebind to the new driver. During the
switching, we leave the modified attrs as is. Thus, the new fbcon
takes over the old buffer as if it were to contain 8 bits chars
(although the attrs are still shifted for 9 bits), and effectively
this results in the yellow color texts instead of the original white
color, as found in the bugzilla entry below.
An easy fix for this is to re-adjust the attrs before leaving the
fbcon at con_deinit callback. Since the code to adjust the attrs is
already present in the current fbcon code, in this patch, we simply
factor out the relevant code, and call it from fbcon_deinit().
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1000619
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24835e442f upstream.
When writing the generic nonblocking commit code I assumed that
through clever lifetime management I can assure that the completion
(stored in drm_crtc_commit) only gets freed after it is completed. And
that worked.
I also wanted to make nonblocking helpers resilient against driver
bugs, by having timeouts everywhere. And that worked too.
Unfortunately taking boths things together results in oopses :( Well,
at least sometimes: What seems to happen is that the drm event hangs
around forever stuck in limbo land. The nonblocking helpers eventually
time out, move on and release it. Now the bug I tested all this
against is drivers that just entirely fail to deliver the vblank
events like they should, and in those cases the event is simply
leaked. But what seems to happen, at least sometimes, on i915 is that
the event is set up correctly, but somohow the vblank fails to fire in
time. Which means the event isn't leaked, it's still there waiting for
eventually a vblank to fire. That tends to happen when re-enabling the
pipe, and then the trap springs and the kernel oopses.
The correct fix here is simply to refcount the crtc commit to make
sure that the event sticks around even for drivers which only
sometimes fail to deliver vblanks for some arbitrary reasons. Since
crtc commits are already refcounted that's easy to do.
References: https://bugs.freedesktop.org/show_bug.cgi?id=96781
Cc: Jim Rees <rees@umich.edu>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161221102331.31033-1-daniel.vetter@ffwll.ch
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea90e0dc8c upstream.
Sowmini pointed out Dmitry's RTNL deadlock report to me, and it turns out
to be perfectly accurate - there are various error paths that miss unlock
of the RTNL.
To fix those, change the locking a bit to not be conditional in all those
nl80211_prepare_*_dump() functions, but make those require the RTNL to
start with, and fix the buggy error paths. This also let me use sparse
(by appropriately overriding the rtnl_lock/rtnl_unlock functions) to
validate the changes.
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0a8b49c03 upstream.
Analogix_dp_bind() can be called from component framework, which doesn't
guarantee proper runtime PM state of the device during bind operation,
so ensure that device is runtime active before doing any register access.
This ensures that the power domain, to which DP module belongs, is turned
on. While at it, also fix the unbalanced call to phy_power_on() in
analogix_dp_bind() function.
This patch solves the following kernel oops on Samsung Exynos5250 Snow
board:
Unhandled fault: imprecise external abort (0x406) at 0x00000000
pgd = c0004000
[00000000] *pgd=00000000
Internal error: : 406 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 75 Comm: kworker/0:2 Not tainted 4.9.0 #1046
Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
Workqueue: events deferred_probe_work_func
task: ee272300 task.stack: ee312000
PC is at analogix_dp_enable_sw_function+0x18/0x2c
LR is at analogix_dp_init_dp+0x2c/0x50
...
[<c03fcb38>] (analogix_dp_enable_sw_function) from [<c03fa9c4>] (analogix_dp_init_dp+0x2c/0x50)
[<c03fa9c4>] (analogix_dp_init_dp) from [<c03fab6c>] (analogix_dp_bind+0x184/0x42c)
[<c03fab6c>] (analogix_dp_bind) from [<c03fdb84>] (component_bind_all+0xf0/0x218)
[<c03fdb84>] (component_bind_all) from [<c03ed64c>] (exynos_drm_load+0x134/0x200)
[<c03ed64c>] (exynos_drm_load) from [<c03d5058>] (drm_dev_register+0xa0/0xd0)
[<c03d5058>] (drm_dev_register) from [<c03d66b8>] (drm_platform_init+0x58/0xb0)
[<c03d66b8>] (drm_platform_init) from [<c03fe0c4>] (try_to_bring_up_master+0x14c/0x188)
[<c03fe0c4>] (try_to_bring_up_master) from [<c03fe188>] (component_add+0x88/0x138)
[<c03fe188>] (component_add) from [<c0403a38>] (platform_drv_probe+0x50/0xb0)
[<c0403a38>] (platform_drv_probe) from [<c0402470>] (driver_probe_device+0x1f0/0x2a8)
[<c0402470>] (driver_probe_device) from [<c0400a54>] (bus_for_each_drv+0x44/0x8c)
[<c0400a54>] (bus_for_each_drv) from [<c04021f8>] (__device_attach+0x9c/0x100)
[<c04021f8>] (__device_attach) from [<c04018e8>] (bus_probe_device+0x84/0x8c)
[<c04018e8>] (bus_probe_device) from [<c0401d1c>] (deferred_probe_work_func+0x60/0x8c)
[<c0401d1c>] (deferred_probe_work_func) from [<c012fc14>] (process_one_work+0x120/0x318)
[<c012fc14>] (process_one_work) from [<c012fe34>] (process_scheduled_works+0x28/0x38)
[<c012fe34>] (process_scheduled_works) from [<c0130048>] (worker_thread+0x204/0x4ac)
[<c0130048>] (worker_thread) from [<c01352c4>] (kthread+0xd8/0xf4)
[<c01352c4>] (kthread) from [<c0107978>] (ret_from_fork+0x14/0x3c)
Code: e59035f0 e5935018 f57ff04f e3c55001 (f57ff04e)
---[ end trace 3d1d0d87796de344 ]---
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Archit Taneja <architt@codeaurora.org>
Link: http://patchwork.freedesktop.org/patch/msgid/1483091866-1088-1-git-send-email-m.szyprowski@samsung.com
Cc: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0134ed4fb9 upstream.
Jeff Moyer reports:
With a device dax alignment of 4KB or 2MB, I get sigbus when running
the attached fio job file for the current kernel (4.11.0-rc1+). If
I specify an alignment of 1GB, it works.
I turned on debug output, and saw that it was failing in the huge
fault code.
dax dax1.0: dax_open
dax dax1.0: dax_mmap
dax dax1.0: dax_dev_huge_fault: fio: write (0x7f08f0a00000 -
dax dax1.0: __dax_dev_pud_fault: phys_to_pgoff(0xffffffffcf60
dax dax1.0: dax_release
fio config for reproduce:
[global]
ioengine=dev-dax
direct=0
filename=/dev/dax0.0
bs=2m
[write]
rw=write
[read]
stonewall
rw=read
The driver fails to fallback when taking a fault that is larger than
the device alignment, or handling a larger fault when a smaller
mapping is already established. While we could support larger
mappings for a device with a smaller alignment, that change is
too large for the immediate fix. The simplest change is to force
fallback until the fault size matches the alignment.
Fixes: dee4107924 ("/dev/dax, core: file operations and dax-mmap")
Cc: <stable@vger.kernel.org>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b581a5854e upstream.
Since ceph.git commit 4e28f9e63644 ("osd/OSDMap: clear osd_info,
osd_xinfo on osd deletion"), weight is set to IN when OSD is deleted.
This changes the result of applying an incremental for clients, not
just OSDs. Because CRUSH computations are obviously affected,
pre-4e28f9e63644 servers disagree with post-4e28f9e63644 clients on
object placement, resulting in misdirected requests.
Mirrors ceph.git commit a6009d1039a55e2c77f431662b3d6cc5a8e8e63f.
Fixes: 930c532869 ("libceph: apply new_state before new_up_client on incrementals")
Link: http://tracker.ceph.com/issues/19122
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e030d5ce9 upstream.
When we close a channel that has been rescinded, we will leak memory since
vmbus_teardown_gpadl() returns an error. Fix this so that we can properly
cleanup the memory allocated to the ring buffers.
Fixes: ccb61f8a99 ("Drivers: hv: vmbus: Fix a rescind handling bug")
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a5476020a upstream.
If we cannot allocate memory for the channel, free the relid
associated with the channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e609ccef52 upstream.
Output 'activation' may fail for the reasons of the output driver,
for example, if msc's buffer is not allocated. We forget, however,
to drop the module reference in this case. So each attempt at
activation in this case leaks a reference, preventing the module
from ever unloading.
This patch adds the missing module_put() in the activation error
path.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd9cb405e0 upstream.
In journal_init_common(), if we failed to allocate the j_wbuf array, or
if we failed to create the buffer_head for the journal superblock, we
leaked the memory allocated for the revocation tables. Fix this.
Fixes: f0c9fd5458
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abda288bb2 upstream.
The OF device table must be terminated, otherwise we'll be walking past it
and into areas unknown.
Fixes: 0cad855fbd ("auxdisplay: img-ascii-lcd: driver for simple ASCII...")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a05d4fd917 upstream.
The net_cls controller controls the classid field of each socket which
is associated with the cgroup. Because the classid is per-socket
attribute, when a task migrates to another cgroup or the configured
classid of the cgroup changes, the controller needs to walk all
sockets and update the classid value, which was implemented by
3b13758f51 ("cgroups: Allow dynamically changing net_classid").
While the approach is not scalable, migrating tasks which have a lot
of fds attached to them is rare and the cost is born by the ones
initiating the operations. However, for simplicity, both the
migration and classid config change paths call update_classid() which
scans all fds of all tasks in the target css. This is an overkill for
the migration path which only needs to cover a much smaller subset of
tasks which are actually getting migrated in.
On cgroup v1, this can lead to unexpected scalability issues when one
tries to migrate a task or process into a net_cls cgroup which already
contains a lot of fds. Even if the migration traget doesn't have many
to get scanned, update_classid() ends up scanning all fds in the
target cgroup which can be extremely numerous.
Unfortunately, on cgroup v2 which doesn't use net_cls, the problem is
even worse. Before bfc2cf6f61 ("cgroup: call subsys->*attach() only
for subsystems which are actually affected by migration"), cgroup core
would call the ->css_attach callback even for controllers which don't
see actual migration to a different css.
As net_cls is always disabled but still mounted on cgroup v2, whenever
a process is migrated on the cgroup v2 hierarchy, net_cls sees
identity migration from root to root and cgroup core used to call
->css_attach callback for those. The net_cls ->css_attach ends up
calling update_classid() on the root net_cls css to which all
processes on the system belong to as the controller isn't used. This
makes any cgroup v2 migration O(total_number_of_fds_on_the_system)
which is horrible and easily leads to noticeable stalls triggering RCU
stall warnings and so on.
The worst symptom is already fixed in upstream by bfc2cf6f61
("cgroup: call subsys->*attach() only for subsystems which are
actually affected by migration"); however, backporting that commit is
too invasive and we want to avoid other cases too.
This patch updates net_cls's cgrp_attach() to iterate fds of only the
processes which are actually getting migrated. This removes the
surprising migration cost which is dependent on the total number of
fds in the target cgroup. As this leaves write_classid() the only
user of update_classid(), open-code the helper into write_classid().
Reported-by: David Goode <dgoode@fb.com>
Fixes: 3b13758f51 ("cgroups: Allow dynamically changing net_classid")
Cc: Nina Schiff <ninasc@fb.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff010472fb upstream.
On CPU online the cpufreq core restores the previous governor (or
the previous "policy" setting for ->setpolicy drivers), but it does
not restore the min/max limits at the same time, which is confusing,
inconsistent and real pain for users who set the limits and then
suspend/resume the system (using full suspend), in which case the
limits are reset on all CPUs except for the boot one.
Fix this by making cpufreq_online() restore the limits when an inactive
policy is brought online.
The commit log and patch are inspired from Rafael's earlier work.
Reported-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit afd0e5a876 upstream.
If kernel image extends across alignment boundary, existing
code increases the KASLR offset by size of kernel image. The
offset is masked after resizing. There are cases, where after
masking, we may still have kernel image extending across
boundary. This eventually results in only 2MB block getting
mapped while creating the page tables. This results in data aborts
while accessing unmapped regions during second relocation (with
kaslr offset) in __primary_switch. To fix this problem, round up the
kernel image size, by swapper block size, before adding it for
correction.
For example consider below case, where kernel image still crosses
1GB alignment boundary, after masking the offset, which is fixed
by rounding up kernel image size.
SWAPPER_TABLE_SHIFT = 30
Swapper using section maps with section size 2MB.
CONFIG_PGTABLE_LEVELS = 3
VA_BITS = 39
_text : 0xffffff8008080000
_end : 0xffffff800aa1b000
offset : 0x1f35600000
mask = ((1UL << (VA_BITS - 2)) - 1) & ~(SZ_2M - 1)
(_text + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7c
(_end + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
offset after existing correction (before mask) = 0x1f37f9b000
(_text + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
(_end + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
offset (after mask) = 0x1f37e00000
(_text + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7c
(_end + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
new offset w/ rounding up = 0x1f38000000
(_text + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
(_end + offset) >> SWAPPER_TABLE_SHIFT = 0x3fffffe7d
Fixes: f80fb3a3d5 ("arm64: add support for kernel ASLR")
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Neeraj Upadhyay <neeraju@codeaurora.org>
Signed-off-by: Srinivas Ramana <sramana@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60b89f1928 upstream.
On some DDR controllers, compatible with the sama5d3 one,
the sequence to enter/exit/re-enter the self-refresh mode adds
more constrains than what is currently written in the at91_idle
driver. An actual access to the DDR chip is needed between exit
and re-enter of this mode which is somehow difficult to implement.
This sequence can completely hang the SoC. It is particularly
experienced on parts which embed a L2 cache if the code run
between IDLE calls fits in it...
Moreover, as the intention is to enter and exit pretty rapidly
from IDLE, the power-down mode is a good candidate.
So now we use power-down instead of self-refresh. As we can
simplify the code for sama5d3 compatible DDR controllers,
we instantiate a new sama5d3_ddr_standby() function.
Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Fixes: 017b5522d5 ("ARM: at91: Add new binding for sama5d3-ddramc")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e10889a31 upstream.
This reverts commit cab4328268 ("ARM: at91/dt: sama5d2: Use new
compatible for ohci node")
It depends from commit 7150bc9b4d ("usb: ohci-at91: Forcibly suspend
ports while USB suspend") which was reverted and implemented
differently. With the new implementation, the compatible string must
remain the same.
The compatible string introduced by this commit has been used in the
default SAMA5D2 dtsi starting from Linux 4.8. As it has never been
working correctly in an official release, removing it should not be
breaking the stability rules.
Fixes: cab4328268 ("ARM: at91/dt: sama5d2: Use new compatible for ohci node")
Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1914f0cd20 upstream.
This was broken in commit cd979883b9 ("xen/acpi-processor:
fix enabling interrupts on syscore_resume"). do_suspend (from
xen/manage.c) and thus xen_resume_notifier never get called on
the initial-domain at resume (it is if running as guest.)
The rationale for the breaking change was that upload_pm_data()
potentially does blocking work in syscore_resume(). This patch
addresses the original issue by scheduling upload_pm_data() to
execute in workqueue context.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Based-on-patch-by: Konrad Wilk <konrad.wilk@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c468447f4 upstream.
The CCP driver generally uses a round-robin approach when
assigning operations to available CCPs. For the DMA engine,
however, the DMA mappings of the SGs are associated with a
specific CCP. When an IOMMU is enabled, the IOMMU is
programmed based on this specific device.
If the DMA operations are not performed by that specific
CCP then addressing errors and I/O page faults will occur.
Update the CCP driver to allow a specific CCP device to be
requested for an operation and use this in the DMA engine
support.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e841d3eb9 upstream.
When PCIe FLR support was added, much of the remove/release code for
PCIe was migrated to ->down_dev(), but ->down_dev() is never called for
device removal. Let's refactor the cleanup to be done in both cases.
Also, drop the comments above mwifiex_cleanup_pcie(), because they were
clearly wrong, and it's better to have clear and obvious code than to
detail the code steps in comments anyway.
Fixes: 4c5dae59d2 ("mwifiex: add PCIe function level reset support")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac8616e4c8 upstream.
The MP style clocks support an mux with pre-dividers. While the driver
correctly accounted for them in the .determine_rate callback, it did
not in the .recalc_rate and .set_rate callbacks.
This means when calculating the factors in the .set_rate callback, they
would be off by a factor of the active pre-divider. Same goes for
reading back the clock rate after it is set.
Fixes: 2ab836db50 ("clk: sunxi-ng: Add M-P factor clock support")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c75704ebc upstream.
After commit e9afc74629 ("hwrng: geode - Use linux/io.h instead of
asm/io.h") the geode-rng driver uses devres with pci_dev->dev to keep
track of resources, but does not actually register a PCI driver. This
results in the following issues:
1. The driver leaks memory because the driver does not attach to a
device. The driver only uses the PCI device as a reference. devm_*()
functions will release resources on driver detach, which the geode-rng
driver will never do. As a result,
2. The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.
Revert the changes made by e9afc74629 ("hwrng: geode - Use linux/io.h
instead of asm/io.h").
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Fixes: 6e9b5e7688 ("hwrng: geode - Migrate to managed API")
Cc: Matt Mackall <mpm@selenic.com>
Cc: Corentin LABBE <clabbe.montjoie@gmail.com>
Cc: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-geode@lists.infradead.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69db700931 upstream.
After commit 31b2a73c9c ("hwrng: amd - Migrate to managed API"), the
amd-rng driver uses devres with pci_dev->dev to keep track of resources,
but does not actually register a PCI driver. This results in the
following issues:
1. The message
WARNING: CPU: 2 PID: 621 at drivers/base/dd.c:349 driver_probe_device+0x38c
is output when the i2c_amd756 driver loads and attempts to register a PCI
driver. The PCI & device subsystems assume that no resources have been
registered for the device, and the WARN_ON() triggers since amd-rng has
already do so.
2. The driver leaks memory because the driver does not attach to a
device. The driver only uses the PCI device as a reference. devm_*()
functions will release resources on driver detach, which the amd-rng
driver will never do. As a result,
3. The driver cannot be reloaded because there is always a use of the
ioport and region after the first load of the driver.
Revert the changes made by 31b2a73c9c ("hwrng: amd - Migrate to managed
API").
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Fixes: 31b2a73c9c ("hwrng: amd - Migrate to managed API").
Cc: Matt Mackall <mpm@selenic.com>
Cc: Corentin LABBE <clabbe.montjoie@gmail.com>
Cc: PrasannaKumar Muralidharan <prasannatsmkumar@gmail.com>
Cc: Wei Yongjun <weiyongjun1@huawei.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-geode@lists.infradead.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 027fb89e61 upstream.
Disabling interrupts for even a millisecond can cause problems for some
devices. That can happen when Intel host controllers wait for the present
state to propagate.
The spin lock is not necessary here. Anything that is racing with changes
to the I/O state is already broken. The mmc core already provides
synchronization via "claiming" the host.
Although the spin lock probably should be removed from the code paths that
lead to this point, such a patch would touch too much code to be suitable
for stable trees. Consequently, for this patch, just drop the spin lock
while waiting.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2ebfb2142 upstream.
Disabling interrupts for even a millisecond can cause problems for some
devices. That can happen when sdhci changes clock frequency because it
waits for the clock to become stable under a spin lock.
The spin lock is not necessary here. Anything that is racing with changes
to the I/O state is already broken. The mmc core already provides
synchronization via "claiming" the host.
Although the spin lock probably should be removed from the code paths that
lead to this point, such a patch would touch too much code to be suitable
for stable trees. Consequently, for this patch, just drop the spin lock
while waiting.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16681037e7 upstream.
sdhci_arasan_get_timeout_clock() divides the frequency it has with (1 <<
(13 + divisor)).
However, the divisor is not some Arasan-specific value, but instead is
just the Data Timeout Counter Value from the SDHCI Timeout Control
Register.
Applying it here like this is wrong as the sdhci driver already takes
that value into account when calculating timeouts, and in fact it *sets*
that register value based on how long a timeout is wanted.
Additionally, sdhci core interprets the .get_timeout_clock callback
return value as if it were read from hardware registers, i.e. the unit
should be kHz or MHz depending on SDHCI_TIMEOUT_CLK_UNIT capability bit.
This bit is set at least on the tested Zynq-7000 SoC.
With the tested hardware (SDHCI_TIMEOUT_CLK_UNIT set) this results in
too high a timeout clock rate being reported, causing the core to use
longer-than-needed timeouts. Additionally, on a partitioned MMC
(therefore having erase_group_def bit set) mmc_calc_max_discard()
disables discard support as it looks like controller does not support
the long timeouts needed for that.
Do not apply the extra divisor and return the timeout clock in the
expected unit.
Tested with a Zynq-7000 SoC and a partitioned Toshiba THGBMAG5A1JBAWR
eMMC card.
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: e3ec3a3d11 ("mmc: arasan: Add driver for Arasan SDHCI")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ce0c7b655 upstream.
The SDHCI controller in the SAMA5D2 chip requires a valid voltage set
in the power control register, otherwise commands will fail with a
timeout error.
When using the regulator framework to specify the regulator used by the
mmc device, the voltage is not configured, and it is not possible to use
the connected device.
Implement a custom 'set_power' function for this specific hardware, that
configures the voltage in the register in all cases.
Signed-off-by: Romain Izard <romain.izard.pro@gmail.com>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d98ce0be5 upstream.
We concluded there may be a window where the idle wakeup code could get
to pnv_wakeup_tb_loss() (which clobbers non-volatile GPRs), but the
hardware may set SRR1[46:47] to 01b (no state loss) which would result
in the wakeup code failing to restore non-volatile GPRs.
I was not able to trigger this condition with trivial tests on real
hardware or simulator, but the ISA (at least 2.07) seems to allow for
it, and Gautham says that it can happen if there is an exception pending
when the sleep/winkle instruction is executed.
Fixes: 1706567117 ("powerpc/kvm: make hypervisor state restore a function")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9cf625d6e upstream.
If ext4_convert_inline_data() was called on a directory with inline
data, the filesystem was left in an inconsistent state (as considered by
e2fsck) because the file size was not increased to cover the new block.
This happened because the inode was not marked dirty after i_disksize
was updated. Fix this by marking the inode dirty at the end of
ext4_finish_convert_inline_dir().
This bug was probably not noticed before because most users mark the
inode dirty afterwards for other reasons. But if userspace executed
FS_IOC_SET_ENCRYPTION_POLICY with invalid parameters, as exercised by
'kvm-xfstests -c adv generic/396', then the inode was never marked dirty
after updating i_disksize.
Fixes: 3c47d54170
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03270c6ac6 upstream.
Usually every parallel port will have a single pardev registered with
it. But ppdev driver is an exception. This userspace parallel port
driver allows to create multiple parrallel port devices for a single
parallel port. And as a result we were having a nice warning like:
"sysctl table check failed:
/dev/parport/parport0/devices/ppdev0/timeslice Sysctl already exists"
Use the same logic as used in parport_register_device() and register
the proc files only once for each parallel port.
Fixes: 6fa45a2268 ("parport: add device-model to parport subsystem")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1414656
Bugzilla: https://bugs.archlinux.org/task/52322
Tested-by: James Feeney <james@nurealm.net>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3ff861f59f upstream.
Even if bus is not hot-pluggable, devices can be unbound from the
driver via sysfs, so we should not be using __exit annotations on
remove() methods. The only exception is drivers registered with
platform_driver_probe() which specifically disables sysfs bind/unbind
attributes.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bec247474 upstream.
In function _hid_sensor_power_state(), when hid_sensor_read_poll_value()
is called, sensor's all properties will be updated by the value from
sensor hardware/firmware.
In some implementation, sensor hardware/firmware will do a power cycle
during S3. In this case, after resume, once hid_sensor_read_poll_value()
is called, sensor's all properties which are kept by driver during S3
will be changed to default value.
But instead, if a set feature function is called first, sensor
hardware/firmware will be recovered to the last status. So change the
sensor_hub_set_feature() calling order to behind of set feature function
to avoid sensor properties lose.
Signed-off-by: Song Hongyan <hongyan.song@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c42f821861 upstream.
Use the IS_ENABLED() helper macro to ensure that the configfs group is
initialized either when configfs is built-in or when configfs is built as a
module. Otherwise software device creation will result in undefined
behaviour when configfs is built as a module since the configfs group for
the device not properly initialized.
Similar to commit b2f0c09664 ("iio: sw-trigger: Fix config group
initialization").
Fixes: 0f3a8c3f34 ("iio: Add support for creating IIO devices via configfs")
Reported-by: Miguel Robles <miguel.robles@farole.net>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Daniel Baluta <daniel.baluta@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e83bb3e6f3 upstream.
The tiadc_irq_h(int irq, void *private) function is handling FIFO
overruns by clearing flags, disabling and enabling the ADC to
recover.
If the ADC is running in continuous mode a FIFO overrun happens
regularly. If the disabling of the ADC happens concurrently with
a new conversion. It might happen that the enabling of the ADC
is ignored by the hardware. This stops the ADC permanently. No
more interrupts are triggered.
According to the AM335x Reference Manual (SPRUH73H October 2011 -
Revised April 2013 - Chapter 12.4 and 12.5) it is necessary to
check the ADC FSM bits in REG_ADCFSM before enabling the ADC
again. Because the disabling of the ADC is done right after the
current conversion has been finished.
To trigger this bug it is necessary to run the ADC in continuous
mode. The ADC values of all channels need to be read in an endless
loop. The bug appears within the first 6 hours (~5.4 million
handled FIFO overruns). The user space application will hang on
reading new values from the character device.
Fixes: ca9a563805 ("iio: ti_am335x_adc: Add continuous sampling support")
Signed-off-by: Michael Engl <michael.engl@wjw-solutions.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 181302dc72 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: 53f3a9e26e ("mmc: USB SD Host Controller (USHC) driver")
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit daf229b159 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Note that the dereference happens in the start callback which is called
during probe.
Fixes: de520b8bd5 ("uwb: add HWA radio controller driver")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ce362711d upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Note that the dereference happens in the cmd and wait_init_done
callbacks which are called during probe.
Fixes: 1ba47da527 ("uwb: add the i1480 DFU driver")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e47c53503 upstream.
Make sure to initialise the return value to avoid having allocation
failures going unnoticed when allocating interrupt-endpoint resources.
This prevents use-after-free or worse when the device is later unbound.
Fixes: dbf3e7f654 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Cc: Dave Penkler <dpenkler@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 687e0687f7 upstream.
USBTMC devices are required to have a bulk-in and a bulk-out endpoint,
but the driver failed to verify this, something which could lead to the
endpoint addresses being taken from uninitialised memory.
Make sure to zero all private data as part of allocation, and add the
missing endpoint sanity check.
Note that this also addresses a more recently introduced issue, where
the interrupt-in-presence flag would also be uninitialised whenever the
optional interrupt-in endpoint is not present. This in turn could lead
to an interrupt urb being allocated, initialised and submitted based on
uninitialised values.
Fixes: dbf3e7f654 ("Implement an ioctl to support the USMTMC-USB488 READ_STATUS_BYTE operation.")
Fixes: 5b775f672c ("USB: add USB test and measurement class driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b2db29fbb upstream.
If usb_get_bos_descriptor() returns an error, usb->bos will be NULL.
Nevertheless, it is dereferenced unconditionally in
hub_set_initial_usb2_lpm_policy() if usb2_hw_lpm_capable is set.
This results in a crash.
usb 5-1: unable to get BOS descriptor
...
Unable to handle kernel NULL pointer dereference at virtual address 00000008
pgd = ffffffc00165f000
[00000008] *pgd=000000000174f003, *pud=000000000174f003,
*pmd=0000000001750003, *pte=00e8000001751713
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in: uinput uvcvideo videobuf2_vmalloc cmac [ ... ]
CPU: 5 PID: 3353 Comm: kworker/5:3 Tainted: G B 4.4.52 #480
Hardware name: Google Kevin (DT)
Workqueue: events driver_set_config_work
task: ffffffc0c3690000 ti: ffffffc0ae9a8000 task.ti: ffffffc0ae9a8000
PC is at hub_port_init+0xc3c/0xd10
LR is at hub_port_init+0xc3c/0xd10
...
Call trace:
[<ffffffc0007fbbfc>] hub_port_init+0xc3c/0xd10
[<ffffffc0007fbe2c>] usb_reset_and_verify_device+0x15c/0x82c
[<ffffffc0007fc5e0>] usb_reset_device+0xe4/0x298
[<ffffffbffc0e3fcc>] rtl8152_probe+0x84/0x9b0 [r8152]
[<ffffffc00080ca8c>] usb_probe_interface+0x244/0x2f8
[<ffffffc000774a24>] driver_probe_device+0x180/0x3b4
[<ffffffc000774e48>] __device_attach_driver+0xb4/0xe0
[<ffffffc000772168>] bus_for_each_drv+0xb4/0xe4
[<ffffffc0007747ec>] __device_attach+0xd0/0x158
[<ffffffc000775080>] device_initial_probe+0x24/0x30
[<ffffffc0007739d4>] bus_probe_device+0x50/0xe4
[<ffffffc000770bd0>] device_add+0x414/0x738
[<ffffffc000809fe8>] usb_set_configuration+0x89c/0x914
[<ffffffc00080a120>] driver_set_config_work+0xc0/0xf0
[<ffffffc000249bb8>] process_one_work+0x390/0x6b8
[<ffffffc00024abcc>] worker_thread+0x480/0x610
[<ffffffc000251a80>] kthread+0x164/0x178
[<ffffffc0002045d0>] ret_from_fork+0x10/0x40
Since we don't know anything about LPM capabilities without BOS descriptor,
don't attempt to enable LPM if it is not available.
Fixes: 890dae8867 ("xhci: Enable LPM support only for hardwired ...")
Cc: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0090114d33 upstream.
The CPPI 4.1 driver polls register to workaround the premature TX
interrupt issue, but it causes audio playback underrun when triggered in
Isoch transfers.
Isoch doesn't do back-to-back transfers, the TX should be done by the
time the next transfer is scheduled. So skip this polling workaround for
Isoch transfer.
Fixes: a655f481d8 ("usb: musb: musb_cppi41: handle pre-mature TX complete interrupt")
Reported-by: Alexandre Bailon <abailon@baylibre.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Tested-by: Alexandre Bailon <abailon@baylibre.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03ace948a4 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.
This specifically fixes the NULL-pointer dereference when probing HWA HC
devices.
Fixes: df3654236e ("wusb: add the Wire Adapter (WA) core")
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Cc: David Vrabel <david.vrabel@csr.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b0addd3fa6 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1dc56c52d2 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should the probed device lack endpoints.
Note that this driver does not bind to any devices by default.
Fixes: ce21bfe603 ("USB: Add LVS Test device driver")
Cc: Pratyush Anand <pratyush.anand@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f259ca3eed upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory beyond the endpoint array should a
malicious device lack the expected endpoints.
Note that the endpoint access that causes the NULL-deref is currently
only used for debugging purposes during probe so the oops only happens
when dynamic debugging is enabled. This means the driver could be
rewritten to continue to accept device with only two endpoints, should
such devices exist.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3243367b20 upstream.
Some USB 2.0 devices erroneously report millisecond values in
bInterval. The generic config code manages to catch most of them,
but in some cases it's not completely enough.
The case at stake here is a USB 2.0 braille device, which wants to
announce 10ms and thus sets bInterval to 10, but with the USB 2.0
computation that yields to 64ms. It happens that one can type fast
enough to reach this interval and get the device buffers overflown,
leading to problematic latencies. The generic config code does not
catch this case because the 64ms is considered a sane enough value.
This change thus adds a USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL quirk
to mark devices which actually report milliseconds in bInterval,
and marks Vario Ultra devices as needing it.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 09424c50b7 upstream.
The streaming_maxburst module parameter is 0 offset (0..15)
so we must add 1 while using it for wBytesPerInterval
calculation for the SuperSpeed companion descriptor.
Without this host uvcvideo driver will always see the wrong
wBytesPerInterval for SuperSpeed uvc gadget and may not find
a suitable video interface endpoint.
e.g. for streaming_maxburst = 0 case it will always
fail as wBytePerInterval was evaluating to 0.
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cdd7928df0 upstream.
The gadget code exports the bitfield for serial status changes
over the wire in its internal endianness. The fix is to convert
to little endian before sending it over the wire.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Tested-by: 家瑋 <momo1208@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e9f44eaae upstream.
Add Quectel UC15, UC20, EC21, and EC25. The EC20 is handled by
qcserial due to a USB VID/PID conflict with an existing Acer
device.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f307834e6 upstream.
A new Dell laptop needs to apply ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to
fix the headset problem, and the pin definiton of this machine is not
in the pin quirk table yet, now adding it to the table.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f363a06642 upstream.
In the commit [15c75b09f8: ALSA: ctxfi: Fallback DMA mask to 32bit],
I forgot to put "!" at dam_set_mask() call check in cthw20k1.c (while
cthw20k2.c is OK). This patch fixes that obvious bug.
(As a side note: although the original commit was completely wrong,
it's still working for most of machines, as it sets to 32bit DMA mask
in the end. So the bug severity is low.)
Fixes: 15c75b09f8 ("ALSA: ctxfi: Fallback DMA mask to 32bit")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c520ff3d03 upstream.
When snd_seq_pool_done() is called, it marks the closing flag to
refuse the further cell insertions. But snd_seq_pool_done() itself
doesn't clear the cells but just waits until all cells are cleared by
the caller side. That is, it's racy, and this leads to the endless
stall as syzkaller spotted.
This patch addresses the racy by splitting the setup of pool->closing
flag out of snd_seq_pool_done(), and calling it properly before
snd_seq_pool_done().
BugLink: http://lkml.kernel.org/r/CACT4Y+aqqy8bZA1fFieifNxR2fAfFQQABcBHj801+u5ePV0URw@mail.gmail.com
Reported-and-tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92461f5d72 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory that lie beyond the end of the endpoint
array should a malicious device lack the expected endpoints.
Fixes: bdb5c57f20 ("Input: add sur40 driver for Samsung SUR40... ")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb1b494663 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac2ee9ba95 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: c04148f915 ("Input: add driver for USB VoIP phones with CM109...")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5cc4a1a9f5 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: aca951a22a ("[PATCH] input-driver-yealink-P1K-usb-phone")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba340d7b83 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: bba5394ad3 ("Input: add support for Hanwang tablets")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1916d31927 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack control-interface endpoints.
Fixes: 628329d524 ("Input: add IMS Passenger Control Unit driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 59cf8bed44 upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer or accessing memory that lie beyond the end of the endpoint
array should a malicious device lack the expected endpoints.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 92ef6f97a6 upstream.
EeeBook X205TA is yet another ASUS device with a special touchpad
firmware that needs to be accounted for during initialization, or
else the touchpad will go into an invalid state upon suspend/resume.
Adding the appropriate ic_type and product_id check fixes the problem.
Signed-off-by: Matjaz Hegedic <matjaz.hegedic@gmail.com>
Acked-by: KT Liao <kt.liao@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 15bb7745e9 ]
icsk_ack.lrcvtime has a 0 value at socket creation time.
tcpi_last_data_recv can have bogus value if no payload is ever received.
This patch initializes icsk_ack.lrcvtime for active sessions
in tcp_finish_connect(), and for passive sessions in
tcp_create_openreq_child()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a97e50cc4c ]
In sk_clone_lock(), we create a new socket and inherit most of the
parent's members via sock_copy() which memcpy()'s various sections.
Now, in case the parent socket had a BPF socket filter attached,
then newsk->sk_filter points to the same instance as the original
sk->sk_filter.
sk_filter_charge() is then called on the newsk->sk_filter to take a
reference and should that fail due to hitting max optmem, we bail
out and release the newsk instance.
The issue is that commit 278571baca ("net: filter: simplify socket
charging") wrongly combined the dismantle path with the failure path
of xfrm_sk_clone_policy(). This means, even when charging failed, we
call sk_free_unlock_clone() on the newsk, which then still points to
the same sk_filter as the original sk.
Thus, sk_free_unlock_clone() calls into __sk_destruct() eventually
where it tests for present sk_filter and calls sk_filter_uncharge()
on it, which potentially lets sk_omem_alloc wrap around and releases
the eBPF prog and sk_filter structure from the (still intact) parent.
Fix it by making sure that when sk_filter_charge() failed, we reset
newsk->sk_filter back to NULL before passing to sk_free_unlock_clone(),
so that we don't mess with the parents sk_filter.
Only if xfrm_sk_clone_policy() fails, we did reach the point where
either the parent's filter was NULL and as a result newsk's as well
or where we previously had a successful sk_filter_charge(), thus for
that case, we do need sk_filter_uncharge() to release the prior taken
reference on sk_filter.
Fixes: 278571baca ("net: filter: simplify socket charging")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c64c0b3cac ]
Alexander reported a KMSAN splat caused by reads of uninitialized
field (tb_id_in) from user provided struct fib_result_nl
It turns out nl_fib_input() sanity tests on user input is a bit
wrong :
User can pretend nlh->nlmsg_len is big enough, but provide
at sendmsg() time a too small buffer.
Reported-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 31739eae73 ]
Commit 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset")
removed the bcmgenet_mii_reset() function from bcmgenet_power_up() and
bcmgenet_internal_phy_setup() functions. In so doing it broke the reset
of the internal PHY devices used by the GENETv1-GENETv3 which required
this reset before the UniMAC was enabled. It also broke the internal
GPHY devices used by the GENETv4 because the config_init that installed
the AFE workaround was no longer occurring after the reset of the GPHY
performed by bcmgenet_phy_power_set() in bcmgenet_internal_phy_setup().
In addition the code in bcmgenet_internal_phy_setup() related to the
"enable APD" comment goes with the bcmgenet_mii_reset() so it should
have also been removed.
Commit bd4060a610 ("net: bcmgenet: Power on integrated GPHY in
bcmgenet_power_up()") moved the bcmgenet_phy_power_set() call to the
bcmgenet_power_up() function, but failed to remove it from the
bcmgenet_internal_phy_setup() function. Had it done so, the
bcmgenet_internal_phy_setup() function would have been empty and could
have been removed at that time.
Commit 5dbebbb44a ("net: bcmgenet: Software reset EPHY after power on")
was submitted to correct the functional problems introduced by
commit 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset"). It
was included in v4.4 and made available on 4.3-stable. Unfortunately,
it didn't fully revert the commit because this bcmgenet_mii_reset()
doesn't apply the soft reset to the internal GPHY used by GENETv4 like
the previous one did. This prevents the restoration of the AFE work-
arounds for internal GPHY devices after the bcmgenet_phy_power_set() in
bcmgenet_internal_phy_setup().
This commit takes the alternate approach of removing the unnecessary
bcmgenet_internal_phy_setup() function which shouldn't have been in v4.3
so that when bcmgenet_mii_reset() was restored it should have only gone
into bcmgenet_power_up(). This will avoid the problems while also
removing the redundancy (and hopefully some of the confusion).
Fixes: 6ac3ce8295 ("net: bcmgenet: Remove excessive PHY reset")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d515684d78 ]
In the case udp_sk(sk)->pending is AF_INET6, udpv6_sendmsg() would
jump to do_append_data, skipping the initialization of sockc.tsflags.
Fix the problem by moving sockc.tsflags initialization earlier.
The bug was detected with KMSAN.
Fixes: c14ac9451c ("sock: enable timestamping using control messages")
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8ab7e2ae15 ]
RX packets statistics ('rx_packets' counter) used to count LRO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.
Note that no information is lost in this patch due to 'rx_lro_packets'
counter existence.
Before, ethtool showed:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
rx_packets: 435277
rx_lro_packets: 35847
rx_packets_phy: 1935066
Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "rx_packets|rx_lro_packets"
rx_packets: 1935066
rx_lro_packets: 35847
rx_packets_phy: 1935066
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d3a4e4da54 ]
TX packets statistics ('tx_packets' counter) used to count GSO packets
as one, even though it contains multiple segments.
This patch will increment the counter by the number of segments, and
align the driver with the behavior of other drivers in the stack.
Note that no information is lost in this patch due to 'tx_tso_packets'
counter existence.
Before, ethtool showed:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
tx_packets: 61340
tx_tso_packets: 60954
tx_packets_phy: 2451115
Now, we will see the more logical statistics:
$ ethtool -S ens6 | egrep "tx_packets|tx_tso_packets"
tx_packets: 2451115
tx_tso_packets: 60954
tx_packets_phy: 2451115
Fixes: e586b3b0ba ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5f40b4ed97 ]
With ConnectX-4 sharing SRQs from the same space as QPs, we hit a
limit preventing some applications to allocate needed QPs amount.
Double the size to 256K.
Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1f30a86c58 ]
The switch cases for the rate limit set and query commands were
missing, which could get us wrong under fw error or driver reset
flow, fix that.
Fixes: 1466cc5b23 ('net/mlx5: Rate limit tables support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3dc857f0e8 ]
The VRF driver takes a reference to the inet6_dev on the VRF device for
its rt6_local dst when handling local traffic through the VRF device as
a loopback. When the device is deleted the driver does a put on the idev
but does not reset rt6i_idev in the rt6_info struct. When the dst is
destroyed, dst_destroy calls ip6_dst_destroy which does a second put for
what is essentially the same reference causing it to be prematurely freed.
Reset rt6i_idev after the put in the vrf driver.
Fixes: b4869aa2f8 ("net: vrf: ipv6 support for local traffic to
local addresses")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6bd845d1cf ]
This is a Dell branded Sierra Wireless EM7455. It is operating in
MBIM mode by default, but can be configured to provide two QMI/RMNET
functions.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7df9c24625 ]
Dmitry has reported that a BUG_ON() condition in unix_notinflight()
may be triggered by a simple code that forwards unix socket in an
SCM_RIGHTS message.
That is caused by incorrect unix socket GC implementation in unix_gc().
The GC first collects list of candidates, then (a) decrements their
"children's" inflight counter, (b) checks which inflight counters are
now 0, and then (c) increments all inflight counters back.
(a) and (c) are done by calling scan_children() with inc_inflight or
dec_inflight as the second argument.
Commit 6209344f5a ("net: unix: fix inflight counting bug in garbage
collector") changed scan_children() such that it no longer considers
sockets that do not have UNIX_GC_CANDIDATE flag. It also added a block
of code that that unsets this flag _before_ invoking
scan_children(, dec_iflight, ). This may lead to incorrect inflight
counters for some sockets.
This change fixes this bug by changing order of operations:
UNIX_GC_CANDIDATE is now unset only after all inflight counters are
restored to the original state.
kernel BUG at net/unix/garbage.c:149!
RIP: 0010:[<ffffffff8717ebf4>] [<ffffffff8717ebf4>]
unix_notinflight+0x3b4/0x490 net/unix/garbage.c:149
Call Trace:
[<ffffffff8716cfbf>] unix_detach_fds.isra.19+0xff/0x170 net/unix/af_unix.c:1487
[<ffffffff8716f6a9>] unix_destruct_scm+0xf9/0x210 net/unix/af_unix.c:1496
[<ffffffff86a90a01>] skb_release_head_state+0x101/0x200 net/core/skbuff.c:655
[<ffffffff86a9808a>] skb_release_all+0x1a/0x60 net/core/skbuff.c:668
[<ffffffff86a980ea>] __kfree_skb+0x1a/0x30 net/core/skbuff.c:684
[<ffffffff86a98284>] kfree_skb+0x184/0x570 net/core/skbuff.c:705
[<ffffffff871789d5>] unix_release_sock+0x5b5/0xbd0 net/unix/af_unix.c:559
[<ffffffff87179039>] unix_release+0x49/0x90 net/unix/af_unix.c:836
[<ffffffff86a694b2>] sock_release+0x92/0x1f0 net/socket.c:570
[<ffffffff86a6962b>] sock_close+0x1b/0x20 net/socket.c:1017
[<ffffffff81a76b8e>] __fput+0x34e/0x910 fs/file_table.c:208
[<ffffffff81a771da>] ____fput+0x1a/0x20 fs/file_table.c:244
[<ffffffff81483ab0>] task_work_run+0x1a0/0x280 kernel/task_work.c:116
[< inline >] exit_task_work include/linux/task_work.h:21
[<ffffffff8141287a>] do_exit+0x183a/0x2640 kernel/exit.c:828
[<ffffffff8141383e>] do_group_exit+0x14e/0x420 kernel/exit.c:931
[<ffffffff814429d3>] get_signal+0x663/0x1880 kernel/signal.c:2307
[<ffffffff81239b45>] do_signal+0xc5/0x2190 arch/x86/kernel/signal.c:807
[<ffffffff8100666a>] exit_to_usermode_loop+0x1ea/0x2d0
arch/x86/entry/common.c:156
[< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190
[<ffffffff81009693>] syscall_return_slowpath+0x4d3/0x570
arch/x86/entry/common.c:259
[<ffffffff881478e6>] entry_SYSCALL_64_fastpath+0xc4/0xc6
Link: https://lkml.org/lkml/2017/3/6/252
Signed-off-by: Andrey Ulanov <andreyu@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 6209344 ("net: unix: fix inflight counting bug in garbage collector")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8f3dbfd79e ]
Added a case for OVS_TUNNEL_KEY_ATTR_PAD to the switch statement
in ip_tun_from_nlattr in order to prevent the default case
returning an error.
Fixes: b46f6ded90 ("libnl: nla_put_be64(): align on a 64-bit area")
Signed-off-by: Kris Murphy <kriskend@linux.vnet.ibm.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 622c36f143 ]
Newer hardware does not provide a cumulative payload length when multiple
descriptors are needed to handle the data. Once the MTU increases beyond
the size that can be handled by a single descriptor, the SKB does not get
built properly by the driver.
The driver will now calculate the size of the data buffers used by the
hardware. The first buffer of the first descriptor is for packet headers
or packet headers and data when the headers can't be split. Subsequent
descriptors in a multi-descriptor chain will not use the first buffer. The
second buffer is used by all the descriptors in the chain for payload data.
Based on whether the driver is processing the first, intermediate, or last
descriptor it can calculate the buffer usage and build the SKB properly.
Tested and verified on both old and new hardware.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 22a0e18eac ]
I mistakenly added the code to release sk->sk_frag in
sk_common_release() instead of sk_destruct()
TCP sockets using sk->sk_allocation == GFP_ATOMIC do no call
sk_common_release() at close time, thus leaking one (order-3) page.
iSCSI is using such sockets.
Fixes: 5640f76858 ("net: use a per task frag allocator")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5371bbf4b2 ]
Suspending the PHY would be putting it in a low power state where it
may no longer allow us to do Wake-on-LAN.
Fixes: cc013fb488 ("net: bcmgenet: correctly suspend and resume PHY device")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3d20f1f7bd ]
When dealing with ipv6 source tunnel key address attribute
(OVS_TUNNEL_KEY_ATTR_IPV6_SRC) we are wrongly setting the tunnel
dst ip, fix that.
Fixes: 6b26ba3a7d ('openvswitch: netlink attributes for IPv6 tunneling')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reported-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Joe Stringer <joe@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The brightness_set method is intended for use cases that must not
block, and can only be used if the GPIO provider can never sleep.
Remove an accidental initialisation (a copy-and-paste error) that
sets it regardless, which has been seen to cause crashes with the
gpio expander driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Upcoming firmware will modify the address portion of node names when
their "reg" property is written by a dtparam. Modify the w1-gpio
overlays to write the gpiopin parameter value to "reg" properties, so
that multiple instances can be loaded simultaneously.
Note: The value of the "address" is unimportant - the w1 subsystem
assigns instance numbers to buses sequentially from 1, and it is
not necessary to know which bus a device is on in order to find it.
This patch consolidates the sample rates which the audioinjector.net octo
sound card can use. The consolidation of sample rates are due to the
capabilities of the cs42448 codec.
This codec also requires a hard reset using the GPIO pin 5 upon probe.
commit 1d18c2747f upstream.
pids_can_fork() is special in that the css association is guaranteed
to be stable throughout the function and thus doesn't need RCU
protection around task_css access. When determining the css to charge
the pid, task_css_check() is used to override the RCU sanity check.
While adding a warning message on fork rejection from pids limit,
135b8b37bd ("cgroup: Add pids controller event when fork fails
because of pid limit") incorrectly added a task_css access which is
neither RCU protected or explicitly annotated. This triggers the
following suspicious RCU usage warning when RCU debugging is enabled.
cgroup: fork rejected by pids controller in
===============================
[ ERR: suspicious RCU usage. ]
4.10.0-work+ #1 Not tainted
-------------------------------
./include/linux/cgroup.h:435 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 0
1 lock held by bash/1748:
#0: (&cgroup_threadgroup_rwsem){+++++.}, at: [<ffffffff81052c96>] _do_fork+0xe6/0x6e0
stack backtrace:
CPU: 3 PID: 1748 Comm: bash Not tainted 4.10.0-work+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014
Call Trace:
dump_stack+0x68/0x93
lockdep_rcu_suspicious+0xd7/0x110
pids_can_fork+0x1c7/0x1d0
cgroup_can_fork+0x67/0xc0
copy_process.part.58+0x1709/0x1e90
_do_fork+0xe6/0x6e0
SyS_clone+0x19/0x20
do_syscall_64+0x5c/0x140
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f7853fab93a
RSP: 002b:00007ffc12d05c90 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7853fab93a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
RBP: 00007ffc12d05cc0 R08: 0000000000000000 R09: 00007f78548db700
R10: 00007f78548db9d0 R11: 0000000000000246 R12: 00000000000006d4
R13: 0000000000000001 R14: 0000000000000000 R15: 000055e3ebe2c04d
/asdf
There's no reason to dereference task_css again here when the
associated css is already available. Fix it by replacing the
task_cgroup() call with css->cgroup.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Galbraith <efault@gmx.de>
Fixes: 135b8b37bd ("cgroup: Add pids controller event when fork fails because of pid limit")
Cc: Kenny Yu <kennyyu@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 320661b08d upstream.
Update to pcpu_nr_empty_pop_pages in pcpu_alloc() is currently done
without holding pcpu_lock. This can lead to bad updates to the variable.
Add missing lock calls.
Fixes: b539b87fed ("percpu: implmeent pcpu_nr_empty_pop_pages and chunk->nr_populated")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 28ea06c46f upstream.
Commit 88ffbf3e03 switches to using rhashtables for glocks, hashing over
the entire struct lm_lockname instead of its individual fields. On some
architectures, struct lm_lockname contains a hole of uninitialized
memory due to alignment rules, which now leads to incorrect hash values.
Get rid of that hole.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68c32f9c2a upstream.
Make sure to check the number of endpoints to avoid dereferencing a
NULL-pointer should a malicious device lack endpoints.
Fixes: cf7776dc05 ("[PATCH] isdn4linux: Siemens Gigaset drivers - direct USB connection")
Cc: Hansjoerg Lipp <hjlipp@web.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13603685c1 upstream.
As reported by Max, the Windows 2008 R2 chkdsk utility expects
VERIFY_16 to be supported, and does not handle the returned
CHECK_CONDITION properly, resulting in an infinite loop.
The kernel will log huge amounts of this error:
kernel: TARGET_CORE[iSCSI]: Unsupported SCSI Opcode 0x8f, sending
CHECK_CONDITION.
Signed-off-by: Max Lohrmann <post@wickenrode.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f8830f5bb upstream.
There's a rather long standing regression from the commit "libiscsi:
Reduce locking contention in fast path"
Depending on iSCSI target behavior, it's possible to hit the case in
iscsi_complete_task where the task is still on a pending list
(!list_empty(&task->running)). When that happens the task is removed
from the list while holding the session back_lock, but other task list
modification occur under the frwd_lock. That leads to linked list
corruption and eventually a panicked system.
Rather than back out the session lock split entirely, in order to try
and keep some of the performance gains this patch adds another lock to
maintain the task lists integrity.
Major enterprise supported kernels have been backing out the lock split
for while now, thanks to the efforts at IBM where a lab setup has the
most reliable reproducer I've seen on this issue. This patch has been
tested there successfully.
Signed-off-by: Chris Leech <cleech@redhat.com>
Fixes: 659743b02c ("[SCSI] libiscsi: Reduce locking contention in fast path")
Reported-by: Prashantha Subbarao <psubbara@us.ibm.com>
Reviewed-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a04e54f2c3 upstream.
The following fixes a divide by zero OOPs with TYPE_TAPE
due to pscsi_tape_read_blocksize() failing causing a zero
sd->sector_size being propigated up via dev_attrib.hw_block_size.
It also fixes another long-standing bug where TYPE_TAPE and
TYPE_MEDIMUM_CHANGER where using pscsi_create_type_other(),
which does not call scsi_device_get() to take the device
reference. Instead, rename pscsi_create_type_rom() to
pscsi_create_type_nondisk() and use it for all cases.
Finally, also drop a dump_stack() in pscsi_get_blocks() for
non TYPE_DISK, which in modern target-core can get invoked
via target_sense_desc_format() during CHECK_CONDITION.
Reported-by: Malcolm Haak <insanemal@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61eb2b43b9 upstream.
Neil Brown pointed out a potential deadlock in raid 10 code with
bio_split/chain. The raid1 code could have the same issue, but recent
barrier rework makes it less likely to happen. The deadlock happens in
below sequence:
1. generic_make_request(bio), this will set current->bio_list
2. raid10_make_request will split bio to bio1 and bio2
3. __make_request(bio1), wait_barrer, add underlayer disk bio to
current->bio_list
4. __make_request(bio2), wait_barrer
If raise_barrier happens between 3 & 4, since wait_barrier runs at 3,
raise_barrier waits for IO completion from 3. And since raise_barrier
sets barrier, 4 waits for raise_barrier. But IO from 3 can't be
dispatched because raid10_make_request() doesn't finished yet.
The solution is to adjust the IO ordering. Quotes from Neil:
"
It is much safer to:
if (need to split) {
split = bio_split(bio, ...)
bio_chain(...)
make_request_fn(split);
generic_make_request(bio);
} else
make_request_fn(mddev, bio);
This way we first process the initial section of the bio (in 'split')
which will queue some requests to the underlying devices. These
requests will be queued in generic_make_request.
Then we queue the remainder of the bio, which will be added to the end
of the generic_make_request queue.
Then we return.
generic_make_request() will pop the lower-level device requests off the
queue and handle them first. Then it will process the remainder
of the original bio once the first section has been fully processed.
"
Note, this only happens in read path. In write path, the bio is flushed to
underlaying disks either by blk flush (from schedule) or offladed to raid1/10d.
It's queued in current->bio_list.
Cc: Coly Li <colyli@suse.de>
Suggested-by: NeilBrown <neilb@suse.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97ee351b50 upstream.
Recent toolchains force the TOC to be 256 byte aligned. We need to
enforce this alignment in the zImage linker script, otherwise pointers
to our TOC variables (__toc_start) could be incorrect. If the actual
start of the TOC and __toc_start don't have the same value we crash
early in the zImage wrapper.
Suggested-by: Alan Modra <amodra@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63513232f8 upstream.
Since rpc_task is async, the release function should be called which
will free the impl_id, scope, and owner.
Trond pointed at 2 more problems:
-- use of client pointer after free in the nfs4_exchangeid_release() function
-- cl_count mismatch if rpc_run_task() isn't run
Fixes: 8d89bd70bc ("NFS setup async exchange_id")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eed50879d6 upstream.
New complaint from kbuild for 4.9.y:
net/sunrpc/xprtrdma/verbs.c:489:19: sparse: incompatible types in
comparison expression (different type sizes)
verbs.c:
489 max_sge = min(ia->ri_device->attrs.max_sge, RPCRDMA_MAX_SEND_SGES);
I can't reproduce this running sparse here. Likewise, "make W=1
net/sunrpc/xprtrdma/verbs.o" never indicated any issue.
A little poking suggests that because the range of its values is
small, gcc can make the actual width of RPCRDMA_MAX_SEND_SGES
smaller than the width of an unsigned integer.
Fixes: 16f906d66c ("xprtrdma: Reduce required number of send SGEs")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73580dac76 upstream.
On those parisc machines which don't provide a software power off
function, the system currently kills the init process at the end of a
shutdown and unexpectedly restarts insteads of halting.
Fix it by adding a loop which will not return.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 316ec0624f upstream.
The previously submitted patch did not resolve the random segmentation
faults observed on the phantom buildd system. There are still
unresolved problems with the Debian 4.8 and 4.9 kernels on C8000.
The attached patch removes the flush of the offset map pages and does a
whole data cache flush for large ranges. No other arch flushes the
offset map in these routines as far as I can tell.
I have not observed any random segmentation faults on rp3440 in two
weeks of testing with 4.10.0 and 4.10.1.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b666809e1 upstream.
When FW notify driver or driver detects low FW resource,
driver tries to send out Busy SCSI Status to tell Initiator
side to back off. During the send process, the lock was not held.
Signed-off-by: Quinn Tran <quinn.tran@qlogic.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 474c90156c upstream.
gcc-7 has an "optimization" pass that completely screws up, and
generates the code expansion for the (impossible) case of calling
ilog2() with a zero constant, even when the code gcc compiles does not
actually have a zero constant.
And we try to generate a compile-time error for anybody doing ilog2() on
a constant where that doesn't make sense (be it zero or negative). So
now gcc7 will fail the build due to our sanity checking, because it
created that constant-zero case that didn't actually exist in the source
code.
There's a whole long discussion on the kernel mailing about how to work
around this gcc bug. The gcc people themselevs have discussed their
"feature" in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72785
but it's all water under the bridge, because while it looked at one
point like it would be solved by the time gcc7 was released, that was
not to be.
So now we have to deal with this compiler braindamage.
And the only simple approach seems to be to just delete the code that
tries to warn about bad uses of ilog2().
So now "ilog2()" will just return 0 not just for the value 1, but for
any non-positive value too.
It's not like I can recall anybody having ever actually tried to use
this function on any invalid value, but maybe the sanity check just
meant that such code never made it out in public.
Reported-by: Laura Abbott <labbott@redhat.com>
Cc: John Stultz <john.stultz@linaro.org>,
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a62234680 upstream.
The pm_runtime_put() we were using immediately released power on the
device, which meant that we were generally turning the device off and
on once per frame. In many profiles I've looked at, that added up to
about 1% of CPU time, but this could get worse in the case of frequent
rendering and readback (as may happen in X rendering). By keeping the
device on until we've been idle for a couple of frames, we drop the
overhead of runtime PM down to sub-.1%.
Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 457e67a728 upstream.
The loop is scanning until the original max_ip (size of the BO), but
we want to not examine any code after the PROG_END's delay slots.
There was a block trying to do that, except that we had some early
continue statements if the signal wasn't a PROG_END or a BRANCH.
The failure mode would be that a valid shader is rejected because some
undefined memory after the PROG_END slots is parsed as a branch and
the rest of its setup is illegal. I haven't seen this in the wild,
but valgrind was complaining when about this up in the userland
simulator mode.
Signed-off-by: Eric Anholt <eric@anholt.net>
Cc: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa2be9b3d6 upstream.
Turning on crypto self-tests on a POWER8 shows:
alg: hash: Test 1 failed for crc32c-vpmsum
00000000: ff ff ff ff
Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.
This probably wasn't caught because btrfs does its own weird
open-coded initialisation.
Initialise our internal context to ~0 on init.
This makes the self-tests pass, and btrfs continues to work.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44fee88cea upstream.
Subhransu reported that convert_art_to_tsc() isn't working for him.
The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.
Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.
The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.
Note for stable: The conditional has changed from TSC_RELIABLE to
TSC_KNOWN_FREQUENCY.
[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]
Fixes: f9677e0f83 ("x86/tsc: Always Running Timer (ART) correlated clocksource")
Reported-by: "Prusty, Subhransu S" <subhransu.s.prusty@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: christopher.s.hall@intel.com
Cc: kevin.b.stanton@intel.com
Cc: john.stultz@linaro.org
Cc: akataria@vmware.com
Link: http://lkml.kernel.org/r/20170313145712.GI3312@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90922a2d03 upstream.
On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.
It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().
This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.
Cc: stable@vger.kernel.org
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6892517629 upstream.
When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system. This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.
Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.
Reported-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a05ef161cd ]
Currently the build breaks if CMA=n and SPAPR_TCE_IOMMU=y:
arch/powerpc/mm/mmu_context_iommu.c: In function ‘mm_iommu_get’:
arch/powerpc/mm/mmu_context_iommu.c:193:42: error: ‘MIGRATE_CMA’ undeclared (first use in this function)
if (get_pageblock_migratetype(page) == MIGRATE_CMA) {
^~~~~~~~~~~
Fix it by using the existing is_migrate_cma_page(), which evaulates to
false when CMA=n.
Fixes: 2e5bbb5461 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f209fa03fc ]
During a PCI error recovery, like the ones provoked by EEH in the ppc64
platform, all IO to the device must be blocked while the recovery is
completed. Current 8250_pci implementation only suspends the port
instead of detaching it, which doesn't prevent incoming accesses like
TIOCMGET and TIOCMSET calls from reaching the device. Those end up
racing with the EEH recovery, crashing it. Similar races were also
observed when opening the device and when shutting it down during
recovery.
This patch implements a more robust IO blockage for the 8250_pci
recovery by unregistering the port at the beginning of the procedure and
re-adding it afterwards. Since the port is detached from the uart
layer, we can be sure that no request will make through to the device
during recovery. This is similar to the solution used by the JSM serial
driver.
I thank Peter Hurley <peter@hurleysoftware.com> for valuable input on
this one over one year ago.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 708f5dcc21 ]
The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.
Adding it to acpi_rev_dmi_table[] helps to work around this problem.
Signed-off-by: Michael Pobega <mpobega@neverware.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9523b9bf6d ]
Precision 5520 and 3520 either hang at login and during suspend or reboot.
It turns out that that adding them to acpi_rev_dmi_table[] helps to work
around those issues.
Signed-off-by: Alex Hung <alex.hung@canonical.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e950267ab8 ]
Some devices have invalid baSourceID references, causing uvc_scan_chain()
to fail, but if we just take the entities we can find and put them
together in the most sensible chain we can think of, turns out they do
work anyway. Note: This heuristic assumes there is a single chain.
At the time of writing, devices known to have such a broken chain are
- Acer Integrated Camera (5986:055a)
- Realtek rtl157a7 (0bda:57a7)
Signed-off-by: Henrik Ingo <henrik.ingo@avoinelama.fi>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 25cdb64510 ]
The WRITE_SAME commands are not present in the blk_default_cmd_filter
write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
[ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]
The problem can be reproduced with the sg_write_same command
# sg_write_same --num 1 --xferlen 512 /dev/sda
#
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
Write same: pass through os error: Operation not permitted
#
For comparison, the WRITE_VERIFY command does not observe this problem,
since it is in that list:
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
#
So, this patch adds the WRITE_SAME commands to the list, in order
for the SG_IO ioctl to finish successfully:
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
#
That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]),
which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).
In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
which are translated to write-same calls in the guest kernel, and then into
SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:
[...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
[...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
[...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
[...] blk_update_request: I/O error, dev sda, sector 17096824
Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com>
Reported-by: Manjunatha H R <manjuhr1@in.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d9c728949d ]
We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.
At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.
This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6f01cc692a ]
There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.
Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.
This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b6fad7097 ]
At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().
This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.
This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.
This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.
As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bc82d122ae ]
In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).
This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.
This replaces current->mm with container->mm everywhere except debug
prints.
This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.
DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.
This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7baee6901 ]
This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.
This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.
This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.
This should cause no behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 88f54a3581 ]
We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.
This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).
This should not cause any behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39701e56f5 ]
The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.
As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.
This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fa32ff6576 ]
With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()
The first version of this commit was reverted (65a532f3d5) to deal with
cross-tree merge issues which are (hopefully) resolved now.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f40ec3c748 ]
Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.
Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled. The half-updated state may conflict with other
devices in the system.
Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.
[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 63880b230a ]
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
We already ignore these updates because of 70675e0b6a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 45d004f4af ]
The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.
Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.
Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 546ba9f8f2 ]
If we update a VF BAR while it's enabled, there are two potential problems:
1) Any driver that's using the VF has a cached BAR value that is stale
after the update, and
2) We can't update 64-bit BARs atomically, so the intermediate state
(new lower dword with old upper dword) may conflict with another
device, and an access by a driver unrelated to the VF may cause a bus
error.
Warn about attempts to update VF BARs while they are enabled. This is a
programming error, so use dev_WARN() to get a backtrace.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7a6d312b50 ]
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0b457dde3c ]
pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR. We only update
those when we enable the ROM.
It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled. That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.
Add comments and references to explain why we can't make the code look more
rational.
The code changes are from 755528c860 ("Ignore disabled ROM resources at
setup") and 8085ce084c ("[PATCH] Fix PCI ROM mapping").
Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 286c2378aa ]
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.
Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6ffa2489c5 ]
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.
Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).
This patch:
- Renames pci_update_resource() to pci_std_update_resource(),
- Adds pci_iov_update_resource(),
- Makes pci_update_resource() a wrapper that calls the appropriate one,
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59107e2f48 ]
There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).
To properly unload VMBus making it possible to start over during kdump we
need to do the following:
- Send an 'unload' message to the hypervisor. This can be done on any CPU
so we do this the crashing CPU.
- Receive the 'unload finished' reply message. WS2012R2 delivers this
message to the CPU which was used to establish VMBus connection during
module load and this CPU may differ from the CPU sending 'unload'.
Receiving a VMBus message means the following:
- There is a per-CPU slot in memory for one message. This slot can in
theory be accessed by any CPU.
- We get an interrupt on the CPU when a message was placed into the slot.
- When we read the message we need to clear the slot and signal the fact
to the hypervisor. In case there are more messages to this CPU pending
the hypervisor will deliver the next message. The signaling is done by
writing to an MSR so this can only be done on the appropriate CPU.
To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:
- We're crashing on a CPU which is different from the one which was used
to initially contact the hypervisor.
- The CPU which was used for the initial contact is blocked with interrupts
disabled and there is a message pending in the message slot.
In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.
The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.
The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c9b3379f60 ]
This patch changes the way the IBM vSCSI server driver manages its
Command/Response Queue (CRQ). We used to register the CRQ with phyp at
probe time. Now we wait until tpg_enable_store. Similarly, when
tpg_enable_store is called to "disable" (i.e. the stored value is 0),
we unregister the queue with phyp.
One consquence to this is that we have no need for the PART_UP_WAIT_ENAB
state, since we can't get an Init Message from the client in our CRQ if
we're waiting to be enabled, since we haven't registered the queue yet.
Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79fac9c9b7 ]
This patch reorders functions in a manner necessary for a follow-on
patch. It also makes some minor styling changes (mostly removing extra
spaces) and fixes some typos.
There are no code changes in this patch, with one exception: due to the
reordering of the functions, I needed to explicitly declare a function
at the top of the file. However, this will be removed in the next patch,
since the code requiring the predeclaration will be removed.
Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4e684f59d7 ]
Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.
Thanks for Jochen Henneberg for the idea and original patch.
Signed-off-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c74fd80f2f ]
Revert the main part of commit:
af42b8d12f ("xen: fix MSI setup and teardown for PV on HVM guests")
That commit introduced reading the pci device's msi message data to see
if a pirq was previously configured for the device's msi/msix, and re-use
that pirq. At the time, that was the correct behavior. However, a
later change to Qemu caused it to call into the Xen hypervisor to unmap
all pirqs for a pci device, when the pci device disables its MSI/MSIX
vectors; specifically the Qemu commit:
c976437c7dba9c7444fb41df45468968aaa326ad
("qemu-xen: free all the pirqs for msi/msix when driver unload")
Once Qemu added this pirq unmapping, it was no longer correct for the
kernel to re-use the pirq number cached in the pci device msi message
data. All Qemu releases since 2.1.0 contain the patch that unmaps the
pirqs when the pci device disables its MSI/MSIX vectors.
This bug is causing failures to initialize multiple NVMe controllers
under Xen, because the NVMe driver sets up a single MSIX vector for
each controller (concurrently), and then after using that to talk to
the controller for some configuration data, it disables the single MSIX
vector and re-configures all the MSIX vectors it needs. So the MSIX
setup code tries to re-use the cached pirq from the first vector
for each controller, but the hypervisor has already given away that
pirq to another controller, and its initialization fails.
This is discussed in more detail at:
https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html
Fixes: af42b8d12f ("xen: fix MSI setup and teardown for PV on HVM guests")
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21d25f6a42 upstream.
On a kernel with DEBUG_LOCKS, ioat_free_chan_resources triggers an
in_interrupt() warning. With PROVE_LOCKING, it reports detecting a
SOFTIRQ-safe to SOFTIRQ-unsafe lock ordering in the same code path.
This is because dma_generic_alloc_coherent() checks if the GFP flags
permit blocking. It allocates from different subsystems if blocking is
permitted. The free path knows how to return the memory to the correct
allocator. If GFP_KERNEL is specified then the alloc and free end up
going through cma_alloc(), which uses mutexes.
Given that ioat_free_chan_resources() can be called in interrupt
context, ioat_alloc_chan_resources() must specify GFP_NOWAIT so that the
allocations do not block and instead use an allocator that uses
spinlocks.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6760bf2ddd ]
Martin reported a verifier issue that hit the BUG_ON() for his
test case in the mark_reg_unknown_value() function:
[ 202.861380] kernel BUG at kernel/bpf/verifier.c:467!
[...]
[ 203.291109] Call Trace:
[ 203.296501] [<ffffffff811364d5>] mark_map_reg+0x45/0x50
[ 203.308225] [<ffffffff81136558>] mark_map_regs+0x78/0x90
[ 203.320140] [<ffffffff8113938d>] do_check+0x226d/0x2c90
[ 203.331865] [<ffffffff8113a6ab>] bpf_check+0x48b/0x780
[ 203.343403] [<ffffffff81134c8e>] bpf_prog_load+0x27e/0x440
[ 203.355705] [<ffffffff8118a38f>] ? handle_mm_fault+0x11af/0x1230
[ 203.369158] [<ffffffff812d8188>] ? security_capable+0x48/0x60
[ 203.382035] [<ffffffff811351a4>] SyS_bpf+0x124/0x960
[ 203.393185] [<ffffffff810515f6>] ? __do_page_fault+0x276/0x490
[ 203.406258] [<ffffffff816db320>] entry_SYSCALL_64_fastpath+0x13/0x94
This issue got uncovered after the fix in a08dd0da53 ("bpf: fix
regression on verifier pruning wrt map lookups"). The reason why it
wasn't noticed before was, because as mentioned in a08dd0da53,
mark_map_regs() was doing the id matching incorrectly based on the
uncached regs[regno].id. So, in the first loop, we walked all regs
and as soon as we found regno == i, then this reg's id was cleared
when calling mark_reg_unknown_value() thus that every subsequent
register was probed against id of 0 (which, in combination with the
PTR_TO_MAP_VALUE_OR_NULL type is an invalid condition that no other
register state can hold), and therefore wasn't type transitioned such
as in the spilled register case for the second loop.
Now since that got fixed, it turned out that 57a09bf0a4 ("bpf:
Detect identical PTR_TO_MAP_VALUE_OR_NULL registers") used
mark_reg_unknown_value() incorrectly for the spilled regs, and thus
hitting the BUG_ON() in some cases due to regno >= MAX_BPF_REG.
Although spilled regs have the same type as the non-spilled regs
for the verifier state, that is, struct bpf_reg_state, they are
semantically different from the non-spilled regs. In other words,
there can be up to 64 (MAX_BPF_STACK / BPF_REG_SIZE) spilled regs
in the stack, for example, register R<x> could have been spilled by
the program to stack location X, Y, Z, and in mark_map_regs() we
need to scan these stack slots of type STACK_SPILL for potential
registers that we have to transition from PTR_TO_MAP_VALUE_OR_NULL.
Therefore, depending on the location, the spilled_regs regno can
be a lot higher than just MAX_BPF_REG's value since we operate on
stack instead. The reset in mark_reg_unknown_value() itself is
just fine, only that the BUG_ON() was inappropriate for this. Fix
it by making a __mark_reg_unknown_value() version that can be
called from mark_map_reg() generically; we know for the non-spilled
case that the regno is always < MAX_BPF_REG anyway.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a08dd0da53 ]
Commit 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.
What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a4 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.
Since commit 57a09bf0a4 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.
For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.
states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a4, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.
We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.
Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d2a4dd37f6 ]
Commmits 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
and 484611357c ("bpf: allow access into map value arrays") by themselves
are correct, but in combination they make state equivalence ignore 'id' field
of the register state which can lead to accepting invalid program.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Fixes: 484611357c ("bpf: allow access into map value arrays")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57a09bf0a4 ]
A BPF program is required to check the return register of a
map_elem_lookup() call before accessing memory. The verifier keeps
track of this by converting the type of the result register from
PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional
jump ensures safety. This check is currently exclusively performed
for the result register 0.
In the event the compiler reorders instructions, BPF_MOV64_REG
instructions may be moved before the conditional jump which causes
them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the
verifier objects when the register is accessed:
0: (b7) r1 = 10
1: (7b) *(u64 *)(r10 -8) = r1
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x59c00000
6: (85) call 1
7: (bf) r4 = r0
8: (15) if r0 == 0x0 goto pc+1
R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp
9: (7a) *(u64 *)(r4 +0) = 0
R4 invalid mem access 'map_value_or_null'
This commit extends the verifier to keep track of all identical
PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by
assigning them an ID and then marking them all when the conditional
jump is observed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72ef9c4125 ]
This patch fixes a memory leak, which happens if the connection request
is not fulfilled between parsing the DCCP options and handling the SYN
(because e.g. the backlog is full), because we forgot to free the
list of ack vectors.
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b20e2d5478 ]
aszlig observed failing ssh tunnels (-w) during initialization since
commit cc9da6cc4f ("ipv6: addrconf: use stable address generator for
ARPHRD_NONE"). We already had reports that the mentioned commit breaks
Juniper VPN connections. I can't clearly say that the Juniper VPN client
has the same problem, but it is worth a try to hint to this patch.
Because of the early generation of link local addresses, the kernel now
can start asking for routers on the local subnet much earlier than usual.
Those router solicitation packets arrive inside the ssh channels and
should be transmitted to the tun fd before the configuration scripts
might have upped the interface and made it ready for transmission.
ssh polls on the interface and receives back a POLL_OUT. It tries to send
the earily router solicitation packet to the tun interface. Unfortunately
it hasn't been up'ed yet by config scripts, thus failing with -EIO. ssh
doesn't retry again and considers the tun interface broken forever.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=121131
Fixes: cc9da6cc4f ("ipv6: addrconf: use stable address generator for ARPHRD_NONE")
Cc: Bjørn Mork <bjorn@mork.no>
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Reported-by: Jonas Lippuner <jonas@lippuner.ca>
Cc: Jonas Lippuner <jonas@lippuner.ca>
Reported-by: aszlig <aszlig@redmoonstudios.org>
Cc: aszlig <aszlig@redmoonstudios.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 45caeaa5ac ]
As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.
We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:
#8 [] page_fault at ffffffff8163e648
[exception RIP: __tcp_ack_snd_check+74]
.
.
#9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8
Of course it may happen with other NIC drivers as well.
It's found the freed dst_entry here:
224 static bool tcp_in_quickack_mode(struct sock *sk)↩
225 {↩
226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩
227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩
228 ↩
229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
231 }↩
But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.
All the vmcores showed 2 significant clues:
- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.
- All vmcores showed a postitive LockDroppedIcmps value, e.g:
LockDroppedIcmps 267
A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:
do_redirect()->__sk_dst_check()-> dst_release().
Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.
To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.
The dccp/IPv6 code is very similar in this respect, so fixing it there too.
As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().
Fixes: ceb3320610 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a13b2082ec ]
Andreas reports kernel oops during rmmod of the br_netfilter module.
Hannes debugged the oops down to a NULL rt6info->rt6i_indev.
Problem is that br_netfilter has the nasty concept of adding a fake
rtable to skb->dst; this happens in a br_netfilter prerouting hook.
A second hook (in bridge LOCAL_IN) is supposed to remove these again
before the skb is handed up the stack.
However, on module unload hooks get unregistered which means an
skb could traverse the prerouting hook that attaches the fake_rtable,
while the 'fake rtable remove' hook gets removed from the hooklist
immediately after.
Fixes: 34666d467c ("netfilter: bridge: move br_netfilter out of the core")
Reported-by: Andreas Karis <akaris@redhat.com>
Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79e49503ef ]
ip6_fragment, in case skb has a fraglist, checks if the
skb is cloned. If it is, it will move to the 'slow path' and allocates
new skbs for each fragment.
However, right before entering the slowpath loop, it updates the
nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
to account for the fragment header that will be inserted in the new
ipv6-fragment skbs.
In case original skb is cloned this munges nexthdr value of another
skb. Avoid this by doing the nexthdr update for each of the new fragment
skbs separately.
This was observed with tcpdump on a bridge device where netfilter ipv6
reassembly is active: tcpdump shows malformed fragment headers as
the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Andreas Karis <akaris@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 67e194007b ]
Commit 2759647247 ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.
Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.
Reproducer:
===========
#!/bin/sh
ip netns add ns1
ip netns add ns2
ip -net ns1 link set lo up
for x in 0 1 2 ; do
ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
ip -net ns1 link set eth$x up
ip -net ns2 link set veth$x up
done
ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048
echo "before replace, 3 routes"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
echo
ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2
echo "after replace, only 2 routes, metric 2048 is gone"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
Fixes: 2759647247 ("ipv6: fix ECMP route replacement")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79099aab38 ]
Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:
$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev br0
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev br0
$ ip li del br0
When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.
rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.
Fixes: c89359a42e ("mpls: support for dead routes")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e37791ec1a ]
When the mpls_router module is unloaded, mpls routes are deleted but
notifications are not sent to userspace leaving userspace caches
out of sync. Add the call to mpls_notify_route in mpls_net_exit as
routes are freed.
Fixes: 0189197f44 ("mpls: Basic routing support")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 745cb7f8a5 ]
Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:
/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function)
__u8 pdmc_addr[MAX_ADDR_LEN];
This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:
$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
__u8 mac[32]; /* MAX_ADDR_LEN */
There are no UAPI headers besides these two that use MAX_ADDR_LEN.
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 294acf1c01 ]
The gso code of several tunnels type (gre and udp tunnels)
takes for granted that the skb->inner_protocol is properly
initialized and drops the packet elsewhere.
On the forwarding path no one is initializing such field,
so gro encapsulated packets are dropped on forward.
Since commit 3872035241 ("gre: Use inner_proto to obtain
inner header protocol"), this can be reproduced when the
encapsulated packets use gre as the tunneling protocol.
The issue happens also with vxlan and geneve tunnels since
commit 8bce6d7d0d ("udp: Generalize skb_udp_segment"), if the
forwarding host's ingress nic has h/w offload for such tunnel
and a vxlan/geneve device is configured on top of it, regardless
of the configured peer address and vni.
To address the issue, this change initialize the inner_protocol
field for encapsulated packets in both ipv4 and ipv6 gro complete
callbacks.
Fixes: 3872035241 ("gre: Use inner_proto to obtain inner header protocol")
Fixes: 8bce6d7d0d ("udp: Generalize skb_udp_segment")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9ac25fc063 ]
TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.
sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc and lead to leaks or use after free.
Fixes: 62bccb8cdb ("net-timestamp: Make the clone operation stand-alone from phy timestamping")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 02b2faaf0a ]
Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting
tcp_disconnect() path that was never really considered and/or used
before syzkaller ;)
I was not able to reproduce the bug, but it seems issues here are the
three possible actions that assumed they would never trigger on a
listener.
1) tcp_write_timer_handler
2) tcp_delack_timer_handler
3) MTU reduction
Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN
states from tcp_v6_mtu_reduced()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d5afb6f9b6 ]
The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.
This problem stayed there for a long while, till b0691c8ee7 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.
Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:
---- 8< ----
I've got the following report while running syzkaller fuzzer on
86292b33d4 ("Merge branch 'akpm' (patches from Andrew)")
[ BUG: held lock freed! ]
4.10.0+ #234 Not tainted
-------------------------
syz-executor6/6898 is freeing memory
ffff88006286cac0-ffff88006286d3b7, with a lock still held there!
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
5 locks held by syz-executor6/6898:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock
include/net/sock.h:1460 [inline]
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>]
inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
#1: (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>]
inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink
include/linux/skbuff.h:1767 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue
include/linux/skbuff.h:1783 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>]
process_backlog+0x264/0x730 net/core/dev.c:4835
#3: (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>]
ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
Fix it just like was done by b0691c8ee7 ("net: Unlock sock before calling
sk_free()").
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 13baa00ad0 ]
It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.
Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.
Lets play safe and allow all contexts.
The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.
This extends and improves what we did in commit 5fa8bbda38 ("net: use
a work queue to defer net_disable_timestamp() work")
Fixes: b90e5794c5 ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8953de2f02 ]
Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.
Fixes: b6cb5ac833 ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f7df4923fa ]
When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.
Instead, overwrite the old binding with the new one.
Fixes: 6b75c4807d ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eab127717a ]
phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.
Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().
Fixes: 3c293f4e08 ("net: phy: Trigger state machine on state change and not polling.")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 51fb60eb16 ]
l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.
Signed-off-by: Paul Hüber <phueber@kernsp.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit edb9d1bff4 ]
When tc actions are loaded as a module and no actions have been installed,
flushing them would result in actions removed from the memory, but modules
reference count not being decremented, so that the modules would not be
unloaded.
Following is example with GACT action:
% sudo modprobe act_gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions ls action gact
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 1
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 2
% sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
....
After the fix:
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions add action pass index 1
% sudo tc actions add action pass index 2
% sudo tc actions add action pass index 3
% lsmod
Module Size Used by
act_gact 16384 3
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
% sudo rmmod act_gact
% lsmod
Module Size Used by
%
Fixes: f97017cdef ("net-sched: Fix actions flushing")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1158632b5a ]
When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.
IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.
Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.
Fixes: 272d96a5ab ("net: vxlan: lwt: Use source ip address during route lookup.")
Signed-off-by: Brian Russell <brussell@brocade.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7dcdf941cd ]
Align vti6 with vti by returning GRE_KEY flag. This enables iproute2
to display tunnel keys on "ip -6 tunnel show"
Signed-off-by: David Forster <dforster@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 36154be40a ]
In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.
The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.
We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.
Tested:
ethtool -K ethxx lro off/on
ethtool --set-priv-flags ethxx rx_cqe_compress on
super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
verified no csum errors and no page refcount issues.
Fixes: 7219ab34f1 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4078e637c1 ]
When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.
Fixes: e4b8550807 ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6f08a22c5f ]
Currently vport representors are added only on driver load and removed on
driver unload. Apparently we forgot to handle them when we added the
seamless reset flow feature. This caused to leave the representors
netdevs alive and active with open HW resources on pci shutdown and on
error reset flows.
To overcome this we move their handling to interface attach/detach, so
they would be cleaned up on shutdown and recreated on reset flows.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without a prompt string, a config setting can't be included in a
defconfig. Give CONFIG_SND_SOC_ICS43432 a prompt so that Pi soundcards
can use the driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
commit aa2be9b3d6 upstream.
Turning on crypto self-tests on a POWER8 shows:
alg: hash: Test 1 failed for crc32c-vpmsum
00000000: ff ff ff ff
Comparing the code with the Intel CRC32c implementation on which
ours is based shows that we are doing an init with 0, not ~0
as CRC32c requires.
This probably wasn't caught because btrfs does its own weird
open-coded initialisation.
Initialise our internal context to ~0 on init.
This makes the self-tests pass, and btrfs continues to work.
Fixes: 6dd7a82cc5 ("crypto: powerpc - Add POWER8 optimised crc32c")
Cc: Anton Blanchard <anton@samba.org>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 44fee88cea upstream.
Subhransu reported that convert_art_to_tsc() isn't working for him.
The ART to TSC relation is only set up for systems which use the refined
TSC calibration. Systems with known TSC frequency (available via CPUID 15)
are not using the refined calibration and therefor the ART to TSC relation
is never established.
Add the setup to the known frequency init path which skips ART
calibration. The init code needs to be duplicated as for systems which use
refined calibration the ART setup must be delayed until calibration has
been done.
The problem has been there since the ART support was introdduced, but only
detected now because Subhransu tested the first time on hardware which has
TSC frequency enumerated via CPUID 15.
Note for stable: The conditional has changed from TSC_RELIABLE to
TSC_KNOWN_FREQUENCY.
[ tglx: Rewrote changelog and identified the proper 'Fixes' commit ]
Fixes: f9677e0f83 ("x86/tsc: Always Running Timer (ART) correlated clocksource")
Reported-by: "Prusty, Subhransu S" <subhransu.s.prusty@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org
Cc: christopher.s.hall@intel.com
Cc: kevin.b.stanton@intel.com
Cc: john.stultz@linaro.org
Cc: akataria@vmware.com
Link: http://lkml.kernel.org/r/20170313145712.GI3312@twins.programming.kicks-ass.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90922a2d03 upstream.
On Qualcomm Datacenter Technologies QDF2400 SoCs, the ITS hardware
implementation uses 16Bytes for Interrupt Translation Entry (ITE),
but reports an incorrect value of 8Bytes in GITS_TYPER.ITTE_size.
It might cause kernel memory corruption depending on the number
of MSI(x) that are configured and the amount of memory that has
been allocated for ITEs in its_create_device().
This patch fixes the potential memory corruption by setting the
correct ITE size to 16Bytes.
Cc: stable@vger.kernel.org
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6892517629 upstream.
When invalidating guest TLBs, special care must be taken to
actually shoot the guest TLBs and not the host ones if we're
running on a VHE system. This is controlled by the HCR_EL2.TGE
bit, which we forget to clear before invalidating TLBs.
Address the issue by introducing two wrappers (__tlb_switch_to_guest
and __tlb_switch_to_host) that take care of both the VTTBR_EL2
and HCR_EL2.TGE switching.
Reported-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Tested-by: Tomasz Nowicki <tnowicki@caviumnetworks.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a05ef161cd ]
Currently the build breaks if CMA=n and SPAPR_TCE_IOMMU=y:
arch/powerpc/mm/mmu_context_iommu.c: In function ‘mm_iommu_get’:
arch/powerpc/mm/mmu_context_iommu.c:193:42: error: ‘MIGRATE_CMA’ undeclared (first use in this function)
if (get_pageblock_migratetype(page) == MIGRATE_CMA) {
^~~~~~~~~~~
Fix it by using the existing is_migrate_cma_page(), which evaulates to
false when CMA=n.
Fixes: 2e5bbb5461 ("KVM: PPC: Book3S HV: Migrate pinned pages out of CMA")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f209fa03fc ]
During a PCI error recovery, like the ones provoked by EEH in the ppc64
platform, all IO to the device must be blocked while the recovery is
completed. Current 8250_pci implementation only suspends the port
instead of detaching it, which doesn't prevent incoming accesses like
TIOCMGET and TIOCMSET calls from reaching the device. Those end up
racing with the EEH recovery, crashing it. Similar races were also
observed when opening the device and when shutting it down during
recovery.
This patch implements a more robust IO blockage for the 8250_pci
recovery by unregistering the port at the beginning of the procedure and
re-adding it afterwards. Since the port is detached from the uart
layer, we can be sure that no request will make through to the device
during recovery. This is similar to the solution used by the JSM serial
driver.
I thank Peter Hurley <peter@hurleysoftware.com> for valuable input on
this one over one year ago.
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 708f5dcc21 ]
The Dell Latitude 3350's ethernet card attempts to use a reserved
IRQ (18), resulting in ACPI being unable to enable the ethernet.
Adding it to acpi_rev_dmi_table[] helps to work around this problem.
Signed-off-by: Michael Pobega <mpobega@neverware.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9523b9bf6d ]
Precision 5520 and 3520 either hang at login and during suspend or reboot.
It turns out that that adding them to acpi_rev_dmi_table[] helps to work
around those issues.
Signed-off-by: Alex Hung <alex.hung@canonical.com>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e950267ab8 ]
Some devices have invalid baSourceID references, causing uvc_scan_chain()
to fail, but if we just take the entities we can find and put them
together in the most sensible chain we can think of, turns out they do
work anyway. Note: This heuristic assumes there is a single chain.
At the time of writing, devices known to have such a broken chain are
- Acer Integrated Camera (5986:055a)
- Realtek rtl157a7 (0bda:57a7)
Signed-off-by: Henrik Ingo <henrik.ingo@avoinelama.fi>
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 25cdb64510 ]
The WRITE_SAME commands are not present in the blk_default_cmd_filter
write_ok list, and thus are failed with -EPERM when the SG_IO ioctl()
is executed without CAP_SYS_RAWIO capability (e.g., unprivileged users).
[ sg_io() -> blk_fill_sghdr_rq() > blk_verify_command() -> -EPERM ]
The problem can be reproduced with the sg_write_same command
# sg_write_same --num 1 --xferlen 512 /dev/sda
#
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
Write same: pass through os error: Operation not permitted
#
For comparison, the WRITE_VERIFY command does not observe this problem,
since it is in that list:
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_verify --num 1 --ilen 512 --lba 0 /dev/sda'
#
So, this patch adds the WRITE_SAME commands to the list, in order
for the SG_IO ioctl to finish successfully:
# capsh --drop=cap_sys_rawio -- -c \
'sg_write_same --num 1 --xferlen 512 /dev/sda'
#
That case happens to be exercised by QEMU KVM guests with 'scsi-block' devices
(qemu "-device scsi-block" [1], libvirt "<disk type='block' device='lun'>" [2]),
which employs the SG_IO ioctl() and runs as an unprivileged user (libvirt-qemu).
In that scenario, when a filesystem (e.g., ext4) performs its zero-out calls,
which are translated to write-same calls in the guest kernel, and then into
SG_IO ioctls to the host kernel, SCSI I/O errors may be observed in the guest:
[...] sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[...] sd 0:0:0:0: [sda] tag#0 Sense Key : Aborted Command [current]
[...] sd 0:0:0:0: [sda] tag#0 Add. Sense: I/O process terminated
[...] sd 0:0:0:0: [sda] tag#0 CDB: Write Same(10) 41 00 01 04 e0 78 00 00 08 00
[...] blk_update_request: I/O error, dev sda, sector 17096824
Links:
[1] http://git.qemu.org/?p=qemu.git;a=commit;h=336a6915bc7089fb20fea4ba99972ad9a97c5f52
[2] https://libvirt.org/formatdomain.html#elementsDisks (see 'disk' -> 'device')
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Brahadambal Srinivasan <latha@linux.vnet.ibm.com>
Reported-by: Manjunatha H R <manjuhr1@in.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d9c728949d ]
We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.
At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.
This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6f01cc692a ]
There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.
Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.
This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4b6fad7097 ]
At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().
This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.
This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.
This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.
As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bc82d122ae ]
In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).
This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.
This replaces current->mm with container->mm everywhere except debug
prints.
This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.
DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.
This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d7baee6901 ]
This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.
This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.
This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.
This should cause no behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 88f54a3581 ]
We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.
This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).
This should not cause any behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39701e56f5 ]
The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.
As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.
This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fa32ff6576 ]
With wrap around mappings in place we can always provide drivers with
direct links to packets on the ring buffer, even when they wrap around.
Do the required updates to get_next_pkt_raw()/put_pkt_raw()
The first version of this commit was reverted (65a532f3d5) to deal with
cross-tree merge issues which are (hopefully) resolved now.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f40ec3c748 ]
Previously we enabled VFs and enable their memory space before calling
pcibios_sriov_enable(). But pcibios_sriov_enable() may update the VF BARs:
for example, on PPC PowerNV we may change them to manage the association of
VFs to PEs.
Because 64-bit BARs cannot be updated atomically, it's unsafe to update
them while they're enabled. The half-updated state may conflict with other
devices in the system.
Call pcibios_sriov_enable() before enabling the VFs so any BAR updates
happen while the VF BARs are disabled.
[bhelgaas: changelog]
Tested-by: Carol Soto <clsoto@us.ibm.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 63880b230a ]
VF BARs are read-only zero, so updating VF BARs will not have any effect.
See the SR-IOV spec r1.1, sec 3.4.1.11.
We already ignore these updates because of 70675e0b6a ("PCI: Don't try to
restore VF BARs"); this merely restructures it slightly to make it easier
to split updates for standard and SR-IOV BARs.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 45d004f4af ]
The BAR property bits (0-3 for memory BARs, 0-1 for I/O BARs) are supposed
to be read-only, but we do save them in res->flags and include them when
updating the BAR.
Mask the I/O property bits with ~PCI_BASE_ADDRESS_IO_MASK (0x3) instead of
PCI_REGION_FLAG_MASK (0xf) to make it obvious that we can't corrupt bits
2-3 of I/O addresses.
Use PCI_ROM_ADDRESS_MASK for ROM BARs. This means we'll only check the top
21 bits (instead of the 28 bits we used to check) of a ROM BAR to see if
the update was successful.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 546ba9f8f2 ]
If we update a VF BAR while it's enabled, there are two potential problems:
1) Any driver that's using the VF has a cached BAR value that is stale
after the update, and
2) We can't update 64-bit BARs atomically, so the intermediate state
(new lower dword with old upper dword) may conflict with another
device, and an access by a driver unrelated to the VF may cause a bus
error.
Warn about attempts to update VF BARs while they are enabled. This is a
programming error, so use dev_WARN() to get a backtrace.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7a6d312b50 ]
Remove the assumption that IORESOURCE_ROM_ENABLE == PCI_ROM_ADDRESS_ENABLE.
PCI_ROM_ADDRESS_ENABLE is the ROM enable bit defined by the PCI spec, so if
we're reading or writing a BAR register value, that's what we should use.
IORESOURCE_ROM_ENABLE is a corresponding bit in struct resource flags.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0b457dde3c ]
pci_update_resource() updates a hardware BAR so its address matches the
kernel's struct resource UNLESS it's a disabled ROM BAR. We only update
those when we enable the ROM.
It's not obvious from the code why ROM BARs should be handled specially.
Apparently there are Matrox devices with defective ROM BARs that read as
zero when disabled. That means that if pci_enable_rom() reads the disabled
BAR, sets PCI_ROM_ADDRESS_ENABLE (without re-inserting the address), and
writes it back, it would enable the ROM at address zero.
Add comments and references to explain why we can't make the code look more
rational.
The code changes are from 755528c860 ("Ignore disabled ROM resources at
setup") and 8085ce084c ("[PATCH] Fix PCI ROM mapping").
Link: https://lkml.org/lkml/2005/8/30/138
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 286c2378aa ]
pci_std_update_resource() only deals with standard BARs, so we don't have
to worry about the complications of VF BARs in an SR-IOV capability.
Compute the BAR address inline and remove pci_resource_bar(). That makes
pci_iov_resource_bar() unused, so remove that as well.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6ffa2489c5 ]
Previously pci_update_resource() used the same code path for updating
standard BARs and VF BARs in SR-IOV capabilities.
Split the VF BAR update into a new pci_iov_update_resource() internal
interface, which makes it simpler to compute the BAR address (we can get
rid of pci_resource_bar() and pci_iov_resource_bar()).
This patch:
- Renames pci_update_resource() to pci_std_update_resource(),
- Adds pci_iov_update_resource(),
- Makes pci_update_resource() a wrapper that calls the appropriate one,
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 59107e2f48 ]
There is a feature in Hyper-V ('Debug-VM --InjectNonMaskableInterrupt')
which injects NMI to the guest. We may want to crash the guest and do kdump
on this NMI by enabling unknown_nmi_panic. To make kdump succeed we need to
allow the kdump kernel to re-establish VMBus connection so it will see
VMBus devices (storage, network,..).
To properly unload VMBus making it possible to start over during kdump we
need to do the following:
- Send an 'unload' message to the hypervisor. This can be done on any CPU
so we do this the crashing CPU.
- Receive the 'unload finished' reply message. WS2012R2 delivers this
message to the CPU which was used to establish VMBus connection during
module load and this CPU may differ from the CPU sending 'unload'.
Receiving a VMBus message means the following:
- There is a per-CPU slot in memory for one message. This slot can in
theory be accessed by any CPU.
- We get an interrupt on the CPU when a message was placed into the slot.
- When we read the message we need to clear the slot and signal the fact
to the hypervisor. In case there are more messages to this CPU pending
the hypervisor will deliver the next message. The signaling is done by
writing to an MSR so this can only be done on the appropriate CPU.
To avoid doing cross-CPU work on crash we have vmbus_wait_for_unload()
function which checks message slots for all CPUs in a loop waiting for the
'unload finished' messages. However, there is an issue which arises when
these conditions are met:
- We're crashing on a CPU which is different from the one which was used
to initially contact the hypervisor.
- The CPU which was used for the initial contact is blocked with interrupts
disabled and there is a message pending in the message slot.
In this case we won't be able to read the 'unload finished' message on the
crashing CPU. This is reproducible when we receive unknown NMIs on all CPUs
simultaneously: the first CPU entering panic() will proceed to crash and
all other CPUs will stop themselves with interrupts disabled.
The suggested solution is to handle unknown NMIs for Hyper-V guests on the
first CPU which gets them only. This will allow us to rely on VMBus
interrupt handler being able to receive the 'unload finish' message in
case it is delivered to a different CPU.
The issue is not reproducible on WS2016 as Debug-VM delivers NMI to the
boot CPU only, WS2012R2 and earlier Hyper-V versions are affected.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: devel@linuxdriverproject.org
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Link: http://lkml.kernel.org/r/20161202100720.28121-1-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c9b3379f60 ]
This patch changes the way the IBM vSCSI server driver manages its
Command/Response Queue (CRQ). We used to register the CRQ with phyp at
probe time. Now we wait until tpg_enable_store. Similarly, when
tpg_enable_store is called to "disable" (i.e. the stored value is 0),
we unregister the queue with phyp.
One consquence to this is that we have no need for the PART_UP_WAIT_ENAB
state, since we can't get an Init Message from the client in our CRQ if
we're waiting to be enabled, since we haven't registered the queue yet.
Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79fac9c9b7 ]
This patch reorders functions in a manner necessary for a follow-on
patch. It also makes some minor styling changes (mostly removing extra
spaces) and fixes some typos.
There are no code changes in this patch, with one exception: due to the
reordering of the functions, I needed to explicitly declare a function
at the top of the file. However, this will be removed in the next patch,
since the code requiring the predeclaration will be removed.
Signed-off-by: Michael Cyr <mikecyr@us.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4e684f59d7 ]
Sometimes firmware may not properly initialize I347AT4_PAGE_SELECT causing
the probe of an igb i210 NIC to fail. This patch adds an addition zeroing
of this register during igb_get_phy_id to workaround this issue.
Thanks for Jochen Henneberg for the idea and original patch.
Signed-off-by: Chris J Arges <christopherarges@gmail.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c74fd80f2f ]
Revert the main part of commit:
af42b8d12f ("xen: fix MSI setup and teardown for PV on HVM guests")
That commit introduced reading the pci device's msi message data to see
if a pirq was previously configured for the device's msi/msix, and re-use
that pirq. At the time, that was the correct behavior. However, a
later change to Qemu caused it to call into the Xen hypervisor to unmap
all pirqs for a pci device, when the pci device disables its MSI/MSIX
vectors; specifically the Qemu commit:
c976437c7dba9c7444fb41df45468968aaa326ad
("qemu-xen: free all the pirqs for msi/msix when driver unload")
Once Qemu added this pirq unmapping, it was no longer correct for the
kernel to re-use the pirq number cached in the pci device msi message
data. All Qemu releases since 2.1.0 contain the patch that unmaps the
pirqs when the pci device disables its MSI/MSIX vectors.
This bug is causing failures to initialize multiple NVMe controllers
under Xen, because the NVMe driver sets up a single MSIX vector for
each controller (concurrently), and then after using that to talk to
the controller for some configuration data, it disables the single MSIX
vector and re-configures all the MSIX vectors it needs. So the MSIX
setup code tries to re-use the cached pirq from the first vector
for each controller, but the hypervisor has already given away that
pirq to another controller, and its initialization fails.
This is discussed in more detail at:
https://lists.xen.org/archives/html/xen-devel/2017-01/msg00447.html
Fixes: af42b8d12f ("xen: fix MSI setup and teardown for PV on HVM guests")
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21d25f6a42 upstream.
On a kernel with DEBUG_LOCKS, ioat_free_chan_resources triggers an
in_interrupt() warning. With PROVE_LOCKING, it reports detecting a
SOFTIRQ-safe to SOFTIRQ-unsafe lock ordering in the same code path.
This is because dma_generic_alloc_coherent() checks if the GFP flags
permit blocking. It allocates from different subsystems if blocking is
permitted. The free path knows how to return the memory to the correct
allocator. If GFP_KERNEL is specified then the alloc and free end up
going through cma_alloc(), which uses mutexes.
Given that ioat_free_chan_resources() can be called in interrupt
context, ioat_alloc_chan_resources() must specify GFP_NOWAIT so that the
allocations do not block and instead use an allocator that uses
spinlocks.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6760bf2ddd ]
Martin reported a verifier issue that hit the BUG_ON() for his
test case in the mark_reg_unknown_value() function:
[ 202.861380] kernel BUG at kernel/bpf/verifier.c:467!
[...]
[ 203.291109] Call Trace:
[ 203.296501] [<ffffffff811364d5>] mark_map_reg+0x45/0x50
[ 203.308225] [<ffffffff81136558>] mark_map_regs+0x78/0x90
[ 203.320140] [<ffffffff8113938d>] do_check+0x226d/0x2c90
[ 203.331865] [<ffffffff8113a6ab>] bpf_check+0x48b/0x780
[ 203.343403] [<ffffffff81134c8e>] bpf_prog_load+0x27e/0x440
[ 203.355705] [<ffffffff8118a38f>] ? handle_mm_fault+0x11af/0x1230
[ 203.369158] [<ffffffff812d8188>] ? security_capable+0x48/0x60
[ 203.382035] [<ffffffff811351a4>] SyS_bpf+0x124/0x960
[ 203.393185] [<ffffffff810515f6>] ? __do_page_fault+0x276/0x490
[ 203.406258] [<ffffffff816db320>] entry_SYSCALL_64_fastpath+0x13/0x94
This issue got uncovered after the fix in a08dd0da53 ("bpf: fix
regression on verifier pruning wrt map lookups"). The reason why it
wasn't noticed before was, because as mentioned in a08dd0da53,
mark_map_regs() was doing the id matching incorrectly based on the
uncached regs[regno].id. So, in the first loop, we walked all regs
and as soon as we found regno == i, then this reg's id was cleared
when calling mark_reg_unknown_value() thus that every subsequent
register was probed against id of 0 (which, in combination with the
PTR_TO_MAP_VALUE_OR_NULL type is an invalid condition that no other
register state can hold), and therefore wasn't type transitioned such
as in the spilled register case for the second loop.
Now since that got fixed, it turned out that 57a09bf0a4 ("bpf:
Detect identical PTR_TO_MAP_VALUE_OR_NULL registers") used
mark_reg_unknown_value() incorrectly for the spilled regs, and thus
hitting the BUG_ON() in some cases due to regno >= MAX_BPF_REG.
Although spilled regs have the same type as the non-spilled regs
for the verifier state, that is, struct bpf_reg_state, they are
semantically different from the non-spilled regs. In other words,
there can be up to 64 (MAX_BPF_STACK / BPF_REG_SIZE) spilled regs
in the stack, for example, register R<x> could have been spilled by
the program to stack location X, Y, Z, and in mark_map_regs() we
need to scan these stack slots of type STACK_SPILL for potential
registers that we have to transition from PTR_TO_MAP_VALUE_OR_NULL.
Therefore, depending on the location, the spilled_regs regno can
be a lot higher than just MAX_BPF_REG's value since we operate on
stack instead. The reset in mark_reg_unknown_value() itself is
just fine, only that the BUG_ON() was inappropriate for this. Fix
it by making a __mark_reg_unknown_value() version that can be
called from mark_map_reg() generically; we know for the non-spilled
case that the regno is always < MAX_BPF_REG anyway.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Reported-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a08dd0da53 ]
Commit 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL
registers") introduced a regression where existing programs stopped
loading due to reaching the verifier's maximum complexity limit,
whereas prior to this commit they were loading just fine; the affected
program has roughly 2k instructions.
What was found is that state pruning couldn't be performed effectively
anymore due to mismatches of the verifier's register state, in particular
in the id tracking. It doesn't mean that 57a09bf0a4 is incorrect per
se, but rather that verifier needs to perform a lot more work for the
same program with regards to involved map lookups.
Since commit 57a09bf0a4 is only about tracking registers with type
PTR_TO_MAP_VALUE_OR_NULL, the id is only needed to follow registers
until they are promoted through pattern matching with a NULL check to
either PTR_TO_MAP_VALUE or UNKNOWN_VALUE type. After that point, the
id becomes irrelevant for the transitioned types.
For UNKNOWN_VALUE, id is already reset to 0 via mark_reg_unknown_value(),
but not so for PTR_TO_MAP_VALUE where id is becoming stale. It's even
transferred further into other types that don't make use of it. Among
others, one example is where UNKNOWN_VALUE is set on function call
return with RET_INTEGER return type.
states_equal() will then fall through the memcmp() on register state;
note that the second memcmp() uses offsetofend(), so the id is part of
that since d2a4dd37f6 ("bpf: fix state equivalence"). But the bisect
pointed already to 57a09bf0a4, where we really reach beyond complexity
limit. What I found was that states_equal() often failed in this
case due to id mismatches in spilled regs with registers in type
PTR_TO_MAP_VALUE. Unlike non-spilled regs, spilled regs just perform
a memcmp() on their reg state and don't have any other optimizations
in place, therefore also id was relevant in this case for making a
pruning decision.
We can safely reset id to 0 as well when converting to PTR_TO_MAP_VALUE.
For the affected program, it resulted in a ~17 fold reduction of
complexity and let the program load fine again. Selftest suite also
runs fine. The only other place where env->id_gen is used currently is
through direct packet access, but for these cases id is long living, thus
a different scenario.
Also, the current logic in mark_map_regs() is not fully correct when
marking NULL branch with UNKNOWN_VALUE. We need to cache the destination
reg's id in any case. Otherwise, once we marked that reg as UNKNOWN_VALUE,
it's id is reset and any subsequent registers that hold the original id
and are of type PTR_TO_MAP_VALUE_OR_NULL won't be marked UNKNOWN_VALUE
anymore, since mark_map_reg() reuses the uncached regs[regno].id that
was just overridden. Note, we don't need to cache it outside of
mark_map_regs(), since it's called once on this_branch and the other
time on other_branch, which are both two independent verifier states.
A test case for this is added here, too.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d2a4dd37f6 ]
Commmits 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
and 484611357c ("bpf: allow access into map value arrays") by themselves
are correct, but in combination they make state equivalence ignore 'id' field
of the register state which can lead to accepting invalid program.
Fixes: 57a09bf0a4 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers")
Fixes: 484611357c ("bpf: allow access into map value arrays")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57a09bf0a4 ]
A BPF program is required to check the return register of a
map_elem_lookup() call before accessing memory. The verifier keeps
track of this by converting the type of the result register from
PTR_TO_MAP_VALUE_OR_NULL to PTR_TO_MAP_VALUE after a conditional
jump ensures safety. This check is currently exclusively performed
for the result register 0.
In the event the compiler reorders instructions, BPF_MOV64_REG
instructions may be moved before the conditional jump which causes
them to keep their type PTR_TO_MAP_VALUE_OR_NULL to which the
verifier objects when the register is accessed:
0: (b7) r1 = 10
1: (7b) *(u64 *)(r10 -8) = r1
2: (bf) r2 = r10
3: (07) r2 += -8
4: (18) r1 = 0x59c00000
6: (85) call 1
7: (bf) r4 = r0
8: (15) if r0 == 0x0 goto pc+1
R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp
9: (7a) *(u64 *)(r4 +0) = 0
R4 invalid mem access 'map_value_or_null'
This commit extends the verifier to keep track of all identical
PTR_TO_MAP_VALUE_OR_NULL registers after a map_elem_lookup() by
assigning them an ID and then marking them all when the conditional
jump is observed.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72ef9c4125 ]
This patch fixes a memory leak, which happens if the connection request
is not fulfilled between parsing the DCCP options and handling the SYN
(because e.g. the backlog is full), because we forgot to free the
list of ack vectors.
Reported-by: Jianwen Ji <jiji@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b20e2d5478 ]
aszlig observed failing ssh tunnels (-w) during initialization since
commit cc9da6cc4f ("ipv6: addrconf: use stable address generator for
ARPHRD_NONE"). We already had reports that the mentioned commit breaks
Juniper VPN connections. I can't clearly say that the Juniper VPN client
has the same problem, but it is worth a try to hint to this patch.
Because of the early generation of link local addresses, the kernel now
can start asking for routers on the local subnet much earlier than usual.
Those router solicitation packets arrive inside the ssh channels and
should be transmitted to the tun fd before the configuration scripts
might have upped the interface and made it ready for transmission.
ssh polls on the interface and receives back a POLL_OUT. It tries to send
the earily router solicitation packet to the tun interface. Unfortunately
it hasn't been up'ed yet by config scripts, thus failing with -EIO. ssh
doesn't retry again and considers the tun interface broken forever.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=121131
Fixes: cc9da6cc4f ("ipv6: addrconf: use stable address generator for ARPHRD_NONE")
Cc: Bjørn Mork <bjorn@mork.no>
Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Reported-by: Jonas Lippuner <jonas@lippuner.ca>
Cc: Jonas Lippuner <jonas@lippuner.ca>
Reported-by: aszlig <aszlig@redmoonstudios.org>
Cc: aszlig <aszlig@redmoonstudios.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 45caeaa5ac ]
As Eric Dumazet pointed out this also needs to be fixed in IPv6.
v2: Contains the IPv6 tcp/Ipv6 dccp patches as well.
We have seen a few incidents lately where a dst_enty has been freed
with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that
dst_entry. If the conditions/timings are right a crash then ensues when the
freed dst_entry is referenced later on. A Common crashing back trace is:
#8 [] page_fault at ffffffff8163e648
[exception RIP: __tcp_ack_snd_check+74]
.
.
#9 [] tcp_rcv_established at ffffffff81580b64
#10 [] tcp_v4_do_rcv at ffffffff8158b54a
#11 [] tcp_v4_rcv at ffffffff8158cd02
#12 [] ip_local_deliver_finish at ffffffff815668f4
#13 [] ip_local_deliver at ffffffff81566bd9
#14 [] ip_rcv_finish at ffffffff8156656d
#15 [] ip_rcv at ffffffff81566f06
#16 [] __netif_receive_skb_core at ffffffff8152b3a2
#17 [] __netif_receive_skb at ffffffff8152b608
#18 [] netif_receive_skb at ffffffff8152b690
#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3]
#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3]
#21 [] net_rx_action at ffffffff8152bac2
#22 [] __do_softirq at ffffffff81084b4f
#23 [] call_softirq at ffffffff8164845c
#24 [] do_softirq at ffffffff81016fc5
#25 [] irq_exit at ffffffff81084ee5
#26 [] do_IRQ at ffffffff81648ff8
Of course it may happen with other NIC drivers as well.
It's found the freed dst_entry here:
224 static bool tcp_in_quickack_mode(struct sock *sk)↩
225 {↩
226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩
227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩
228 ↩
229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩
230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩
231 }↩
But there are other backtraces attributed to the same freed dst_entry in
netfilter code as well.
All the vmcores showed 2 significant clues:
- Remote hosts behind the default gateway had always been redirected to a
different gateway. A rtable/dst_entry will be added for that host. Making
more dst_entrys with lower reference counts. Making this more probable.
- All vmcores showed a postitive LockDroppedIcmps value, e.g:
LockDroppedIcmps 267
A closer look at the tcp_v4_err() handler revealed that do_redirect() will run
regardless of whether user space has the socket locked. This can result in a
race condition where the same dst_entry cached in sk->sk_dst_entry can be
decremented twice for the same socket via:
do_redirect()->__sk_dst_check()-> dst_release().
Which leads to the dst_entry being prematurely freed with another socket
pointing to it via sk->sk_dst_cache and a subsequent crash.
To fix this skip do_redirect() if usespace has the socket locked. Instead let
the redirect take place later when user space does not have the socket
locked.
The dccp/IPv6 code is very similar in this respect, so fixing it there too.
As Eric Garver pointed out the following commit now invalidates routes. Which
can set the dst->obsolete flag so that ipv4_dst_check() returns null and
triggers the dst_release().
Fixes: ceb3320610 ("ipv4: Kill routes during PMTU/redirect updates.")
Cc: Eric Garver <egarver@redhat.com>
Cc: Hannes Sowa <hsowa@redhat.com>
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a13b2082ec ]
Andreas reports kernel oops during rmmod of the br_netfilter module.
Hannes debugged the oops down to a NULL rt6info->rt6i_indev.
Problem is that br_netfilter has the nasty concept of adding a fake
rtable to skb->dst; this happens in a br_netfilter prerouting hook.
A second hook (in bridge LOCAL_IN) is supposed to remove these again
before the skb is handed up the stack.
However, on module unload hooks get unregistered which means an
skb could traverse the prerouting hook that attaches the fake_rtable,
while the 'fake rtable remove' hook gets removed from the hooklist
immediately after.
Fixes: 34666d467c ("netfilter: bridge: move br_netfilter out of the core")
Reported-by: Andreas Karis <akaris@redhat.com>
Debugged-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79e49503ef ]
ip6_fragment, in case skb has a fraglist, checks if the
skb is cloned. If it is, it will move to the 'slow path' and allocates
new skbs for each fragment.
However, right before entering the slowpath loop, it updates the
nexthdr value of the last ipv6 extension header to NEXTHDR_FRAGMENT,
to account for the fragment header that will be inserted in the new
ipv6-fragment skbs.
In case original skb is cloned this munges nexthdr value of another
skb. Avoid this by doing the nexthdr update for each of the new fragment
skbs separately.
This was observed with tcpdump on a bridge device where netfilter ipv6
reassembly is active: tcpdump shows malformed fragment headers as
the l4 header (icmpv6, tcp, etc). is decoded as a fragment header.
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Andreas Karis <akaris@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 67e194007b ]
Commit 2759647247 ("ipv6: fix ECMP route replacement") introduced a
loop that removes all siblings of an ECMP route that is being
replaced. However, this loop doesn't stop when it has replaced
siblings, and keeps removing other routes with a higher metric.
We also end up triggering the WARN_ON after the loop, because after
this nsiblings < 0.
Instead, stop the loop when we have taken care of all routes with the
same metric as the route being replaced.
Reproducer:
===========
#!/bin/sh
ip netns add ns1
ip netns add ns2
ip -net ns1 link set lo up
for x in 0 1 2 ; do
ip link add veth$x netns ns2 type veth peer name eth$x netns ns1
ip -net ns1 link set eth$x up
ip -net ns2 link set veth$x up
done
ip -net ns1 -6 r a 2000::/64 nexthop via fe80::0 dev eth0 \
nexthop via fe80::1 dev eth1 nexthop via fe80::2 dev eth2
ip -net ns1 -6 r a 2000::/64 via fe80::42 dev eth0 metric 256
ip -net ns1 -6 r a 2000::/64 via fe80::43 dev eth0 metric 2048
echo "before replace, 3 routes"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
echo
ip -net ns1 -6 r c 2000::/64 nexthop via fe80::4 dev eth0 \
nexthop via fe80::5 dev eth1 nexthop via fe80::6 dev eth2
echo "after replace, only 2 routes, metric 2048 is gone"
ip -net ns1 -6 r | grep -v '^fe80\|^ff00'
Fixes: 2759647247 ("ipv6: fix ECMP route replacement")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 79099aab38 ]
Multipath routes can be rendered usesless when a device in one of the
paths is deleted. For example:
$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev br0
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev br0
$ ip li del br0
When br0 is deleted the other hop is not considered in
mpls_select_multipath because of the alive check -- rt_nhn_alive
is 0.
rt_nhn_alive is decremented once in mpls_ifdown when the device is taken
down (NETDEV_DOWN) and again when it is deleted (NETDEV_UNREGISTER). For
a 2 hop route, deleting one device drops the alive count to 0. Since
devices are taken down before unregistering, the decrement on
NETDEV_UNREGISTER is redundant.
Fixes: c89359a42e ("mpls: support for dead routes")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e37791ec1a ]
When the mpls_router module is unloaded, mpls routes are deleted but
notifications are not sent to userspace leaving userspace caches
out of sync. Add the call to mpls_notify_route in mpls_net_exit as
routes are freed.
Fixes: 0189197f44 ("mpls: Basic routing support")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 745cb7f8a5 ]
Replace MAX_ADDR_LEN with its numeric value to fix the following
linux/packet_diag.h userspace compilation error:
/usr/include/linux/packet_diag.h:67:17: error: 'MAX_ADDR_LEN' undeclared here (not in a function)
__u8 pdmc_addr[MAX_ADDR_LEN];
This is not the first case in the UAPI where the numeric value
of MAX_ADDR_LEN is used instead of symbolic one, uapi/linux/if_link.h
already does the same:
$ grep MAX_ADDR_LEN include/uapi/linux/if_link.h
__u8 mac[32]; /* MAX_ADDR_LEN */
There are no UAPI headers besides these two that use MAX_ADDR_LEN.
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 294acf1c01 ]
The gso code of several tunnels type (gre and udp tunnels)
takes for granted that the skb->inner_protocol is properly
initialized and drops the packet elsewhere.
On the forwarding path no one is initializing such field,
so gro encapsulated packets are dropped on forward.
Since commit 3872035241 ("gre: Use inner_proto to obtain
inner header protocol"), this can be reproduced when the
encapsulated packets use gre as the tunneling protocol.
The issue happens also with vxlan and geneve tunnels since
commit 8bce6d7d0d ("udp: Generalize skb_udp_segment"), if the
forwarding host's ingress nic has h/w offload for such tunnel
and a vxlan/geneve device is configured on top of it, regardless
of the configured peer address and vni.
To address the issue, this change initialize the inner_protocol
field for encapsulated packets in both ipv4 and ipv6 gro complete
callbacks.
Fixes: 3872035241 ("gre: Use inner_proto to obtain inner header protocol")
Fixes: 8bce6d7d0d ("udp: Generalize skb_udp_segment")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9ac25fc063 ]
TX skbs do not necessarily hold a reference on skb->sk->sk_refcnt
By the time TX completion happens, sk_refcnt might be already 0.
sock_hold()/sock_put() would then corrupt critical state, like
sk_wmem_alloc and lead to leaks or use after free.
Fixes: 62bccb8cdb ("net-timestamp: Make the clone operation stand-alone from phy timestamping")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 02b2faaf0a ]
Dmitry Vyukov reported a divide by 0 triggered by syzkaller, exploiting
tcp_disconnect() path that was never really considered and/or used
before syzkaller ;)
I was not able to reproduce the bug, but it seems issues here are the
three possible actions that assumed they would never trigger on a
listener.
1) tcp_write_timer_handler
2) tcp_delack_timer_handler
3) MTU reduction
Only IPv6 MTU reduction was properly testing TCP_CLOSE and TCP_LISTEN
states from tcp_v6_mtu_reduced()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d5afb6f9b6 ]
The code where sk_clone() came from created a new socket and locked it,
but then, on the error path didn't unlock it.
This problem stayed there for a long while, till b0691c8ee7 ("net:
Unlock sock before calling sk_free()") fixed it, but unfortunately the
callers of sk_clone() (now sk_clone_locked()) were not audited and the
one in dccp_create_openreq_child() remained.
Now in the age of the syskaller fuzzer, this was finally uncovered, as
reported by Dmitry:
---- 8< ----
I've got the following report while running syzkaller fuzzer on
86292b33d4 ("Merge branch 'akpm' (patches from Andrew)")
[ BUG: held lock freed! ]
4.10.0+ #234 Not tainted
-------------------------
syz-executor6/6898 is freeing memory
ffff88006286cac0-ffff88006286d3b7, with a lock still held there!
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
(slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
5 locks held by syz-executor6/6898:
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>] lock_sock
include/net/sock.h:1460 [inline]
#0: (sk_lock-AF_INET6){+.+.+.}, at: [<ffffffff839a34b4>]
inet_stream_connect+0x44/0xa0 net/ipv4/af_inet.c:681
#1: (rcu_read_lock){......}, at: [<ffffffff83bc1c2a>]
inet6_csk_xmit+0x12a/0x5d0 net/ipv6/inet6_connection_sock.c:126
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_unlink
include/linux/skbuff.h:1767 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>] __skb_dequeue
include/linux/skbuff.h:1783 [inline]
#2: (rcu_read_lock){......}, at: [<ffffffff8369b424>]
process_backlog+0x264/0x730 net/core/dev.c:4835
#3: (rcu_read_lock){......}, at: [<ffffffff83aeb5c0>]
ip6_input_finish+0x0/0x1700 net/ipv6/ip6_input.c:59
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>] spin_lock
include/linux/spinlock.h:299 [inline]
#4: (slock-AF_INET6){+.-...}, at: [<ffffffff8362c2c9>]
sk_clone_lock+0x3d9/0x12c0 net/core/sock.c:1504
Fix it just like was done by b0691c8ee7 ("net: Unlock sock before calling
sk_free()").
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170301153510.GE15145@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 13baa00ad0 ]
It is now very clear that silly TCP listeners might play with
enabling/disabling timestamping while new children are added
to their accept queue.
Meaning net_enable_timestamp() can be called from BH context
while current state of the static key is not enabled.
Lets play safe and allow all contexts.
The work queue is scheduled only under the problematic cases,
which are the static key enable/disable transition, to not slow down
critical paths.
This extends and improves what we did in commit 5fa8bbda38 ("net: use
a work queue to defer net_disable_timestamp() work")
Fixes: b90e5794c5 ("net: dont call jump_label_dec from irq context")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8953de2f02 ]
Even with multicast flooding turned off, IPv6 ND should still work so
that IPv6 connectivity is provided. Allow this by continuing to flood
multicast traffic originated by us.
Fixes: b6cb5ac833 ("net: bridge: add per-port multicast flood flag")
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Mike Manning <mmanning@brocade.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f7df4923fa ]
When the structure of the LPM tree changes (f.e., due to the addition of
a new prefix), we unbind the old tree and then bind the new one. This
may result in temporary packet loss.
Instead, overwrite the old binding with the new one.
Fixes: 6b75c4807d ("mlxsw: spectrum_router: Add virtual router management")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eab127717a ]
phy_error() is called in the PHY state machine workqueue context, and
calls phy_trigger_machine() which does a cancel_delayed_work_sync() of
the workqueue we execute from, causing a deadlock situation.
Augment phy_trigger_machine() machine with a sync boolean indicating
whether we should use cancel_*_sync() or just cancel_*_work().
Fixes: 3c293f4e08 ("net: phy: Trigger state machine on state change and not polling.")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 51fb60eb16 ]
l2tp_ip_backlog_recv may not return -1 if the packet gets dropped.
The return value is passed up to ip_local_deliver_finish, which treats
negative values as an IP protocol number for resubmission.
Signed-off-by: Paul Hüber <phueber@kernsp.in>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit edb9d1bff4 ]
When tc actions are loaded as a module and no actions have been installed,
flushing them would result in actions removed from the memory, but modules
reference count not being decremented, so that the modules would not be
unloaded.
Following is example with GACT action:
% sudo modprobe act_gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions ls action gact
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 1
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 2
% sudo rmmod act_gact
rmmod: ERROR: Module act_gact is in use
....
After the fix:
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions add action pass index 1
% sudo tc actions add action pass index 2
% sudo tc actions add action pass index 3
% lsmod
Module Size Used by
act_gact 16384 3
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
%
% sudo tc actions flush action gact
% lsmod
Module Size Used by
act_gact 16384 0
% sudo rmmod act_gact
% lsmod
Module Size Used by
%
Fixes: f97017cdef ("net-sched: Fix actions flushing")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1158632b5a ]
When using IPv6 transport and a default dst, a pointer to the configured
source address is passed into the route lookup. If no source address is
configured, then the value is overwritten.
IPv6 route lookup ignores egress ifindex match if the source address is set,
so if egress ifindex match is desired, the source address must be passed
as any. The overwrite breaks this for subsequent lookups.
Avoid this by copying the configured address to an existing stack variable
and pass a pointer to that instead.
Fixes: 272d96a5ab ("net: vxlan: lwt: Use source ip address during route lookup.")
Signed-off-by: Brian Russell <brussell@brocade.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7dcdf941cd ]
Align vti6 with vti by returning GRE_KEY flag. This enables iproute2
to display tunnel keys on "ip -6 tunnel show"
Signed-off-by: David Forster <dforster@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 36154be40a ]
In cqe compression with striding RQ, the decompression of the CQE field
wqe_counter was done with a wrong wraparound value.
This caused handling cqes with a wrong pointer to wqe (rx descriptor)
and creating SKBs with wrong data, pointing to wrong (and already consumed)
strides/pages.
The meaning of the CQE field wqe_counter in striding RQ holds the
stride index instead of the WQE index. Hence, when decompressing
a CQE, wqe_counter should have wrapped-around the number of strides
in a single multi-packet WQE.
We dropped this wrap-around mask at all in CQE decompression of striding
RQ. It is not needed as in such cases the CQE compression session would
break because of different value of wqe_id field, starting a new
compression session.
Tested:
ethtool -K ethxx lro off/on
ethtool --set-priv-flags ethxx rx_cqe_compress on
super_netperf 16 {ipv4,ipv6} -t TCP_STREAM -m 50 -D
verified no csum errors and no page refcount issues.
Fixes: 7219ab34f1 ("net/mlx5e: CQE compression")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Tom Herbert <tom@herbertland.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4078e637c1 ]
When rq_type is Striding RQ, no room of SKB_RESERVE is needed
as SKB allocation is not done via build_skb.
Fixes: e4b8550807 ("net/mlx5e: Slightly reduce hardware LRO size")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6f08a22c5f ]
Currently vport representors are added only on driver load and removed on
driver unload. Apparently we forgot to handle them when we added the
seamless reset flow feature. This caused to leave the representors
netdevs alive and active with open HW resources on pci shutdown and on
error reset flows.
To overcome this we move their handling to interface attach/detach, so
they would be cleaned up on shutdown and recreated on reset flows.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45bded2c21 upstream.
Make sure that the Q counters are supported by the FW before trying
to allocate/deallocte them, this will avoid driver load failure when
they aren't supported by the FW.
Fixes: 0837e86a7a ('IB/mlx5: Add per port counters')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d06863f90 upstream.
Fix a BUG when the kernel tries to mount a file system constructed as
follows:
echo foo > foo.txt
mke2fs -Fq -t ext4 -O encrypt foo.img 100
debugfs -w foo.img << EOF
write foo.txt a
set_inode_field a i_flags 0x80800
set_super_value s_last_orphan 12
quit
EOF
root@kvm-xfstests:~# mount -o loop foo.img /mnt
[ 160.238770] ------------[ cut here ]------------
[ 160.240106] kernel BUG at /usr/projects/linux/ext4/fs/ext4/inode.c:3874!
[ 160.240106] invalid opcode: 0000 [#1] SMP
[ 160.240106] Modules linked in:
[ 160.240106] CPU: 0 PID: 2547 Comm: mount Tainted: G W 4.10.0-rc3-00034-gcdd33b941b67 #227
[ 160.240106] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.1-1 04/01/2014
[ 160.240106] task: f4518000 task.stack: f47b6000
[ 160.240106] EIP: ext4_block_zero_page_range+0x1a7/0x2b4
[ 160.240106] EFLAGS: 00010246 CPU: 0
[ 160.240106] EAX: 00000001 EBX: f7be4b50 ECX: f47b7dc0 EDX: 00000007
[ 160.240106] ESI: f43b05a8 EDI: f43babec EBP: f47b7dd0 ESP: f47b7dac
[ 160.240106] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 160.240106] CR0: 80050033 CR2: bfd85b08 CR3: 34a00680 CR4: 000006f0
[ 160.240106] Call Trace:
[ 160.240106] ext4_truncate+0x1e9/0x3e5
[ 160.240106] ext4_fill_super+0x286f/0x2b1e
[ 160.240106] ? set_blocksize+0x2e/0x7e
[ 160.240106] mount_bdev+0x114/0x15f
[ 160.240106] ext4_mount+0x15/0x17
[ 160.240106] ? ext4_calculate_overhead+0x39d/0x39d
[ 160.240106] mount_fs+0x58/0x115
[ 160.240106] vfs_kern_mount+0x4b/0xae
[ 160.240106] do_mount+0x671/0x8c3
[ 160.240106] ? _copy_from_user+0x70/0x83
[ 160.240106] ? strndup_user+0x31/0x46
[ 160.240106] SyS_mount+0x57/0x7b
[ 160.240106] do_int80_syscall_32+0x4f/0x61
[ 160.240106] entry_INT80_32+0x2f/0x2f
[ 160.240106] EIP: 0xb76b919e
[ 160.240106] EFLAGS: 00000246 CPU: 0
[ 160.240106] EAX: ffffffda EBX: 08053838 ECX: 08052188 EDX: 080537e8
[ 160.240106] ESI: c0ed0000 EDI: 00000000 EBP: 080537e8 ESP: bfa13660
[ 160.240106] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 160.240106] Code: 59 8b 00 a8 01 0f 84 09 01 00 00 8b 07 66 25 00 f0 66 3d 00 80 75 61 89 f8 e8 3e e2 ff ff 84 c0 74 56 83 bf 48 02 00 00 00 75 02 <0f> 0b 81 7d e8 00 10 00 00 74 02 0f 0b 8b 43 04 8b 53 08 31 c9
[ 160.240106] EIP: ext4_block_zero_page_range+0x1a7/0x2b4 SS:ESP: 0068:f47b7dac
[ 160.317241] ---[ end trace d6a773a375c810a5 ]---
The problem is that when the kernel tries to truncate an inode in
ext4_truncate(), it tries to clear any on-disk data beyond i_size.
Without the encryption key, it can't do that, and so it triggers a
BUG.
E2fsck does *not* provide this service, and in practice most file
systems have their orphan list processed by e2fsck, so to avoid
crashing, this patch skips this step if we don't have access to the
encryption key (which is the case when processing the orphan list; in
all other cases, we will have the encryption key, or the kernel
wouldn't have allowed the file to be opened).
An open question is whether the fact that e2fsck isn't clearing the
bytes beyond i_size causing problems --- and if we've lived with it
not doing it for so long, can we drop this from the kernel replay of
the orphan list in all cases (not just when we don't have the key for
encrypted inodes).
Addresses-Google-Bug: #35209576
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 413808685d upstream.
When the protocol is set via the sysfs protocols attribute, the
decoder is loaded. However, when it is not when a device is first
plugged in or registered.
Fixes: acc1c3c ("[media] media: rc: load decoder modules on-demand")
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d67a5f4b59 upstream.
Commit df2cb6daa4 ("block: Avoid deadlocks with bio allocation by
stacking drivers") created a workqueue for every bio set and code
in bio_alloc_bioset() that tries to resolve some low-memory deadlocks
by redirecting bios queued on current->bio_list to the workqueue if the
system is low on memory. However other deadlocks (see below **) may
happen, without any low memory condition, because generic_make_request
is queuing bios to current->bio_list (rather than submitting them).
** the related dm-snapshot deadlock is detailed here:
https://www.redhat.com/archives/dm-devel/2016-July/msg00065.html
Fix this deadlock by redirecting any bios on current->bio_list to the
bio_set's rescue workqueue on every schedule() call. Consequently,
when the process blocks on a mutex, the bios queued on
current->bio_list are dispatched to independent workqueus and they can
complete without waiting for the mutex to be available.
The structure blk_plug contains an entry cb_list and this list can contain
arbitrary callback functions that are called when the process blocks.
To implement this fix DM (ab)uses the onstack plug's cb_list interface
to get its flush_current_bio_list() called at schedule() time.
This fixes the snapshot deadlock - if the map method blocks,
flush_current_bio_list() will be called and it redirects bios waiting
on current->bio_list to appropriate workqueues.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1267650
Depends-on: df2cb6daa4 ("block: Avoid deadlocks with bio allocation by stacking drivers")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 370a0ec181 upstream.
Currently, if a vcpu thread tries to change the active state of an
interrupt which is already on the same vcpu's AP list, it will loop
forever. Since the VGIC mmio handler is called after a vcpu has
already synced back the LR state to the struct vgic_irq, we can just
let it proceed safely.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Jintack Lim <jintack@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e4d88009f upstream.
While we can technically not run huge page guests right now, we can
setup a guest with huge pages. Trying to migrate it will trigger a
VM_BUG_ON and, if the kernel is not configured to panic on a BUG, it
will happily try to work on non-existing page table entries.
With this patch, we always return "dirty" if we encounter a large page
when migrating. This at least fixes the immediate problem until we
have proper handling for both kind of pages.
Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.")
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f98c7bce57 upstream.
If DMA is not available (even when configured in DeviceTree), the driver
will fail the startup procedure thus making serial console not
available.
For example this causes boot failure on QEMU ARMv7 (Exynos4210, SMDKC210):
[ 1.302575] OF: amba_device_add() failed (-19) for /amba/pdma@12680000
...
[ 11.435732] samsung-uart 13800000.serial: DMA request failed
[ 72.963893] samsung-uart 13800000.serial: DMA request failed
[ 73.143361] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
DMA is not necessary for serial to work, so continue with UART startup
after emitting a warning.
Fixes: 62c37eedb7 ("serial: samsung: add dma reqest/release functions")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 654b404f2a upstream.
Add missing sanity check to the bulk-in completion handler to avoid an
integer underflow that can be triggered by a malicious device.
This avoids leaking 128 kB of memory content from after the URB transfer
buffer to user space.
Fixes: 8c209e6782 ("USB: make actual_length in struct urb field u32")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b1d250afb upstream.
Fix a NULL-pointer dereference in the interrupt callback should a
malicious device send data containing a bad port number by adding the
missing sanity check.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit de46e56653 upstream.
Make sure to verify that we have the required interrupt-out endpoint for
IOWarrior56 devices to avoid dereferencing a NULL-pointer in write
should a malicious device lack such an endpoint.
Fixes: 946b960d13 ("USB: add driver for iowarrior devices.")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7321e81fc upstream.
Make sure to check for the required interrupt-in endpoint to avoid
dereferencing a NULL-pointer should a malicious device lack such an
endpoint.
Note that a fairly recent change purported to fix this issue, but added
an insufficient test on the number of endpoints only, a test which can
now be removed.
Fixes: 4ec0ef3a82 ("USB: iowarrior: fix oops with malicious USB descriptors")
Fixes: 946b960d13 ("USB: add driver for iowarrior devices.")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30572418b4 upstream.
This driver needlessly took another reference to the tty on open, a
reference which was then never released on close. This lead to not just
a leak of the tty, but also a driver reference leak that prevented the
driver from being unloaded after a port had once been opened.
Fixes: 4a90f09b20 ("tty: usb-serial krefs")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c76d7cd52 upstream.
Add missing sanity check to the bulk-in completion handler to avoid an
integer underflow that could be triggered by a malicious device.
This avoids leaking up to 56 bytes from after the URB transfer buffer to
user space.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dcc7620cad upstream.
Upstream commit 98d74f9cea ("xhci: fix 10 second timeout on removal of
PCI hotpluggable xhci controllers") fixes a problem with hot pluggable PCI
xhci controllers which can result in excessive timeouts, to the point where
the system reports a deadlock.
The same problem is seen with hot pluggable xhci controllers using the
xhci-plat driver, such as the driver used for Type-C ports on rk3399.
Similar to hot-pluggable PCI controllers, the driver for this chip
removes the xhci controller from the system when the Type-C cable is
disconnected.
The solution for PCI devices works just as well for non-PCI devices
and avoids the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f95e60a7db upstream.
According to xHCI spec, HCIVERSION containing a BCD encoding
of the xHCI specification revision number, 0100h corresponds
to xHCI version 1.0. Change "100" as "0x100".
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Fixes: 04abb6de28 ("xhci: Read and parse new xhci 1.1 capability register")
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb38d913c2 upstream.
This reverts commit 4fbac5206a.
This commit breaks g_webcam when used with uvc-gadget [1].
The user space application (e.g. uvc-gadget) is responsible for
sending response to UVC class specific requests on control endpoint
in uvc_send_response() in uvc_v4l2.c.
The bad commit was causing a duplicate response to be sent with
incorrect response data thus causing UVC probe to fail at the host
and broken control transfer endpoint at the gadget.
[1] - git://git.ideasonboard.org/uvc-gadget.git
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2bfa0719ac upstream.
If we're dealing with SuperSpeed endpoints, we need
to make sure to pass along the companion descriptor
and initialize fields needed by the Gadget
API. Eventually, f_fs.c should be converted to use
config_ep_by_speed() like all other functions,
though.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85550f9148 upstream.
In patch 2e2aa1bc7eff90ecm, USB suspend and wakeup control requests are
passed to SFR_OHCIICR register. If a processor does not have such a
register, this hub control request will be dropped.
If no such a SFR register is available, all USB suspend control requests
will now be processed using ohci_hub_control()
(like before patch 2e2aa1bc7eff90ecm.)
Tested on an Atmel AT91SAM9G20 with an on-board TI TUSB2046B hub chip
If the last USB device is unplugged from the USB hub, the hub goes into
sleep and will not wakeup when an USB devices is inserted.
Fixes: 2e2aa1bc7e ("usb: ohci-at91: Forcibly suspend ports while USB suspend")
Signed-off-by: Jelle Martijn Kok <jmkok@youcom.nl>
Tested-by: Wenyou Yang <wenyou.yang@atmel.com>
Cc: Wenyou Yang <wenyou.yang@atmel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7369090a9f upstream.
Some gadget drivers are bad, bad boys. We notice
that ADB was passing bad Burst Size which caused top
bits of param0 to be overwritten which confused DWC3
when running this command.
In order to avoid future issues, we're going to make
sure values passed by macros are always safe for the
controller. Note that ADB still needs a fix to *not*
pass bad values.
Reported-by: Mohamed Abbas <mohamed.abbas@intel.com>
Sugested-by: Adam Andruszak <adam.andruszak@intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a69e2fb703 upstream.
The CPPR (Current Processor Priority Register) of a XICS interrupt
presentation controller contains a value N, such that only interrupts
with a priority "more favoured" than N will be received by the CPU,
where "more favoured" means "less than". So if the CPPR has the value 5
then only interrupts with a priority of 0-4 inclusive will be received.
In theory the CPPR can support a value of 0 to 255 inclusive.
In practice Linux only uses values of 0, 4, 5 and 0xff. Setting the CPPR
to 0 rejects all interrupts, setting it to 0xff allows all interrupts.
The values 4 and 5 are used to differentiate IPIs from external
interrupts. Setting the CPPR to 5 allows IPIs to be received but not
external interrupts.
The CPPR emulation in the OPAL XICS implementation only directly
supports priorities 0 and 0xff. All other priorities are considered
equivalent, and mapped to a single priority value internally. This means
when using icp-opal we can not allow IPIs but not externals.
This breaks Linux's use of priority values when a CPU is hot unplugged.
After migrating IRQs away from the CPU that is being offlined, we set
the priority to 5, meaning we still want the offline CPU to receive
IPIs. But the effect of the OPAL XICS emulation's use of a single
priority value is that all interrupts are rejected by the CPU. With the
CPU offline, and not receiving IPIs, we may not be able to wake it up to
bring it back online.
The first part of the fix is in icp_opal_set_cpu_priority(). CPPR values
of 0 to 4 inclusive will correctly cause all interrupts to be rejected,
so we pass those CPPR values through to OPAL. However if we are called
with a CPPR of 5 or greater, the caller is expecting to be able to allow
IPIs but not external interrupts. We know this doesn't work, so instead
of rejecting all interrupts we choose the opposite which is to allow all
interrupts. This is still not correct behaviour, but we know for the
only existing caller (xics_migrate_irqs_away()), that it is the better
option.
The other part of the fix is in xics_migrate_irqs_away(). Instead of
setting priority (CPPR) to 0, and then back to 5 before migrating IRQs,
we migrate the IRQs before setting the priority back to 5. This should
have no effect on an ICP backend with a working set_priority(), and on
icp-opal it means we will keep all interrupts blocked until after we've
finished doing the IRQ migration. Additionally we wait for 5ms after
doing the migration to make sure there are no IRQs in flight.
Fixes: d74361881f ("powerpc/xics: Add ICP OPAL backend")
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Reported-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Tested-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Rewrote comments and change log, change delay to 5ms]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e148bd17f4 upstream.
emulate_step() uses a number of underlying kernel functions that were
initially not enabled for LE. This has been rectified since. So, fix
emulate_step() for LE for the corresponding instructions.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 606142af57 upstream.
On Kernel 4.9, WARNINGs about doing DMA on stack are hit at
the dw2102 driver: one in su3000_power_ctrl() and the other in tt_s2_4600_frontend_attach().
Both were due to the use of buffers on the stack as parameters to
dvb_usb_generic_rw() and the resulting attempt to do DMA with them.
The device was non-functional as a result.
So, switch this driver over to use a buffer within the device state
structure, as has been done with other DVB-USB drivers.
Tested with TechnoTrend TT-connect S2-4600.
[mchehab@osg.samsung.com: fixed a warning at su3000_i2c_transfer() that
state var were dereferenced before check 'd']
Signed-off-by: Jonathan McDowell <noodles@earth.li>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1eb98143c upstream.
On ARM and arm64, we use a dedicated mm_struct to map the UEFI
Runtime Services regions, which allows us to map those regions
on demand, and in a way that is guaranteed to be compatible
with incoming kernels across kexec.
As it turns out, we don't fully initialize the mm_struct in the
same way as process mm_structs are initialized on fork(), which
results in the following crash on ARM if CONFIG_CPUMASK_OFFSTACK=y
is enabled:
...
EFI Variables Facility v0.08 2004-May-17
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[...]
Process swapper/0 (pid: 1)
...
__memzero()
check_and_switch_context()
virt_efi_get_next_variable()
efivar_init()
efivars_sysfs_init()
do_one_initcall()
...
This is due to a missing call to mm_init_cpumask(), so add it.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1488395154-29786-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 040757f738 upstream.
Always increment/decrement ucount->count under the ucounts_lock. The
increments are there already and moving the decrements there means the
locking logic of the code is simpler. This simplification in the
locking logic fixes a race between put_ucounts and get_ucounts that
could result in a use-after-free because the count could go zero then
be found by get_ucounts and then be freed by put_ucounts.
A bug presumably this one was found by a combination of syzkaller and
KASAN. JongWhan Kim reported the syzkaller failure and Dmitry Vyukov
spotted the race in the code.
Fixes: f6b2db1a3e ("userns: Make the count of user namespaces per user")
Reported-by: JongHwan Kim <zzoru007@gmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf7165cfa2 upstream.
There are several trace include files that define TRACE_INCLUDE_FILE.
Include several of them in the same .c file (as I currently have in
some code I am working on), and the compile will blow up with a
"warning: "TRACE_INCLUDE_FILE" redefined #define TRACE_INCLUDE_FILE syscalls"
Every other include file in include/trace/events/ avoids that issue
by having a #undef TRACE_INCLUDE_FILE before the #define; syscalls.h
should have one, too.
Link: http://lkml.kernel.org/r/20160928225554.13bd7ac6@annuminas.surriel.com
Fixes: b8007ef742 ("tracing: Separate raw syscall from syscall tracer")
Signed-off-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d43e6fb4ac upstream.
The #warning was present 10 years ago when the driver first got merged.
As the platform is rather obsolete by now, it seems very unlikely that
the warning will cause anyone to fix the code properly.
kernelci.org reports the warning for every build in the meantime, so
I think it's better to just turn it into a code comment to reduce
noise.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df384d435a upstream.
gcc-7 and probably earlier versions get confused by this function
and print a harmless warning:
drivers/net/ethernet/broadcom/bcm63xx_enet.c: In function 'bcm_enet_open':
drivers/net/ethernet/broadcom/bcm63xx_enet.c:1130:3: error: 'phydev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This adds an initialization for the 'phydev' variable when it is unused
and changes the check to test for that NULL pointer to make it clear
that we always pass a valid pointer here.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 906b268477 upstream.
kernelci.org reports a warning for this driver, as it copies a local
variable into a 'const char *' string:
drivers/mtd/maps/pmcmsp-flash.c:149:30: warning: passing argument 1 of 'strncpy' discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
Using kstrndup() simplifies the code and avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23ca9b5223 upstream.
kernelci reports a failure of the ip28_defconfig build after upgrading its
gcc version:
arch/mips/sgi-ip22/Platform:29: *** gcc doesn't support needed option -mr10k-cache-barrier=store. Stop.
The problem apparently is that the -mr10k-cache-barrier=store option is now
rejected for CPUs other than r10k. Explicitly including the CPU in the
check fixes this and is safe because both options were introduced in
gcc-4.4.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15049/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b617649468 upstream.
One of the last remaining failures in kernelci.org is for a gcc bug:
drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: error: insn does not satisfy its constraints:
drivers/net/ethernet/qlogic/qlge/qlge_main.c:4819:1: internal compiler error: in extract_constrain_insn, at recog.c:2190
This is apparently broken in gcc-6 but fixed in gcc-7, and I cannot
reproduce the problem here. However, it is clear that ip27_defconfig
does not actually need this driver as the platform has only PCI-X but
not PCIe, and the qlge adapter in turn is PCIe-only.
The driver was originally enabled in 2010 along with lots of other
drivers.
Fixes: 59d302b342 ("MIPS: IP27: Make defconfig useful again.")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15197/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1742ac2650 upstream.
vdso.h includes <spaces.h> implicitly after defining CONFIG_32BITS.
This defeats the override in mach-ip27/spaces.h, leading to
a build error that shows up in kernelci.org:
In file included from arch/mips/include/asm/mach-ip27/spaces.h:29:0,
from arch/mips/include/asm/page.h:12,
from arch/mips/vdso/vdso.h:26,
from arch/mips/vdso/gettimeofday.c:11:
arch/mips/include/asm/mach-generic/spaces.h:28:0: error: "CAC_BASE" redefined [-Werror]
#define CAC_BASE _AC(0x80000000, UL)
An earlier patch tried to make the second definition conditional,
but that patch had the #ifdef in the wrong place, and would lead
to another warning:
arch/mips/include/asm/io.h: In function 'phys_to_virt':
arch/mips/include/asm/io.h:138:9: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
For all I can tell, there is no other reason than vdso32 to ever
include this file with CONFIG_32BITS set, and the vdso itself should
never refer to the base addresses as it is running in user space,
so adding an #ifdef here is safe.
Link: https://patchwork.kernel.org/patch/9418187/
Fixes: 3ffc17d876 ("MIPS: Adjust MIPS64 CAC_BASE to reflect Config.K0")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/15039/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9ddc16ad8e upstream.
In linux-4.10-rc, NF_CT_PROTO_UDPLITE and NF_CT_PROTO_DCCP are bool
symbols instead of tristate, and kernelci.org reports a bunch of
warnings for this, like:
arch/mips/configs/malta_kvm_guest_defconfig:63:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
arch/mips/configs/malta_defconfig:62:warning: symbol value 'm' invalid for NF_CT_PROTO_DCCP
arch/mips/configs/malta_defconfig:63:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
arch/mips/configs/ip22_defconfig:70:warning: symbol value 'm' invalid for NF_CT_PROTO_DCCP
arch/mips/configs/ip22_defconfig:71:warning: symbol value 'm' invalid for NF_CT_PROTO_UDPLITE
This changes all the MIPS defconfigs with these symbols to have them
built-in.
Fixes: 9b91c96c5d ("netfilter: conntrack: built-in support for UDPlite")
Fixes: c51d39010a ("netfilter: conntrack: built-in support for DCCP")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/14999/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d6e910502 upstream.
An ancient gcc bug (first reported in 2003) has apparently resurfaced
on MIPS, where kernelci.org reports an overly large stack frame in the
whirlpool hash algorithm:
crypto/wp512.c:987:1: warning: the frame size of 1112 bytes is larger than 1024 bytes [-Wframe-larger-than=]
With some testing in different configurations, I'm seeing large
variations in stack frames size up to 1500 bytes for what should have
around 300 bytes at most. I also checked the reference implementation,
which is essentially the same code but also comes with some test and
benchmarking infrastructure.
It seems that recent compiler versions on at least arm, arm64 and powerpc
have a partial fix for this problem, but enabling "-fsched-pressure", but
even with that fix they suffer from the issue to a certain degree. Some
testing on arm64 shows that the time needed to hash a given amount of
data is roughly proportional to the stack frame size here, which makes
sense given that the wp512 implementation is doing lots of loads for
table lookups, and the problem with the overly large stack is a result
of doing a lot more loads and stores for spilled registers (as seen from
inspecting the object code).
Disabling -fschedule-insns consistently fixes the problem for wp512,
in my collection of cross-compilers, the results are consistently better
or identical when comparing the stack sizes in this function, though
some architectures (notable x86) have schedule-insns disabled by
default.
The four columns are:
default: -O2
press: -O2 -fsched-pressure
nopress: -O2 -fschedule-insns -fno-sched-pressure
nosched: -O2 -no-schedule-insns (disables sched-pressure)
default press nopress nosched
alpha-linux-gcc-4.9.3 1136 848 1136 176
am33_2.0-linux-gcc-4.9.3 2100 2076 2100 2104
arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
cris-linux-gcc-4.9.3 272 272 272 272
frv-linux-gcc-4.9.3 1128 1000 1128 280
hppa64-linux-gcc-4.9.3 1128 336 1128 184
hppa-linux-gcc-4.9.3 644 308 644 276
i386-linux-gcc-4.9.3 352 352 352 352
m32r-linux-gcc-4.9.3 720 656 720 268
microblaze-linux-gcc-4.9.3 1108 604 1108 256
mips64-linux-gcc-4.9.3 1328 592 1328 208
mips-linux-gcc-4.9.3 1096 624 1096 240
powerpc64-linux-gcc-4.9.3 1088 432 1088 160
powerpc-linux-gcc-4.9.3 1080 584 1080 224
s390-linux-gcc-4.9.3 456 456 624 360
sh3-linux-gcc-4.9.3 292 292 292 292
sparc64-linux-gcc-4.9.3 992 240 992 208
sparc-linux-gcc-4.9.3 680 592 680 312
x86_64-linux-gcc-4.9.3 224 240 272 224
xtensa-linux-gcc-4.9.3 1152 704 1152 304
aarch64-linux-gcc-7.0.0 224 224 1104 208
arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
mips-linux-gcc-7.0.0 1120 648 1120 272
x86_64-linux-gcc-7.0.1 240 240 304 240
arm-linux-gnueabi-gcc-4.4.7 840 392
arm-linux-gnueabi-gcc-4.5.4 784 728 784 320
arm-linux-gnueabi-gcc-4.6.4 736 728 736 304
arm-linux-gnueabi-gcc-4.7.4 944 784 944 352
arm-linux-gnueabi-gcc-4.8.5 464 464 760 352
arm-linux-gnueabi-gcc-4.9.3 848 848 1048 352
arm-linux-gnueabi-gcc-5.3.1 824 824 1064 336
arm-linux-gnueabi-gcc-6.1.1 808 808 1056 344
arm-linux-gnueabi-gcc-7.0.1 824 824 1048 352
Trying the same test for serpent-generic, the picture is a bit different,
and while -fno-schedule-insns is generally better here than the default,
-fsched-pressure wins overall, so I picked that instead.
default press nopress nosched
alpha-linux-gcc-4.9.3 1392 864 1392 960
am33_2.0-linux-gcc-4.9.3 536 524 536 528
arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
cris-linux-gcc-4.9.3 528 528 528 528
frv-linux-gcc-4.9.3 536 400 536 504
hppa64-linux-gcc-4.9.3 524 208 524 480
hppa-linux-gcc-4.9.3 768 472 768 508
i386-linux-gcc-4.9.3 564 564 564 564
m32r-linux-gcc-4.9.3 712 576 712 532
microblaze-linux-gcc-4.9.3 724 392 724 512
mips64-linux-gcc-4.9.3 720 384 720 496
mips-linux-gcc-4.9.3 728 384 728 496
powerpc64-linux-gcc-4.9.3 704 304 704 480
powerpc-linux-gcc-4.9.3 704 296 704 480
s390-linux-gcc-4.9.3 560 560 592 536
sh3-linux-gcc-4.9.3 540 540 540 540
sparc64-linux-gcc-4.9.3 544 352 544 496
sparc-linux-gcc-4.9.3 544 344 544 496
x86_64-linux-gcc-4.9.3 528 536 576 528
xtensa-linux-gcc-4.9.3 752 544 752 544
aarch64-linux-gcc-7.0.0 432 432 656 480
arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
mips-linux-gcc-7.0.0 720 464 720 488
x86_64-linux-gcc-7.0.1 536 528 600 536
arm-linux-gnueabi-gcc-4.4.7 592 440
arm-linux-gnueabi-gcc-4.5.4 776 448 776 544
arm-linux-gnueabi-gcc-4.6.4 776 448 776 544
arm-linux-gnueabi-gcc-4.7.4 768 448 768 544
arm-linux-gnueabi-gcc-4.8.5 488 488 776 544
arm-linux-gnueabi-gcc-4.9.3 552 552 776 536
arm-linux-gnueabi-gcc-5.3.1 552 552 776 536
arm-linux-gnueabi-gcc-6.1.1 560 560 776 536
arm-linux-gnueabi-gcc-7.0.1 616 616 808 536
I did not do any runtime tests with serpent, so it is possible that stack
frame size does not directly correlate with runtime performance here and
it actually makes things worse, but it's more likely to help here, and
the reduced stack frame size is probably enough reason to apply the patch,
especially given that the crypto code is often used in deep call chains.
Link: https://kernelci.org/build/id/58797d7559b5149efdf6c3a9/logs/
Link: http://www.larc.usp.br/~pbarreto/WhirlpoolPage.html
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11488
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e46565cf6 upstream.
A recent change claimed to fix an off-by-one error in the OOB-port
completion handler, but instead introduced such an error. This could
specifically led to modem-status changes going unnoticed, effectively
breaking TIOCMGET.
Note that the offending commit fixes a loop-condition underflow and is
marked for stable, but should not be backported without this fix.
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 2d38088921 ("USB: serial: digi_acceleport: fix OOB data sanity
check")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d38088921 upstream.
Make sure to check for short transfers to avoid underflow in a loop
condition when parsing the receive buffer.
Also fix an off-by-one error in the incomplete sanity check which could
lead to invalid data being parsed.
Fixes: 8c209e6782 ("USB: make actual_length in struct urb field u32")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The API for port_parameter_get() requires that the
filled length is returned, or if insufficient space
that the required space is returned.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
commit 40e952f9d6 upstream.
mem_cgroup_free() indirectly calls wb_domain_exit() which is not
prepared to deal with a struct wb_domain object that hasn't executed
wb_domain_init(). For instance, the following warning message is
printed by lockdep if alloc_percpu() fails in mem_cgroup_alloc():
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 1 PID: 1950 Comm: mkdir Not tainted 4.10.0+ #151
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x67/0x99
register_lock_class+0x36d/0x540
__lock_acquire+0x7f/0x1a30
lock_acquire+0xcc/0x200
del_timer_sync+0x3c/0xc0
wb_domain_exit+0x14/0x20
mem_cgroup_free+0x14/0x40
mem_cgroup_css_alloc+0x3f9/0x620
cgroup_apply_control_enable+0x190/0x390
cgroup_mkdir+0x290/0x3d0
kernfs_iop_mkdir+0x58/0x80
vfs_mkdir+0x10e/0x1a0
SyS_mkdirat+0xa8/0xd0
SyS_mkdir+0x14/0x20
entry_SYSCALL_64_fastpath+0x18/0xad
Add __mem_cgroup_free() which skips wb_domain_exit(). This is used by
both mem_cgroup_free() and mem_cgroup_alloc() clean up.
Fixes: 0b8f73e104 ("mm: memcontrol: clean up alloc, online, offline, free functions")
Link: http://lkml.kernel.org/r/20170306192122.24262-1-tahsin@google.com
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ebb4a1b84 upstream.
The following test case triggers BUG() in munlock_vma_pages_range():
int main(int argc, char *argv[])
{
int fd;
system("mount -t tmpfs -o huge=always none /mnt");
fd = open("/mnt/test", O_CREAT | O_RDWR);
ftruncate(fd, 4UL << 20);
mmap(NULL, 4UL << 20, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_FIXED | MAP_LOCKED, fd, 0);
mmap(NULL, 4096, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_LOCKED, fd, 0);
munlockall();
return 0;
}
The second mmap() create PTE-mapping of the first huge page in file. It
makes kernel munlock the page as we never keep PTE-mapped page mlocked.
On munlockall() when we handle vma created by the first mmap(),
munlock_vma_page() returns page_mask == 0, as the page is not mlocked
anymore. On next iteration follow_page_mask() return tail page, but
page_mask is HPAGE_NR_PAGES - 1. It makes us skip to the first tail
page of the next huge page and step on
VM_BUG_ON_PAGE(PageMlocked(page)).
The fix is not use the page_mask from follow_page_mask() at all. It has
no use for us.
Link: http://lkml.kernel.org/r/20170302150252.34120-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c4ea6e28d upstream.
Fengguang reported random corruptions from various locations on x86-32
after commits d2852a2240 ("arch: add ARCH_HAS_SET_MEMORY config") and
9d876e79df ("bpf: fix unlocking of jited image when module ronx not set")
that uses the former. While x86-32 doesn't have a JIT like x86_64, the
bpf_prog_lock_ro() and bpf_prog_unlock_ro() got enabled due to
ARCH_HAS_SET_MEMORY, whereas Fengguang's test kernel doesn't have module
support built in and therefore never had the DEBUG_SET_MODULE_RONX setting
enabled.
After investigating the crashes further, it turned out that using
set_memory_ro() and set_memory_rw() didn't have the desired effect, for
example, setting the pages as read-only on x86-32 would still let
probe_kernel_write() succeed without error. This behavior would manifest
itself in situations where the vmalloc'ed buffer was accessed prior to
set_memory_*() such as in case of bpf_prog_alloc(). In cases where it
wasn't, the page attribute changes seemed to have taken effect, leading to
the conclusion that a TLB invalidate didn't happen. Moreover, it turned out
that this issue reproduced with qemu in "-cpu kvm64" mode, but not for
"-cpu host". When the issue occurs, change_page_attr_set_clr() did trigger
a TLB flush as expected via __flush_tlb_all() through cpa_flush_range(),
though.
There are 3 variants for issuing a TLB flush: invpcid_flush_all() (depends
on CPU feature bits X86_FEATURE_INVPCID, X86_FEATURE_PGE), cr4 based flush
(depends on X86_FEATURE_PGE), and cr3 based flush. For "-cpu host" case in
my setup, the flush used invpcid_flush_all() variant, whereas for "-cpu
kvm64", the flush was cr4 based. Switching the kvm64 case to cr3 manually
worked fine, and further investigating the cr4 one turned out that
X86_CR4_PGE bit was not set in cr4 register, meaning the
__native_flush_tlb_global_irq_disabled() wrote cr4 twice with the same
value instead of clearing X86_CR4_PGE in the first write to trigger the
flush.
It turned out that X86_CR4_PGE was cleared from cr4 during init from
lguest_arch_host_init() via adjust_pge(). The X86_FEATURE_PGE bit is also
cleared from there due to concerns of using PGE in guest kernel that can
lead to hard to trace bugs (see bff672e630 ("lguest: documentation V:
Host") in init()). The CPU feature bits are cleared in dynamic
boot_cpu_data, but they never propagated to __flush_tlb_all() as it uses
static_cpu_has() instead of boot_cpu_has() for testing which variant of TLB
flushing to use, meaning they still used the old setting of the host
kernel.
Clearing via setup_clear_cpu_cap(X86_FEATURE_PGE) so this would propagate
to static_cpu_has() checks is too late at this point as sections have been
patched already, so for now, it seems reasonable to switch back to
boot_cpu_has(X86_FEATURE_PGE) as it was prior to commit c109bf9599
("x86/cpufeature: Remove cpu_has_pge"). This lets the TLB flush trigger via
cr3 as originally intended, properly makes the new page attributes visible
and thus fixes the crashes seen by Fengguang.
Fixes: c109bf9599 ("x86/cpufeature: Remove cpu_has_pge")
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: bp@suse.de
Cc: Kees Cook <keescook@chromium.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: lkp@01.org
Cc: Laura Abbott <labbott@redhat.com>
Link: http://lkml.kernrl.org/r/20170301125426.l4nf65rx4wahohyl@wfg-t540p.sh.intel.com
Link: http://lkml.kernel.org/r/25c41ad9eca164be4db9ad84f768965b7eb19d9e.1489191673.git.daniel@iogearbox.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d24cdcd3e4 upstream.
I ran into this compile warning, which is the result of BUG_ON(1)
not always leading to the compiler treating the code path as
unreachable:
include/linux/ceph/osdmap.h: In function 'ceph_can_shift_osds':
include/linux/ceph/osdmap.h:62:1: error: control reaches end of non-void function [-Werror=return-type]
Using BUG() here avoids the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9ad86e42d upstream.
Having "ret" be a bool type works for everything except
ret = funcs->atomic_check(). The other functions all return zero on
error but ->atomic_check() returns negative error codes. We want to
propagate the error code but instead we return 1.
I found this bug with static analysis and I don't know if it affects
run time.
Fixes: 4cd4df8080 ("drm/atomic: Add ->atomic_check() to encoder helpers")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207234601.GA23981@mwanda
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc12bccda8 upstream.
Commit deb65870b5 ("drm/imx: imx-tve: check the value returned by
regulator_set_voltage()") exposes the following probe issue:
63ff0000.tve supply dac not found, using dummy regulator
imx-drm display-subsystem: failed to bind 63ff0000.tve (ops imx_tve_ops): -22
When the 'dac-supply' is not passed in the device tree a dummy regulator is
used and setting its voltage is not allowed.
To fix this issue, do not set the dac-supply voltage inside the driver
and let its voltage be specified in the device tree.
Print a warning if the the 'dac-supply' voltage has a value different
from 2.75V.
Fixes: deb65870b5 ("drm/imx: imx-tve: check the value returned by regulator_set_voltage()")
Suggested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85f57752b3 upstream.
The residue calculation was taking in consideration that dma
transaction status will be always retrieved in the dma callback
used to inform that dma transfer is complete. However this is not
the case for all subsystems that use dma. Some subsystems use a
timer to check the dma status periodically.
Therefore the calculation was updated and residue is calculated
accordingly by a) update the residue calculation taking in
consideration the last used buffer index by using *buf_ptail* variable
and b) chn_real_count (number of bytes transferred) is initialized to
zero, when dma channel is created, to avoid using an uninitialized
value in residue calculation when dma status is checked without
waiting dma complete event.
Signed-off-by: Nandor Han <nandor.han@ge.com>
Acked-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Peter Senna Tschudin <peter.senna@collabora.com>
Tested-by: Marek Vasut <marex@denx.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Cc: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31788ca803 upstream.
vmware tools has a daemon that gets layout information from the GUI and
forwards it to DRM so that the modesetting code can set preferred connector
locations and modes. This daemon was using control nodes but since control
nodes were just removed, make it possible for the daemon to use render- or
primary nodes instead. This is a bit ugly but will allow drm to proceed with
removal of the mostly unused control-node code and allow vmware to proceed
with fixing up automatic layout settings for gnome-shell/wayland.
We bump minor to inform user-space about the api change.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170221104227.2854-1-thellstrom@vmware.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 239ac65fa5 upstream.
The current caching state may not be tt_cached, even though the
placement contains TTM_PL_FLAG_CACHED, because placement can contain
multiple caching flags. Trying to swap out such a BO would trip up the
BUG_ON(ttm->caching_state != tt_cached);
in ttm_tt_swapout.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19d19e9605 upstream.
When I originally introduced using the driver-indicated station as an
optimisation to avoid the hashtable lookup/iteration, of course it
wasn't intended to really functionally change anything.
I neglected, however, to take into account VLAN interfaces, which have
the property that management and data frames are handled differently:
data frames go directly to the station and the VLAN while management
frames continue to be processed over the underlying/associated AP-type
interface. As a consequence, when a driver used this optimisation for
management frames and the user enabled VLANs, my change broke things
since any management frames, particularly disassoc/deauth, were missed
by hostapd.
Fix this by restoring the original code path for non-data frames, they
aren't critical for performance to begin with.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=194713.
Big thanks goes to Jarek who bisected the issue and provided a very
detailed bug report, including the crucial information that he was
using VLANs in his configuration.
Fixes: 771e846bea9e ("mac80211: allow passing transmitter station on RX")
Reported-and-tested-by: Jarek Kamiński <jarek@freeside.be>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 890030d3c4 upstream.
When running a BA session, the driver (or the hardware) already takes
care of retransmitting failed frames, since it has to keep the receiver
reorder window in sync.
Adding another layer of retransmit around that does not improve
anything. In fact, it can only lead to some strong reordering with huge
latency.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b7540d8f25 upstream.
When RX aggregation starts, transmitter may continue send frames
with SN smaller than SSN until the AddBA response is received.
However, the reorder buffer is already initialized at this point,
which will cause the drop of such frames as duplicates since the
head SN of the reorder buffer is set to the SSN, which is bigger.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9e9200d86 upstream.
The issue was found when entering suspend and resume.
It triggers a warning in:
mac80211/key.c: ieee80211_enable_keys()
...
WARN_ON_ONCE(sdata->crypto_tx_tailroom_needed_cnt ||
sdata->crypto_tx_tailroom_pending_dec);
...
It points out sdata->crypto_tx_tailroom_pending_dec isn't cleaned up successfully
in a delayed_work during suspend. Add a flush_delayed_work to fix it.
Signed-off-by: Matt Chen <matt.chen@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 86ef58a4e3 upstream.
The interleave-set cookie is a sum that sanity checks the composition of
an interleave set has not changed from when the namespace was initially
created. The checksum is calculated by sorting the DIMMs by their
location in the interleave-set. The comparison for the sort must be
64-bit wide, not byte-by-byte as performed by memcmp() in the broken
case.
Fix the implementation to accept correct cookie values in addition to
the Linux "memcmp" order cookies, but only allow correct cookies to be
generated going forward. It does mean that namespaces created by
third-party-tooling, or created by newer kernels with this fix, will not
validate on older kernels. However, there are a couple mitigating
conditions:
1/ platforms with namespace-label capable NVDIMMs are not widely
available.
2/ interleave-sets with a single-dimm are by definition not affected
(nothing to sort). This covers the QEMU-KVM NVDIMM emulation case.
The cookie stored in the namespace label will be fixed by any write the
namespace label, the most straightforward way to achieve this is to
write to the "alt_name" attribute of a namespace in sysfs.
Fixes: eaf961536e ("libnvdimm, nfit: add interleave-set state-tracking infrastructure")
Reported-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Tested-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4ab18701c6 upstream.
FDT tag parsing is not related to whether BLK_DEV_INITRD is configured
or not, move it out of the corresponding #ifdef/#endif block.
This fixes passing external FDT to the kernel configured w/o
BLK_DEV_INITRD support.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d254a340e upstream.
When first implementing support for changing the output frequency, an
optimization was added to continue the PWM after changing the prescaler
without having to reprogram the ON and OFF registers for the duty cycle,
in case the duty cycle stayed the same. This was flawed, because we
compared the absolute value of the duty cycle in nanoseconds instead of
the ratio to the period.
Fix the problem by removing the shortcut.
Fixes: 01ec847200 ("pwm-pca9685: Support changing the output frequency")
Signed-off-by: Clemens Gruber <clemens.gruber@pqgruber.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0c424971f upstream.
In PowerNV PCI hotplug driver, the initial PCI slot's state is set
to PNV_PHP_STATE_POPULATED if no PCI devices are connected to the
slot. The PCI devices that are hot added to the slot won't be probed
and populated because of the check in pnv_php_enable():
/* Check if the slot has been configured */
if (php_slot->state != PNV_PHP_STATE_REGISTERED)
return 0;
This fixes the issue by leaving the slot in PNV_PHP_STATE_REGISTERED
state initially if nothing is connected to the slot.
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Reported-by: Hank Chang <hankmax0000@gmail.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Willie Liauw <williel@supermicro.com.tw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7d55536c6 upstream.
The surprise hotplug is driven by interrupt in PowerNV PCI hotplug
driver. In the interrupt handler, pnv_php_interrupt(), we bail when
pnv_pci_get_presence_state() returns zero wrongly. It causes the
presence change event is always ignored incorrectly.
This fixes the issue by bailing on error (non-zero value) returned
from pnv_pci_get_presence_state().
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Reported-by: Hank Chang <hankmax0000@gmail.com>
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Tested-by: Willie Liauw <williel@supermicro.com.tw>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bd4e2d2907 upstream.
When transport_clear_lun_ref() is shutting down a se_lun via
configfs with new I/O in-flight, it's possible to trigger a
NULL pointer dereference in transport_lookup_cmd_lun() due
to the fact percpu_ref_get() doesn't do any __PERCPU_REF_DEAD
checking before incrementing lun->lun_ref.count after
lun->lun_ref has switched to atomic_t mode.
This results in a NULL pointer dereference as LUN shutdown
code in core_tpg_remove_lun() continues running after the
existing ->release() -> core_tpg_lun_ref_release() callback
completes, and clears the RCU protected se_lun->lun_se_dev
pointer.
During the OOPs, the state of lun->lun_ref in the process
which triggered the NULL pointer dereference looks like
the following on v4.1.y stable code:
struct se_lun {
lun_link_magic = 4294932337,
lun_status = TRANSPORT_LUN_STATUS_FREE,
.....
lun_se_dev = 0x0,
lun_sep = 0x0,
.....
lun_ref = {
count = {
counter = 1
},
percpu_count_ptr = 3,
release = 0xffffffffa02fa1e0 <core_tpg_lun_ref_release>,
confirm_switch = 0x0,
force_atomic = false,
rcu = {
next = 0xffff88154fa1a5d0,
func = 0xffffffff8137c4c0 <percpu_ref_switch_to_atomic_rcu>
}
}
}
To address this bug, use percpu_ref_tryget_live() to ensure
once __PERCPU_REF_DEAD is visable on all CPUs and ->lun_ref
has switched to atomic_t, all new I/Os will fail to obtain
a new lun->lun_ref reference.
Also use an explicit percpu_ref_kill_and_confirm() callback
to block on ->lun_ref_comp to allow the first stage and
associated RCU grace period to complete, and then block on
->lun_ref_shutdown waiting for the final percpu_ref_put()
to drop the last reference via transport_lun_remove_cmd()
before continuing with core_tpg_remove_lun() shutdown.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Tested-by: Vaibhav Tandon <vst@datera.io>
Cc: Vaibhav Tandon <vst@datera.io>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 303529d6ef upstream.
The root port or PCIe switch downstream port might have been associated
with driver other than pnv-php. The MSI or MSIx might also have been
enabled by that driver (e.g. pcieport_drv). Attempt to enable MSI incurs
below backtrace:
PowerPC PowerNV PCI Hotplug Driver version: 0.1
------------[ cut here ]------------
WARNING: CPU: 19 PID: 1004 at drivers/pci/msi.c:1071 \
__pci_enable_msi_range+0x84/0x4e0
NIP [c000000000665c34] __pci_enable_msi_range+0x84/0x4e0
LR [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0
Call Trace:
[c000000384d67600] [c000000000665c24] __pci_enable_msi_range+0x74/0x4e0
[c000000384d676e0] [d00000000aa31b04] pnv_php_register+0x564/0x5a0 [pnv_php]
[c000000384d677c0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php]
[c000000384d678a0] [d00000000aa31658] pnv_php_register+0xb8/0x5a0 [pnv_php]
[c000000384d67980] [d00000000aa31dfc] pnv_php_init+0x60/0x98 [pnv_php]
[c000000384d679f0] [c00000000000cfdc] do_one_initcall+0x6c/0x1d0
[c000000384d67ab0] [c000000000b92354] do_init_module+0x94/0x254
[c000000384d67b40] [c00000000019719c] load_module+0x258c/0x2c60
[c000000384d67d30] [c000000000197bb0] SyS_finit_module+0xf0/0x170
[c000000384d67e30] [c00000000000b184] system_call+0x38/0xe0
This fixes the issue by skipping enabling the surprise hotplug
capability if the MSI or MSIx on the PCI slot's upstream port has
been enabled by other driver.
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32677207dc upstream.
The child_exit errno needs to be shifted by 8 bits to compare against the
return values for the bisect variables.
Fixes: c5dacb88f0 ("ktest: Allow overriding bisect test results")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee19428950 upstream.
at91sam9_ebi_get_config() is incorrectly converting timings in clock
cycles into timings in nanoseconds by multiplying the cycle values by
the clk rate instead of the clk period.
at91sam9_ebi_xslate_config() has the same problem for the
tdf_ns -> tdf_cycles conversion.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Chris Leahy <leahycm@gmail.com>
Fixes: 6a4ec4cd08 ("memory: add Atmel EBI (External Bus Interface) driver")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93faccbbfa upstream.
To support unprivileged users mounting filesystems two permission
checks have to be performed: a test to see if the user allowed to
create a mount in the mount namespace, and a test to see if
the user is allowed to access the specified filesystem.
The automount case is special in that mounting the original filesystem
grants permission to mount the sub-filesystems, to any user who
happens to stumble across the their mountpoint and satisfies the
ordinary filesystem permission checks.
Attempting to handle the automount case by using override_creds
almost works. It preserves the idea that permission to mount
the original filesystem is permission to mount the sub-filesystem.
Unfortunately using override_creds messes up the filesystems
ordinary permission checks.
Solve this by being explicit that a mount is a submount by introducing
vfs_submount, and using it where appropriate.
vfs_submount uses a new mount internal mount flags MS_SUBMOUNT, to let
sget and friends know that a mount is a submount so they can take appropriate
action.
sget and sget_userns are modified to not perform any permission checks
on submounts.
follow_automount is modified to stop using override_creds as that
has proven problemantic.
do_mount is modified to always remove the new MS_SUBMOUNT flag so
that we know userspace will never by able to specify it.
autofs4 is modified to stop using current_real_cred that was put in
there to handle the previous version of submount permission checking.
cifs is modified to pass the mountpoint all of the way down to vfs_submount.
debugfs is modified to pass the mountpoint all of the way down to
trace_automount by adding a new parameter. To make this change easier
a new typedef debugfs_automount_t is introduced to capture the type of
the debugfs automount function.
Fixes: 069d5ac9ae ("autofs: Fix automounts by using current_real_cred()->uid")
Fixes: aeaa4a79ff ("fs: Call d_automount with the filesystems creds")
Reviewed-by: Trond Myklebust <trond.myklebust@primarydata.com>
Reviewed-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a6fdbdeb1 upstream.
Avoid that srp_process_rsp() overwrites the status information
in ch if the SRP target response timed out and processing of
another task management function has already started. Avoid that
issuing multiple task management functions concurrently triggers
list corruption. This patch prevents that the following stack
trace appears in the system log:
WARNING: CPU: 8 PID: 9269 at lib/list_debug.c:52 __list_del_entry_valid+0xbc/0xc0
list_del corruption. prev->next should be ffffc90004bb7b00, but was ffff8804052ecc68
CPU: 8 PID: 9269 Comm: sg_reset Tainted: G W 4.10.0-rc7-dbg+ #3
Call Trace:
dump_stack+0x68/0x93
__warn+0xc6/0xe0
warn_slowpath_fmt+0x4a/0x50
__list_del_entry_valid+0xbc/0xc0
wait_for_completion_timeout+0x12e/0x170
srp_send_tsk_mgmt+0x1ef/0x2d0 [ib_srp]
srp_reset_device+0x5b/0x110 [ib_srp]
scsi_ioctl_reset+0x1c7/0x290
scsi_ioctl+0x12a/0x420
sd_ioctl+0x9d/0x100
blkdev_ioctl+0x51e/0x9f0
block_ioctl+0x38/0x40
do_vfs_ioctl+0x8f/0x700
SyS_ioctl+0x3c/0x70
entry_SYSCALL_64_fastpath+0x18/0xad
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Israel Rukshin <israelr@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Steve Feeley <Steve.Feeley@sandisk.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6cb72bc1b4 upstream.
After srp_process_rsp() returns there is a short time during which
the scsi_host_find_tag() call will return a pointer to the SCSI
command that is being completed. If during that time a duplicate
response is received, avoid that the following call stack appears:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: srp_recv_done+0x450/0x6b0 [ib_srp]
Oops: 0000 [#1] SMP
CPU: 10 PID: 0 Comm: swapper/10 Not tainted 4.10.0-rc7-dbg+ #1
Call Trace:
<IRQ>
__ib_process_cq+0x4b/0xd0 [ib_core]
ib_poll_handler+0x1d/0x70 [ib_core]
irq_poll_softirq+0xba/0x120
__do_softirq+0xba/0x4c0
irq_exit+0xbe/0xd0
smp_apic_timer_interrupt+0x38/0x50
apic_timer_interrupt+0x90/0xa0
</IRQ>
RIP: srp_recv_done+0x450/0x6b0 [ib_srp] RSP: ffff88046f483e20
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Israel Rukshin <israelr@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Steve Feeley <Steve.Feeley@sandisk.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0fd27a88c2 upstream.
When we initialize buffer to create SRQ in kernel,
the number of pages was less than actually used in
following mlx5_fill_page_array().
Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b0841766a upstream.
When sending packet to destination that was not resolved yet
via path query, the driver keeps the skb and tries to re-send it
again when the path is resolved.
But when re-sending via dev_queue_xmit the kernel doesn't call
to dev_hard_header, so IPoIB needs to keep 20 bytes in the skb
and to put the destination address inside them.
In that way the dev_start_xmit will have the correct destination,
and the driver won't take the destination from the skb->data, while
nothing exists there, which causes to packet be be dropped.
The test flow is:
1. Run the SM on remote node,
2. Restart the driver.
4. Ping some destination,
3. Observe that first ICMP request will be dropped.
Fixes: fc791b6335 ("IB/ipoib: move back IB LL address into the hard header")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1064f874ab upstream.
Ever since mount propagation was introduced in cases where a mount in
propagated to parent mount mountpoint pair that is already in use the
code has placed the new mount behind the old mount in the mount hash
table.
This implementation detail is problematic as it allows creating
arbitrary length mount hash chains.
Furthermore it invalidates the constraint maintained elsewhere in the
mount code that a parent mount and a mountpoint pair will have exactly
one mount upon them. Making it hard to deal with and to talk about
this special case in the mount code.
Modify mount propagation to notice when there is already a mount at
the parent mount and mountpoint where a new mount is propagating to
and place that preexisting mount on top of the new mount.
Modify unmount propagation to notice when a mount that is being
unmounted has another mount on top of it (and no other children), and
to replace the unmounted mount with the mount on top of it.
Move the MNT_UMUONT test from __lookup_mnt_last into
__propagate_umount as that is the only call of __lookup_mnt_last where
MNT_UMOUNT may be set on any mount visible in the mount hash table.
These modifications allow:
- __lookup_mnt_last to be removed.
- attach_shadows to be renamed __attach_mnt and its shadow
handling to be removed.
- commit_tree to be simplified
- copy_tree to be simplified
The result is an easier to understand tree of mounts that does not
allow creation of arbitrary length hash chains in the mount hash table.
The result is also a very slight userspace visible difference in semantics.
The following two cases now behave identically, where before order
mattered:
case 1: (explicit user action)
B is a slave of A
mount something on A/a , it will propagate to B/a
and than mount something on B/a
case 2: (tucked mount)
B is a slave of A
mount something on B/a
and than mount something on A/a
Histroically umount A/a would fail in case 1 and succeed in case 2.
Now umount A/a succeeds in both configurations.
This very small change in semantics appears if anything to be a bug
fix to me and my survey of userspace leads me to believe that no programs
will notice or care of this subtle semantic change.
v2: Updated to mnt_change_mountpoint to not call dput or mntput
and instead to decrement the counts directly. It is guaranteed
that there will be other references when mnt_change_mountpoint is
called so this is safe.
v3: Moved put_mountpoint under mount_lock in attach_recursive_mnt
As the locking in fs/namespace.c changed between v2 and v3.
v4: Reworked the logic in propagate_mount_busy and __propagate_umount
that detects when a mount completely covers another mount.
v5: Removed unnecessary tests whose result is alwasy true in
find_topper and attach_recursive_mnt.
v6: Document the user space visible semantic difference.
Fixes: b90fa9ae8f ("[PATCH] shared mount handling: bind and rbind")
Tested-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e290cecdd upstream.
brcmf_sdio_fromevntchan() was being called on the the data frame
rather than the software header, causing some frames to be
mischaracterized as on the event channel rather than the data channel.
This fixes a major performance regression (due to dropped packets). With
this patch the download speed jumped from 1Mbit/s back up to 40MBit/s due
to the sheer amount of packets being incorrectly processed.
Fixes: c56caa9db8 ("brcmfmac: screening firmware event packet")
Signed-off-by: Gavin Li <git@thegavinli.com>
Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>
[kvalo@codeaurora.org: improve commit logs based on email discussion]
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 171ed0fcd8 upstream.
Commit 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.
It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.
Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.
Fixes: 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU not configured")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14a3ae34bf upstream.
During EEH recovery, we deconfigure all AFUs whilst leaving the
corresponding vPHB and virtual PCI device in place.
If something attempts to interact with the AFU's PCI config space (e.g.
running lspci) after the AFU has been deconfigured and before it's
reconfigured, cxl_pcie_{read,write}_config() will read invalid values from
the deconfigured struct cxl_afu and proceed to Oops when they try to
dereference pointers that have been set to NULL during deconfiguration.
Add a rwsem to struct cxl_afu so we can prevent interaction with config
space while the AFU is deconfigured.
Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Suggested-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 239a3b6636 upstream.
When TX descriptors are filled in, the buffer DMA address is split
between the tx_desc->buf_phys_addr field (high-order bits) and
tx_desc->packet_offset field (5 low-order bits).
However, when we re-calculate the DMA address from the TX descriptor in
mvpp2_txq_inc_put(), we do not take tx_desc->packet_offset into
account. This means that when the DMA address is not aligned on a 32
bytes boundary, we end up calling dma_unmap_single() with a DMA address
that was not the one returned by dma_map_single().
This inconsistency is detected by the kernel when DMA_API_DEBUG is
enabled. We fix this problem by properly calculating the DMA address in
mvpp2_txq_inc_put().
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4920e3cf77 upstream.
The current implementation of setup_randomness uses the stack address
and therefore the pointer to the SYSIB 3.2.2 block as input data
address. Furthermore the length of the input data is the number of
virtual-machine description blocks which is typically one.
This means that typically a single zero byte is fed to
add_device_randomness.
Fix both of these and use the address of the first virtual machine
description block as input data address and also use the correct
length.
Fixes: bcfcbb6bae ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da8fd820f3 upstream.
Commit bcfcbb6bae ("s390: add system information as device
randomness") intended to add some virtual machine specific information
to the randomness pool.
Unfortunately it uses the page allocator before it is ready to use. In
result the page allocator always returns NULL and the setup_randomness
function never adds anything to the randomness pool.
To fix this use memblock_alloc and memblock_free instead.
Fixes: bcfcbb6bae ("s390: add system information as device randomness")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb94a687d9 upstream.
Return a sensible value if TASK_SIZE if called from a kernel thread.
This gets us around an issue with copy_mount_options that does a magic
size calculation "TASK_SIZE - (unsigned long)data" while in a kernel
thread and data pointing to kernel space.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a4a81d8eeb upstream.
In binutils/libbfd (bfd/elf.c) it is enforced that all s390 specific ELF
notes like e.g. NT_S390_PREFIX or NT_S390_CTRS have "LINUX" specified
as note name. Otherwise the notes are ignored.
For /proc/vmcore we currently use "CORE" for these notes.
Up to now this has not been a real problem because the dump analysis tool
"crash" does not check the note name. But it will break all programs that
use libbfd for processing ELF notes.
So fix this and use "LINUX" for all s390 specific notes to comply with
libbfd.
Reported-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a63f53e34d upstream.
Since commit dd22f551 "block: Change direct_access calling convention",
the device size calculation in dcssblk_direct_access() is off-by-one.
This results in bdev_direct_access() always returning -ENXIO because the
returned value is not page aligned.
Fix this by adding 1 to the dev_sz calculation.
Fixes: dd22f551 ("block: Change direct_access calling convention")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e4a382fdc upstream.
For devices with multiple input queues, tiqdio_call_inq_handlers()
iterates over all input queues and clears the device's DSCI
during each iteration. If the DSCI is re-armed during one
of the later iterations, we therefore do not scan the previous
queues again.
The re-arming also raises a new adapter interrupt. But its
handler does not trigger a rescan for the device, as the DSCI
has already been erroneously cleared.
This can result in queue stalls on devices with multiple
input queues.
Fix it by clearing the DSCI just once, prior to scanning the queues.
As the code is moved in front of the loop, we also need to access
the DSCI directly (ie irq->dsci) instead of going via each queue's
parent pointer to the same irq. This is not a functional change,
and a follow-up patch will clean up the other users.
In practice, this bug only affects CQ-enabled HiperSockets devices,
ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is
needed for AF_IUCV socket applications that use HiperSockets
communication.
Fixes: 104ea556ee ("qdio: support asynchronous delivery of storage blocks")
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96794e4ed4 upstream.
Guest segment selector is 16 bit field and guest segment base is natural
width field. Fix two incorrect invocations accordingly.
Without this patch, build fails when aggressive inlining is used with ICC.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1e8a9624f upstream.
User controlled KVM guests do not support the dirty log, as they have
no single gmap that we can check for changes.
As they have no single gmap, kvm->arch.gmap is NULL and all further
referencing to it for dirty checking will result in a NULL
dereference.
Let's return -EINVAL if a caller tries to sync dirty logs for a
UCONTROL guest.
Fixes: 15f36eb ("KVM: s390: Add proper dirty bitmap support to S390 kvm.")
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c9c858e2f upstream.
The MKS Instruments SCOM-0800 and SCOM-0801 cards (originally by Tenta
Technologies) are 3U CompactPCI serial cards with 4 and 8 serial ports,
respectively. The first 4 ports are implemented by an OX16PCI954 chip,
and the second 4 ports are implemented by an OX16C954 chip on a local
bus, bridged by the second PCI function of the OX16PCI954. The ports
are jumper-selectable as RS-232 and RS-422/485, and the UARTs use a
non-standard oscillator frequency of 20 MHz (base_baud = 1250000).
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82f2341c94 upstream.
Currently N_HDLC line discipline uses a self-made singly linked list for
data buffers and has n_hdlc.tbuf pointer for buffer retransmitting after
an error.
The commit be10eb7589
("tty: n_hdlc add buffer flushing") introduced racy access to n_hdlc.tbuf.
After tx error concurrent flush_tx_queue() and n_hdlc_send_frames() can put
one data buffer to tx_free_buf_list twice. That causes double free in
n_hdlc_release().
Let's use standard kernel linked list and get rid of n_hdlc.tbuf:
in case of tx error put current data buffer after the head of tx_buf_list.
Signed-off-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
brcmexp_gpio_set was adding gpio->gc.base to the offset
twice, so passing an invalid number to the mailbox service.
The firmware treated it modulo-8 anyway, but was logging an
assert every time.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Firmware expgpio driver reworked due to complaint over
hotplug detect.
Requires power LED to change sense as firmware is no longer
inverting the read value.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
commit e5072053b0 upstream.
This further refines the changes made to conntrack gc_worker in
commit e0df8cae6c ("netfilter: conntrack: refine gc worker heuristics").
The main idea of that change was to reduce the scan interval when evictions
take place.
However, on the reporters' setup, there are 1-2 million conntrack entries
in total and roughly 8k new (and closing) connections per second.
In this case we'll always evict at least one entry per gc cycle and scan
interval is always at 1 jiffy because of this test:
} else if (expired_count) {
gc_work->next_gc_run /= 2U;
next_run = msecs_to_jiffies(1);
being true almost all the time.
Given we scan ~10k entries per run its clearly wrong to reduce interval
based on nonzero eviction count, it will only waste cpu cycles since a vast
majorities of conntracks are not timed out.
Thus only look at the ratio (scanned entries vs. evicted entries) to make
a decision on whether to reduce or not.
Because evictor is supposed to only kick in when system turns idle after
a busy period, pick a high ratio -- this makes it 50%. We thus keep
the idea of increasing scan rate when its likely that table contains many
expired entries.
In order to not let timed-out entries hang around for too long
(important when using event logging, in which case we want to timely
destroy events), we now scan the full table within at most
GC_MAX_SCAN_JIFFIES (16 seconds) even in worst-case scenario where all
timed-out entries sit in same slot.
I tested this with a vm under synflood (with
sysctl net.netfilter.nf_conntrack_tcp_timeout_syn_recv=3).
While flood is ongoing, interval now stays at its max rate
(GC_MAX_SCAN_JIFFIES / GC_MAX_BUCKETS_DIV -> 125ms).
With feedback from Nicolas Dichtel.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: b87a2f9199 ("netfilter: conntrack: add gc worker to remove timed-out entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d641df819d upstream.
add_to_page_cache_lru() can fails, so the actual pages to read
can be smaller than the initial size of osd request. We need to
update osd request size in that case.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae2f5e5ed0 upstream.
Fix the following build error with binutils 2.25.
CC arch/mips/mm/sc-ip22.o
{standard input}: Assembler messages:
{standard input}:132: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:159: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:200: Error: number (0x9000000080000000) larger than 32 bits
scripts/Makefile.build:293: recipe for target 'arch/mips/mm/sc-ip22.o' failed
make[1]: *** [arch/mips/mm/sc-ip22.o] Error 1
MIPS has used .set mips3 to temporarily switch the assembler to 64 bit
mode in 64 bit kernels virtually forever. Binutils 2.25 broke this
behavious partially by happily accepting 64 bit instructions in .set mips3
mode but puking on 64 bit constants when generating 32 bit ELF. Binutils
2.26 restored the old behaviour again.
Fix build with binutils 2.25 by open coding the offending
dli $1, 0x9000000080000000
as
li $1, 0x9000
dsll $1, $1, 48
which is ugly be the only thing that will build on all binutils vintages.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fda2d27db6 upstream.
We will set LPCR with correct value for radix during int. This make sure we
start with a sanitized value of LPCR. In case of kexec, cpus can have LPCR
value based on the previous translation mode we were running.
Fixes: fe036a0605 ("powerpc/64/kexec: Fix MMU cleanup on radix")
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c21a493a2b upstream.
Currently xmon data-breakpoint feature is broken.
Whenever there is a watchpoint match occurs, hw_breakpoint_handler will
be called by do_break via notifier chains mechanism. If watchpoint is
registered by xmon, hw_breakpoint_handler won't find any associated
perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break
also returns without notifying to xmon.
Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not
find any perf_event associated with matched watchpoint, rather than
NOTIFY_STOP, which tells the core code to continue calling the other
breakpoint handlers including the xmon one.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16f906d66c upstream.
The MAX_SEND_SGES check introduced in commit 655fec6987
("xprtrdma: Use gathered Send for large inline messages") fails
for devices that have a small max_sge.
Instead of checking for a large fixed maximum number of SGEs,
check for a minimum small number. RPC-over-RDMA will switch to
using a Read chunk if an xdr_buf has more pages than can fit in
the device's max_sge limit. This is considerably better than
failing all together to mount the server.
This fix supports devices that have as few as three send SGEs
available.
Reported-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reported-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reported-by: Honggang Li <honli@redhat.com>
Reported-by: Ram Amrani <Ram.Amrani@cavium.com>
Fixes: 655fec6987 ("xprtrdma: Use gathered Send for large ...")
Tested-by: Honggang Li <honli@redhat.com>
Tested-by: Ram Amrani <Ram.Amrani@cavium.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c95a3c6b88 upstream.
Commit d5440e27d3 ("xprtrdma: Enable pad optimization") made the
Linux client omit XDR round-up padding in normal Read and Write
chunks so that the client doesn't have to register and invalidate
3-byte memory regions that contain no real data.
Unfortunately, my cheery 2014 assessment that this optimization "is
supported now by both Linux and Solaris servers" was premature.
We've found bugs in Solaris in this area since commit d5440e27d3
("xprtrdma: Enable pad optimization") was merged (SYMLINK is the
main offender).
So for maximum interoperability, I'm disabling this optimization
again. If a CM private message is exchanged when connecting, the
client recognizes that the server is Linux, and enables the
optimization for that connection.
Until now the Solaris server bugs did not impact common operations,
and were thus largely benign. Soon, less capable devices on Linux
NFS/RDMA clients will make use of Read chunks more often, and these
Solaris bugs will prevent interoperation in more cases.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5f0afbea4 upstream.
Pad optimization is changed by echoing into
/proc/sys/sunrpc/rdma_pad_optimize. This is a global setting,
affecting all RPC-over-RDMA connections to all servers.
The marshaling code picks up that value and uses it for decisions
about how to construct each RPC-over-RDMA frame. Having it change
suddenly in mid-operation can result in unexpected failures. And
some servers a client mounts might need chunk round-up, while
others don't.
So instead, copy the pad_optimize setting into each connection's
rpcrdma_ia when the transport is created, and use the copy, which
can't change during the life of the connection, instead.
This also removes a hack: rpcrdma_convert_iovs was using
the remote-invalidation-expected flag to predict when it could leave
out Write chunk padding. This is because the Linux server handles
implicit XDR padding on Write chunks correctly, and only Linux
servers can set the connection's remote-invalidation-expected flag.
It's more sensible to use the pad optimization setting instead.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24abdf1be1 upstream.
When pad optimization is disabled, rpcrdma_convert_iovs still
does not add explicit XDR round-up padding to a Read chunk.
Commit 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
incorrectly short-circuited the test for whether round-up padding
is needed that appears later in rpcrdma_convert_iovs.
However, if this is indeed a regular Read chunk (and not a
Position-Zero Read chunk), the tail iovec _always_ contains the
chunk's padding, and never anything else.
So, it's easy to just skip the tail when padding optimization is
enabled, and add the tail in a subsequent Read chunk segment, if
disabled.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adee40b265 upstream.
Commit 3d8cc00073 ("dmaengine: ipu: Consolidate duplicated irq handlers")
consolidated the two interrupts routines into one, but the remaining
interrupt routine only checks the status of the error interrupts, not the
normal interrupts.
This patch fixes that problem (tested on i.MX31 PDK board).
Fixes: 3d8cc00073 ("dmaengine: ipu: Consolidate duplicated irq handlers")
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 656441478e upstream.
The commit 7a65417216 ("mtd/ifc: Add support for IFC controller
version 2.0") added support for version 2.0 of the IFC controller.
The version 2.0 controller has the ECC status registers at a different
location to the previous versions.
Correct the fsl_ifc_nand structure so that the ECC status can be read
from the correct location for both version 1.0 and 2.0 of the controller.
Fixes: 7a65417216 ("mtd/ifc: Add support for IFC controller version 2.0")
Signed-off-by: Mark Marshall <mark.marshall@omicronenergy.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03a9e24ef2 upstream.
Recently I receive a bug report that on Linux v3.0 based kerenl, hot add
disk to a md linear device causes kernel crash at linear_congested(). From
the crash image analysis, I find in linear_congested(), mddev->raid_disks
contains value N, but conf->disks[] only has N-1 pointers available. Then
a NULL pointer deference crashes the kernel.
There is a race between linear_add() and linear_congested(), RCU stuffs
used in these two functions cannot avoid the race. Since Linuv v4.0
RCU code is replaced by introducing mddev_suspend(). After checking the
upstream code, it seems linear_congested() is not called in
generic_make_request() code patch, so mddev_suspend() cannot provent it
from being called. The possible race still exists.
Here I explain how the race still exists in current code. For a machine
has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
md linear device; at the same time on other CPU, linear_congested() is
called to detect whether this md linear device is congested before issuing
an I/O request onto it.
Now I use a possible code execution time sequence to demo how the possible
race happens,
seq linear_add() linear_congested()
0 conf=mddev->private
1 oldconf=mddev->private
2 mddev->raid_disks++
3 for (i=0; i<mddev->raid_disks;i++)
4 bdev_get_queue(conf->disks[i].rdev->bdev)
5 mddev->private=newconf
In linear_add() mddev->raid_disks is increased in time seq 2, and on
another CPU in linear_congested() the for-loop iterates conf->disks[i] by
the increased mddev->raid_disks in time seq 3,4. But conf with one more
element (which is a pointer to struct dev_info type) to conf->disks[] is
not updated yet, accessing its structure member in time seq 4 will cause a
NULL pointer deference fault.
To fix this race, there are 2 parts of modification in the patch,
1) Add 'int raid_disks' in struct linear_conf, as a copy of
mddev->raid_disks. It is initialized in linear_conf(), always being
consistent with pointers number of 'struct dev_info disks[]'. When
iterating conf->disks[] in linear_congested(), use conf->raid_disks to
replace mddev->raid_disks in the for-loop, then NULL pointer deference
will not happen again.
2) RCU stuffs are back again, and use kfree_rcu() in linear_add() to
free oldconf memory. Because oldconf may be referenced as mddev->private
in linear_congested(), kfree_rcu() makes sure that its memory will not
be released until no one uses it any more.
Also some code comments are added in this patch, to make this modification
to be easier understandable.
This patch can be applied for kernels since v4.0 after commit:
3be260cc18 ("md/linear: remove rcu protections in favour of
suspend/resume"). But this bug is reported on Linux v3.0 based kernel, for
people who maintain kernels before Linux v4.0, they need to do some back
back port to this patch.
Changelog:
- V3: add 'int raid_disks' in struct linear_conf, and use kfree_rcu() to
replace rcu_call() in linear_add().
- v2: add RCU stuffs by suggestion from Shaohua and Neil.
- v1: initial effort.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb61bb82cb upstream.
The RTC is clocked from either an internal, imprecise, oscillator or an
external one, which is usually much more accurate.
The difference perceived between the time elapsed and the time reported by
the RTC is in a 10% scale, which prevents the RTC from being useful at all.
Fortunately, the external oscillator is reported to be mandatory in the
Allwinner datasheet, so we can just switch to it.
Fixes: 9765d2d943 ("rtc: sun6i: Add sun6i RTC driver")
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b107f5b97 upstream.
If segs_per_sec is over 1 like under SMR, previously f2fs issues discard
commands redundantly on the same section, since we didn't move end position
for the previous discard command.
E.g.,
start end
| |
prefree_bitmap = [01111100111100]
And, after issue discard for this section,
end start
| |
prefree_bitmap = [01111100111100]
Select this section again by searching from (end + 1),
start end
| |
prefree_bitmap = [01111100111100]
Fixes: 36abef4e79 ("f2fs: introduce mode=lfs mount option")
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e93b986525 upstream.
For foreground gc, greedy algorithm should be adapted, which makes
this formula work well:
(2 * (100 / config.overprovision + 1) + 6)
But currently, we fg_gc have a prior to select bg_gc victim segments to gc
first, these victims are selected by cost-benefit algorithm, we can't guarantee
such segments have the small valid blocks, which may destroy the f2fs rule, on
the worstest case, would consume all the free segments.
This patch fix this by add a filter in check_bg_victims, if segment's has # of
valid blocks over overprovision ratio, skip such segments.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88c5c13a50 upstream.
It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple
vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries
having same name which breaks filesystem consistency.
Until upper layer fixes, let's work around by f2fs, which shows actually not
much performance regression.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed92d8c137 upstream.
We're not taking into account that the space needed for the (variable
length) attr bitmap, with the result that we'd sometimes get a spurious
ERANGE when the ACL data got close to the end of a page.
Just add in an extra page to make sure.
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6682c14bbe upstream.
Bitmap and attrlen follow immediately after the op reply header. This
was an oversight from commit bf118a342f.
Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).
Fixes: bf118a342f "NFSv4: include bitmap in nfsv4 get acl data"
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df3ab232e4 upstream.
If we see that our pNFS READ/WRITE/COMMIT operation failed, but we
also see that our layout segment is no longer valid, then we need to
get a new layout segment before retrying.
Fixes: 90816d1dda ("NFSv4.1/flexfiles: Don't mark the entire deviceid...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d8cacbf56 upstream.
Copy offload code needs to be hooked into the code for handling
NFS4ERR_BAD_STATEID by ensuring that we set the "stateid" field
in struct nfs4_exception.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 2e72448b07 ("NFS: Add COPY nfs operation")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a974deee47 upstream.
If we exit because the file access check failed, we currently
leak the struct nfs4_state. We need to attach it to the
open context before returning.
Fixes: 3efb972247 ("NFSv4: Refactor _nfs4_open_and_get_state..")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 783112f740 upstream.
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributes to set to set various file attributes including the
file size and the uid/gid.
The Linux syscalls never mix size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates don't update random other attributes, and many other file
systems handle the case but do not update the other attributes in the
same transaction. NFSD on the other hand passes the attributes it gets
on the wire more or less directly through to the VFS, leading to updates
the file systems don't expect. XFS at least has an assert on the
allowed attributes, which caught an unusual NFS client setting the size
and group at the same time.
To handle this issue properly this splits the notify_change call in
nfsd_setattr into two separate ones.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9797484ba8 upstream.
Commit 050c3d52cc ("vme: make core
vme support explicitly non-modular") dropped the remove function
because it appeared as if it was for removal of the bus, which is
not supported.
However, vme_bus_remove() is called when a VME device is removed
from the bus and not when the bus is removed; as it calls the VME
device driver's cleanup function. Without this function, the
remove() in the VME device driver is never called and VME device
drivers cannot be reloaded again.
Here we restore the remove function that was deleted in that
commit, and the reference to the function in the bus structure.
Fixes: 050c3d52cc ("vme: make core vme support explicitly non-modular")
Cc: Manohar Vanga <manohar.vanga@gmail.com>
Acked-by: Martyn Welch <martyn@welchs.me.uk>
Cc: devel@driverdev.osuosl.org
Signed-off-by: Stefano Babic <sbabic@denx.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6773386f97 upstream.
Kernels built with CONFIG_KASAN=y report the following BUG for rtl8192cu
and rtl8192c-common:
==================================================================
BUG: KASAN: slab-out-of-bounds in rtl92c_dm_bt_coexist+0x858/0x1e40
[rtl8192c_common] at addr ffff8801c90edb08
Read of size 1 by task kworker/0:1/38
page:ffffea0007243800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x8000000000004000(head)
page dumped because: kasan: bad access detected
CPU: 0 PID: 38 Comm: kworker/0:1 Not tainted 4.9.7-gentoo #3
Hardware name: Gigabyte Technology Co., Ltd. To be filled by
O.E.M./Z77-DS3H, BIOS F11a 11/13/2013
Workqueue: rtl92c_usb rtl_watchdog_wq_callback [rtlwifi]
0000000000000000 ffffffff829eea33 ffff8801d7f0fa30 ffff8801c90edb08
ffffffff824c0f09 ffff8801d4abee80 0000000000000004 0000000000000297
ffffffffc070b57c ffff8801c7aa7c48 ffff880100000004 ffffffff000003e8
Call Trace:
[<ffffffff829eea33>] ? dump_stack+0x5c/0x79
[<ffffffff824c0f09>] ? kasan_report_error+0x4b9/0x4e0
[<ffffffffc070b57c>] ? _usb_read_sync+0x15c/0x280 [rtl_usb]
[<ffffffff824c0f75>] ? __asan_report_load1_noabort+0x45/0x50
[<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common]
[<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common]
[<ffffffffc06d0cbe>] ? rtl92c_dm_rf_saving+0x96e/0x1330 [rtl8192c_common]
...
The problem is due to rtl8192ce and rtl8192cu sharing routines, and having
different layouts of struct rtl_pci_priv, which is used by rtl8192ce, and
struct rtl_usb_priv, which is used by rtl8192cu. The problem was resolved
by placing the struct bt_coexist_info at the head of each of those private
areas.
Reported-and-tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40b368af4b upstream.
The addresses of Wlan NIC registers are natural alignment, but some
drivers have bugs. These are evident on platforms that need natural
alignment to access registers. This change contains the following:
1. Function _rtl8821ae_dbi_read() is used to read one byte from DBI,
thus it should use rtl_read_byte().
2. Register 0x4C7 of 8192ee is single byte.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e8b571a9a upstream.
The "fw" firmware object is passed from the remoteproc core and should
not be overwritten, as that results in leaked buffers and a double free
of the the last firmware object.
Fixes: 051fb70fd4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f38e5fb95a upstream.
We must hold the rcu read lock across looking up glocks and trying to
bump their refcount to prevent the glocks from being freed in between.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55efcfcd77 upstream.
The RDMA core uses ib_pack() to convert from unpacked CPU structs
to on-the-wire bitpacked structs.
This process requires that 1 bit fields are declared as u8 in the
unpacked struct, otherwise the packing process does not read the
value properly and the packed result is wired to 0. Several
places wrongly used int.
Crucially this means the kernel has never, set reversible
correctly in the path record request. It has always asked for
irreversible paths even if the ULP requests otherwise.
When the kernel is used with a SM that supports this feature, it
completely breaks communication management if reversible paths are
not properly requested.
The only reason this ever worked is because opensm ignores the
reversible bit.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d77044d142 upstream.
VSS may use a char device to support the communication between
the user level daemon and the driver. When the VSS channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new VSS offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20951c7535 upstream.
Fcopy may use a char device to support the communication between
the user level daemon and the driver. When the Fcopy channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new Fcopy offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a66fecbf6 upstream.
KVP may use a char device to support the communication between
the user level daemon and the driver. When the KVP channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new KVP offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccb61f8a99 upstream.
The host can rescind a channel that has been offered to the
guest and once the channel is rescinded, the host does not
respond to any requests on that channel. Deal with the case where
the guest may be blocked waiting for a response from the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7e97dd8b7 upstream.
After the channel is rescinded, the host does not read from the rescinded channel.
Fail writes to a channel that has already been rescinded. If we permit writes on a
rescinded channel, since the host will not respond we will have situations where
we will be unable to unload vmbus drivers that cannot have any outstanding requests
to the host at the point they are unoaded.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56ef6718a1 upstream.
It may happen that secondary CPUs are still alive and resetting
hv_context.tsc_page will cause a consequent crash in read_hv_clock_tsc()
as we don't check for it being not NULL there. It is safe as we're not
freeing this page anyways.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c7630d350 upstream.
Initializing hv_context.percpu_list in hv_synic_alloc() helps to prevent a
crash in percpu_channel_enq() when not all CPUs were online during
initialization and it naturally belongs there.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 421b8f20d3 upstream.
It may happen that not all CPUs are online when we do hv_synic_alloc() and
in case more CPUs come online later we may try accessing these allocated
structures.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33e4c1a998 upstream.
As IN request has to be allocated in set_alt() and released in
disable() we cannot use mutex to protect it as we cannot sleep
in those funcitons. Let's replace this mutex with a spinlock.
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa65d11aa0 upstream.
When we unlock our spinlock to copy data to user we may get
disabled by USB host and free the whole list of completed out
requests including the one from which we are copying the data
to user memory.
To prevent from this let's remove our working element from
the list and place it back only if there is sth left when we
finish with it.
Fixes: 99c5150058 ("usb: gadget: hidg: register OUT INT endpoint for SET_REPORT")
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20d2ca955b upstream.
Requests for out endpoint are allocated in bind() function
but never released.
This commit ensures that all pending requests are released
when we disable out endpoint.
Fixes: 99c5150058 ("usb: gadget: hidg: register OUT INT endpoint for SET_REPORT")
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5528954a1a upstream.
Commit 304f7e5e1d ("usb: gadget: Refactor request completion")
removed check if req->req.complete is non-NULL, resulting in a NULL
pointer derefence and a kernel panic.
This patch adds an empty complete function instead of re-introducing
the req->req.complete check.
Fixes: 304f7e5e1d ("usb: gadget: Refactor request completion")
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8236800da1 upstream.
Since:
commit 855ed04a37 ("usb: gadget: udc-core: independent registration
of gadgets and gadget drivers")
if we load gadget module but there is no free udc available
then it will be stored on a pending gadgets list.
$ modprobe g_zero.ko
$ modprobe g_ether.ko
[] udc-core: couldn't find an available UDC - added [g_ether] to list
of pending drivers
We scan this list each time when new UDC appears in system.
But we can get a free UDC each time after gadget unbind.
This commit add scanning of that list directly after unbinding
gadget from udc.
Thanks to this, when we unload first gadget:
$ rmmod g_zero.ko
gadget which is pending is automatically
attached to that UDC (if name matches).
Fixes: 855ed04a37 ("usb: gadget: udc-core: independent registration of gadgets and gadget drivers")
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5de4e1ea9a upstream.
The commit 4ac53087d6 ("usb: xhci: plat: Create both
HCDs before adding them") move add hcd to the end of
probe, this cause hcc_params uninitiated, because xHCI
driver sets hcc_params in xhci_gen_setup() called from
usb_add_hcd().
This patch checks the Maximum Primary Stream Array Size
in the hcc_params register after add primary hcd.
Signed-off-by: William wu <william.wu@rock-chips.com>
Acked-by: Roger Quadros <rogerq@ti.com>
Fixes: 4ac53087d6 ("usb: xhci: plat: Create both HCDs before adding them")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb80fc672 upstream.
At least macOS seems to be sending
ClearFeature(ENDPOINT_HALT) to endpoints which
aren't Halted. This makes DWC3's CLEARSTALL command
time out which causes several issues for the driver.
Instead, let's just return 0 and bail out early.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a994ce2d7e upstream.
DA8xx driver is registering and using the CPPI 3.0 DMA controller but
actually, the DA8xx has a CPPI 4.1 DMA controller.
Remove the CPPI 3.0 quirk and methods.
Fixes: f8e9f34f80 ("usb: musb: Fix up DMA related macros")
Fixes: 7f6283ed6f ("usb: musb: Set up function pointers for DMA")
Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61cd1b4cd1 upstream.
ds2490 driver was doing USB transfers from / to buffers on a stack.
This is not permitted and made the driver non-working with vmapped stacks.
Since all these transfers are done under the same bus_mutex lock we can
simply use shared buffers in a device private structure for two most common
of them.
While we are at it, let's also fix a comparison between int and size_t in
ds9490r_search() which made the driver spin in this function if state
register get requests were failing.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2ce4ea1a0 upstream.
Near the beginning of w1_attach_slave_device() we increment a w1 master
reference count.
Later, when we are going to exit this function without actually attaching
a slave device (due to failure of __w1_attach_slave_device()) we need to
decrement this reference count back.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: 9fcbbac5de ("w1: process w1 netlink commands in w1_process thread")
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c42631376 upstream.
The priv->cmd_msg_buffer is allocated in the probe function, but never
kfree()ed. This patch converts the kzalloc() to resource-managed
kzalloc.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c919a3069c upstream.
Fixes: 05ca527000 can: gs_usb: add ethtool set_phys_id callback to locate physical device
The gs_usb driver is performing USB transfers using buffers allocated on
the stack. This causes the driver to not function with vmapped stacks.
Instead, allocate memory for the transfer buffers.
Signed-off-by: Ethan Zonca <e@ethanzonca.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cf6cdba58 upstream.
Fixes a regression triggered by a change in the layout of
struct iio_chan_spec, but the real bug is in the driver which assumed
a specific structure layout in the first place. Hint: the two bits were
not OR:ed together as implied by the indentation prior to this patch,
there was a comma between them, which accidentally moved the ..._SCALE
bit to the next structure field. That field was .info_mask_shared_by_type
before the _available attributes was added by commit 5123960007
("iio:core: add a callback to allow drivers to provide _available
attributes") and .info_mask_separate_available afterwards, and the
regression happened.
info_mask_shared_by_type is actually a better choice than the originally
intended info_mask_separate for the ..._SCALE bit since a constant is
returned from mpl3115_read_raw for the scale. Using
info_mask_shared_by_type also preserves the behavior from before the
regression and is therefore less likely to cause other interesting side
effects.
The above mentioned regression causes an unintended sysfs attibute to
show up that is not backed by code, in turn causing the following NULL
pointer defererence to happen on access.
Segmentation fault
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ecc3c000
[00000000] *pgd=87f91831
Internal error: Oops: 80000007 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 1051 Comm: cat Not tainted 4.10.0-rc5-00009-gffd8858-dirty #3
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: ed54ec00 task.stack: ee2bc000
PC is at 0x0
LR is at iio_read_channel_info_avail+0x40/0x280
pc : [<00000000>] lr : [<c06fbc1c>] psr: a0070013
sp : ee2bdda8 ip : 00000000 fp : ee2bddf4
r10: c0a53c74 r9 : ed79f000 r8 : ee8d1018
r7 : 00001000 r6 : 00000fff r5 : ee8b9a00 r4 : ed79f000
r3 : ee2bddc4 r2 : ee2bddbc r1 : c0a86dcc r0 : ee8d1000
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 3cc3c04a DAC: 00000051
Process cat (pid: 1051, stack limit = 0xee2bc210)
Stack: (0xee2bdda8 to 0xee2be000)
dda0: ee2bddc0 00000002 c016d720 c016d394 ed54ec00 00000000
ddc0: 60070013 ed413780 00000001 edffd480 ee8b9a00 00000fff 00001000 ee8d1018
dde0: ed79f000 c0a53c74 ee2bde0c ee2bddf8 c0513c58 c06fbbe8 edffd480 edffd540
de00: ee2bde3c ee2bde10 c0293474 c0513c40 c02933e4 ee2bde60 00000001 ed413780
de20: 00000001 ed413780 00000000 edffd480 ee2bde4c ee2bde40 c0291d00 c02933f0
de40: ee2bde9c ee2bde50 c024679c c0291ce0 edffd4b0 b6e37000 00020000 ee2bdf78
de60: 00000000 00000000 ed54ec00 ed013200 00000817 c0a111fc edffd540 ed413780
de80: b6e37000 00020000 00020000 ee2bdf78 ee2bded4 ee2bdea0 c0292890 c0246604
dea0: c0117940 c016ba50 00000025 c0a111fc b6e37000 ed413780 ee2bdf78 00020000
dec0: ee2bc000 b6e37000 ee2bdf44 ee2bded8 c021d158 c0292770 c0117764 b6e36004
dee0: c0f0d7c4 ee2bdfb0 b6f89228 00021008 ee2bdfac ee2bdf00 c0101374 c0117770
df00: 00000000 00000000 ee2bc000 00000000 ee2bdf34 ee2bdf20 c016ba04 c0171080
df20: 00000000 00020000 ed413780 b6e37000 00000000 ee2bdf78 ee2bdf74 ee2bdf48
df40: c021e7a0 c021d130 c023e300 c023e280 ee2bdf74 00000000 00000000 ed413780
df60: ed413780 00020000 ee2bdfa4 ee2bdf78 c021e870 c021e71c 00000000 00000000
df80: 00020000 00020000 b6e37000 00000003 c0108084 00000000 00000000 ee2bdfa8
dfa0: c0107ee0 c021e838 00020000 00020000 00000003 b6e37000 00020000 0001a2b4
dfc0: 00020000 00020000 b6e37000 00000003 7fffe000 00000000 00000000 00020000
dfe0: 00000000 be98eb4c 0000c740 b6f1985c 60070010 00000003 00000000 00000000
Backtrace:
[<c06fbbdc>] (iio_read_channel_info_avail) from [<c0513c58>] (dev_attr_show+0x24/0x50)
r10:c0a53c74 r9:ed79f000 r8:ee8d1018 r7:00001000 r6:00000fff r5:ee8b9a00
r4:edffd480
[<c0513c34>] (dev_attr_show) from [<c0293474>] (sysfs_kf_seq_show+0x90/0x110)
r5:edffd540 r4:edffd480
[<c02933e4>] (sysfs_kf_seq_show) from [<c0291d00>] (kernfs_seq_show+0x2c/0x30)
r10:edffd480 r9:00000000 r8:ed413780 r7:00000001 r6:ed413780 r5:00000001
r4:ee2bde60 r3:c02933e4
[<c0291cd4>] (kernfs_seq_show) from [<c024679c>] (seq_read+0x1a4/0x4e0)
[<c02465f8>] (seq_read) from [<c0292890>] (kernfs_fop_read+0x12c/0x1cc)
r10:ee2bdf78 r9:00020000 r8:00020000 r7:b6e37000 r6:ed413780 r5:edffd540
r4:c0a111fc
[<c0292764>] (kernfs_fop_read) from [<c021d158>] (__vfs_read+0x34/0x118)
r10:b6e37000 r9:ee2bc000 r8:00020000 r7:ee2bdf78 r6:ed413780 r5:b6e37000
r4:c0a111fc
[<c021d124>] (__vfs_read) from [<c021e7a0>] (vfs_read+0x90/0x11c)
r8:ee2bdf78 r7:00000000 r6:b6e37000 r5:ed413780 r4:00020000
[<c021e710>] (vfs_read) from [<c021e870>] (SyS_read+0x44/0x90)
r8:00020000 r7:ed413780 r6:ed413780 r5:00000000 r4:00000000
[<c021e82c>] (SyS_read) from [<c0107ee0>] (ret_fast_syscall+0x0/0x1c)
r10:00000000 r8:c0108084 r7:00000003 r6:b6e37000 r5:00020000 r4:00020000
Code: bad PC value
---[ end trace 9c4938ccd0389004 ]---
Fixes: cc26ad455f ("iio: Add Freescale MPL3115A2 pressure / temperature sensor driver")
Fixes: 5123960007 ("iio:core: add a callback to allow drivers to provide _available attributes")
Reported-by: Ken Lin <ken.lin@advantech.com>
Tested-by: Ken Lin <ken.lin@advantech.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a6e1d56a0 upstream.
Fixes a regression triggered by a change in the layout of
struct iio_chan_spec, but the real bug is in the driver which assumed
a specific structure layout in the first place. Hint: the three bits were
not OR:ed together as implied by the indentation prior to this patch,
there was a comma between the first two, which accidentally moved the
..._SCALE and ..._OFFSET bits to the next structure field. That field
was .info_mask_shared_by_type before the _available attributes was added
by commit 5123960007 ("iio:core: add a callback to allow drivers to
provide _available attributes") and .info_mask_separate_available
afterwards, and the regression happened.
info_mask_shared_by_type is actually a better choice than the originally
intended info_mask_separate for the ..._SCALE and ..._OFFSET bits since
a constant is returned from mpl115_read_raw for the scale/offset. Using
info_mask_shared_by_type also preserves the behavior from before the
regression and is therefore less likely to cause other interesting side
effects.
The above mentioned regression causes unintended sysfs attibutes to
show up that are not backed by code, in turn causing a NULL pointer
defererence to happen on access.
Fixes: 3017d90e89 ("iio: Add Freescale MPL115A2 pressure / temperature sensor driver")
Fixes: 5123960007 ("iio:core: add a callback to allow drivers to provide _available attributes")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bdbf3b071 upstream.
The IRQFD framework calls the architecture dependent function
twice if the corresponding GSI type is edge triggered. For ARM,
the function kvm_set_msi() is getting called twice whenever the
IRQFD receives the event signal. The rest of the code path is
trying to inject the MSI without any validation checks. No need
to call the function vgic_its_inject_msi() second time to avoid
an unnecessary overhead in IRQ queue logic. It also avoids the
possibility of VM seeing the MSI twice.
Simple fix, return -1 if the argument 'level' value is zero.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d0928f18b upstream.
Since it was introduced in commit da8d02d19f ("arm64/capabilities:
Make use of system wide safe value"), __raw_read_system_reg() has
erroneously mapped some sysreg IDs to other registers.
For the fields in ID_ISAR5_EL1, our local feature detection will be
erroneous. We may spuriously detect that a feature is uniformly
supported, or may fail to detect when it actually is, meaning some
compat hwcaps may be erroneous (or not enforced upon hotplug).
This patch corrects the erroneous entries.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: da8d02d19f ("arm64/capabilities: Make use of system wide safe value")
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adbe7e26f4 upstream.
When bypassing SWIOTLB on small-memory systems, we need to avoid calling
into swiotlb_dma_mapping_error() in exactly the same way as we avoid
swiotlb_dma_supported(), because the former also relies on SWIOTLB state
being initialised.
Under the assumptions for which we skip SWIOTLB, dma_map_{single,page}()
will only ever return the DMA-offset-adjusted physical address of the
page passed in, thus we can report success unconditionally.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
CC: Jisheng Zhang <jszhang@marvell.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f36ebaf21 upstream.
When we fault in a page, we flush it to the PoC (Point of Coherency)
if the faulting vcpu has its own caches off, so that it can observe
the page we just brought it.
But if the vcpu has its caches on, we skip that step. Bad things
happen when *another* vcpu tries to access that page with its own
caches disabled. At that point, there is no garantee that the
data has made it to the PoC, and we access stale data.
The obvious fix is to always flush to PoC when a page is faulted
in, no matter what the state of the vcpu is.
Fixes: 2d58b733c8 ("arm64: KVM: force cache clean on page fault when caches are off")
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58ab9a088d upstream.
Kirill reported a warning from UBSAN about undefined behavior when using
protection keys. He is running on hardware that actually has support for
it, which is not widely available.
The warning triggers because of very large shifts of integers when doing a
pkey_free() of a large, invalid value. This happens because we never check
that the pkey "fits" into the mm_pkey_allocation_map().
I do not believe there is any danger here of anything bad happening
other than some aliasing issues where somebody could do:
pkey_free(35);
and the kernel would effectively execute:
pkey_free(8);
While this might be confusing to an app that was doing something stupid, it
has to do something stupid and the effects are limited to the app shooting
itself in the foot.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kselftest@vger.kernel.org
Cc: shuah@kernel.org
Cc: kirill.shutemov@linux.intel.com
Link: http://lkml.kernel.org/r/20170223222603.A022ED65@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e38bea99a upstream.
fuse_file_put() was missing the "force" flag for the RELEASE request when
sending synchronously (fuseblk).
If this flag is not set, then a sync request may be interrupted before it
is dequeued by the userspace filesystem. In this case the OPEN won't be
balanced with a RELEASE.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 5a18ec176c ("fuse: fix hang of single threaded fuseblk filesystem")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c68bb0f62 upstream.
Running with KASAN and crypto tests currently gives
BUG: KASAN: global-out-of-bounds in __test_aead+0x9d9/0x2200 at addr ffffffff8212fca0
Read of size 16 by task cryptomgr_test/1107
Address belongs to variable 0xffffffff8212fca0
CPU: 0 PID: 1107 Comm: cryptomgr_test Not tainted 4.10.0+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
Call Trace:
dump_stack+0x63/0x8a
kasan_report.part.1+0x4a7/0x4e0
? __test_aead+0x9d9/0x2200
? crypto_ccm_init_crypt+0x218/0x3c0 [ccm]
kasan_report+0x20/0x30
check_memory_region+0x13c/0x1a0
memcpy+0x23/0x50
__test_aead+0x9d9/0x2200
? kasan_unpoison_shadow+0x35/0x50
? alg_test_akcipher+0xf0/0xf0
? crypto_skcipher_init_tfm+0x2e3/0x310
? crypto_spawn_tfm2+0x37/0x60
? crypto_ccm_init_tfm+0xa9/0xd0 [ccm]
? crypto_aead_init_tfm+0x7b/0x90
? crypto_alloc_tfm+0xc4/0x190
test_aead+0x28/0xc0
alg_test_aead+0x54/0xd0
alg_test+0x1eb/0x3d0
? alg_find_test+0x90/0x90
? __sched_text_start+0x8/0x8
? __wake_up_common+0x70/0xb0
cryptomgr_test+0x4d/0x60
kthread+0x173/0x1c0
? crypto_acomp_scomp_free_ctx+0x60/0x60
? kthread_create_on_node+0xa0/0xa0
ret_from_fork+0x2c/0x40
Memory state around the buggy address:
ffffffff8212fb80: 00 00 00 00 01 fa fa fa fa fa fa fa 00 00 00 00
ffffffff8212fc00: 00 01 fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
>ffffffff8212fc80: fa fa fa fa 00 05 fa fa fa fa fa fa 00 00 00 00
^
ffffffff8212fd00: 01 fa fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
ffffffff8212fd80: fa fa fa fa 00 00 00 00 00 05 fa fa fa fa fa fa
This always happens on the same IV which is less than 16 bytes.
Per Ard,
"CCM IVs are 16 bytes, but due to the way they are constructed
internally, the final couple of bytes of input IV are dont-cares.
Apparently, we do read all 16 bytes, which triggers the KASAN errors."
Fix this by padding the IV with null bytes to be at least 16 bytes.
Fixes: 0bc5a6c5c7 ("crypto: testmgr - Disable rfc4309 test and convert test vectors")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0bb03924f upstream.
DoS protection conditions were altered in WS2016 and now it's easy to get
-EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a
netvsc device in a loop). All vmbus_post_msg() callers don't retry the
operation and we usually end up with a non-functional device or crash.
While host's DoS protection conditions are unknown to me my tests show that
it can take up to 10 seconds before the message is sent so doing udelay()
is not an option, we really need to sleep. Almost all vmbus_post_msg()
callers are ready to sleep but there is one special case:
vmbus_initiate_unload() which can be called from interrupt/NMI context and
we can't sleep there. I'm also not sure about the lonely
vmbus_send_tl_connect_request() which has no in-tree users but its external
users are most likely waiting for the host to reply so sleeping there is
also appropriate.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a7275a3d8 upstream.
eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage") used
TLP_FMTTYPE_CFGRD* (instead of TLP_FMTTYPE_CFGWR*) for TLP writes, which
causes writing to configuration space to fail. Fix it by using correct
FMTTYPE for write operation.
Fixes: eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage")
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49f4b08e61 upstream.
pnv_php_disable_irq() can be called in two paths: Bailing path in
pnv_php_enable_irq() or releasing slot. The MSI (or MSIx) interrupts
is disabled unconditionally in pnv_php_disable_irq(). It's wrong
because that might be enabled by drivers other than pnv-php.
This disables MSI (or MSIx) interrupts and the PCI device only if
it was enabled by pnv-php. In the error path of pnv_php_enable_irq(),
we rely on the newly added parameter @disable_device. In the path
of releasing slot, @pnv_php->irq is checked.
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60e2e2fbaf upstream.
The devfn of 00:02.0 is 0x10. devfn_to_wslot(0x10) == 0x2, and
wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code.
Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot()
returns NULL and pci_stop_and_remove_bus_device() is not called.
Later when the real device driver's .remove() is invoked by
hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed
because the VM has lost the access to the underlying device at that
time.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
CC: K. Y. Srinivasan <kys@microsoft.com>
CC: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9f1e32600 upstream.
This patch fixes the OTP register definitions for the AR934x and AR9550
WMAC SoC.
Previously, the ath9k driver was unable to initialize the integrated
WMAC on an Aerohive AP121:
| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004
| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004
| ath: phy0: Unable to initialize hardware; initialization status: -5
| ath9k ar934x_wmac: failed to initialize device
| ath9k: probe of ar934x_wmac failed with error -5
It turns out that the AR9300_OTP_STATUS and AR9300_OTP_DATA
definitions contain a typo.
Cc: Gabor Juhos <juhosg@openwrt.org>
Fixes: add295a4af "ath9k: use correct OTP register offsets for AR9550"
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Chris Blake <chrisrblake93@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a5e969bb2 upstream.
The code currently relies on refcounting to disable IRQs from within the
IRQ handler and re-enabling them again after the tasklet has run.
However, due to race conditions sometimes the IRQ handler might be
called twice, or the tasklet may not run at all (if interrupted in the
middle of a reset).
This can cause nasty imbalances in the irq-disable refcount which will
get the driver permanently stuck until the entire radio has been stopped
and started again (ath_reset will not recover from this).
Instead of using this fragile logic, change the code to ensure that
running the irq handler during tasklet processing is safe, and leave the
refcount untouched.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb4281528b upstream.
Rx filter reset and the dynamic tx switch mode (EXT_RESOURCE_CFG)
configuration are causing the following errors when UTF firmware
is loaded to the target.
Error message 1:
[ 598.015629] ath10k_pci 0001:01:00.0: failed to ping firmware: -110
[ 598.020828] ath10k_pci 0001:01:00.0: failed to reset rx filter: -110
[ 598.141556] ath10k_pci 0001:01:00.0: failed to start core (testmode): -110
Error message 2:
[ 668.615839] ath10k_ahb a000000.wifi: failed to send ext resource cfg command : -95
[ 668.618902] ath10k_ahb a000000.wifi: failed to start core (testmode): -95
Avoiding these configurations while bringing the target in
testmode is solving the problem.
Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb97fbbcac upstream.
Parallel reads from multiple threads on a file descriptor
are not well defined and racy. It is safer to return to original
behavior and simply fail the additional read.
The solution is to remove request for next read credit.
Fixes: ff1586a7ea ("mei: enqueue consecutive reads")
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4753d8a24d upstream.
If the file system requires journal recovery, and the device is
read-ony, return EROFS to the mount system call. This allows xfstests
generic/050 to pass.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97abd7d4b5 upstream.
If the journal is aborted, the needs_recovery feature flag should not
be removed. Otherwise, it's the journal might not get replayed and
this could lead to more data getting lost.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb5efbcb76 upstream.
The write_end() function must always unlock the page and drop its ref
count, even on an error.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd01b690f8 upstream.
In the case where the child's encryption context was inconsistent with
its parent directory, we were using inode->i_sb and inode->i_ino after
the inode had already been iput(). Fix this by doing the iput() in the
correct places.
Note: only ext4 had this bug, not f2fs and ubifs.
Fixes: d9cdc90331 ("ext4 crypto: enforce context consistency")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b136499e9 upstream.
ext4_journalled_write_end() did not propely handle all the cases when
generic_perform_write() did not copy all the data into the target page
and could mark buffers with uninitialized contents as uptodate and dirty
leading to possible data corruption (which would be quickly fixed by
generic_perform_write() retrying the write but still). Fix the problem
by carefully handling the case when the page that is written to is not
uptodate.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd648b8a8f upstream.
If filesystem groups are artifically small (using parameter -g to
mkfs.ext4), ext4_mb_normalize_request() can result in a request that is
larger than a block group. Trim the request size to not confuse
allocation code.
Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03e916fa8b upstream.
Inside ext4_ext_shift_extents() function ext4_find_extent() is called
without EXT4_EX_NOCACHE flag, which should prevent cache population.
This leads to oudated offsets in the extents tree and wrong blocks
afterwards.
Patch fixes the problem providing EXT4_EX_NOCACHE flag for each
ext4_find_extents() call inside ext4_ext_shift_extents function.
Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a9b8cba62 upstream.
While doing 'insert range' start block should be also shifted right.
The bug can be easily reproduced by the following test:
ptr = malloc(4096);
assert(ptr);
fd = open("./ext4.file", O_CREAT | O_TRUNC | O_RDWR, 0600);
assert(fd >= 0);
rc = fallocate(fd, 0, 0, 8192);
assert(rc == 0);
for (i = 0; i < 2048; i++)
*((unsigned short *)ptr + i) = 0xbeef;
rc = pwrite(fd, ptr, 4096, 0);
assert(rc == 4096);
rc = pwrite(fd, ptr, 4096, 4096);
assert(rc == 4096);
for (block = 2; block < 1000; block++) {
rc = fallocate(fd, FALLOC_FL_INSERT_RANGE, 4096, 4096);
assert(rc == 0);
for (i = 0; i < 2048; i++)
*((unsigned short *)ptr + i) = block;
rc = pwrite(fd, ptr, 4096, 4096);
assert(rc == 4096);
}
Because start block is not included in the range the hole appears at
the wrong offset (just after the desired offset) and the following
pwrite() overwrites already existent block, keeping hole untouched.
Simple way to verify wrong behaviour is to check zeroed blocks after
the test:
$ hexdump ./ext4.file | grep '0000 0000'
The root cause of the bug is a wrong range (start, stop], where start
should be inclusive, i.e. [start, stop].
This patch fixes the problem by including start into the range. But
not to break left shift (range collapse) stop points to the beginning
of the a block, not to the end.
The other not obvious change is an iterator check on validness in a
main loop. Because iterator is unsigned the following corner case
should be considered with care: insert a block at 0 offset, when stop
variables overflows and never becomes less than start, which is 0.
To handle this special case iterator is set to NULL to indicate that
end of the loop is reached.
Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e02898b423 upstream.
loop_reread_partitions() needs to do I/O, but we just froze the queue,
so we end up waiting forever. This can easily be reproduced with losetup
-P. Fix it by moving the reread to after we unfreeze the queue.
Fixes: ecdd09597a ("block/loop: fix race between I/O and set_status")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e112666b49 upstream.
If the journal has been aborted, we shouldn't mark the underlying
buffer head as dirty, since that will cause the metadata block to get
modified. And if the journal has been aborted, we shouldn't allow
this since it will almost certainly lead to a corrupted file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 907565337e upstream.
Userspace applications should be allowed to expect the membarrier system
call with MEMBARRIER_CMD_SHARED command to issue memory barriers on
nohz_full CPUs, but synchronize_sched() does not take those into
account.
Given that we do not want unrelated processes to be able to affect
real-time sensitive nohz_full CPUs, simply return ENOSYS when membarrier
is invoked on a kernel with enabled nohz_full CPUs.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b0408745e upstream.
LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
proper power off sequence is used before shutting down the platform.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 857de6e007 upstream.
The device handler needs to check if a given queue belongs to a scsi
device; only then does it make sense to attach a device handler.
[mkp: dropped flags]
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c421530bf8 upstream.
The driver currently checks the SELF_TEST_FAILED first and then
KERNEL_PANIC next. Under error conditions(boot code failure) both
SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time.
The driver has the capability to reset the controller on an KERNEL_PANIC,
but not on SELF_TEST_FAILED.
Fixed by first checking KERNEL_PANIC and then the others.
Fixes: e8b12f0fb8 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40630f4628 upstream.
On I/O errors, the Windows driver doesn't set data_transfer_length
on error conditions other than SRB_STATUS_DATA_OVERRUN.
In these cases we need to set data_transfer_length to 0,
indicating there is no data transferred. On SRB_STATUS_DATA_OVERRUN,
data_transfer_length is set by the Windows driver to the actual data transferred.
Reported-by: Shiva Krishna <Shiva.Krishna@nimblestorage.com>
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bba5dc332e upstream.
When sense message is present on error, we should pass along to the upper
layer to decide how to deal with the error.
This patch fixes connectivity issues with Fiber Channel devices.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d36a19541f upstream.
The lvm2 sequence to manage dm-raid constructor flags that trigger a
rebuild or a reshape is defined as:
1) load table with flags (e.g. rebuild/delta_disks/data_offset)
2) clear out the flags in lvm2 metadata
3) store the lvm2 metadata, reload the table to reset the flags
previously established during the initial load (1) -- in order to
prevent repeatedly requesting a rebuild or a reshape on activation
Currently, loading an inactive table with rebuild/reshape flags
specified will cause dm-raid to rebuild/reshape on resume and thus start
updating the raid metadata (about the progress). When the second table
reload, to reset the flags, occurs the constructor accesses the volatile
progress state kept in the raid superblocks. Because the active mapping
is still processing the rebuild/reshape, that position will be stale by
the time the device is resumed.
In the reshape case, this causes data corruption by processing already
reshaped stripes again. In the rebuild case, it does _not_ cause data
corruption but instead involves superfluous rebuilds.
Fix by keeping the raid set frozen during the first resume and then
allow the rebuild/reshape during the second resume.
Fixes: 9dbd1aa3a ("dm raid: add reshaping support to the target")
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37a098e9d1 upstream.
The sloppy nature of lockless access to percpu pointers
(s->current_path) in rr_select_path(), from multiple threads, is
causing some paths to used more than others -- which results in less
IO performance being observed.
Revert these upstream commits to restore truly symmetric round-robin
IO submission in DM multipath:
b0b477c dm round robin: use percpu 'repeat_count' and 'current_path'
802934b dm round robin: do not use this_cpu_ptr() without having preemption disabled
There is no benefit to all this complexity if repeat_count = 1 (which is
the recommended default).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca763d0a53 upstream.
A rounding bug due to compiler generated temporary being 32bit was found
in remap_to_cache(). A localized cast in remap_to_cache() fixes the
corruption but this preferred fix (changing from uint32_t to sector_t)
eliminates potential for future rounding errors elsewhere.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30582c25a4 upstream.
Until now, the trans_stat information of passive devfreq is not updated.
This patch updates the trans_stat information after setting the target
frequency of passive devfreq device.
Fixes: 996133119f ("PM / devfreq: Add new passive governor")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcf23c79c4 upstream.
The devfreq using passive governor is not able to change the governor.
So, the user can not change the governor through 'available_governor' sysfs
entry. Also, the devfreq which don't use the passive governor is not able to
change to 'passive' governor on the fly.
Fixes: 996133119f ("PM / devfreq: Add new passive governor")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc15ed663e upstream.
On failure to return a pathname from ima_d_path(), a pointer to
dname is returned, which is subsequently used in the IMA measurement
list, the IMA audit records, and other audit logging. Saving the
pointer to dname for later use has the potential to race with rename.
Intead of returning a pointer to dname on failure, this patch returns
a pointer to a copy of the filename.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95e91b831f upstream.
The issue is described here, with a nice testcase:
https://bugzilla.kernel.org/show_bug.cgi?id=192931
The problem is that shmat() calls do_mmap_pgoff() with MAP_FIXED, and
the address rounded down to 0. For the regular mmap case, the
protection mentioned above is that the kernel gets to generate the
address -- arch_get_unmapped_area() will always check for MAP_FIXED and
return that address. So by the time we do security_mmap_addr(0) things
get funky for shmat().
The testcase itself shows that while a regular user crashes, root will
not have a problem attaching a nil-page. There are two possible fixes
to this. The first, and which this patch does, is to simply allow root
to crash as well -- this is also regular mmap behavior, ie when hacking
up the testcase and adding mmap(... |MAP_FIXED). While this approach
is the safer option, the second alternative is to ignore SHM_RND if the
rounded address is 0, thus only having MAP_SHARED flags. This makes the
behavior of shmat() identical to the mmap() case. The downside of this
is obviously user visible, but does make sense in that it maintains
semantics after the round-down wrt 0 address and mmap.
Passes shm related ltp tests.
Link: http://lkml.kernel.org/r/1486050195-18629-1-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Gareth Evans <gareth.evans@contextis.co.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71ab6cfe88 upstream.
get_scan_count() considers the whole node LRU size when
- doing SCAN_FILE due to many page cache inactive pages
- calculating the number of pages to scan
In both cases this might lead to unexpected behavior especially on 32b
systems where we can expect lowmem memory pressure very often.
A large highmem zone can easily distort SCAN_FILE heuristic because
there might be only few file pages from the eligible zones on the node
lru and we would still enforce file lru scanning which can lead to
trashing while we could still scan anonymous pages.
The later use of lruvec_lru_size can be problematic as well. Especially
when there are not many pages from the eligible zones. We would have to
skip over many pages to find anything to reclaim but shrink_node_memcg
would only reduce the remaining number to scan by SWAP_CLUSTER_MAX at
maximum. Therefore we can end up going over a large LRU many times
without actually having chance to reclaim much if anything at all. The
closer we are out of memory on lowmem zone the worse the problem will
be.
Fix this by filtering out all the ineligible zones when calculating the
lru size for both paths and consider only sc->reclaim_idx zones.
The patch would need to be tweaked a bit to apply to 4.10 and older but
I will do that as soon as it hits the Linus tree in the next merge
window.
Link: http://lkml.kernel.org/r/20170117103702.28542-3-mhocko@kernel.org
Fixes: b2e18757f2 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: Trevor Cordes <trevor@tecnopolis.ca>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd8416c477 upstream.
With rw_page, page_endio is used for completing IO on a page and it
propagates write error to the address space if the IO fails. The
problem is it accesses page->mapping directly which might be okay for
file-backed pages but it shouldn't for anonymous page. Otherwise, it
can corrupt one of field from anon_vma under us and system goes panic
randomly.
swap_writepage
bdev_writepage
ops->rw_page
I encountered the BUG during developing new zram feature and it was
really hard to figure it out because it made random crash, somtime
mmap_sem lockdep, sometime other places where places never related to
zram/zsmalloc, and not reproducible with some configuration.
When I consider how that bug is subtle and people do fast-swap test with
brd, it's worth to add stable mark, I think.
Fixes: dd6bd0d9c7 ("swap: use bdev_read_page() / bdev_write_page()")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e02dc017c3 upstream.
When @node_reclaim_node isn't 0, the page allocator tries to reclaim
pages if the amount of free memory in the zones are below the low
watermark. On Power platform, none of NUMA nodes are scanned for page
reclaim because no nodes match the condition in zone_allows_reclaim().
On Power platform, RECLAIM_DISTANCE is set to 10 which is the distance
of Node-A to Node-A. So the preferred node even won't be scanned for
page reclaim.
__alloc_pages_nodemask()
get_page_from_freelist()
zone_allows_reclaim()
Anton proposed the test code as below:
# cat alloc.c
:
int main(int argc, char *argv[])
{
void *p;
unsigned long size;
unsigned long start, end;
start = time(NULL);
size = strtoul(argv[1], NULL, 0);
printf("To allocate %ldGB memory\n", size);
size <<= 30;
p = malloc(size);
assert(p);
memset(p, 0, size);
end = time(NULL);
printf("Used time: %ld seconds\n", end - start);
sleep(3600);
return 0;
}
The system I use for testing has two NUMA nodes. Both have 128GB
memory. In below scnario, the page caches on node#0 should be reclaimed
when it encounters pressure to accommodate request of allocation.
# echo 2 > /proc/sys/vm/zone_reclaim_mode; \
sync; \
echo 3 > /proc/sys/vm/drop_caches; \
# taskset -c 0 cat file.32G > /dev/null; \
grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33619712 kB
# taskset -c 0 ./alloc 128
# grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33619840 kB
# grep MemFree /sys/devices/system/node/node0/meminfo
Node 0 MemFree: 186816 kB
With the patch applied, the pagecache on node-0 is reclaimed when its
free memory is running out. It's the expected behaviour.
# echo 2 > /proc/sys/vm/zone_reclaim_mode; \
sync; \
echo 3 > /proc/sys/vm/drop_caches
# taskset -c 0 cat file.32G > /dev/null; \
grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33605568 kB
# taskset -c 0 ./alloc 128
# grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 1379520 kB
# grep MemFree /sys/devices/system/node/node0/meminfo
Node 0 MemFree: 317120 kB
Fixes: 5f7a75acdb ("mm: page_alloc: do not cache reclaim distances")
Link: http://lkml.kernel.org/r/1486532455-29613-1-git-send-email-gwshan@linux.vnet.ibm.com
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c25702cee upstream.
Currently we call copy_page_to_iter() for uncached reading into a pipe.
This is wrong because it treats pages as VFS cache pages and copies references
rather than actual data. When we are trying to read from the pipe we end up
calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error
is translated into 0 which is returned to a user.
This issue is reproduced by running xfs-tests suite (generic test #249)
against mount points with "cache=none". Fix it by mapping pages manually
and calling copy_to_iter() that copies data into the pipe.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e42a46b6f5 upstream.
It is allowed to call regulator_get with a NULL dev argument
(_regulator_get explicitly checks for it) but this causes an error later
when printing /sys/kernel/debug/regulator_summary.
Fix this by explicitly handling "deviceless" consumers in the debugfs code.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4474f4c40a upstream.
The stm is automatically enabled when an application sets the policy
via ->link() call back by using coresight_enable(), which keeps the
refcount of the current users of the STM. However, the unlink() callback
issues stm_disable() directly, which leaves the STM turned off, without
the coresight layer knowing about it. This prevents any further uses
of the STM hardware as the coresight layer still thinks the STM is
turned on and doesn't enable the hardware when required. Even manually
enabling the STM via sysfs can't really enable the hw.
e.g,
$ echo 1 > $CS_DEVS/$ETR/enable_sink
$ mkdir -p $CONFIG_FS/stp-policy/$source.0/stm_test/
$ echo 32768 65535 > $CONFIG_FS/stp-policy/$source.0/stm_test/channels
$ echo 64 > $CS_DEVS/$source/traceid
$ ./stm_app
Sending 64000 byte blocks of pattern 0 at 0us intervals
Success to map channel(32768~32783) to 0xffffa95fa000
Sending on channel 32768
$ dd if=/dev/$ETR of=~/trace.bin.1
597+1 records in
597+1 records out
305920 bytes (306 kB) copied, 0.399952 s, 765 kB/s
$ ./stm_app
Sending 64000 byte blocks of pattern 0 at 0us intervals
Success to map channel(32768~32783) to 0xffff7e9e2000
Sending on channel 32768
$ dd if=/dev/$ETR of=~/trace.bin.2
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.0232083 s, 0.0 kB/s
Note that we don't get any data from the ETR for the second session.
Also dmesg shows :
[ 77.520458] coresight-tmc 20800000.etr: TMC-ETR enabled
[ 77.537097] coresight-replicator etr_replicator@20890000: REPLICATOR enabled
[ 77.558828] coresight-replicator main_replicator@208a0000: REPLICATOR enabled
[ 77.581068] coresight-funnel 208c0000.main_funnel: FUNNEL inport 0 enabled
[ 77.602217] coresight-tmc 20840000.etf: TMC-ETF enabled
[ 77.618422] coresight-stm 20860000.stm: STM tracing enabled
[ 139.554252] coresight-stm 20860000.stm: STM tracing disabled
# End of first tracing session
[ 146.351135] coresight-tmc 20800000.etr: TMC read start
[ 146.514486] coresight-tmc 20800000.etr: TMC read end
# Note that the STM is not turned on via stm_generic_link()->coresight_enable()
# and hence none of the components are turned on.
[ 152.479080] coresight-tmc 20800000.etr: TMC read start
[ 152.542632] coresight-tmc 20800000.etr: TMC read end
This patch fixes the problem by balancing the unlink operation by using
the coresight_disable(), keeping the coresight layer in sync with the
hardware state and thus allowing normal usage of the STM component.
Fixes: commit 237483aa5c ("coresight: stm: adding driver for CoreSight STM component")
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Reported-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e01700602 upstream.
gcc-7 detects that wlanhdr_to_ethhdr() in two drivers calls memcpy() with
a destination argument that an earlier function call may have set to NULL:
staging/rtl8188eu/core/rtw_recv.c: In function 'wlanhdr_to_ethhdr':
staging/rtl8188eu/core/rtw_recv.c:1318:2: warning: argument 1 null where non-null expected [-Wnonnull]
staging/rtl8712/rtl871x_recv.c: In function 'r8712_wlanhdr_to_ethhdr':
staging/rtl8712/rtl871x_recv.c:649:2: warning: argument 1 null where non-null expected [-Wnonnull]
I'm fixing this by adding a NULL pointer check and returning failure
from the function, which is hopefully already handled properly.
This seems to date back to when the drivers were originally added,
so backporting the fix to stable seems appropriate. There are other
related realtek drivers in the kernel, but none of them contain a
function with a similar name or produce this warning.
Fixes: 1cc18a22b9 ("staging: r8188eu: Add files for new driver - part 5")
Fixes: 2865d42c78 ("staging: r8712u: Add the new driver to the mainline kernel")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc7ffefdcc upstream.
This is unbreaking another of those "stealth" janitor
patches that got in and subtly broke some things.
sv_cpt_data is a pointer to pointer, so need to
dereference it twice to allocate the correct structure size.
Fixes: 9899cb68c6 ("Staging: lustre: rpc: Use sizeof type *pointer instead of sizeof type.")
CC: Sandhya Bankar <bankarsandhya512@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33b8807a6f upstream.
The loopback driver allows the user to set a minimum delay of up to one
second to be inserted between test iterations (i.e. request
submissions). The delay is currently specified in microseconds and is
implemented using udelay.
Busy looping for long periods is not just anti-social; udelay must not
be used for delays longer than a few milliseconds due to the risk of
integer overflow.
Replace the broken udelay with a usleep_range with a 100 us range for
short delays (< 20 ms) and otherwise revert to using msleep.
Fixes: b36f04fa94 ("greybus: loopback: Convert thread delay to microseconds")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82dbe987b7 upstream.
If sensor attributes were never read, the pwm control data has not been
initiialized, which can cause wrong driver behavior. Ensure that cached
data is current before acting on it.
Reported-by: Kevin Folz <kfolz@evertz.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c7b8ca1ae upstream.
In IT8620E, after setting pwm control to manual, it was observed that
pwm values for fan 4..6 have reversed results (writing 0 results in fans
running at full speed, writing 255 results in fans turned off).
With the new PWM control, pwm polarity for pwm control 4..6 is specified
in its pwm control registers. Those registers are overwritten when setting
the pwm mode or the temperature mapping. Do not touch bit 2..6 of pwm
control registers on register writes to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29693efcea upstream.
On this machine, the micmute button is connected to Line2 of the
codec and the micmute led is connected to GPIO2 of the codec.
After applying this quirk, both hotkey and led work well.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3ac9f7376 upstream.
The sequencer FIFO management has a bug that may lead to a corruption
(shortage) of the cell linked list. When a sequencer client faces an
error at the event delivery, it tries to put back the dequeued cell.
When the first queue was put back, this forgot the tail pointer
tracking, and the link will be screwed up.
Although there is no memory corruption, the sequencer client may stall
forever at exit while flushing the pending FIFO cells in
snd_seq_pool_done(), as spotted by syzkaller.
This patch addresses the missing tail pointer tracking at
snd_seq_fifo_cell_putback(). Also the patch makes sure to clear the
cell->enxt pointer at snd_seq_fifo_event_in() for avoiding a similar
mess-up of the FIFO linked list.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15c75b09f8 upstream.
Currently ctxfi driver tries to set only the 64bit DMA mask on 64bit
architectures, and bails out if it fails. This causes a problem on
some platforms since the 64bit DMA isn't always guaranteed. We should
fall back to the default 32bit DMA when 64bit DMA fails.
Fixes: 6d74b86d3c ("ALSA: ctxfi - Allow 64bit DMA")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71321eb3f2 upstream.
When a user sets a too small ticks with a fine-grained timer like
hrtimer, the kernel tries to fire up the timer irq too frequently.
This may lead to the condensed locks, eventually the kernel spinlock
lockup with warnings.
For avoiding such a situation, we define a lower limit of the
resolution, namely 1ms. When the user passes a too small tick value
that results in less than that, the kernel returns -EINVAL now.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7480b34ad upstream.
Like for Sunrise Point, the total stream number of Lewisburg's
input and output stream exceeds 15 (GCAP is 0x9701), which will
cause some streams do not work because of the overflow on
SDxCTL.STRM field if using the legacy stream tag allocation method.
Fixes: 5cf92c8b3d ("ALSA: hda - Add Intel Lewisburg device IDs Audio")
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f1bc2c4c5 upstream.
The issue is the same as "dd9aa335c880 ALSA: hda/realtek - Can't adjust
speaker's volume on a Dell AIO", the output requires to connect to a node
with Amp-out capability.
Applying the same fixup "ALC298_FIXUP_SPK_VOLUME" can fix the issue.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef8d02d4a2 upstream.
Enable DMA on usart3 to get a more reliable console. This is especially
useful for automation and kernelci were a kernel with PROVE_LOCKING enabled
is quite susceptible to character loss, resulting in tests failure.
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 379f831a92 upstream.
Commit a92e7c3d82 ("spi: s3c64xx: consider the case when the CS
line is not connected") introduced an inconsistency between the
binding, where the disconnected CS line was marked as
'no-cs-readback', and the driver.
The driver is erroneously checking for that attribute with
property name of 'broken-cs'.
Check for 'no-cs-readback' in the driver as well.
Fixes: a92e7c3d82 ("spi: s3c64xx: consider the case when the CS line is not connected")
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98d85f3cb9 upstream.
When the functions replaced media entity types, the range which was
allowed for the types was incorrect. This meant that media entity types
for specific devices were not passed correctly to the userspace through
MEDIA_IOC_ENUM_ENTITIES. Fix it.
Fixes: commit b2cd27448b ("[media] media-device: map new functions into old types for legacy API")
Reported-and-tested-by: Antti Laakso <antti.laakso@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ffb94b6cc upstream.
Setting GPIOs during probe causes null pointer deference when
GPIOLIB was not selected by Kconfig. Initialize driver private
field before calling set gpios.
It is regressing bug since 4.9.
Fixes: 07fdf7d9f1 ("[media] cxd2820r: add I2C driver bindings")
Reported-by: Chris Rankin <rankincj@gmail.com>
Tested-by: Chris Rankin <rankincj@gmail.com>
Tested-by: Håkan Lennestål <hakan.lennestal@gmail.com>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ebf75774f upstream.
In vpfe_s_fmt(), when the sensor format and the requested format were
the same, bpp was assigned to vpfe->bpp without being initialized first.
Grab the bpp value that is currently used by using __vpfe_get_format()
instead of its wrapper, vpfe_try_fmt().
This use of uninitialized variable has been found by compiling the
kernel with clang.
Fixes: 417d2e507e ("[media] media: platform: add VPFE capture driver
support for AM437X")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb9bc4689b upstream.
get_frame_info() calculates the offset of the return address within a
stack frame simply by dividing a the bottom 16 bits of the instruction,
treated as a signed integer, by the size of a long. Whilst this works
for MIPS32 & MIPS64 ISAs where the sw or sd instructions are used, it's
incorrect for microMIPS where encodings differ. The result is that we
typically completely fail to unwind the stack on microMIPS.
Fix this by adjusting is_ra_save_ins() to calculate the return address
offset, and take into account the various different encodings there in
the same place as we consider whether an instruction is storing the
ra/$31 register.
With this we are now able to unwind the stack for kernels targetting the
microMIPS ISA, for example we can produce:
Call Trace:
[<80109e1f>] show_stack+0x63/0x7c
[<8011ea17>] __warn+0x9b/0xac
[<8011ea45>] warn_slowpath_fmt+0x1d/0x20
[<8013fe53>] register_console+0x43/0x314
[<8067c58d>] of_setup_earlycon+0x1dd/0x1ec
[<8067f63f>] early_init_dt_scan_chosen_stdout+0xe7/0xf8
[<8066c115>] do_early_param+0x75/0xac
[<801302f9>] parse_args+0x1dd/0x308
[<8066c459>] parse_early_options+0x25/0x28
[<8066c48b>] parse_early_param+0x2f/0x38
[<8066e8cf>] setup_arch+0x113/0x488
[<8066c4f3>] start_kernel+0x57/0x328
---[ end trace 0000000000000000 ]---
Whereas previously we only produced:
Call Trace:
[<80109e1f>] show_stack+0x63/0x7c
---[ end trace 0000000000000000 ]---
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14532/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6c7a324df upstream.
get_frame_info() is meant to iterate over up to the first 128
instructions within a function, but for microMIPS kernels it will not
reach that many instructions unless the function is 512 bytes long since
we calculate the maximum number of instructions to check by dividing the
function length by the 4 byte size of a union mips_instruction. In
microMIPS kernels this won't do since instructions are variable length.
Fix this by instead checking whether the pointer to the current
instruction has reached the end of the function, and use max_insns as a
simple constant to check the number of iterations against.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14530/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3552dace7 upstream.
During stack unwinding we call a number of functions to determine what
type of instruction we're looking at. The union mips_instruction pointer
provided to them may be pointing at a 2 byte, but not 4 byte, aligned
address & we thus cannot directly access the 4 byte wide members of the
union mips_instruction. To avoid this is_ra_save_ins() copies the
required half-words of the microMIPS instruction to a correctly aligned
union mips_instruction on the stack, which it can then access safely.
The is_jump_ins() & is_sp_move_ins() functions do not correctly perform
this temporary copy, and instead attempt to directly dereference 4 byte
fields which may be misaligned and lead to an address exception.
Fix this by copying the instruction halfwords to a temporary union
mips_instruction in get_frame_info() such that we can provide a 4 byte
aligned union mips_instruction to the is_*_ins() functions and they do
not need to deal with misalignment themselves.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14529/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccaf7caf2c upstream.
get_frame_info() can be called in microMIPS kernels with the ISA bit
already clear. For example this happens when unwind_stack_by_address()
is called because we begin with a PC that has the ISA bit set & subtract
the (odd) offset from the preceding symbol (which does not have the ISA
bit set). Since get_frame_info() unconditionally subtracts 1 from the PC
in microMIPS kernels it incorrectly misaligns the address it then
attempts to access code at, leading to an address error exception.
Fix this by using msk_isa16_mode() to clear the ISA bit, which allows
get_frame_info() to function regardless of whether it is provided with a
PC that has the ISA bit set or not.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14528/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 774f0c6419 upstream.
Disabling ethernet during reboot (only to enable it again when the
ethernet driver attaches) can put the chip into a faulty state where it
corrupts the header of all incoming packets.
This happens if packets arrive during the time window where the core is
disabled, and it can be easily reproduced by rebooting while sending a
flood ping to the broadcast address.
Fixes: 95135bfa7e ("MIPS: Lantiq: Deactivate most of the devices by default")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: John Crispin <john@phrozen.org>
Cc: hauke.mehrtens@lantiq.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15078/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 884b426917 upstream.
If copy_from_user is called with a large buffer (>= 128 bytes) and the
userspace buffer refers partially to unreadable memory, then it is
possible for Octeon's copy_from_user to report the wrong number of bytes
have been copied. In the case where the buffer size is an exact multiple
of 128 and the fault occurs in the last 64 bytes, copy_from_user will
report that all the bytes were copied successfully but leave some
garbage in the destination buffer.
The bug is in the main __copy_user_common loop in octeon-memcpy.S where
in the middle of the loop, src and dst are incremented by 128 bytes. The
l_exc_copy fault handler is used after this but that assumes that
"src < THREAD_BUADDR($28)". This is not the case if src has already been
incremented.
Fix by adding an extra fault handler which rewinds the src and dst
pointers 128 bytes before falling though to l_exc_copy.
Thanks to the pwritev test from the strace test suite for originally
highlighting this bug!
Fixes: 5b3b16880f ("MIPS: Add Cavium OCTEON processor support ...")
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14978/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 66fd848cad upstream.
For certain arguments such as saddr = 0xc0a8fd60, daddr = 0xc0a8fda1,
len = 80, proto = 17, sum = 0x7eae049d there will be a carry when
folding the intermediate 64 bit checksum to 32 bit but the code doesn't
add the carry back to the one's complement sum, thus an incorrect result
will be generated.
Reported-by: Mark Zhang <bomb.zhang@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5072053b0 upstream.
This further refines the changes made to conntrack gc_worker in
commit e0df8cae6c ("netfilter: conntrack: refine gc worker heuristics").
The main idea of that change was to reduce the scan interval when evictions
take place.
However, on the reporters' setup, there are 1-2 million conntrack entries
in total and roughly 8k new (and closing) connections per second.
In this case we'll always evict at least one entry per gc cycle and scan
interval is always at 1 jiffy because of this test:
} else if (expired_count) {
gc_work->next_gc_run /= 2U;
next_run = msecs_to_jiffies(1);
being true almost all the time.
Given we scan ~10k entries per run its clearly wrong to reduce interval
based on nonzero eviction count, it will only waste cpu cycles since a vast
majorities of conntracks are not timed out.
Thus only look at the ratio (scanned entries vs. evicted entries) to make
a decision on whether to reduce or not.
Because evictor is supposed to only kick in when system turns idle after
a busy period, pick a high ratio -- this makes it 50%. We thus keep
the idea of increasing scan rate when its likely that table contains many
expired entries.
In order to not let timed-out entries hang around for too long
(important when using event logging, in which case we want to timely
destroy events), we now scan the full table within at most
GC_MAX_SCAN_JIFFIES (16 seconds) even in worst-case scenario where all
timed-out entries sit in same slot.
I tested this with a vm under synflood (with
sysctl net.netfilter.nf_conntrack_tcp_timeout_syn_recv=3).
While flood is ongoing, interval now stays at its max rate
(GC_MAX_SCAN_JIFFIES / GC_MAX_BUCKETS_DIV -> 125ms).
With feedback from Nicolas Dichtel.
Reported-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Fixes: b87a2f9199 ("netfilter: conntrack: add gc worker to remove timed-out entries")
Signed-off-by: Florian Westphal <fw@strlen.de>
Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d641df819d upstream.
add_to_page_cache_lru() can fails, so the actual pages to read
can be smaller than the initial size of osd request. We need to
update osd request size in that case.
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae2f5e5ed0 upstream.
Fix the following build error with binutils 2.25.
CC arch/mips/mm/sc-ip22.o
{standard input}: Assembler messages:
{standard input}:132: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:159: Error: number (0x9000000080000000) larger than 32 bits
{standard input}:200: Error: number (0x9000000080000000) larger than 32 bits
scripts/Makefile.build:293: recipe for target 'arch/mips/mm/sc-ip22.o' failed
make[1]: *** [arch/mips/mm/sc-ip22.o] Error 1
MIPS has used .set mips3 to temporarily switch the assembler to 64 bit
mode in 64 bit kernels virtually forever. Binutils 2.25 broke this
behavious partially by happily accepting 64 bit instructions in .set mips3
mode but puking on 64 bit constants when generating 32 bit ELF. Binutils
2.26 restored the old behaviour again.
Fix build with binutils 2.25 by open coding the offending
dli $1, 0x9000000080000000
as
li $1, 0x9000
dsll $1, $1, 48
which is ugly be the only thing that will build on all binutils vintages.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fda2d27db6 upstream.
We will set LPCR with correct value for radix during int. This make sure we
start with a sanitized value of LPCR. In case of kexec, cpus can have LPCR
value based on the previous translation mode we were running.
Fixes: fe036a0605 ("powerpc/64/kexec: Fix MMU cleanup on radix")
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c21a493a2b upstream.
Currently xmon data-breakpoint feature is broken.
Whenever there is a watchpoint match occurs, hw_breakpoint_handler will
be called by do_break via notifier chains mechanism. If watchpoint is
registered by xmon, hw_breakpoint_handler won't find any associated
perf_event and returns immediately with NOTIFY_STOP. Similarly, do_break
also returns without notifying to xmon.
Solve this by returning NOTIFY_DONE when hw_breakpoint_handler does not
find any perf_event associated with matched watchpoint, rather than
NOTIFY_STOP, which tells the core code to continue calling the other
breakpoint handlers including the xmon one.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16f906d66c upstream.
The MAX_SEND_SGES check introduced in commit 655fec6987
("xprtrdma: Use gathered Send for large inline messages") fails
for devices that have a small max_sge.
Instead of checking for a large fixed maximum number of SGEs,
check for a minimum small number. RPC-over-RDMA will switch to
using a Read chunk if an xdr_buf has more pages than can fit in
the device's max_sge limit. This is considerably better than
failing all together to mount the server.
This fix supports devices that have as few as three send SGEs
available.
Reported-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reported-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reported-by: Honggang Li <honli@redhat.com>
Reported-by: Ram Amrani <Ram.Amrani@cavium.com>
Fixes: 655fec6987 ("xprtrdma: Use gathered Send for large ...")
Tested-by: Honggang Li <honli@redhat.com>
Tested-by: Ram Amrani <Ram.Amrani@cavium.com>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c95a3c6b88 upstream.
Commit d5440e27d3 ("xprtrdma: Enable pad optimization") made the
Linux client omit XDR round-up padding in normal Read and Write
chunks so that the client doesn't have to register and invalidate
3-byte memory regions that contain no real data.
Unfortunately, my cheery 2014 assessment that this optimization "is
supported now by both Linux and Solaris servers" was premature.
We've found bugs in Solaris in this area since commit d5440e27d3
("xprtrdma: Enable pad optimization") was merged (SYMLINK is the
main offender).
So for maximum interoperability, I'm disabling this optimization
again. If a CM private message is exchanged when connecting, the
client recognizes that the server is Linux, and enables the
optimization for that connection.
Until now the Solaris server bugs did not impact common operations,
and were thus largely benign. Soon, less capable devices on Linux
NFS/RDMA clients will make use of Read chunks more often, and these
Solaris bugs will prevent interoperation in more cases.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5f0afbea4 upstream.
Pad optimization is changed by echoing into
/proc/sys/sunrpc/rdma_pad_optimize. This is a global setting,
affecting all RPC-over-RDMA connections to all servers.
The marshaling code picks up that value and uses it for decisions
about how to construct each RPC-over-RDMA frame. Having it change
suddenly in mid-operation can result in unexpected failures. And
some servers a client mounts might need chunk round-up, while
others don't.
So instead, copy the pad_optimize setting into each connection's
rpcrdma_ia when the transport is created, and use the copy, which
can't change during the life of the connection, instead.
This also removes a hack: rpcrdma_convert_iovs was using
the remote-invalidation-expected flag to predict when it could leave
out Write chunk padding. This is because the Linux server handles
implicit XDR padding on Write chunks correctly, and only Linux
servers can set the connection's remote-invalidation-expected flag.
It's more sensible to use the pad optimization setting instead.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24abdf1be1 upstream.
When pad optimization is disabled, rpcrdma_convert_iovs still
does not add explicit XDR round-up padding to a Read chunk.
Commit 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
incorrectly short-circuited the test for whether round-up padding
is needed that appears later in rpcrdma_convert_iovs.
However, if this is indeed a regular Read chunk (and not a
Position-Zero Read chunk), the tail iovec _always_ contains the
chunk's padding, and never anything else.
So, it's easy to just skip the tail when padding optimization is
enabled, and add the tail in a subsequent Read chunk segment, if
disabled.
Fixes: 677eb17e94 ("xprtrdma: Fix XDR tail buffer marshalling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adee40b265 upstream.
Commit 3d8cc00073 ("dmaengine: ipu: Consolidate duplicated irq handlers")
consolidated the two interrupts routines into one, but the remaining
interrupt routine only checks the status of the error interrupts, not the
normal interrupts.
This patch fixes that problem (tested on i.MX31 PDK board).
Fixes: 3d8cc00073 ("dmaengine: ipu: Consolidate duplicated irq handlers")
Cc: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 656441478e upstream.
The commit 7a65417216 ("mtd/ifc: Add support for IFC controller
version 2.0") added support for version 2.0 of the IFC controller.
The version 2.0 controller has the ECC status registers at a different
location to the previous versions.
Correct the fsl_ifc_nand structure so that the ECC status can be read
from the correct location for both version 1.0 and 2.0 of the controller.
Fixes: 7a65417216 ("mtd/ifc: Add support for IFC controller version 2.0")
Signed-off-by: Mark Marshall <mark.marshall@omicronenergy.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03a9e24ef2 upstream.
Recently I receive a bug report that on Linux v3.0 based kerenl, hot add
disk to a md linear device causes kernel crash at linear_congested(). From
the crash image analysis, I find in linear_congested(), mddev->raid_disks
contains value N, but conf->disks[] only has N-1 pointers available. Then
a NULL pointer deference crashes the kernel.
There is a race between linear_add() and linear_congested(), RCU stuffs
used in these two functions cannot avoid the race. Since Linuv v4.0
RCU code is replaced by introducing mddev_suspend(). After checking the
upstream code, it seems linear_congested() is not called in
generic_make_request() code patch, so mddev_suspend() cannot provent it
from being called. The possible race still exists.
Here I explain how the race still exists in current code. For a machine
has many CPUs, on one CPU, linear_add() is called to add a hard disk to a
md linear device; at the same time on other CPU, linear_congested() is
called to detect whether this md linear device is congested before issuing
an I/O request onto it.
Now I use a possible code execution time sequence to demo how the possible
race happens,
seq linear_add() linear_congested()
0 conf=mddev->private
1 oldconf=mddev->private
2 mddev->raid_disks++
3 for (i=0; i<mddev->raid_disks;i++)
4 bdev_get_queue(conf->disks[i].rdev->bdev)
5 mddev->private=newconf
In linear_add() mddev->raid_disks is increased in time seq 2, and on
another CPU in linear_congested() the for-loop iterates conf->disks[i] by
the increased mddev->raid_disks in time seq 3,4. But conf with one more
element (which is a pointer to struct dev_info type) to conf->disks[] is
not updated yet, accessing its structure member in time seq 4 will cause a
NULL pointer deference fault.
To fix this race, there are 2 parts of modification in the patch,
1) Add 'int raid_disks' in struct linear_conf, as a copy of
mddev->raid_disks. It is initialized in linear_conf(), always being
consistent with pointers number of 'struct dev_info disks[]'. When
iterating conf->disks[] in linear_congested(), use conf->raid_disks to
replace mddev->raid_disks in the for-loop, then NULL pointer deference
will not happen again.
2) RCU stuffs are back again, and use kfree_rcu() in linear_add() to
free oldconf memory. Because oldconf may be referenced as mddev->private
in linear_congested(), kfree_rcu() makes sure that its memory will not
be released until no one uses it any more.
Also some code comments are added in this patch, to make this modification
to be easier understandable.
This patch can be applied for kernels since v4.0 after commit:
3be260cc18 ("md/linear: remove rcu protections in favour of
suspend/resume"). But this bug is reported on Linux v3.0 based kernel, for
people who maintain kernels before Linux v4.0, they need to do some back
back port to this patch.
Changelog:
- V3: add 'int raid_disks' in struct linear_conf, and use kfree_rcu() to
replace rcu_call() in linear_add().
- v2: add RCU stuffs by suggestion from Shaohua and Neil.
- v1: initial effort.
Signed-off-by: Coly Li <colyli@suse.de>
Cc: Shaohua Li <shli@fb.com>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb61bb82cb upstream.
The RTC is clocked from either an internal, imprecise, oscillator or an
external one, which is usually much more accurate.
The difference perceived between the time elapsed and the time reported by
the RTC is in a 10% scale, which prevents the RTC from being useful at all.
Fortunately, the external oscillator is reported to be mandatory in the
Allwinner datasheet, so we can just switch to it.
Fixes: 9765d2d943 ("rtc: sun6i: Add sun6i RTC driver")
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8b107f5b97 upstream.
If segs_per_sec is over 1 like under SMR, previously f2fs issues discard
commands redundantly on the same section, since we didn't move end position
for the previous discard command.
E.g.,
start end
| |
prefree_bitmap = [01111100111100]
And, after issue discard for this section,
end start
| |
prefree_bitmap = [01111100111100]
Select this section again by searching from (end + 1),
start end
| |
prefree_bitmap = [01111100111100]
Fixes: 36abef4e79 ("f2fs: introduce mode=lfs mount option")
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e93b986525 upstream.
For foreground gc, greedy algorithm should be adapted, which makes
this formula work well:
(2 * (100 / config.overprovision + 1) + 6)
But currently, we fg_gc have a prior to select bg_gc victim segments to gc
first, these victims are selected by cost-benefit algorithm, we can't guarantee
such segments have the small valid blocks, which may destroy the f2fs rule, on
the worstest case, would consume all the free segments.
This patch fix this by add a filter in check_bg_victims, if segment's has # of
valid blocks over overprovision ratio, skip such segments.
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 88c5c13a50 upstream.
It turns out a stakable filesystem like sdcardfs in AOSP can trigger multiple
vfs_create() to lower filesystem. In that case, f2fs will add multiple dentries
having same name which breaks filesystem consistency.
Until upper layer fixes, let's work around by f2fs, which shows actually not
much performance regression.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed92d8c137 upstream.
We're not taking into account that the space needed for the (variable
length) attr bitmap, with the result that we'd sometimes get a spurious
ERANGE when the ACL data got close to the end of a page.
Just add in an extra page to make sure.
Signed-off-by: Weston Andros Adamson <dros@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6682c14bbe upstream.
Bitmap and attrlen follow immediately after the op reply header. This
was an oversight from commit bf118a342f.
Consequences of this are just minor efficiency (extra calls to
xdr_shrink_bufhead).
Fixes: bf118a342f "NFSv4: include bitmap in nfsv4 get acl data"
Reviewed-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df3ab232e4 upstream.
If we see that our pNFS READ/WRITE/COMMIT operation failed, but we
also see that our layout segment is no longer valid, then we need to
get a new layout segment before retrying.
Fixes: 90816d1dda ("NFSv4.1/flexfiles: Don't mark the entire deviceid...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d8cacbf56 upstream.
Copy offload code needs to be hooked into the code for handling
NFS4ERR_BAD_STATEID by ensuring that we set the "stateid" field
in struct nfs4_exception.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 2e72448b07 ("NFS: Add COPY nfs operation")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a974deee47 upstream.
If we exit because the file access check failed, we currently
leak the struct nfs4_state. We need to attach it to the
open context before returning.
Fixes: 3efb972247 ("NFSv4: Refactor _nfs4_open_and_get_state..")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 783112f740 upstream.
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributes to set to set various file attributes including the
file size and the uid/gid.
The Linux syscalls never mix size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates don't update random other attributes, and many other file
systems handle the case but do not update the other attributes in the
same transaction. NFSD on the other hand passes the attributes it gets
on the wire more or less directly through to the VFS, leading to updates
the file systems don't expect. XFS at least has an assert on the
allowed attributes, which caught an unusual NFS client setting the size
and group at the same time.
To handle this issue properly this splits the notify_change call in
nfsd_setattr into two separate ones.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9797484ba8 upstream.
Commit 050c3d52cc ("vme: make core
vme support explicitly non-modular") dropped the remove function
because it appeared as if it was for removal of the bus, which is
not supported.
However, vme_bus_remove() is called when a VME device is removed
from the bus and not when the bus is removed; as it calls the VME
device driver's cleanup function. Without this function, the
remove() in the VME device driver is never called and VME device
drivers cannot be reloaded again.
Here we restore the remove function that was deleted in that
commit, and the reference to the function in the bus structure.
Fixes: 050c3d52cc ("vme: make core vme support explicitly non-modular")
Cc: Manohar Vanga <manohar.vanga@gmail.com>
Acked-by: Martyn Welch <martyn@welchs.me.uk>
Cc: devel@driverdev.osuosl.org
Signed-off-by: Stefano Babic <sbabic@denx.de>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6773386f97 upstream.
Kernels built with CONFIG_KASAN=y report the following BUG for rtl8192cu
and rtl8192c-common:
==================================================================
BUG: KASAN: slab-out-of-bounds in rtl92c_dm_bt_coexist+0x858/0x1e40
[rtl8192c_common] at addr ffff8801c90edb08
Read of size 1 by task kworker/0:1/38
page:ffffea0007243800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x8000000000004000(head)
page dumped because: kasan: bad access detected
CPU: 0 PID: 38 Comm: kworker/0:1 Not tainted 4.9.7-gentoo #3
Hardware name: Gigabyte Technology Co., Ltd. To be filled by
O.E.M./Z77-DS3H, BIOS F11a 11/13/2013
Workqueue: rtl92c_usb rtl_watchdog_wq_callback [rtlwifi]
0000000000000000 ffffffff829eea33 ffff8801d7f0fa30 ffff8801c90edb08
ffffffff824c0f09 ffff8801d4abee80 0000000000000004 0000000000000297
ffffffffc070b57c ffff8801c7aa7c48 ffff880100000004 ffffffff000003e8
Call Trace:
[<ffffffff829eea33>] ? dump_stack+0x5c/0x79
[<ffffffff824c0f09>] ? kasan_report_error+0x4b9/0x4e0
[<ffffffffc070b57c>] ? _usb_read_sync+0x15c/0x280 [rtl_usb]
[<ffffffff824c0f75>] ? __asan_report_load1_noabort+0x45/0x50
[<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common]
[<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common]
[<ffffffffc06d0cbe>] ? rtl92c_dm_rf_saving+0x96e/0x1330 [rtl8192c_common]
...
The problem is due to rtl8192ce and rtl8192cu sharing routines, and having
different layouts of struct rtl_pci_priv, which is used by rtl8192ce, and
struct rtl_usb_priv, which is used by rtl8192cu. The problem was resolved
by placing the struct bt_coexist_info at the head of each of those private
areas.
Reported-and-tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40b368af4b upstream.
The addresses of Wlan NIC registers are natural alignment, but some
drivers have bugs. These are evident on platforms that need natural
alignment to access registers. This change contains the following:
1. Function _rtl8821ae_dbi_read() is used to read one byte from DBI,
thus it should use rtl_read_byte().
2. Register 0x4C7 of 8192ee is single byte.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e8b571a9a upstream.
The "fw" firmware object is passed from the remoteproc core and should
not be overwritten, as that results in leaked buffers and a double free
of the the last firmware object.
Fixes: 051fb70fd4 ("remoteproc: qcom: Driver for the self-authenticating Hexagon v5")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f38e5fb95a upstream.
We must hold the rcu read lock across looking up glocks and trying to
bump their refcount to prevent the glocks from being freed in between.
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55efcfcd77 upstream.
The RDMA core uses ib_pack() to convert from unpacked CPU structs
to on-the-wire bitpacked structs.
This process requires that 1 bit fields are declared as u8 in the
unpacked struct, otherwise the packing process does not read the
value properly and the packed result is wired to 0. Several
places wrongly used int.
Crucially this means the kernel has never, set reversible
correctly in the path record request. It has always asked for
irreversible paths even if the ULP requests otherwise.
When the kernel is used with a SM that supports this feature, it
completely breaks communication management if reversible paths are
not properly requested.
The only reason this ever worked is because opensm ignores the
reversible bit.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d77044d142 upstream.
VSS may use a char device to support the communication between
the user level daemon and the driver. When the VSS channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new VSS offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20951c7535 upstream.
Fcopy may use a char device to support the communication between
the user level daemon and the driver. When the Fcopy channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new Fcopy offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a66fecbf6 upstream.
KVP may use a char device to support the communication between
the user level daemon and the driver. When the KVP channel is rescinded
we need to make sure that the char device is fully cleaned up before
we can process a new KVP offer from the host. Implement this logic.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccb61f8a99 upstream.
The host can rescind a channel that has been offered to the
guest and once the channel is rescinded, the host does not
respond to any requests on that channel. Deal with the case where
the guest may be blocked waiting for a response from the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7e97dd8b7 upstream.
After the channel is rescinded, the host does not read from the rescinded channel.
Fail writes to a channel that has already been rescinded. If we permit writes on a
rescinded channel, since the host will not respond we will have situations where
we will be unable to unload vmbus drivers that cannot have any outstanding requests
to the host at the point they are unoaded.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56ef6718a1 upstream.
It may happen that secondary CPUs are still alive and resetting
hv_context.tsc_page will cause a consequent crash in read_hv_clock_tsc()
as we don't check for it being not NULL there. It is safe as we're not
freeing this page anyways.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c7630d350 upstream.
Initializing hv_context.percpu_list in hv_synic_alloc() helps to prevent a
crash in percpu_channel_enq() when not all CPUs were online during
initialization and it naturally belongs there.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 421b8f20d3 upstream.
It may happen that not all CPUs are online when we do hv_synic_alloc() and
in case more CPUs come online later we may try accessing these allocated
structures.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33e4c1a998 upstream.
As IN request has to be allocated in set_alt() and released in
disable() we cannot use mutex to protect it as we cannot sleep
in those funcitons. Let's replace this mutex with a spinlock.
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa65d11aa0 upstream.
When we unlock our spinlock to copy data to user we may get
disabled by USB host and free the whole list of completed out
requests including the one from which we are copying the data
to user memory.
To prevent from this let's remove our working element from
the list and place it back only if there is sth left when we
finish with it.
Fixes: 99c5150058 ("usb: gadget: hidg: register OUT INT endpoint for SET_REPORT")
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20d2ca955b upstream.
Requests for out endpoint are allocated in bind() function
but never released.
This commit ensures that all pending requests are released
when we disable out endpoint.
Fixes: 99c5150058 ("usb: gadget: hidg: register OUT INT endpoint for SET_REPORT")
Tested-by: David Lechner <david@lechnology.com>
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5528954a1a upstream.
Commit 304f7e5e1d ("usb: gadget: Refactor request completion")
removed check if req->req.complete is non-NULL, resulting in a NULL
pointer derefence and a kernel panic.
This patch adds an empty complete function instead of re-introducing
the req->req.complete check.
Fixes: 304f7e5e1d ("usb: gadget: Refactor request completion")
Signed-off-by: Magnus Lilja <lilja.magnus@gmail.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8236800da1 upstream.
Since:
commit 855ed04a37 ("usb: gadget: udc-core: independent registration
of gadgets and gadget drivers")
if we load gadget module but there is no free udc available
then it will be stored on a pending gadgets list.
$ modprobe g_zero.ko
$ modprobe g_ether.ko
[] udc-core: couldn't find an available UDC - added [g_ether] to list
of pending drivers
We scan this list each time when new UDC appears in system.
But we can get a free UDC each time after gadget unbind.
This commit add scanning of that list directly after unbinding
gadget from udc.
Thanks to this, when we unload first gadget:
$ rmmod g_zero.ko
gadget which is pending is automatically
attached to that UDC (if name matches).
Fixes: 855ed04a37 ("usb: gadget: udc-core: independent registration of gadgets and gadget drivers")
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5de4e1ea9a upstream.
The commit 4ac53087d6 ("usb: xhci: plat: Create both
HCDs before adding them") move add hcd to the end of
probe, this cause hcc_params uninitiated, because xHCI
driver sets hcc_params in xhci_gen_setup() called from
usb_add_hcd().
This patch checks the Maximum Primary Stream Array Size
in the hcc_params register after add primary hcd.
Signed-off-by: William wu <william.wu@rock-chips.com>
Acked-by: Roger Quadros <rogerq@ti.com>
Fixes: 4ac53087d6 ("usb: xhci: plat: Create both HCDs before adding them")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb80fc672 upstream.
At least macOS seems to be sending
ClearFeature(ENDPOINT_HALT) to endpoints which
aren't Halted. This makes DWC3's CLEARSTALL command
time out which causes several issues for the driver.
Instead, let's just return 0 and bail out early.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a994ce2d7e upstream.
DA8xx driver is registering and using the CPPI 3.0 DMA controller but
actually, the DA8xx has a CPPI 4.1 DMA controller.
Remove the CPPI 3.0 quirk and methods.
Fixes: f8e9f34f80 ("usb: musb: Fix up DMA related macros")
Fixes: 7f6283ed6f ("usb: musb: Set up function pointers for DMA")
Signed-off-by: Alexandre Bailon <abailon@baylibre.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61cd1b4cd1 upstream.
ds2490 driver was doing USB transfers from / to buffers on a stack.
This is not permitted and made the driver non-working with vmapped stacks.
Since all these transfers are done under the same bus_mutex lock we can
simply use shared buffers in a device private structure for two most common
of them.
While we are at it, let's also fix a comparison between int and size_t in
ds9490r_search() which made the driver spin in this function if state
register get requests were failing.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2ce4ea1a0 upstream.
Near the beginning of w1_attach_slave_device() we increment a w1 master
reference count.
Later, when we are going to exit this function without actually attaching
a slave device (due to failure of __w1_attach_slave_device()) we need to
decrement this reference count back.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes: 9fcbbac5de ("w1: process w1 netlink commands in w1_process thread")
Cc: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c42631376 upstream.
The priv->cmd_msg_buffer is allocated in the probe function, but never
kfree()ed. This patch converts the kzalloc() to resource-managed
kzalloc.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c919a3069c upstream.
Fixes: 05ca527000 can: gs_usb: add ethtool set_phys_id callback to locate physical device
The gs_usb driver is performing USB transfers using buffers allocated on
the stack. This causes the driver to not function with vmapped stacks.
Instead, allocate memory for the transfer buffers.
Signed-off-by: Ethan Zonca <e@ethanzonca.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cf6cdba58 upstream.
Fixes a regression triggered by a change in the layout of
struct iio_chan_spec, but the real bug is in the driver which assumed
a specific structure layout in the first place. Hint: the two bits were
not OR:ed together as implied by the indentation prior to this patch,
there was a comma between them, which accidentally moved the ..._SCALE
bit to the next structure field. That field was .info_mask_shared_by_type
before the _available attributes was added by commit 5123960007
("iio:core: add a callback to allow drivers to provide _available
attributes") and .info_mask_separate_available afterwards, and the
regression happened.
info_mask_shared_by_type is actually a better choice than the originally
intended info_mask_separate for the ..._SCALE bit since a constant is
returned from mpl3115_read_raw for the scale. Using
info_mask_shared_by_type also preserves the behavior from before the
regression and is therefore less likely to cause other interesting side
effects.
The above mentioned regression causes an unintended sysfs attibute to
show up that is not backed by code, in turn causing the following NULL
pointer defererence to happen on access.
Segmentation fault
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ecc3c000
[00000000] *pgd=87f91831
Internal error: Oops: 80000007 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 1051 Comm: cat Not tainted 4.10.0-rc5-00009-gffd8858-dirty #3
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
task: ed54ec00 task.stack: ee2bc000
PC is at 0x0
LR is at iio_read_channel_info_avail+0x40/0x280
pc : [<00000000>] lr : [<c06fbc1c>] psr: a0070013
sp : ee2bdda8 ip : 00000000 fp : ee2bddf4
r10: c0a53c74 r9 : ed79f000 r8 : ee8d1018
r7 : 00001000 r6 : 00000fff r5 : ee8b9a00 r4 : ed79f000
r3 : ee2bddc4 r2 : ee2bddbc r1 : c0a86dcc r0 : ee8d1000
Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c5387d Table: 3cc3c04a DAC: 00000051
Process cat (pid: 1051, stack limit = 0xee2bc210)
Stack: (0xee2bdda8 to 0xee2be000)
dda0: ee2bddc0 00000002 c016d720 c016d394 ed54ec00 00000000
ddc0: 60070013 ed413780 00000001 edffd480 ee8b9a00 00000fff 00001000 ee8d1018
dde0: ed79f000 c0a53c74 ee2bde0c ee2bddf8 c0513c58 c06fbbe8 edffd480 edffd540
de00: ee2bde3c ee2bde10 c0293474 c0513c40 c02933e4 ee2bde60 00000001 ed413780
de20: 00000001 ed413780 00000000 edffd480 ee2bde4c ee2bde40 c0291d00 c02933f0
de40: ee2bde9c ee2bde50 c024679c c0291ce0 edffd4b0 b6e37000 00020000 ee2bdf78
de60: 00000000 00000000 ed54ec00 ed013200 00000817 c0a111fc edffd540 ed413780
de80: b6e37000 00020000 00020000 ee2bdf78 ee2bded4 ee2bdea0 c0292890 c0246604
dea0: c0117940 c016ba50 00000025 c0a111fc b6e37000 ed413780 ee2bdf78 00020000
dec0: ee2bc000 b6e37000 ee2bdf44 ee2bded8 c021d158 c0292770 c0117764 b6e36004
dee0: c0f0d7c4 ee2bdfb0 b6f89228 00021008 ee2bdfac ee2bdf00 c0101374 c0117770
df00: 00000000 00000000 ee2bc000 00000000 ee2bdf34 ee2bdf20 c016ba04 c0171080
df20: 00000000 00020000 ed413780 b6e37000 00000000 ee2bdf78 ee2bdf74 ee2bdf48
df40: c021e7a0 c021d130 c023e300 c023e280 ee2bdf74 00000000 00000000 ed413780
df60: ed413780 00020000 ee2bdfa4 ee2bdf78 c021e870 c021e71c 00000000 00000000
df80: 00020000 00020000 b6e37000 00000003 c0108084 00000000 00000000 ee2bdfa8
dfa0: c0107ee0 c021e838 00020000 00020000 00000003 b6e37000 00020000 0001a2b4
dfc0: 00020000 00020000 b6e37000 00000003 7fffe000 00000000 00000000 00020000
dfe0: 00000000 be98eb4c 0000c740 b6f1985c 60070010 00000003 00000000 00000000
Backtrace:
[<c06fbbdc>] (iio_read_channel_info_avail) from [<c0513c58>] (dev_attr_show+0x24/0x50)
r10:c0a53c74 r9:ed79f000 r8:ee8d1018 r7:00001000 r6:00000fff r5:ee8b9a00
r4:edffd480
[<c0513c34>] (dev_attr_show) from [<c0293474>] (sysfs_kf_seq_show+0x90/0x110)
r5:edffd540 r4:edffd480
[<c02933e4>] (sysfs_kf_seq_show) from [<c0291d00>] (kernfs_seq_show+0x2c/0x30)
r10:edffd480 r9:00000000 r8:ed413780 r7:00000001 r6:ed413780 r5:00000001
r4:ee2bde60 r3:c02933e4
[<c0291cd4>] (kernfs_seq_show) from [<c024679c>] (seq_read+0x1a4/0x4e0)
[<c02465f8>] (seq_read) from [<c0292890>] (kernfs_fop_read+0x12c/0x1cc)
r10:ee2bdf78 r9:00020000 r8:00020000 r7:b6e37000 r6:ed413780 r5:edffd540
r4:c0a111fc
[<c0292764>] (kernfs_fop_read) from [<c021d158>] (__vfs_read+0x34/0x118)
r10:b6e37000 r9:ee2bc000 r8:00020000 r7:ee2bdf78 r6:ed413780 r5:b6e37000
r4:c0a111fc
[<c021d124>] (__vfs_read) from [<c021e7a0>] (vfs_read+0x90/0x11c)
r8:ee2bdf78 r7:00000000 r6:b6e37000 r5:ed413780 r4:00020000
[<c021e710>] (vfs_read) from [<c021e870>] (SyS_read+0x44/0x90)
r8:00020000 r7:ed413780 r6:ed413780 r5:00000000 r4:00000000
[<c021e82c>] (SyS_read) from [<c0107ee0>] (ret_fast_syscall+0x0/0x1c)
r10:00000000 r8:c0108084 r7:00000003 r6:b6e37000 r5:00020000 r4:00020000
Code: bad PC value
---[ end trace 9c4938ccd0389004 ]---
Fixes: cc26ad455f ("iio: Add Freescale MPL3115A2 pressure / temperature sensor driver")
Fixes: 5123960007 ("iio:core: add a callback to allow drivers to provide _available attributes")
Reported-by: Ken Lin <ken.lin@advantech.com>
Tested-by: Ken Lin <ken.lin@advantech.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a6e1d56a0 upstream.
Fixes a regression triggered by a change in the layout of
struct iio_chan_spec, but the real bug is in the driver which assumed
a specific structure layout in the first place. Hint: the three bits were
not OR:ed together as implied by the indentation prior to this patch,
there was a comma between the first two, which accidentally moved the
..._SCALE and ..._OFFSET bits to the next structure field. That field
was .info_mask_shared_by_type before the _available attributes was added
by commit 5123960007 ("iio:core: add a callback to allow drivers to
provide _available attributes") and .info_mask_separate_available
afterwards, and the regression happened.
info_mask_shared_by_type is actually a better choice than the originally
intended info_mask_separate for the ..._SCALE and ..._OFFSET bits since
a constant is returned from mpl115_read_raw for the scale/offset. Using
info_mask_shared_by_type also preserves the behavior from before the
regression and is therefore less likely to cause other interesting side
effects.
The above mentioned regression causes unintended sysfs attibutes to
show up that are not backed by code, in turn causing a NULL pointer
defererence to happen on access.
Fixes: 3017d90e89 ("iio: Add Freescale MPL115A2 pressure / temperature sensor driver")
Fixes: 5123960007 ("iio:core: add a callback to allow drivers to provide _available attributes")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0bdbf3b071 upstream.
The IRQFD framework calls the architecture dependent function
twice if the corresponding GSI type is edge triggered. For ARM,
the function kvm_set_msi() is getting called twice whenever the
IRQFD receives the event signal. The rest of the code path is
trying to inject the MSI without any validation checks. No need
to call the function vgic_its_inject_msi() second time to avoid
an unnecessary overhead in IRQ queue logic. It also avoids the
possibility of VM seeing the MSI twice.
Simple fix, return -1 if the argument 'level' value is zero.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d0928f18b upstream.
Since it was introduced in commit da8d02d19f ("arm64/capabilities:
Make use of system wide safe value"), __raw_read_system_reg() has
erroneously mapped some sysreg IDs to other registers.
For the fields in ID_ISAR5_EL1, our local feature detection will be
erroneous. We may spuriously detect that a feature is uniformly
supported, or may fail to detect when it actually is, meaning some
compat hwcaps may be erroneous (or not enforced upon hotplug).
This patch corrects the erroneous entries.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: da8d02d19f ("arm64/capabilities: Make use of system wide safe value")
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit adbe7e26f4 upstream.
When bypassing SWIOTLB on small-memory systems, we need to avoid calling
into swiotlb_dma_mapping_error() in exactly the same way as we avoid
swiotlb_dma_supported(), because the former also relies on SWIOTLB state
being initialised.
Under the assumptions for which we skip SWIOTLB, dma_map_{single,page}()
will only ever return the DMA-offset-adjusted physical address of the
page passed in, thus we can report success unconditionally.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
CC: Jisheng Zhang <jszhang@marvell.com>
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f36ebaf21 upstream.
When we fault in a page, we flush it to the PoC (Point of Coherency)
if the faulting vcpu has its own caches off, so that it can observe
the page we just brought it.
But if the vcpu has its caches on, we skip that step. Bad things
happen when *another* vcpu tries to access that page with its own
caches disabled. At that point, there is no garantee that the
data has made it to the PoC, and we access stale data.
The obvious fix is to always flush to PoC when a page is faulted
in, no matter what the state of the vcpu is.
Fixes: 2d58b733c8 ("arm64: KVM: force cache clean on page fault when caches are off")
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58ab9a088d upstream.
Kirill reported a warning from UBSAN about undefined behavior when using
protection keys. He is running on hardware that actually has support for
it, which is not widely available.
The warning triggers because of very large shifts of integers when doing a
pkey_free() of a large, invalid value. This happens because we never check
that the pkey "fits" into the mm_pkey_allocation_map().
I do not believe there is any danger here of anything bad happening
other than some aliasing issues where somebody could do:
pkey_free(35);
and the kernel would effectively execute:
pkey_free(8);
While this might be confusing to an app that was doing something stupid, it
has to do something stupid and the effects are limited to the app shooting
itself in the foot.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-kselftest@vger.kernel.org
Cc: shuah@kernel.org
Cc: kirill.shutemov@linux.intel.com
Link: http://lkml.kernel.org/r/20170223222603.A022ED65@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e38bea99a upstream.
fuse_file_put() was missing the "force" flag for the RELEASE request when
sending synchronously (fuseblk).
If this flag is not set, then a sync request may be interrupted before it
is dequeued by the userspace filesystem. In this case the OPEN won't be
balanced with a RELEASE.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 5a18ec176c ("fuse: fix hang of single threaded fuseblk filesystem")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c68bb0f62 upstream.
Running with KASAN and crypto tests currently gives
BUG: KASAN: global-out-of-bounds in __test_aead+0x9d9/0x2200 at addr ffffffff8212fca0
Read of size 16 by task cryptomgr_test/1107
Address belongs to variable 0xffffffff8212fca0
CPU: 0 PID: 1107 Comm: cryptomgr_test Not tainted 4.10.0+ #45
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.1-1.fc24 04/01/2014
Call Trace:
dump_stack+0x63/0x8a
kasan_report.part.1+0x4a7/0x4e0
? __test_aead+0x9d9/0x2200
? crypto_ccm_init_crypt+0x218/0x3c0 [ccm]
kasan_report+0x20/0x30
check_memory_region+0x13c/0x1a0
memcpy+0x23/0x50
__test_aead+0x9d9/0x2200
? kasan_unpoison_shadow+0x35/0x50
? alg_test_akcipher+0xf0/0xf0
? crypto_skcipher_init_tfm+0x2e3/0x310
? crypto_spawn_tfm2+0x37/0x60
? crypto_ccm_init_tfm+0xa9/0xd0 [ccm]
? crypto_aead_init_tfm+0x7b/0x90
? crypto_alloc_tfm+0xc4/0x190
test_aead+0x28/0xc0
alg_test_aead+0x54/0xd0
alg_test+0x1eb/0x3d0
? alg_find_test+0x90/0x90
? __sched_text_start+0x8/0x8
? __wake_up_common+0x70/0xb0
cryptomgr_test+0x4d/0x60
kthread+0x173/0x1c0
? crypto_acomp_scomp_free_ctx+0x60/0x60
? kthread_create_on_node+0xa0/0xa0
ret_from_fork+0x2c/0x40
Memory state around the buggy address:
ffffffff8212fb80: 00 00 00 00 01 fa fa fa fa fa fa fa 00 00 00 00
ffffffff8212fc00: 00 01 fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
>ffffffff8212fc80: fa fa fa fa 00 05 fa fa fa fa fa fa 00 00 00 00
^
ffffffff8212fd00: 01 fa fa fa fa fa fa fa 00 00 00 00 01 fa fa fa
ffffffff8212fd80: fa fa fa fa 00 00 00 00 00 05 fa fa fa fa fa fa
This always happens on the same IV which is less than 16 bytes.
Per Ard,
"CCM IVs are 16 bytes, but due to the way they are constructed
internally, the final couple of bytes of input IV are dont-cares.
Apparently, we do read all 16 bytes, which triggers the KASAN errors."
Fix this by padding the IV with null bytes to be at least 16 bytes.
Fixes: 0bc5a6c5c7 ("crypto: testmgr - Disable rfc4309 test and convert test vectors")
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0bb03924f upstream.
DoS protection conditions were altered in WS2016 and now it's easy to get
-EAGAIN returned from vmbus_post_msg() (e.g. when we try changing MTU on a
netvsc device in a loop). All vmbus_post_msg() callers don't retry the
operation and we usually end up with a non-functional device or crash.
While host's DoS protection conditions are unknown to me my tests show that
it can take up to 10 seconds before the message is sent so doing udelay()
is not an option, we really need to sleep. Almost all vmbus_post_msg()
callers are ready to sleep but there is one special case:
vmbus_initiate_unload() which can be called from interrupt/NMI context and
we can't sleep there. I'm also not sure about the lonely
vmbus_send_tl_connect_request() which has no in-tree users but its external
users are most likely waiting for the host to reply so sleeping there is
also appropriate.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a7275a3d8 upstream.
eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage") used
TLP_FMTTYPE_CFGRD* (instead of TLP_FMTTYPE_CFGWR*) for TLP writes, which
causes writing to configuration space to fail. Fix it by using correct
FMTTYPE for write operation.
Fixes: eb5767122f ("PCI: altera: Simplify TLB_CFG_DW0 usage")
Signed-off-by: Ley Foon Tan <ley.foon.tan@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49f4b08e61 upstream.
pnv_php_disable_irq() can be called in two paths: Bailing path in
pnv_php_enable_irq() or releasing slot. The MSI (or MSIx) interrupts
is disabled unconditionally in pnv_php_disable_irq(). It's wrong
because that might be enabled by drivers other than pnv-php.
This disables MSI (or MSIx) interrupts and the PCI device only if
it was enabled by pnv-php. In the error path of pnv_php_enable_irq(),
we rely on the newly added parameter @disable_device. In the path
of releasing slot, @pnv_php->irq is checked.
Fixes: 360aebd85a ("drivers/pci/hotplug: Support surprise hotplug in powernv driver")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60e2e2fbaf upstream.
The devfn of 00:02.0 is 0x10. devfn_to_wslot(0x10) == 0x2, and
wslot_to_devfn(0x2) should be 0x10, while it's 0x2 in the current code.
Due to this, hv_eject_device_work() -> pci_get_domain_bus_and_slot()
returns NULL and pci_stop_and_remove_bus_device() is not called.
Later when the real device driver's .remove() is invoked by
hv_pci_remove() -> pci_stop_root_bus(), some warnings can be noticed
because the VM has lost the access to the underlying device at that
time.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
CC: K. Y. Srinivasan <kys@microsoft.com>
CC: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9f1e32600 upstream.
This patch fixes the OTP register definitions for the AR934x and AR9550
WMAC SoC.
Previously, the ath9k driver was unable to initialize the integrated
WMAC on an Aerohive AP121:
| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004
| ath: phy0: timeout (1000 us) on reg 0x30018: 0xbadc0ffe & 0x00000007 != 0x00000004
| ath: phy0: Unable to initialize hardware; initialization status: -5
| ath9k ar934x_wmac: failed to initialize device
| ath9k: probe of ar934x_wmac failed with error -5
It turns out that the AR9300_OTP_STATUS and AR9300_OTP_DATA
definitions contain a typo.
Cc: Gabor Juhos <juhosg@openwrt.org>
Fixes: add295a4af "ath9k: use correct OTP register offsets for AR9550"
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: Chris Blake <chrisrblake93@gmail.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a5e969bb2 upstream.
The code currently relies on refcounting to disable IRQs from within the
IRQ handler and re-enabling them again after the tasklet has run.
However, due to race conditions sometimes the IRQ handler might be
called twice, or the tasklet may not run at all (if interrupted in the
middle of a reset).
This can cause nasty imbalances in the irq-disable refcount which will
get the driver permanently stuck until the entire radio has been stopped
and started again (ath_reset will not recover from this).
Instead of using this fragile logic, change the code to ensure that
running the irq handler during tasklet processing is safe, and leave the
refcount untouched.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb4281528b upstream.
Rx filter reset and the dynamic tx switch mode (EXT_RESOURCE_CFG)
configuration are causing the following errors when UTF firmware
is loaded to the target.
Error message 1:
[ 598.015629] ath10k_pci 0001:01:00.0: failed to ping firmware: -110
[ 598.020828] ath10k_pci 0001:01:00.0: failed to reset rx filter: -110
[ 598.141556] ath10k_pci 0001:01:00.0: failed to start core (testmode): -110
Error message 2:
[ 668.615839] ath10k_ahb a000000.wifi: failed to send ext resource cfg command : -95
[ 668.618902] ath10k_ahb a000000.wifi: failed to start core (testmode): -95
Avoiding these configurations while bringing the target in
testmode is solving the problem.
Signed-off-by: Tamizh chelvam <c_traja@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb97fbbcac upstream.
Parallel reads from multiple threads on a file descriptor
are not well defined and racy. It is safer to return to original
behavior and simply fail the additional read.
The solution is to remove request for next read credit.
Fixes: ff1586a7ea ("mei: enqueue consecutive reads")
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4753d8a24d upstream.
If the file system requires journal recovery, and the device is
read-ony, return EROFS to the mount system call. This allows xfstests
generic/050 to pass.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 97abd7d4b5 upstream.
If the journal is aborted, the needs_recovery feature flag should not
be removed. Otherwise, it's the journal might not get replayed and
this could lead to more data getting lost.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb5efbcb76 upstream.
The write_end() function must always unlock the page and drop its ref
count, even on an error.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd01b690f8 upstream.
In the case where the child's encryption context was inconsistent with
its parent directory, we were using inode->i_sb and inode->i_ino after
the inode had already been iput(). Fix this by doing the iput() in the
correct places.
Note: only ext4 had this bug, not f2fs and ubifs.
Fixes: d9cdc90331 ("ext4 crypto: enforce context consistency")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b136499e9 upstream.
ext4_journalled_write_end() did not propely handle all the cases when
generic_perform_write() did not copy all the data into the target page
and could mark buffers with uninitialized contents as uptodate and dirty
leading to possible data corruption (which would be quickly fixed by
generic_perform_write() retrying the write but still). Fix the problem
by carefully handling the case when the page that is written to is not
uptodate.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd648b8a8f upstream.
If filesystem groups are artifically small (using parameter -g to
mkfs.ext4), ext4_mb_normalize_request() can result in a request that is
larger than a block group. Trim the request size to not confuse
allocation code.
Reported-by: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03e916fa8b upstream.
Inside ext4_ext_shift_extents() function ext4_find_extent() is called
without EXT4_EX_NOCACHE flag, which should prevent cache population.
This leads to oudated offsets in the extents tree and wrong blocks
afterwards.
Patch fixes the problem providing EXT4_EX_NOCACHE flag for each
ext4_find_extents() call inside ext4_ext_shift_extents function.
Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a9b8cba62 upstream.
While doing 'insert range' start block should be also shifted right.
The bug can be easily reproduced by the following test:
ptr = malloc(4096);
assert(ptr);
fd = open("./ext4.file", O_CREAT | O_TRUNC | O_RDWR, 0600);
assert(fd >= 0);
rc = fallocate(fd, 0, 0, 8192);
assert(rc == 0);
for (i = 0; i < 2048; i++)
*((unsigned short *)ptr + i) = 0xbeef;
rc = pwrite(fd, ptr, 4096, 0);
assert(rc == 4096);
rc = pwrite(fd, ptr, 4096, 4096);
assert(rc == 4096);
for (block = 2; block < 1000; block++) {
rc = fallocate(fd, FALLOC_FL_INSERT_RANGE, 4096, 4096);
assert(rc == 0);
for (i = 0; i < 2048; i++)
*((unsigned short *)ptr + i) = block;
rc = pwrite(fd, ptr, 4096, 4096);
assert(rc == 4096);
}
Because start block is not included in the range the hole appears at
the wrong offset (just after the desired offset) and the following
pwrite() overwrites already existent block, keeping hole untouched.
Simple way to verify wrong behaviour is to check zeroed blocks after
the test:
$ hexdump ./ext4.file | grep '0000 0000'
The root cause of the bug is a wrong range (start, stop], where start
should be inclusive, i.e. [start, stop].
This patch fixes the problem by including start into the range. But
not to break left shift (range collapse) stop points to the beginning
of the a block, not to the end.
The other not obvious change is an iterator check on validness in a
main loop. Because iterator is unsigned the following corner case
should be considered with care: insert a block at 0 offset, when stop
variables overflows and never becomes less than start, which is 0.
To handle this special case iterator is set to NULL to indicate that
end of the loop is reached.
Fixes: 331573febb
Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Namjae Jeon <namjae.jeon@samsung.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e02898b423 upstream.
loop_reread_partitions() needs to do I/O, but we just froze the queue,
so we end up waiting forever. This can easily be reproduced with losetup
-P. Fix it by moving the reread to after we unfreeze the queue.
Fixes: ecdd09597a ("block/loop: fix race between I/O and set_status")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e112666b49 upstream.
If the journal has been aborted, we shouldn't mark the underlying
buffer head as dirty, since that will cause the metadata block to get
modified. And if the journal has been aborted, we shouldn't allow
this since it will almost certainly lead to a corrupted file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 907565337e upstream.
Userspace applications should be allowed to expect the membarrier system
call with MEMBARRIER_CMD_SHARED command to issue memory barriers on
nohz_full CPUs, but synchronize_sched() does not take those into
account.
Given that we do not want unrelated processes to be able to affect
real-time sensitive nohz_full CPUs, simply return ENOSYS when membarrier
is invoked on a kernel with enabled nohz_full CPUs.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Rik van Riel <riel@redhat.com>
Acked-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b0408745e upstream.
LPDDR memories can only handle up to 400 uncontrolled power off. Ensure the
proper power off sequence is used before shutting down the platform.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 857de6e007 upstream.
The device handler needs to check if a given queue belongs to a scsi
device; only then does it make sense to attach a device handler.
[mkp: dropped flags]
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c421530bf8 upstream.
The driver currently checks the SELF_TEST_FAILED first and then
KERNEL_PANIC next. Under error conditions(boot code failure) both
SELF_TEST_FAILED and KERNEL_PANIC can be set at the same time.
The driver has the capability to reset the controller on an KERNEL_PANIC,
but not on SELF_TEST_FAILED.
Fixed by first checking KERNEL_PANIC and then the others.
Fixes: e8b12f0fb8 ([SCSI] aacraid: Add new code for PMC-Sierra's SRC base controller family)
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: David Carroll <David.Carroll@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40630f4628 upstream.
On I/O errors, the Windows driver doesn't set data_transfer_length
on error conditions other than SRB_STATUS_DATA_OVERRUN.
In these cases we need to set data_transfer_length to 0,
indicating there is no data transferred. On SRB_STATUS_DATA_OVERRUN,
data_transfer_length is set by the Windows driver to the actual data transferred.
Reported-by: Shiva Krishna <Shiva.Krishna@nimblestorage.com>
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bba5dc332e upstream.
When sense message is present on error, we should pass along to the upper
layer to decide how to deal with the error.
This patch fixes connectivity issues with Fiber Channel devices.
Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d36a19541f upstream.
The lvm2 sequence to manage dm-raid constructor flags that trigger a
rebuild or a reshape is defined as:
1) load table with flags (e.g. rebuild/delta_disks/data_offset)
2) clear out the flags in lvm2 metadata
3) store the lvm2 metadata, reload the table to reset the flags
previously established during the initial load (1) -- in order to
prevent repeatedly requesting a rebuild or a reshape on activation
Currently, loading an inactive table with rebuild/reshape flags
specified will cause dm-raid to rebuild/reshape on resume and thus start
updating the raid metadata (about the progress). When the second table
reload, to reset the flags, occurs the constructor accesses the volatile
progress state kept in the raid superblocks. Because the active mapping
is still processing the rebuild/reshape, that position will be stale by
the time the device is resumed.
In the reshape case, this causes data corruption by processing already
reshaped stripes again. In the rebuild case, it does _not_ cause data
corruption but instead involves superfluous rebuilds.
Fix by keeping the raid set frozen during the first resume and then
allow the rebuild/reshape during the second resume.
Fixes: 9dbd1aa3a ("dm raid: add reshaping support to the target")
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37a098e9d1 upstream.
The sloppy nature of lockless access to percpu pointers
(s->current_path) in rr_select_path(), from multiple threads, is
causing some paths to used more than others -- which results in less
IO performance being observed.
Revert these upstream commits to restore truly symmetric round-robin
IO submission in DM multipath:
b0b477c dm round robin: use percpu 'repeat_count' and 'current_path'
802934b dm round robin: do not use this_cpu_ptr() without having preemption disabled
There is no benefit to all this complexity if repeat_count = 1 (which is
the recommended default).
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ca763d0a53 upstream.
A rounding bug due to compiler generated temporary being 32bit was found
in remap_to_cache(). A localized cast in remap_to_cache() fixes the
corruption but this preferred fix (changing from uint32_t to sector_t)
eliminates potential for future rounding errors elsewhere.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30582c25a4 upstream.
Until now, the trans_stat information of passive devfreq is not updated.
This patch updates the trans_stat information after setting the target
frequency of passive devfreq device.
Fixes: 996133119f ("PM / devfreq: Add new passive governor")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcf23c79c4 upstream.
The devfreq using passive governor is not able to change the governor.
So, the user can not change the governor through 'available_governor' sysfs
entry. Also, the devfreq which don't use the passive governor is not able to
change to 'passive' governor on the fly.
Fixes: 996133119f ("PM / devfreq: Add new passive governor")
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc15ed663e upstream.
On failure to return a pathname from ima_d_path(), a pointer to
dname is returned, which is subsequently used in the IMA measurement
list, the IMA audit records, and other audit logging. Saving the
pointer to dname for later use has the potential to race with rename.
Intead of returning a pointer to dname on failure, this patch returns
a pointer to a copy of the filename.
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 95e91b831f upstream.
The issue is described here, with a nice testcase:
https://bugzilla.kernel.org/show_bug.cgi?id=192931
The problem is that shmat() calls do_mmap_pgoff() with MAP_FIXED, and
the address rounded down to 0. For the regular mmap case, the
protection mentioned above is that the kernel gets to generate the
address -- arch_get_unmapped_area() will always check for MAP_FIXED and
return that address. So by the time we do security_mmap_addr(0) things
get funky for shmat().
The testcase itself shows that while a regular user crashes, root will
not have a problem attaching a nil-page. There are two possible fixes
to this. The first, and which this patch does, is to simply allow root
to crash as well -- this is also regular mmap behavior, ie when hacking
up the testcase and adding mmap(... |MAP_FIXED). While this approach
is the safer option, the second alternative is to ignore SHM_RND if the
rounded address is 0, thus only having MAP_SHARED flags. This makes the
behavior of shmat() identical to the mmap() case. The downside of this
is obviously user visible, but does make sense in that it maintains
semantics after the round-down wrt 0 address and mmap.
Passes shm related ltp tests.
Link: http://lkml.kernel.org/r/1486050195-18629-1-git-send-email-dave@stgolabs.net
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Reported-by: Gareth Evans <gareth.evans@contextis.co.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Michael Kerrisk <mtk.manpages@googlemail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71ab6cfe88 upstream.
get_scan_count() considers the whole node LRU size when
- doing SCAN_FILE due to many page cache inactive pages
- calculating the number of pages to scan
In both cases this might lead to unexpected behavior especially on 32b
systems where we can expect lowmem memory pressure very often.
A large highmem zone can easily distort SCAN_FILE heuristic because
there might be only few file pages from the eligible zones on the node
lru and we would still enforce file lru scanning which can lead to
trashing while we could still scan anonymous pages.
The later use of lruvec_lru_size can be problematic as well. Especially
when there are not many pages from the eligible zones. We would have to
skip over many pages to find anything to reclaim but shrink_node_memcg
would only reduce the remaining number to scan by SWAP_CLUSTER_MAX at
maximum. Therefore we can end up going over a large LRU many times
without actually having chance to reclaim much if anything at all. The
closer we are out of memory on lowmem zone the worse the problem will
be.
Fix this by filtering out all the ineligible zones when calculating the
lru size for both paths and consider only sc->reclaim_idx zones.
The patch would need to be tweaked a bit to apply to 4.10 and older but
I will do that as soon as it hits the Linus tree in the next merge
window.
Link: http://lkml.kernel.org/r/20170117103702.28542-3-mhocko@kernel.org
Fixes: b2e18757f2 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: Trevor Cordes <trevor@tecnopolis.ca>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd8416c477 upstream.
With rw_page, page_endio is used for completing IO on a page and it
propagates write error to the address space if the IO fails. The
problem is it accesses page->mapping directly which might be okay for
file-backed pages but it shouldn't for anonymous page. Otherwise, it
can corrupt one of field from anon_vma under us and system goes panic
randomly.
swap_writepage
bdev_writepage
ops->rw_page
I encountered the BUG during developing new zram feature and it was
really hard to figure it out because it made random crash, somtime
mmap_sem lockdep, sometime other places where places never related to
zram/zsmalloc, and not reproducible with some configuration.
When I consider how that bug is subtle and people do fast-swap test with
brd, it's worth to add stable mark, I think.
Fixes: dd6bd0d9c7 ("swap: use bdev_read_page() / bdev_write_page()")
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e02dc017c3 upstream.
When @node_reclaim_node isn't 0, the page allocator tries to reclaim
pages if the amount of free memory in the zones are below the low
watermark. On Power platform, none of NUMA nodes are scanned for page
reclaim because no nodes match the condition in zone_allows_reclaim().
On Power platform, RECLAIM_DISTANCE is set to 10 which is the distance
of Node-A to Node-A. So the preferred node even won't be scanned for
page reclaim.
__alloc_pages_nodemask()
get_page_from_freelist()
zone_allows_reclaim()
Anton proposed the test code as below:
# cat alloc.c
:
int main(int argc, char *argv[])
{
void *p;
unsigned long size;
unsigned long start, end;
start = time(NULL);
size = strtoul(argv[1], NULL, 0);
printf("To allocate %ldGB memory\n", size);
size <<= 30;
p = malloc(size);
assert(p);
memset(p, 0, size);
end = time(NULL);
printf("Used time: %ld seconds\n", end - start);
sleep(3600);
return 0;
}
The system I use for testing has two NUMA nodes. Both have 128GB
memory. In below scnario, the page caches on node#0 should be reclaimed
when it encounters pressure to accommodate request of allocation.
# echo 2 > /proc/sys/vm/zone_reclaim_mode; \
sync; \
echo 3 > /proc/sys/vm/drop_caches; \
# taskset -c 0 cat file.32G > /dev/null; \
grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33619712 kB
# taskset -c 0 ./alloc 128
# grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33619840 kB
# grep MemFree /sys/devices/system/node/node0/meminfo
Node 0 MemFree: 186816 kB
With the patch applied, the pagecache on node-0 is reclaimed when its
free memory is running out. It's the expected behaviour.
# echo 2 > /proc/sys/vm/zone_reclaim_mode; \
sync; \
echo 3 > /proc/sys/vm/drop_caches
# taskset -c 0 cat file.32G > /dev/null; \
grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 33605568 kB
# taskset -c 0 ./alloc 128
# grep FilePages /sys/devices/system/node/node0/meminfo
Node 0 FilePages: 1379520 kB
# grep MemFree /sys/devices/system/node/node0/meminfo
Node 0 MemFree: 317120 kB
Fixes: 5f7a75acdb ("mm: page_alloc: do not cache reclaim distances")
Link: http://lkml.kernel.org/r/1486532455-29613-1-git-send-email-gwshan@linux.vnet.ibm.com
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c25702cee upstream.
Currently we call copy_page_to_iter() for uncached reading into a pipe.
This is wrong because it treats pages as VFS cache pages and copies references
rather than actual data. When we are trying to read from the pipe we end up
calling page_cache_pipe_buf_confirm() which returns -ENODATA. This error
is translated into 0 which is returned to a user.
This issue is reproduced by running xfs-tests suite (generic test #249)
against mount points with "cache=none". Fix it by mapping pages manually
and calling copy_to_iter() that copies data into the pipe.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e42a46b6f5 upstream.
It is allowed to call regulator_get with a NULL dev argument
(_regulator_get explicitly checks for it) but this causes an error later
when printing /sys/kernel/debug/regulator_summary.
Fix this by explicitly handling "deviceless" consumers in the debugfs code.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4474f4c40a upstream.
The stm is automatically enabled when an application sets the policy
via ->link() call back by using coresight_enable(), which keeps the
refcount of the current users of the STM. However, the unlink() callback
issues stm_disable() directly, which leaves the STM turned off, without
the coresight layer knowing about it. This prevents any further uses
of the STM hardware as the coresight layer still thinks the STM is
turned on and doesn't enable the hardware when required. Even manually
enabling the STM via sysfs can't really enable the hw.
e.g,
$ echo 1 > $CS_DEVS/$ETR/enable_sink
$ mkdir -p $CONFIG_FS/stp-policy/$source.0/stm_test/
$ echo 32768 65535 > $CONFIG_FS/stp-policy/$source.0/stm_test/channels
$ echo 64 > $CS_DEVS/$source/traceid
$ ./stm_app
Sending 64000 byte blocks of pattern 0 at 0us intervals
Success to map channel(32768~32783) to 0xffffa95fa000
Sending on channel 32768
$ dd if=/dev/$ETR of=~/trace.bin.1
597+1 records in
597+1 records out
305920 bytes (306 kB) copied, 0.399952 s, 765 kB/s
$ ./stm_app
Sending 64000 byte blocks of pattern 0 at 0us intervals
Success to map channel(32768~32783) to 0xffff7e9e2000
Sending on channel 32768
$ dd if=/dev/$ETR of=~/trace.bin.2
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0.0232083 s, 0.0 kB/s
Note that we don't get any data from the ETR for the second session.
Also dmesg shows :
[ 77.520458] coresight-tmc 20800000.etr: TMC-ETR enabled
[ 77.537097] coresight-replicator etr_replicator@20890000: REPLICATOR enabled
[ 77.558828] coresight-replicator main_replicator@208a0000: REPLICATOR enabled
[ 77.581068] coresight-funnel 208c0000.main_funnel: FUNNEL inport 0 enabled
[ 77.602217] coresight-tmc 20840000.etf: TMC-ETF enabled
[ 77.618422] coresight-stm 20860000.stm: STM tracing enabled
[ 139.554252] coresight-stm 20860000.stm: STM tracing disabled
# End of first tracing session
[ 146.351135] coresight-tmc 20800000.etr: TMC read start
[ 146.514486] coresight-tmc 20800000.etr: TMC read end
# Note that the STM is not turned on via stm_generic_link()->coresight_enable()
# and hence none of the components are turned on.
[ 152.479080] coresight-tmc 20800000.etr: TMC read start
[ 152.542632] coresight-tmc 20800000.etr: TMC read end
This patch fixes the problem by balancing the unlink operation by using
the coresight_disable(), keeping the coresight layer in sync with the
hardware state and thus allowing normal usage of the STM component.
Fixes: commit 237483aa5c ("coresight: stm: adding driver for CoreSight STM component")
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Chunyan Zhang <zhang.chunyan@linaro.org>
Reported-by: Robert Walker <robert.walker@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e01700602 upstream.
gcc-7 detects that wlanhdr_to_ethhdr() in two drivers calls memcpy() with
a destination argument that an earlier function call may have set to NULL:
staging/rtl8188eu/core/rtw_recv.c: In function 'wlanhdr_to_ethhdr':
staging/rtl8188eu/core/rtw_recv.c:1318:2: warning: argument 1 null where non-null expected [-Wnonnull]
staging/rtl8712/rtl871x_recv.c: In function 'r8712_wlanhdr_to_ethhdr':
staging/rtl8712/rtl871x_recv.c:649:2: warning: argument 1 null where non-null expected [-Wnonnull]
I'm fixing this by adding a NULL pointer check and returning failure
from the function, which is hopefully already handled properly.
This seems to date back to when the drivers were originally added,
so backporting the fix to stable seems appropriate. There are other
related realtek drivers in the kernel, but none of them contain a
function with a similar name or produce this warning.
Fixes: 1cc18a22b9 ("staging: r8188eu: Add files for new driver - part 5")
Fixes: 2865d42c78 ("staging: r8712u: Add the new driver to the mainline kernel")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc7ffefdcc upstream.
This is unbreaking another of those "stealth" janitor
patches that got in and subtly broke some things.
sv_cpt_data is a pointer to pointer, so need to
dereference it twice to allocate the correct structure size.
Fixes: 9899cb68c6 ("Staging: lustre: rpc: Use sizeof type *pointer instead of sizeof type.")
CC: Sandhya Bankar <bankarsandhya512@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33b8807a6f upstream.
The loopback driver allows the user to set a minimum delay of up to one
second to be inserted between test iterations (i.e. request
submissions). The delay is currently specified in microseconds and is
implemented using udelay.
Busy looping for long periods is not just anti-social; udelay must not
be used for delays longer than a few milliseconds due to the risk of
integer overflow.
Replace the broken udelay with a usleep_range with a 100 us range for
short delays (< 20 ms) and otherwise revert to using msleep.
Fixes: b36f04fa94 ("greybus: loopback: Convert thread delay to microseconds")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82dbe987b7 upstream.
If sensor attributes were never read, the pwm control data has not been
initiialized, which can cause wrong driver behavior. Ensure that cached
data is current before acting on it.
Reported-by: Kevin Folz <kfolz@evertz.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c7b8ca1ae upstream.
In IT8620E, after setting pwm control to manual, it was observed that
pwm values for fan 4..6 have reversed results (writing 0 results in fans
running at full speed, writing 255 results in fans turned off).
With the new PWM control, pwm polarity for pwm control 4..6 is specified
in its pwm control registers. Those registers are overwritten when setting
the pwm mode or the temperature mapping. Do not touch bit 2..6 of pwm
control registers on register writes to fix the problem.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 29693efcea upstream.
On this machine, the micmute button is connected to Line2 of the
codec and the micmute led is connected to GPIO2 of the codec.
After applying this quirk, both hotkey and led work well.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3ac9f7376 upstream.
The sequencer FIFO management has a bug that may lead to a corruption
(shortage) of the cell linked list. When a sequencer client faces an
error at the event delivery, it tries to put back the dequeued cell.
When the first queue was put back, this forgot the tail pointer
tracking, and the link will be screwed up.
Although there is no memory corruption, the sequencer client may stall
forever at exit while flushing the pending FIFO cells in
snd_seq_pool_done(), as spotted by syzkaller.
This patch addresses the missing tail pointer tracking at
snd_seq_fifo_cell_putback(). Also the patch makes sure to clear the
cell->enxt pointer at snd_seq_fifo_event_in() for avoiding a similar
mess-up of the FIFO linked list.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 15c75b09f8 upstream.
Currently ctxfi driver tries to set only the 64bit DMA mask on 64bit
architectures, and bails out if it fails. This causes a problem on
some platforms since the 64bit DMA isn't always guaranteed. We should
fall back to the default 32bit DMA when 64bit DMA fails.
Fixes: 6d74b86d3c ("ALSA: ctxfi - Allow 64bit DMA")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 71321eb3f2 upstream.
When a user sets a too small ticks with a fine-grained timer like
hrtimer, the kernel tries to fire up the timer irq too frequently.
This may lead to the condensed locks, eventually the kernel spinlock
lockup with warnings.
For avoiding such a situation, we define a lower limit of the
resolution, namely 1ms. When the user passes a too small tick value
that results in less than that, the kernel returns -EINVAL now.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7480b34ad upstream.
Like for Sunrise Point, the total stream number of Lewisburg's
input and output stream exceeds 15 (GCAP is 0x9701), which will
cause some streams do not work because of the overflow on
SDxCTL.STRM field if using the legacy stream tag allocation method.
Fixes: 5cf92c8b3d ("ALSA: hda - Add Intel Lewisburg device IDs Audio")
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f1bc2c4c5 upstream.
The issue is the same as "dd9aa335c880 ALSA: hda/realtek - Can't adjust
speaker's volume on a Dell AIO", the output requires to connect to a node
with Amp-out capability.
Applying the same fixup "ALC298_FIXUP_SPK_VOLUME" can fix the issue.
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef8d02d4a2 upstream.
Enable DMA on usart3 to get a more reliable console. This is especially
useful for automation and kernelci were a kernel with PROVE_LOCKING enabled
is quite susceptible to character loss, resulting in tests failure.
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 379f831a92 upstream.
Commit a92e7c3d82 ("spi: s3c64xx: consider the case when the CS
line is not connected") introduced an inconsistency between the
binding, where the disconnected CS line was marked as
'no-cs-readback', and the driver.
The driver is erroneously checking for that attribute with
property name of 'broken-cs'.
Check for 'no-cs-readback' in the driver as well.
Fixes: a92e7c3d82 ("spi: s3c64xx: consider the case when the CS line is not connected")
Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98d85f3cb9 upstream.
When the functions replaced media entity types, the range which was
allowed for the types was incorrect. This meant that media entity types
for specific devices were not passed correctly to the userspace through
MEDIA_IOC_ENUM_ENTITIES. Fix it.
Fixes: commit b2cd27448b ("[media] media-device: map new functions into old types for legacy API")
Reported-and-tested-by: Antti Laakso <antti.laakso@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0ffb94b6cc upstream.
Setting GPIOs during probe causes null pointer deference when
GPIOLIB was not selected by Kconfig. Initialize driver private
field before calling set gpios.
It is regressing bug since 4.9.
Fixes: 07fdf7d9f1 ("[media] cxd2820r: add I2C driver bindings")
Reported-by: Chris Rankin <rankincj@gmail.com>
Tested-by: Chris Rankin <rankincj@gmail.com>
Tested-by: Håkan Lennestål <hakan.lennestal@gmail.com>
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ebf75774f upstream.
In vpfe_s_fmt(), when the sensor format and the requested format were
the same, bpp was assigned to vpfe->bpp without being initialized first.
Grab the bpp value that is currently used by using __vpfe_get_format()
instead of its wrapper, vpfe_try_fmt().
This use of uninitialized variable has been found by compiling the
kernel with clang.
Fixes: 417d2e507e ("[media] media: platform: add VPFE capture driver
support for AM437X")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb9bc4689b upstream.
get_frame_info() calculates the offset of the return address within a
stack frame simply by dividing a the bottom 16 bits of the instruction,
treated as a signed integer, by the size of a long. Whilst this works
for MIPS32 & MIPS64 ISAs where the sw or sd instructions are used, it's
incorrect for microMIPS where encodings differ. The result is that we
typically completely fail to unwind the stack on microMIPS.
Fix this by adjusting is_ra_save_ins() to calculate the return address
offset, and take into account the various different encodings there in
the same place as we consider whether an instruction is storing the
ra/$31 register.
With this we are now able to unwind the stack for kernels targetting the
microMIPS ISA, for example we can produce:
Call Trace:
[<80109e1f>] show_stack+0x63/0x7c
[<8011ea17>] __warn+0x9b/0xac
[<8011ea45>] warn_slowpath_fmt+0x1d/0x20
[<8013fe53>] register_console+0x43/0x314
[<8067c58d>] of_setup_earlycon+0x1dd/0x1ec
[<8067f63f>] early_init_dt_scan_chosen_stdout+0xe7/0xf8
[<8066c115>] do_early_param+0x75/0xac
[<801302f9>] parse_args+0x1dd/0x308
[<8066c459>] parse_early_options+0x25/0x28
[<8066c48b>] parse_early_param+0x2f/0x38
[<8066e8cf>] setup_arch+0x113/0x488
[<8066c4f3>] start_kernel+0x57/0x328
---[ end trace 0000000000000000 ]---
Whereas previously we only produced:
Call Trace:
[<80109e1f>] show_stack+0x63/0x7c
---[ end trace 0000000000000000 ]---
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14532/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6c7a324df upstream.
get_frame_info() is meant to iterate over up to the first 128
instructions within a function, but for microMIPS kernels it will not
reach that many instructions unless the function is 512 bytes long since
we calculate the maximum number of instructions to check by dividing the
function length by the 4 byte size of a union mips_instruction. In
microMIPS kernels this won't do since instructions are variable length.
Fix this by instead checking whether the pointer to the current
instruction has reached the end of the function, and use max_insns as a
simple constant to check the number of iterations against.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14530/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3552dace7 upstream.
During stack unwinding we call a number of functions to determine what
type of instruction we're looking at. The union mips_instruction pointer
provided to them may be pointing at a 2 byte, but not 4 byte, aligned
address & we thus cannot directly access the 4 byte wide members of the
union mips_instruction. To avoid this is_ra_save_ins() copies the
required half-words of the microMIPS instruction to a correctly aligned
union mips_instruction on the stack, which it can then access safely.
The is_jump_ins() & is_sp_move_ins() functions do not correctly perform
this temporary copy, and instead attempt to directly dereference 4 byte
fields which may be misaligned and lead to an address exception.
Fix this by copying the instruction halfwords to a temporary union
mips_instruction in get_frame_info() such that we can provide a 4 byte
aligned union mips_instruction to the is_*_ins() functions and they do
not need to deal with misalignment themselves.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14529/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccaf7caf2c upstream.
get_frame_info() can be called in microMIPS kernels with the ISA bit
already clear. For example this happens when unwind_stack_by_address()
is called because we begin with a PC that has the ISA bit set & subtract
the (odd) offset from the preceding symbol (which does not have the ISA
bit set). Since get_frame_info() unconditionally subtracts 1 from the PC
in microMIPS kernels it incorrectly misaligns the address it then
attempts to access code at, leading to an address error exception.
Fix this by using msk_isa16_mode() to clear the ISA bit, which allows
get_frame_info() to function regardless of whether it is provided with a
PC that has the ISA bit set or not.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 34c2f668d0 ("MIPS: microMIPS: Add unaligned access support.")
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14528/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 774f0c6419 upstream.
Disabling ethernet during reboot (only to enable it again when the
ethernet driver attaches) can put the chip into a faulty state where it
corrupts the header of all incoming packets.
This happens if packets arrive during the time window where the core is
disabled, and it can be easily reproduced by rebooting while sending a
flood ping to the broadcast address.
Fixes: 95135bfa7e ("MIPS: Lantiq: Deactivate most of the devices by default")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Acked-by: John Crispin <john@phrozen.org>
Cc: hauke.mehrtens@lantiq.com
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/15078/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 884b426917 upstream.
If copy_from_user is called with a large buffer (>= 128 bytes) and the
userspace buffer refers partially to unreadable memory, then it is
possible for Octeon's copy_from_user to report the wrong number of bytes
have been copied. In the case where the buffer size is an exact multiple
of 128 and the fault occurs in the last 64 bytes, copy_from_user will
report that all the bytes were copied successfully but leave some
garbage in the destination buffer.
The bug is in the main __copy_user_common loop in octeon-memcpy.S where
in the middle of the loop, src and dst are incremented by 128 bytes. The
l_exc_copy fault handler is used after this but that assumes that
"src < THREAD_BUADDR($28)". This is not the case if src has already been
incremented.
Fix by adding an extra fault handler which rewinds the src and dst
pointers 128 bytes before falling though to l_exc_copy.
Thanks to the pwritev test from the strace test suite for originally
highlighting this bug!
Fixes: 5b3b16880f ("MIPS: Add Cavium OCTEON processor support ...")
Signed-off-by: James Cowgill <James.Cowgill@imgtec.com>
Acked-by: David Daney <david.daney@cavium.com>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14978/
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 66fd848cad upstream.
For certain arguments such as saddr = 0xc0a8fd60, daddr = 0xc0a8fda1,
len = 80, proto = 17, sum = 0x7eae049d there will be a carry when
folding the intermediate 64 bit checksum to 32 bit but the code doesn't
add the carry back to the one's complement sum, thus an incorrect result
will be generated.
Reported-by: Mark Zhang <bomb.zhang@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
https://github.com/raspberrypi/linux/issues/1447
port_parameter_get() failed to account for the header
(u32 id and u32 size) in the size before memcpying
the response into the response buffer, so overrunning
the provided buffer by 8 bytes.
Account for those bytes, and also a belt-and-braces
check to ensure we never copy more than *value_size
bytes into value.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
Commit 488f9bc8e3 slightly increased the
reported rate of PLLD, so the clk driver decided that PLLD/3/8 was now
higher than our requested pixel clock rate and rejected it in favor of
PLLD/4/8, which then ran the pixel clock way out of spec.
By bumping the requested clock rate just slightly, we get back to
PLLD/3/8 like we wanted and the panel displays content again.
Signed-off-by: Eric Anholt <eric@anholt.net>
Commonly used desktop environments such as xfce4 and gnome
on debian sid can flood the graphics drivers with cursor
updates. Because the current implementation is waiting
for a vblank between cursor updates, this will cause the
display to hang for a long time since a typical refresh
rate is only 60Hz.
This is unnecessary and unexpected by user mode software,
so simply swap out the cursor frame buffer without waiting.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
The VPU is responsible for managing the core clock, usually under
direction from the bcm2835-cpufreq driver but not via the clk-bcm2835
driver. Since the core frequency can change without warning, it is
safer to report the maximum clock rate to users of the core clock -
I2C, SPI and the mini UART - to err on the safe side when calculating
clock divisors.
If the DT node for the clock driver includes a reference to the
firmware node, use the firmware API to query the maximum core clock
instead of reading the divider registers.
Prior to this patch, a "100KHz" I2C bus was sometimes clocked at about
160KHz. In particular, switching to the 4.9 kernel was likely to break
SenseHAT usage on a Pi3.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add DT support for the Pi Zero W. N.B. It will not be loaded
automatically without a corresponding change to the firmware.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Pi3 and Compute Module 3 have a GPIO expander that the
VPU communicates with.
There is a mailbox service that now allows control of this
expander, so add a kernel driver that can make use of it.
Pwr_led node added to device-tree for Pi3.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
H264 header come off VC with 0 timestamps, which means they get a
strange timestamp when processed with VC/kernel start times,
particularly if used with the inline header option.
Remember the last frame timestamp and use that if set, or otherwise
use the kernel start time.
https://github.com/raspberrypi/linux/issues/1836
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>
The UART clock is initialised to be as close to the requested
frequency as possible without exceeding it. Now that there is a
clock manager that returns the actual frequencies, an expected
48MHz clock is reported as 47999625. If the requested baudrate
== requested clock/16, there is no headroom and the slight
reduction in actual clock rate results in failure.
Detect cases where it looks like a "round" clock was chosen and
adjust the reported clock to match that "round" value. As the
code comment says:
/*
* If increasing a clock by less than 0.1% changes it
* from ..999.. to ..000.., round up.
*/
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
If a clock has the prediv flag set, both the integer and fractional
parts must be scaled when calculating the resulting frequency.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Fe-Pi Audio Sound Card is based on NXP SGTL5000 codec.
Mechanical specification of the board is the same the Raspberry Pi Zero.
3.5mm jacks for Headphone/Mic, Line In, and Line Out.
Signed-off-by: Henry Kupis <fe-pi@cox.net>
commit fa7f138ac4 upstream.
The buffered write failure handling code in
xfs_file_iomap_end_delalloc() has a couple minor problems. First, if
written == 0, start_fsb is not rounded down and it fails to kill off a
delalloc block if the start offset is block unaligned. This results in a
lingering delalloc block and broken delalloc block accounting detected
at unmount time. Fix this by rounding down start_fsb in the unlikely
event that written == 0.
Second, it is possible for a failed overwrite of a delalloc extent to
leave dirty pagecache around over a hole in the file. This is because is
possible to hit ->iomap_end() on write failure before the iomap code has
attempted to allocate pagecache, and thus has no need to clean it up. If
the targeted delalloc extent was successfully written by a previous
write, however, then it does still have dirty pages when ->iomap_end()
punches out the underlying blocks. This ultimately results in writeback
over a hole. To fix this problem, unconditionally punch out the
pagecache from XFS before the associated delalloc range.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 575ddce050 upstream.
In the function rtl_usb_start we pre-allocate a certain number of urbs
for RX path but they will not be freed when calling rtl_usb_stop. This
results in leaking urbs when doing ifconfig up and down. Eventually,
the system has no available urbs.
Signed-off-by: Michael Schenk <michael.schenk@albis-elcon.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f478e4ea5 upstream.
When !CONFIG_CGROUP_WRITEBACK, bdi has single bdi_writeback_congested
at bdi->wb_congested. cgwb_bdi_init() allocates it with kzalloc() and
doesn't do further initialization. This usually works fine as the
reference count gets bumped to 1 by wb_init() and the put from
wb_exit() releases it.
However, when wb_init() fails, it puts the wb base ref automatically
freeing the wb and the explicit kfree() in cgwb_bdi_init() error path
ends up trying to free the same pointer the second time causing a
double-free.
Fix it by explicitly initilizing the refcnt to 1 and putting the base
ref from cgwb_bdi_destroy().
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: a13f35e871 ("writeback: don't embed root bdi_writeback_congested in bdi_writeback")
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffab9188e4 upstream.
ACPICA commit b59347d0b8b676cb555fe8da5cad08fcd4eeb0d3
The following commit cleans up compiler specific inclusions:
Commit: 9fa1cebdbf
Subject: ACPICA: OSL: Cleanup the inclusion order of the compiler-specific headers
But breaks one thing due to the following old issue:
Buidling Linux kernel with Intel compiler originally depends on acgcc.h
not acintel.h.
So after making Intel compiler build working in ACPICA upstream by
correctly using acintel.h, it becomes unable to build Linux kernel using
Intel compiler as there is no acintel.h in the kernel source tree.
This patch releases acintel.h to Linux kernel and fixes its inclusion in
acenv.h.
Fixes: 9fa1cebdbf (ACPICA: OSL: Cleanup the inclusion order of the compiler-specific headers)
Link: https://github.com/acpica/acpica/commit/b59347d0
Tested-by: Stepan M Mishura <stepan.m.mishura@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dfe75ff8ca upstream.
Commit 3bb398d925 ("netfilter: nf_ct_helper: disable automatic helper
assignment") is causing behavior regressions in firewalls, as traffic
handled by conntrack helpers is now by default not passed through even
though it was before due to missing CT targets (which were not necessary
before this commit).
The default had to be switched off due to security reasons [1] [2] and
therefore should stay the way it is, but let's be friendly to firewall
admins and issue a warning the first time we're in situation where packet
would be likely passed through with the old default but we're likely going
to drop it on the floor now.
Rewrite the code a little bit as suggested by Linus, so that we avoid
spaghettiing the code even more -- namely the whole decision making
process regarding helper selection (either automatic or not) is being
separated, so that the whole logic can be simplified and code (condition)
duplication reduced.
[1] https://cansecwest.com/csw12/conntrack-attack.pdf
[2] https://home.regit.org/netfilter-en/secure-use-of-helpers/
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6cf18e6927 upstream.
This interrupt handler is broken in several ways:
- It loops forever when the op code is not decodeable
- It never returns IRQ_HANDLED because the only way to exit the loop
returns IRQ_NONE unconditionally.
The whole concept of this is broken. Creating devices in an interrupt
handler is beyond any point of sanity.
Make it at least behave halfways sane so accidental users do not have to
deal with a hard to debug lockup.
Fixes: e809c22b8f ("goldfish: add the goldfish virtual bus")
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 47512cfd0d upstream.
The goldfish platform code registers the platform device unconditionally
which causes havoc in several ways if the goldfish_pdev_bus driver is
enabled:
- Access to the hardcoded physical memory region, which is either not
available or contains stuff which is completely unrelated.
- Prevents that the interrupt of the serial port can be requested
- In case of a spurious interrupt it goes into a infinite loop in the
interrupt handler of the pdev_bus driver (which needs to be fixed
seperately).
Add a 'goldfish' command line option to make the registration opt-in when
the platform is compiled in.
I'm seriously grumpy about this engineering trainwreck, which has seven
SOBs from Intel developers for 50 lines of code. And none of them figured
out that this is broken. Impressive fail!
Fixes: ddd70cf93d ("goldfish: platform device for x86")
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14816b16fa upstream.
Since commit 4a51096937 ("tty: Make tty_files_lock per-tty") a new
tty_struct spin lock is taken in the tty release path, but the
USB-serial-console hack was never updated hence leaving the lock of its
"fake" tty uninitialised. This was eventually detected by lockdep.
Make sure to initialise the new lock also for the fake tty to address
this regression.
Yes, this code is a mess, but cleaning it up is left for another day.
Fixes: 4a51096937 ("tty: Make tty_files_lock per-tty")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9fef37d7cf upstream.
The current implementation failed to detect short transfers, something
which could lead to bits of the uninitialised heap transfer buffer
leaking to user space.
Fixes: 149fc791a4 ("USB: ark3116: Setup some basic infrastructure for new ark3116 driver.")
Fixes: f4c1e8d597 ("USB: ark3116: Make existing functions 16450-aware and add close and release functions.")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2eee05020a upstream.
The opticon driver used a control request at open to trigger a CTS
status notification to be sent over the bulk-in pipe. When the driver
was converted to using the generic read implementation, an inverted test
prevented this request from being sent, something which could lead to
TIOCMGET reporting an incorrect CTS state.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 7a6ee2b027 ("USB: opticon: switch to generic read implementation")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5ed8d41023 upstream.
Make sure to detect short control transfers and return zero on success
when retrieving the modem status.
This fixes the TIOCMGET implementation which since e1ed212d85 ("USB:
spcp8x5: add proper modem-status support") has returned TIOCM_LE on
successful retrieval, and avoids leaking bits from the stack on short
transfers.
This also fixes the carrier-detect implementation which since the above
mentioned commit unconditionally has returned true.
Fixes: e1ed212d85 ("USB: spcp8x5: add proper modem-status support")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6bb1e17a3 upstream.
FTDI devices use a receive latency timer to periodically empty the
receive buffer and report modem and line status (also when the buffer is
empty).
When a break or error condition is detected the corresponding status
flags will be set on a packet with nonzero data payload and the flags
are not updated until the break is over or further characters are
received.
In order to avoid over-reporting break and error conditions, these flags
must therefore only be processed for packets with payload.
This specifically fixes the case where after an overrun, the error
condition is continuously reported and NULL-characters inserted until
further data is received.
Reported-by: Michael Walle <michael@walle.cc>
Fixes: 72fda3ca6f ("USB: serial: ftd_sio: implement sysrq handling on
break")
Fixes: 166ceb6907 ("USB: ftdi_sio: clean up line-status handling")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c6dce26266 upstream.
Since commit 557aaa7ffa ("ft232: support the ASYNC_LOW_LATENCY
flag") the FTDI driver has been using a receive latency-timer value of
1 ms instead of the device default of 16 ms.
The latency timer is used to periodically empty a non-full receive
buffer, but a status header is always sent when the timer expires
including when the buffer is empty. This means that a two-byte bulk
message is received every millisecond also for an otherwise idle port as
long as it is open.
Let's restore the pre-2009 behaviour which reduces the rate of the
status messages to 1/16th (e.g. interrupt frequency drops from 1 kHz to
62.5 Hz) by not setting ASYNC_LOW_LATENCY by default.
Anyone willing to pay the price for the minimum-latency behaviour should
set the flag explicitly instead using the TIOCSSERIAL ioctl or a tool
such as setserial (e.g. setserial /dev/ttyUSB0 low_latency).
Note that since commit 0cbd81a9f6 ("USB: ftdi_sio: remove
tty->low_latency") the ASYNC_LOW_LATENCY flag has no other effects but
to set a minimal latency timer.
Reported-by: Antoine Aubert <a.aubert@overkiz.com>
Fixes: 557aaa7ffa ("ft232: support the ASYNC_LOW_LATENCY flag")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 427c3a95e3 upstream.
Make sure to detect short responses when fetching the modem status in
order to avoid parsing uninitialised buffer data and having bits of it
leak to user space.
Note that we still allow for short 1-byte responses.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5182c2cf2a upstream.
Fix another NULL-pointer dereference at open should a malicious device
lack an interrupt-in endpoint.
Note that the driver has a broken check for an interrupt-in endpoint
which means that an interrupt URB has never even been submitted.
Fixes: 3f5429746d ("USB: Moschip 7840 USB-Serial Driver")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abe81f3b8e upstream.
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/tty/serial/msm_serial.ko | grep alias
$
After this patch:
$ modinfo drivers/tty/serial/msm_serial.ko | grep alias
alias: of:N*T*Cqcom,msm-uartdmC*
alias: of:N*T*Cqcom,msm-uartdm
alias: of:N*T*Cqcom,msm-uartC*
alias: of:N*T*Cqcom,msm-uart
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e623a9e9de ]
Commit 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path"),
changed the exit path of recvmmsg to always return the datagrams
variable and modified the error paths to set the variable to the error
code returned by recvmsg if necessary.
However in the case sock_error returned an error, the error code was
then ignored, and recvmmsg returned 0.
Change the error path of recvmmsg to correctly return the error code
of sock_error.
The bug was triggered by using recvmmsg on a CAN interface which was
not up. Linux 4.6 and later return 0 in this case while earlier
releases returned -ENETDOWN.
Fixes: 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ca4ef4574f ]
The skbs processed by ip_cmsg_recv() are not guaranteed to
be linear e.g. when sending UDP packets over loopback with
MSGMORE.
Using csum_partial() on [potentially] the whole skb len
is dangerous; instead be on the safe side and use skb_checksum().
Thanks to syzkaller team to detect the issue and provide the
reproducer.
v1 -> v2:
- move the variable declaration in a tighter scope
Fixes: ad6f939ab1 ("ip: Add offset parameter to ip_cmsg_recv")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e716953071 ]
Resizing currently drops consumer lock. This can cause entries to be
reordered, which isn't good in itself. More importantly, consumer can
detect a false ring empty condition and block forever.
Further, nesting of consumer within producer lock is problematic for
tun, since it produces entries in a BH, which causes a lock order
reversal:
CPU0 CPU1
---- ----
consume:
lock(&(&r->consumer_lock)->rlock);
resize:
local_irq_disable();
lock(&(&r->producer_lock)->rlock);
lock(&(&r->consumer_lock)->rlock);
<Interrupt>
produce:
lock(&(&r->producer_lock)->rlock);
To fix, nest producer lock within consumer lock during resize,
and keep consumer lock during the whole swap operation.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4c03b862b1 ]
A nested lock depth was added to the hasbin_delete() code but it
doesn't actually work some well and results in tons of lockdep splats.
Fix the code instead to properly drop the lock around the operation
and just keep peeking the head of the hashbin queue.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 22f0708a71 ]
Since the commit 0c1d70af92 ("net: use dst_cache for vxlan device")
vxlan_fill_metadata_dst() calls vxlan_get_route() passing a NULL
dst_cache pointer, so the latter should explicitly check for
valid dst_cache ptr. Unfortunately the commit d71785ffc7 ("net: add
dst_cache to ovs vxlan lwtunnel") removed said check.
As a result is possible to trigger a null pointer access calling
vxlan_fill_metadata_dst(), e.g. with:
ovs-vsctl add-br ovs-br0
ovs-vsctl add-port ovs-br0 vxlan0 -- set interface vxlan0 \
type=vxlan options:remote_ip=192.168.1.1 \
options:key=1234 options:dst_port=4789 ofport_request=10
ip address add dev ovs-br0 172.16.1.2/24
ovs-vsctl set Bridge ovs-br0 ipfix=@i -- --id=@i create IPFIX \
targets=\"172.16.1.1:1234\" sampling=1
iperf -c 172.16.1.1 -u -l 1000 -b 10M -t 1 -p 1234
This commit addresses the issue passing to vxlan_get_route() the
dst_cache already available into the lwt info processed by
vxlan_fill_metadata_dst().
Fixes: d71785ffc7 ("net: add dst_cache to ovs vxlan lwtunnel")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5edabca9d4 ]
In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
is forcibly freed via __kfree_skb in dccp_rcv_state_process if
dccp_v6_conn_request successfully returns.
However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
is saved to ireq->pktopts and the ref count for skb is incremented in
dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
in dccp_rcv_state_process.
Fix by calling consume_skb instead of doing goto discard and therefore
calling __kfree_skb.
Similar fixes for TCP:
fb7e2399ec [TCP]: skb is unexpectedly freed.
0aea76d35c tcp: SYN packets are now
simply consumed
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7627ae6030 ]
When setting a neigh related sysctl parameter, we always send a
NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
executing
sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000
a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.
This is caused by commit 2a4501ae18 ("neigh: Send a
notification when DELAY_PROBE_TIME changes"). According to the
commit's description, it was intended to generate such an event
when setting the "delay_first_probe_time" sysctl parameter.
In order to fix this, only generate this event when actually
setting the "delay_first_probe_time" sysctl parameter. This fix
should not have any unintended side-effects, because all but one
registered netevent callbacks check for other netevent event
types (the registered callbacks were obtained by grepping for
"register_netevent_notifier"). The only callback that uses the
NETEVENT_DELAY_PROBE_TIME_UPDATE event is
mlxsw_sp_router_netevent_event() (in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
of this event, it only accesses the DELAY_PROBE_TIME of the
passed neigh_parms.
Fixes: 2a4501ae18 ("neigh: Send a notification when DELAY_PROBE_TIME changes")
Signed-off-by: Marcus Huewe <suse-tux@gmx.de>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d199fab63c ]
Multiple threads can call fanout_add() at the same time.
We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.
Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.
Fixes: dc99f60069 ("packet: Add fanout support.")
Fixes: 0648ab70af ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a60ced990e ]
There is a copy-paste error, which hides breaking of resume
for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
in suspend, but left it unchanged in resume.
Fixes: 606f399395
(ti: cpsw: move platform data and slaves info to cpsw_common)
Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8b74d439e1 ]
It seems nobody used LLC since linux-3.12.
Fortunately fuzzers like syzkaller still know how to run this code,
otherwise it would be no fun.
Setting skb->sk without skb->destructor leads to all kinds of
bugs, we now prefer to be very strict about it.
Ideally here we would use skb_set_owner() but this helper does not exist yet,
only CAN seems to have a private helper for that.
Fixes: 376c7311bd ("net: add a temporary sanity check in skb_orphan()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fed06ee89b ]
When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g
_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)
per the kernel documention, preemption should be disabled, add that.
Before the fix, when running with CONFIG_PREEMPT set, we get a
BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793
asserion from the TC action (mirred) stats_update callback.
Fixes: aad7e08d39 ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code responsible for splitting periods into chunks that
can be handled by the DMA controller missed to update total_len,
the number of bytes processed in the current period, when there
are more chunks to follow.
Therefore total_len was stuck at 0 and the code didn't work at all.
This resulted in a wrong control block layout and audio issues because
the cyclic DMA callback wasn't executing on period boundaries.
Fix this by adding the missing total_len update.
Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
commit 35879ee476 upstream.
This reverts 'commit 7e0739cd9c ("[media] videodev2.h: fix
sYCC/AdobeYCC default quantization range").
The problem is that many drivers can convert R'G'B' content (often
from sensors) to Y'CbCr, but they all produce limited range Y'CbCr.
To stay backwards compatible the default quantization range for
sRGB and AdobeRGB Y'CbCr encoding should be limited range, not full
range, even though the corresponding standards specify full range.
Update the V4L2_MAP_QUANTIZATION_DEFAULT define accordingly and
also update the documentation.
Fixes: 7e0739cd9c ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range")
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9644347c52 upstream.
In the normal I/O execution path, ntb_perf is missing a call to
dmaengine_unmap_put() after submission. That causes us to leak
unmap objects.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 8a7b6a77 ("ntb: ntb perf tool")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd62245e73 upstream.
The call to debugfs_remove_recursive(qp->debugfs_dir) of the sub-level
directory must not be later than
debugfs_remove_recursive(nt_debugfs_dir) of the top-level directory.
Otherwise, the sub-level directory will not exist, and it would be
invalid (panic) to attempt to remove it. This removes the top-level
directory last, after sub-level directories have been cleaned up.
Signed-off-by: Allen Hubbe <Allen.Hubbe@dell.com>
Fixes: e26a5843f ("NTB: Split ntb_hw_intel and ntb_transport drivers")
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc98c3c8c9 upstream.
Use rcuidle console tracepoint because, apparently, it may be issued
from an idle CPU:
hw-breakpoint: Failed to enable monitor mode on CPU 0.
hw-breakpoint: CPU 0 failed to disable vector catch
===============================
[ ERR: suspicious RCU usage. ]
4.10.0-rc8-next-20170215+ #119 Not tainted
-------------------------------
./include/trace/events/printk.h:32 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
RCU used illegally from idle CPU!
rcu_scheduler_active = 2, debug_locks = 0
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
#0: (cpu_pm_notifier_lock){......}, at: [<c0237e2c>] cpu_pm_exit+0x10/0x54
#1: (console_lock){+.+.+.}, at: [<c01ab350>] vprintk_emit+0x264/0x474
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.10.0-rc8-next-20170215+ #119
Hardware name: Generic OMAP4 (Flattened Device Tree)
console_unlock
vprintk_emit
vprintk_default
printk
reset_ctrl_regs
dbg_cpu_pm_notify
notifier_call_chain
cpu_pm_exit
omap_enter_idle_coupled
cpuidle_enter_state
cpuidle_enter_state_coupled
do_idle
cpu_startup_entry
start_kernel
This RCU warning, however, is suppressed by lockdep_off() in printk().
lockdep_off() increments the ->lockdep_recursion counter and thus
disables RCU_LOCKDEP_WARN() and debug_lockdep_rcu_enabled(), which want
lockdep to be enabled "current->lockdep_recursion == 0".
Link: http://lkml.kernel.org/r/20170217015932.11898-1-sergey.senozhatsky@gmail.com
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Tony Lindgren <tony@atomide.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Russell King <rmk@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12688dc21f upstream.
This reverts commit 63d0f0a695.
It caused a regression on platforms where I2C controller is synthesized
with dynamic TAR update disabled. Detection code is testing is bit
DW_IC_CON_10BITADDR_MASTER in register DW_IC_CON read-only but fails to
restore original value in case bit is read-write.
Instead of fixing this we revert the commit since it was preparation for
the commit 0317e6c0f1 ("i2c: designware: do not disable adapter after
transfer") which was also reverted.
Reported-by: Shah Nehal-Bakulchandra <Nehal-bakulchandra.Shah@amd.com>
Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-By: Lucas De Marchi <lucas.demarchi@intel.com>
Fixes: 63d0f0a695 ("i2c: designware: detect when dynamic tar update is possible")
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e34404818 upstream.
The 64-bit get_user() wasn't clearing the high word due to a typo in the
error handler. The exception handler entry was already correct, though.
Noticed during recent usercopy test additions in lib/test_user_copy.c.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25f71d1c3e upstream.
The UEVENT user mode helper is enabled before the initcalls are executed
and is available when the root filesystem has been mounted.
The user mode helper is triggered by device init calls and the executable
might use the futex syscall.
futex_init() is marked __initcall which maps to device_initcall, but there
is no guarantee that futex_init() is invoked _before_ the first device init
call which triggers the UEVENT user mode helper.
If the user mode helper uses the futex syscall before futex_init() then the
syscall crashes with a NULL pointer dereference because the futex subsystem
has not been initialized yet.
Move futex_init() to core_initcall so futexes are initialized before the
root filesystem is mounted and the usermode helper becomes available.
[ tglx: Rewrote changelog ]
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: jiang.biao2@zte.com.cn
Cc: jiang.zhengxiong@zte.com.cn
Cc: zhong.weidong@zte.com.cn
Cc: deng.huali@zte.com.cn
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 722c5ac708 upstream.
ELAN0605 has been confirmed to be a variant of ELAN0600, which is
blacklisted in the hid-core to be managed by elan_i2c. This device can be
found in Lenovo ideapad 310s (80U4000).
Signed-off-by: Hiroka IHARA <ihara_h@live.jp>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 137d01df51 upstream.
What happens is that a write to /dev/sg is given a request with non-zero
->iovec_count combined with zero ->dxfer_len. Or with ->dxferp pointing
to an array full of empty iovecs.
Having write permission to /dev/sg shouldn't be equivalent to the
ability to trigger BUG_ON() while holding spinlocks...
Found by Dmitry Vyukov and syzkaller.
[ The BUG_ON() got changed to a WARN_ON_ONCE(), but this fixes the
underlying issue. - Linus ]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd3fc0b4d7 upstream.
Don't crash the machine just because of an empty transfer. Use WARN_ON()
combined with returning an error.
Found by Dmitry Vyukov and syzkaller.
[ Changed to "WARN_ON_ONCE()". Al has a patch that should fix the root
cause, but a BUG_ON() is not acceptable in any case, and a WARN_ON()
might still be a cause of excessive log spamming.
NOTE! If this warning ever triggers, we may end up leaking resources,
since this doesn't bother to try to clean the command up. So this
WARN_ON_ONCE() triggering does imply real problems. But BUG_ON() is
much worse.
People really need to stop using BUG_ON() for "this shouldn't ever
happen". It makes pretty much any bug worse. - Linus ]
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: James Bottomley <jejb@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f91a89d42 upstream.
Currently, if the kernel is running on a POWER9 processor under a
hypervisor, it may try to use the radix MMU even though it doesn't have
the necessary code to do so (it doesn't negotiate use of radix, and it
doesn't do the H_REGISTER_PROC_TBL hcall). If the hypervisor supports
both radix and HPT, then it will set up the guest to use HPT (since the
guest doesn't request radix in the CAS call), but if the radix feature
bit is set in the ibm,pa-features property (which is valid, since
ibm,pa-features is defined to represent the capabilities of the
processor) the guest will try to use radix, resulting in a crash when
it turns the MMU on.
This makes the minimal fix for the current code, which is to disable
radix unless we are running in hypervisor mode.
Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d4ef32975 upstream.
Commit 577fb13199 ("mmc: rework selection of bus speed mode")
refactored bus width selection code to mmc_select_bus_width().
However, it also altered the behavior to not call the selection code in
non-high-speed modes anymore.
This causes 1-bit mode to always be used when the high-speed mode is not
enabled, even though 4-bit and 8-bit bus are valid bus widths in the
backwards-compatibility (legacy) mode as well (see e.g. 5.3.2 Bus Speed
Modes in JEDEC 84-B50). This results in a significant regression in
transfer speeds.
Fix the code to allow 4-bit and 8-bit widths even without high-speed
mode, as before.
Tested with a Zynq-7000 PicoZed 7020 board.
Fixes: 577fb13199 ("mmc: rework selection of bus speed mode")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ba4d2722d upstream.
There is a potential race between fuse_dev_do_write()
and request_wait_answer() contexts as shown below:
TASK 1:
__fuse_request_send():
|--spin_lock(&fiq->waitq.lock);
|--queue_request();
|--spin_unlock(&fiq->waitq.lock);
|--request_wait_answer():
|--if (test_bit(FR_SENT, &req->flags))
<gets pre-empted after it is validated true>
TASK 2:
fuse_dev_do_write():
|--clears bit FR_SENT,
|--request_end():
|--sets bit FR_FINISHED
|--spin_lock(&fiq->waitq.lock);
|--list_del_init(&req->intr_entry);
|--spin_unlock(&fiq->waitq.lock);
|--fuse_put_request();
|--queue_interrupt();
<request gets queued to interrupts list>
|--wake_up_locked(&fiq->waitq);
|--wait_event_freezable();
<as FR_FINISHED is set, it returns and then
the caller frees this request>
Now, the next fuse_dev_do_read(), see interrupts list is not empty
and then calls fuse_read_interrupt() which tries to access the request
which is already free'd and gets the below crash:
[11432.401266] Unable to handle kernel paging request at virtual address
6b6b6b6b6b6b6b6b
...
[11432.418518] Kernel BUG at ffffff80083720e0
[11432.456168] PC is at __list_del_entry+0x6c/0xc4
[11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
...
[11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
[11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
[11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
[11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
[11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
[11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94
As FR_FINISHED bit is set before deleting the intr_entry with input
queue lock in request completion path, do the testing of this flag and
queueing atomically with the same lock in queue_interrupt().
Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: fd22d62ed0 ("fuse: no fc->lock for iqueue parts")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f9c85ee671 upstream.
Reported as a Kaffeine bug:
https://bugs.kde.org/show_bug.cgi?id=375811
The USB control messages require DMA to work. We cannot pass
a stack-allocated buffer, as it is not warranted that the
stack would be into a DMA enabled area.
On Kernel 4.9, the default is to not accept DMA on stack anymore
on x86 architecture. On other architectures, this has been a
requirement since Kernel 2.2. So, after this patch, this driver
should likely work fine on all archs.
Tested with USB ID 2040:5510: Hauppauge Windham
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a81e6a171 upstream.
Flags (PIPE_BUF_FLAG_PACKET, PIPE_BUF_FLAG_GIFT) could remain on the
unused part of the pipe ring buffer. Previously splice_to_pipe() left
the flags value alone, which could result in incorrect behavior.
Uninitialized flags appears to have been there from the introduction of
the splice syscall.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The claim-clocks property can be used to prevent PLLs and dividers
from being marked as critical. It contains a vector of clock IDs,
as defined by dt-bindings/clock/bcm2835.h.
Use this mechanism to claim PLLD_DSI0, PLLD_DSI1, PLLH_AUX and
PLLH_PIX for the vc4_kms_v3d driver.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The VPU configures and relies on several PLLs and dividers. Mark all
enabled dividers and their PLLs as CRITICAL to prevent the kernel from
switching them off.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The brcmfmac WiFi driver always complains about the '00' country code
and the firmware version is reported as an error. Modify the driver to
ignore '00' silently and display firmware version at INFO level.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
With an ethernet node in the DT, a suitable firmware can populate the
local-mac-address property, removing the need for a downstream patch
to the driver to read its MAC address from a module parameter.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
If a CMA allocation failed, the partially constructed BO would be
unreferenced through the normal path, and we might choose to put it in
the BO cache. If we then reused it before it expired from the cache,
the kernel would OOPS.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: c826a6e106 ("drm/vc4: Add a BO cache.")
The from_cache flag was actually "the BO is invisible to userspace",
so we can repurpose to just zero out a cached BO and return it to
userspace.
Improves wall time for a loop of 5 glsl-algebraic-add-add-1 by
-1.44989% +/- 0.862891% (n=28, 1 outlier removed from each that
appeared to be other system noise)
Note that there's an intel-gpu-tools test to check for the proper
zeroing behavior here, which we continue to pass.
Signed-off-by: Eric Anholt <eric@anholt.net>
In the rewrite of vc4_crtc.c for fkms, I dropped the part of the
CRTC's atomic flush handler that moved the completion event from the
proposed atomic state change to the CRTC's current state. That meant
that when full screen pageflipping happened (glxgears -fullscreen in
X, compton, por weston), the app would end up blocked firever waiting
to draw its next frame.
Signed-off-by: Eric Anholt <eric@anholt.net>
The DSI0 and DSI1 blocks on the 2835 are related hardware blocks.
Some registers move around, and the featureset is slightly different,
as DSI1 (the 4-lane DSI) is a later version of the hardware block.
This driver doesn't yet enable DSI0, since we don't have any hardware
to test against, but it does put a lot of the register definitions and
code in place.
Signed-off-by: Eric Anholt <eric@anholt.net>
We have to set a different pixel format, which tells the hardware to
use the pix_width field that's fed in sideband from the DSI encoder to
divide the "pixel" clock.
Signed-off-by: Eric Anholt <eric@anholt.net>
We want the HVS on, obviously, and we also want DSP3 (PV1's source) to
be muxed from HVS channel 2 like we expect in vc4_crtc.c. The
firmware wasn't setting the DSP3 mux up when both the LCD and HDMI
were disabled.
Signed-off-by: Eric Anholt <eric@anholt.net>
Some generic TV connector properties are exposed in drm_mode_config, but
they are currently handled independently in each DRM encoder driver.
Extend the drm_connector_state to store TV related states, and modify the
drm_atomic_connector_{set,get}_property() helpers to fill the connector
state accordingly.
Each driver is then responsible for checking and applying the new config
in its ->atomic_mode_{check,set}() operations.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 299a16b163)
PV_CONTROL_CLK_SELECT_VEC is actually 2 and not 0. Fix the definition and
rework the vc4_set_crtc_possible_masks() to cover the full range of the
PV_CONTROL_CLK_SELECT field.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit ab8df60e3a)
There was a small window where a userspace program could submit
a pageflip after receiving a pageflip completion event yet still
receive EBUSY.
Signed-off-by: Derek Foreman <derekf@osg.samsung.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Daniel Stone <daniels@collabora.com>
(cherry picked from commit 26fc78f6fe)
FS threading brings performance improvements of 0-20% in glmark2.
The validation code checks for thread switch signals and ensures that
the registers of the other thread are not touched, and that our clamps
are not live across thread switches. It also checks that the
threading and branching instructions do not interfere.
(Original patch by Jonas, changes by anholt for style cleanup,
removing validation the kernel doesn't need to do, and adding the flag
for userspace).
v2: Minor style fixes from checkpatch.
Signed-off-by: Jonas Pfeil <pfeiljonas@gmx.de>
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit c778cc5df9)
The pm_runtime_put() we were using immediately released power on the
device, which meant that we were generally turning the device off and
on once per frame. In many profiles I've looked at, that added up to
about 1% of CPU time, but this could get worse in the case of frequent
rendering and readback (as may happen in X rendering). By keeping the
device on until we've been idle for a couple of frames, we drop the
overhead of runtime PM down to sub-.1%.
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 3a62234680)
The validation for it ends up being quite simple, but I hadn't got
around to it before merging the driver. For backwards compatibility,
we also need to add a flag so that the userspace GL driver can easily
tell if the kernel will allow ETC1 textures (on an old kernel, it will
continue to convert to RGBA8)
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 7154d76fed)
The loop is scanning until the original max_ip (size of the BO), but
we want to not examine any code after the PROG_END's delay slots.
There was a block trying to do that, except that we had some early
continue statements if the signal wasn't a PROG_END or a BRANCH.
The failure mode would be that a valid shader is rejected because some
undefined memory after the PROG_END slots is parsed as a branch and
the rest of its setup is illegal. I haven't seen this in the wild,
but valgrind was complaining when about this up in the userland
simulator mode.
Signed-off-by: Eric Anholt <eric@anholt.net>
(cherry picked from commit 457e67a728)
The modules stay disabled by default, and if you want to enable DSI
you'll need an overlay that connects a panel to it.
Signed-off-by: Eric Anholt <eric@anholt.net>
This driver communicates with the Atmel microcontroller for sequencing
the poweron of the TC358762 DSI-DPI bridge and controlling the
backlight PWM.
The following lines are required in config.txt, to keep the firmware
from trying to bash our I2C lines and steal the DSI interrupts:
disable_touchscreen=1
ignore_lcd=2
mask_gpu_interrupt1=0x1000
This means that the firmware won't power on the panel at boot time (no
rainbow) and the touchscreen input won't work. The native input
driver for the touchscreen still needs to be written.
v2: Set the same default orientation as the closed source firmware
used, which is the best for viewing angle.
Signed-off-by: Eric Anholt <eric@anholt.net>
This proved incredibly useful during debugging of the DSI driver, to
see if our clocks were running at rate we requested. Let's leave it
here for the next person interacting with clocks on the platform (and
so that hopefully we can just hook it up to debugfs some day).
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
(cherry picked from commit 3f9195811d)
The DSI pixel clocks are muxed from clocks generated in the analog phy
by the DSI driver. In order to set them as parents, we need to do the
same name lookup dance on them as we do for our root oscillator.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
(cherry picked from commit 8a39e9fa57)
Our core PLLs are intended to be configured once and left alone. With
the SET_RATE_PARENT, asking to set the PLLD_DSI1 clock rate would
change PLLD just to get closer to the requested DSI clock, thus
changing PLLD_PER, the UART and ethernet PHY clock rates downstream of
it, and breaking ethernet.
We *do* want PLLH to change so that PLLH_AUX can be exactly the value
we want, though. Thus, we need to have a per-divider policy of
whether to pass rate changes up.
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
(cherry picked from commit 55486091bd)
The VEC clock requires needs to be set at exactly 108MHz. Allow rate
change propagation on PLLH_AUX to match this requirement wihtout
impacting other IPs (PLLH is currently only used by the HDMI encoder,
which cannot be enabled when the VEC encoder is enabled).
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
(cherry picked from commit d86d46af84)
Some peripheral clocks, like the VEC (Video EnCoder) clock need to be set
to a precise rate (in our case 108MHz). With the current implementation,
where peripheral clocks are not allowed to forward rate change requests
to their parents, it is impossible to match this requirement unless the
bootloader has configured things correctly, or a specific rate has been
assigned through the DT (with the assigned-clk-rates property).
Add a new field to struct bcm2835_clock_data to specify which parent
clocks accept rate change propagation, and support set rate propagation
in bcm2835_clock_determine_rate().
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
(cherry picked from commit 155e8b3b0e)
Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixesraspberrypi/linux#903
Christopher Alexander Tobias Schulze - May 2, 2015, 11:57 a.m.
This patch fixes a problem with VFP state save and restore related
to exception handling (panic with message "BUG: unsupported FP
instruction in kernel mode") present on VFP11 floating point units
(as used with ARM1176JZF-S CPUs, e.g. on first generation Raspberry
Pi boards). This patch was developed and discussed on
https://github.com/raspberrypi/linux/issues/859
A precondition to see the crashes is that floating point exception
traps are enabled. In this case, the VFP11 might determine that a FPU
operation needs to trap at a point in time when it is not possible to
signal this to the ARM11 core any more. The VFP11 will then set the
FPEXC.EX bit and store the trapped opcode in FPINST. (In some cases,
a second opcode might have been accepted by the VFP11 before the
exception was detected and could be reported to the ARM11 - in this
case, the VFP11 also sets FPEXC.FP2V and stores the second opcode in
FPINST2.)
If FPEXC.EX is set, the VFP11 will "bounce" the next FPU opcode issued
by the ARM11 CPU, which will be seen by the ARM11 as an undefined opcode
trap. The VFP support code examines the FPEXC.EX and FPEXC.FP2V bits
to decide what actions to take, i.e., whether to emulate the opcodes
found in FPINST and FPINST2, and whether to retry the bounced instruction.
If a user space application has left the VFP11 in this "pending trap"
state, the next FPU opcode issued to the VFP11 might actually be the
VSTMIA operation vfp_save_state() uses to store the FPU registers
to memory (in our test cases, when building the signal stack frame).
In this case, the kernel crashes as described above.
This patch fixes the problem by making sure that vfp_save_state() is
always entered with FPEXC.EX cleared. (The current value of FPEXC has
already been saved, so this does not corrupt the context. Clearing
FPEXC.EX has no effects on FPINST or FPINST2. Also note that many
callers already modify FPEXC by setting FPEXC.EN before invoking
vfp_save_state().)
This patch also addresses a second problem related to FPEXC.EX: After
returning from signal handling, the kernel reloads the VFP context
from the user mode stack. However, the current code explicitly clears
both FPEXC.EX and FPEXC.FP2V during reload. As VFP11 requires these
bits to be preserved, this patch disables clearing them for VFP
implementations belonging to architecture 1. There should be no
negative side effects: the user can set both bits by executing FPU
opcodes anyway, and while user code may now place arbitrary values
into FPINST and FPINST2 (e.g., non-VFP ARM opcodes) the VFP support
code knows which instructions can be emulated, and rejects other
opcodes with "unhandled bounce" messages, so there should be no
security impact from allowing reloading FPEXC.EX and FPEXC.FP2V.
Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
Since driver load deferrals are expected and will already
have resulted in a kernel message, suppress an essentially
duplicate error message from the RPi audio board drivers.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
At present there is no mechanism to specify driver load order,
which can lead to deferrals and repeated retries until successful.
Since this situation is expected, reduce the dmesg level to
INFO and mention that the operation will be retried.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Note: due to problems with deferred probing of regulators
the following softdep should be added to a modprobe.d file
softdep arizona-spi pre: arizona-ldo1
Signed-off-by: Matthias Reichl <hias@horus.com>
Reference 3.3V / 5V system rails instead of instantiating local
regulators.
Add missing power supply properties for codecs where these are
required according to the DT bindings docs.
Signed-off-by: Matthias Reichl <hias@horus.com>
The CM1 dtb contains an empty audio_pins node, but no reference to it.
Adding the usual pinctrl reference from the audio node enables the
audremap overlay (and others) to easily turn on audio.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
If it breaks on anybody, they can use the standard device tree
overlays to switch back to the dwc2 driver.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
IRQ-CPU mapping is round robined on ARM64 to increase
concurrency and allow multiple interrupts to be serviced
at a time. This reduces the need for FIQ.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
In ARM64, the FIQ mechanism used by this driver is not current
implemented. As a workaround, reqular IRQ is used instead
of FIQ.
In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time. This reduces the need for FIQ.
Tests Run:
This mechanism is most likely to break when multiple USB devices
are attached at the same time. So the system was tested under
stress.
Devices:
1. USB Speakers playing back a FLAC audio through VLC
at 96KHz.(Higher then typically, but supported on my speakers).
2. sftp transferring large files through the buildin ethernet
connection which is connected through USB.
3. Keyboard and mouse attached and being used.
Although I do occasionally hear some glitches, the music seems to
play quite well.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
The spi0-cs overlay allows the software chip selectts to be modified
using the cs0_pin and cs1_pin parameters.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
These drivers build now, so they can be enabled back
in the build configuration just like they are for
32 bit.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
These drivers use an ASM function from the base
system to compute the ipv6 checksum. These functions
are not available on ARM64, probably because nobody
has bother to write them. The base system does have
a generic "C" version, so a simple fix is to include
the header to use the generic version on ARM64 only.
A longer term solution would be to submit the necessary
ASM function to the upstream source.
With this change, these drivers now compile without
any errors on ARM64.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
Randomization allows the mapping between virtual addresses and physical
address to be different on each boot. This makes it more difficult
to exploit security vulnerabilities that require knowledge of fixed
hardware addresses.
The firmware generates a 8 byte random number during bootup and stores
it in the device tree under chosen/kaslr-seed. This number is used
to randomize the address mapping.
This change enables this feature in the build configuration for ARM64.
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
* Invoke the dtc compiler with the same options used in arm mode.
* ARM64 now uses the bcm2835 platform just like ARM32.
* ARM64: Update bcmrpi3_defconfig
Signed-off-by: Michael Zoran <mzoran@crowfest.net>
* Added a writable sysfs object to enable scripts / user space software
to blink MIDI activity LEDs for variable duration.
* Improved hw_param constraints setting.
* Added compatibility with S16_LE sample format.
* Exposed some simple placeholder volume controls, so the card appears
in volumealsa widget.
Signed-off-by: Giedrius Trainavicius <giedrius@blokas.io>
This adds a debug module parameter to aid in debugging transfer issues
by printing info to the kernel log. When enabled, status values are
collected in the interrupt routine and msg info in
bcm2835_i2c_start_transfer(). This is done in a way that tries to avoid
affecting timing. Having printk in the isr can mask issues.
debug values (additive):
1: Print info on error
2: Print info on all transfers
3: Print messages before transfer is started
The value can be changed at runtime:
/sys/module/i2c_bcm2835/parameters/debug
Example output, debug=3:
[ 747.114448] bcm2835_i2c_xfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.114463] bcm2835_i2c_xfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117809] start_transfer: msg(1/2) write addr=0x54, len=2 flags= [i2c1]
[ 747.117825] isr: remain=2, status=0x30000055 : TA TXW TXD TXE [i2c1]
[ 747.117839] start_transfer: msg(2/2) read addr=0x54, len=32 flags= [i2c1]
[ 747.117849] isr: remain=32, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117861] isr: remain=20, status=0xd0000039 : TA RXR TXD RXD [i2c1]
[ 747.117870] isr: remain=8, status=0x32 : DONE TXD RXD [i2c1]
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Support a dynamic clock by reading the frequency and setting the
divisor in the transfer function instead of during probe.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Martin Sperl <kernel@martin.sperl.org>
Use i2c_adapter->timeout for the completion timeout value. The core
default is 1 second.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Documentation/i2c/i2c-protocol states that Combined transactions should
separate messages with a Start bit and end the whole transaction with a
Stop bit. This patch adds support for issuing only a Start between
messages instead of a Stop followed by a Start.
This implementation differs from downstream i2c-bcm2708 in 2 respects:
- it uses an interrupt to detect that the transfer is active instead
of using polling. There is no interrupt for Transfer Active, but by
not prefilling the FIFO it's possible to use the TXW interrupt.
- when resetting/disabling the controller between transfers it writes
CLEAR to the control register instead of just zero.
Using just zero gave many errors. This might be the reason why
downstream had to disable this feature and make it available with a
module parameter.
I have run thousands of transfers to a DS1307 (rtc), MMA8451 (accel)
and AT24C32 (eeprom) in parallel without problems.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Acked-by: Eric Anholt <eric@anholt.net>
The controller can't support this flag, so remove it.
Documentation/i2c/i2c-protocol states that all of the message is sent:
I2C_M_IGNORE_NAK:
Normally message is interrupted immediately if there is [NA] from the
client. Setting this flag treats any [NA] as [A], and all of
message is sent.
From the BCM2835 ARM Peripherals datasheet:
The ERR field is set when the slave fails to acknowledge either
its address or a data byte written to it.
So when the controller doesn't receive an ack, it sets ERR and raises
an interrupt. In other words, the whole message is not sent.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Writing to an AT24C32 generates on average 2x i2c transfer errors per
32-byte page write. Which amounts to a lot for a 4k write. This is due
to the fact that the chip doesn't respond during it's internal write
cycle when the at24 driver tries and retries the next write.
Only a handful drivers use dev_err() on transfer error, so switch to
dev_dbg() instead.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
If an unexpected TXW or RXR interrupt occurs (msg_buf_remaining == 0),
the driver has no way to fill/drain the FIFO to stop the interrupts.
In this case the controller has to be disabled and the transfer
completed to avoid hang.
(CLKT | ERR) and DONE interrupts are completed in their own paths, and
the controller is disabled in the transfer function after completion.
Unite the code paths and do disabling inside the interrupt routine.
Clear interrupt status bits in the united completion path instead of
trying to do it on every interrupt which isn't necessary.
Only CLKT, ERR and DONE can be cleared that way.
Add the status value to the error value in case of TXW/RXR errors to
distinguish them from the other S_LEN error.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Writing messages larger than the FIFO size results in a hang, rendering
the machine unusable. This is because the RXD status flag is set on the
first interrupt which results in bcm2835_drain_rxfifo() stealing bytes
from the buffer. The controller continues to trigger interrupts waiting
for the missing bytes, but bcm2835_fill_txfifo() has none to give.
In this situation wait_for_completion_timeout() apparently is unable to
stop the madness.
The BCM2835 ARM Peripherals datasheet has this to say about the flags:
TXD: is set when the FIFO has space for at least one byte of data.
RXD: is set when the FIFO contains at least one byte of data.
TXW: is set during a write transfer and the FIFO is less than full.
RXR: is set during a read transfer and the FIFO is or more full.
Implementing the logic from the downstream i2c-bcm2708 driver solved
the hang problem.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Martin Sperl <kernel@martin.sperl.org>
Claiming the completion_mutex within add_completion did prevent some
messages appearing twice, but provokes a deadlock caused by vcsm using
vchiq within a page fault handler.
Revert the use of completion_mutex, and instead fix the original
problem using more memory barriers.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
An issue was observed when flushing openmax components
which generate a large number of messages returning
buffers to host.
We occasionally found a duplicate message from 16
messages prior, resulting in a buffer returned twice.
While only one thread adds completions, without the
mutex you don't get the protection of the automatic
memory barrier you get with synchronisation objects.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Service callbacks are not allowed to return an error. The internal callback
that delivers events and messages to user tasks does not enqueue them if
the service is closing, but this is not an error and should not be
reported as such.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Reading through this code looking for another problem (now found in userland)
the use of dequeue_pending outside a lock didn't seem safe.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Disable MMC_BCM2835_SDHOST and MMC_BCM2835 since these drivers are crashing at the moment.
ARM64: Modify default config to get raspbian to boot (#1686)
1. Enable emulation of deprecated instructions.
2. Enable ARM 8.1 and 8.2 features which are not detected at runtime.
3. Switch the default governer to powersave.
4. Include the watchdog timer driver in the kernel image rather then a module.
Tested with raspbian-jessie 2016-09-23.
brcmfmac: Disable power management
Disable wireless power saving in the brcmfmac WLAN driver. This is a
temporary measure until the connectivity loss resulting from power
saving is resolved.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Use original country code as a fallback
Commit 73345fd212:
brcmfmac: Configure country code using device specific settings
prevents region codes from working on devices that lack a region code
translation table. In the event of an absent table, preserve the old
behaviour of using the provided code as-is.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: Plug memory leak in brcmf_fill_bss_param
See: https://github.com/raspberrypi/linux/issues/1471
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
brcmfmac: do not use internal roaming engine by default
Some evidence of curing disconnects with this disabled, so make it a default.
Can be overridden with module parameter roamoff=0
See: http://projectable.me/optimize-my-pi-wi-fi/
brcmfmac: Change stop_ap sequence
Patch from Broadcom/Cypress to resolve a customer error
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
This is a port of Pantelis Antoniou's v3 port that makes use of the
new upstreamed configfs support for binary attributes.
Original commit message:
Add a runtime interface to using configfs for generic device tree overlay
usage. With it its possible to use device tree overlays without having
to use a per-platform overlay manager.
Please see Documentation/devicetree/configfs-overlays.txt for more info.
Changes since v2:
- Removed ifdef CONFIG_OF_OVERLAY (since for now it's required)
- Created a documentation entry
- Slight rewording in Kconfig
Changes since v1:
- of_resolve() -> of_resolve_phandles().
Originally-signed-off-by: Pantelis Antoniou <pantelis.antoniou@konsulko.com>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: Fix build errors on other platforms
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
DT configfs: fix build error
There is an error when compiling rpi-4.6.y branch:
CC drivers/of/configfs.o
drivers/of/configfs.c:291:21: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.default_groups = of_cfs_def_groups,
^
drivers/of/configfs.c:291:21: note: (near initialization for 'of_cfs_subsys.su_group.default_groups.next')
The .default_groups is linked list since commit
1ae1602de0.
This commit uses configfs_add_default_group to fix this problem.
Signed-off-by: Slawomir Stepien <sst@poczta.fm>
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
suppress spurious messages
Add #if for 3.14 kernel change (#87)
Fixes compiling after changes in f663dd9aaf and 99932d4fc0Fixes#86
Set dev_type to wlan
Fixes#23
Tentatively added support for more 8188CUS based devices.
Add support for more 8188CUS and 8192CUS devices
Add ProductId for the Netgear N150 - WNA1000M
Fixes CONFIG_CONCURRENT_MODE CONFIG_MULTI_VIR_IFACES
Fixes compatibility with 3.13
Enables warning in the compiler and fixes some issues, reference => https://github.com/diederikdehaas/rtl8812AU
Starts device in station mode instead of monitor, fixes NetworkManager issues
Enable cfg80211 support
Fix cfg80211 for kernel >= 4.7
Fixes rtl8192cu for kernel >= 4.8
Add non-mainline source for rtl8192cu wireless driver version v4.0.2_9000 as
this is widely used. Disable older rtlwifi driver.
8192cu needs old wireless extensions
The obsolete WIRELESS_EXT configuration is used
by the old Realtek code and is needed for AP support.
8192cu: CONFIG_AP_MODE hardcoded in autoconf.h
rtl8192c_rf6052: PHY_RFShadowRefresh(): fix off-by-one
Signed-off-by: Marc Kleine-Budde <mkl@blackshift.org>
rtl8192cu: Add PID for D-Link DWA 131
The pl011 driver looks for DT aliases of the form "serial<n>",
and if found uses <n> as the device ID. This can cause
/dev/ttyAMA0 to become /dev/ttyAMA1, which is confusing if the
other serial port is provided by the 8250 driver which doesn't
use the same logic.
Add a mailbox-driven backlight controller for the Raspberry Pi DSI
touchscreen display. Requires updated GPU firmware to recognise the
mailbox request.
Signed-off-by: Gordon Hollingworth <gordon@raspberrypi.org>
Pisound dynamic overlay (#1760)
Restructuring pisound-overlay.dts, so it can be loaded and unloaded dynamically using dtoverlay.
Print a logline when the kernel module is removed.
Add initial 2 channel (stereo) support for Allo Piano DAC (2.0/2.1) boards,
using allo-piano-dac-pcm512x-audio overlay and allo-piano-dac ALSA ASoC
machine driver.
NB. The initial support is 2 channel (stereo) ONLY!
(The Piano DAC 2.1 will only support 2 channel (stereo) left/right output,
pending an update to the upstream pcm512x codec driver, which will have
to be submitted via upstream. With the initial downstream support,
provided by this patch, the Piano DAC 2.1 subwoofer outputs will
not function.)
Signed-off-by: Baswaraj K <jaikumar@cem-solutions.net>
Signed-off-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Support IQAudIO Digi board with iqaudio_digi machine driver and
iqaudio-digi-wm8804-audio overlay.
NB. Machine driver is a cut and paste of hifiberry_digi code, with format
and general cleanup to comply with kernel coding standards.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Contains the sound/soc/bcm ALSA machine driver and necessary alterations to the Kconfig and Makefile.
Adds the dts overlay and updates the Makefile and README.
Updates the relevant defconfig files to enable building for the Raspberry Pi.
Thanks to Phil Elwell (pelwell) for the review, simple-card concepts and discussion. Thanks to Clive Messer for overlay naming suggestions.
Added support for headphones, microphone and bclk_ratio settings.
This patch adds headphone and microphone capability to the Audio Injector sound card. The patch also sets the bit clock ratio for use in the bcm2835-i2s driver. The bcm2835-i2s can't handle an 8 kHz sample rate when the bit clock is at 12 MHz because its register is only 10 bits wide which can't represent the ch2 offset of 1508. For that reason, the rate constraint is added.
This commit adds basic support for the codec usage including: Device tree overlay,
binding I2S bus and setting I2S mode, clock source and frequency setting according
to spec.
Signed-off-by: Andrey Grodzovsky <andrey2805@gmail.com>
justboom-dac: Adjust for ALSA API change
As of 4.4, snd_soc_limit_volume now takes a struct snd_soc_card *
rather than a struct snd_soc_codec *.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Signed-off-by: Jan Grulich <jan@grulich.eu>
config: fix RaspiDAC Rev.3x dependencies
Change depends to SND_BCM2708_SOC_I2S || SND_BCM2835_SOC_I2S
like the other I2S soundcard drivers.
Signed-off-by: Matthias Reichl <hias@horus.com>
The driver contains a low-level hardware driver for the TAS5713 and the
drivers for the Raspberry Pi I2S subsystem.
TAS5713: return error if initialisation fails
Existing TAS5713 driver logs errors during initialisation, but does not return
an error code. Therefore even if initialisation fails, the driver will still be
loaded, but won't work. This patch fixes this. I2C communication error will now
reported correctly by a non-zero return code.
HiFiBerry Amp: fix device-tree problems
Some code to load the driver based on device-tree-overlays was missing. This is added by this patch.
hifiberry-amp: Adjust for ALSA object refactoring
See: https://github.com/raspberrypi/linux/issues/1775
The driver is based on the HiFiBerry DAC driver. However HiFiBerry DAC+ uses
a different codec chip (PCM5122), therefore a new driver is necessary.
Add support for the HiFiBerry DAC+ Pro.
The HiFiBerry DAC+ and DAC+ Pro products both use the existing bcm sound driver with the DAC+ Pro having a special clock device driver representing the two high precision oscillators.
An addition bug fix is included for the PCM512x codec where by the physical size of the sample frame is used in the calculation of the LRCK divisor as it was found to be wrong when using 24-bit depth sample contained in a little endian 4-byte sample frame.
Limit PCM512x "Digital" gain to 0dB by default with HiFiBerry DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Add dt param to force HiFiBerry DAC+ Pro into slave mode
"dtoverlay=hifiberry-dacplus,slave"
Add 'slave' param to use HiFiBerry DAC+ Pro in slave mode,
with Pi as master for bit and frame clock.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Set a limit of 0dB on Digital Volume Control
The main volume control in the PCM512x DAC has a range up to
+24dB. This is dangerously loud and can potentially cause massive
clipping in the output stages. Therefore this sets a sensible
limit of 0dB for this control.
Allow up to 24dB digital gain to be applied when using IQAudIO DAC+
24db_digital_gain DT param can be used to specify that PCM512x
codec "Digital" volume control should not be limited to 0dB gain,
and if specified will allow the full 24dB gain.
Modify IQAudIO DAC+ ASoC driver to set card/dai config from dt
Add the ability to set the card name, dai name and dai stream name, from
dt config.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
IQaudIO: auto-mute for AMP+ and DigiAMP+
IQAudIO amplifier mute via GPIO22. Add dt params for "one-shot" unmute
and auto mute.
Revision 2, auto mute implementing HiassofT suggestion to mute/unmute
using set_bias_level, rather than startup/shutdown....
"By default DAPM waits 5 seconds (pmdown_time) before shutting down
playback streams so a close/stop immediately followed by open/start
doesn't trigger an amp mute+unmute."
Tested on both AMP+ (via DAC+) and DigiAMP+, with both options...
dtoverlay=iqaudio-dacplus,unmute_amp
"one-shot" unmute when kernel module loads.
dtoverlay=iqaudio-dacplus,auto_mute_amp
Unmute amp when ALSA device opened by a client. Mute, with 5 second delay
when ALSA device closed. (Re-opening the device within the 5 second close
window, will cancel mute.)
Revision 4, using gpiod.
Revision 5, clean-up formatting before adding mute code.
- Convert tab plus 4 space formatting to 2x tab
- Remove '// NOT USED' commented code
Revision 6, don't attempt to "one-shot" unmute amp, unless card is
successfully registered.
Signed-off-by: DigitalDreamtime <clive.messer@digitaldreamtime.co.uk>
Signed-off-by: Daniel Matuschek <daniel@matuschek.net>
Add a parameter to turn off SPDIF output if no audio is playing
This patch adds the paramater auto_shutdown_output to the kernel module.
Default behaviour of the module is the same, but when auto_shutdown_output
is set to 1, the SPDIF oputput will shutdown if no stream is playing.
bugfix for 32kHz sample rate, was missing
HiFiBerry Digi: set SPDIF status bits for sample rate
The HiFiBerry Digi driver did not signal the sample rate in the SPDIF status bits.
While this is optional, some DACs and receivers do not accept this signal. This patch
adds the sample rate bits in the SPDIF status block.
Added HiFiBerry Digi+ Pro driver
Signed-off-by: Daniel Matuschek <daniel@hifiberry.com>
This adds a machine driver for the HifiBerry DAC.
It is a sound card that can
be stacked onto the Raspberry Pi.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
The Raspberry Pi firmware manages the power-down and reboot
process. To do this it installs a pm_power_off handler, causing
the gpio-poweroff module to abort the probe function.
This patch introduces a "force" DT property that overrides that
behaviour, and also adds a DT overlay to enable and control it.
Note that running in an active-low configuration (DT parameter
"active_low") requires a custom dt-blob.bin and probably won't
allow a reboot without switching off, so an external inversion
of the trigger signal may be preferable.
Provide a __copy_from_user that uses memcpy. On BCM2708, use
optimised memcpy/memmove/memcmp/memset implementations.
arch/arm: Add mmiocpy/set aliases for memcpy/set
See: https://github.com/raspberrypi/linux/issues/1082
copy_from_user: CPU_SW_DOMAIN_PAN compatibility
The downstream copy_from_user acceleration must also play nice with
CONFIG_CPU_SW_DOMAIN_PAN.
See: https://github.com/raspberrypi/linux/issues/1381
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Fix driver detection failure Check that the buffer response is non-zero meaning the touchscreen was detected
rpi-ft5406: Use firmware API
RPI-FT5406: Enable aarch64 support through explicit iomem interface
Signed-off-by: Gerhard de Clercq <gerharddeclercq@outlook.com>
1-wire: Add support for configuring pin for w1-gpio kernel module
See: https://github.com/raspberrypi/linux/pull/457
Add bitbanging pullups, use them for w1-gpio
Allows parasite power to work, uses module option pullup=1
bcm2708: Ensure 1-wire pullup is disabled by default, and expose as module parameter
Signed-off-by: Alex J Lennon <ajlennon@dynamicdevices.co.uk>
w1-gpio: Add gpiopin module parameter and correctly free up gpio pull-up pin, if set
Signed-off-by: Alex J Lennon <ajlennon@dynamicdevices.co.uk>
w1-gpio: Sort out the pullup/parasitic power tangle
Especially on platforms with a slower CPU but a relatively high
framebuffer fill bandwidth, like current ARM devices, the existing
console monochrome imageblit function used to draw console text is
suboptimal for common pixel depths such as 16bpp and 32bpp. The existing
code is quite general and can deal with several pixel depths. By creating
special case functions for 16bpp and 32bpp, by far the most common pixel
formats used on modern systems, a significant speed-up is attained
which can be readily felt on ARM-based devices like the Raspberry Pi
and the Allwinner platform, but should help any platform using the
fb layer.
The special case functions allow constant folding, eliminating a number
of instructions including divide operations, and allow the use of an
unrolled loop, eliminating instructions with a variable shift size,
reducing source memory access instructions, and eliminating excessive
branching. These unrolled loops also allow much better code optimization
by the C compiler. The code that selects which optimized variant is used
is also simplified, eliminating integer divide instructions.
The speed-up, measured by timing 'cat file.txt' in the console, varies
between 40% and 70%, when testing on the Raspberry Pi and Allwinner
ARM-based platforms, depending on font size and the pixel depth, with
the greater benefit for 32bpp.
Signed-off-by: Harm Hanemaaijer <fgenfb@yahoo.com>
Based on the patch authored by Ali Gholami Rudi at
https://lkml.org/lkml/2009/7/13/153
Provide an ioctl for userspace applications, but only if this operation
is hardware accelerated (otherwide it does not make any sense).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: Add ioctl for reading gpu memory through dma
The "input" trigger makes the associated GPIO an input. This is to support
the Raspberry Pi PWR LED, which is driven by external hardware in normal use.
N.B. pwr_led is not available on Model A or B boards.
leds-gpio: Implement the brightness_get method
The power LED uses some clever logic that means it is driven
by a voltage measuring circuit when configured as input, otherwise
it is driven by the GPIO output value. This patch wires up the
brightness_get method for leds-gpio so that user-space can monitor
the LED value via /sys/class/gpio/led1/brightness. Using the input
trigger this returns an indication of the system power health,
otherwise it is just whatever value the trigger has written most
recently.
See: https://github.com/raspberrypi/linux/issues/1064
Add the bare minimum needed to boot BCM2708 from a Device Tree.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
BCM2708: DT: change 'axi' nodename to 'soc'
Change DT node named 'axi' to 'soc' so it matches ARCH_BCM2835.
The VC4 bootloader fills in certain properties in the 'axi' subtree,
but since this is part of an upstreaming effort, the name is changed.
Signed-off-by: Noralf Tronnes notro@tronnes.org
BCM2708_DT: Correct length of the peripheral space
Use dts-dirs feature for overlays.
The kernel makefiles have a dts-dirs target that is for vendor subdirectories.
Using this fixes the install_dtbs target, which previously did not install the overlays.
BCM270X_DT: configure I2S DMA channels
Signed-off-by: Matthias Reichl <hias@horus.com>
BCM270X_DT: switch to bcm2835-i2s
I2S soundcard drivers with proper devicetree support (i.e. not linking
to the cpu_dai/platform via name but to cpu/platform via of_node)
will work out of the box without any modifications.
When the kernel is compiled without devicetree support the platform
code will instantiate the bcm2708-i2s driver and I2S soundcard drivers
will link to it via name, as before.
Signed-off-by: Matthias Reichl <hias@horus.com>
SDIO-overlay: add poll_once-boolean parameter
Add paramter to toggle sdio-device-polling
done every second or once at boot-time.
Signed-off-by: Patrick Boettcher <patrick.boettcher@posteo.de>
BCM270X_DT: Make mmc overlay compatible with current firmware
The original DT overlay logic followed a merge-then-patch procedure,
i.e. parameters are applied to the loaded overlay before the overlay
is merged into the base DTB. This sequence has been changed to
patch-then-merge, in order to support parameterised node names, and
to protect against bad overlays. As a result, overrides (parameters)
must only target labels in the overlay, but the overlay can obviously target nodes in the base DTB.
mmc-overlay.dts (that switches back to the original mmc sdcard
driver) is the only overlay violating that rule, and this patch
fixes it.
bcm270x_dt: Use the sdhost MMC controller by default
The "mmc" overlay reverts to using the other controller.
squash: Add cprman to dt
BCM270X_DT: Use clk_core for I2C interfaces
BCM270X_DT: Use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi
The mainline Device Tree files are quite close to downstream now.
Let's use bcm283x.dtsi, bcm2835.dtsi and bcm2836.dtsi as base files
for our dts files.
Mainline dts files are based on these files:
bcm2835-rpi.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
Current downstream are based on these:
bcm2708.dtsi bcm2709.dtsi bcm2710.dtsi
bcm2708_common.dtsi
This patch introduces this dependency:
bcm2708.dtsi bcm2709.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm2835.dtsi bcm2836.dtsi
bcm283x.dtsi
And:
bcm2710.dtsi
bcm2708-rpi.dtsi
bcm270x.dtsi
bcm283x.dtsi
bcm270x.dtsi contains the downstream bcm283x.dtsi diff.
bcm2708-rpi.dtsi is the downstream version of bcm2835-rpi.dtsi.
Other changes:
- The led node has moved from /soc/leds to /leds. This is not a problem
since the label is used to reference it.
- The clk_osc reg property changes from 6 to 3.
- The gpu nodes has their interrupt property set in the base file.
- the clocks label does not point to the /clocks node anymore, but
points to the cprman node. This is not a problem since the overlays
that use the clock node refer to it directly: target-path = "/clocks";
- some nodes now have 2 labels since mainline and downstream differs in
this respect: cprman/clocks, spi0/spi, gpu/vc4.
- some nodes doesn't have an explicit status = "okay" since they're not
disabled in the base file: watchdog and random.
- gpiomem doesn't need an explicit status = "okay".
- bcm2708-rpi-cm.dts got the hpd-gpios property from bcm2708_common.dtsi,
it's now set directly in that file.
- bcm2709-rpi-2-b.dts has the timer node moved from /soc/timer to /timer.
- Removed clock-frequency property on the bcm{2709,2710}.dtsi timer nodes.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Use raspberrypi-power to turn on USB power
Use the raspberrypi-power driver to turn on USB power.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270X_DT: Add a .dtbo target, use for overlays
Change the filenames and extensions to keep the pre-DDT style of
overlay (<name>-overlay.dtb) distinct from new ones that use a
different style of local fixups (<name>.dtbo), and to match other
platforms.
The RPi firmware uses the DDTK trailer atom to choose which type of
overlay to use for each kernel.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Don't generate "linux,phandle" props
The EPAPR standard says to use "phandle" properties to store phandles,
rather than the deprecated "linux,phandle" version. By default, dtc
generates both, but adding "-H epapr" causes it to only generate
"phandle"s, saving some space and clutter.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add overlay for enc28j60 on SPI2
Works on SPI2 for compute module
BCM270X_DT: Add midi-uart0 overlay
MIDI requires 31.25kbaud, a baudrate unsupported by Linux. The
midi-uart0 overlay configures uart0 (ttyAMA0) to use a fake clock
so that requesting 38.4kbaud actually gets 31.25kbaud.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: Add i2c-sensor overlay
The i2c-sensor overlay is a container for various pressure and
temperature sensors, currently bmp085 and bmp280. The standalone
bmp085_i2c-sensor overlay is now deprecated.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM270X_DT: overlays/*-overlay.dtb -> overlays/*.dtbo (#1752)
We now create overlays as .dtbo files.
build: support for .dtbo files for dtb overlays
Kernel 4.4.6+ on RaspberryPi support .dtbo files for overlays, instead of .dtb.
Patch the kernel, which has faulty rules to generate .dtbo the way yocto does
Signed-off-by: Herve Jourdain <herve.jourdain@neuf.fr>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Includes the new localfixups format.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
scripts/dtc: Fix UMR causing corrupt dtbo overlay files
struct fixup_entry is allocated from the heap but it's member
local_fixup_generated was never initialized. This lead to
corrupted dtbo files.
Fix this by initializing local_fixup_generated to false.
Signed-off-by: Matthias Reichl <hias@horus.com>
scripts/dtc: Only emit local fixups for overlays
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
The Raspberry Pi firmware looks for a trailer on the kernel image to
determine whether it was compiled with Device Tree support enabled.
If the firmware finds a kernel without this trailer, or which has a
trailer indicating that it isn't DT-capable, it disables DT support
and reverts to using ATAGs.
The mkknlimg utility adds that trailer, having first analysed the
image to look for signs of DT support and the kernel version string.
knlinfo displays the contents of the trailer in the given kernel image.
scripts/mkknlimg: Add support for ARCH_BCM2835
Add a new trailer field indicating whether this is an ARCH_BCM2835
build, as opposed to MACH_BCM2708/9. If the loader finds this flag
is set it changes the default base dtb file name from bcm270x...
to bcm283y...
Also update knlinfo to show the status of the field.
scripts/mkknlimg: Improve ARCH_BCM2835 detection
The board support code contains sufficient strings to be able to
distinguish 2708 vs. 2835 builds, so remove the check for
bcm2835-pm-wdt which could exist in either.
Also, since the canned configuration is no longer built in (it's
a module), remove the config string checking.
See: https://github.com/raspberrypi/linux/issues/1157
scripts: Multi-platform support for mkknlimg and knlinfo
The firmware uses tags in the kernel trailer to choose which dtb file
to load. Current firmware loads bcm2835-*.dtb if the '283x' tag is true,
otherwise it loads bcm270*.dtb. This scheme breaks if an image supports
multiple platforms.
This patch adds '270X' and '283X' tags to indicate support for RPi and
upstream platforms, respectively. '283x' (note lower case 'x') is left
for old firmware, and is only set if the image only supports upstream
builds.
scripts/mkknlimg: Append a trailer for all input
Now that the firmware assumes an unsigned kernel is DT-capable, it is
helpful to be able to mark a kernel as being non-DT-capable.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
scripts/knlinfo: Decode DDTK atom
Show the DDTK atom as being a boolean.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mkknlimg: Retain downstream-kernel detection
With the death of ARCH_BCM2708 and ARCH_BCM2709, a new way is needed to
determine if this is a "downstream" build that wants the firmware to
load a bcm27xx .dtb. The vc_cma driver is used downstream but not
upstream, making vc_cma_init a suitable predicate symbol.
- Supports raw YUV capture, preview, JPEG and H264.
- Uses videobuf2 for data transfer, using dma_buf.
- Uses 3.6.10 timestamping
- Camera power based on use
- Uses immutable input mode on video encoder
Signed-off-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Luke Diamand <luked@broadcom.com>
V4L2: Fixes from 6by9
V4L2: Fix EV values. Add manual shutter speed control
V4L2 EV values should be in units of 1/1000. Corrected.
Add support for V4L2_CID_EXPOSURE_ABSOLUTE which should
give manual shutter control. Requires manual exposure mode
to be selected first.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Correct JPEG Q-factor range
Should be 1-100, not 0-100
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix issue of driver jamming if STREAMON failed.
Fix issue where the driver was left in a partially enabled
state if STREAMON failed, and would then reject many IOCTLs
as it thought it was streaming.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix ISO controls.
Driver was passing the index to the GPU, and not the desired
ISO value.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add flicker avoidance controls
Add support for V4L2_CID_POWER_LINE_FREQUENCY to set flicker
avoidance frequencies.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add support for frame rate control.
Add support for frame rate (or time per frame as V4L2
inverts it) control via s_parm.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Improve G_FBUF handling so we pass conformance
Return some sane numbers for get framebuffer so that
we pass conformance.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix information advertised through g_vidfmt
Width and height were being stored based on incorrect
values.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add support for inline H264 headers
Add support for V4L2_CID_MPEG_VIDEO_REPEAT_SEQ_HEADER
to control H264 inline headers.
Requires firmware fix to work correctly, otherwise format
has to be set to H264 before this parameter is set.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix JPEG timestamp issue
JPEG images were coming through from the GPU with timestamp
of 0. Detect this and give current system time instead
of some invalid value.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix issue when switching down JPEG resolution.
JPEG buffer size calculation is based on input resolution.
Input resolution was being configured after output port
format. Caused failures if switching from one JPEG resolution
to a smaller one.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Enable MJPEG encoding
Requires GPU firmware update to support MJPEG encoder.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Correct flag settings for compressed formats
Set flags field correctly on enum_fmt_vid_cap for compressed
image formats.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: H264 profile & level ctrls, FPS control and auto exp pri
Several control handling updates.
H264 profile and level controls.
Timeperframe/FPS reworked to add V4L2_CID_EXPOSURE_AUTO_PRIORITY to
select whether AE is allowed to override the framerate specified.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Correct BGR24 to RGB24 in format table
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add additional pixel formats. Correct colourspace
Adds the other flavours of YUYV, and NV12.
Corrects the overlay advertised colourspace.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Drop logging msg from info to debug
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Initial pass at scene modes.
Only supports exposure mode and metering modes.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add manual white balance control.
Adds support for V4L2_CID_RED_BALANCE and
V4L2_CID_BLUE_BALANCE. Only has an effect if
V4L2_CID_AUTO_N_PRESET_WHITE_BALANCE has
V4L2_WHITE_BALANCE_MANUAL selected.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
config: Enable V4L / MMAL driver
V4L2: Increase the MMAL timeout to 3sec
MJPEG codec flush is now taking longer and results
in a kernel panic if the driver has stopped waiting for
the result when it finally completes.
Increase the timeout value from 1 to 3secs.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add support for setting H264_I_PERIOD
Adds support for the parameter V4L2_CID_MPEG_VIDEO_H264_I_PERIOD
to set the frequency with which I frames are produced.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Enable GPU function for removing padding from images.
GPU can now support arbitrary strides, although may require
additional processing to achieve it. Enable this feature
so that the images delivered are the size requested.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add support for V4L2_PIX_FMT_BGR32
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Set the colourspace to avoid odd YUV-RGB conversions
Removes the amiguity from the conversion routines and stops
them dropping back to the SD vs HD choice of coeffs.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Make video/still threshold a run-time param
Move the define for at what resolution the driver
switches from a video mode capture to a stills mode
capture to module parameters.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Fix incorrect pool sizing
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add option to disable enum_framesizes.
Gstreamer's handling of a driver that advertises
V4L2_FRMSIZE_TYPE_STEPWISE to define the supported
resolutions is broken. See bug
https://bugzilla.gnome.org/show_bug.cgi?id=726521
Optional parameter of gst_v4l2src_is_broken added.
If non-zero, the driver claims not to support that
ioctl, and gstreamer should be happy again (it
guesses a set of defaults for itself).
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Add support for more image formats
Adds YVU420 (YV12), YVU420SP (NV21), and BGR888.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
V4L2: Extend range for V4L2_CID_MPEG_VIDEO_H264_I_PERIOD
Request to extend the range from the fairly arbitrary
1000 frames (33 seconds at 30fps). Extend out to the
max range supported (int32 value).
Also allow 0, which is handled by the codec as only
send an I-frame on the first frame and never again.
There may be an exception if it detects a significant
scene change, but there's no easy way around that.
Signed-off-by: Dave Stevenson <dsteve@broadcom.com>
bcm2835-camera: stop_streaming now has a void return
BCM2835-V4L2: Fix compliance test failures
VIDIOC_TRY_FMT and VIDIOC_S_FMT tests were faling due
to reporting V4L2_COLORSPACE_JPEG when the colour
format wasn't V4L2_PIX_FMT_JPEG.
Now reports V4L2_COLORSPACE_SMPTE170M for YUV formats.
bcm2835 camera planar/packed stride length
Added a field to the mmal_fmt struct used to compute the bytes per line
when using a particular format. This results in the correct stride being
calculated even when the format is planar.
Signed-off-by: Garrett Wilson <g@floft.net>
bcm2835: camera: check for scene not being found
static analysis by cppcheck detected some potential NULL pointer
dereference issues:
[drivers/media/platform/bcm2835/controls.c:854]: (error) Possible null
pointer dereference: scene
(and lines 858, 859 too)
it is possible that scene is not found because of an invalue ctrl->val
and is therefore NULL and hence causing a null pointer dereference.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
bcm2835: memcpy port data to m rather than rmsg
static analysis by cppcheck detected a memcpy to rmsg which is
not actually initialized at that point. The memcpy should be copying
to variable m instead.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
BCM2835-V4L2: Return buffers to videobuf2 on shutdown
https://github.com/raspberrypi/linux/issues/817
Fixes the kernel warning from videobuf2 as buffers
are now returned as they are being flushed on
stop_streaming.
squash: Fixup bcm2835-camera for changes in kernel 4.4 api
v4l2: Fix up driver to upstream timestamp changes
bcm2835-camera: fix a bug in computation of frame timestamp
Fixes#1318
V4L2 driver updates (#1393)
* BCM2835-V4L2: Correct ISO control and add V4L2_CID_ISO_SENSITIVITY_AUTO
https://github.com/raspberrypi/linux/issues/1251
V4L2_CID_ISO_SENSITIVITY was not advertising ISO*1000 as it should.
V4L2_CID_ISO_SENSITIVITY_AUTO was not implemented, so was taking
V4L2_CID_ISO_SENSITIVITY as 0 for auto mode.
Still accepts 0 for auto, but also abides by the new parameter.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
* BCM2835-V4L2: Add a video_nr parameter.
Adds a kernel parameter "video_nr" to specify the preferred
/dev/videoX device node.
https://www.raspberrypi.org/forums/viewtopic.php?f=38&t=136120&p=905545
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
* BCM2835-V4L2: Add support for multiple cameras
Ask GPU on load how many cameras have been detected, and
enumerate that number of devices.
Only applicable on the Compute Module as no other device
exposes multiple CSI2 interfaces.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
* BCM2835-V4L2: Add control of the overlay location and alpha.
Actually do something useful in vidioc_s_fmt_vid_overlay and
vidioc_try_fmt_vid_overlay, rather than effectively having
read-only fields.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
* BCM2835-V4L2: V4L2-Compliance failure fix
VIDIOC_TRY_FMT was failing due to bytesperline not
being set correctly by default.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
* BCM2835-V4L2: Make all module parameters static
Clean up to correct variable scope
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
V4L2: Request maximum resolution from GPU
Get resolution information about the sensors from the GPU
and advertise it correctly.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
BCM2835-V4L2: Increase minimum resolution to 32x32
https://github.com/raspberrypi/linux/issues/1498 showed
up that 16x16 is failing to work on the GPU for some reason.
GPU bug being tracked on
https://github.com/raspberrypi/firmware/issues/607
Workaround here by increasing minimum resolution via V4L2
to 32x32.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
[media]: bcm2835-camera: fix compilation error
There is an error when compiling rpi-4.6.y branch:
CC [M] drivers/media/platform/bcm2835/bcm2835-camera.o
drivers/media/platform/bcm2835/bcm2835-camera.c:639:17: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
.queue_setup = queue_setup,
^
drivers/media/platform/bcm2835/bcm2835-camera.c:639:17: note: (near initialization for 'bm2835_mmal_video_qops.queue_setup')
The const void *parg in setup_queue callback is not needed since commit:
df9ecb0cad.
This commit removes it.
Signed-off-by: Slawomir Stepien <sst@poczta.fm>
bcm2835-camera: Fix max/min error when looping over cameras/resolutions
See: https://github.com/raspberrypi/linux/issues/1447#issuecomment-221303506
BCM2835-V4L2: Correct handling for BGR24 vs RGB24.
There was a bug in the GPU firmware that had reversed these
two formats.
Detect the old firmware, and reverse the formats if necessary.
Signed-off-by: Dave Stevenson <6by9@users.noreply.github.com>
BCM2835-v4l2: Fix a conformance test failure
Format ioctls:
test VIDIOC_ENUM_FMT/FRAMESIZES/FRAMEINTERVALS: OK
warn: v4l2-test-formats.cpp(1195): S_PARM is supported but
doesn't report V4L2_CAP_TIMEPERFRAME.
fail: v4l2-test-formats.cpp(1118): node->has_frmintervals
&& !cap->capability
Support booting without Device Tree.
Turn on USB power.
Load driver early because of lacking support for deferred probing
in many drivers.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
firmware: bcm2835: Don't turn on USB power
The raspberrypi-power driver is now used to turn on USB power.
This partly reverts commit:
firmware: bcm2835: Support ARCH_BCM270x
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Add module for accessing the mailbox property channel through
/dev/vcio. Was previously in bcm2708-vcio.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
i2c-bcm2708: fixed baudrate
Fixed issue where the wrong CDIV value was set for baudrates below 3815 Hz (for 250MHz bus clock).
In that case the computed CDIV value was more than 0xffff. However the CDIV register width is only 16 bits.
This resulted in incorrect setting of CDIV and higher baudrate than intended.
Example: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0x1704 -> 42430Hz
After correction: 3500Hz -> CDIV=0x11704 -> CDIV(16bit)=0xffff -> 3815Hz
The correct baudrate is shown in the log after the cdiv > 0xffff correction.
Perform I2C combined transactions when possible
Perform I2C combined transactions whenever possible, within the
restrictions of the Broadcomm Serial Controller.
Disable DONE interrupt during TA poll
Prevent interrupt from being triggered if poll is missed and transfer
starts and finishes.
i2c: Make combined transactions optional and disabled by default
i2c: bcm2708: add device tree support
Add DT support to driver and add to .dtsi file.
Setup pins in .dts file.
i2c is disabled by default.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
bcm2708: don't register i2c controllers when using DT
The devices for the i2c controllers are in the Device Tree.
Only register devices when not using DT.
Signed-off-by: Noralf Tronnes <notro@tronnes.org>
I2C: Only register the I2C device for the current board revision
i2c_bcm2708: Fix clock reference counting
Fix grabbing lock from atomic context in i2c driver
2 main changes:
- check for timeouts in the bcm2708_bsc_setup function as indicated by this comment:
/* poll for transfer start bit (should only take 1-20 polls) */
This implies that the setup function can now fail so account for this everywhere it's called
- Removed the clk_get_rate call from inside the setup function as it locks a mutex and that's not ok since we call it from under a spin lock.
i2c-bcm2708: When using DT, leave the GPIO setup to pinctrl
i2c-bcm2708: Increase timeouts to allow larger transfers
Use the timeout value provided by the I2C_TIMEOUT ioctl when waiting
for completion. The default timeout is 1 second.
See: https://github.com/raspberrypi/linux/issues/260
i2c-bcm2708/BCM270X_DT: Add support for I2C2
The third I2C bus (I2C2) is normally reserved for HDMI use. Careless
use of this bus can break an attached display - use with caution.
It is recommended to disable accesses by VideoCore by setting
hdmi_ignore_edid=1 or hdmi_edid_file=1 in config.txt.
The interface is disabled by default - enable using the
i2c2_iknowwhatimdoing DT parameter.
bcm2708-spi: Don't use static pin configuration with DT
Also remove superfluous error checking - the SPI framework ensures the
validity of the chip_select value.
i2c-bcm2708: Remove non-DT support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Set the BSC_CLKT clock streching timeout to 35ms as per SMBus specs.
Fixes i2c_bcm2708: Write to FIFO correctly - v2 (#1574)
* i2c: fix i2c_bcm2708: Clear FIFO before sending data
Make sure FIFO gets cleared before trying to send
data in case of a repeated start (COMBINED=Y).
* i2c: fix i2c_bcm2708: Only write to FIFO when not full
Check if FIFO can accept data before writing.
To avoid a peripheral read on the last iteration of a loop,
both bcm2708_bsc_fifo_fill and ~drain are changed as well.
BCM270x: Move thermal sensor to Device Tree
Add Device Tree support to bcm2835-thermal driver.
Add thermal sensor device to Device Tree.
Don't add platform device when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
lirc_rpi: Use read_current_timer to determine transmitter delay. Thanks to jjmz and others
See: https://github.com/raspberrypi/linux/issues/525
lirc: Remove restriction on gpio pins that can be used with lirc
Compute Module, for example could use different pins
lirc_rpi: Add parameter to specify input pin pull
Depending on the connected IR circuitry it might be desirable to change the
gpios internal pull from it pull-down default behaviour. Add a module
parameter to allow the user to set it explicitly.
Signed-off-by: Julian Scheel <julian@jusst.de>
lirc-rpi: Use the higher-level irq control functions
This module used to access the irq_chip methods of the
gpio controller directly, rather than going through the
standard enable_irq/irq_set_irq_type functions. This
caused problems on pinctrl-bcm2835 which only implements
the irq_enable/disable methods and not irq_unmask/mask.
lirc-rpi: Correct the interrupt usage
1) Correct the use of enable_irq (i.e. don't call it so often)
2) Correct the shutdown sequence.
3) Avoid a bcm2708_gpio driver quirk by setting the irq flags earlier
lirc-rpi: use getnstimeofday instead of read_current_timer
read_current_timer isn't guaranteed to return values in
microseconds, and indeed it doesn't on a Pi2.
Issue: linux#827
lirc-rpi: Add device tree support, and a suitable overlay
The overlay supports DT parameters that match the old module
parameters, except that gpio_in_pull should be set using the
strings "up", "down" or "off".
lirc-rpi: Also support pinctrl-bcm2835 in non-DT mode
fix auto-sense in lirc_rpi driver
On a Raspberry Pi 2, the lirc_rpi driver might receive spurious
interrupts and change it's low-active / high-active setting.
When this happens, the IR remote control stops working.
This patch disables this auto-detection if the 'sense' parameter
was set in the device tree, making the driver robust to such
spurious interrupts.
Use clock manager instead of self-made clockmanager.
Also fix some error paths that showd up during development
(especially missing release of dma resources on rmmod)
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Add experimental support for the VideoCore shared memory service.
This allows user processes to allocate memory from VideoCore's
GPU relocatable heap and mmap the buffers. Additionally, the memory
handles can passed to other VideoCore services such as MMAL, OpenMax
and DispmanX
TODO
* This driver was originally released for BCM28155 which has a different
cache architecture to BCM2835. Consequently, in this release only
uncached mappings are supported. However, there's no fundamental
reason which cached mappings cannot be support or BCM2835
* More refactoring is required to remove the typedefs.
* Re-enable the some of the commented out debug-fs statistics which were
disabled when migrating code from proc-fs.
* There's a lot of code to support sharing of VCSM in order to support
Android. This could probably done more cleanly or perhaps just
removed.
Signed-off-by: Tim Gover <timgover@gmail.com>
config: Disable VC_SM for now to fix hang with cutdown kernel
vcsm: Use boolean as it cannot be built as module
On building the bcm_vc_sm as a module we get the following error:
v7_dma_flush_range and do_munmap are undefined in vc-sm.ko.
Fix by making it not an option to build as module
vcsm: Add ioctl for custom cache flushing
vc-sm: Move headers out of arch directory
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
BCM270x: Move vc_mem
Make the vc_mem module available for ARCH_BCM2835 by moving it.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
alsa: add mmap support and some cleanups to bcm2835 ALSA driver
snd-bcm2835: Add support for spdif/hdmi passthrough
This adds a dedicated subdevice which can be used for passthrough of non-audio
formats (ie encoded a52) through the hdmi audio link. In addition to this
driver extension an appropriate card config is required to make alsa-lib
support the AES parameters for this device.
snd-bcm2708: Add mutex, improve logging
Fix for ALSA driver crash
Avoids an issue when closing and opening vchiq where a message can arrive before service handle has been written
alsa: reduce severity of expected warning message
snd-bcm2708: Fix dmesg spam for non-error case
alsa: Ensure mutexes are released through error paths
alsa: Make interrupted close paths quieter
BCM270x: Add onboard sound device to Device Tree
Add Device Tree support to alsa driver.
Add device to Device Tree.
Don't add platform devices when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835: access controls under the audio mutex
I don't think the ALSA framework provides any kind of automatic
synchronization within the control callbacks. We most likely need
to ensure this manually, so add locking around all access to shared
mutable data. In particular, bcm2835_audio_set_ctls() should
probably always be called under our own audio lock.
snd-bcm2835: Don't allow responses from VC to be interrupted by user signals
There should always be a response, and retry after a signal interruption is not handled, so don't report
we are interruptible.
See: https://github.com/raspberrypi/linux/issues/1560
snd-bcm2835: Use bcm2835_hw params in preallocate
Signed-off-by: popcornmix <popcornmix@gmail.com>
vc_cma: Make the vc_cma area the default contiguous DMA area
vc_cma: Provide empty functions when module is not built
Providing empty functions saves the users from guarding the
function call with an #if clause.
Move __init markings from prototypes to functions.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Some SD cards have been found that corrupt data when small blocks
are erased. Add a quirk to indicate that ERASE should not be used,
and set it for cards of that type.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: Apply QUIRK_BROKEN_ERASE to other capacities
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: Add card_quirks module parameter, log quirks
Use mmc_block.card_quirks to override the quirks for all SD or MMC
cards. The value is a bitfield using the bit positions defined in
include/linux/mmc/card.h. If the module parameter is placed in the
kernel command line (or bootargs) stored on the card then, assuming the
device only has one SD card interface, the override effectively becomes
card-specific.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
BCM2835 has two SD card interfaces. This driver uses the other one.
bcm2835-sdhost: Error handling fix, and code clarification
bcm2835-sdhost: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the sdhost interface is restricted to integer divisions of
core_freq, and the highest sensible option for a core_freq of 250MHz
is 84 (250/3 = 83.3MHz), the next being 125 (250/2) which is much too
high.
Use at your own risk.
bcm2835-sdhost: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
bcm2835-sdhost: Improve error handling and recovery
1) Expose the hw_reset method to the MMC framework, removing many
internal calls by the driver.
2) Reduce overclock setting on error.
3) Increase timeout to cope with high capacity cards.
4) Add properties and parameters to control pio_limit and debug.
5) Reduce messages at probe time.
bcm2835-sdhost: Further improve overclock back-off
bcm2835-sdhost: Clear HBLC for PIO mode
Also update pio_limit default in overlay README.
bcm2835-sdhost: Add the ERASE capability
See: https://github.com/raspberrypi/linux/issues/1076
bcm2835-sdhost: Ignore CRC7 for MMC CMD1
It seems that the sdhost interface returns CRC7 errors for CMD1,
which is the MMC-specific SEND_OP_COND. Returning these errors to
the MMC layer causes a downward spiral, but ignoring them seems
to be harmless.
bcm2835-mmc/sdhost: Remove ARCH_BCM2835 differences
The bcm2835-mmc driver (and -sdhost driver that copied from it)
contains code to handle SDIO interrupts in a threaded interrupt
handler rather than waking the MMC framework thread. The change
follows a patch from Russell King that adds the facility as the
preferred way of working.
However, the new code path is only present in ARCH_BCM2835
builds, which I have taken to be a way of testing the waters
rather than making the change across the board; I can't see
any technical reason why it wouldn't be enabled for MACH_BCM270X
builds. So this patch standardises on the ARCH_BCM2835 code,
removing the old code paths.
bcm2835-sdhost: Don't log timeout errors unless debug=1
The MMC card-discovery process generates timeouts. This is
expected behaviour, so reporting it to the user serves no purpose.
Suppress the reporting of timeout errors unless the debug flag
is on.
bcm2835-sdhost: Add workaround for odd behaviour on some cards
For reasons not understood, the sdhost driver fails when reading
sectors very near the end of some SD cards. The problem could
be related to the similar issue that reading the final sector
of any card as part of a multiple read never completes, and the
workaround is an extension of the mechanism introduced to solve
that problem which ensures those sectors are always read singly.
bcm2835-sdhost: Major revision
This is a significant revision of the bcm2835-sdhost driver. It
improves on the original in a number of ways:
1) Through the use of CMD23 for reads it appears to avoid problems
reading some sectors on certain high speed cards.
2) Better atomicity to prevent crashes.
3) Higher performance.
4) Activity logging included, for easier diagnosis in the event
of a problem.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Restore ATOMIC flag to PIO sg mapping
Allocation problems have been seen in a wireless driver, and
this is the only change which might have been responsible.
SQUASH: bcm2835-sdhost: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-sdhost driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Workaround for "slow" sectors
Some cards have been seen to cause timeouts after certain sectors are
read. This workaround enforces a minimum delay between the stop after
reading one of those sectors and a subsequent data command.
Using CMD23 (SET_BLOCK_COUNT) avoids this problem, so good cards will
not be penalised by this workaround.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Firmware manages the clock divisor
The bcm2835-sdhost driver hands control of the CDIV clock divisor
register to matching firmware, allowing it to adjust to a changing
core clock. This removes the need to use the performance governor or
to enable io_is_busy on the on-demand governor in order to get the
best SD performance.
N.B. As SD clocks must be an integer divisor of the core clock, it is
possible that the SD clock for "turbo" mode can be different (even
lower) than "normal" mode.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Reset the clock in task context
Since reprogramming the clock can now involve a round-trip to the
firmware it must not be done at atomic context, and a tasklet
is not a task.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
bcm2835-sdhost: Don't exit cmd wait loop on error
The FAIL flag can be set in the CMD register before command processing
is complete, leading to spurious "failed to complete" errors. This has
the effect of promoting harmless CRC7 errors during CMD1 processing
into errors that can delay and even prevent booting.
Also:
1) Convert the last KERN_ERROR message in the register dumping to
KERN_INFO.
2) Remove an unnecessary reset call from bcm2835_sdhost_add_host.
See: https://github.com/raspberrypi/linux/pull/1492
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
mmc: Disable CMD23 transfers on all cards
Pending wire-level investigation of these types of transfers
and associated errors on bcm2835-mmc, disable for now. Fallback of
CMD18/CMD25 transfers will be used automatically by the MMC layer.
Reported/Tested-by: Gellert Weisz <gellert@raspberrypi.org>
mmc: bcm2835-mmc: enable DT support for all architectures
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Enable Device Tree support for all architectures.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
mmc: bcm2835-mmc: fix probe error handling
Probe error handling is broken in several places.
Simplify error handling by using device managed functions.
Replace pr_{err,info} with dev_{err,info}.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Add locks when accessing sdhost registers
bcm2835-mmc: Add range of debug options for slowing things down
bcm2835-mmc: Add option to disable some delays
bcm2835-mmc: Add option to disable MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Default to disabling MMC_QUIRK_BLK_NO_CMD23
bcm2835-mmc: Adding overclocking option
Allow a different clock speed to be substitued for a requested 50MHz.
This option is exposed using the "overclock_50" DT parameter.
Note that the mmc interface is restricted to EVEN integer divisions of
250MHz, and the highest sensible option is 63 (250/4 = 62.5), the
next being 125 (250/2) which is much too high.
Use at your own risk.
bcm2835-mmc: Round up the overclock, so 62 works for 62.5Mhz
Also only warn once for each overclock setting.
mmc: bcm2835-mmc: Make available on ARCH_BCM2835
Make the bcm2835-mmc driver available for use on ARCH_BCM2835.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-mmc entry
Add Device Tree entry for bcm2835-mmc.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2835-mmc: Don't overwrite MMC capabilities from DT
bcm2835-mmc: Don't override bus width capabilities from devicetree
Take out the force setting of the MMC_CAP_4_BIT_DATA host capability
so that the result read from devicetree via mmc_of_parse() is
preserved.
bcm2835-mmc: Only claim one DMA channel
With both MMC controllers enabled there are few DMA channels left. The
bcm2835-mmc driver only uses DMA in one direction at a time, so it
doesn't need to claim two channels.
See: https://github.com/raspberrypi/linux/issues/1327
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Add support for DMA controller of BCM2708 as used in the Raspberry Pi.
Currently it only supports cyclic DMA.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
dmaengine: expand functionality by supporting scatter/gather transfers sdhci-bcm2708 and dma.c: fix for LITE channels
DMA: fix cyclic LITE length overflow bug
dmaengine: bcm2708: Remove chancnt affectations
Mirror bcm2835-dma.c commit 9eba5536a7:
chancnt is already filled by dma_async_device_register, which uses the channel
list to know how much channels there is.
Since it's already filled, we can safely remove it from the drivers' probe
function.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: overwrite dreq only if it is not set
dreq is set when the DMA channel is fetched from Device Tree.
slave_id is set using dmaengine_slave_config().
Only overwrite dreq with slave_id if it is not set.
dreq/slave_id in the cyclic DMA case is not touched, because I don't
have hardware to test with.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: do device registration in the board file
Don't register the device in the driver. Do it in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: don't restrict DT support to ARCH_BCM2835
Both ARCH_BCM2835 and ARCH_BCM270x are built with OF now.
Add Device Tree support to the non ARCH_BCM2835 case.
Use the same driver name regardless of architecture.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: add bcm2835-dma entry
Add Device Tree entry for bcm2835-dma.
The entry doesn't contain any resources since they are handled
by the arch/arm/mach-bcm270x/dma.c driver.
In non-DT mode, don't add the device in the board file.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine: Add debug options
BCM270x: Add memory and irq resources to dmaengine device and DT
Prepare for merging of the legacy DMA API arch driver dma.c
with bcm2708-dmaengine by adding memory and irq resources both
to platform file device and Device Tree node.
Don't use BCM_DMAMAN_DRIVER_NAME so we don't have to include mach/dma.h
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Merge with arch dma.c driver and disable dma.c
Merge the legacy DMA API driver with bcm2708-dmaengine.
This is done so we can use bcm2708_fb on ARCH_BCM2835 (mailbox
driver is also needed).
Changes to the dma.c code:
- Use BIT() macro.
- Cutdown some comments to one line.
- Add mutex to vc_dmaman and use this, since the dev lock is locked
during probing of the engine part.
- Add global g_dmaman variable since drvdata is used by the engine part.
- Restructure for readability:
vc_dmaman_chan_alloc()
vc_dmaman_chan_free()
bcm_dma_chan_free()
- Restructure bcm_dma_chan_alloc() to simplify error handling.
- Use device irq resources instead of hardcoded bcm_dma_irqs table.
- Remove dev_dmaman_register() and code it directly.
- Remove dev_dmaman_deregister() and code it directly.
- Simplify bcm_dmaman_probe() using devm_* functions.
- Get dmachans from DT if available.
- Keep 'dma.dmachans' module argument name for backwards compatibility.
Make it available on ARCH_BCM2835 as well.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: set residue_granularity field
bcm2708-dmaengine supports residue reporting at burst level
but didn't report this via the residue_granularity field.
Without this field set properly we get playback issues with I2S cards.
dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer
bcm2708-dmaengine: Use more DMA channels (but not 12)
1) Only the bcm2708_fb drivers uses the legacy DMA API, and
it requires a BULK-capable channel, so all other types
(FAST, NORMAL and LITE) can be made available to the regular
DMA API.
2) DMA channels 11-14 share an interrupt. The driver can't
handle this, so don't use channels 12-14 (12 was used, probably
because it appears to have an interrupt, but in reality that
interrupt is for activity on ANY channel). This may explain
a lockup encountered when running out of DMA channels.
The combined effect of this patch is to leave 7 DMA channels
available + channel 0 for bcm2708_fb via the legacy API.
See: https://github.com/raspberrypi/linux/issues/1110https://github.com/raspberrypi/linux/issues/1108
dmaengine: bcm2708: Make legacy API available for bcm2835-dma
bcm2708_fb uses the legacy DMA API, so in order to start using
bcm2835-dma, bcm2835-dma has to support the legacy API. Make this
possible by exporting bcm_dmaman_probe() and bcm_dmaman_remove().
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Change DT compatible string
Both bcm2835-dma and bcm2708-dmaengine have the same compatible string.
So change compatible to "brcm,bcm2708-dma".
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dmaengine: bcm2708: Remove driver but keep legacy API
Dropping non-DT support means we don't need this driver,
but we still need the legacy DMA API.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2708-dmaengine - Fix arm64 portability/build issues
Signed-off-by: popcornmix <popcornmix@gmail.com>
bcm2708_fb : Implement blanking support using the mailbox property interface
bcm2708_fb: Add pan and vsync controls
bcm2708_fb: DMA acceleration for fb_copyarea
Based on http://www.raspberrypi.org/phpBB3/viewtopic.php?p=62425#p62425
Also used Simon's dmaer_master module as a reference for tweaking DMA
settings for better performance.
For now busylooping only. IRQ support might be added later.
With non-overclocked Raspberry Pi, the performance is ~360 MB/s
for simple copy or ~260 MB/s for two-pass copy (used when dragging
windows to the right).
In the case of using DMA channel 0, the performance improves
to ~440 MB/s.
For comparison, VFP optimized CPU copy can only do ~114 MB/s in
the same conditions (hindered by reading uncached source buffer).
Signed-off-by: Siarhei Siamashka <siarhei.siamashka@gmail.com>
bcm2708_fb: report number of dma copies
Add a counter (exported via debugfs) reporting the
number of dma copies that the framebuffer driver
has done, in order to help evaluate different
optimization strategies.
Signed-off-by: Luke Diamand <luked@broadcom.com>
bcm2708_fb: use IRQ for DMA copies
The copyarea ioctl() uses DMA to speed things along. This
was busy-waiting for completion. This change supports using
an interrupt instead for larger transfers. For small
transfers, busy-waiting is still likely to be faster.
Signed-off-by: Luke Diamand <luke@diamand.org>
bcm2708: Make ioctl logging quieter
video: fbdev: bcm2708_fb: Don't panic on error
No need to panic the kernel if the video driver fails.
Just print a message and return an error.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
fbdev: bcm2708_fb: Add ARCH_BCM2835 support
Add Device Tree support.
Pass the device to dma_alloc_coherent() in order to get the
correct bus address on ARCH_BCM2835.
Use the new DMA legacy API header file.
Including <mach/platform.h> is not necessary.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
BCM270x_DT: Add bcm2708-fb device
Add bcm2708-fb to Device Tree and don't add the
platform device when booting in DT mode.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
usb: dwc: fix lockdep false positive
Signed-off-by: Kari Suvanto <karis79@gmail.com>
usb: dwc: fix inconsistent lock state
Signed-off-by: Kari Suvanto <karis79@gmail.com>
Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas
Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.
Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh
Make sure we wait for the reset to finish
dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
memory corruption, escalating to OOPS under high USB load.
dwc_otg: Fix unsafe access of QTD during URB enqueue
In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.
This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).
dwc_otg: Fix incorrect URB allocation error handling
If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.
dwc_otg: fix potential use-after-free case in interrupt handler
If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.
dwc_otg: add handling of SPLIT transaction data toggle errors
Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).
dwc_otg: implement tasklet for returning URBs to usbcore hcd layer
The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.
This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.
dwc_otg: fix NAK holdoff and allow on split transactions only
This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.
Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.
dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler
usb_hcd_unlink_urb_from_ep must be called with the HCD lock held. Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4a1a).
This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.
NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer. This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.
USB fix using a FIQ to implement split transactions
This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux
dwc_otg: fix device attributes and avoid kernel warnings on boot
dcw_otg: avoid logging function that can cause panics
See: https://github.com/raspberrypi/firmware/issues/21
Thanks to cleverca22 for fix
dwc_otg: mask correct interrupts after transaction error recovery
The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.
By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.
dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ
In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.
Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.
Fix function tracing
dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue
dwc_otg: prevent OOPSes during device disconnects
The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.
Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.
dwc_otg: prevent BUG() in TT allocation if hub address is > 16
A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.
This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.
Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.
dwc_otg: make channel halts with unknown state less damaging
If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.
Add catchall handling to treat as a transaction error and retry.
dwc_otg: fiq_split: use TTs with more granularity
This fixes certain issues with split transaction scheduling.
- Isochronous multi-packet OUT transactions now hog the TT until
they are completed - this prevents hubs aborting transactions
if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
allows simultaneous use of the TT's bulk/control and periodic
transaction buffers
This commit will mainly affect USB audio playback.
dwc_otg: fix potential sleep while atomic during urb enqueue
Fixes a regression introduced with eb1b482a. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.
dwc_otg: make fiq_split_enable imply fiq_fix_enable
Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.
dwc_otg: prevent crashes on host port disconnects
Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.
- Clean up queue heads properly in kill_urbs_in_qh_list by
removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
device drivers so they respond in a more correct fashion
and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
interrupts if the associated URB was dequeued (and the
driver state was cleared)
dwc_otg: prevent leaking URBs during enqueue
A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.
dwc_otg: Enable NAK holdoff for control split transactions
Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb238.
In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.
dwc_otg: Fix for occasional lockup on boot when doing a USB reset
dwc_otg: Don't issue traffic to LS devices in FS mode
Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.
Fix ARM architecture issue with local_irq_restore()
If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.
Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.
Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.
dwc_otg: fiq_fsm: Base commit for driver rewrite
This commit removes the previous FIQ fixes entirely and adds fiq_fsm.
This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.
All driver options have been removed and replaced with:
- dwc_otg.fiq_enable (bool)
- dwc_otg.fiq_fsm_enable (bool)
- dwc_otg.fiq_fsm_mask (bitmask)
- dwc_otg.nak_holdoff (unsigned int)
Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.
fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used
If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).
Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.
In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.
Push the error count reset into the FIQ handler.
fiq_fsm: Implement timeout mechanism
For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.
Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.
fiq_fsm: fix bounce buffer utilisation for Isochronous OUT
Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.
Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.
dwc_otg: Return full-speed frame numbers in HS mode
The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.
fiq_fsm: save PID on completion of interrupt OUT transfers
Also add edge case handling for interrupt transports.
Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.
fiq_fsm: add missing case for fiq_fsm_tt_in_use()
Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.
fiq_fsm: clear hcintmsk for aborted transactions
Prevents the FIQ from erroneously handling interrupts
on a timed out channel.
fiq_fsm: enable by default
fiq_fsm: fix dequeues for non-periodic split transactions
If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.
fiq_fsm: Disable by default
fiq_fsm: Handle HC babble errors
The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.
dwc_otg: fix interrupt registration for fiq_enable=0
Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.
fiq_fsm: Enable by default
dwc_otg: Fix various issues with root port and transaction errors
Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.
Fix a few thinkos with the transaction error passthrough for fiq_fsm.
fiq_fsm: Implement hack for Split Interrupt transactions
Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.
It goes without saying that this is a fairly egregious USB specification
violation, but it works.
Original idea by Hans Petter Selasky @ FreeBSD.org.
dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.
dwc_otg: introduce fiq_fsm_spin(un|)lock()
SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.
This also makes it possible to run the FIQ and IRQ handlers on different
cores.
fiq_fsm: fix build on bcm2708 and bcm2709 platforms
dwc_otg: put some barriers back where they should be for UP
bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active
dwc_otg: fixup read-modify-write in critical paths
Be more careful about read-modify-write on registers that the FIQ
also touches.
Guard fiq_fsm_spin_lock with fiq_enable check
fiq_fsm: Falling out of the state machine isn't fatal
This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.
Also get rid of the useless return value.
squash: dwc_otg: Allow to build without SMP
usb: core: make overcurrent messages more prominent
Hub overcurrent messages are more serious than "debug". Increase loglevel.
usb: dwc_otg: Don't use dma_to_virt()
Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.
Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.
transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Fix crash when fiq_enable=0
dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly
Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.
Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.
dwc_otg: Force host mode to fix incorrect compute module boards
dwc_otg: Add ARCH_BCM2835 support
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Simplify FIQ irq number code
Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
dwc_otg: Remove duplicate gadget probe/unregister function
dwc_otg: Properly set the HFIR
Douglas Anderson reported:
According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1
This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)
and reported lower timing jitter on a USB analyser
dcw_otg: trim xfer length when buffer larger than allocated size is received
dwc_otg: Don't free qh align buffers in atomic context
dwc_otg: Enable the hack for Split Interrupt transactions by default
dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.
See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437
Signed-off-by: popcornmix <popcornmix@gmail.com>
dwc_otg: Use kzalloc when suitable
dwc_otg: Pass struct device to dma_alloc*()
This makes it possible to get the bus address from Device Tree.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: popcornmix <popcornmix@gmail.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm2709: Drop platform smp and timer init code
irq-bcm2836 handles this through these functions:
bcm2835_init_local_timer_frequency()
bcm2836_arm_irqchip_smp_init()
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
bcm270x: Use watchdog for reboot/poweroff
The watchdog driver already has support for reboot/poweroff.
Make use of this and remove the code from the platform files.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
so that special/critical clocks can get enabled early on
in the boot process avoiding the risk of disabling a clock,
pll_divider or pll when a claiming driver fails to install
propperly - maybe it needs to defer.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
The Raspberry Pi firmware looks at the RSTS register to know which
partition to boot from. The reboot syscall command
LINUX_REBOOT_CMD_RESTART2 supports passing in a string argument.
Add support for passing in a partition number 0..63 to boot from.
Partition 63 is a special partiton indicating halt.
If the partition doesn't exist, the firmware falls back to partition 0.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
These divide off of PLLD_PER and are used for the ethernet and wifi
PHYs source PLLs. Neither of them is currently represented by a phy
device that would grab the clock for us.
This keeps other drivers from killing the networking PHYs when they
disable their own clocks and trigger PLLD_PER's refcount going to 0.
v2: Skip marking as critical if they aren't on at boot.
Signed-off-by: Eric Anholt <eric@anholt.net>
Load driver early since at least bcm2708_fb doesn't support deferred
probing and even if it did, we don't want the video driver deferred.
Support the legacy DMA API which is needed by bcm2708_fb.
Don't mask out channel 2.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
The VideoCore bootloader passes in Serial number and
Revision number through Device Tree. Make these available to
userspace through /proc/cpuinfo.
Mainline status:
There is a commit in linux-next that standardize passing the serial
number through Device Tree (string: /serial-number):
ARM: 8355/1: arch: Show the serial number from devicetree in cpuinfo
There was an attempt to do the same with the revision number, but it
didn't get in:
[PATCH v2 1/2] arm: devtree: Set system_rev from DT revision
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
The spi-bcm2835 driver automatically uses GPIO chip-selects due to
some unreliability of the native ones. In doing so it chooses the
same pins as the native chip-selects would use, but the existing
code always uses pins 7 and 8, wherever the SPI function is mapped.
Search the pinctrl group assigned to the driver for pins that
correspond to native chip-selects, and use those for GPIO chip-
selects.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
When dynamically unloading overlays, it is important that freed pins are
restored to being inputs to prevent functions from being enabled in
multiple places at once.
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
Although the GPIO controller can generate three interrupts (four counting
the common one), the device tree files currently only specify two. In the
absence of the third, simply don't register that interrupt (as opposed to
registering 0), which has the effect of making it impossible to generate
interrupts for GPIOs 46-53 which, since they share pins with the SD card
interface, is unlikely to be a problem.
Contrary to the documentation, the BCM2835 GPIO controller actually has
four interrupt lines - one each for the three IRQ groups and one common. Rather
confusingly, the GPIO interrupt groups don't correspond directly with the GPIO
control banks. Instead, GPIOs 0-27 generate IRQ GPIO0, 28-45 GPIO1 and
46-53 GPIO2.
Awkwardly, the GPIOS for IRQ GPIO1 straddle two 32-entry GPIO banks, so it is
cleaner to split out a function to process the interrupts for a single GPIO
bank.
This bug has only just been observed because GPIOs above 27 can only be
accessed on an old Raspberry Pi with the optional P5 header fitted, where
the pins are often used for I2S instead.
Add a duplicate irq range with an offset on the hwirq's so the
driver can detect that enable_fiq() is used.
Tested with downstream dwc_otg USB controller driver.
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
The old arch-specific IRQ macros included a dsb to ensure the
write to clear the mailbox interrupt completed before returning
from the interrupt. The BCM2836 irqchip driver needs the same
precaution to avoid spurious interrupts.
Spurious interrupts are still possible for other reasons,
though, so trap them early.
See commit dae803e165 -- the warning is
expected sometimes when using CMA. However, that commit still spams
my kernel log with these warnings.
Signed-off-by: Eric Anholt <eric@anholt.net>
smsc95xx is adjusting truesize when it shouldn't, and following a recent patch from Eric this is now triggering warnings.
This patch stops smsc95xx from changing truesize.
Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com>
commit 92e55f412c upstream.
Unlike ipv4, this control socket is shared by all cpus so we cannot use
it as scratchpad area to annotate the mark that we pass to ip6_xmit().
Add a new parameter to ip6_xmit() to indicate the mark. The SCTP socket
family caches the flowi6 structure in the sctp_transport structure, so
we cannot use to carry the mark unless we later on reset it back, which
I discarded since it looks ugly to me.
Fixes: bf99b4ded5 ("tcp: fix mark propagation with fwmark_reflect enabled")
Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0fd758d611 upstream.
When adding a new rule to an fte, we need to hold the fte lock
until we add that rule to the fte and increase the fte ref count.
Fixes: 0c56b97503 ("net/mlx5_core: Introduce flow steering API")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf99b4ded5 upstream.
Otherwise, RST packets generated by the TCP stack for non-existing
sockets always have mark 0.
The mark from the original packet is assigned to the netns_ipv4/6
socket used to send the response so that it can get copied into the
response skb when the socket sends it.
Fixes: e110861f86 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Pau Espin Pedrol <pau.espin@tessares.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9c8bb163ae ]
In function igmpv3/mld_add_delrec() we allocate pmc and put it in
idev->mc_tomb, so we should free it when we don't need it in del_delrec().
But I removed kfree(pmc) incorrectly in latest two patches. Now fix it.
Fixes: 24803f38a5 ("igmp: do not remove igmp souce list info when ...")
Fixes: 1666d49e1d ("mld: do not remove mld souce list info when ...")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1666d49e1d ]
This is an IPv6 version of commit 24803f38a5 ("igmp: do not remove igmp
souce list..."). In mld_del_delrec(), we will restore back all source filter
info instead of flush them.
Move mld_clear_delrec() from ipv6_mc_down() to ipv6_mc_destroy_dev() since
we should not remove source list info when set link down. Remove
igmp6_group_dropped() in ipv6_mc_destroy_dev() since we have called it in
ipv6_mc_down().
Also clear all source info after igmp6_group_dropped() instead of in it
because ipv6_mc_down() will call igmp6_group_dropped().
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 72fb96e7bd ]
udp_ioctl(), as its name suggests, is used by UDP protocols,
but is also used by L2TP :(
L2TP should use its own handler, because it really does not
look the same.
SIOCINQ for instance should not assume UDP checksum or headers.
Thanks to Andrey and syzkaller team for providing the report
and a nice reproducer.
While crashes only happen on recent kernels (after commit
7c13f97ffd ("udp: do fwd memory scheduling on dequeue")), this
probably needs to be backported to older kernels.
Fixes: 7c13f97ffd ("udp: do fwd memory scheduling on dequeue")
Fixes: 8558467201 ("udp: Fix udp_poll() and ioctl()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 382e1eea2d ]
dsa_slave_create() can fail, and dsa_user_port_unapply() will properly check
for the network device not being NULL before attempting to destroy it. We were
not setting the slave network device as NULL if dsa_slave_create() failed, so
we would later on be calling dsa_slave_destroy() on a now free'd and
unitialized network device, causing crashes in dsa_slave_destroy().
Fixes: 83c0afaec7 ("net: dsa: Add new binding implementation")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57031eb794 ]
Link layer protocols may unconditionally pull headers, as Ethernet
does in eth_type_trans. Ensure that the entire link layer header
always lies in the skb linear segment. tpacket_snd has such a check.
Extend this to packet_snd.
Variable length link layer headers complicate the computation
somewhat. Here skb->len may be smaller than dev->hard_header_len.
Round up the linear length to be at least as long as the smallest of
the two.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 217e6fa24c ]
The stack must not pass packets to device drivers that are shorter
than the minimum link layer header length.
Previously, packet sockets would drop packets smaller than or equal
to dev->hard_header_len, but this has false positives. Zero length
payload is used over Ethernet. Other link layer protocols support
variable length headers. Support for validation of these protocols
removed the min length check for all protocols.
Introduce an explicit dev->min_header_len parameter and drop all
packets below this value. Initially, set it to non-zero only for
Ethernet and loopback. Other protocols can follow in a patch to
net-next.
Fixes: 9ed988cd59 ("packet: validate variable length ll headers")
Reported-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2bd137de53 ]
An error was reported upgrading to 4.9.8:
root@Typhoon:~# ip route add default table 210 nexthop dev eth0 via 10.68.64.1
weight 1 nexthop dev eth0 via 10.68.64.2 weight 1
RTNETLINK answers: Operation not supported
The problem occurs when CONFIG_LWTUNNEL is not enabled and a multipath
route is submitted.
The point of lwtunnel_valid_encap_type_attr is catch modules that
need to be loaded before any references are taken with rntl held. With
CONFIG_LWTUNNEL disabled, there will be no modules to load so the
lwtunnel_valid_encap_type_attr stub should just return 0.
Fixes: 9ed59592e3 ("lwtunnel: fix autoload of lwt modules")
Reported-by: pupilla@libero.it
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2dcab59848 ]
Alexander Popov reported that an application may trigger a BUG_ON in
sctp_wait_for_sndbuf if the socket tx buffer is full, a thread is
waiting on it to queue more data and meanwhile another thread peels off
the association being used by the first thread.
This patch replaces the BUG_ON call with a proper error handling. It
will return -EPIPE to the original sendmsg call, similarly to what would
have been done if the association wasn't found in the first place.
Acked-by: Alexander Popov <alex.popov@linux.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit bd4ce941c8 ]
mlx4 may schedule napi from a workqueue. Afterwards, softirqs are not run
in a deterministic time frame and the following message may be logged:
NOHZ: local_softirq_pending 08
The problem is the same as what was described in commit ec13ee8014
("virtio_net: invoke softirqs after __napi_schedule") and this patch
applies the same fix to mlx4.
Fixes: 07841f9d94 ("net/mlx4_en: Schedule napi when RX buffers allocation fails")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2d6a0e9de0 ]
Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7926aff5c5 ]
Allocating USB buffers on the stack is not portable, and no longer
works on x86_64 (with VMAP_STACK enabled as per default).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 837585a537 ]
When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
Data length is verified to be greater than or equal to expected header
length tun->vnet_hdr_sz before copying.
Macvtap functions read the value once, but unless READ_ONCE is used,
the compiler may ignore this and read multiple times. Enforce a single
read and locally cached value to avoid updates between test and use.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e1edab87fa ]
When IFF_VNET_HDR is enabled, a virtio_net header must precede data.
Data length is verified to be greater than or equal to expected header
length tun->vnet_hdr_sz before copying.
Read this value once and cache locally, as it can be updated between
the test and use (TOCTOU).
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
CC: Eric Dumazet <edumazet@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ccf7abb93a ]
Splicing from TCP socket is vulnerable when a packet with URG flag is
received and stored into receive queue.
__tcp_splice_read() returns 0, and sk_wait_data() immediately
returns since there is the problematic skb in queue.
This is a nice way to burn cpu (aka infinite loop) and trigger
soft lockups.
Again, this gem was found by syzkaller tool.
Fixes: 9c55e01c0c ("[TCP]: Splice receive support.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ebf6c9cb23 ]
Dmitry reported use-after-free in ip6_datagram_recv_specific_ctl()
A similar bug was fixed in commit 8ce48623f0 ("ipv6: tcp: restore
IP6CB for pktoptions skbs"), but I missed another spot.
tcp_v6_syn_recv_sock() can indeed set np->pktoptions from ireq->pktopts
Fixes: 971f10eca1 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7892032cfe ]
Andrey Konovalov reported out of bound accesses in ip6gre_err()
If GRE flags contains GRE_KEY, the following expression
*(((__be32 *)p) + (grehlen / 4) - 1)
accesses data ~40 bytes after the expected point, since
grehlen includes the size of IPv6 headers.
Let's use a "struct gre_base_hdr *greh" pointer to make this
code more readable.
p[1] becomes greh->protocol.
grhlen is the GRE header length.
Fixes: c12b395a46 ("gre: Support GRE over IPv6")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 34b2cef20f ]
Andrey Konovalov got crashes in __ip_options_echo() when a NULL skb->dst
is accessed.
ipv4_pktinfo_prepare() should not drop the dst if (evil) IP options
are present.
We could refine the test to the presence of ts_needtime or srr,
but IP options are not often used, so let's be conservative.
Thanks to syzkaller team for finding this bug.
Fixes: d826eb14ec ("ipv4: PKTINFO doesnt need dst reference")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0a764db103 ]
DW GMAC databook says the following about bits in "Register 15 (Interrupt
Mask Register)":
--------------------------->8-------------------------
When set, this bit __disables_the_assertion_of_the_interrupt_signal__
because of the setting of XXX bit in Register 14 (Interrupt
Status Register).
--------------------------->8-------------------------
In fact even if we mask one bit in the mask register it doesn't prevent
corresponding bit to appear in the status register, it only disables
interrupt generation for corresponding event.
But currently we expect a bit different behavior: status bits to be in
sync with their masks, i.e. if mask for bit A is set in the mask
register then bit A won't appear in the interrupt status register.
This was proven to be incorrect assumption, see discussion here [1].
That misunderstanding causes unexpected behaviour of the GMAC, for
example we were happy enough to just see bogus messages about link
state changes.
So from now on we'll be only checking bits that really may trigger an
interrupt.
[1] https://lkml.org/lkml/2016/11/3/413
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Fabrice Gasnier <fabrice.gasnier@st.com>
Cc: Joachim Eastwood <manabian@gmail.com>
Cc: Phil Reid <preid@electromag.com.au>
Cc: David Miller <davem@davemloft.net>
Cc: Alexandre Torgue <alexandre.torgue@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 06425c308b ]
syszkaller fuzzer was able to trigger a divide by zero, when
TCP window scaling is not enabled.
SO_RCVBUF can be used not only to increase sk_rcvbuf, also
to decrease it below current receive buffers utilization.
If mss is negative or 0, just return a zero TCP window.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fbfa743a9d ]
This function suffers from multiple issues.
First one is that pskb_may_pull() may reallocate skb->head,
so the 'raw' pointer needs either to be reloaded or not used at all.
Second issue is that NEXTHDR_DEST handling does not validate
that the options are present in skb->data, so we might read
garbage or access non existent memory.
With help from Willem de Bruijn.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit fd62d9f5c5 ]
In the current version, the matchall internal state is split into two
structs: cls_matchall_head and cls_matchall_filter. This makes little
sense, as matchall instance supports only one filter, and there is no
situation where one exists and the other does not. In addition, that led
to some races when filter was deleted while packet was processed.
Unify that two structs into one, thus simplifying the process of matchall
creation and deletion. As a result, the new, delete and get callbacks have
a dummy implementation where all the work is done in destroy and change
callbacks, as was done in cls_cgroup.
Fixes: bf3994d2ed ("net/sched: introduce Match-all classifier")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a100ff3eef ]
Modifying TIR hash should change selected fields bitmask in addition to
the function and key.
Formerly, Only on ethool mlx5e_set_rxfh "ethtoo -X" we would not set this
field resulting in zeroing of its value, which means no packet fields are
used for RX RSS hash calculation thus causing all traffic to arrive in
RQ[0].
On driver load out of the box we don't have this issue, since the TIR
hash is fully created from scratch.
Tested:
ethtool -X ethX hkey <new key>
ethtool -X ethX hfunc <new func>
ethtool -X ethX equal <new indirection table>
All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.
Fixes: bdfc028de1 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f1712c7371 ]
Zhang Yanmin reported crashes [1] and provided a patch adding a
synchronize_rcu() call in can_rx_unregister()
The main problem seems that the sockets themselves are not RCU
protected.
If CAN uses RCU for delivery, then sockets should be freed only after
one RCU grace period.
Recent kernels could use sock_set_flag(sk, SOCK_RCU_FREE), but let's
ease stable backports with the following fix instead.
[1]
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffff81495e25>] selinux_socket_sock_rcv_skb+0x65/0x2a0
Call Trace:
<IRQ>
[<ffffffff81485d8c>] security_sock_rcv_skb+0x4c/0x60
[<ffffffff81d55771>] sk_filter+0x41/0x210
[<ffffffff81d12913>] sock_queue_rcv_skb+0x53/0x3a0
[<ffffffff81f0a2b3>] raw_rcv+0x2a3/0x3c0
[<ffffffff81f06eab>] can_rcv_filter+0x12b/0x370
[<ffffffff81f07af9>] can_receive+0xd9/0x120
[<ffffffff81f07beb>] can_rcv+0xab/0x100
[<ffffffff81d362ac>] __netif_receive_skb_core+0xd8c/0x11f0
[<ffffffff81d36734>] __netif_receive_skb+0x24/0xb0
[<ffffffff81d37f67>] process_backlog+0x127/0x280
[<ffffffff81d36f7b>] net_rx_action+0x33b/0x4f0
[<ffffffff810c88d4>] __do_softirq+0x184/0x440
[<ffffffff81f9e86c>] do_softirq_own_stack+0x1c/0x30
<EOI>
[<ffffffff810c76fb>] do_softirq.part.18+0x3b/0x40
[<ffffffff810c8bed>] do_softirq+0x1d/0x20
[<ffffffff81d30085>] netif_rx_ni+0xe5/0x110
[<ffffffff8199cc87>] slcan_receive_buf+0x507/0x520
[<ffffffff8167ef7c>] flush_to_ldisc+0x21c/0x230
[<ffffffff810e3baf>] process_one_work+0x24f/0x670
[<ffffffff810e44ed>] worker_thread+0x9d/0x6f0
[<ffffffff810e4450>] ? rescuer_thread+0x480/0x480
[<ffffffff810ebafc>] kthread+0x12c/0x150
[<ffffffff81f9ccef>] ret_from_fork+0x3f/0x70
Reported-by: Zhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8381cdd0e3 upstream.
The -o/--order option is to select column number to sort a diff result.
It does the job by adding a hpp field at the beginning of the sort list.
But it should not be added to the output field list as it has no
callbacks required by a output field.
During the setup_sorting(), the perf_hpp__setup_output_field() appends
the given sort keys to the output field if it's not there already.
Originally it was checked by fmt->list being non-empty. But commit
3f931f2c42 ("perf hists: Make hpp setup function generic") changed it
to check the ->equal callback.
Anyways, we don't need to add the pseudo hpp field to the output field
list since it won't be used for output. So just skip fields if they
have no ->color or ->entry callbacks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 3f931f2c42 ("perf hists: Make hpp setup function generic")
Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1c9f97f0b upstream.
Commit 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field
interface") changed list_add() to perf_hpp__register_sort_field().
This resulted in a behavior change since the field was added to the tail
instead of the head. So the -o option is mostly ignored due to its
order in the list.
This patch fixes it by adding perf_hpp__prepend_sort_field().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 21e6d84286 ("perf diff: Use perf_hpp__register_sort_field interface")
Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bfeda41d06 upstream.
Since KERN_CONT became meaningful again, lockdep stack traces have had
annoying extra newlines, like this:
[ 5.561122] -> #1 (B){+.+...}:
[ 5.561528]
[ 5.561532] [<ffffffff810d8873>] lock_acquire+0xc3/0x210
[ 5.562178]
[ 5.562181] [<ffffffff816f6414>] mutex_lock_nested+0x74/0x6d0
[ 5.562861]
[ 5.562880] [<ffffffffa01aa3c3>] init_btrfs_fs+0x21/0x196 [btrfs]
[ 5.563717]
[ 5.563721] [<ffffffff81000472>] do_one_initcall+0x52/0x1b0
[ 5.564554]
[ 5.564559] [<ffffffff811a3af6>] do_init_module+0x5f/0x209
[ 5.565357]
[ 5.565361] [<ffffffff81122f4d>] load_module+0x218d/0x2b80
[ 5.566020]
[ 5.566021] [<ffffffff81123beb>] SyS_finit_module+0xeb/0x120
[ 5.566694]
[ 5.566696] [<ffffffff816fd241>] entry_SYSCALL_64_fastpath+0x1f/0xc2
That's happening because each printk() call now gets printed on its own
line, and we do a separate call to print the spaces before the symbol.
Fix it by doing the printk() directly instead of using the
print_ip_sym() helper.
Additionally, the symbol address isn't very helpful, so let's get rid of
that, too. The final result looks like this:
[ 5.194518] -> #1 (B){+.+...}:
[ 5.195002] lock_acquire+0xc3/0x210
[ 5.195439] mutex_lock_nested+0x74/0x6d0
[ 5.196491] do_one_initcall+0x52/0x1b0
[ 5.196939] do_init_module+0x5f/0x209
[ 5.197355] load_module+0x218d/0x2b80
[ 5.197792] SyS_finit_module+0xeb/0x120
[ 5.198251] entry_SYSCALL_64_fastpath+0x1f/0xc2
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel-team@fb.com
Fixes: 4bcc595ccd ("printk: reinstate KERN_CONT for printing continuation lines")
Link: http://lkml.kernel.org/r/43b4e114724b2bdb0308fa86cb33aa07d3d67fad.1486510315.git.osandov@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79a8b9aa38 upstream.
Commit:
a33d331761 ("x86/CPU/AMD: Fix Bulldozer topology")
restored the initial approach we had with the Fam15h topology of
enumerating CU (Compute Unit) threads as cores. And this is still
correct - they're beefier than HT threads but still have some
shared functionality.
Our current approach has a problem with the Mad Max Steam game, for
example. Yves Dionne reported a certain "choppiness" while playing on
v4.9.5.
That problem stems most likely from the fact that the CU threads share
resources within one CU and when we schedule to a thread of a different
compute unit, this incurs latency due to migrating the working set to a
different CU through the caches.
When the thread siblings mask mirrors that aspect of the CUs and
threads, the scheduler pays attention to it and tries to schedule within
one CU first. Which takes care of the latency, of course.
Reported-by: Yves Dionne <yves.dionne@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Brice Goglin <Brice.Goglin@inria.fr>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <yazen.ghannam@amd.com>
Link: http://lkml.kernel.org/r/20170205105022.8705-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 146fbb7669 upstream.
CONFIG_KASAN=y needs a lot of virtual memory mapped for its shadow.
In that case ptdump_walk_pgd_level_core() takes a lot of time to
walk across all page tables and doing this without
a rescheduling causes soft lockups:
NMI watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [swapper/0:1]
...
Call Trace:
ptdump_walk_pgd_level_core+0x40c/0x550
ptdump_walk_pgd_level_checkwx+0x17/0x20
mark_rodata_ro+0x13b/0x150
kernel_init+0x2f/0x120
ret_from_fork+0x2c/0x40
I guess that this issue might arise even without KASAN on huge machines
with several terabytes of RAM.
Stick cond_resched() in pgd loop to fix this.
Reported-by: Tobias Regnery <tobias.regnery@gmail.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: kasan-dev@googlegroups.com
Cc: Alexander Potapenko <glider@google.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Link: http://lkml.kernel.org/r/20170210095405.31802-1-aryabinin@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f3d83317a6 upstream.
This reverts commit f6a0dd107a.
The commit caused a regression on LINE6 Transport that has no control
caps. Although reverting the commit may result back in a spurious
error message for some device again, it's the simplest regression fix,
hence it's taken as is at first. The further code fix will follow
later.
Fixes: f6a0dd107a ("ALSA: line6: Only determine control port properties if needed")
Reported-by: Igor Zinovev <zinigor@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37a7ea4a9b upstream.
snd_seq_pool_done() syncs with closing of all opened threads, but it
aborts the wait loop with a timeout, and proceeds to the release
resource even if not all threads have been closed. The timeout was 5
seconds, and if you run a crazy stuff, it can exceed easily, and may
result in the access of the invalid memory address -- this is what
syzkaller detected in a bug report.
As a fix, let the code graduate from naiveness, simply remove the loop
timeout.
BugLink: http://lkml.kernel.org/r/CACT4Y+YdhDV2H5LLzDTJDVF-qiYHUHhtRaW4rbb4gUhTCQB81w@mail.gmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4842e98f26 upstream.
When a sequencer queue is created in snd_seq_queue_alloc(),it adds the
new queue element to the public list before referencing it. Thus the
queue might be deleted before the call of snd_seq_queue_use(), and it
results in the use-after-free error, as spotted by syzkaller.
The fix is to reference the queue object at the right time.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af677166cf upstream.
Without this change, the HDMI/DP codec will be recognised as a
generic codec, and there is no sound when playing through this codec.
As suggested by NVidia side, after adding the new ID in the driver,
the sound playing works well.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7447095485 upstream.
rx_refill_timer should be deleted as soon as we disconnect from the
backend since otherwise it is possible for the timer to go off before
we get to xennet_destroy_queues(). If this happens we may dereference
queue->rx.sring which is set to NULL in xennet_disconnect_backend().
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b25671497 upstream.
The IPIs come in as HVI not EE, so we need to test the appropriate
SRR1 bits. The encoding is such that it won't have false positives
on P7 and P8 so we can just test it like that. We also need to handle
the icp-opal variant of the flush.
Fixes: d74361881f ("powerpc/xics: Add ICP OPAL backend")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90c1e3c2fa upstream.
Three tiny changes to the ERAT flushing logic: First don't make
it depend on DD1. It hasn't been decided yet but we might run
DD2 in a mode that also requires explicit flushes for performance
reasons so make it unconditional. We also add a missing isync, and
finally remove the flush from _tlbiel_va as it is only necessary
for congruence-class invalidations (PID, LPID and full TLB), not
targetted invalidations.
Fixes: 96ed1fe511 ("powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a36224918 upstream.
Commit 4c63c2454e incorrectly assumed that returning -ENOIOCTLCMD would
cause the native ioctl to be called. The ->compat_ioctl callback is
expected to handle all ioctls, not just compat variants. As a result,
when using 32-bit userspace on 64-bit kernels, everything except those
three ioctls would return -ENOTTY.
Fixes: 4c63c2454e ("btrfs: bugfix: handle FS_IOC32_{GETFLAGS,SETFLAGS,GETVERSION} in btrfs_ioctl")
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2dfa6688aa upstream.
Dan Carpenter kindly reported:
<quote>
The patch d27a7cb919: "zfcp: trace on request for open and close of
WKA port" from Aug 10, 2016, leads to the following static checker
warning:
drivers/s390/scsi/zfcp_fsf.c:1615 zfcp_fsf_open_wka_port()
warn: 'req' was already freed.
drivers/s390/scsi/zfcp_fsf.c
1609 zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
1610 retval = zfcp_fsf_req_send(req);
1611 if (retval)
1612 zfcp_fsf_req_free(req);
^^^
Freed.
1613 out:
1614 spin_unlock_irq(&qdio->req_q_lock);
1615 if (req && !IS_ERR(req))
1616 zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req->req_id);
^^^^^^^^^^^
Use after free.
1617 return retval;
1618 }
Same thing for zfcp_fsf_close_wka_port() as well.
</quote>
Rather than relying on req being NULL (or ERR_PTR) for all cases where
we don't want to trace or should not trace,
simply check retval which is unconditionally initialized with -EIO != 0
and it can only become 0 on successful retval = zfcp_fsf_req_send(req).
With that we can also remove the then again unnecessary unconditional
initialization of req which was introduced with that earlier commit.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Suggested-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: d27a7cb919 ("zfcp: trace on request for open and close of WKA port")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 433e19cf33 upstream.
Commit a389fcfd2c ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")
added the proper mb(), but removed the test "prev_write_sz < pending_sz"
when making the signal decision.
As a result, the guest can signal the host unnecessarily,
and then the host can throttle the guest because the host
thinks the guest is buggy or malicious; finally the user
running stress test can perceive intermittent freeze of
the guest.
This patch brings back the test, and properly handles the
in-place consumption APIs used by NetVSC (see get_next_pkt_raw(),
put_pkt_raw() and commit_rd_index()).
Fixes: a389fcfd2c ("Drivers: hv: vmbus: Fix signaling logic in
hv_need_to_signal_on_read()")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reported-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Tested-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3372592a14 upstream.
Signal the host when we determine the host is to be signaled -
on th read path. The currrent code determines the need to signal in the
ringbuffer code and actually issues the signal elsewhere. This can result
in the host viewing this interrupt as spurious since the host may also
poll the channel. Make the necessary adjustments.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f6ee4e7d8 upstream.
Signal the host when we determine the host is to be signaled.
The currrent code determines the need to signal in the ringbuffer
code and actually issues the signal elsewhere. This can result
in the host viewing this interrupt as spurious since the host may also
poll the channel. Make the necessary adjustments.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 74198eb4a4 upstream.
One of the factors that can result in the host concluding that a given
guest in mounting a DOS attack is if the guest generates interrupts
to the host when the host is not expecting it. If these "spurious"
interrupts reach a certain rate, the host can throttle the guest to
minimize the impact. The host computation of the "expected number
of interrupts" is strictly based on the ring transitions. Until
the host logic is fixed, base the guest logic to interrupt solely
on the ring state.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d3398facd upstream.
We don't need to modify our TIRs unless the user requested a change in
the hash function/key, for example when changing indirection only.
Tested:
# Modify TIRs hash is needed
ethtool -X ethX hkey <new key>
ethtool -X ethX hfunc <new func>
# Modify TIRs hash is not needed
ethtool -X ethX equal <new indirection table>
All cases are verified with TCP Multi-Stream traffic over IPv4 & IPv6.
Fixes: bdfc028de1 ("net/mlx5e: Fix ethtool RX hash func configuration change")
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da7061c82e upstream.
The function ieee80211_ie_split_vendor doesn't return 0 on errors. Instead
it returns any offset < ielen when WLAN_EID_VENDOR_SPECIFIC is found. The
return value in mesh_add_vendor_ies must therefore be checked against
ifmsh->ie_len and not 0. Otherwise all ifmsh->ie starting with
WLAN_EID_VENDOR_SPECIFIC will be rejected.
Fixes: 082ebb0c25 ("mac80211: fix mesh beacon format")
Signed-off-by: Thorsten Horstmann <thorsten@defutech.de>
Signed-off-by: Mathias Kretschmer <mathias.kretschmer@fit.fraunhofer.de>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
[sven@narfation.org: Add commit message]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd551bac47 upstream.
A previous change to fix checks for NL80211_MESHCONF_HT_OPMODE
missed setting the flag when replacing FILL_IN_MESH_PARAM_IF_SET
with checking codes. This results in dropping the received HT
operation value when called by nl80211_update_mesh_config(). Fix
this by setting the flag properly.
Fixes: 9757235f45 ("nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value")
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
[rewrite commit message to use Fixes: line]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e7eb1783b upstream.
We're using non-canonical addresses in drm_mm, and we're making sure that
userspace is using canonical addressing - both in case of softpin
(verifying incoming offset) and when relocating (converting to canonical
when updating offset returned to userspace).
Unfortunately when considering the need for relocations, we're comparing
offset from userspace (in canonical form) with drm_mm node (in
non-canonical form), and as a result, we end up always relocating if our
offsets are in the "problematic" range.
Let's always convert the offsets to avoid the performance impact of
relocations.
Fixes: a5f0edf63b ("drm/i915: Avoid writing relocs with addresses in non-canonical form")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Reported-by: Michał Pyrzowski <michal.pyrzowski@intel.com>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170207195559.18798-1-michal.winiarski@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 038c95a313)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f59b31911 upstream.
GPIO4_11 is on pin 152(MX6DL_PAD_KEY_ROW2) and not on pin
151(MX6DL_PAD_KEY_ROW1).
I found the error while booting a mainline kernel on APF6S SoM and
noticed the following message:
[ 2.609337] imx6dl-pinctrl 20e0000.iomuxc: pin MX6DL_PAD_KEY_ROW1
already requested by 20a8000.gpio:105; cannot claim for 20a8000.gpio:107
[ 2.621884] imx6dl-pinctrl 20e0000.iomuxc: pin-151 (20a8000.gpio:107)
status -22
[ 2.629303] spi_imx 2008000.ecspi: Can't get CS GPIO 107
With this patch, the message is gone and spi_imx driver probes correctly.
Fixes: bb728d662b ("ARM: dts: add gpio-ranges property to iMX GPIO controllers")
Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9b2792c3da upstream.
This patch addresses a long standing bug where the commit phase
of COMPARE_AND_WRITE would result in a se_cmd->cmd_kref reference
leak if se_cmd->scsi_status returned non SAM_STAT_GOOD.
This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.
To address this bug, compare_and_write_post() has been changed
to drop the incorrect !cmd->scsi_status conditional that was
preventing *post_ret = 1 for being set during non SAM_STAT_GOOD
status.
This patch has been tested with SAM_STAT_CHECK_CONDITION status
from normal target_complete_cmd() callback path, as well as the
incoming __target_execute_cmd() submission failure path when
se_cmd->execute_cmd() returns non zero status.
Reported-by: Donald White <dew@datera.io>
Cc: Donald White <dew@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01d4d67355 upstream.
This patch addresses a long-standing bug with multi-session
(eg: iscsi-target + iser-target) se_node_acl dynamic free
withini transport_deregister_session().
This bug is caused when a storage endpoint is configured with
demo-mode (generate_node_acls = 1 + cache_dynamic_acls = 1)
initiators, and initiator login creates a new dynamic node acl
and attaches two sessions to it.
After that, demo-mode for the storage instance is disabled via
configfs (generate_node_acls = 0 + cache_dynamic_acls = 0) and
the existing dynamic acl is never converted to an explicit ACL.
The end result is dynamic acl resources are released twice when
the sessions are shutdown in transport_deregister_session().
If the storage instance is not changed to disable demo-mode,
or the dynamic acl is converted to an explict ACL, or there
is only a single session associated with the dynamic ACL,
the bug is not triggered.
To address this big, move the release of dynamic se_node_acl
memory into target_complete_nacl() so it's only freed once
when se_node_acl->acl_kref reaches zero.
(Drop unnecessary list_del_init usage - HCH)
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c54eeffbe9 upstream.
This patch fixes a bug where incoming task management requests
can be explicitly aborted during an active LUN_RESET, but who's
struct work_struct are canceled in-flight before execution.
This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
for target_tmr_work() never getting invoked and the aborted TMR
waiting indefinately within transport_wait_for_tasks().
To address this case, perform a CMD_T_ABORTED check early in
transport_generic_handle_tmr(), and invoke the normal path via
transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
blocked waiting for CMD_T_STOP in transport_wait_for_tasks().
Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
into transport_generic_handle_tmr() so the existing check in
core_tmr_drain_tmr_list() avoids attempting abort the incoming
se_tmr_req->task_cmd->work if it has already been queued into
se_device->tmr_wq.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0583c261e6 upstream.
This patch adds the missing target_complete_cmd() SCSI status
parameter change in target_xcopy_do_work(), that was originally
missing in commit 926317de33.
It correctly propigates up the correct SCSI status during
EXTENDED_COPY exception cases, instead of always using the
hardcoded SAM_STAT_CHECK_CONDITION from original code.
This is required for ESX host environments that expect to
hit SAM_STAT_RESERVATION_CONFLICT for certain scenarios,
and SAM_STAT_CHECK_CONDITION results in non-retriable
status for these cases.
Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 391e2a6de9 upstream.
After the v4.2+ RCU conversion to se_node_acl->lun_entry_hlist,
a BUG_ON() was added in core_enable_device_list_for_node() to
detect when the located orig->se_lun_acl contains an existing
se_lun_acl pointer reference.
However, this scenario can happen when a dynamically generated
NodeACL is being converted to an explicit NodeACL, when the
explicit NodeACL contains a different LUN mapping than the
default provided by the WWN endpoint.
So instead of triggering BUG_ON(), go ahead and fail instead
following the original pre RCU conversion logic.
Reported-by: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Cc: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b3f2d07f46 upstream.
The use of ACCESS_ONCE() looks like a micro-optimization to force gcc to use
an indexed load for the register address, but it has an absolutely detrimental
effect on builds with gcc-5 and CONFIG_KASAN=y, leading to a very likely
kernel stack overflow aside from very complex object code:
hisilicon/hns/hns_dsaf_gmac.c: In function 'hns_gmac_update_stats':
hisilicon/hns/hns_dsaf_gmac.c:419:1: error: the frame size of 2912 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_ppe.c: In function 'hns_ppe_reset_common':
hisilicon/hns/hns_dsaf_ppe.c:390:1: error: the frame size of 1184 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_ppe.c: In function 'hns_ppe_get_regs':
hisilicon/hns/hns_dsaf_ppe.c:621:1: error: the frame size of 3632 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_rcb.c: In function 'hns_rcb_get_common_regs':
hisilicon/hns/hns_dsaf_rcb.c:970:1: error: the frame size of 2784 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_gmac.c: In function 'hns_gmac_get_regs':
hisilicon/hns/hns_dsaf_gmac.c:641:1: error: the frame size of 5728 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_rcb.c: In function 'hns_rcb_get_ring_regs':
hisilicon/hns/hns_dsaf_rcb.c:1021:1: error: the frame size of 2208 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_comm_init':
hisilicon/hns/hns_dsaf_main.c:1209:1: error: the frame size of 1904 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_xgmac.c: In function 'hns_xgmac_get_regs':
hisilicon/hns/hns_dsaf_xgmac.c:748:1: error: the frame size of 4704 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_update_stats':
hisilicon/hns/hns_dsaf_main.c:2420:1: error: the frame size of 1088 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
hisilicon/hns/hns_dsaf_main.c: In function 'hns_dsaf_get_regs':
hisilicon/hns/hns_dsaf_main.c:2753:1: error: the frame size of 10768 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
This does not seem to happen any more with gcc-7, but removing the ACCESS_ONCE
seems safe anyway and it avoids a serious issue for some people. I have verified
that with gcc-5.3.1, the object code we get is better in the new version
both with and without CONFIG_KASAN, as we no longer allocate a 1344 byte
stack frame for hns_dsaf_get_regs() but otherwise have practically identical
object code.
With gcc-7.0.0, removing ACCESS_ONCE has no effect, the object code is already
good either way.
This patch is probably not urgent to get into 4.11 as only KASAN=y builds
with certain compilers are affected, but I still think it makes sense to
backport into older kernels.
Fixes: 511e6bc ("net: add Hisilicon Network Subsystem DSAF support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d59b6ccf0 upstream.
Commit 513e3d2d11 ("cpumask: always use nr_cpu_ids in formatting and
parsing functions") converted both cpumask printing and parsing
functions to use nr_cpu_ids instead of nr_cpumask_bits. While this was
okay for the printing functions as it just picked one of the two output
formats that we were alternating between depending on a kernel config,
doing the same for parsing wasn't okay.
nr_cpumask_bits can be either nr_cpu_ids or NR_CPUS. We can always use
nr_cpu_ids but that is a variable while NR_CPUS is a constant, so it can
be more efficient to use NR_CPUS when we can get away with it.
Converting the printing functions to nr_cpu_ids makes sense because it
affects how the masks get presented to userspace and doesn't break
anything; however, using nr_cpu_ids for parsing functions can
incorrectly leave the higher bits uninitialized while reading in these
masks from userland. As all testing and comparison functions use
nr_cpumask_bits which can be larger than nr_cpu_ids, the parsed cpumasks
can erroneously yield false negative results.
This made the taskstats interface incorrectly return -EINVAL even when
the inputs were correct.
Fix it by restoring the parse functions to use nr_cpumask_bits instead
of nr_cpu_ids.
Link: http://lkml.kernel.org/r/20170206182442.GB31078@htj.duckdns.org
Fixes: 513e3d2d11 ("cpumask: always use nr_cpu_ids in formatting and parsing functions")
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Martin Steigerwald <martin.steigerwald@teamix.de>
Debugged-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d966564fcd upstream.
This reverts commit 020eb3daab.
Gabriel C reports that it causes his machine to not boot, and we haven't
tracked down the reason for it yet. Since the bug it fixes has been
around for a longish time, we're better off reverting the fix for now.
Gabriel says:
"It hangs early and freezes with a lot RCU warnings.
I bisected it down to :
> Ruslan Ruslichenko (1):
> x86/ioapic: Restore IO-APIC irq_chip retrigger callback
Reverting this one fixes the problem for me..
The box is a PRIMERGY TX200 S5 , 2 socket , 2 x E5520 CPU(s) installed"
and Ruslan and Thomas are currently stumped.
Reported-and-bisected-by: Gabriel C <nix.or.die@gmail.com>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c461cb727 upstream.
SELinux tries to support setting/clearing of /proc/pid/attr attributes
from the shell by ignoring terminating newlines and treating an
attribute value that begins with a NUL or newline as an attempt to
clear the attribute. However, the test for clearing attributes has
always been wrong; it has an off-by-one error, and this could further
lead to reading past the end of the allocated buffer since commit
bb646cdb12 ("proc_pid_attr_write():
switch to memdup_user()"). Fix the off-by-one error.
Even with this fix, setting and clearing /proc/pid/attr attributes
from the shell is not straightforward since the interface does not
support multiple write() calls (so shells that write the value and
newline separately will set and then immediately clear the attribute,
requiring use of echo -n to set the attribute), whereas trying to use
echo -n "" to clear the attribute causes the shell to skip the
write() call altogether since POSIX says that a zero-length write
causes no side effects. Thus, one must use echo -n to set and echo
without -n to clear, as in the following example:
$ echo -n unconfined_u:object_r:user_home_t:s0 > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
unconfined_u:object_r:user_home_t:s0
$ echo "" > /proc/$$/attr/fscreate
$ cat /proc/$$/attr/fscreate
Note the use of /proc/$$ rather than /proc/self, as otherwise
the cat command will read its own attribute value, not that of the shell.
There are no users of this facility to my knowledge; possibly we
should just get rid of it.
UPDATE: Upon further investigation it appears that a local process
with the process:setfscreate permission can cause a kernel panic as a
result of this bug. This patch fixes CVE-2017-2618.
Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the update about CVE-2017-2618 to the commit description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
commit 601bbbe051 upstream.
If user tries to initialize uinput device mixing old and new style
initialization (i.e. using old UI_SET_ABSBIT instead of UI_ABS_SETUP,
we forget to allocate input->absinfo and will crash when trying to send
absolute events:
ioctl(ui, UI_DEV_SETUP, &us);
ioctl(ui, UI_SET_PHYS, "Test");
ioctl(ui, UI_SET_EVBIT, EV_ABS);
ioctl(ui, UI_SET_ABSBIT, ABS_X);
ioctl(ui, UI_SET_ABSBIT, ABS_Y);
ioctl(ui, UI_DEV_CREATE, 0);
Reported-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=191811
Fixes: fbae10db09 ("Input: uinput - rework ABS validation")
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5da5c5667 upstream.
Eliminate a double-add by creating a new list to manage
command descriptors when created; move the descriptor to
the pending list when the command is submitted.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 500c0106e6 upstream.
An I/O page fault occurs when the IOMMU is enabled on a
system that supports the v5 CCP. DMA operations use a
Request ID value that does not match what is expected by
the IOMMU, resulting in the I/O page fault. Setting the
Request ID value to 0 corrects this issue.
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4087a1fffe upstream.
Fixes a crash in dm_table_find_target() due to a NULL struct dm_table
being passed from dm_old_request_fn() that races with DM device
destruction.
Reported-by: artem@flashgrid.io
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bfb34527a3 upstream.
When vmemmap_populate() allocates space for the memmap it does so in 2MB
sized chunks. The libnvdimm-pfn driver incorrectly accounts for this
when the alignment of the device is set to 4K. When this happens we
trigger memory allocation failures in altmap_alloc_block_buf() and
trigger warnings of the form:
WARNING: CPU: 0 PID: 3376 at arch/x86/mm/init_64.c:656 arch_add_memory+0xe4/0xf0
[..]
Call Trace:
dump_stack+0x86/0xc3
__warn+0xcb/0xf0
warn_slowpath_null+0x1d/0x20
arch_add_memory+0xe4/0xf0
devm_memremap_pages+0x29b/0x4e0
Fixes: 315c562536 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d032f4201 upstream.
Given that the naming of pmem devices changes from the pmemX form to the
pmemX.Y form when namespace id is greater than 0, arrange for namespaces
with id-0 to be exempt from deletion. Otherwise a simple reconfiguration
of an existing namespace to a new mode results in a name change of the
resulting block device:
# ndctl list --namespace=namespace1.0
{
"dev":"namespace1.0",
"mode":"raw",
"size":2147483648,
"uuid":"3dadf3dc-89b9-4b24-b20e-abc8a4707ce3",
"blockdev":"pmem1"
}
# ndctl create-namespace --reconfig=namespace1.0 --mode=memory --force
{
"dev":"namespace1.1",
"mode":"memory",
"size":2111832064,
"uuid":"7b4a6341-7318-4219-a02c-fb57c0bbf613",
"blockdev":"pmem1.1"
}
This change does require tooling changes to explicitly look for
namespaceX.0 if the seed has already advanced to another namespace.
Fixes: 98a29c39dc ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e471486c13 upstream.
We queue an on-stack work item to 'nfit_wq' and wait for it to complete
as part of a 'flush_probe' request. However, if the user cancels the
wait we need to make sure the item is flushed from the queue otherwise
we are leaving an out-of-scope stack address on the work list.
BUG: unable to handle kernel paging request at ffffbcb3c72f7cd0
IP: [<ffffffffa9413a7b>] __list_add+0x1b/0xb0
[..]
RIP: 0010:[<ffffffffa9413a7b>] [<ffffffffa9413a7b>] __list_add+0x1b/0xb0
RSP: 0018:ffffbcb3c7ba7c00 EFLAGS: 00010046
[..]
Call Trace:
[<ffffffffa90bb11a>] insert_work+0x3a/0xc0
[<ffffffffa927fdda>] ? seq_open+0x5a/0xa0
[<ffffffffa90bb30a>] __queue_work+0x16a/0x460
[<ffffffffa90bbb08>] queue_work_on+0x38/0x40
[<ffffffffc0cf2685>] acpi_nfit_flush_probe+0x95/0xc0 [nfit]
[<ffffffffc0cf25d0>] ? nfit_visible+0x40/0x40 [nfit]
[<ffffffffa9571495>] wait_probe_show+0x25/0x60
[<ffffffffa9546b30>] dev_attr_show+0x20/0x50
Fixes: 7ae0fa439f ("nfit, libnvdimm: async region scrub workqueue")
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6e978b22ef upstream.
Some Kabylake desktop processors may not reach max turbo when running in
HWP mode, even if running under sustained 100% utilization.
This occurs when the HWP.EPP (Energy Performance Preference) is set to
"balance_power" (0x80) -- the default on most systems.
It occurs because the platform BIOS may erroneously enable an
energy-efficiency setting -- MSR_IA32_POWER_CTL BIT-EE, which is not
recommended to be enabled on this SKU.
On the failing systems, this BIOS issue was not discovered when the
desktop motherboard was tested with Windows, because the BIOS also
neglects to provide the ACPI/CPPC table, that Windows requires to enable
HWP, and so Windows runs in legacy P-state mode, where this setting has
no effect.
Linux' intel_pstate driver does not require ACPI/CPPC to enable HWP, and
so it runs in HWP mode, exposing this incorrect BIOS configuration.
There are several ways to address this problem.
First, Linux can also run in legacy P-state mode on this system.
As intel_pstate is how Linux enables HWP, booting with
"intel_pstate=disable"
will run in acpi-cpufreq/ondemand legacy p-state mode.
Or second, the "performance" governor can be used with intel_pstate,
which will modify HWP.EPP to 0.
Or third, starting in 4.10, the
/sys/devices/system/cpu/cpufreq/policy*/energy_performance_preference
attribute in can be updated from "balance_power" to "performance".
Or fourth, apply this patch, which fixes the erroneous setting of
MSR_IA32_POWER_CTL BIT_EE on this model, allowing the default
configuration to function as designed.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bafb2f7d47 upstream.
There is a disparity in the context image saved to disk and our own
bookkeeping - that is we presume the RING_HEAD and RING_TAIL match our
stored ce->ring->tail value. However, as we emit WA_TAIL_DWORDS into the
ring but may not tell the GPU about them, the GPU may be lagging behind
our bookkeeping. Upon hibernation we do not save stolen pages, presuming
that their contents are volatile. This means that although we start
writing into the ring at tail, the GPU starts executing from its HEAD
and there may be some garbage in between and so the GPU promptly hangs
upon resume.
Testcase: igt/gem_exec_suspend/basic-S4
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96526
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20160921135108.29574-3-chris@chris-wilson.co.uk
Cc: Eric Blau <eblau1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1908f5255 upstream.
Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion. He has tracked
this down to the following path
__alloc_pages_nodemask+0x436/0x4d0
alloc_pages_current+0x97/0x1b0
__page_cache_alloc+0x15d/0x1a0 mm/filemap.c:728
pagecache_get_page+0x5a/0x2b0 mm/filemap.c:1331
grab_cache_page_write_begin+0x23/0x40 mm/filemap.c:2773
iomap_write_begin+0x50/0xd0 fs/iomap.c:118
iomap_write_actor+0xb5/0x1a0 fs/iomap.c:190
? iomap_write_end+0x80/0x80 fs/iomap.c:150
iomap_apply+0xb3/0x130 fs/iomap.c:79
iomap_file_buffered_write+0x68/0xa0 fs/iomap.c:243
? iomap_write_end+0x80/0x80
xfs_file_buffered_aio_write+0x132/0x390 [xfs]
? remove_wait_queue+0x59/0x60
xfs_file_write_iter+0x90/0x130 [xfs]
__vfs_write+0xe5/0x140
vfs_write+0xc7/0x1f0
? syscall_trace_enter+0x1d0/0x380
SyS_write+0x58/0xc0
do_syscall_64+0x6c/0x200
entry_SYSCALL64_slow_path+0x25/0x25
the oom victim has access to all memory reserves to make a forward
progress to exit easier. But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request. We need to
check for fatal signals and back off with a short write instead.
As the iomap_apply delegates all the work down to the actor we have to
hook into those. All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there. dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly. Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.
Fixes: 68a9f5e700 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b414fa01c3 upstream.
The current QP FetchBurstMax value is 256B, which
is incorrect since a WR can exceed that value. The
result being a partial WR fetched by hardware, and
a fatal "bad WR" error posted by the SGE.
So bump the FetchBurstMax to 512B.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aaaec6fc75 upstream.
The recent commit which prevents double activation of interrupts unearthed
interesting code in x86. The code (ab)uses irq_domain_activate_irq() to
reconfigure an already activated interrupt. That trips over the prevention
code now.
Fix it by deactivating the interrupt before activating the new configuration.
Fixes: 08d85f3ea9 "irqdomain: Avoid activating interrupts more than once"
Reported-and-tested-by: Mike Galbraith <efault@gmx.de>
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701311901580.3457@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08d85f3ea9 upstream.
Since commit f3b0946d62 ("genirq/msi: Make sure PCI MSIs are
activated early"), we can end-up activating a PCI/MSI twice (once
at allocation time, and once at startup time).
This is normally of no consequences, except that there is some
HW out there that may misbehave if activate is used more than once
(the GICv3 ITS, for example, uses the activate callback
to issue the MAPVI command, and the architecture spec says that
"If there is an existing mapping for the EventID-DeviceID
combination, behavior is UNPREDICTABLE").
While this could be worked around in each individual driver, it may
make more sense to tackle the issue at the core level. In order to
avoid getting in that situation, let's have a per-interrupt flag
to remember if we have already activated that interrupt or not.
Fixes: f3b0946d62 ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-and-tested-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Link: http://lkml.kernel.org/r/1484668848-24361-1-git-send-email-marc.zyngier@arm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 828f84ee8f upstream.
FIFO was being read every sample after the "almost full" state was
reached. This was due to an incorrect placement of the parenthesis
in the while condition check.
Note - the fixes tag is not actually correct, but the fix in this patch
would also be needed for it to function correctly so we'll go with that
one. Backports should pick up both.
Signed-off-by: Matt Ranostay <matt@ranostay.consulting>
Fixes: b74fccad7 ("iio: health: max30100: correct FIFO check condition")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c113b5e00 upstream.
The DHT22 (AM2302) datasheet specifies that the LOW start pulse should not
exceed 20ms. However, observations with an oscilloscope of an RPi Model 2B
(rev 1.1) communicating with a DHT22 sensor showed that the driver was
consistently sending start pulses longer than 20ms:
Kernel 4.7.10-v7+ (n=132):
Minimum pulse length: 20.20ms
Maximum: 29.84ms
Mean: 24.96ms
StDev: 2.82ms
Sensor response rate: 100%
Read success rate: 76%
On kernel 4.8, the start pulse was so long that the sensor would not even
respond 97% of the time:
Kernel 4.8.16-v7+ (n=100):
Minimum pulse length: 30.4ms
Maximum: 74.4ms
Mean: 39.3ms
StDev: 10.2ms
Sensor response rate: 3%
Read success rate: 3%
The driver would return ETIMEDOUT and write log messages like this:
[ 51.430987] dht11 dht11@0: Only 1 signal edges detected
[ 66.311019] dht11 dht11@0: Only 0 signal edges detected
Replacing msleep(18) with usleep_range(18000, 20000) made the pulse length
sane again and restored responsiveness:
Kernel 4.8.16-v7+ with usleep_range (n=123):
Minimum pulse length: 18.16ms
Maximum: 20.20ms
Mean: 19.85ms
StDev: 0.51ms
Sensor response rate: 100%
Read success rate: 84%
Signed-off-by: John Brooks <john@fastquake.com>
Reviewed-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5badd1e97 upstream.
The suspend/resume functions were using dev_to_iio_dev() to get
the iio_dev. That only works on IIO dev's. Replace it with spi
functions to get the correct iio_dev.
Signed-off-by: Alison Schofield <amsfield22@gmail.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 802ecfc113 upstream.
The suspend/resume functions were using dev_to_iio_dev() to get
the iio_dev. That only works on IIO dev's. Replace it with i2c
functions to get the correct iio_dev.
Signed-off-by: Alison Schofield <amsfield22@gmail.com>
Acked-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1aaf20ee6 upstream.
The suspend/resume functions were using dev_to_iio_dev() to get
the iio_dev. That only works on IIO dev's. Use dev_get_drvdata()
for a platform device to get the correct iio_dev.
Signed-off-by: Alison Schofield <amsfield22@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b17c1bba9c upstream.
When tearingdown timesync, and not in arche platform, the state platform
callback is not initialized. That will trigger the following NULL
dereferencing.
CallTrace:
? gb_timesync_platform_unlock_bus+0x11/0x20 [greybus]
gb_timesync_teardown+0x85/0xc0 [greybus]
gb_timesync_svc_remove+0xab/0x190 [greybus]
gb_svc_del+0x29/0x110 [greybus]
gb_hd_del+0x14/0x20 [greybus]
ap_disconnect+0x24/0x60 [gb_es2]
usb_unbind_interface+0x7a/0x2c0
__device_release_driver+0x96/0x150
device_release_driver+0x1e/0x30
bus_remove_device+0xe7/0x130
device_del+0x116/0x230
usb_disable_device+0x97/0x1f0
usb_disconnect+0x80/0x260
hub_event+0x5ca/0x10e0
process_one_work+0x126/0x3b0
worker_thread+0x55/0x4c0
? process_one_work+0x3b0/0x3b0
kthread+0xc4/0xe0
? kthread_park+0xb0/0xb0
ret_from_fork+0x22/0x30
So, fix that by adding checks before use the callback.
Fixes: 970dc85bd9 ("greybus: timesync: Add timesync core driver")
Signed-off-by: Rui Miguel Silva <rmfrfs@gmail.com>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83e526f2a2 upstream.
OS descriptor head, when flagged as provided, is accessed without
checking if it fits in provided buffer. Verify length before access.
Also, there are other places where buffer length it checked
after accessing offsets which are potentially past the end. Check
buffer length before as well to fail cleanly.
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Acked-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 407788b51d upstream.
Commit 467d5c9807 ("usb: musb: Implement session bit based runtime PM for
musb-core") started implementing musb generic runtime PM support by
introducing devctl register session bit based state control.
This caused a regression where if a USB mass storage device is connected
to a USB hub, we can get:
usb 1-1: reset high-speed USB device number 2 using musb-hdrc
usb 1-1: device descriptor read/64, error -71
usb 1-1.1: new high-speed USB device number 4 using musb-hdrc
This is because before the USB storage device is connected, musb is
in OTG_STATE_A_SUSPEND. And we currently only set need_finish_resume
in musb_stage0_irq() and the related code calling finish_resume_work
in musb_resume() and musb_runtime_resume() never gets called.
To fix the issue, we can call schedule_delayed_work() directly in
musb_stage0_irq() to have finish_resume_work run.
And we should no longer never get interrupts when when suspended.
We have changed musb to no longer need pm_runtime_irqsafe().
The need_finish_resume flag was added in commit 9298b4aad3 ("usb:
musb: fix device hotplug behind hub") and no longer applies as far
as I can tell. So let's just remove the earlier code that no longer
is needed.
Fixes: 467d5c9807 ("usb: musb: Implement session bit based runtime PM for musb-core")
Reported-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d9b2997e4a upstream.
Add a quirk for WORLDE easykey.25 MIDI keyboard (idVendor=0218,
idProduct=0401). The device reports that it has config string
descriptor at index 3, but when the system selects the configuration
and tries to get the description, it returns a -EPROTO error,
the communication restarts and this keeps repeating over and over again.
Not requesting the string descriptor makes the device work correctly.
Relevant info from Wireshark:
[...]
CONFIGURATION DESCRIPTOR
bLength: 9
bDescriptorType: 0x02 (CONFIGURATION)
wTotalLength: 101
bNumInterfaces: 2
bConfigurationValue: 1
iConfiguration: 3
Configuration bmAttributes: 0xc0 SELF-POWERED NO REMOTE-WAKEUP
1... .... = Must be 1: Must be 1 for USB 1.1 and higher
.1.. .... = Self-Powered: This device is SELF-POWERED
..0. .... = Remote Wakeup: This device does NOT support remote wakeup
bMaxPower: 50 (100mA)
[...]
45 0.369104 host 2.38.0 USB 64 GET DESCRIPTOR Request STRING
[...]
URB setup
bmRequestType: 0x80
1... .... = Direction: Device-to-host
.00. .... = Type: Standard (0x00)
...0 0000 = Recipient: Device (0x00)
bRequest: GET DESCRIPTOR (6)
Descriptor Index: 0x03
bDescriptorType: 0x03
Language Id: English (United States) (0x0409)
wLength: 255
46 0.369255 2.38.0 host USB 64 GET DESCRIPTOR Response STRING[Malformed Packet]
[...]
Frame 46: 64 bytes on wire (512 bits), 64 bytes captured (512 bits) on interface 0
USB URB
[Source: 2.38.0]
[Destination: host]
URB id: 0xffff88021f62d480
URB type: URB_COMPLETE ('C')
URB transfer type: URB_CONTROL (0x02)
Endpoint: 0x80, Direction: IN
Device: 38
URB bus id: 2
Device setup request: not relevant ('-')
Data: present (0)
URB sec: 1484896277
URB usec: 455031
URB status: Protocol error (-EPROTO) (-71)
URB length [bytes]: 0
Data length [bytes]: 0
[Request in: 45]
[Time from request: 0.000151000 seconds]
Unused Setup Header
Interval: 0
Start frame: 0
Copy of Transfer Flags: 0x00000200
Number of ISO descriptors: 0
[Malformed Packet: USB]
[Expert Info (Error/Malformed): Malformed Packet (Exception occurred)]
[Malformed Packet (Exception occurred)]
[Severity level: Error]
[Group: Malformed]
Signed-off-by: Lukáš Lalinský <lukas@oxygene.sk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d07830db1b upstream.
Seems that ATEN serial-to-usb devices using pl2303 exist with
different device ids. This patch adds a missing device ID so it
is recognised by the driver.
Signed-off-by: Marcel J.E. Mol <marcel@mesa.nl>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24d615a694 upstream.
The Dell DW5570 is a re-branded Sierra Wireless MC8805 which will by
default boot with vid 0x413c and pid 0x81a3. When triggered QDL download
mode, the device switches to pid 0x81a6 and provides the standard TTY
used for firmware upgrade.
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 00c87e9a70 upstream.
Saving unsupported state prevents migration when the new host does not
support a XSAVE feature of the original host, even if the feature is not
exposed to the guest.
We've masked host features with guest-visible features before, with
4344ee981e ("KVM: x86: only copy XSAVE state for the supported
features") and dropped it when implementing XSAVES. Do it again.
Fixes: df1daba7d1 ("KVM: x86: support XSAVES usage in the host")
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 362f456246 upstream.
Commit fdea2d09b9 ("dmaengine: cppi41: Add basic PM runtime support")
together with recent MUSB changes allowed USB and DMA on BeagleBone to idle
when no cable is connected. But looks like few corner case issues still
remain.
Looks like just by re-plugging USB cable about ten or so times on BeagleBone
when configured in USB peripheral mode we can get warnings and eventually
trigger an oops in cppi41 DMA:
WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:1154 cppi41_runtime_suspend+
x28/0x38 [cppi41]
...
WARNING: CPU: 0 PID: 14 at drivers/dma/cppi41.c:452
push_desc_queue+0x94/0x9c [cppi41]
...
Unable to handle kernel NULL pointer dereference at virtual
address 00000104
pgd = c0004000
[00000104] *pgd=00000000
Internal error: Oops: 805 [#1] SMP ARM
...
[<bf0d92cc>] (cppi41_runtime_resume [cppi41]) from [<c0589838>]
(__rpm_callback+0xc0/0x214)
[<c0589838>] (__rpm_callback) from [<c05899ac>] (rpm_callback+0x20/0x80)
[<c05899ac>] (rpm_callback) from [<c0589460>] (rpm_resume+0x504/0x78c)
[<c0589460>] (rpm_resume) from [<c058a1a0>] (pm_runtime_work+0x60/0xa8)
[<c058a1a0>] (pm_runtime_work) from [<c0156120>] (process_one_work+0x2b4/0x808)
This is because of a race with runtime PM and cppi41_dma_issue_pending()
as reported by Alexandre Bailon <abailon@baylibre.com> in earlier
set of patches. Based on mailing list discussions we however came to the
conclusion that a different fix from Alexandre's fix is needed in order
to guarantee that DMA is really active when we try to use it.
To fix the issue, we need to add a driver specific flag as we otherwise
can have -EINPROGRESS state set by runtime PM and can't rely on
pm_runtime_active() to tell us when we can use the DMA.
And we need to make sure the DMA transfers get triggered in the queued
order. So let's always queue the transfers, then flush the queue
from both cppi41_dma_issue_pending() and cppi41_runtime_resume()
as suggested by Grygorii Strashko <grygorii.strashko@ti.com> in an
earlier example patch.
For reference, this is also documented in Documentation/power/runtime_pm.txt
in the example at the end of the file as pointed out by Grygorii Strashko
<grygorii.strashko@ti.com>.
Based on earlier patches from Alexandre Bailon <abailon@baylibre.com>
and Grygorii Strashko <grygorii.strashko@ti.com> modified based on
testing and what was discussed on the mailing lists.
Fixes: fdea2d09b9 ("dmaengine: cppi41: Add basic PM runtime support")
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Bin Liu <b-liu@ti.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Patrick Titiano <ptitiano@baylibre.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reported-by: Alexandre Bailon <abailon@baylibre.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae4a3e028b upstream.
Commit fdea2d09b9 ("dmaengine: cppi41: Add basic PM runtime support")
added runtime PM support for cppi41, but had corner case issues. Some of
the issues were fixed with commit 098de42ad6 ("dmaengine: cppi41: Fix
unpaired pm runtime when only a USB hub is connected"). That fix however
caused a new regression where we can get error -115 messages with USB on
BeagleBone when connecting a USB mass storage device to a hub.
This is because when connecting a USB mass storage device to a hub, the
initial DMA transfers can take over 200ms to complete and cppi41
autosuspend delay times out.
To fix the issue, we want to implement refcounting for chan_busy array
that contains the active dma transfers. Increasing the autosuspend delay
won't help as that the delay could be potentially seconds, and it's best
to let the USB subsystem to deal with the timeouts on errors.
The earlier attempt for runtime PM was buggy as the pm_runtime_get/put()
calls could get unpaired easily as they did not follow the state of
the chan_busy array as described in commit 098de42ad6 ("dmaengine:
cppi41: Fix unpaired pm runtime when only a USB hub is connected".
Let's fix the issue by adding pm_runtime_get() to where a new transfer
is added to the chan_busy array, and calls to pm_runtime_put() where
chan_busy array entry is cleared. This prevents any autosuspend timeouts
from happening while dma transfers are active.
Fixes: 098de42ad6 ("dmaengine: cppi41: Fix unpaired pm runtime when
only a USB hub is connected")
Fixes: fdea2d09b9 ("dmaengine: cppi41: Add basic PM runtime support")
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Bin Liu <b-liu@ti.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Patrick Titiano <ptitiano@baylibre.com>
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 282e4637bc upstream.
Commit 025bcc1 performed cleanup work on the 'wacom_pl_irq' function, making
it follow the standards used in the rest of the codebase. The change
unintiontionally allowed the function to send input events from reports
that are not marked as being in prox. This can cause problems as the
report values for X, Y, etc. are not guaranteed to be correct. In
particular, occasionally the tablet will send a report with these values
set to zero. If such a report is received it can caus an unexpected jump
in the XY position.
This patch surrounds more of the processing code with a proximity check,
preventing these zeroed reports from overwriting the current state. To
be safe, only the tool type and ABS_MISC events should be reported when
the pen is marked as being out of prox.
Fixes: 025bcc1540 ("HID: wacom: Simplify 'wacom_pl_irq'")
Signed-off-by: Jason Gerecke <jason.gerecke@wacom.com>
Reviewed-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed9ab4287f upstream.
Quirking the following AMI USB device with ALWAYS_POLL fixes an AMI
virtual keyboard and mouse from not responding and timing out when
it is attached to a ppc64el Power 8 system and when we have some
rapid open/closes on the mouse device.
usb 1-3: new high-speed USB device number 2 using xhci_hcd
usb 1-3: New USB device found, idVendor=046b, idProduct=ff01
usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-3: Product: Virtual Hub
usb 1-3: Manufacturer: American Megatrends Inc.
usb 1-3: SerialNumber: serial
usb 1-3.3: new high-speed USB device number 3 using xhci_hcd
usb 1-3.3: New USB device found, idVendor=046b, idProduct=ff31
usb 1-3.3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-3.3: Product: Virtual HardDisk Device
usb 1-3.3: Manufacturer: American Megatrends Inc.
usb 1-3.4: new low-speed USB device number 4 using xhci_hcd
usb 1-3.4: New USB device found, idVendor=046b, idProduct=ff10
usb 1-3.4: New USB device strings: Mfr=1, Product=2, SerialNumber=0
usb 1-3.4: Product: Virtual Keyboard and Mouse
usb 1-3.4: Manufacturer: American Megatrends Inc.
With the quirk I have not been able to trigger the issue with
half an hour of saturation soak testing.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03c902bff5 upstream.
When the firmware restarts in a situation in which any station
has no queue reserved anymore because that queue was used, the
code will crash trying to access the queue_info array at the
offset 255, which is far too big. Fix this by checking that a
queue is actually reserved before writing its status.
Fixes: 8d98ae6eb0 ("iwlwifi: mvm: re-assign old queues after hw restart in dqa mode")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7941c59e45 upstream.
Mistakenly, the driver is trying to load the 8000C firmware with an
incorrect name (i.e. with two hyphens where there should be only one)
and that fails. Fix that by removing the hyphen from the format
macro.
Fixes: e1ba684f76 ("iwlwifi: 8000: fix MODULE_FIRMWARE input")
Signed-off-by: Jürg Billeter <j@bitron.ch>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1b89970d81 upstream.
Debounce value is set globally per community. Otherwise user will easily
get a kernel crash when they start using the feature:
BUG: unable to handle kernel paging request at ffffc900003be000
IP: byt_gpio_dbg_show+0xa9/0x430
Make it clear in byt_gpio_reg().
Note that this fix just prevents kernel to crash, but doesn't make any
difference to the existing logic. It means the last caller will win the
trade and debounce value will be configured accordingly. The actual
logic fix needs to be thought about and it's not as important as crash
fix. That's why the latter goes separately and right now.
Fixes: 658b476c74 ("pinctrl: baytrail: Add debounce configuration")
Cc: Cristina Ciocan <cristina.ciocan@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d5415b489 upstream.
This reverts commit c7070619f3.
This has been shown to regress on some ARM systems:
by forcing on DMA API usage for ARM systems, we have inadvertently
kicked open a hornets' nest in terms of cache-coherency. Namely that
unless the virtio device is explicitly described as capable of coherent
DMA by firmware, the DMA APIs on ARM and other DT-based platforms will
assume it is non-coherent. This turns out to cause a big problem for the
likes of QEMU and kvmtool, which generate virtio-mmio devices in their
guest DTs but neglect to add the often-overlooked "dma-coherent"
property; as a result, we end up with the guest making non-cacheable
accesses to the vring, the host doing so cacheably, both talking past
each other and things going horribly wrong.
We are working on a safer work-around.
Fixes: c7070619f3 ("vring: Force use of DMA API for ARM-based systems with legacy devices")
Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7195439d1d upstream.
This reverts commit 4c81acab38 ("bcma: init serial console directly
from ChipCommon code") as it broke IRQ assignment. Getting IRQ with
bcma_core_irq helper on SoC requires MIPS core to be set. It happens
*after* ChipCommon initialization so we can't do this so early.
This fixes a user reported regression. It wasn't critical as serial was
still somehow working but lack of IRQs was making in unreliable.
Fixes: 4c81acab38 ("bcma: init serial console directly from ChipCommon code")
Reported-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 966d2b04e0 upstream.
percpu_ref_tryget() and percpu_ref_tryget_live() should return
"true" IFF they acquire a reference. But the return value from
atomic_long_inc_not_zero() is a long and may have high bits set,
e.g. PERCPU_COUNT_BIAS, and the return value of the tryget routines
is bool so the reference may actually be acquired but the routines
return "false" which results in a reference leak since the caller
assumes it does not need to do a corresponding percpu_ref_put().
This was seen when performing CPU hotplug during I/O, as hangs in
blk_mq_freeze_queue_wait where percpu_ref_kill (blk_mq_freeze_queue_start)
raced with percpu_ref_tryget (blk_mq_timeout_work).
Sample stack trace:
__switch_to+0x2c0/0x450
__schedule+0x2f8/0x970
schedule+0x48/0xc0
blk_mq_freeze_queue_wait+0x94/0x120
blk_mq_queue_reinit_work+0xb8/0x180
blk_mq_queue_reinit_prepare+0x84/0xa0
cpuhp_invoke_callback+0x17c/0x600
cpuhp_up_callbacks+0x58/0x150
_cpu_up+0xf0/0x1c0
do_cpu_up+0x120/0x150
cpu_subsys_online+0x64/0xe0
device_online+0xb4/0x120
online_store+0xb4/0xc0
dev_attr_store+0x68/0xa0
sysfs_kf_write+0x80/0xb0
kernfs_fop_write+0x17c/0x250
__vfs_write+0x6c/0x1e0
vfs_write+0xd0/0x270
SyS_write+0x6c/0x110
system_call+0x38/0xe0
Examination of the queue showed a single reference (no PERCPU_COUNT_BIAS,
and __PERCPU_REF_DEAD, __PERCPU_REF_ATOMIC set) and no requests.
However, conditions at the time of the race are count of PERCPU_COUNT_BIAS + 0
and __PERCPU_REF_DEAD and __PERCPU_REF_ATOMIC set.
The fix is to make the tryget routines use an actual boolean internally instead
of the atomic long result truncated to a int.
Fixes: e625305b39 percpu-refcount: make percpu_ref based on longs instead of ints
Link: https://bugzilla.kernel.org/show_bug.cgi?id=190751
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Reviewed-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: e625305b39 ("percpu-refcount: make percpu_ref based on longs instead of ints")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0e287a401 upstream.
A typo or copy-paste bug means that the register access intended for
regulator dcdce goes to dcdcb instead. This patch corrects it.
Fixes: 2ca342d391 (regulator: axp20x: Support AXP806 variant)
Signed-off-by: Rask Ingemann Lambertsen <rask@formelder.dk>
Acked-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cda8bba0f9 upstream.
Currently, under certain circumstances vhost_init_is_le does just a part
of the initialization job, and depends on vhost_reset_is_le being called
too. For this reason vhost_vq_init_access used to call vhost_reset_is_le
when vq->private_data is NULL. This is not only counter intuitive, but
also real a problem because it breaks vhost_net. The bug was introduced to
vhost_net with commit 2751c9882b ("vhost: cross-endian support for
legacy devices"). The symptom is corruption of the vq's used.idx field
(virtio) after VHOST_NET_SET_BACKEND was issued as a part of the vhost
shutdown on a vq with pending descriptors.
Let us make sure the outcome of vhost_init_is_le never depend on the state
it is actually supposed to initialize, and fix virtio_net by removing the
reset from vhost_vq_init_access.
With the above, there is no reason for vhost_reset_is_le to do just half
of the job. Let us make vhost_reset_is_le reinitialize is_le.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Michael A. Tebolt <miket@us.ibm.com>
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: commit 2751c9882b ("vhost: cross-endian support for legacy devices")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Michael A. Tebolt <miket@us.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 161e6d44a5 upstream.
One of our kernelCI boxes hanged at boot because a faulty eSDHC device
was triggering spurious CARD_INT interrupts for SD cards, causing CMD52
reads, which are not allowed for SD devices. This adds a sanity check
to the interruption path, preventing that illegal command from getting
sent if the CARD_INT interruption should be disabled.
This quirk allows that particular machine to resume boot despite the
faulty hardware, instead of getting hung dealing with thousands of
mishandled interrupts.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07cd129455 upstream.
While refactoring cgroup creation, a5bca21520 ("cgroup: factor out
cgroup_create() out of cgroup_mkdir()") incorrectly onlined subsystems
before the new cgroup is associated with it kernfs_node. This is fine
for cgroup proper but cgroup_name/path() depend on the associated
kernfs_node and if a subsystem makes the new cgroup_subsys_state
visible, which they're allowed to after onlining, it can lead to NULL
dereference.
The current code performs cgroup creation and subsystem onlining in
cgroup_create() and cgroup_mkdir() makes the cgroup and subsystems
visible afterwards. There's no reason to online the subsystems early
and we can simply drop cgroup_apply_control_enable() call from
cgroup_create() so that the subsystems are onlined and made visible at
the same time.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: a5bca21520 ("cgroup: factor out cgroup_create() out of cgroup_mkdir()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a06393ed03 upstream.
When removing a bcm tx operation either a hrtimer or a tasklet might run.
As the hrtimer triggers its associated tasklet and vice versa we need to
take care to mutually terminate both handlers.
Reported-by: Michael Josenhans <michael.josenhans@web.de>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Michael Josenhans <michael.josenhans@web.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79c6f448c8 upstream.
The hwlat tracer creates a kernel thread at start of the tracer. It is
pinned to a single CPU and will move to the next CPU after each period of
running. If the user modifies the migration thread's affinity, it will not
change after that happens.
The original code created the thread at the first instance it was called,
but later was changed to destroy the thread after the tracer was finished,
and would not be created until the next instance of the tracer was
established. The code that initialized the affinity was only called on the
initial instantiation of the tracer. After that, it was not initialized, and
the previous affinity did not match the current newly created one, making
it appear that the user modified the thread's affinity when it did not, and
the thread failed to migrate again.
Fixes: 0330f7aa8e ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a96dfddbcc upstream.
Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().
BUG: unable to handle kernel paging request at ffffea017a000000
IP: show_valid_zones+0x6f/0x160
This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB. [1] An example of such
systems is desribed below. 0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.
BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable
Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range. show_valid_zones() then proceeds with the valid range.
[1] 'Commit bdee237c03 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
commit deb88a2a19 upstream.
Patch series "fix a kernel oops when reading sysfs valid_zones", v2.
A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory. [1] When the start address of a
memory block is not backed by struct page, i.e. a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops. This issue was observed on multiple x86-64 systems with
more than 64GiB of memory. This patch-set fixes this issue.
Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.
Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).
Note for stable kernels: The memory block size change was made by commit
bdee237c03 ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9. However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887f4 ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.
So, I recommend that we backport it up to 4.4.
[1] 'Commit bdee237c03 ("x86: mm: Use 2GB memory block size on
large-memory x86-64 systems")'
This patch (of 2):
test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'. Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.
Fix it by properly setting 'sec_end_pfn' to the next section pfn.
Also make sure that this function returns 1 only when the range belongs
to a zone.
Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.com
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81ddd8c0c5 upstream.
Reviewed-by: Jeff Layton <jlayton@redhat.com>
file_info_lock is not initalized in initiate_cifs_search(), leading to the
following splat after a simple "mount.cifs ... dir && ls dir/":
BUG: spinlock bad magic on CPU#0, ls/486
lock: 0xffff880009301110, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
CPU: 0 PID: 486 Comm: ls Not tainted 4.9.0 #27
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
ffffc900042f3db0 ffffffff81327533 0000000000000000 ffff880009301110
ffffc900042f3dd0 ffffffff810baf75 ffff880009301110 ffffffff817ae077
ffffc900042f3df0 ffffffff810baff6 ffff880009301110 ffff880008d69900
Call Trace:
[<ffffffff81327533>] dump_stack+0x65/0x92
[<ffffffff810baf75>] spin_dump+0x85/0xe0
[<ffffffff810baff6>] spin_bug+0x26/0x30
[<ffffffff810bb159>] do_raw_spin_lock+0xe9/0x130
[<ffffffff8159ad2f>] _raw_spin_lock+0x1f/0x30
[<ffffffff8127e50d>] cifs_closedir+0x4d/0x100
[<ffffffff81181cfd>] __fput+0x5d/0x160
[<ffffffff81181e3e>] ____fput+0xe/0x10
[<ffffffff8109410e>] task_work_run+0x7e/0xa0
[<ffffffff81002512>] exit_to_usermode_loop+0x92/0xa0
[<ffffffff810026f9>] syscall_return_slowpath+0x49/0x50
[<ffffffff8159b484>] entry_SYSCALL_64_fastpath+0xa7/0xa9
Fixes: 3afca265b5 ("Clarify locking of cifs file and tcon structures and make more granular")
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d7b028f56a upstream.
Add zswap_init_failed bool that prevents changing any of the module
params, if init_zswap() fails, and set zswap_enabled to false. Change
'enabled' param to a callback, and check zswap_init_failed before
allowing any change to 'enabled', 'zpool', or 'compressor' params.
Any driver that is built-in to the kernel will not be unloaded if its
init function returns error, and its module params remain accessible for
users to change via sysfs. Since zswap uses param callbacks, which
assume that zswap has been initialized, changing the zswap params after
a failed initialization will result in WARNING due to the param
callbacks expecting a pool to already exist. This prevents that by
immediately exiting any of the param callbacks if initialization failed.
This was reported here:
https://marc.info/?l=linux-mm&m=147004228125528&w=4
And fixes this WARNING:
[ 429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60
The warning is just noise, and not serious. However, when init fails,
zswap frees all its percpu dstmem pages and its kmem cache. The kmem
cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
the percpu dstmem pages are definitely a problem, as they're used as
temporary buffer for compressed pages before copying into place in the
zpool.
If the user does get zswap enabled after an init failure, then zswap
will likely Oops on the first page it tries to compress (or worse, start
corrupting memory).
Fixes: 90b0fc26d5 ("zswap: change zpool/compressor at runtime")
Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reported-by: Marcin Miroslaw <marcin@mejor.pl>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 034dd34ff4 upstream.
Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."
The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize. Fix that.
[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task: ffff8800776f95c0 task.stack: ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946] ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901] gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017] svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257] ? __enqueue_entity+0x6c/0x70
[120408.589101] svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212] ? try_to_wake_up+0x4a/0x360
[120408.591036] ? wake_up_process+0x15/0x20
[120408.592093] ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177] svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168] svc_process_common+0x203/0x710 [sunrpc]
[120408.595220] svc_process+0x105/0x1c0 [sunrpc]
[120408.596278] nfsd+0xe9/0x160 [nfsd]
[120408.597060] kthread+0x101/0x140
[120408.597734] ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626] ? kthread_park+0x90/0x90
[120408.599448] ret_from_fork+0x22/0x30
Fixes: 1d658336b0 "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d19fb70dd6 upstream.
nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().
If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).
This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().
Fixes: 356a95ece7 "nfsd: clean up races in lock stateid searching..."
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0615a16f7 upstream.
When setting a 2MB pte, radix__map_kernel_page() is using the address
ptep = (pte_t *)pudp;
Fix this conversion to use pmdp instead. Use pmdp_ptep() to do this
instead of casting the pointer.
Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5fa0f7f88 upstream.
Anton says: In commit 4db7327194 ("powerpc: Add option to use jump
label for cpu_has_feature()") and commit c12e6f24d4 ("powerpc: Add
option to use jump label for mmu_has_feature()") we added:
BUILD_BUG_ON(!__builtin_constant_p(feature))
to cpu_has_feature() and mmu_has_feature() in order to catch usage
issues (such as cpu_has_feature(cpu_has_feature(X), which has happened
once in the past). Unfortunately LLVM isn't smart enough to resolve
this, and it errors out.
I work around it in my clang/LLVM builds of the kernel, but I have just
discovered that it causes a lot of issues for the bcc (eBPF) trace tool
(which uses LLVM).
For now just #ifdef it away for clang builds.
Fixes: 4db7327194 ("powerpc: Add option to use jump label for cpu_has_feature()")
Fixes: c12e6f24d4 ("powerpc: Add option to use jump label for mmu_has_feature()")
Reported-by: Anton Blanchard <anton@samba.org>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af2b7fa17e upstream.
prom_init.c calls 'instance-to-package' twice, but the return
is not checked during prom_find_boot_cpu(). The result is then
passed to prom_getprop(), which could be PROM_ERROR. Add a return check
to prevent this.
This was found on a pasemi system, where CFE doesn't have a working
'instance-to package' prom call.
Before Commit 5c0484e25e ('powerpc: Endian safe trampoline') the area
around addr 0 was mostly 0's and this doesn't cause a problem. Once the
macro 'FIXUP_ENDIAN' has been added to head_64.S, the low memory area
now has non-zero values, which cause the prom_getprop() call
to hang.
mpe: Also confirmed that under SLOF if 'instance-to-package' did fail
with PROM_ERROR we would crash in SLOF. So the bug is not specific to
CFE, it's just that other open firmwares don't trigger it because they
have a working 'instance-to-package'.
Fixes: 5c0484e25e ("powerpc: Endian safe trampoline")
Signed-off-by: Darren Stevens <darren@stevens-zone.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f05fea5b35 upstream.
In __eeh_clear_pe_frozen_state(), we should pass the flag's value
instead of its address to eeh_unfreeze_pe(). The isolated flag is
cleared if no error returned from __eeh_clear_pe_frozen_state(). We
never observed the error from the function. So the isolated flag should
have been always cleared, no real issue is caused because of the misused
@flag.
This fixes the code by passing the value of @flag to eeh_unfreeze_pe().
Fixes: 5cfb20b96f ("powerpc/eeh: Emulate EEH recovery for VFIO devices")
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2dae99558e upstream.
For an ATA device supporting the sense data reporting feature set, a
failed command will trigger the execution of ata_eh_request_sense if
the result task file of the failed command has the ATA_SENSE bit set
(sense data available bit). ata_eh_request_sense executes the REQUEST
SENSE DATA EXT command to retrieve the sense data of the failed
command. On success of REQUEST SENSE DATA EXT, the ATA_SENSE bit will
NOT be set (the command succeeded) but ata_eh_request_sense
nevertheless tests the availability of sense data by testing that bit
presence in the result tf of the REQUEST SENSE DATA EXT command. This
leads us to falsely assume that request sense data failed and to the
warning message:
atax.xx: request sense failed stat 50 emask 0
Upon success of REQUEST SENSE DATA EXT, set the ATA_SENSE bit in the
result task file command so that sense data can be returned by
ata_eh_request_sense.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0edc8c546 upstream.
Marko reports that CX1-JB512-HP shows the same timeout issues as
CX1-JB256-HP. Let's apply MAX_SEC_128 to all devices in the series.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Marko Koski-Vähälä <marko@koski-vahala.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 064c3db9c5 upstream.
Here, If devm_ioremap will fail. It will return NULL.
Then hpriv->base = NULL - 0x20000; Kernel can run into
a NULL-pointer dereference. This error check will avoid
NULL pointer dereference.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 11e3b725cf upstream.
Update the ARMv8 Crypto Extensions and the plain NEON AES implementations
in CBC and CTR modes to return the next IV back to the skcipher API client.
This is necessary for chaining to work correctly.
Note that for CTR, this is only done if the request is a round multiple of
the block size, since otherwise, chaining is impossible anyway.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e9faa1546 upstream.
In case of a zero-length report, the gpio direction_input callback would
currently return success instead of an errno.
Fixes: 1ffb3c40ff ("HID: cp2112: make transfer buffers DMA capable")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a7b5df84b upstream.
A recent commit fixing DMA-buffers on stack added a shared transfer
buffer protected by a spinlock. This is broken as the USB HID request
callbacks can sleep. Fix this up by replacing the spinlock with a mutex.
Fixes: 1ffb3c40ff ("HID: cp2112: make transfer buffers DMA capable")
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4b3e6f2ef3 upstream.
Commit bf15f86b34 ("xtensa: initialize MMU before jumping to reset
vector") calls MMU management functions even when CONFIG_MMU is not
selected. That breaks noMMU build on cores with MMU.
Don't manage MMU when CONFIG_MMU is not selected.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8f325a59c upstream.
Some AArch64 UEFI implementations disable the MMU in ExitBootServices(),
after which unaligned accesses to RAM are no longer supported.
Commit:
abfb7b686a ("efi/libstub/arm*: Pass latest memory map to the kernel")
fixed an issue in the memory map handling of the stub FDT code, but
inadvertently created an issue with such firmware, by moving some
of the FDT manipulation to after the invocation of ExitBootServices().
Given that the stub's libfdt implementation uses the ordinary, accelerated
string functions, which rely on hardware handling of unaligned accesses,
manipulating the FDT with the MMU off may result in alignment faults.
So fix the situation by moving the update_fdt_memmap() call into the
callback function invoked by efi_exit_boot_services() right before it
calls the ExitBootServices() UEFI service (which is arguably a better
place for it anyway)
Note that disabling the MMU in ExitBootServices() is not compliant with
the UEFI spec, and carries great risk due to the fact that switching from
cached to uncached memory accesses halfway through compiler generated code
(i.e., involving a stack) can never be done in a way that is architecturally
safe.
Fixes: abfb7b686a ("efi/libstub/arm*: Pass latest memory map to the kernel")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Cc: mark.rutland@arm.com
Cc: linux-efi@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: leif.lindholm@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1485971102-23330-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf29bddf04 upstream.
Commit:
129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode")
stopped creating 1:1 mappings for all RAM, when running in native 64-bit mode.
It turns out though that there are 64-bit EFI implementations in the wild
(this particular problem has been reported on a Lenovo Yoga 710-11IKB),
which still make use of the first physical page for their own private use,
even though they explicitly mark it EFI_CONVENTIONAL_MEMORY in the memory
map.
In case there is no mapping for this particular frame in the EFI pagetables,
as soon as firmware tries to make use of it, a triple fault occurs and the
system reboots (in case of the Yoga 710-11IKB this is very early during bootup).
Fix that by always mapping the first page of physical memory into the EFI
pagetables. We're free to hand this page to the BIOS, as trim_bios_range()
will reserve the first page and isolate it away from memory allocators anyway.
Note that just reverting 129766708 alone is not enough on v4.9-rc1+ to fix the
regression on affected hardware, as this commit:
ab72a27da ("x86/efi: Consolidate region mapping logic")
later made the first physical frame not to be mapped anyway.
Reported-by: Hanka Pavlikova <hanka@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vojtech Pavlik <vojtech@ucw.cz>
Cc: Waiman Long <waiman.long@hpe.com>
Cc: linux-efi@vger.kernel.org
Fixes: 129766708 ("x86/efi: Only map RAM into EFI page tables if in mixed-mode")
Link: http://lkml.kernel.org/r/20170127222552.22336-1-matt@codeblueprint.co.uk
[ Tidied up the changelog and the comment. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3a4b77cd47 upstream.
Ralf Spenneberg reported that he hit a kernel crash when mounting a
modified ext4 image. And it turns out that kernel crashed when
calculating fs overhead (ext4_calculate_overhead()), this is because
the image has very large s_first_meta_bg (debug code shows it's
842150400), and ext4 overruns the memory in count_overhead() when
setting bitmap buffer, which is PAGE_SIZE.
ext4_calculate_overhead():
buf = get_zeroed_page(GFP_NOFS); <=== PAGE_SIZE buffer
blks = count_overhead(sb, i, buf);
count_overhead():
for (j = ext4_bg_num_gdb(sb, grp); j > 0; j--) { <=== j = 842150400
ext4_set_bit(EXT4_B2C(sbi, s++), buf); <=== buffer overrun
count++;
}
This can be reproduced easily for me by this script:
#!/bin/bash
rm -f fs.img
mkdir -p /mnt/ext4
fallocate -l 16M fs.img
mke2fs -t ext4 -O bigalloc,meta_bg,^resize_inode -F fs.img
debugfs -w -R "ssv first_meta_bg 842150400" fs.img
mount -o loop fs.img /mnt/ext4
Fix it by validating s_first_meta_bg first at mount time, and
refusing to mount if its value exceeds the largest possible meta_bg
number.
Reported-by: Ralf Spenneberg <ralf@os-t.de>
Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 030305d69f upstream.
In a struct pcie_link_state, link->root points to the pcie_link_state of
the root of the PCIe hierarchy. For the topmost link, this points to
itself (link->root = link). For others, we copy the pointer from the
parent (link->root = link->parent->root).
Previously we recognized that Root Ports originated PCIe hierarchies, but
we treated PCI/PCI-X to PCIe Bridges as being in the middle of the
hierarchy, and when we tried to copy the pointer from link->parent->root,
there was no parent, and we dereferenced a NULL pointer:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000090
IP: [<ffffffff9e424350>] pcie_aspm_init_link_state+0x170/0x820
Recognize that PCI/PCI-X to PCIe Bridges originate PCIe hierarchies just
like Root Ports do, so link->root for these devices should also point to
itself.
Fixes: 51ebfc92b7 ("PCI: Enumerate switches below PCI-to-PCIe bridges")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=193411
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1022181
Tested-by: lists@ssl-mail.com
Tested-by: Jayachandran C. <jnair@caviumnetworks.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c364b6d0b6 upstream.
In a bmapx call, bmv_count is the total size of the array, including the
zeroth element that userspace uses to supply the search key. The output
array starts at offset 1 so that we can set up the user for the next
invocation. Since we now can split an extent into multiple bmap records
due to shared/unshared status, we have to be careful that we don't
overflow the output array.
In the original patch f86f403794 ("xfs: teach get_bmapx about shared
extents and the CoW fork") I used cur_ext (the output index) to check
for overflows, albeit with an off-by-one error. Since nexleft no longer
describes the number of unfilled slots in the output, we can rip all
that out and use cur_ext for the overflow check directly.
Failure to do this causes heap corruption in bmapx callers such as
xfs_io and xfs_scrub. xfs/328 can reproduce this problem.
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2aa6ba7b5a upstream.
If we try to allocate memory pages to back an xfs_buf that we're trying
to read, it's possible that we'll be so short on memory that the page
allocation fails. For a blocking read we'll just wait, but for
readahead we simply dump all the pages we've collected so far.
Unfortunately, after dumping the pages we neglect to clear the
_XBF_PAGES state, which means that the subsequent call to xfs_buf_free
thinks that b_pages still points to pages we own. It then double-frees
the b_pages pages.
This results in screaming about negative page refcounts from the memory
manager, which xfs oughtn't be triggering. To reproduce this case,
mount a filesystem where the size of the inodes far outweighs the
availalble memory (a ~500M inode filesystem on a VM with 300MB memory
did the trick here) and run bulkstat in parallel with other memory
eating processes to put a huge load on the system. The "check summary"
phase of xfs_scrub also works for this purpose.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 493611ebd6 upstream.
With COW files they are the hotpath, just like for files with the
extent size hint attribute. We really shouldn't micro-manage anything
but failure cases with unlikely.
Additionally Arnd Bergmann recently reported that one of these two
unlikely annotations causes link failures together with an upcoming
kernel instrumentation patch, so let's get rid of it ASAP.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a93790d4e upstream.
xfs_attr_[get|remove]() have unlocked attribute fork checks to optimize
away a lock cycle in cases where the fork does not exist or is otherwise
empty. This check is not safe, however, because an attribute fork short
form to extent format conversion includes a transient state that causes
the xfs_inode_hasattr() check to fail. Specifically,
xfs_attr_shortform_to_leaf() creates an empty extent format attribute
fork and then adds the existing shortform attributes to it.
This means that lookup of an existing xattr can spuriously return
-ENOATTR when racing against a setxattr that causes the associated
format conversion. This was originally reproduced by an untar on a
particularly configured glusterfs volume, but can also be reproduced on
demand with properly crafted xattr requests.
The format conversion occurs under the exclusive ilock. xfs_attr_get()
and xfs_attr_remove() already have the proper locking and checks further
down in the functions to handle this situation correctly. Drop the
unlocked checks to avoid the spurious failure and rely on the existing
logic.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83d230eb5c upstream.
sb_dirblklog is added to sb_blocklog to compute the directory block size
in bytes. Therefore, we must compare the sum of both those values
against XFS_MAX_BLOCKSIZE_LOG, not just dirblklog.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2b3964a07 upstream.
Due to the way how xfs_iomap_write_allocate tries to convert the whole
found extents from delalloc to real space we can run into a race
condition with multiple threads doing writes to this same extent.
For the non-COW case that is harmless as the only thing that can happen
is that we call xfs_bmapi_write on an extent that has already been
converted to a real allocation. For COW writes where we move the extent
from the COW to the data fork after I/O completion the race is, however,
not quite as harmless. In the worst case we are now calling
xfs_bmapi_write on a region that contains hole in the COW work, which
will trip up an assert in debug builds or lead to file system corruption
in non-debug builds. This seems to be reproducible with workloads of
small O_DSYNC write, although so far I've not managed to come up with
a with an isolated reproducer.
The fix for the issue is relatively simple: tell xfs_bmapi_write
that we are only asked to convert delayed allocations and skip holes
in that case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd29f7af75 upstream.
A harmless warning just got introduced:
fs/xfs/libxfs/xfs_dir2.h:40:8: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
Removing the 'const' modifier avoids the warning and has no
other effect.
Fixes: 1fc4d33fed ("xfs: replace xfs_mode_to_ftype table with switch statement")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 657bdfb7f5 upstream.
The GETNEXTQOTA ioctl takes whatever ID is sent in,
and looks for the next active quota for an user
equal or higher to that ID.
But if we are at the maximum ID and then ask for the "next"
one, we may wrap back to zero. In this case, userspace
may loop forever, because it will start querying again
at zero.
We'll fix this in userspace as well, but for the kernel,
return -ENOENT if we ask for the next quota ID
past UINT_MAX so the caller knows to stop.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fab8eef86c upstream.
The helper xfs_dentry_to_name() is used by 2 different
classes of callers: Callers that pass zero mode and don't care
about the returned name.type field and Callers that pass
non zero mode and do care about the name.type field.
Change xfs_dentry_to_name() to not take the mode argument and
change the call sites of the first class to not pass the mode
argument.
Create a new helper xfs_dentry_mode_to_name() which does pass
the mode argument and returns -EFSCORRUPTED if mode is invalid.
Callers that translate non zero mode to on-disk file type now
check the return value and will export the error to user instead
of staging an invalid file type to be written to directory entry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fc4d33fed.
The size of the xfs_mode_to_ftype[] conversion table
was too small to handle an invalid value of mode=S_IFMT.
Instead of fixing the table size, replace the conversion table
with a conversion helper that uses a switch statement.
Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b597dd5373 upstream.
xfs_dir2.h dereferences some data types in inline functions
and fails to include those type definitions, e.g.:
xfs_dir2_data_aoff_t, struct xfs_da_geometry.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c6f46eacd upstream.
This changes fixes an assertion hit when fuzzing on-disk
i_mode values.
The easy case to fix is when changing an empty file
i_mode to S_IFDIR. In this case, xfs_dinode_verify()
detects an illegal zero size for directory and fails
to load the inode structure from disk.
For the case of non empty file whose i_mode is changed
to S_IFDIR, the ASSERT() statement in xfs_dir2_isblock()
is replaced with return -EFSCORRUPTED, to avoid interacting
with corrupted jusk also when XFS_DEBUG is disabled.
Suggested-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84a4620cfe upstream.
There are only two reasons for xfs_log_force / xfs_log_force_lsn to fail:
one is an I/O error, for which xlog_bdstrat already logs a warning, and
the second is an already shutdown log due to a previous I/O errors. In
the latter case we'll already have a previous indication for the actual
error, but the large stream of misleading warnings from xfs_log_force
will probably scroll it out of the message buffer.
Simply removing the warnings thus makes the XFS log reporting significantly
better.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 12ef830198 upstream.
->total is a bit of an odd parameter passed down to the low-level
allocator all the way from the high-level callers. It's supposed to
contain the maximum number of blocks to be allocated for the whole
transaction [1].
But in xfs_iomap_write_allocate we only convert existing delayed
allocations and thus only have a minimal block reservation for the
current transaction, so xfs_alloc_space_available can't use it for
the allocation decisions. Use the maximum of args->total and the
calculated block requirement to make a decision. We probably should
get rid of args->total eventually and instead apply ->minleft more
broadly, but that will require some extensive changes all over.
[1] which creates lots of confusion as most callers don't decrement it
once doing a first allocation. But that's for a separate series.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54fee133ad upstream.
We must decide in xfs_alloc_fix_freelist if we can perform an
allocation from a given AG is possible or not based on the available
space, and should not fail the allocation past that point on a
healthy file system.
But currently we have two additional places that second-guess
xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
maxlen parameter to remove the reservation before doing the
allocation (but ignores the various minium freespace requirements),
and xfs_alloc_fix_minleft tries to fix up the allocated length
after we've found an extent, but ignores the reservations and also
doesn't take the AGFL into account (and thus fails allocations
for not matching minlen in some cases).
Remove all these later fixups and just correct the maxlen argument
inside xfs_alloc_fix_freelist once we have the AGF buffer locked.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 255c516278 upstream.
We can't just set minleft to 0 when we're low on space - that's exactly
what we need minleft for: to protect space in the AG for btree block
allocations when we are low on free space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5149fd327f upstream.
Setting aside 4 blocks globally for bmbt splits isn't all that useful,
as different threads can allocate space in parallel. Bump it to 4
blocks per AG to allow each thread that is currently doing an
allocation to dip into it separately. Without that we may no have
enough reserved blocks if there are enough parallel transactions
in an almost out space file system that all run into bmap btree
splits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f154be241d ]
Commit 448b4482c6 ("net: dsa: Add lockdep class to tx queues to avoid
lockdep splat") removed the netif_device_detach() call done in
dsa_slave_suspend() which is necessary, and paired with a corresponding
netif_device_attach(), bring it back.
Fixes: 448b4482c6 ("net: dsa: Add lockdep class to tx queues to avoid lockdep splat")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 85c814016c ]
When attempting to free lwtunnel state after the module for the encap
has been unloaded an oops occurs:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: lwtstate_free+0x18/0x40
[..]
task: ffff88003e372380 task.stack: ffffc900001fc000
RIP: 0010:lwtstate_free+0x18/0x40
RSP: 0018:ffff88003fd83e88 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88002bbb3380 RCX: ffff88000c91a300
[..]
Call Trace:
<IRQ>
free_fib_info_rcu+0x195/0x1a0
? rt_fibinfo_free+0x50/0x50
rcu_process_callbacks+0x2d3/0x850
? rcu_process_callbacks+0x296/0x850
__do_softirq+0xe4/0x4cb
irq_exit+0xb0/0xc0
smp_apic_timer_interrupt+0x3d/0x50
apic_timer_interrupt+0x93/0xa0
[..]
Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8
The problem is after the module for the encap can be unloaded the
corresponding ops is removed and is thus NULL here.
Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so grab the module
reference for the ops on creating lwtunnel state and of course release
the reference when freeing the state.
Fixes: 1104d9ba44 ("lwtunnel: Add destroy state operation")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 88ff7334f2 ]
Modules implementing lwtunnel ops should not be allowed to unload
while there is state alive using those ops, so specify the owning
module for all lwtunnel ops.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5b9f575163 ]
Another rebranded Novatel E371. qmi_wwan should drive this device, while
cdc_ether should ignore it. Even though the USB descriptors are plain
CDC-ETHER that USB interface is a QMI interface. Ref commit 7fdb7846c9
("qmi_wwan/cdc_ether: add device IDs for Dell 5804 (Novatel E371) WWAN
card")
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0fb44559ff ]
Dmitry reported a deadlock scenario:
unix_bind() path:
u->bindlock ==> sb_writer
do_splice() path:
sb_writer ==> pipe->mutex ==> u->bindlock
In the unix_bind() code path, unix_mknod() does not have to
be done with u->bindlock held, since it is a pure fs operation,
so we can just move unix_mknod() out.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9f427a0e47 ]
MPLS multipath for LSR is broken -- always selecting the first nexthop
in the one label case. For example:
$ ip -f mpls ro ls
100
nexthop as to 200 via inet 172.16.2.2 dev virt12
nexthop as to 300 via inet 172.16.3.2 dev virt13
101
nexthop as to 201 via inet6 2000:2::2 dev virt12
nexthop as to 301 via inet6 2000:3::2 dev virt13
In this example incoming packets have a single MPLS labels which means
BOS bit is set. The BOS bit is passed from mpls_forward down to
mpls_multipath_hash which never processes the hash loop because BOS is 1.
Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
tracks the total mpls header length on each pass (on pass N mpls_hdr_len
is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
it verifies the skb has sufficient header for ipv4 or ipv6, and find the
IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
advance past it.
With these changes I have verified the code correctly sees the label,
BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
traffic for ipv4 and ipv6 are distributed across the nexthops.
Fixes: 1c78efa831 ("mpls: flow-based multipath selection")
Acked-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b6677449df ]
Any bridge options specified during link creation (e.g. ip link add)
are ignored as br_dev_newlink() does not process them.
Use br_changelink() to do it.
Fixes: 1332351617 ("bridge: implement rtnl_link_ops->changelink")
Signed-off-by: Ivan Vecera <cera@cera.cz>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e048fc50d7 ]
A driver using dev_alloc_page() must not reuse a page allocated from
emergency memory reserve.
Otherwise all packets using this page will be immediately dropped,
unless for very specific sockets having SOCK_MEMALLOC bit set.
This issue might be hard to debug, because only a fraction of received
packets would be dropped.
Fixes: 4415a0319f ("net/mlx5e: Implement RX mapped page cache for page recycle")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0dbd7ff3ac ]
Found that if we run LTP netstress test with large MSS (65K),
the first attempt from server to send data comparable to this
MSS on fastopen connection will be delayed by the probe timer.
Here is an example:
< S seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32
> S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0
< . ack 1 win 342 length 0
Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now',
as well as in 'size_goal'. This results the segment not queued for
transmition until all the data copied from user buffer. Then, inside
__tcp_push_pending_frames(), it breaks on send window test and
continues with the check probe timer.
Fragmentation occurs in tcp_write_wakeup()...
+0.2 > P. seq 1:43777 ack 1 win 342 length 43776
< . ack 43777, win 1365 length 0
> P. seq 43777:65001 ack 1 win 342 options [...] length 21224
...
This also contradicts with the fact that we should bound to the half
of the window if it is large.
Fix this flaw by correctly initializing max_window. Before that, it
could have large values that affect further calculations of 'size_goal'.
Fixes: 168a8f5805 ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 03e4deff49 ]
Just like commit 4acd4945cd ("ipv6: addrconf: Avoid calling
netdevice notifiers with RCU read-side lock"), it is unnecessary
to make addrconf_disable_change() use RCU iteration over the
netdev list, since it already holds the RTNL lock, or we may meet
Illegal context switch in RCU read-side critical section.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9ed59592e3]
Trying to add an mpls encap route when the MPLS modules are not loaded
hangs. For example:
CONFIG_MPLS=y
CONFIG_NET_MPLS_GSO=m
CONFIG_MPLS_ROUTING=m
CONFIG_MPLS_IPTUNNEL=m
$ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
The ip command hangs:
root 880 826 0 21:25 pts/0 00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
$ cat /proc/880/stack
[<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
[<ffffffff81065efc>] __request_module+0x27b/0x30a
[<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
[<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
[<ffffffff814ae451>] fib_table_insert+0x90/0x41f
[<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
...
modprobe is trying to load rtnl-lwt-MPLS:
root 881 5 0 21:25 ? 00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
and it hangs after loading mpls_router:
$ cat /proc/881/stack
[<ffffffff81441537>] rtnl_lock+0x12/0x14
[<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
[<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
[<ffffffff81000471>] do_one_initcall+0x8e/0x13f
[<ffffffff81119961>] do_init_module+0x5a/0x1e5
[<ffffffff810bd070>] load_module+0x13bd/0x17d6
...
The problem is that lwtunnel_build_state is called with rtnl lock
held preventing mpls_init from registering.
Given the potential references held by the time lwtunnel_build_state it
can not drop the rtnl lock to the load module. So, extract the module
loading code from lwtunnel_build_state into a new function to validate
the encap type. The new function is called while converting the user
request into a fib_config which is well before any table, device or
fib entries are examined.
Fixes: 745041e2aa ("lwtunnel: autoload of lwt modules")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit cd33b3e0da ]
Commit a1cba5613e ("net: phy: Add Broadcom phy library for common
interfaces") make the BCM63xx PHY driver utilize bcm_phy_config_intr()
which would appear to do the right thing, except that it does not write
to the MII_BCM63XX_IR register but to MII_BCM54XX_ECR which is
different.
This would be causing invalid link parameters and events from being
generated by the PHY interrupt.
Fixes: a1cba5613e ("net: phy: Add Broadcom phy library for common interfaces")
Signed-off-by: Daniel Gonzalez Cabanelas <dgcbueu@gmail.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7be2c82cfd ]
Ashizuka reported a highmem oddity and sent a patch for freescale
fec driver.
But the problem root cause is that core networking stack
must ensure no skb with highmem fragment is ever sent through
a device that does not assert NETIF_F_HIGHDMA in its features.
We need to call illegal_highdma() from harmonize_features()
regardless of CSUM checks.
Fixes: ec5f061564 ("net: Kill link between CSUM and SG features.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pravin Shelar <pshelar@ovn.org>
Reported-by: "Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d5ff72d9af ]
vxlan->cfg.dst_port is in network byte order, so an htons()
is needed here. Also reduced comment length to stay closer
to 80 column width (still slightly over, however).
Fixes: e1e5314de0 ("vxlan: implement GPE")
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 501db51139 ]
This patch part reverts fd2a0437dc and e858fae2b0 which introduced a
subtle change in how the virtio_net flags are derived from the SKBs
ip_summed field.
With the above commits, the flags are set to VIRTIO_NET_HDR_F_DATA_VALID
when ip_summed == CHECKSUM_UNNECESSARY, thus treating it differently to
ip_summed == CHECKSUM_NONE, which should be the same.
Further, the virtio spec 1.0 / CS04 explicitly says that
VIRTIO_NET_HDR_F_DATA_VALID must not be set by the driver.
Fixes: fd2a0437dc ("virtio_net: introduce virtio_net_hdr_{from,to}_skb")
Fixes: e858fae2b0 (" virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0faa9cb5b3 ]
Demonstrating the issue:
.. add a drop action
$sudo $TC actions add action drop index 10
.. retrieve it
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 29 sec used 29 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
... bug 1 above: reference is two.
Reference is actually 1 but we forget to subtract 1.
... do a GET again and we see the same issue
try a few times and nothing changes
~$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 0 installed 31 sec used 31 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
... lets try to bind the action to a filter..
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
... and now a few GETs:
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 3 bind 1 installed 204 sec used 204 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 4 bind 1 installed 206 sec used 206 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 5 bind 1 installed 235 sec used 235 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
.... as can be observed the reference count keeps going up.
After the fix
$ sudo $TC actions add action drop index 10
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 4 sec used 4 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 1 bind 0 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC qdisc add dev lo ingress
$ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 32 sec used 32 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
$ sudo $TC -s actions get action gact index 10
action order 1: gact action drop
random type none pass val 0
index 10 ref 2 bind 1 installed 33 sec used 33 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
Fixes: aecc5cefc3 ("net sched actions: fix GETing actions")
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8a367e74c0 ]
The ax.25 socket connection timed out & the sock struct has been
previously taken down ie. sock struct is now a NULL pointer. Checking
the sock_flag causes the segfault. Check if the socket struct pointer
is NULL before checking sock_flag. This segfault is seen in
timed out netrom connections.
Please submit to -stable.
Signed-off-by: Basil Gunn <basil@pacabunga.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 02ca0423fd ]
With ip6gre we have a tunnel header which also makes the tunnel MTU
smaller. We need to reserve room for it. Previously we were using up
space reserved for the Tunnel Encapsulation Limit option
header (RFC 2473).
Also, after commit b05229f442 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") our contract with the caller has
changed. Now we check if the packet length exceeds the tunnel MTU after
the tunnel header has been pushed, unlike before.
This is reflected in the check where we look at the packet length minus
the size of the tunnel header, which is already accounted for in tunnel
MTU.
Fixes: b05229f442 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 003c941057 ]
Fix up a data alignment issue on sparc by swapping the order
of the cookie byte array field with the length field in
struct tcp_fastopen_cookie, and making it a proper union
to clean up the typecasting.
This addresses log complaints like these:
log_unaligned: 113 callbacks suppressed
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360
Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360
Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360
Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 148d3d021c ]
The __bcm_sysport_tx_reclaim() function is used to reclaim transmit
resources in different places within the driver. Most of them should
not affect the state of the transit flow control.
Introduce bcm_sysport_tx_clean() which cleans the ring, but does not
re-enable flow control towards the networking stack, and make
bcm_sysport_tx_reclaim() do the actual transmit queue flow control.
Fixes: 80105befdb ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 8a430ed50b ]
rtm_table is an 8-bit field while table ids are allowed up to u32. Commit
709772e6e0 ("net: Fix routing tables with id > 255 for legacy software")
added the preference to set rtm_table in dumps to RT_TABLE_COMPAT if the
table id is > 255. The table id returned on get route requests should do
the same.
Fixes: c36ba6603a ("net: Allow user to get table id from route lookup")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ea7a80858f ]
Handle failure in lwtunnel_fill_encap adding attributes to skb.
Fixes: 571e722676 ("ipv4: support for fib route lwtunnel encap attributes")
Fixes: 19e42e4515 ("ipv6: support for fib route lwtunnel encap attributes")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 28e46a0f2e ]
The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This
leads to duplication with other fields in the Event Queue Element such as
sub-type, cqn and owner.
Fixes: eda6500a98 ("mlxsw: Add PCI bus implementation")
Signed-off-by: Elad Raz <eladr@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 400fc0106d ]
During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.
Fix this by adding the original skb release to the main path.
Fixes: d003462a50 ("mlxsw: Simplify mlxsw_sx_port_xmit function")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 36bf38d158 ]
During transmission the skb is checked for headroom in order to
add vendor specific header. In case the skb needs to be re-allocated,
skb_realloc_headroom() is called to make a private copy of the original,
but doesn't release it. Current code assumes that the original skb is
released during reallocation and only releases it at the error path
which causes a memory leak.
Fix this by adding the original skb release to the main path.
Fixes: 56ade8fe3f ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0719e72ccb ]
The receive callback (in tasklet context) is using RCU to get reference
to associated VF network device but this is not safe. RCU read lock
needs to be held. Found by running with full lockdep debugging
enabled.
Fixes: f207c10d98 ("hv_netvsc: use RCU to protect vf_netdev")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 19c0f40d4f ]
Fix the hw rx checksum is always enabled, and the user couldn't switch
it to sw rx checksum.
Note that the RTL_VER_01 only support sw rx checksum only. Besides,
the hw rx checksum for RTL_VER_02 is disabled after
commit b9a321b48a ("r8152: Fix broken RX checksums."). Re-enable it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fc020d864 upstream.
The WaDisableLSQCROPERFforOCL workaround has the side effect of
disabling an L3SQ optimization that has huge performance implications
and is unlikely to be necessary for the correct functioning of usual
graphic workloads. Userspace is free to re-enable the workaround on
demand, and is generally in a better position to determine whether the
workaround is necessary than the DRM is (e.g. only during the
execution of compute kernels that rely on both L3 fences and HDC R/W
requests).
The same workaround seems to apply to BDW (at least to production
stepping G1) and SKL as well (the internal workaround database claims
that it does for all steppings, while the BSpec workaround table only
mentions pre-production steppings), but the DRM doesn't do anything
beyond whitelisting the L3SQCREG4 register so userspace can enable it
when it sees fit. Do the same on KBL platforms.
Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
This is followed by a regression of 35% and 10% respectively for the
same benchmarks and platform caused by my recent patch series
switching userspace to use the dataport constant cache instead of the
sampler to implement uniform pull constant loads, which caused us to
hit more heavily the L3 cache (and on platforms other than KBL had the
opposite effect of improving performance of the same two benchmarks).
The overall effect on KBL of this change combined with the recent
userspace change is respectively 4.6% and 2.6%. SynMark2 OglShMapPcf
was affected by the constant cache changes (though it improved as it
did on other platforms rather than regressing), but is not
significantly affected by this patch (with statistical significance of
5% and sample size 20).
v2: Drop some more code to avoid unused variable warning.
Fixes: 738fa1b312 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: beignet@lists.freedesktop.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
[Removed double Fixes tag]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
(cherry picked from commit 8726f2faa3)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
[ Francisco Jerez: Rebase on v4.9 branch. ]
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 321027c1fe upstream.
Di Shen reported a race between two concurrent sys_perf_event_open()
calls where both try and move the same pre-existing software group
into a hardware context.
The problem is exactly that described in commit:
f63a8daa58 ("perf: Fix event->ctx locking")
... where, while we wait for a ctx->mutex acquisition, the event->ctx
relation can have changed under us.
That very same commit failed to recognise sys_perf_event_context() as an
external access vector to the events and thereby didn't apply the
established locking rules correctly.
So while one sys_perf_event_open() call is stuck waiting on
mutex_lock_double(), the other (which owns said locks) moves the group
about. So by the time the former sys_perf_event_open() acquires the
locks, the context we've acquired is stale (and possibly dead).
Apply the established locking rules as per perf_event_ctx_lock_nested()
to the mutex_lock_double() for the 'move_group' case. This obviously means
we need to validate state after we acquire the locks.
Reported-by: Di Shen (Keen Lab)
Tested-by: John Dias <joaodias@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Min Chong <mchong@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: f63a8daa58 ("perf: Fix event->ctx locking")
Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a00b6c243 upstream.
The commit 1c6c69525b ("genirq: Reject bogus threaded irq requests")
starts refusing misconfigured interrupt handlers. This makes
intel_mid_powerbtn not working anymore.
Add a mandatory flag to a threaded IRQ request in the driver.
Fixes: 1c6c69525b ("genirq: Reject bogus threaded irq requests")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63d762b88c upstream.
There is an off-by-one error so we don't unregister priv->pdev_mux[0].
Also it's slightly simpler as a while loop instead of a for loop.
Fixes: 58cbbee239 ("x86/platform/mellanox: Introduce support for Mellanox systems platform")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Vadim Pasternak <vadimp@mellanox.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f7f6634d23 upstream.
Once DMA API usage is enabled, it becomes apparent that virtio-mmio is
inadvertently relying on the default 32-bit DMA mask, which leads to
problems like rapidly exhausting SWIOTLB bounce buffers.
Ensure that we set the appropriate 64-bit DMA mask whenever possible,
with the coherent mask suitably limited for the legacy vring as per
a0be1db430 ("virtio_pci: Limit DMA mask to 44 bits for legacy virtio
devices").
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Fixes: b42111382f ("virtio_mmio: Use the DMA API if enabled")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a1f780e7f upstream.
online_{kernel|movable} is used to change the memory zone to
ZONE_{NORMAL|MOVABLE} and online the memory.
To check that memory zone can be changed, zone_can_shift() is used.
Currently the function returns minus integer value, plus integer
value and 0. When the function returns minus or plus integer value,
it means that the memory zone can be changed to ZONE_{NORNAL|MOVABLE}.
But when the function returns 0, there are two meanings.
One of the meanings is that the memory zone does not need to be changed.
For example, when memory is in ZONE_NORMAL and onlined by online_kernel
the memory zone does not need to be changed.
Another meaning is that the memory zone cannot be changed. When memory
is in ZONE_NORMAL and onlined by online_movable, the memory zone may
not be changed to ZONE_MOVALBE due to memory online limitation(see
Documentation/memory-hotplug.txt). In this case, memory must not be
onlined.
The patch changes the return type of zone_can_shift() so that memory
online operation fails when memory zone cannot be changed as follows:
Before applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 8388608
managed 8388608
online_movable operation succeeded. But memory is onlined as
ZONE_NORMAL, not ZONE_MOVABLE.
After applying patch:
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
# echo online_movable > memory4097/state
bash: echo: write error: Invalid argument
# grep -A 35 "Node 2" /proc/zoneinfo
Node 2, zone Normal
<snip>
node_scanned 0
spanned 8388608
present 7864320
managed 7864320
online_movable operation failed because of failure of changing
the memory zone from ZONE_NORMAL to ZONE_MOVABLE
Fixes: df429ac039 ("memory-hotplug: more general validation of zone during online")
Link: http://lkml.kernel.org/r/2f9c3837-33d7-b6e5-59c0-6ca4372b2d84@gmail.com
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04ff5a095d upstream.
The commit 658b476c74 ("pinctrl: baytrail: Add debounce configuration")
implements debounce for Baytrail pin control, but seems wasn't tested properly.
The register which keeps debounce value is separated from the configuration
one. Writing wrong values to the latter will guarantee wrong behaviour of the
driver and even might break something physically.
Besides above there is missed case how to disable it, which is actually done
through the bit in configuration register.
Rectify implementation here by using proper register for debounce value.
Fixes: 658b476c74 ("pinctrl: baytrail: Add debounce configuration")
Cc: Cristina Ciocan <cristina.ciocan@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c739c0a7c3 upstream.
A rare randconfig build failure shows up in this driver when
the CRC32 helper is not there:
drivers/media/built-in.o: In function `s5k4ecgx_s_power':
s5k4ecgx.c:(.text+0x9eb4): undefined reference to `crc32_le'
This adds the 'select' that all other users of this function have.
Fixes: 8b99312b72 ("[media] Add v4l2 subdev driver for S5K4ECGX sensor")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d4b21e0a2 upstream.
On UD QP completer tasklet is scheduled for each packet sent.
If it is followed by a destroy_qp(), the kernel panic will
happen as the completer tries to operate on a destroyed QP.
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f39f775218 upstream.
The first argument of list_add_tail is the new item and the second
is the head of the list. Fix the code to pass arguments in the
right order, otherwise not all the rxe devices will be removed
during teardown.
Fixes: 8700e3e7c4 ('Soft RoCE driver')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 828f6fa65c upstream.
1. Release pid before enter odp flow
2. Release pid when fail to allocate memory
Fixes: 87773dd56d ("IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get")
Fixes: 8ada2c1c0c ("IB/core: Add support for on demand paging regions")
Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com>
Reviewed-by: Haggai Eran <haggaie@mellanox.com>
Reviewed-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79d6205a3f upstream.
The s_stream() handler incorrectly writes the whole MISC_CTL register to
enable or disable the outputs, overriding the output pinmuxing
configuration. Fix it to only touch the output enable bits.
The CONF_SHARED_PIN register is also written by the same function,
resulting in muxing the INTREQ signal instead of the VBLK/GPCL signal on
the INTREQ/GPCL/VBLK pin. As the driver doesn't support interrupts this
is obviously incorrect, and breaks operation on other devices. Fix it by
removing the write.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4b2de386b upstream.
The FID/GLCO/VLK/HVLK and INTREQ/GPCL/VBLK pins are muxed differently
depending on whether the input is an S-Video or composite signal. The
comment that explains the logic doesn't reflect the code. It appears
that the comment is incorrect, as disabling the output data bus in
composite mode makes no sense. Update the comment to match the code.
While at it define macros for the MISC_CTL register bits, the code is
too confusing with numerical values.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aff808e813 upstream.
The tvp5150 doesn't support format setting through the subdev pad API
and thus implements the set format handler as a get format operation.
The single handler, tvp5150_fill_fmt(), resets the device by calling
tvp5150_reset(). This causes malfunction as the device can be reset at
will, possibly from userspace when the subdev userspace API is enabled.
The reset call was added in commit ec2c4f3f93 ("[media] media:
tvp5150: Add mbus_fmt callbacks"), probably as an attempt to set the
device to a known state before detecting the current TV standard.
However, the get format handler doesn't access the hardware to get the
TV standard since commit 963ddc63e2 ("[media] media: tvp5150: Add
cropping support"). There is thus no need to reset the device when
getting the format.
However, removing the tvp5150_reset() from the get/set format handlers
results in the function not being called at all if the bridge driver
doesn't use the .reset() operation. The operation is nowadays abused and
shouldn't be used, so shouldn't expect bridge drivers to call it. To
make sure the device is properly initialize, move the reset call from
the format handlers to the probe function.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 48775cb73c upstream.
commit 73d5c5c864 ("[media] pctv452e: don't do DMA on stack") caused
a NULL pointer dereference which occurs when dvb_usb_init()
calls dvb_usb_device_power_ctrl() for the first time, before the
frontend has been attached. It also caused a recursive deadlock because
tt3650_ci_msg_locked() has already locked the mutex.
So, partially revert it, but move the buffer to the heap
(DMA capable), not to the stack (may not be DMA capable).
Instead of sharing one buffer which needs mutex protection,
do a new heap allocation for each call.
Fixes: commit 73d5c5c864 ("[media] pctv452e: don't do DMA on stack")
Signed-off-by: Max Kellermann <max.kellermann@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c12a67fec8 upstream.
Commit ad61a4c7a9 ("iw_cxgb4: don't block in destroy_qp awaiting
the last deref") introduced a bug where the RDMA QP EQ queue memory
(and QIDs) are possibly freed before the underlying connection has been
fully shutdown. The result being a possible DMA read issued by HW after
the queue memory has been unmapped and freed. This results in possible
WR corruption in the worst case, system bus errors if an IOMMU is in use,
and SGE "bad WR" errors reported in the very least. The fix is to defer
unmap/free of queue memory and QID resources until the QP struct has
been fully dereferenced. To do this, the c4iw_ucontext must also be kept
around until the last QP that references it is fully freed. In addition,
since the last QP deref can happen in an IRQ disabled context, we need
a new workqueue thread to do the final unmap/free of the EQ queue memory.
Fixes: ad61a4c7a9 ("iw_cxgb4: don't block in destroy_qp awaiting the last deref")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a430607b2e upstream.
Some nfsv4.0 servers may return a mode for the verifier following an open
with EXCLUSIVE4 createmode, but this does not mean the client should skip
setting the mode in the following SETATTR. It should only do that for
EXCLUSIVE4_1 or UNGAURDED createmode.
Fixes: 5334c5bdac ("NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ac092519a upstream.
We cannot call nfs4_handle_exception() without first ensuring that the
slot has been freed. If not, we end up deadlocking with the process
waiting for recovery to complete, and recovery waiting for the slot
table to drain.
Fixes: 2e80dbe7ac ("NFSv4.1: Close callback races for OPEN, LAYOUTGET...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 059aa73482 upstream.
Xuan Qi reports that the Linux NFSv4 client failed to lock a file
that was migrated. The steps he observed on the wire:
1. The client sent a LOCK request to the source server
2. The source server replied NFS4ERR_MOVED
3. The client switched to the destination server
4. The client sent the same LOCK request to the destination
server with a bumped lock sequence ID
5. The destination server rejected the LOCK request with
NFS4ERR_BAD_SEQID
RFC 3530 section 8.1.5 provides a list of NFS errors which do not
bump a lock sequence ID.
However, RFC 3530 is now obsoleted by RFC 7530. In RFC 7530 section
9.1.7, this list has been updated by the addition of NFS4ERR_MOVED.
Reported-by: Xuan Qi <xuan.qi@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ad5d52d42 upstream.
In swab.h the "#if BITS_PER_LONG > 32" breaks compiling userspace programs if
BITS_PER_LONG is #defined by userspace with the sizeof() compiler builtin.
Solve this problem by using __BITS_PER_LONG instead. Since we now
#include asm/bitsperlong.h avoid further potential userspace pollution
by moving the #define of SHIFT_PER_LONG to bitops.h which is not
exported to userspace.
This patch unbreaks compiling qemu on hppa/parisc.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9aed02feae upstream.
After emulating an unaligned access in delay slot of a branch, we
pretend as the delay slot never happened - so return back to actual
branch target (or next PC if branch was not taken).
Curently we did this by handling STATUS32.DE, we also need to clear the
BTA.T bit, which is disregarded when returning from original misaligned
exception, but could cause weirdness if it took the interrupt return
path (in case interrupt was acive too)
One ARC700 customer ran into this when enabling unaligned access fixup
for kernel mode accesses as well
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36425cd670 upstream.
commit 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)
However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.
This resulted in some fun - as nested ZOL loops were being generared
| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 # <======= OUTER LOOP (gcc generated)
| .L14:
| ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
| dmb 1
| breq r2, -1, @.L21 #, w,,
| bbit0 r2,1,@.L13 # w,,
| ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
| mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
| mov lp_count, r3 # <====== INNER LOOP (from inline asm)
| lp 1f
| nop
| 1:
| nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,
This caused issues with drivers relying on sane behaviour of udelay
friends.
With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.
Addresses STAR 9001146134
Reported-by: Joao Pinto <jpinto@synopsys.com>
Fixes: 3c7c7a2fc8 ("ARC: Don't use "+l" inline asm constraint")
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit befa60113c upstream.
In order to make the driver work with the common clock framework, this
patch converts the clk_enable()/clk_disable() to
clk_prepare_enable()/clk_disable_unprepare().
Also add error checking for clk_prepare_enable().
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c97c52be78 upstream.
The priv->device pointer for c_can_pci is never set, but it is used
without a NULL check in c_can_start(). Setting it in c_can_pci_probe()
like c_can_plat_probe() prevents c_can_pci.ko from crashing, with and
without CONFIG_PM.
This might also cause the pm_runtime_*() functions in c_can.c to
actually be executed for c_can_pci devices - they are the only other
place where priv->device is used, but they all contain a null check.
Signed-off-by: Einar Jón <tolvupostur@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a475ef422 upstream.
After setting indirect_sg_entries module_param to huge value (e.g 500,000),
srp_alloc_req_data() fails to allocate indirect descriptors for the request
ring (kmalloc fails). This commit enforces the maximum value of
indirect_sg_entries to be SG_MAX_SEGMENTS as signified in module param
description.
Fixes: 65e8617fba (scsi: rename SCSI_MAX_{SG, SG_CHAIN}_SEGMENTS)
Fixes: c07d424d61 (IB/srp: add support for indirect tables that don't fit in SRP_CMD)
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>--
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1e5db6c31a upstream.
For devices that can register page list that is bigger than
USHRT_MAX, we actually take the wrong value for sg_tablesize.
E.g: for CX4 max_fast_reg_page_list_len is 65536 (bigger than USHRT_MAX)
so we set sg_tablesize to 0 by mistake. Therefore, each IO that is
bigger than 4k splitted to "< 4k" chunks that cause performance degredation.
Remove wrong sg_tablesize assignment, and use the value that was set during
address resolution handler with the needed casting.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9dce990d2c upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
convert_vx_to_fp() is adapted to handle only a specified number of
registers rather than unconditionally handling all of them: other
callers of this function are adapted appropriately.
Based on an initial patch by Dave Martin.
Reported-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4cfe3971f upstream.
If IPV6 has not been enabled in the underlying kernel, we must avoid
calling IPV6 procedures in rdma_cm.ko.
This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements
surrounding any code which calls external IPV6 procedures.
In the instance fixed here, procedure cma_bind_addr() called
ipv6_addr_type() -- which resulted in calling external procedure
__ipv6_addr_type().
Fixes: 6c26a77124 ("RDMA/cma: fix IPv6 address resolution")
Cc: Spencer Baugh <sbaugh@catern.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1fdf41941b upstream.
When you snapshot a subvolume containing a subvolume, you get a
placeholder directory where the subvolume would be. These directory
inodes have ->i_ops set to btrfs_dir_ro_inode_operations. Previously,
these i_ops didn't include the xattr operation callbacks. The conversion
to xattr_handlers missed this case, leading to bogus attempts to set
xattrs on these inodes. This manifested itself as failures when running
delayed inodes.
To fix this, clear IOP_XATTR in ->i_opflags on these inodes.
Fixes: 6c6ef9f26e ("xattr: Stop calling {get,set,remove}xattr inode operations")
Cc: Andreas Gruenbacher <agruenba@redhat.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Tested-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 67ade058ef upstream.
As Jeff explained in c2951f32d3 ("btrfs: remove old tree_root dirent
processing in btrfs_real_readdir()"), supporting this old format is no
longer necessary since the Btrfs magic number has been updated since we
changed to the current format. There are other places where we still
handle this old format, but since this is part of a fix that is going to
stable, I'm only removing this one for now.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 950eabbd6d upstream.
With some gcc versions, we get a warning about the eicon driver,
and that currently shows up as the only remaining warning in one
of the build bots:
In file included from ../drivers/isdn/hardware/eicon/message.c:30:0:
eicon/message.c: In function 'mixer_notify_update':
eicon/platform.h:333:18: warning: array subscript is above array bounds [-Warray-bounds]
The code is easily changed to open-code the unusual PUT_WORD() line
causing this to avoid the warning.
Link: http://arm-soc.lixom.net/buildlogs/stable-rc/v4.4.45/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e0d76fa447 upstream.
Quotacheck runs at mount time in situations where quota accounting must
be recalculated. In doing so, it uses bulkstat to visit every inode in
the filesystem. Historically, every inode processed during quotacheck
was released and immediately tagged for reclaim because quotacheck runs
before the superblock is marked active by the VFS. In other words,
the final iput() lead to an immediate ->destroy_inode() call, which
allowed the XFS background reclaim worker to start reclaiming inodes.
Commit 17c12bcd3 ("xfs: when replaying bmap operations, don't let
unlinked inodes get reaped") marks the XFS superblock active sooner as
part of the mount process to support caching inodes processed during log
recovery. This occurs before quotacheck and thus means all inodes
processed by quotacheck are inserted to the LRU on release. The
s_umount lock is held until the mount has completed and thus prevents
the shrinkers from operating on the sb. This means that quotacheck can
excessively populate the inode LRU and lead to OOM conditions on systems
without sufficient RAM.
Update the quotacheck bulkstat handler to set XFS_IGET_DONTCACHE on
inodes processed by quotacheck. This causes ->drop_inode() to return 1
and in turn causes iput_final() to evict the inode. This preserves the
original quotacheck behavior and prevents it from overloading the LRU
and running out of memory.
Reported-by: Martin Svec <martin.svec@zoner.cz>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff9f8a7cf9 upstream.
We perform the conversion between kernel jiffies and ms only when
exporting kernel value to user space.
We need to do the opposite operation when value is written by user.
Only matters when HZ != 1000
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7070619f3 upstream.
Booting Linux on an ARM fastmodel containing an SMMU emulation results
in an unexpected I/O page fault from the legacy virtio-blk PCI device:
[ 1.211721] arm-smmu-v3 2b400000.smmu: event 0x10 received:
[ 1.211800] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
[ 1.211880] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
[ 1.211959] arm-smmu-v3 2b400000.smmu: 0x00000008fa081002
[ 1.212075] arm-smmu-v3 2b400000.smmu: 0x0000000000000000
[ 1.212155] arm-smmu-v3 2b400000.smmu: event 0x10 received:
[ 1.212234] arm-smmu-v3 2b400000.smmu: 0x00000000fffff010
[ 1.212314] arm-smmu-v3 2b400000.smmu: 0x0000020800000000
[ 1.212394] arm-smmu-v3 2b400000.smmu: 0x00000008fa081000
[ 1.212471] arm-smmu-v3 2b400000.smmu: 0x0000000000000000
<system hangs failing to read partition table>
This is because the legacy virtio-blk device is behind an SMMU, so we
have consequently swizzled its DMA ops and configured the SMMU to
translate accesses. This then requires the vring code to use the DMA API
to establish translations, otherwise all transactions will result in
fatal faults and termination.
Given that ARM-based systems only see an SMMU if one is really present
(the topology is all described by firmware tables such as device-tree or
IORT), then we can safely use the DMA API for all legacy virtio devices.
Modern devices can advertise the prescense of an IOMMU using the
VIRTIO_F_IOMMU_PLATFORM feature flag.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Fixes: 876945dbf6 ("arm64: Hook up IOMMU dma_ops")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e47483bca2 upstream.
Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
triggers OOM killer in few seconds, despite lots of free memory. The
test attempts to repeatedly fault in memory in one process in a cpuset,
while changing allowed nodes of the cpuset between 0 and 1 in another
process.
The problem comes from insufficient protection against cpuset changes,
which can cause get_page_from_freelist() to consider all zones as
non-eligible due to nodemask and/or current->mems_allowed. This was
masked in the past by sufficient retries, but since commit 682a3385e7
("mm, page_alloc: inline the fast path of the zonelist iterator") we fix
the preferred_zoneref once, and don't iterate over the whole zonelist in
further attempts, thus the only eligible zones might be placed in the
zonelist before our starting point and we always miss them.
A previous patch fixed this problem for current->mems_allowed. However,
cpuset changes also update the task's mempolicy nodemask. The fix has
two parts. We have to repeat the preferred_zoneref search when we
detect cpuset update by way of seqcount, and we have to check the
seqcount before considering OOM.
[akpm@linux-foundation.org: fix typo in comment]
Link: http://lkml.kernel.org/r/20170120103843.24587-5-vbabka@suse.cz
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 16096c25bf upstream.
Ganapatrao Kulkarni reported that the LTP test cpuset01 in stress mode
triggers OOM killer in few seconds, despite lots of free memory. The
test attempts to repeatedly fault in memory in one process in a cpuset,
while changing allowed nodes of the cpuset between 0 and 1 in another
process.
One possible cause is that in the fast path we find the preferred
zoneref according to current mems_allowed, so that it points to the
middle of the zonelist, skipping e.g. zones of node 1 completely. If
the mems_allowed is updated to contain only node 1, we never reach it in
the zonelist, and trigger OOM before checking the cpuset_mems_cookie.
This patch fixes the particular case by redoing the preferred zoneref
search if we switch back to the original nodemask. The condition is
also slightly changed so that when the last non-root cpuset is removed,
we don't miss it.
Note that this is not a full fix, and more patches will follow.
Link: http://lkml.kernel.org/r/20170120103843.24587-3-vbabka@suse.cz
Fixes: 682a3385e7 ("mm, page_alloc: inline the fast path of the zonelist iterator")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea57485af8 upstream.
Patch series "fix premature OOM regression in 4.7+ due to cpuset races".
This is v2 of my attempt to fix the recent report based on LTP cpuset
stress test [1]. The intention is to go to stable 4.9 LTSS with this,
as triggering repeated OOMs is not nice. That's why the patches try to
be not too intrusive.
Unfortunately why investigating I found that modifying the testcase to
use per-VMA policies instead of per-task policies will bring the OOM's
back, but that seems to be much older and harder to fix problem. I have
posted a RFC [2] but I believe that fixing the recent regressions has a
higher priority.
Longer-term we might try to think how to fix the cpuset mess in a better
and less error prone way. I was for example very surprised to learn,
that cpuset updates change not only task->mems_allowed, but also
nodemask of mempolicies. Until now I expected the parameter to
alloc_pages_nodemask() to be stable. I wonder why do we then treat
cpusets specially in get_page_from_freelist() and distinguish HARDWALL
etc, when there's unconditional intersection between mempolicy and
cpuset. I would expect the nodemask adjustment for saving overhead in
g_p_f(), but that clearly doesn't happen in the current form. So we
have both crazy complexity and overhead, AFAICS.
[1] https://lkml.kernel.org/r/CAFpQJXUq-JuEP=QPidy4p_=FN0rkH5Z-kfB4qBvsf6jMS87Edg@mail.gmail.com
[2] https://lkml.kernel.org/r/7c459f26-13a6-a817-e508-b65b903a8378@suse.cz
This patch (of 4):
Since commit c33d6c06f6 ("mm, page_alloc: avoid looking up the first
zone in a zonelist twice") we have a wrong check for NULL preferred_zone,
which can theoretically happen due to concurrent cpuset modification. We
check the zoneref pointer which is never NULL and we should check the zone
pointer. Also document this in first_zones_zonelist() comment per Michal
Hocko.
Fixes: c33d6c06f6 ("mm, page_alloc: avoid looking up the first zone in a zonelist twice")
Link: http://lkml.kernel.org/r/20170120103843.24587-2-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Ganapatrao Kulkarni <gpkulkarni@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d51e9894d2 upstream.
Since commit be97a41b29 ("mm/mempolicy.c: merge alloc_hugepage_vma to
alloc_pages_vma") alloc_pages_vma() can potentially free a mempolicy by
mpol_cond_put() before accessing the embedded nodemask by
__alloc_pages_nodemask(). The commit log says it's so "we can use a
single exit path within the function" but that's clearly wrong. We can
still do that when doing mpol_cond_put() after the allocation attempt.
Make sure the mempolicy is not freed prematurely, otherwise
__alloc_pages_nodemask() can end up using a bogus nodemask, which could
lead e.g. to premature OOM.
Fixes: be97a41b29 ("mm/mempolicy.c: merge alloc_hugepage_vma to alloc_pages_vma")
Link: http://lkml.kernel.org/r/20170118141124.8345-1-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8310d48b12 upstream.
In commit 19be0eaffa ("mm: remove gup_flags FOLL_WRITE games from
__get_user_pages()"), the mm code was changed from unsetting FOLL_WRITE
after a COW was resolved to setting the (newly introduced) FOLL_COW
instead. Simultaneously, the check in gup.c was updated to still allow
writes with FOLL_FORCE set if FOLL_COW had also been set.
However, a similar check in huge_memory.c was forgotten. As a result,
remote memory writes to ro regions of memory backed by transparent huge
pages cause an infinite loop in the kernel (handle_mm_fault sets
FOLL_COW and returns 0 causing a retry, but follow_trans_huge_pmd bails
out immidiately because `(flags & FOLL_WRITE) && !pmd_write(*pmd)` is
true.
While in this state the process is stil SIGKILLable, but little else
works (e.g. no ptrace attach, no other signals). This is easily
reproduced with the following code (assuming thp are set to always):
#include <assert.h>
#include <fcntl.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#define TEST_SIZE 5 * 1024 * 1024
int main(void) {
int status;
pid_t child;
int fd = open("/proc/self/mem", O_RDWR);
void *addr = mmap(NULL, TEST_SIZE, PROT_READ,
MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
assert(addr != MAP_FAILED);
pid_t parent_pid = getpid();
if ((child = fork()) == 0) {
void *addr2 = mmap(NULL, TEST_SIZE, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
assert(addr2 != MAP_FAILED);
memset(addr2, 'a', TEST_SIZE);
pwrite(fd, addr2, TEST_SIZE, (uintptr_t)addr);
return 0;
}
assert(child == waitpid(child, &status, 0));
assert(WIFEXITED(status) && WEXITSTATUS(status) == 0);
return 0;
}
Fix this by updating follow_trans_huge_pmd in huge_memory.c analogously
to the update in gup.c in the original commit. The same pattern exists
in follow_devmap_pmd. However, we should not be able to reach that
check with FOLL_COW set, so add WARN_ONCE to make sure we notice if we
ever do.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170106015025.GA38411@juliacomputing.com
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[Fixed differently in 4.10]
The fence needs to be cleared out, otherwise the following commit
might wait on a stale fence from the previous commit. This was fixed
as a side effect of 9626014258 (drm/fence: add in-fences support)
in kernel 4.10.
As this commit introduces new functionality and as such can not be
applied to stable, this patch is the minimal fix for the kernel 4.9
stable series.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Tested-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b8ac63847 upstream.
By failing to set the errno, we'd continue on to trying to set up the
RCL, and then oops on trying to dereference the tile_bo that binning
validation should have set up.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f2ff82e11 upstream.
We copy the unvalidated ioctl arguments from the user into kernel
temporary memory to run the validation from, to avoid a race where the
user updates the unvalidate contents in between validating them and
copying them into the validated BO.
However, in setting up the layout of the kernel side, we failed to
check one of the additions (the roundup() for shader_rec_offset)
against integer overflow, allowing a nearly MAX_UINT value of
bin_cl_size to cause us to under-allocate the temporary space that we
then copy_from_user into.
Reported-by: Murray McAllister <murray.mcallister@insomniasec.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7622b25543 upstream.
The underscores variant frees the pointers inside, while the
no-underscores variant calls underscores and then frees the struct.
Signed-off-by: Eric Anholt <eric@anholt.net>
Fixes: d8dbf44f13 ("drm/vc4: Make the CRTCs cooperate on allocating display lists.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdf35a6b22 upstream.
I noticed that the VT switch doesn't work any longer with a Dell
laptop with 1366x768 eDP when the machine is connected with a DP
monitor. It behaves as if VT were switched, but the graphics remain
frozen. Actually the keyboard works, so I could switch back to VT7
again.
I tried to track down the problem, and encountered a long story until
we reach to this error:
- The machine is booted with video=1366x768 option (the distro
installer seems to add it as default).
- Recently, drm_helper_probe_single_connector_modes() deals with
cmdline modes, and it tries to create a new mode when no
matching mode is found.
- The drm_mode_create_from_cmdline_mode() creates a mode based on
either CVT of GFT according to the given cmdline mode; in our case,
it's 1366x768.
- Since both CVT and GFT can't express the width 1366 due to
alignment, the resultant mode becomes 1368x768, slightly larger than
the given size.
- Later on, the atomic commit is performed, and in
drm_atomic_check_only(), the size of each plane is checked.
- The size check of 1366x768 fails due to the above, and eventually
the whole VT switch fails.
Back in the history, we've had a manual fix-up of 1368x768 in various
places via c09dedb7a5 ("drm/edid: Add a workaround for 1366x768 HD
panel"), but they have been all in drm_edid.c at probing the modes
from EDID. For addressing the problem above, we need a similar hack
to the mode newly created from cmdline, manually adjusting the width
when the expected size is 1366 while we get 1368 instead.
Fixes: eaf99c749d ("drm: Perform cmdline mode parsing during...")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109145614.29454-1-tiwai@suse.de
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68f458eec7 upstream.
Instead of scheduling the work to handle the initial delayed event, use 1s
delay.
This delay should not be needed, but Optimus/nouveau will fail in a
mysterious way if the delayed event is handled as soon as possible like it
is done in drm_helper_probe_single_connector_modes() in case the poll
was enabled before.
Reverting 339fd36238 would give back the 10 sec (!) delay to handle the
delayed event. Adding 1sec delay to the poll_work is enough to work around
the issue in Optimus setups and gives shorter response on handling the
initial delayed event.
Fixes: 339fd36238 ("drm: drm_probe_helper: Fix output_poll_work scheduling")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
[danvet: Add FIXME to the comment to make it stick out more.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20170109143158.21917-1-peter.ujfalusi@ti.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd7c99142d upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21f5eda9b8 upstream.
Since ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work
without /dev/cpu") run-on-all.sh uses seq 0 $HOST_AFFINITY as the list
of ids of the CPUs to run the command on (assuming ids of online CPUs
are consecutive and start from 0), where $HOST_AFFINITY is the highest
CPU id in the system previously determined using lscpu. This can fail
on systems with offline CPUs.
Instead let's use lscpu to determine the list of online CPUs.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: ef1b144d ("tools/virtio/ringtest: fix run-on-all.sh to work without
/dev/cpu")
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36b29eb30e upstream.
Fix to return a negative error code from the kthread_run() error
handling case instead of 0, as done elsewhere in this function.
Fixes: cdd5de500b ("soc: ti: Add wkup_m3_ipc driver")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2dd8af00c upstream.
The commit 7c7289a404 ("spi: pxa2xx: Default thresholds to PXA
configuration") while splitting up CE4100 code obviously missed a break
condition in one chunk. Add it here.
Looks like we have no active user of CE4100, though better to fix this later
than never.
Fixes: commit 7c7289a404 ("spi: pxa2xx: Default thresholds to PXA configuration")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c9e6c2b2b upstream.
PL330 DMA engine driver is leaking a runtime reference after any terminated
DMA transactions. This patch fixes this issue by tracking runtime PM state
of the device and making additional call to pm_runtime_put() in terminate_all
callback if needed.
Fixes: ae43b32891 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0026c7bfb upstream.
Clock control indirectly requires access to MFC device, so call it only
if we are sure that the device exists in s5p_mfc_release function.
s5p_mfc_remove() calls s5p_mfc_final_pm(), which releases all PM related
resources, including clocks, so any call to clocks related functions
is not valid after s5p_mfc_final_pm().
Fixes: d695c12 ("[media] media: s5p-mfc fix invalid memory access from
s5p_mfc_release()")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eadf081146 upstream.
A bugfix removed the two callers of s5p_cec_runtime_suspend
and s5p_cec_runtime_resume, leading to the return of a harmless
warning that I had previously fixed in commit aee8937089
("[media] s5p_cec: mark suspend/resume as __maybe_unused"):
staging/media/s5p-cec/s5p_cec.c:234:12: error: ‘s5p_cec_runtime_suspend’ defined but not used [-Werror=unused-function]
staging/media/s5p-cec/s5p_cec.c:242:12: error: ‘s5p_cec_runtime_resume’ defined but not used [-Werror=unused-function]
This adds the __maybe_unused annotations to the function that
were not removed and that are now unused when CONFIG_PM
is disabled.
Fixes: 57b978ada0 ("[media] s5p-cec: fix system and runtime PM integration")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b2bed8912 upstream.
The devm_ioremap_resource() returns error pointers, never NULL. The
platform_get_resource() returns NULL on error, never error pointers.
The error code needs to be set, as well. The current code returns
PTR_ERR(NULL) which is success.
Fixes: 57b2c0628b ("[media] st-hva: multi-format video encoder V4L2 driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jean-Christophe Trotin <jean-christophe.trotin@st.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7ec03e60ef upstream.
Function ite_set_carrier_params() uses variable use_demodulator after
having initialized it to false in some if branches, but this variable is
never set to true otherwise.
This bug has been found using clang -Wsometimes-uninitialized warning
flag.
Fixes: 620a32bba4 ("[media] rc: New rc-based ite-cir driver for
several ITE CIRs")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit df94121f02 upstream.
It's not necessary to free memory allocated with devm_kzalloc
and using kfree leads to a double free.
Fixes: 7aae6e2df1 ("[media] Add GS1662 driver, a video serializer")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff681022c6 upstream.
Moving the pxa_camera driver from soc_camera lots the implied
VIDEO_V4L2 Kconfig dependency, and building the driver without
V4L2 results in a kernel that cannot link:
drivers/media/platform/pxa_camera.o: In function `pxa_camera_remove':
pxa_camera.c:(.text.pxa_camera_remove+0x10): undefined reference to `v4l2_clk_unregister'
pxa_camera.c:(.text.pxa_camera_remove+0x18): undefined reference to `v4l2_device_unregister'
drivers/media/platform/pxa_camera.o: In function `pxa_camera_probe':
pxa_camera.c:(.text.pxa_camera_probe+0x458): undefined reference to `v4l2_of_parse_endpoint'
drivers/media/v4l2-core/videobuf2-core.o: In function `__enqueue_in_driver':
drivers/media/v4l2-core/videobuf2-core.o: In function `vb2_core_streamon':
videobuf2-core.c:(.text.vb2_core_streamon+0x1b4): undefined reference to `v4l_vb2q_enable_media_source'
drivers/media/v4l2-core/videobuf2-v4l2.o: In function `vb2_ioctl_reqbufs':
videobuf2-v4l2.c:(.text.vb2_ioctl_reqbufs+0xc): undefined reference to `video_devdata'
This adds back an explicit dependency.
Fixes: 3050b99850 ("[media] media: platform: pxa_camera: move pxa_camera out of soc_camera")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9205e18b4 upstream.
devm_pinctrl_get() can fail so we should check for that.
Fixes: 0a6824bc10 ('[media] v4l2: blackfin: select proper pinctrl state in ppi_set_params if CONFIG_PINCTRL is enabled')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63447646ac upstream.
Since commit 4dffed5b3a ("rpmsg: Name rpmsg devices based on
channel id"), it is no more possible for a firmware to register twice
a service (on different endpoints). rpmsg_register_device function
is failing when calling device_add for the second time as second
device has the same name as first one already register.
It is because name is based only on service name and so is not more
unique. Previously name was unique thanks to the use of rpmsg_dev_index.
This patch adds destination and source endpoint numbers device name to
create an unique identifier.
Fixes: 4dffed5b3a ("rpmsg: Name rpmsg devices based on channel id")
Acked-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Loic Pallardy <loic.pallardy@st.com>
[bjorn: flipped name and address in device name]
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73613b16cb upstream.
This patch fixes the bug of devfreq_add_device(). The devfreq device must
have the default governor. If find_devfreq_governor() returns error,
devfreq_add_device() fail to add the devfreq instance.
Fixes: 1b5c1be2c8 (PM / devfreq: map devfreq drivers to governor using name)
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 32dd773169 upstream.
This patch fixes the wrong return value. If devfreq driver requires the wrong
and non-available governor, it is fail. So, this patch returns the error
insead of -EPROBE_DEFER.
Fixes: 403e0689d2 (PM / devfreq: exynos: Add support of bus frequency of sub-blocks using passive governor)
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ffb5845658 upstream.
mpt3sas has a firmware failure where it can only handle one pass through
ATA command at a time. If another comes in, contrary to the SAT
standard, it will hang until the first one completes (causing long
commands like secure erase to timeout). The original fix was to block
the device when an ATA command came in, but this caused a regression
with
commit 669f044170
Author: Bart Van Assche <bart.vanassche@sandisk.com>
Date: Tue Nov 22 16:17:13 2016 -0800
scsi: srp_transport: Move queuecommand() wait code to SCSI core
So fix the original fix of the secure erase timeout by properly
returning SAM_STAT_BUSY like the SAT recommends. The original patch
also had a concurrency problem since scsih_qcmd is lockless at that
point (this is fixed by using atomic bitops to set and test the flag).
[mkp: addressed feedback wrt. test_bit and fixed whitespace]
Fixes: 18f6084a98 (mpt3sas: Fix secure erase premature termination)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Sreekanth Reddy <Sreekanth.Reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fff5d99225 upstream.
On architectures like arm64, swiotlb is tied intimately to the core
architecture DMA support. In addition, ZONE_DMA cannot be disabled.
To aid debugging and catch devices not supporting DMA to memory outside
the 32-bit address space, add a kernel command line option
"swiotlb=noforce", which disables the use of bounce buffers.
If specified, trying to map memory that cannot be used with DMA will
fail, and a rate-limited warning will be printed.
Note that io_tlb_nslabs is set to 1, which is the minimal supported
value.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 524dabe1c6 upstream.
Commit b67a8b29df introduced logic to skip swiotlb allocation when all memory
is DMA accessible anyway.
While this is a great idea, __dma_alloc still calls swiotlb code unconditionally
to allocate memory when there is no CMA memory available. The swiotlb code is
called to ensure that we at least try get_free_pages().
Without initialization, swiotlb allocation code tries to access io_tlb_list
which is NULL. That results in a stack trace like this:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
[...]
[<ffff00000845b908>] swiotlb_tbl_map_single+0xd0/0x2b0
[<ffff00000845be94>] swiotlb_alloc_coherent+0x10c/0x198
[<ffff000008099dc0>] __dma_alloc+0x68/0x1a8
[<ffff000000a1b410>] drm_gem_cma_create+0x98/0x108 [drm]
[<ffff000000abcaac>] drm_fbdev_cma_create_with_funcs+0xbc/0x368 [drm_kms_helper]
[<ffff000000abcd84>] drm_fbdev_cma_create+0x2c/0x40 [drm_kms_helper]
[<ffff000000abc040>] drm_fb_helper_initial_config+0x238/0x410 [drm_kms_helper]
[<ffff000000abce88>] drm_fbdev_cma_init_with_funcs+0x98/0x160 [drm_kms_helper]
[<ffff000000abcf90>] drm_fbdev_cma_init+0x40/0x58 [drm_kms_helper]
[<ffff000000b47980>] vc4_kms_load+0x90/0xf0 [vc4]
[<ffff000000b46a94>] vc4_drm_bind+0xec/0x168 [vc4]
[...]
Thankfully swiotlb code just learned how to not do allocations with the FORCE_NO
option. This patch configures the swiotlb code to use that if we decide not to
initialize the swiotlb framework.
Fixes: b67a8b29df ("arm64: mm: only initialize swiotlb when necessary")
Signed-off-by: Alexander Graf <agraf@suse.de>
CC: Jisheng Zhang <jszhang@marvell.com>
CC: Geert Uytterhoeven <geert+renesas@glider.be>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c8a946bf3 upstream.
The arm64 __page_to_voff() macro takes a parameter called 'page', and
also refers to 'struct page'. Thus, if the value passed in is not
called 'page', we'll refer to the wrong struct name (which might not
exist).
Fixes: 3fa72fe9c6 ("arm64: mm: fix __page_to_voff definition")
Acked-by: Mark Rutland <mark.rutland@arm.com>
Suggested-by: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
Signed-off-by: Oleksandr Andrushchenko <Oleksandr_Andrushchenko@epam.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d38de6564 upstream.
Verbs providers may perform house-keeping on the Send Queue during
each signaled send completion. It is necessary therefore for a verbs
consumer (like xprtrdma) to occasionally force a signaled send
completion if it runs unsignaled most of the time.
xprtrdma does not require signaled completions for Send or FastReg
Work Requests, but does signal some LocalInv Work Requests. To
ensure that Send Queue house-keeping can run before the Send Queue
is more than half-consumed, xprtrdma forces a signaled completion
on occasion by counting the number of Send Queue Entries it
consumes. It currently does this by counting each ib_post_send as
one Entry.
Commit c9918ff56d ("xprtrdma: Add ro_unmap_sync method for FRWR")
introduced the ability for frwr_op_unmap_sync to post more than one
Work Request with a single post_send. Thus the underlying assumption
of one Send Queue Entry per ib_post_send is no longer true.
Also, FastReg Work Requests are currently never signaled. They
should be signaled once in a while, just as Send is, to keep the
accounting of consumed SQEs accurate.
While we're here, convert the CQCOUNT macros to the currently
preferred kernel coding style, which is inline functions.
Fixes: c9918ff56d ("xprtrdma: Add ro_unmap_sync method for FRWR")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 124f930b8c upstream.
... otherwise the crypto stack will align it for us with a GFP_ATOMIC
allocation and a memcpy() -- see skcipher_walk_first().
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe2ed42517 upstream.
sparse says:
fs/ceph/inode.c:308:36: warning: incorrect type in argument 1 (different base types)
fs/ceph/inode.c:308:36: expected unsigned int [unsigned] [usertype] a
fs/ceph/inode.c:308:36: got restricted __le32 [usertype] frag
fs/ceph/inode.c:308:46: warning: incorrect type in argument 2 (different base types)
fs/ceph/inode.c:308:46: expected unsigned int [unsigned] [usertype] b
fs/ceph/inode.c:308:46: got restricted __le32 [usertype] frag
We need to convert these values to host-endian before calling the
comparator.
Fixes: a407846ef7 ("ceph: don't assume frag tree splits in mds reply are sorted")
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c341ee328 upstream.
try_get_cap_refs can be used as a condition in a wait_event* calls.
This is all fine until it has to call __ceph_do_pending_vmtruncate,
which in turn acquires the i_truncate_mutex. This leads to a situation
in which a task's state is !TASK_RUNNING and at the same time it's
trying to acquire a sleeping primitive. In essence a nested sleeping
primitives are being used. This causes the following warning:
WARNING: CPU: 22 PID: 11064 at kernel/sched/core.c:7631 __might_sleep+0x9f/0xb0()
do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff8109447d>] prepare_to_wait_event+0x5d/0x110
ipmi_msghandler tcp_scalable ib_qib dca ib_mad ib_core ib_addr ipv6
CPU: 22 PID: 11064 Comm: fs_checker.pl Tainted: G O 4.4.20-clouder2 #6
Hardware name: Supermicro X10DRi/X10DRi, BIOS 1.1a 10/16/2015
0000000000000000 ffff8838b416fa88 ffffffff812f4409 ffff8838b416fad0
ffffffff81a034f2 ffff8838b416fac0 ffffffff81052b46 ffffffff81a0432c
0000000000000061 0000000000000000 0000000000000000 ffff88167bda54a0
Call Trace:
[<ffffffff812f4409>] dump_stack+0x67/0x9e
[<ffffffff81052b46>] warn_slowpath_common+0x86/0xc0
[<ffffffff81052bcc>] warn_slowpath_fmt+0x4c/0x50
[<ffffffff8109447d>] ? prepare_to_wait_event+0x5d/0x110
[<ffffffff8109447d>] ? prepare_to_wait_event+0x5d/0x110
[<ffffffff8107767f>] __might_sleep+0x9f/0xb0
[<ffffffff81612d30>] mutex_lock+0x20/0x40
[<ffffffffa04eea14>] __ceph_do_pending_vmtruncate+0x44/0x1a0 [ceph]
[<ffffffffa04fa692>] try_get_cap_refs+0xa2/0x320 [ceph]
[<ffffffffa04fd6f5>] ceph_get_caps+0x255/0x2b0 [ceph]
[<ffffffff81094370>] ? wait_woken+0xb0/0xb0
[<ffffffffa04f2c11>] ceph_write_iter+0x2b1/0xde0 [ceph]
[<ffffffff81613f22>] ? schedule_timeout+0x202/0x260
[<ffffffff8117f01a>] ? kmem_cache_free+0x1ea/0x200
[<ffffffff811b46ce>] ? iput+0x9e/0x230
[<ffffffff81077632>] ? __might_sleep+0x52/0xb0
[<ffffffff81156147>] ? __might_fault+0x37/0x40
[<ffffffff8119e123>] ? cp_new_stat+0x153/0x170
[<ffffffff81198cfa>] __vfs_write+0xaa/0xe0
[<ffffffff81199369>] vfs_write+0xa9/0x190
[<ffffffff811b6d01>] ? set_close_on_exec+0x31/0x70
[<ffffffff8119a056>] SyS_write+0x46/0xa0
This happens since wait_event_interruptible can interfere with the
mutex locking code, since they both fiddle with the task state.
Fix the issue by using the newly-added nested blocking infrastructure
in 61ada528de ("sched/wait: Provide infrastructure to deal with
nested blocking")
Link: https://lwn.net/Articles/628628/
Signed-off-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90f92c631b upstream.
The following patch was sketched by Russell in response to my
crashes on the PB11MPCore after the patch for software-based
priviledged no access support for ARMv8.1. See this thread:
http://marc.info/?l=linux-arm-kernel&m=144051749807214&w=2
I am unsure what is going on, I suspect everyone involved in
the discussion is. I just want to repost this to get the
discussion restarted, as I still have to apply this patch
with every kernel iteration to get my PB11MPCore Realview
running.
Testing by Neil Armstrong on the Oxnas NAS has revealed that
this bug exist also on that widely deployed hardware, so
we are probably currently regressing all ARM11MPCore systems.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: a5e090acbf ("ARM: software-based priviledged-no-access support")
Tested-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0e8faa7a5 upstream.
This function clearly never worked and always returns true,
as pointed out by gcc-7:
arch/arm/mach-ux500/pm.c: In function 'prcmu_is_cpu_in_wfi':
arch/arm/mach-ux500/pm.c:137:212: error: ?:
using integer constants in boolean context, the expression
will always evaluate to 'true' [-Werror=int-in-bool-context]
With the added braces, the condition actually makes sense.
Fixes: 34fe6f107e ("mfd : Check if the other db8500 core is in WFI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1ea6af3216 upstream.
This fixes commit ab8dd3aed0 ("ARM: DTS: Add minimal Support for
Logic PD DM3730 SOM-LV") where the Card Detect and Write Protect
pins were improperly configured.
Fixes: ab8dd3aed0 ("ARM: DTS: Add minimal Support for Logic PD DM3730 SOM-LV")
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ab5c2b662 upstream.
This patch fixes the following error:
sgtl5000 0-000a: Error reading chip id -6
imx-sgtl5000 sound: ASoC: CODEC DAI sgtl5000 not registered
imx-sgtl5000 sound: snd_soc_register_card failed (-517)
The problem was that the pinctrl group was linked to the sound driver
instead of the codec node. Since the codec is probed first, the sys_mclk
was missing and it would therefore fail to initialize.
Fixes: b32e700256 ("ARM: dts: imx: add Boundary Devices Nitrogen6_Max board")
Signed-off-by: Gary Bisson <gary.bisson@boundarydevices.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3d37d41a14 upstream.
Commit d1f3156fc8 ("ARM: dts: omap2: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: d1f3156fc8 ("ARM: dts: omap2: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 23ab4c6183 upstream.
Commit 008a2ebcd6 ("ARM: dts: omap3: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 008a2ebcd6 ("ARM: dts: omap3: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce95077d0c upstream.
Commit 75813028bb ("ARM: dts: am4372: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 75813028bb ("ARM: dts: am4372: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9faa84cb9 upstream.
Commit 76a8548ea9 ("ARM: dts: omap5: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 76a8548ea9 ("ARM: dts: omap5: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6c565d1a63 upstream.
Commit da6269e7e3 ("ARM: dts: omap4: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: da6269e7e3 ("ARM: dts: omap4: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d8d6d3f2f upstream.
Commit f8bf01611c ("ARM: dts: am33xx: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: f8bf01611c ("ARM: dts: am33xx: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9536fd30d4 upstream.
Commit 76155b378c ("ARM: dts: dm814x: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 76155b378c ("ARM: dts: dm814x: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6ed80b3a23 upstream.
Commit 06bfb9c199 ("ARM: dts: dm816x: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 06bfb9c199 ("ARM: dts: dm816x: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f6c857b12 upstream.
Commit 55871eb6e2 ("ARM: dts: dra7: Remove skeleton.dtsi usage")
removed the skeleton.dtsi usage since we want to get rid of it.
But this can cause issues when booting a kernel with a boot-loader
that doesn't create a chosen node if this isn't present in the DTB
since the decompressor relies on a pre-existing chosen node to be
available to insert the command line and merge other ATAGS info.
Fixes: 55871eb6e2 ("ARM: dts: dra7: Remove skeleton.dtsi usage")
Reported-by: Pali Rohar <pali.rohar@gmail.com>
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7882a26d2e upstream.
It's going to be used as a temporary buffer for in-place en/decryption
with ceph_crypt() instead of on-stack buffers, so rename to enc_buf.
Ensure alignment to avoid GFP_ATOMIC allocations in the crypto stack.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a45f795c65 upstream.
Starting with 4.9, kernel stacks may be vmalloced and therefore not
guaranteed to be physically contiguous; the new CONFIG_VMAP_STACK
option is enabled by default on x86. This makes it invalid to use
on-stack buffers with the crypto scatterlist API, as sg_set_buf()
expects a logical address and won't work with vmalloced addresses.
There isn't a different (e.g. kvec-based) crypto API we could switch
net/ceph/crypto.c to and the current scatterlist.h API isn't getting
updated to accommodate this use case. Allocating a new header and
padding for each operation is a non-starter, so do the en/decryption
in-place on a single pre-assembled (header + data + padding) heap
buffer. This is explicitly supported by the crypto API:
"... the caller may provide the same scatter/gather list for the
plaintext and cipher text. After the completion of the cipher
operation, the plaintext data is replaced with the ciphertext data
in case of an encryption and vice versa for a decryption."
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 36721ece1e upstream.
Pass what's going to be encrypted - that's msg_b, not ticket_blob.
ceph_x_encrypt_buflen() returns the upper bound, so this doesn't change
the maxlen calculation, but makes it a bit clearer.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 864db9295b upstream.
The current Alps SS5 (SS4 v2) code generates bogus TouchPad events when
TrackStick packets are processed.
This causes the xorg synaptics driver to print
"unable to find touch point 0" and
"BUG: triggered 'if (priv->num_active_touches > priv->num_slots)'"
messages. It also causes unexpected TouchPad button release and re-click
event sequences if the TrackStick is moved while holding a TouchPad
button.
This commit corrects the problem by adjusting alps_process_packet_ss4_v2()
so that it only sends TrackStick reports when processing TrackStick
packets.
Reviewed-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Paul Donohue <linux-kernel@PaulSD.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ad9e202aa1 upstream.
We cannot preserve partial fields for hardware breakpoints, because
the values written by userspace to the hardware breakpoint
registers can't subsequently be recovered intact from the hardware.
So, just reject attempts to write incomplete fields with -EINVAL.
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aeb1f39d81 upstream.
This patch adds an explicit __reserved[] field to user_fpsimd_state
to replace what was previously unnamed padding.
This ensures that data in this region are propagated across
assignment rather than being left possibly uninitialised at the
destination.
Fixes: 60ffc30d56 ("arm64: Exception handling")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a672401c00 upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Fixes: 5d220ff942 ("arm64: Better native ptrace support for compat tasks")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9dd73f72f2 upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Fixes: 766a85d7bc ("arm64: ptrace: add NT_ARM_SYSTEM_CALL regset")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a17b876b5 upstream.
Ensure that if userspace supplies insufficient data to
PTRACE_SETREGSET to fill all the registers, the thread's old
registers are preserved.
Fixes: 478fcb2cdb ("arm64: Debugging support")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Will Deacon <Will.Deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d9e8f71b9 upstream.
Generally, taking an unexpected exception should be a fatal event, and
bad_mode is intended to cater for this. However, it should be possible
to contain unexpected synchronous exceptions from EL0 without bringing
the kernel down, by sending a SIGILL to the task.
We tried to apply this approach in commit 9955ac47f4 ("arm64:
don't kill the kernel on a bad esr from el0"), by sending a signal for
any bad_mode call resulting from an EL0 exception.
However, this also applies to other unexpected exceptions, such as
SError and FIQ. The entry paths for these exceptions branch to bad_mode
without configuring the link register, and have no kernel_exit. Thus, if
we take one of these exceptions from EL0, bad_mode will eventually
return to the original user link register value.
This patch fixes this by introducing a new bad_el0_sync handler to cater
for the recoverable case, and restoring bad_mode to its original state,
whereby it calls panic() and never returns. The recoverable case
branches to bad_el0_sync with a bl, and returns to userspace via the
usual ret_to_user mechanism.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 9955ac47f4 ("arm64: don't kill the kernel on a bad esr from el0")
Reported-by: Mark Salter <msalter@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 43849785e1 upstream.
Read access to the SPI flash are broken on da850-evm, i.e. the data
read is not what is actually programmed on the flash.
According to the datasheet for the M25P64 part present on the da850-evm,
if the SPI frequency is higher than 20MHz then the READ command is not
usable anymore and only the FAST_READ command can be used to read data.
This commit specifies in the DTS that we should use FAST_READ command
instead of the READ command.
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Fabien Parent <fparent@baylibre.com>
[nsekhar@ti.com: subject line adjustment]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
commit 87cb12910a upstream.
AHCI provides the register PORTS_IMPL to let the software know which port
is supported. The register must be initialized by the bootloader. However
in some cases u-boot doesn't properly initialize this value (if it is not
compiled with SATA support for example or if the SATA initialization fails).
The DTS entry "ports-implemented" can be used to override the value in
PORTS_IMPL.
Without this patch the SATA will not work in the following two cases:
* if there has been a failure to initialize SATA in u-boot.
* if ahci_platform module has been removed and re-inserted. The reason is
that the content of PORTS_IMPL is lost after the module is removed.
I suspect that it's because the controller is reset by the hwmod.
Signed-off-by: Jean-Jacques Hiblot <jjhiblot@ti.com>
Acked-by: Roger Quadros <rogerq@ti.com>
[tony@atomide.com: updated comments with what goes wrong]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6df8c9d80a upstream.
sparse says:
fs/ceph/mds_client.c:291:23: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:293:28: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:294:28: warning: restricted __le32 degrades to integer
fs/ceph/mds_client.c:296:28: warning: restricted __le32 degrades to integer
The op value is __le32, so we need to convert it before comparing it.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ddc37832a1 upstream.
On APQ8060, the kernel crashes in arch_hw_breakpoint_init, taking an
undefined instruction trap within write_wb_reg. This is because Scorpion
CPUs erroneously appear to set DBGPRSR.SPD when WFI is issued, even if
the core is not powered down. When DBGPRSR.SPD is set, breakpoint and
watchpoint registers are treated as undefined.
It's possible to trigger similar crashes later on from userspace, by
requesting the kernel to install a breakpoint or watchpoint, as we can
go idle at any point between the reset of the debug registers and their
later use. This has always been the case.
Given that this has always been broken, no-one has complained until now,
and there is no clear workaround, disable hardware breakpoints and
watchpoints on Scorpion to avoid these issues.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce1ca7d2d1 upstream.
In rdma_read_chunk_frmr() when ib_post_send() fails, the error code path
invokes ib_dma_unmap_sg() to unmap the sg list. It then invokes
svc_rdma_put_frmr() which in turn tries to unmap the same sg list through
ib_dma_unmap_sg() again. This second unmap is invalid and could lead to
problems when the iova being unmapped is subsequently reused. Remove
the call to unmap in rdma_read_chunk_frmr() and let svc_rdma_put_frmr()
handle it.
Fixes: 412a15c0fe ("svcrdma: Port to new memory registration API")
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cb51a15b5 upstream.
When replaying the journal it can happen that a journal entry points to
a garbage collected node.
This is the case when a power-cut occurred between a garbage collect run
and a commit. In such a case nodes have to be read using the failable
read functions to detect whether the found node matches what we expect.
One corner case was forgotten, when the journal contains an entry to
remove an inode all xattrs have to be removed too. UBIFS models xattr
like directory entries, so the TNC code iterates over
all xattrs of the inode and removes them too. This code re-uses the
functions for walking directories and calls ubifs_tnc_next_ent().
ubifs_tnc_next_ent() expects to be used only after the journal and
aborts when a node does not match the expected result. This behavior can
render an UBIFS volume unmountable after a power-cut when xattrs are
used.
Fix this issue by using failable read functions in ubifs_tnc_next_ent()
too when replaying the journal.
Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Reported-by: Rock Lee <rockdotlee@gmail.com>
Reviewed-by: David Gstir <david@sigma-star.at>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eeb0d56fab upstream.
In AP (or VLAN) mode, when unicast 802.11 packets are received,
they might actually be multicast after conversion. In this case
the fast-RX path didn't handle them properly to send them back
to the wireless medium. Implement that by copying the SKB and
sending it back out.
The possible alternative would be to just punt the packet back
to the regular (slow) RX path, but since we have almost all of
the required code here already it's not so complicated to add
here. Punting it back would also mean acquiring the spinlock,
which would be bad for the stated purpose of the fast-RX path,
to enable well-performing parallel RX.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fc1ffd6cb3 upstream.
During code inspection, while investigating following stack trace
seen on one of the test setup, we found out there was possibility
of memory leak becuase driver was not unwinding the stack properly.
This issue has not been reproduced in a test environment or on a
customer setup.
Here's stack trace that was seen.
[1469877.797315] Call Trace:
[1469877.799940] [<ffffffffa03ab6e9>] qla2x00_mem_alloc+0xb09/0x10c0 [qla2xxx]
[1469877.806980] [<ffffffffa03ac50a>] qla2x00_probe_one+0x86a/0x1b50 [qla2xxx]
[1469877.814013] [<ffffffff813b6d01>] ? __pm_runtime_resume+0x51/0xa0
[1469877.820265] [<ffffffff8157c1f5>] ? _raw_spin_lock_irqsave+0x25/0x90
[1469877.826776] [<ffffffff8157cd2d>] ? _raw_spin_unlock_irqrestore+0x6d/0x80
[1469877.833720] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.839885] [<ffffffff8157cd0c>] ? _raw_spin_unlock_irqrestore+0x4c/0x80
[1469877.846830] [<ffffffff81319b9c>] local_pci_probe+0x4c/0xb0
[1469877.852562] [<ffffffff810741d1>] ? preempt_count_sub+0xb1/0x100
[1469877.858727] [<ffffffff81319c89>] pci_call_probe+0x89/0xb0
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: Fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 178f358208 upstream.
IBM bit 31 (for the rest of us - bit 0) is a reserved field in the
instruction definition of mtspr and mfspr. Hardware is encouraged to
(and does) ignore it.
As a result, if userspace executes an mtspr DSCR with the reserved bit
set, we get a DSCR facility unavailable exception. The kernel fails to
match against the expected value/mask, and we silently return to
userspace to try and re-execute the same mtspr DSCR instruction. We
loop forever until the process is killed.
We should do something here, and it seems mirroring what hardware does
is the better option vs killing the process. While here, relax the
matching of mfspr PVR too.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b34ca60148 upstream.
Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the check pointed registers, the thread's old check pointed
registers are preserved.
Fixes: 9d3918f7c0 ("powerpc/ptrace: Enable support for NT_PPC_CVSX")
Fixes: 19cbcbf75a ("powerpc/ptrace: Enable support for NT_PPC_CFPR")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99dfe80a2a upstream.
Ensure that if userspace supplies insufficient data to PTRACE_SETREGSET
to fill all the registers, the thread's old registers are preserved.
Fixes: c6e6771b87 ("powerpc: Introduce VSX thread_struct and CONFIG_VSX")
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d89f473ff6 upstream.
Use 0x10012 event code for PM_BRU_CMPL event in power9 event list
instead of current 0x40060.
Fixes: 34922527a2 ('powerpc/perf: Add power9 event list macros for generic and cache events')
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9728a7c8ab upstream.
The icp-opal call is missing the code from icp-native to recover
interrupts snatched by KVM. Without that, when running KVM, we can
get into a situation where an interrupt is lost and the CPU stuck
with an elevated CPPR.
Also harden replay by always checking the return from opal_int_eoi().
Fixes: d74361881f ("powerpc/xics: Add ICP OPAL backend")
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1193e6aeec upstream.
Dmitry Vyukov reported that the syzkaller fuzzer triggered a
deadlock in the vgic setup code when an error was detected, as
the cleanup code tries to take a lock that is already held by
the setup code.
The fix is to avoid retaking the lock when cleaning up, by
telling the cleanup function that we already hold it.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0447819741 upstream.
kvm_s390_get_machine() populates the facility bitmap by copying bytes
from the host results that are stored in a 256 byte array in the prefix
page. The KVM code does use the size of the target buffer (2k), thus
copying and exposing unrelated kernel memory (mostly machine check
related logout data).
Let's use the size of the source buffer instead. This is ok, as the
target buffer will always be greater or equal than the source buffer as
the KVM internal buffers (and thus S390_ARCH_FAC_LIST_SIZE_BYTE) cover
the maximum possible size that is allowed by STFLE, which is 256
doublewords. All structures are zero allocated so we can leave bytes
256-2047 unchanged.
Add a similar fix for kvm_arch_init_vm().
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
[found with smatch]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a272466349 upstream.
Remove the usage of modules functions to make this driver compile
again. Otherwise an include of linux/modules.h would be needed.
Fixes: 024366750c ("mtd: nand: xway: convert to normal platform driver")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73529c872a upstream.
The xway_nand driver accesses the ltq_ebu_membase symbol which is not
exported. This also should not get exported and we should handle the
EBU interface in a better way later. This quick fix just deactivated
support for building as module.
Fixes: 99f2b10792 ("mtd: lantiq: Add NAND support on Lantiq XWAY SoC.")
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cf9e1672a6 upstream.
Semantics of NR_IRQS is different on machines with SPARSE_IRQ option
disabled or enabled, in the latter case IRQs are allocated starting
at least from the value specified by NR_IRQS and going upwards, so
the check of (irq >= NR_IRQ) to decide about an error code returned by
platform_get_irq() is completely invalid, don't attempt to overrule
irq subsystem in the driver.
The change fixes LPC32xx NAND MLC driver initialization on boot.
Fixes: 8cb17b5ed0 ("irqchip: Add LPC32xx interrupt controller driver")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Sylvain Lemieux <slemieux.tyco@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05a974efa4 upstream.
From 4.9 we should really avoid using the stack here as this will not be DMA
able on various platforms. This changes the buffers already being present in
time of 4.9 being released. This should go into stable as well.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01167c7b9c upstream.
According to the code the intention is to append 8 SCK cycles
instead of 4 at end of a MMC_STOP_TRANSMISSION command. But this
will never happened because it's an AC command not an ADTC command.
So fix this by moving the statement into the right function.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: e4243f13d1 (mmc: mxs-mmc: add mmc host driver for i.MX23/28)
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e1d070c379 upstream.
Commit e5bbf30733 ("mmc: sdhci-acpi: Ensure connected devices are
powered when probing") introduced code to powerup any acpi child
nodes listed in the dstd. But some dstd-s list all possible devices
used on some board variants, while reporting if the device is actually
present and enabled in the status field of the device.
So we end up calling the acpi _PS0 (power-on) method for devices which
are not actually present. This does not always end well, e.g. on my
cube iwork8 air tablet, this results in freezing the entire tablet as
soon as the r8723bs module is loaded.
This commit fixes this by checking the child device's status.present
and status.enabled bits and only call acpi_device_fix_up_power()
if both are set.
Fixes: e5bbf30733 ("mmc: sdhci-acpi: Ensure connected devices are powered when probing")
BugLink: https://github.com/hadess/rtl8723bs/issues/80
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a546af50e upstream.
Make sure to check for short control transfers in order to avoid parsing
uninitialised buffer data and leaking it to user space.
Note that the backlight and macro-mode buffer constraints are kept as
loose as possible in order to avoid any regressions should the current
buffer sizes be larger than necessary.
Fixes: 6f78193ee9 ("HID: corsair: Add Corsair Vengeance K90 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6d104af38b upstream.
Not all platforms support DMA to the stack, and specifically since v4.9
this is no longer supported on x86 with VMAP_STACK either.
Note that the macro-mode buffer was larger than necessary.
Fixes: 6f78193ee9 ("HID: corsair: Add Corsair Vengeance K90 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51ebfc92b7 upstream.
A PCI-to-PCIe bridge (a "reverse bridge") has a PCI or PCI-X primary
interface and a PCI Express secondary interface. The PCIe interface is a
Downstream Port that originates a Link. See the "PCI Express to PCI/PCI-X
Bridge Specification", rev 1.0, sections 1.2 and A.6.
The bug report below involves a PCI-to-PCIe bridge and a PCIe switch below
the bridge:
00:1e.0 Intel 82801 PCI Bridge to [bus 01-0a]
01:00.0 Pericom PI7C9X111SL PCIe-to-PCI Reversible Bridge to [bus 02-0a]
02:00.0 Pericom Device 8608 [PCIe Upstream Port] to [bus 03-0a]
03:01.0 Pericom Device 8608 [PCIe Downstream Port] to [bus 0a]
01:00.0 is configured as a PCI-to-PCIe bridge (despite the name printed by
lspci). As we traverse a PCIe hierarchy, device connections alternate
between PCIe Links and internal Switch logic. Previously we did not
recognize that 01:00.0 had a secondary link, so we thought the 02:00.0
Upstream Port *did* have a secondary link. In fact, it's the other way
around: 01:00.0 has a secondary link, and 02:00.0 has internal Switch logic
on its secondary side.
When we thought 02:00.0 had a secondary link, the pci_scan_slot() ->
only_one_child() path assumed 02:00.0 could have only one child, so 03:00.0
was the only possible downstream device. But 03:00.0 doesn't exist, so we
didn't look for any other devices on bus 03.
Booting with "pci=pcie_scan_all" is a workaround, but we don't want users
to have to do that.
Recognize that PCI-to-PCIe bridges originate links on their secondary
interfaces.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=189361
Fixes: d0751b98df ("PCI: Add dev->has_secondary_link to track downstream PCIe links")
Tested-by: Blake Moore <blake.moore@men.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a782b5f986 upstream.
Previously we checked for iATU unroll support by reading PCIE_ATU_VIEWPORT
even on platforms, e.g., Keystone, that do not have ATU ports. This can
cause bad behavior such as asynchronous external aborts:
OF: PCI: MEM 0x60000000..0x6fffffff -> 0x60000000
Unhandled fault: asynchronous external abort (0x1211) at 0x00000000
pgd = c0003000
[00000000] *pgd=80000800004003, *pmd=00000000
Internal error: : 1211 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.0-00009-g6ff59d2-dirty #7
Hardware name: Keystone
task: eb878000 task.stack: eb866000
PC is at dw_pcie_setup_rc+0x24/0x380
LR is at ks_pcie_host_init+0x10/0x170
Move the dw_pcie_iatu_unroll_enabled() check so we only call it on
platforms that do not use the ATU. These platforms supply their own
->rd_other_conf() and ->wr_other_conf() methods.
[bhelgaas: changelog]
Fixes: a0601a4705 ("PCI: designware: Add iATU Unroll feature")
Fixes: 416379f9eb ("PCI: designware: Check for iATU unroll support after initializing host")
Tested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-By: Joao Pinto <jpinto@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 210675270c upstream.
Commit bcb6f6d2b9 ("fuse: use timespec64") introduced clamped nsec values
in time_to_jiffies but used the max of nsec and NSEC_PER_SEC - 1 instead of
the min. Because of this, dentries would stay in the cache longer than
requested and go stale in scenarios that relied on their timely eviction.
Fixes: bcb6f6d2b9 ("fuse: use timespec64")
Signed-off-by: David Sheets <dsheets@docker.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a8a86d78d6 upstream.
fuse_abort_conn() moves requests from pending list to a temporary list
before canceling them. This operation races with request_wait_answer()
which also tries to remove the request after it gets a fatal signal. It
checks FR_PENDING flag to determine whether the request is still in the
pending list.
Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer()
does not remove the request from temporary list.
This bug causes an Oops when trying to delete an already deleted list entry
in end_requests().
Fixes: ee314a870e ("fuse: abort: no fc->lock needed for request ending")
Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb1357d942 upstream.
commit d65283f7b6 added mod->arch.secstr under
CONFIG_ARC_DW2_UNWIND, but used it unconditionally which broke builds
when the option was disabled. Fix that by adjusting the #ifdef guard.
And while at it add a missing guard (for unwinder) in module.c as well
Reported-by: Waldemar Brodkorb <wbx@openadk.org>
Fixes: d65283f7b6 ("ARC: module: elide loop to save reference to .eh_frame")
Tested-by: Anton Kolesov <akolesov@synopsys.com>
Reviewed-by: Alexey Brodkin <abrodkin@synopsys.com>
[abrodkin: provided fixlet to Kconfig per failure in allnoconfig build]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f19b983a8 upstream.
Commit 98a29c39dc ("libnvdimm, namespace: allow creation of multiple
pmem-namespaces per region") added support for establishing additional
pmem namespace beyond the seed device, similar to blk namespaces.
However, it neglected to delete the namespace when the size is set to
zero.
Fixes: 98a29c39dc ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 78794d1890 upstream.
Context expiry times are in units of seconds since boot, not unix time.
The use of get_seconds() here therefore sets the expiry time decades in
the future. This prevents timely freeing of contexts destroyed by
client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually
(when the module is unloaded or the container shut down), but a lot of
contexts could pile up before then.
Fixes: c5b29f885a "sunrpc: use seconds since boot in expiry cache"
Reported-by: Andy Adamson <andros@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 546125d161 upstream.
The inet6addr_chain is an atomic notifier chain, so we can't call
anything that might sleep (like lock_sock)... instead of closing the
socket from svc_age_temp_xprts_now (which is called by the notifier
function), just have the rpc service threads do it instead.
Fixes: c3d4879e01 "sunrpc: Add a function to close..."
Signed-off-by: Scott Mayhew <smayhew@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52d7e48b86 upstream.
The current preemptible RCU implementation goes through three phases
during bootup. In the first phase, there is only one CPU that is running
with preemption disabled, so that a no-op is a synchronous grace period.
In the second mid-boot phase, the scheduler is running, but RCU has
not yet gotten its kthreads spawned (and, for expedited grace periods,
workqueues are not yet running. During this time, any attempt to do
a synchronous grace period will hang the system (or complain bitterly,
depending). In the third and final phase, RCU is fully operational and
everything works normally.
This has been OK for some time, but there has recently been some
synchronous grace periods showing up during the second mid-boot phase.
This code worked "by accident" for awhile, but started failing as soon
as expedited RCU grace periods switched over to workqueues in commit
8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue").
Note that the code was buggy even before this commit, as it was subject
to failure on real-time systems that forced all expedited grace periods
to run as normal grace periods (for example, using the rcu_normal ksysfs
parameter). The callchain from the failure case is as follows:
early_amd_iommu_init()
|-> acpi_put_table(ivrs_base);
|-> acpi_tb_put_table(table_desc);
|-> acpi_tb_invalidate_table(table_desc);
|-> acpi_tb_release_table(...)
|-> acpi_os_unmap_memory
|-> acpi_os_unmap_iomem
|-> acpi_os_map_cleanup
|-> synchronize_rcu_expedited
The kernel showing this callchain was built with CONFIG_PREEMPT_RCU=y,
which caused the code to try using workqueues before they were
initialized, which did not go well.
This commit therefore reworks RCU to permit synchronous grace periods
to proceed during this mid-boot phase. This commit is therefore a
fix to a regression introduced in v4.9, and is therefore being put
forward post-merge-window in v4.10.
This commit sets a flag from the existing rcu_scheduler_starting()
function which causes all synchronous grace periods to take the expedited
path. The expedited path now checks this flag, using the requesting task
to drive the expedited grace period forward during the mid-boot phase.
Finally, this flag is updated by a core_initcall() function named
rcu_exp_runtime_mode(), which causes the runtime codepaths to be used.
Note that this arrangement assumes that tasks are not sent POSIX signals
(or anything similar) from the time that the first task is spawned
through core_initcall() time.
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Reported-by: "Zheng, Lv" <lv.zheng@intel.com>
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Tested-by: Stan Kain <stan.kain@gmail.com>
Tested-by: Ivan <waffolz@hotmail.com>
Tested-by: Emanuel Castelo <emanuel.castelo@gmail.com>
Tested-by: Bruno Pesavento <bpesavento@infinito.it>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Frederic Bezies <fredbezies@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f466ae66fa upstream.
It is now legal to invoke synchronize_sched() at early boot, which causes
Tiny RCU's synchronize_sched() to emit spurious splats. This commit
therefore removes the cond_resched() from Tiny RCU's synchronize_sched().
Fixes: 8b355e3bc1 ("rcu: Drive expedited grace periods from workqueue")
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89e9f7bcd8 upstream.
Martin reported that the Supermicro X8DTH-i/6/iF/6F advertises incorrect
host bridge windows via _CRS:
pci_root PNP0A08:00: host bridge window [io 0xf000-0xffff]
pci_root PNP0A08:01: host bridge window [io 0xf000-0xffff]
Both bridges advertise the 0xf000-0xffff window, which cannot be correct.
Work around this by ignoring _CRS on this system. The downside is that we
may not assign resources correctly to hot-added PCI devices (if they are
possible on this system).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=42606
Reported-by: Martin Burnicki <martin.burnicki@meinberg.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 497de07d89 upstream.
This change was missed the tmpfs modification in In CVE-2016-7097
commit 073931017b ("posix_acl: Clear SGID bit when setting
file permissions")
It can test by xfstest generic/375, which failed to clear
setgid bit in the following test case on tmpfs:
touch $testfile
chown 100:100 $testfile
chmod 2755 $testfile
_runas -u 100 -g 101 -- setfacl -m u::rwx,g::rwx,o::rwx $testfile
Signed-off-by: Gu Zheng <guzheng1@huawei.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7245f67f86 upstream.
Fixes: ("ab8dd3aed011 ARM: DTS: Add minimal Support for Logic PD
DM3730 SOM-LV")
This adds the dts file into the Makefile. This should have been included in
the original patch.
V2: Update patch description - same source code
V1: Original patch
Reviewed-by: Javier Martinez Canillas <javier@osg.samsung.com>
Signed-off-by: Adam Ford <aford173@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af92305e56 upstream.
On i.MX31 AVIC interrupt controller base address is at 0x68000000.
The problem was shadowed by the AVIC driver, which takes the correct
base address from a SoC specific header file.
Fixes: d2a37b3d91 ("ARM i.MX31: Add devicetree support")
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Reviewed-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f87aee6a2 upstream.
i.MX31 Clock Control Module controller is found on AIPS2 bus, move it
there from SPBA bus to avoid a conflict of device IO space mismatch.
Fixes: ef0e4a606f ("ARM: mx31: Replace clk_register_clkdev with clock DT lookup")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e575cbc93 upstream.
The type of AVIC interrupt controller found on i.MX31 is one-cell,
namely 31 for CCM DVFS and 53 for CCM, however for clock control
module its interrupts are specified as 3-cells, fix it.
Fixes: ef0e4a606f ("ARM: mx31: Replace clk_register_clkdev with clock DT lookup")
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 72649a4606 upstream.
According to the schematics of CompuLab's sbc-fx6 baseboard and the
vendor devicetree GPIO_16 is *not* muxed to ENET_REF_CLK but to SPDIF_IN.
Remove the wrong pinctrl setting.
Fixes: 682d055e6a ("ARM: dts: Add initial support for cm-fx6.")
Signed-off-by: Christopher Spinrath <christopher.spinrath@rwth-aachen.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7d891a685d upstream.
The address of the mailbox node in the bcm283x.dtsi also has a typo.
So fix it accordingly.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Fixes: 05b682b7a3 ("ARM: bcm2835: dt: Add the mailbox to the device tree")
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 621cb4e783 upstream.
This patch modifies the build dependencies on the jitdump support in
perf. As it stands jitdump was wrongfully made dependent 100% on using
DWARF. However, the dwarf dependency, only exist if generating the
source line table in genelf_debug.c. The rest of the support does not
need DWARF.
This patch removes the dependency on DWARF for the entire jitdump
support. It keeps it only for the genelf_debug.c support.
Signed-off-by: Maciej Debski <maciejd@google.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1476356383-30100-3-git-send-email-eranian@google.com
Fixes: e12b202f8f ("perf jitdump: Build only on supported archs")
[ Make it build only if NO_LIBELF isn't defined, as jitdump.o will only be built in that case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c56cb33b56 upstream.
Since 841e3558b2 ("perf callchain: Recording 'dwarf' callchains do not
need DWARF unwinding support"), --call-graph dwarf is allowed in 'perf
record' even without unwind support. A couple of other places don't
reflect this yet though: the help text should list dwarf as a valid
record mode and the dump_size config should be respected too.
Signed-off-by: Rabin Vincent <rabinv@axis.com>
Cc: He Kuang <hekuang@huawei.com>
Fixes: 841e3558b2 ("perf callchain: Recording 'dwarf' callchains do not need DWARF unwinding support")
Link: http://lkml.kernel.org/r/1470837148-7642-1-git-send-email-rabin.vincent@axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed6c166cc7 upstream.
Fixes a perf diff regression issue which was introduced by commit
5baecbcd9c ("perf symbols: we can now read separate debug-info files
based on a build ID")
The binary name could be same when perf diff different binaries. Build
id is used to distinguish between them.
However, the previous patch assumes the same binary name has same build
id. So it overwrites the build id according to the binary name,
regardless of whether the build id is set or not.
Check the has_build_id in dso__load. If the build id is already set, use
it.
Before the fix:
$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% -99.80% tchain_edit [.] f2
0.12% +99.81% tchain_edit [.] f3
0.02% -0.01% [ixgbe] [k] ixgbe_read_reg
After the fix:
$ perf diff 1.perf.data 2.perf.data
# Event 'cycles'
#
# Baseline Delta Shared Object Symbol
# ........ ....... ................ .............................
#
99.83% +0.10% tchain_edit [.] f3
0.12% -0.08% tchain_edit [.] f2
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
CC: Dima Kogan <dima@secretsauce.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 5baecbcd9c ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecf1e2253e upstream.
Instead of the one when another syscall takes place while another is being
processed (in another CPU, but we show it serialized, so need to "interrupt"
the other), and also when finally showing the sys_enter + sys_exit + duration,
where we were showing the sample->time for the sys_exit, duh.
Before:
# perf trace sleep 1
<SNIP>
0.373 ( 0.001 ms): close(fd: 3 ) = 0
1000.626 (1000.211 ms): nanosleep(rqtp: 0x7ffd6ddddfb0) = 0
1000.653 ( 0.003 ms): close(fd: 1 ) = 0
1000.657 ( 0.002 ms): close(fd: 2 ) = 0
1000.667 ( 0.000 ms): exit_group( )
#
After:
# perf trace sleep 1
<SNIP>
0.336 ( 0.001 ms): close(fd: 3 ) = 0
0.373 (1000.086 ms): nanosleep(rqtp: 0x7ffe303e9550) = 0
1000.481 ( 0.002 ms): close(fd: 1 ) = 0
1000.485 ( 0.001 ms): close(fd: 2 ) = 0
1000.494 ( 0.000 ms): exit_group( )
[root@jouet linux]#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ecbzgmu2ni6glc6zkw8p1zmx@git.kernel.org
Fixes: 752fde44fd ("perf trace: Support interrupted syscalls")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0b59970e7d upstream.
Remove the warning print of "can't use of GFP_NOIO" to avoid prints in
each QP creation when devices aren't supporting IB_QP_CREATE_USE_GFP_NOIO.
This print become more annoying when the IPoIB interface is configured
to work in connected mode.
Fixes: 09b93088d7 ('IB: Add a QP creation flag to use GFP_NOIO allocations')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f22e454df upstream.
According to the firmware spec, FLOW_STEERING_IB_UC_QP_RANGE command is
supported only if dmfs_ipoib bit is set.
If it isn't set we want to ensure allocating NET_IF QPs fail. We do so
by filling out the allocation bitmap. By thus, the NET_IF QPs allocating
function won't find any free QP and will fail.
Fixes: c1c9850112 ('IB/mlx4: Add support for steerable IB UD QPs')
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6fa2620820 upstream.
Report the correct speed in the port attributes when using a 56Gbps
ethernet link. Without this change the field is incorrectly set to 10.
Fixes: a9c766bb75 ('IB/mlx4: Fix info returned when querying IBoE ports')
Fixes: 2e96691c31 ('IB: Use central enum for speed instead of hard-coded values')
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit befcabcd53 upstream.
If OpenSM runs over a ConnectX-3, and there are ConnectX-4 or Connect-IB
VFs active on the network, the OpenSM will receive QP1 packets containing
a GRH where the destination GID is the "Well-Known GID" -- which is not a
GID in the HCA Port's GID Table.
This GID must be tested-for separately -- and packets which contain
this destination GID should be routed to slave 0 (the PF).
Fixes: 37bfc7c1e8 ('IB/mlx4: SR-IOV multiplex and demultiplex MADs')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c482af646d upstream.
For non-special QPs, the port value becomes non-zero only at the
RESET-to-INIT transition. If the QP has not undergone that transition,
its port number value is still zero.
If such a QP is destroyed before being moved out of the RESET state,
subtracting one from the qp port number results in a negative value.
Using that negative value as an index into the qp1_proxy array
results in an out-of-bounds array reference.
Fix this by testing that the QP type is one that uses qp1_proxy before
using the port number. For special QPs of all types, the port number is
specified at QP creation time.
Fixes: 9433c18891 ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af4295c117 upstream.
Set traffic class within sl_tclass_flowlabel when create iboe AH.
Without this the TOS value will be empty when running VLAN tagged
traffic, because the TOS value is taken from the traffic class in the
address handle attributes.
Fixes: 9106c41069 ('IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c73b7911de upstream.
Move the SRQ type assignment to be before actually using it
in create_srq_user() and in create_srq_kernel() functions.
Fixes: af1ba291c5 ('{net, IB}/mlx5: Refactor internal SRQ API')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit afd02cd3a9 upstream.
When enabling many VFs, the total amount of DMA mappings increase
significantly. This causes DMA allocations to take a lot of time
since they are serialized in the kernel.
As a result the driver enters into fatal condition due to
timeout and the system hangs. To recover from this we disable
MR cache for VFs.
PFs will still have a full cache and VFs cache can be manipulated
as usual after driver load.
Fixes: e126ba97db ('mlx5: Add driver for Mellanox Connect-IB adapters')
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a0fa72683e upstream.
A race condition fix added an rxe_qp structure to the stack in order
to be able to perform rollback in rxe_requester(), but the structure
is large enough to trigger the warning for possible stack overflow:
drivers/infiniband/sw/rxe/rxe_req.c: In function 'rxe_requester':
drivers/infiniband/sw/rxe/rxe_req.c:757:1: error: the frame size of 2064 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]
This changes the rollback function to only save the psn inside
the qp, which is the only field we access in the rollback_qp
anyway.
Fixes: 3050b99850 ("IB/rxe: Fix race condition between requester and completer")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d680ebed91 upstream.
Increase limit of max CQE from 8K to 32K to allow demanding
applications to work over SoftRoCE with same configuration
as most RoCEv2 HW vendors have.
Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa6aae38f7 upstream.
The failure in ib_cache_setup_one function during
ib_register_device will leave leaked allocated memory.
Fixes: 03db3a2d81 ("IB/core: Add RoCE GID table management")
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d7400c4ac upstream.
Always stating PIN_CONFIG_BIAS_DISABLE is supported gives untrue output
when examining /sys/kernel/debug/pinctrl/e6060000.pfc/pinconf-pins if
the operation get_bias() is implemented but the pin is not handled by
the get_bias() implementation. In that case the output will state that
"input bias disabled" indicating that this pin has bias control
support.
Make support for PIN_CONFIG_BIAS_DISABLE depend on that the pin either
supports SH_PFC_PIN_CFG_PULL_UP or SH_PFC_PIN_CFG_PULL_DOWN. This also
solves the issue where SoC specific implementations print error messages
if their particular implementation of {set,get}_bias() is called with a
pin it does not know about.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69d012345a upstream.
In current code, the @changed always returns the last one's status for
the huge page with the contiguous bit set. This is really not what we
want. Even one of the PTEs is changed, we should tell it to the caller.
This patch fixes this issue.
Fixes: 66b3923a1a ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20156ce236 upstream.
The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:
1.) pmd entry is not present.
2.) the page fault occurs:
... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()
3.) set_huge_pte_at() will only set the first PMD entry, since the
find_num_contig just return 1 in this case. So the PMD entries
are all empty except the first one.
4.) when kernel accesses the address mapped by the second PMD entry,
a new page fault occurs:
... hugetlb_fault() --> huge_ptep_set_access_flags()
The second PMD entry is still empty now.
5.) When the kernel returns, the access will cause a page fault again.
The kernel will run like the "4)" above.
We will see a dead loop since here.
The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).
This patch removes wrong pmd check, and fixes this dead loop.
This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0c2f0afe35 upstream.
The libhugetlbfs meets several failures since the following functions
do not use the correct address:
huge_ptep_get_and_clear()
huge_ptep_set_access_flags()
huge_ptep_set_wrprotect()
huge_ptep_clear_flush()
This patch fixes the wrong address for them.
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d4791db527 upstream.
Whenever a PE is initialised in powernv, opal_pci_eeh_freeze_clear() is
called. This is to remove any existing freeze, and has no negative side
effects if the PE is already in an unfrozen state. On PHB backends that
don't support this operation and return OPAL_UNSUPPORTED, this creates a
scary and misleading warning message.
Skip the warning message on init if OPAL_UNSUPPORTED is returned.
As far as I'm aware, this currently only affects NPUs.
Fixes: 313483d ("powerpc/powernv: Unfreeze PE on allocation")
Signed-off-by: Russell Currey <ruscur@russell.cc>
Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fe0f316816 upstream.
Make sure to drop any reference taken by bus_find_device() in the sysfs
callbacks that are used to create and destroy devices based on
device-tree entries.
Fixes: 6bccf755ff ("[POWERPC] ibmebus: dynamic addition/removal of adapters, some code cleanup")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 815a7141c4 upstream.
Make sure to drop any reference taken by bus_find_device() when creating
devices during init and driver registration.
Fixes: 55347cc996 ("[POWERPC] ibmebus: Add device creation and bus probing based on of_device")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 555c16328a upstream.
Version 3.00 of the ISA states that the PATS (partition table size) field
of the PTCR (partition table control register) and the PRTS (process table
size) field of the partition table entry must both be less than or equal
to 24. However the actual size of the partition and process tables is equal
to 2 to the power of 12 plus the PATS and PRTS fields, respectively. This
means that the max allowable size of each of these tables is 2^36 or 64GB
for both.
Thus when checking the size shift for each we should be checking for values
of greater than 36 instead of the current check for shifts larger than 24
and 23.
Fixes: 2bfd65e45e
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c090959b9d upstream.
Make sure to drop the reference to the parent device taken by
class_find_device() after populating the bus.
Fixes: 3b9334ac83 ("mfd: vexpress: Convert custom func API to regmap")
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c02ebfdddb upstream.
Commit 0e87e58bf6 ("blk-mq: improve warning for running a queue on the
wrong CPU") attempts to avoid triggering the WARN_ON in
__blk_mq_run_hw_queue when the expected CPU is dead. Problem is, in the
last batch execution before round robin, blk_mq_hctx_next_cpu can
schedule a dead CPU and also update next_cpu to the next alive CPU in
the mask, which will trigger the WARN_ON despite the previous
workaround.
The following patch fixes this scenario by always scheduling the value
in hctx->next_cpu. This changes the moment when we round-robin the CPU
running the hctx, but it really doesn't matter, since it still executes
BLK_MQ_CPU_WORK_BATCH times in a row before switching to another CPU.
Fixes: 0e87e58bf6 ("blk-mq: improve warning for running a queue on the wrong CPU")
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bee9ea1de upstream.
The BQ27510 and BQ27520 use a slightly different register map than the
BQ27500, add a new type enum and add these gauges to it.
Fixes: d74534c277 ("power: bq27xxx_battery: Add support for additional bq27xxx family devices")
Based-on-patch-by: Kenneth R. Crudup <kenny@panix.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb43f81b84 upstream.
Commit e1399ba20e ("powercap / RAPL: handle missing MSRs") added
contraint_to_pl() function to return index into an array. But it
can potentially return -EINVAL if powercap layer sends an out of
range constraint ID. This patch adds sanity check.
Unnecessary RAPL domain pointer check is removed since it must be
initialized before calling rapl_unit_xlate().
Fixes: e1399ba20e ("powercap / RAPL: handle missing MSRs")
Reported-by: Odzioba, Lukasz <lukasz.odzioba@intel.com>
Reported-by: Koss, Marcin <marcin.koss@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a545715d2d upstream.
When removing and adding cpu 0 on a system with GHES NMI the following stack
trace is seen when re-adding the cpu:
WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
Call Trace:
dump_stack+0x63/0x8e
__warn+0xd1/0xf0
warn_slowpath_null+0x1d/0x20
setup_local_APIC+0x275/0x370
apic_ap_setup+0xe/0x20
start_secondary+0x48/0x180
set_init_arg+0x55/0x55
early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x2a/0x2c
x86_64_start_kernel+0x13d/0x14c
During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the
ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
(0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also
0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
that something has set the IRQ_WORK_VECTOR line prior to the APIC being
initialized.
Commit 2383844d48 ("GHES: Elliminate double-loop in the NMI handler")
incorrectly modified the behavior such that the handler returns
NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
work queue for every NMI.
This patch modifies the ghes_proc_irq_work() to run as it did prior to
2383844d48 ("GHES: Elliminate double-loop in the NMI handler") by
properly returning NMI_HANDLED and only calling the work queue if
NMI_HANDLED has been set.
Fixes: 2383844d48 (GHES: Elliminate double-loop in the NMI handler)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebc4ff661f upstream.
cfq_cpd_alloc() which is the cpd_alloc_fn implementation for cfq was
incorrectly hard coding GFP_KERNEL instead of using the mask specified
through the @gfp parameter. This currently doesn't cause any actual
issues because all current callers specify GFP_KERNEL. Fix it.
Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: e4a9bde958 ("blkcg: replace blkcg_policy->cpd_size with ->cpd_alloc/free_fn() methods")
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a05e7541c upstream.
With compilers which follow the C99 standard (like modern versions of
gcc and clang), "extern inline" does the opposite thing from older
versions of gcc (emits code for an externally linkable version of the
inline function).
"static inline" does the intended behavior in all cases instead.
Description taken from commit 6d91857d48 ("staging, rtl8192e,
LLVMLinux: Change extern inline to static inline").
This also fixes the following GCC warning when building with CONFIG_PM
disabled:
./include/linux/blkdev.h:1143:20: warning: no previous prototype for 'blk_set_runtime_active' [-Wmissing-prototypes]
Fixes: d07ab6d114 ("block: Add blk_set_runtime_active()")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 85b037442e upstream.
The handling of bypass_val_on that was added in
regulator_get_bypass_regmap is done unconditionally however
several drivers don't define a value for bypass_val_on. This
results in those drivers reporting bypass being enabled when
it is not. In regulator_set_bypass_regmap we use bypass_mask
if bypass_val_on is zero. This patch adds similar handling in
regulator_get_bypass_regmap.
Fixes: commit dd1a571dae ("regulator: helpers: Ensure bypass register field matches ON value")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b243fcfb5 upstream.
This changes the way that we support the new ISA v3.00 HPTE format.
Instead of adapting everything that uses HPTE values to handle either
the old format or the new format, depending on which CPU we are on,
we now convert explicitly between old and new formats if necessary
in the low-level routines that actually access HPTEs in memory.
This limits the amount of code that needs to know about the new
format and makes the conversions explicit. This is OK because the
old format contains all the information that is in the new format.
This also fixes operation under a hypervisor, because the H_ENTER
hypercall (and other hypercalls that deal with HPTEs) will continue
to require the HPTE value to be supplied in the old format. At
present the kernel will not boot in HPT mode on POWER9 under a
hypervisor.
This fixes and partially reverts commit 50de596de8
("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29).
Fixes: 50de596de8 ("powerpc/mm/hash: Add support for Power9 Hash")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d701d3dd8 upstream.
Fix to return a negative error code from the st_rproc_state() error
handling case instead of 0, as done elsewhere in this function.
Fixes: 63edb0310a ("remoteproc: Supply controller driver for ST's Remote Processors")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6de1a507c4 upstream.
The tie between the main WCNSS driver and the IRIS driver causes a
circular dependency between the two modules. Neither part makes sense to
have on their own so lets merge them into one module.
For the sake of picking up the clock and regulator resources described
in the iris of_node we need an associated struct device. But, to keep
the size of the patch down we continue to represent the IRIS part as its
own platform_driver, within the same module, rather than setting up a
dummy device.
Fixes: aed361adca ("remoteproc: qcom: Introduce WCNSS peripheral image loader")
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 230c5b4423 upstream.
In the loop on .timings, we should check .num_timings to see if it's the
only mode specified, not .num_modes, which should be used with .modes.
Fixes: cda553725c ("drm/panel: simple: Set appropriate mode type")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cff52e5fc4 upstream.
gcc warns about the timestamp in drm_wait_vblank being possibly
used without an initialization:
drivers/gpu/drm/drm_irq.c: In function 'drm_crtc_send_vblank_event':
drivers/gpu/drm/drm_irq.c:992:24: error: 'now.tv_usec' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/gpu/drm/drm_irq.c:1069:17: note: 'now.tv_usec' was declared here
drivers/gpu/drm/drm_irq.c:991:23: error: 'now.tv_sec' may be used uninitialized in this function [-Werror=maybe-uninitialized]
This can happen if drm_vblank_count_and_time() returns 0 in its
error path. To sanitize the error case, I'm changing that function
to return a zero timestamp when it fails.
Fixes: e6ae8687a8 ("drm: idiot-proof vblank")
Reviewed-by: David Herrmann <dh.herrmann@gmail.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20161017221355.1861551-6-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f638c1cb0 upstream.
smbus functions return -ve on error, 0 on success. However,
__i2c_transfer() have a different return signature - -ve on error, or
number of buffers transferred (which may be zero or greater.)
The upshot of this is that the sense of the test is reversed when using
the mux on a bus supporting the master_xfer method: we cache the value
and never retry if we fail to transfer any buffers, but if we succeed,
we clear the cached value.
Fix this by making pca954x_reg_write() return a negative error code for
all failure cases.
Fixes: 463e8f845c ("i2c: mux: pca954x: retry updating the mux selection on failure")
Acked-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cfd278c280 upstream.
Various places assume that if nfs4_fl_prepare_ds() turns a non-NULL 'ds',
then ds->ds_clp will also be non-NULL.
This is not necessasrily true in the case when the process received a fatal signal
while nfs4_pnfs_ds_connect is waiting in nfs4_wait_ds_connect().
In that case ->ds_clp may not be set, and the devid may not recently have been marked
unavailable.
So add a test for ds_clp == NULL and return NULL in that case.
Fixes: c23266d532 ("NFS4.1 Fix data server connection race")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Olga Kornievskaia <aglo@umich.edu>
Acked-by: Adamson, Andy <William.Adamson@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79f687a3de upstream.
Ben Coddington reports that commit 311324ad17, by adding the function
nfs_dir_mapping_need_revalidate() that checks page cache validity on
each call to nfs_readdir() causes a performance regression when
the directory is being modified.
If the directory is changing while we're iterating through the directory,
POSIX does not require us to invalidate the page cache unless the user
calls rewinddir(). However, we still do want to ensure that we use
readdirplus in order to avoid a load of stat() calls when the user
is doing an 'ls -l' workload.
The fix should be to invalidate the page cache immediately when we're
setting the NFS_INO_ADVISE_RDPLUS bit.
Reported-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 311324ad17 ("NFS: Be more aggressive in using readdirplus...")
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee284e35d8 upstream.
We must put the task to sleep while holding the inode->i_lock in order
to ensure atomicity with the test for NFS_LAYOUT_RETURN.
Fixes: 500d701f33 ("NFS41: make close wait for layoutreturn")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f24d311f92 upstream.
The pinctrl_gpio_request is called with the "full" gpio number, already
containing the base, then meson_pmx_request_gpio is then called with the
final pin number.
Remove the base addition when calling meson_pmx_disable_other_groups.
Fixes: 6ac7309511 ("pinctrl: add driver for Amlogic Meson SoCs")
CC: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Acked-by: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aa7c8da35d upstream.
In __btrfs_run_delayed_refs, the error path when run_delayed_extent_op
fails sets locked_ref->processing = 0 but doesn't re-increment
delayed_refs->num_heads_ready. As a result, we end up triggering
the WARN_ON in btrfs_select_ref_head.
Fixes: d7df2c796d (Btrfs: attach delayed ref updates to delayed ref heads)
Reported-by: Jon Nelson <jnelson-suse@jamponi.net>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d028099643 upstream.
In __btrfs_run_delayed_refs, when we put back a delayed ref that's too
new, we have already dropped the lock on locked_ref when we set
->processing = 0.
This patch keeps the lock to cover that assignment.
Fixes: d7df2c796d (Btrfs: attach delayed ref updates to delayed ref heads)
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5a10c5f75 upstream.
Commit 54adc01055 ("nvme/quirk: Add a delay before checking for adapter
readiness") introduced a quirk to adapters that cannot read the bit
NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
need a delay or else the action of reading the bit NVME_CSTS_RDY could
somehow corrupt adapter's registers state and it never recovers.
When this quirk was added, we checked ctrl->tagset in order to avoid
quirking in probe time, supposing we would never require such delay
during probe. Well, it was too optimistic; we in fact need this quirk
at probe time in some cases, like after a kexec.
In some experiments, after abnormal shutdown of machine (aka power cord
unplug), we booted into our bootloader in Power, which is a Linux kernel,
and kexec'ed into another distro. If this kexec is too quick, we end up
reaching the probe of NVMe adapter in that distro when adapter is in
bad state (not fully initialized on our bootloader). What happens next
is that nvme_wait_ready() is unable to complete, except if the quirk is
enabled.
So, this patch removes the original ctrl->tagset verification in order
to enable the quirk even on probe time.
Fixes: 54adc01055 ("nvme/quirk: Add a delay before checking for adapter readiness")
Reported-by: Andrew Byrne <byrneadw@ie.ibm.com>
Reported-by: Jaime A. H. Gomez <jahgomez@mx1.ibm.com>
Reported-by: Zachary D. Myers <zdmyers@us.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Jeffrey Lien <Jeff.Lien@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 701dc207bf upstream.
On AMD's SB800 and upwards, the SMBus is shared with the Integrated
Micro Controller (IMC).
The platform provides a hardware semaphore to avoid race conditions
among them. (Check page 288 of the SB800-Series Southbridges Register
Reference Guide http://support.amd.com/TechDocs/45482.pdf)
Without this patch, many access to the SMBus end with an invalid
transaction or even with the bus stalled.
Reported-by: Alexandre Desnoyers <alex@qtec.com>
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>:
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e44fca504 upstream.
Do not attempt to drain the health workqueue when unloading the device in
the recovery flow, this can cause a deadlock when the recovery work
tries to cancel itself with sync.
Because the work is no longer unconditionally canceled when unloading, it
must be explicitly canceled in the AER flow.
fixes: 689a248df8 ("net/mlx5: Cancel recovery work in remove flow")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 030ee7ae52 upstream.
The modem-control signals are managed by the tty-layer during open and
should not be asserted prematurely when set_termios is called from
driver open.
Also make sure that the signals are asserted only when changing speed
from B0.
Fixes: 664d5df92e ("USB: usb-serial ch341: support for DTR/RTS/CTS")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The backport of
2c7d0602c - "Fix PCODE polling during CDCLK change notification"
to the 4.9 stable tree used an incorrect timeout value. Fix this up
so the backport matches the upstream commit.
Reported-by: Thomas Backlund <tmb@mageia.org>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dc5367bcc5 upstream.
With commit e53743994e
("af_iucv: use paged SKBs for big outbound messages"),
we transmit paged skbs for both of AF_IUCV's transport modes
(IUCV or HiperSockets).
The qeth driver for Layer 3 HiperSockets currently doesn't
support NETIF_F_SG, so these skbs would just be linearized again
by the stack.
Avoid that overhead by using paged skbs only for IUCV transport.
cc stable, since this also circumvents a significant skb leak when
sending large messages (where the skb then needs to be linearized).
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Fixes: e53743994e ("af_iucv: use paged SKBs for big outbound messages")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2bed8a8e70 upstream.
When in RS485 emulation mode, __do_stop_tx_rs485() calls
serial8250_clear_fifos(). This not only clears the FIFOs, but also sets
all bits in their control register (UART_FCR) to 0.
One of the effects of this is the disabling of the FIFOs, which turns
them into single-byte holding registers. The rest of the driver doesn't
know this, which results in the lions share of characters passed into a
write call to be dropped.
(I can supply logic analyzer screenshots if necessary)
This fix replaces the serial8250_clear_fifos() call to
serial8250_clear_and_reinit_fifos() - this prevents the "dropped
characters" issue from manifesting again while retaining the requirement
of clearing the RX FIFO after transmission if the SER_RS485_RX_DURING_TX
flag is disabled.
Signed-off-by: Daniel Jedrychowski <avistel@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b11ebedd6 upstream.
Function get_zeroed_page() returns a NULL pointer if there is no enough
memory. In function extcon_sync(), it returns 0 if the call to
get_zeroed_page() fails. The return value 0 indicates success in the
context, which is incosistent with the execution status. This patch
fixes the bug by returning -ENOMEM.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188611
Signed-off-by: Pan Bian <bianpan2016@163.com>
Fixes: a580982f08
Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 802c03881f upstream.
The sysrq input handler should be attached to the input device which has
a left alt key.
On 32-bit kernels, some input devices which has a left alt key cannot
attach sysrq handler. Because the keybit bitmap in struct input_device_id
for sysrq is not correctly initialized. KEY_LEFTALT is 56 which is
greater than BITS_PER_LONG on 32-bit kernels.
I found this problem when using a matrix keypad device which defines
a KEY_LEFTALT (56) but doesn't have a KEY_O (24 == 56%32).
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 570b90fa23 upstream.
Eric Biggers pointed out that the orinoco driver pointed scatterlists
at the stack.
Fix it by switching from ahash to shash. The result should be
simpler, faster, and more correct.
kvalo: cherry picked from commit 1fef293b8a as I
accidentally applied this patch to wireless-drivers-next when I was supposed to
apply this wireless-drivers
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7c9d8d0c41 upstream.
If srp_transfer_data fails within ibmvscsis_write_pending, then
the most likely scenario is that the client timed out the op and
removed the TCE mapping. Thus it will loop forever retrying the
op that is pretty much guaranteed to fail forever. A better return
code would be EIO instead of EAGAIN.
Reported-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bgly@us.ibm.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 89d8232411 upstream.
If we don't disable the transmitter in atmel_stop_tx, the DMA buffer
continues to send data until it is emptied.
This cause problems with the flow control (CTS is asserted and data are
still sent).
So, disabling the transmitter in atmel_stop_tx is a sane thing to do.
Tested on at91sam9g35-cm(DMA)
Tested for regressions on sama5d2-xplained(Fifo) and at91sam9g20ek(PDC)
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b389f173aa upstream.
When using RS485 in half duplex, RX should be enabled when TX is
finished, and stopped when TX starts.
Before commit 0058f0871e ("tty/serial: atmel: fix RS485 half
duplex with DMA"), RX was not disabled in atmel_start_tx() if the DMA
was used. So, collisions could happened.
But disabling RX in atmel_start_tx() uncovered another bug:
RX was enabled again in the wrong place (in atmel_tx_dma) instead of
being enabled when TX is finished (in atmel_complete_tx_dma), so the
transmission simply stopped.
This bug was not triggered before commit 0058f0871e
("tty/serial: atmel: fix RS485 half duplex with DMA") because RX was
never disabled before.
Moving atmel_start_rx() in atmel_complete_tx_dma() corrects the problem.
Reported-by: Gil Weber <webergil@gmail.com>
Fixes: 0058f0871e
Tested-by: Gil Weber <webergil@gmail.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a14d749fce upstream.
Most users of BLOCK_PC requests allocate the sense buffer on the stack,
so to avoid DMA to the stack copy them to a field in the heap allocated
virtblk_req structure. Without that any attempt at SCSI passthrough I/O,
including the SG_IO ioctl from userspace will crash the kernel. Note that
this includes running tools like hdparm even when the host does not have
SCSI passthrough enabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 836c3ce256 upstream.
The original patch did not done what it was supposed to be doing and even
worst it broke legacy boot (OMAP1).
The lch_map size should be the number of available logical channels in sDMA
and the od->dma_requests should store the number of available DMA request
lines usable in sDMA.
In legacy mode we do not have a way to get the DMA request count, in that
case we use OMAP_SDMA_REQUESTS (127), despite the fact that OMAP1510 have
only 31 DMA request line.
Fixes: 2d1a9a946f ("dmaengine: omap-dma: Dynamically allocate memory for lch_map")
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 488debb997 upstream.
When borrowing the pfn_valid() check from mmap_kmem(), somebody managed
to get physical and virtual addresses spectacularly muddled up, such
that we've ended up with checks for one being the other. Whilst this
does indeed prevent out-of-bounds accesses crashing, on most systems
it also prevents the more desirable use-case of working at all ever.
Check the *virtual* offset correctly for what it is. Furthermore, do
so in the right place - a read or write may span multiple pages, so a
single up-front check is insufficient. High memory accesses already
have a similar validity check just before the copy_to_user() call, so
just make the low memory path fully consistent with that.
Reported-by: Jason A. Donenfeld <Jason@zx2c4.com>
Fixes: 148a1bc843 ("drivers: char: mem: Check {read,write}_kmem() addresses")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3895dbf898 upstream.
Protecting the mountpoint hashtable with namespace_sem was sufficient
until a call to umount_mnt was added to mntput_no_expire. At which
point it became possible for multiple calls of put_mountpoint on
the same hash chain to happen on the same time.
Kristen Johansen <kjlx@templeofstupid.com> reported:
> This can cause a panic when simultaneous callers of put_mountpoint
> attempt to free the same mountpoint. This occurs because some callers
> hold the mount_hash_lock, while others hold the namespace lock. Some
> even hold both.
>
> In this submitter's case, the panic manifested itself as a GP fault in
> put_mountpoint() when it called hlist_del() and attempted to dereference
> a m_hash.pprev that had been poisioned by another thread.
Al Viro observed that the simple fix is to switch from using the namespace_sem
to the mount_lock to protect the mountpoint hash table.
I have taken Al's suggested patch moved put_mountpoint in pivot_root
(instead of taking mount_lock an additional time), and have replaced
new_mountpoint with get_mountpoint a function that does the hash table
lookup and addition under the mount_lock. The introduction of get_mounptoint
ensures that only the mount_lock is needed to manipulate the mountpoint
hashtable.
d_set_mounted is modified to only set DCACHE_MOUNTED if it is not
already set. This allows get_mountpoint to use the setting of
DCACHE_MOUNTED to ensure adding a struct mountpoint for a dentry
happens exactly once.
Fixes: ce07d891a0 ("mnt: Honor MNT_LOCKED when detaching mounts")
Reported-by: Krister Johansen <kjlx@templeofstupid.com>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit add7c65ca4 upstream.
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
4.10.0-rc2-00024-g4aecec9-dirty #118 Tainted: G W
---------------------------------------------------------
swapper/1/0 just changed the state of lock:
(&(&sighand->siglock)->rlock){-.....}, at: [<ffffffffbd0a1bc6>] __lock_task_sighand+0xb6/0x2c0
but this lock took another, HARDIRQ-unsafe lock in the past:
(ucounts_lock){+.+...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Chain exists of: &(&sighand->siglock)->rlock --> &(&tty->ctrl_lock)->rlock --> ucounts_lock
Possible interrupt unsafe locking scenario:
CPU0 CPU1
---- ----
lock(ucounts_lock);
local_irq_disable();
lock(&(&sighand->siglock)->rlock);
lock(&(&tty->ctrl_lock)->rlock);
<Interrupt>
lock(&(&sighand->siglock)->rlock);
*** DEADLOCK ***
This patch removes a dependency between rlock and ucount_lock.
Fixes: f333c700c6 ("pidns: Add a limit on the number of pid namespaces")
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8a6a09c1c upstream.
In ca91cx42_slave_get function, the value pointed by vme_base pointer is
set through:
*vme_base = ioread32(bridge->base + CA91CX42_VSI_BS[i]);
So it must be dereferenced to be used in calculation of pci_base:
*pci_base = (dma_addr_t)*vme_base + pci_offset;
This bug was caught thanks to the following gcc warning:
drivers/vme/bridges/vme_ca91cx42.c: In function ‘ca91cx42_slave_get’:
drivers/vme/bridges/vme_ca91cx42.c:467:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
*pci_base = (dma_addr_t)vme_base + pci_offset;
Signed-off-by: Augusto Mecking Caringi <augustocaringi@gmail.com>
Acked-By: Martyn Welch <martyn@welchs.me.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6741f551a0 upstream.
This commit needs to be reverted because it prevents people from
using the serial console as a secondary console with input being
directed to tty0.
IOW, if you boot with console=ttyS0 console=tty0 then all kernels
prior to this commit will produce output on both ttyS0 and tty0
but input will only be taken from tty0. With this patch the serial
console will always be the primary console instead of tty0,
potentially preventing people from getting into their machines in
emergency situations.
Fixes: d03516df83 ("tty: serial: 8250: add CON_CONSDEV to flags")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9e4d59ada4 upstream.
This is a fix for Linux 4.10-rc1.
In C language specification, a bit-field is interpreted as a signed or
unsigned integer type consisting of the specified number of bits.
In GCC manual, the range of a signed bit field of N bits is from
-(2^N) / 2 to ((2^N) / 2) - 1
https://www.gnu.org/software/gnu-c-manual/gnu-c-manual.html#Bit-Fields
Therefore, when defined as 1 bit-field with signed type, variables can
represents -1 and 0.
The snd-soc-hdmi-codec module includes a structure which has signed type
members with bit-fields. Codes of this module assign 0 and 1 to the
members. This seems to result in implementation-dependent behaviours.
As of v4.10-rc1 merge window, outside of sound subsystem, this structure
is referred by below GPU modules.
- tda998x
- sti-drm
- mediatek-drm-hdmi
- msm
As long as I review their codes relevant to the structure, the structure
members are used just for condition statements and printk formats.
My proposal of change is a bit intrusive to the printk formats but this
may be acceptable.
Totally, it's reasonable to use unsigned type for the structure members.
This bug is detected by Sparse, static code analyzer with below warnings.
./include/sound/hdmi-codec.h:39:26: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:40:28: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:41:29: error: dubious one-bit signed bitfield
./include/sound/hdmi-codec.h:42:31: error: dubious one-bit signed bitfield
Fixes: 09184118a8 ("ASoC: hdmi-codec: Add hdmi-codec for external HDMI-encoders")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Acked-by: Arnaud Pouliquen <arnaud.pouliquen@st.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ac0c7cf8be upstream.
Enabling btrfs tracepoints leads to instant crash, as reported. The wq
callbacks could free the memory and the tracepoints started to
dereference the members to get to fs_info.
The proposed fix https://marc.info/?l=linux-btrfs&m=148172436722606&w=2
removed the tracepoints but we could preserve them by passing only the
required data in a safe way.
Fixes: bc074524e1 ("btrfs: prefix fsid to all trace events")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d6169d0409 upstream.
If a URB is killed while the host is removed we can end up in a situation
where the hub thread takes the roothub device lock, and waits for
the URB to be given back by xhci-hcd, blocking the host remove code.
xhci-hcd tries to stop the endpoint and give back the urb, but can't
as the host is removed from PCI bus at the same time, preventing the normal
way of giving back urb.
Instead we need to rely on the stop command timeout function to give back
the urb. This xhci_stop_endpoint_command_watchdog() timeout function
used a XHCI_STATE_DYING flag to indicate if the timeout function is already
running, but later this flag has been taking into use in other places to
mark that xhci is dying.
Remove checks for XHCI_STATE_DYING in xhci_urb_dequeue. We are still
checking that reading from pci state does not return 0xffffffff or that
host is not halted before trying to stop the endpoint.
This whole area of stopping endpoints, giving back URBs, and the wathdog
timeout need rework, this fix focuses on solving a specific deadlock
issue that we can then send to stable before any major rework.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9dc6f65bc upstream.
The logics in pipe_advance() used to release all buffers past the new
position failed in cases when the number of buffers to release was equal
to pipe->buffers. If that happened, none of them had been released,
leaving pipe full. Worse, it was trivial to trigger and we end up with
pipe full of uninitialized pages. IOW, it's an infoleak.
Reported-by: "Alan J. Wylie" <alan@wylie.me.uk>
Tested-by: "Alan J. Wylie" <alan@wylie.me.uk>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30f939feae upstream.
i2c_smbus_xfer() does not always fill an entire block, allowing
kernel stack memory disclosure through the temp variable. Clear
it before it's read to.
Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f724fb303 upstream.
In of_i2c_register_device(), when the check for
device address validity fails we print the info.addr,
which has not been assigned properly.
Fix this by printing the actual invalid address.
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: b4e2f6ac12 ("i2c: apply DT flags when probing")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a20047f36e upstream.
The private baud_rate variable is used to configure the port at open and
reset-resume and must never be set to (and left at) zero or reset-resume
and all further open attempts will fail.
Fixes: aa91def41a ("USB: ch341: set tty baud speed according to tty struct")
Fixes: 664d5df92e ("USB: usb-serial ch341: support for DTR/RTS/CTS")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d5a9c72d0 upstream.
A short control transfer would currently fail to be detected, something
which could lead to stale buffer data being used as valid input.
Check for short transfers, and make sure to log any transfer errors.
Note that this also avoids leaking heap data to user space (TIOCMGET)
and the remote device (break control).
Fixes: 6ce7610478 ("USB: Driver for CH341 USB-serial adaptor")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f2950b7854 upstream.
Make sure to stop the interrupt URB before returning on errors during
open.
Fixes: 664d5df92e ("USB: usb-serial ch341: support for DTR/RTS/CTS")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ce5e292828 upstream.
Fix reset-resume handling which failed to resubmit the read and
interrupt URBs, thereby leaving a port that was open before suspend in a
broken state until closed and reopened.
Fixes: 1ded7ea47b ("USB: ch341 serial: fix port number changed after resume")
Fixes: 2bfd1c96a9 ("USB: serial: ch341: remove reset_resume callback")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4e2da44691 upstream.
DTR and RTS will be asserted by the tty-layer when the port is opened
and deasserted on close (if HUPCL is set). Make sure the initial state
is not-asserted before the port is first opened as well.
Fixes: 664d5df92e ("USB: usb-serial ch341: support for DTR/RTS/CTS")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 146cc8a17a upstream.
The current implementation failed to detect short transfers when
attempting to read the line state, and also, to make things worse,
logged the content of the uninitialised heap transfer buffer.
Fixes: abf492e7b3 ("USB: kl5kusb105: fix DMA buffers on stack")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b6c1b4c0e upstream.
MUSB driver now has runtime PM support, but the debugfs driver misses
the PM _get/_put() calls, which could cause MUSB register access
failure.
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 620f1a632e upstream.
The driver put a constant buffer of all zeros on the stack and
pointed a scatterlist entry at it. This doesn't work with virtual
stacks. Use ZERO_PAGE instead.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3344ed3079 upstream.
The workaround for the AMD Erratum E400 (Local APIC timer stops in C1E
state) is a two step process:
- Selection of the E400 aware idle routine
- Detection whether the platform is affected
The idle routine selection happens for possibly affected CPUs depending on
family/model/stepping information. These range of CPUs is not necessarily
affected as the decision whether to enable the C1E feature is made by the
firmware. Unfortunately there is no way to query this at early boot.
The current implementation polls a MSR in the E400 aware idle routine to
detect whether the CPU is affected. This is inefficient on non affected
CPUs because every idle entry has to do the MSR read.
There is a better way to detect this before going idle for the first time
which requires to seperate the bug flags:
X86_BUG_AMD_E400 - Selects the E400 aware idle routine and
enables the detection
X86_BUG_AMD_APIC_C1E - Set when the platform is affected by E400
Replace the current X86_BUG_AMD_APIC_C1E usage by the new X86_BUG_AMD_E400
bug bit to select the idle routine which currently does an unconditional
detection poll. X86_BUG_AMD_APIC_C1E is going to be used in later patches
to remove the MSR polling and simplify the handling of this misfeature.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20161209182912.2726-3-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6a50cddbc upstream.
These changes do not affect current hw - just a cleanup:
Currently, we assume that a system has a single Last Level Cache (LLC)
per node, and that the cpu_llc_id is thus equal to the node_id. This no
longer applies since Fam17h can have multiple last level caches within a
node.
So group the cpu_llc_id assignment by topology feature and family in
order to make the computation of cpu_llc_id on the different families
more clear.
Here is how the LLC ID is being computed on the different families:
The NODEID_MSR feature only applies to Fam10h in which case the LLC is
at the node level.
The TOPOEXT feature is used on families 15h, 16h and 17h. So far we only
see multiple last level caches if L3 caches are available. Otherwise,
the cpu_llc_id will default to be the phys_proc_id.
We have L3 caches only on families 15h and 17h:
- on Fam15h, the LLC is at the node level.
- on Fam17h, the LLC is at the core complex level and can be found by
right shifting the APIC ID. Also, keep the family checks explicit so that
new families will fall back to the default, which will be node_id for
TOPOEXT systems.
Single node systems in families 10h and 15h will have a Node ID of 0
which will be the same as the phys_proc_id, so we don't need to check
for multiple nodes before using the node_id.
Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
[ Rewrote the commit message. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20161108153054.bs3sajbyevq6a6uu@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 14221cc45c upstream.
Problem:
br_nf_pre_routing_finish() calls itself instead of
br_nf_pre_routing_finish_bridge(). Due to this bug reverse path filter drops
packets that go through bridge interface.
User impact:
Local docker containers with bridge network can not communicate with each
other.
Fixes: c5136b15ea ("netfilter: bridge: add and use br_nf_hook_thresh")
Signed-off-by: Artur Molchanov <artur.molchanov@synesis.ru>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a417b8dc1 upstream.
Commit 99579ccec4 "xfs: skip dirty pages in ->releasepage()" started
to skip dirty pages in xfs_vm_releasepage() which also has the effect
that if a dirty page is truncated, it does not get freed by
block_invalidatepage() and is lingering in LRU list waiting for reclaim.
So a simple loop like:
while true; do
dd if=/dev/zero of=file bs=1M count=100
rm file
done
will keep using more and more memory until we hit low watermarks and
start pagecache reclaim which will eventually reclaim also the truncate
pages. Keeping these truncated (and thus never usable) pages in memory
is just a waste of memory, is unnecessarily stressing page cache
reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM
prematurely.
So instead of just skipping dirty pages in xfs_vm_releasepage(), return
to old behavior of skipping them only if they have delalloc or unwritten
buffers and fix the spurious warnings by warning only if the page is
clean.
CC: Brian Foster <bfoster@redhat.com>
CC: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Petr Tůma <petr.tuma@d3s.mff.cuni.cz>
Fixes: 99579ccec4
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5018ada69a upstream.
When removing a gpiochip that uses GPIO hogging (e.g. by unloading the
chip's DT overlay), a warning is printed:
gpio gpiochip8: REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED
This happens because gpiochip_free_hogs() is called after the gdev->chip
pointer is reset to NULL. Hence __gpiod_free() cannot determine the
chip in use, and cannot clear flags nor call the optional chip-specific
.free() callback.
Move the call to gpiochip_free_hogs() up to fix this.
Fixes: ff2b135922 ("gpio: make the gpiochip a real device")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 753aacfd2e upstream.
A single netlink socket might own multiple interfaces *and* a
scheduled scan request (which might belong to another interface),
so when it goes away both may need to be destroyed.
Remove the schedule_scan_stop indirection to fix this - it's only
needed for interface destruction because of the way this works
right now, with a single work taking care of all interfaces.
Fixes: 93a1e86ce1 ("nl80211: Stop scheduled scan if netlink client disappears")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20b1e22d01 upstream.
With the following commit:
4bc9f92e64 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
... efi_bgrt_init() calls into the memblock allocator through
efi_mem_reserve() => efi_arch_mem_reserve() *after* mm_init() has been called.
Indeed, KASAN reports a bad read access later on in efi_free_boot_services():
BUG: KASAN: use-after-free in efi_free_boot_services+0xae/0x24c
at addr ffff88022de12740
Read of size 4 by task swapper/0/0
page:ffffea0008b78480 count:0 mapcount:-127
mapping: (null) index:0x1 flags: 0x5fff8000000000()
[...]
Call Trace:
dump_stack+0x68/0x9f
kasan_report_error+0x4c8/0x500
kasan_report+0x58/0x60
__asan_load4+0x61/0x80
efi_free_boot_services+0xae/0x24c
start_kernel+0x527/0x562
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x157/0x17a
start_cpu+0x5/0x14
The instruction at the given address is the first read from the memmap's
memory, i.e. the read of md->type in efi_free_boot_services().
Note that the writes earlier in efi_arch_mem_reserve() don't splat because
they're done through early_memremap()ed addresses.
So, after memblock is gone, allocations should be done through the "normal"
page allocator. Introduce a helper, efi_memmap_alloc() for this. Use
it from efi_arch_mem_reserve(), efi_free_boot_services() and, for the sake
of consistency, from efi_fake_memmap() as well.
Note that for the latter, the memmap allocations cease to be page aligned.
This isn't needed though.
Tested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Dave Young <dyoung@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Mika Penttilä <mika.penttila@nextfour.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Fixes: 4bc9f92e64 ("x86/efi-bgrt: Use efi_mem_reserve() to avoid copying image data")
Link: http://lkml.kernel.org/r/20170105125130.2815-1-nicstange@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0100a3e67a upstream.
Some machines, such as the Lenovo ThinkPad W541 with firmware GNET80WW
(2.28), include memory map entries with phys_addr=0x0 and num_pages=0.
These machines fail to boot after the following commit,
commit 8e80632fb2 ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()")
Fix this by removing such bogus entries from the memory map.
Furthermore, currently the log output for this case (with efi=debug)
looks like:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0xffffffffffffffff] (0MB)
This is clearly wrong, and also not as informative as it could be. This
patch changes it so that if we find obviously invalid memory map
entries, we print an error and skip those entries. It also detects the
display of the address range calculation overflow, so the new output is:
[ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0x0000000000000000] (invalid)
It also detects memory map sizes that would overflow the physical
address, for example phys_addr=0xfffffffffffff000 and
num_pages=0x0200000000000001, and prints:
[ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries:
[ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[phys_addr=0xfffffffffffff000-0x20ffffffffffffffff] (invalid)
It then removes these entries from the memory map.
Signed-off-by: Peter Jones <pjones@redhat.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[ardb: refactor for clarity with no functional changes, avoid PAGE_SHIFT]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
[Matt: Include bugzilla info in commit log]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=191121
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 129a72a0d3 upstream.
Introduces segemented_write_std.
Switches from emulated reads/writes to standard read/writes in fxsave,
fxrstor, sgdt, and sidt. This fixes CVE-2017-2584, a longstanding
kernel memory leak.
Since commit 283c95d0e3 ("KVM: x86: emulate FXSAVE and FXRSTOR",
2016-11-09), which is luckily not yet in any final release, this would
also be an exploitable kernel memory *write*!
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 96051572c8
Fixes: 283c95d0e3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 283c95d0e3 upstream.
Internal errors were reported on 16 bit fxsave and fxrstor with ipxe.
Old Intels don't have unrestricted_guest, so we have to emulate them.
The patch takes advantage of the hardware implementation.
AMD and Intel differ in saving and restoring other fields in first 32
bytes. A test wrote 0xff to the fxsave area, 0 to upper bits of MCSXR
in the fxsave area, executed fxrstor, rewrote the fxsave area to 0xee,
and executed fxsave:
Intel (Nehalem):
7f 1f 7f 7f ff 00 ff 07 ff ff ff ff ff ff 00 00
ff ff ff ff ff ff 00 00 ff ff 00 00 ff ff 00 00
Intel (Haswell -- deprecated FPU CS and FPU DS):
7f 1f 7f 7f ff 00 ff 07 ff ff ff ff 00 00 00 00
ff ff ff ff 00 00 00 00 ff ff 00 00 ff ff 00 00
AMD (Opteron 2300-series):
7f 1f 7f 7f ff 00 ee ee ee ee ee ee ee ee ee ee
ee ee ee ee ee ee ee ee ff ff 00 00 ff ff 02 00
fxsave/fxrstor will only be emulated on early Intels, so KVM can't do
much to improve the situation.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aabba3c6ab upstream.
Move the existing exception handling for inline assembly into a macro
and switch its return values to X86EMUL type.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cef84c302f upstream.
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
These are implemented with delayed_work structs which can still be
pending when the KVM module is unloaded. We've seen this cause kernel
panics when the kvm_intel module is quickly reloaded.
Use the new static_key_deferred_flush() API to flush pending updates on
module unload.
Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6416e6101 upstream.
Modules that use static_key_deferred need a way to synchronize with
any delayed work that is still pending when the module is unloaded.
Introduce static_key_deferred_flush() which flushes any pending
jump label updates.
Signed-off-by: David Matlack <dmatlack@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f3dbdf47e upstream.
Reported syzkaller:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
PGD 0
Oops: 0002 [#1] SMP
CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
Call Trace:
irqfd_shutdown+0x66/0xa0 [kvm]
process_one_work+0x16b/0x480
worker_thread+0x4b/0x500
kthread+0x101/0x140
? process_one_work+0x480/0x480
? kthread_create_on_node+0x60/0x60
ret_from_fork+0x25/0x30
RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
CR2: 0000000000000008
The syzkaller folks reported a NULL pointer dereference that due to
unregister an consumer which fails registration before. The syzkaller
creates two VMs w/ an equal eventfd occasionally. So the second VM
fails to register an irqbypass consumer. It will make irqfd as inactive
and queue an workqueue work to shutdown irqfd and unregister the irqbypass
consumer when eventfd is closed. However, the second consumer has been
initialized though it fails registration. So the token(same as the first
VM's) is taken to unregister the consumer through the workqueue, the
consumer of the first VM is found and unregistered, then NULL deref incurred
in the path of deleting consumer from the consumers list.
This patch fixes it by making irq_bypass_register/unregister_consumer()
looks for the consumer entry based on consumer pointer itself instead of
token matching.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 33ab91103b upstream.
This is CVE-2017-2583. On Intel this causes a failed vmentry because
SS's type is neither 3 nor 7 (even though the manual says this check is
only done for usable SS, and the dmesg splat says that SS is unusable!).
On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.
The fix fabricates a data segment descriptor when SS is set to a null
selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
this in turn ensures CPL < 3 because RPL must be equal to CPL.
Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
the bug and deciphering the manuals.
Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com>
Fixes: 79d5b4c3cd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5bbc8a6c9 upstream.
return_unused_surplus_pages() decrements the global reservation count,
and frees any unused surplus pages that were backing the reservation.
Commit 7848a4bf51 ("mm/hugetlb.c: add cond_resched_lock() in
return_unused_surplus_pages()") added a call to cond_resched_lock in the
loop freeing the pages.
As a result, the hugetlb_lock could be dropped, and someone else could
use the pages that will be freed in subsequent iterations of the loop.
This could result in inconsistent global hugetlb page state, application
api failures (such as mmap) failures or application crashes.
When dropping the lock in return_unused_surplus_pages, make sure that
the global reservation count (resv_huge_pages) remains sufficiently
large to prevent someone else from claiming pages about to be freed.
Analyzed by Paul Cassella.
Fixes: 7848a4bf51 ("mm/hugetlb.c: add cond_resched_lock() in return_unused_surplus_pages()")
Link: http://lkml.kernel.org/r/1483991767-6879-1-git-send-email-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Paul Cassella <cassella@cray.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Cc: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f05714293a upstream.
During developemnt for zram-swap asynchronous writeback, I found strange
corruption of compressed page, resulting in:
Modules linked in: zram(E)
CPU: 3 PID: 1520 Comm: zramd-1 Tainted: G E 4.8.0-mm1-00320-ge0d4894c9c38-dirty #3274
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff88007620b840 task.stack: ffff880078090000
RIP: set_freeobj.part.43+0x1c/0x1f
RSP: 0018:ffff880078093ca8 EFLAGS: 00010246
RAX: 0000000000000018 RBX: ffff880076798d88 RCX: ffffffff81c408c8
RDX: 0000000000000018 RSI: 0000000000000000 RDI: 0000000000000246
RBP: ffff880078093cb0 R08: 0000000000000000 R09: 0000000000000000
R10: ffff88005bc43030 R11: 0000000000001df3 R12: ffff880076798d88
R13: 000000000005bc43 R14: ffff88007819d1b8 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff88007e380000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc934048f20 CR3: 0000000077b01000 CR4: 00000000000406e0
Call Trace:
obj_malloc+0x22b/0x260
zs_malloc+0x1e4/0x580
zram_bvec_rw+0x4cd/0x830 [zram]
page_requests_rw+0x9c/0x130 [zram]
zram_thread+0xe6/0x173 [zram]
kthread+0xca/0xe0
ret_from_fork+0x25/0x30
With investigation, it reveals currently stable page doesn't support
anonymous page. IOW, reuse_swap_page can reuse the page without waiting
writeback completion so it can overwrite page zram is compressing.
Unfortunately, zram has used per-cpu stream feature from v4.7.
It aims for increasing cache hit ratio of scratch buffer for
compressing. Downside of that approach is that zram should ask
memory space for compressed page in per-cpu context which requires
stricted gfp flag which could be failed. If so, it retries to
allocate memory space out of per-cpu context so it could get memory
this time and compress the data again, copies it to the memory space.
In this scenario, zram assumes the data should never be changed
but it is not true unless stable page supports. So, If the data is
changed under us, zram can make buffer overrun because second
compression size could be bigger than one we got in previous trial
and blindly, copy bigger size object to smaller buffer which is
buffer overrun. The overrun breaks zsmalloc free object chaining
so system goes crash like above.
I think below is same problem.
https://bugzilla.suse.com/show_bug.cgi?id=997574
Unfortunately, reuse_swap_page should be atomic so that we cannot wait on
writeback in there so the approach in this patch is simply return false if
we found it needs stable page. Although it increases memory footprint
temporarily, it happens rarely and it should be reclaimed easily althoug
it happened. Also, It would be better than waiting of IO completion,
which is critial path for application latency.
Fixes: da9556a236 ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/20161120233015.GA14113@bbox
Link: http://lkml.kernel.org/r/1482366980-3782-2-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hyeoncheol Lee <cheol.lee@lge.com>
Cc: <yjay.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4536f0c82 upstream.
Nils Holland and Klaus Ethgen have reported unexpected OOM killer
invocations with 32b kernel starting with 4.8 kernels
kworker/u4:5 invoked oom-killer: gfp_mask=0x2400840(GFP_NOFS|__GFP_NOFAIL), nodemask=0, order=0, oom_score_adj=0
kworker/u4:5 cpuset=/ mems_allowed=0
CPU: 1 PID: 2603 Comm: kworker/u4:5 Not tainted 4.9.0-gentoo #2
[...]
Mem-Info:
active_anon:58685 inactive_anon:90 isolated_anon:0
active_file:274324 inactive_file:281962 isolated_file:0
unevictable:0 dirty:649 writeback:0 unstable:0
slab_reclaimable:40662 slab_unreclaimable:17754
mapped:7382 shmem:202 pagetables:351 bounce:0
free:206736 free_pcp:332 free_cma:0
Node 0 active_anon:234740kB inactive_anon:360kB active_file:1097296kB inactive_file:1127848kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:29528kB dirty:2596kB writeback:0kB shmem:0kB shmem_thp: 0kB shmem_pmdmapped: 184320kB anon_thp: 808kB writeback_tmp:0kB unstable:0kB pages_scanned:0 all_unreclaimable? no
DMA free:3952kB min:788kB low:984kB high:1180kB active_anon:0kB inactive_anon:0kB active_file:7316kB inactive_file:0kB unevictable:0kB writepending:96kB present:15992kB managed:15916kB mlocked:0kB slab_reclaimable:3200kB slab_unreclaimable:1408kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
lowmem_reserve[]: 0 813 3474 3474
Normal free:41332kB min:41368kB low:51708kB high:62048kB active_anon:0kB inactive_anon:0kB active_file:532748kB inactive_file:44kB unevictable:0kB writepending:24kB present:897016kB managed:836248kB mlocked:0kB slab_reclaimable:159448kB slab_unreclaimable:69608kB kernel_stack:1112kB pagetables:1404kB bounce:0kB free_pcp:528kB local_pcp:340kB free_cma:0kB
lowmem_reserve[]: 0 0 21292 21292
HighMem free:781660kB min:512kB low:34356kB high:68200kB active_anon:234740kB inactive_anon:360kB active_file:557232kB inactive_file:1127804kB unevictable:0kB writepending:2592kB present:2725384kB managed:2725384kB mlocked:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:800kB local_pcp:608kB free_cma:0kB
the oom killer is clearly pre-mature because there there is still a lot
of page cache in the zone Normal which should satisfy this lowmem
request. Further debugging has shown that the reclaim cannot make any
forward progress because the page cache is hidden in the active list
which doesn't get rotated because inactive_list_is_low is not memcg
aware.
The code simply subtracts per-zone highmem counters from the respective
memcg's lru sizes which doesn't make any sense. We can simply end up
always seeing the resulting active and inactive counts 0 and return
false. This issue is not limited to 32b kernels but in practice the
effect on systems without CONFIG_HIGHMEM would be much harder to notice
because we do not invoke the OOM killer for allocations requests
targeting < ZONE_NORMAL.
Fix the issue by tracking per zone lru page counts in mem_cgroup_per_node
and subtract per-memcg highmem counts when memcg is enabled. Introduce
helper lruvec_zone_lru_size which redirects to either zone counters or
mem_cgroup_get_zone_lru_size when appropriate.
We are losing empty LRU but non-zero lru size detection introduced by
ca707239e8 ("mm: update_lru_size warn and reset bad lru_size") because
of the inherent zone vs. node discrepancy.
Fixes: f8d1a31163 ("mm: consider whether to decivate based on eligible zones inactive ratio")
Link: http://lkml.kernel.org/r/20170104100825.3729-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Nils Holland <nholland@tisys.org>
Tested-by: Nils Holland <nholland@tisys.org>
Reported-by: Klaus Ethgen <Klaus@Ethgen.de>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7ee2c089e upstream.
The crash happens rather often when we reset some cluster nodes while
nodes contend fiercely to do truncate and append.
The crash backtrace is below:
dlm: C21CBDA5E0774F4BA5A9D4F317717495: dlm_recover_grant 1 locks on 971 resources
dlm: C21CBDA5E0774F4BA5A9D4F317717495: dlm_recover 9 generation 5 done: 4 ms
ocfs2: Begin replay journal (node 318952601, slot 2) on device (253,18)
ocfs2: End replay journal (node 318952601, slot 2) on device (253,18)
ocfs2: Beginning quota recovery on device (253,18) for slot 2
ocfs2: Finishing quota recovery on device (253,18) for slot 2
(truncate,30154,1):ocfs2_truncate_file:470 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode)
(truncate,30154,1):ocfs2_truncate_file:470 ERROR: Inode 290321, inode i_size = 732 != di i_size = 937, i_flags = 0x1
------------[ cut here ]------------
kernel BUG at /usr/src/linux/fs/ocfs2/file.c:470!
invalid opcode: 0000 [#1] SMP
Modules linked in: ocfs2_stack_user(OEN) ocfs2(OEN) ocfs2_nodemanager ocfs2_stackglue(OEN) quota_tree dlm(OEN) configfs fuse sd_mod iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi af_packet iscsi_ibft iscsi_boot_sysfs softdog xfs libcrc32c ppdev parport_pc pcspkr parport joydev virtio_balloon virtio_net i2c_piix4 acpi_cpufreq button processor ext4 crc16 jbd2 mbcache ata_generic cirrus virtio_blk ata_piix drm_kms_helper ahci syscopyarea libahci sysfillrect sysimgblt fb_sys_fops ttm floppy libata drm virtio_pci virtio_ring uhci_hcd virtio ehci_hcd usbcore serio_raw usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4
Supported: No, Unsupported modules are loaded
CPU: 1 PID: 30154 Comm: truncate Tainted: G OE N 4.4.21-69-default #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20151112_172657-sheep25 04/01/2014
task: ffff88004ff6d240 ti: ffff880074e68000 task.ti: ffff880074e68000
RIP: 0010:[<ffffffffa05c8c30>] [<ffffffffa05c8c30>] ocfs2_truncate_file+0x640/0x6c0 [ocfs2]
RSP: 0018:ffff880074e6bd50 EFLAGS: 00010282
RAX: 0000000000000074 RBX: 000000000000029e RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffff880074e6bda8 R08: 000000003675dc7a R09: ffffffff82013414
R10: 0000000000034c50 R11: 0000000000000000 R12: ffff88003aab3448
R13: 00000000000002dc R14: 0000000000046e11 R15: 0000000000000020
FS: 00007f839f965700(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f839f97e000 CR3: 0000000036723000 CR4: 00000000000006e0
Call Trace:
ocfs2_setattr+0x698/0xa90 [ocfs2]
notify_change+0x1ae/0x380
do_truncate+0x5e/0x90
do_sys_ftruncate.constprop.11+0x108/0x160
entry_SYSCALL_64_fastpath+0x12/0x6d
Code: 24 28 ba d6 01 00 00 48 c7 c6 30 43 62 a0 8b 41 2c 89 44 24 08 48 8b 41 20 48 c7 c1 78 a3 62 a0 48 89 04 24 31 c0 e8 a0 97 f9 ff <0f> 0b 3d 00 fe ff ff 0f 84 ab fd ff ff 83 f8 fc 0f 84 a2 fd ff
RIP [<ffffffffa05c8c30>] ocfs2_truncate_file+0x640/0x6c0 [ocfs2]
It's because ocfs2_inode_lock() get us stale LVB in which the i_size is
not equal to the disk i_size. We mistakenly trust the LVB because the
underlaying fsdlm dlm_lock() doesn't set lkb_sbflags with
DLM_SBF_VALNOTVALID properly for us. But, why?
The current code tries to downconvert lock without DLM_LKF_VALBLK flag
to tell o2cb don't update RSB's LVB if it's a PR->NULL conversion, even
if the lock resource type needs LVB. This is not the right way for
fsdlm.
The fsdlm plugin behaves different on DLM_LKF_VALBLK, it depends on
DLM_LKF_VALBLK to decide if we care about the LVB in the LKB. If
DLM_LKF_VALBLK is not set, fsdlm will skip recovering RSB's LVB from
this lkb and set the right DLM_SBF_VALNOTVALID appropriately when node
failure happens.
The following diagram briefly illustrates how this crash happens:
RSB1 is inode metadata lock resource with LOCK_TYPE_USES_LVB;
The 1st round:
Node1 Node2
RSB1: PR
RSB1(master): NULL->EX
ocfs2_downconvert_lock(PR->NULL, set_lvb==0)
ocfs2_dlm_lock(no DLM_LKF_VALBLK)
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
dlm_lock(no DLM_LKF_VALBLK)
convert_lock(overwrite lkb->lkb_exflags
with no DLM_LKF_VALBLK)
RSB1: NULL RSB1: EX
reset Node2
dlm_recover_rsbs()
recover_lvb()
/* The LVB is not trustable if the node with EX fails and
* no lock >= PR is left. We should set RSB_VALNOTVALID for RSB1.
*/
if(!(kb_exflags & DLM_LKF_VALBLK)) /* This means we miss the chance to
return; * to invalid the LVB here.
*/
The 2nd round:
Node 1 Node2
RSB1(become master from recovery)
ocfs2_setattr()
ocfs2_inode_lock(NULL->EX)
/* dlm_lock() return the stale lvb without setting DLM_SBF_VALNOTVALID */
ocfs2_meta_lvb_is_trustable() return 1 /* so we don't refresh inode from disk */
ocfs2_truncate_file()
mlog_bug_on_msg(disk isize != i_size_read(inode)) /* crash! */
The fix is quite straightforward. We keep to set DLM_LKF_VALBLK flag
for dlm_lock() if the lock resource type needs LVB and the fsdlm plugin
is uesed.
Link: http://lkml.kernel.org/r/1481275846-6604-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mfasheh@versity.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f931ab479d upstream.
Both arch_add_memory() and arch_remove_memory() expect a single threaded
context.
For example, arch/x86/mm/init_64.c::kernel_physical_mapping_init() does
not hold any locks over this check and branch:
if (pgd_val(*pgd)) {
pud = (pud_t *)pgd_page_vaddr(*pgd);
paddr_last = phys_pud_init(pud, __pa(vaddr),
__pa(vaddr_end),
page_size_mask);
continue;
}
pud = alloc_low_page();
paddr_last = phys_pud_init(pud, __pa(vaddr), __pa(vaddr_end),
page_size_mask);
The result is that two threads calling devm_memremap_pages()
simultaneously can end up colliding on pgd initialization. This leads
to crash signatures like the following where the loser of the race
initializes the wrong pgd entry:
BUG: unable to handle kernel paging request at ffff888ebfff0000
IP: memcpy_erms+0x6/0x10
PGD 2f8e8fc067 PUD 0 /* <---- Invalid PUD */
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
CPU: 54 PID: 3818 Comm: systemd-udevd Not tainted 4.6.7+ #13
task: ffff882fac290040 ti: ffff882f887a4000 task.ti: ffff882f887a4000
RIP: memcpy_erms+0x6/0x10
[..]
Call Trace:
? pmem_do_bvec+0x205/0x370 [nd_pmem]
? blk_queue_enter+0x3a/0x280
pmem_rw_page+0x38/0x80 [nd_pmem]
bdev_read_page+0x84/0xb0
Hold the standard memory hotplug mutex over calls to
arch_{add,remove}_memory().
Fixes: 41e94a8513 ("add devm_memremap_pages")
Link: http://lkml.kernel.org/r/148357647831.9498.12606007370121652979.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20f664aabe upstream.
Andreas reported [1] made a test in jemalloc hang in THP mode in arm64:
http://lkml.kernel.org/r/mvmmvfy37g1.fsf@hawking.suse.de
The problem is currently page fault handler doesn't supports dirty bit
emulation of pmd for non-HW dirty-bit architecture so that application
stucks until VM marked the pmd dirty.
How the emulation work depends on the architecture. In case of arm64,
when it set up pte firstly, it sets pte PTE_RDONLY to get a chance to
mark the pte dirty via triggering page fault when store access happens.
Once the page fault occurs, VM marks the pmd dirty and arch code for
setting pmd will clear PTE_RDONLY for application to proceed.
IOW, if VM doesn't mark the pmd dirty, application hangs forever by
repeated fault(i.e., store op but the pmd is PTE_RDONLY).
This patch enables pmd dirty-bit emulation for those architectures.
[1] b8d3c4c300, mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called
Fixes: b8d3c4c300 ("mm/huge_memory.c: don't split THP page when MADV_FREE syscall is called")
Link: http://lkml.kernel.org/r/1482506098-6149-1-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Andreas Schwab <schwab@suse.de>
Tested-by: Andreas Schwab <schwab@suse.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jason Evans <je@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 965d004af5 upstream.
Currently in DAX if we have three read faults on the same hole address we
can end up with the following:
Thread 0 Thread 1 Thread 2
-------- -------- --------
dax_iomap_fault
grab_mapping_entry
lock_slot
<locks empty DAX entry>
dax_iomap_fault
grab_mapping_entry
get_unlocked_mapping_entry
<sleeps on empty DAX entry>
dax_iomap_fault
grab_mapping_entry
get_unlocked_mapping_entry
<sleeps on empty DAX entry>
dax_load_hole
find_or_create_page
...
page_cache_tree_insert
dax_wake_mapping_entry_waiter
<wakes one sleeper>
__radix_tree_replace
<swaps empty DAX entry with 4k zero page>
<wakes>
get_page
lock_page
...
put_locked_mapping_entry
unlock_page
put_page
<sleeps forever on the DAX
wait queue>
The crux of the problem is that once we insert a 4k zero page, all
locking from then on is done in terms of that 4k zero page and any
additional threads sleeping on the empty DAX entry will never be woken.
Fix this by waking all sleepers when we replace the DAX radix tree entry
with a 4k zero page. This will allow all sleeping threads to
successfully transition from locking based on the DAX empty entry to
locking on the 4k zero page.
With the test case reported by Xiong this happens very regularly in my
test setup, with some runs resulting in 9+ threads in this deadlocked
state. With this fix I've been able to run that same test dozens of
times in a loop without issue.
Fixes: ac401cc782 ("dax: New fault locking")
Link: http://lkml.kernel.org/r/1483479365-13607-1-git-send-email-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reported-by: Xiong Zhou <xzhou@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b09ab054b6 upstream.
zram has used per-cpu stream feature from v4.7. It aims for increasing
cache hit ratio of scratch buffer for compressing. Downside of that
approach is that zram should ask memory space for compressed page in
per-cpu context which requires stricted gfp flag which could be failed.
If so, it retries to allocate memory space out of per-cpu context so it
could get memory this time and compress the data again, copies it to the
memory space.
In this scenario, zram assumes the data should never be changed but it is
not true without stable page support. So, If the data is changed under
us, zram can make buffer overrun so that zsmalloc free object chain is
broken so system goes crash like below
https://bugzilla.suse.com/show_bug.cgi?id=997574
This patch adds BDI_CAP_STABLE_WRITES to zram for declaring "I am block
device needing *stable write*".
Fixes: da9556a236 ("zram: user per-cpu compression streams")
Link: http://lkml.kernel.org/r/1482366980-3782-4-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Hyeoncheol Lee <cheol.lee@lge.com>
Cc: <yjay.kim@lge.com>
Cc: Sangseok Lee <sangseok.lee@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2b1e8a20c upstream.
Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3659f98b53 upstream.
Nothing in this minimal script seems to require bash. We often run these
tests on embedded devices where the only shell available is the busybox
ash. Use sh instead.
Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b2cdeb19f1 upstream.
If the allocation fails the current code returns success. If
copy_from_user() fails it returns the number of bytes remaining instead
of -EFAULT.
Fixes: d5b1a78a77 ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9376cad207 upstream.
The devm_pinctrl_register() function returns an error pointer or a valid
handle. So checking for NULL here is pointless and can never trigger.
Check the returned value with IS_ERR instead and propagate this value as
done in the other functions which call devm_pinctrl_register().
Fixes: 0751bb5c44 ("drm/tegra: dpaux: Add pinctrl support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 618c808968 upstream.
The maximum supported voltage for ldo_io# is 3.3V, but on cold boot
the selector comes up at 0x1f, which maps to 3.8V. This was previously
corrected by Allwinner's U-boot, which set all regulators on the PMICs
to some pre-configured voltage. With recent progress in U-boot SPL
support, this is no longer the case. In any case we should handle
this quirk in the kernel driver as well.
This invalid setting causes _regulator_get_voltage() to fail with -EINVAL
which causes regulator registration to fail when constrains are used:
[ 1.054181] vcc-pg: failed to get the current voltage(-22)
[ 1.059670] axp20x-regulator axp20x-regulator.0: Failed to register ldo_io0
[ 1.069749] axp20x-regulator: probe of axp20x-regulator.0 failed with error -22
This commits makes the axp20x regulator driver accept the 0x1f register
value, fixing this.
The datasheet does not guarantee reliable operation above 3.3V, so on
boards where this regulator is used the regulator-max-microvolt setting
must be 3.3V or less.
This is essentially the same as the commit f40d4896bf ("regulator:
axp20x: Fix axp22x ldo_io registration error on cold boot") for AXP22x
PMICs.
Fixes: a51f9f4622 ("regulator: axp20x: support AXP809 variant")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8ca5bd158 upstream.
The BUCK regulators 3, 4, and 5 also have a 10mV step mode,
adjust the tables and logic to reflect the data-sheet for
these regulators.
fixes: d2a2e729a6 ("regulator: tps65086: Add regulator driver for the TPS65086 PMIC")
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c314c9f15a upstream.
On some SoC there are no simple mapping of pins to bias register bits
and a lookup table is needed. This logic is already implemented in some
SoC specific drivers that could benefit from a generic implementation.
Add helpers to deal with the lookup which later can be used by the SoC
specific drivers. The logic used to lookup are different from the one it
aims to replace, this is intentional. This new method reduces the memory
consumption at the cost of increased CPU usage and fix a bug where a
WARN() would incorrectly be triggered if the register offset is 0.
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d3b861bccd upstream.
There is a bug in the r8a7795 bias code where a WARN() is trigged
anytime a pin from PUEN0/PUD0 is accessed.
# cat /sys/kernel/debug/pinctrl/e6060000.pfc/pinconf-pins
WARNING: CPU: 2 PID: 2391 at drivers/pinctrl/sh-pfc/pfc-r8a7795.c:5364 r8a7795_pinmux_get_bias+0xbc/0xc8
[..]
Call trace:
[<ffff0000083c442c>] r8a7795_pinmux_get_bias+0xbc/0xc8
[<ffff0000083c37f4>] sh_pfc_pinconf_get+0x194/0x270
[<ffff0000083b0768>] pin_config_get_for_pin+0x20/0x30
[<ffff0000083b11e8>] pinconf_generic_dump_one+0x168/0x188
[<ffff0000083b144c>] pinconf_generic_dump_pins+0x5c/0x98
[<ffff0000083b0628>] pinconf_pins_show+0xc8/0x128
[<ffff0000081fe3bc>] seq_read+0x16c/0x420
[<ffff00000831a110>] full_proxy_read+0x58/0x88
[<ffff0000081d7ad4>] __vfs_read+0x1c/0xf8
[<ffff0000081d8874>] vfs_read+0x84/0x148
[<ffff0000081d9d64>] SyS_read+0x44/0xa0
[<ffff000008082f4c>] __sys_trace_return+0x0/0x4
This is due to the WARN() check if the reg field of the pullups struct
is zero, and this should be 0 for pins controlled by the PUEN0/PUD0
registers since PU0 is defined as 0. Change the data structure and use
the generic sh_pfc_pin_to_bias_info() function to get the register
offset and bit information.
Fixes: 560655247b ("pinctrl: sh-pfc: r8a7795: Add bias pinconf support")
Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6fc513da5 upstream.
currently the controllers get the same product id as the wireless
receiver. However the controllers actually have their own product id.
The patch makes the driver expose the same product id as the windows
driver.
This improves compatibility when running applications with WINE.
see https://github.com/paroj/xpad/issues/54
Signed-off-by: Pavel Rojtberg <rojtberg@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 60f59ce027 upstream.
These drivers need to be able to reference "struct ieee80211_hw" from
the driver's private data, and vice versa. The USB driver failed to
store the address of ieee80211_hw in the private data. Although this
bug has been present for a long time, it was not exposed until
commit ba9f93f82a ("rtlwifi: Fix enter/exit power_save").
Fixes: ba9f93f82a ("rtlwifi: Fix enter/exit power_save")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba9f93f82a upstream.
In commit a5ffbe0a19 ("rtlwifi: Fix scheduling while atomic bug") and
commit a269913c52 ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter()
to use work queue"), an error was introduced in the power-save routines
due to the fact that leaving PS was delayed by the use of a work queue.
This problem is fixed by detecting if the enter or leave routines are
in interrupt mode. If so, the workqueue is used to place the request.
If in normal mode, the enter or leave routines are called directly.
Fixes: a269913c52 ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue")
Reported-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c7d0602c8 upstream.
commit 848496e590
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Wed Jul 13 16:32:03 2016 +0300
drm/i915: Wait up to 3ms for the pcu to ack the cdclk change request on SKL
increased the timeout to match the spec, but we still see a timeout on
at least one SKL. A CDCLK change request following the failed one will
succeed nevertheless.
I could reproduce this problem easily by running kms_pipe_crc_basic in a
loop. In all failure cases _wait_for() was pre-empted for >3ms and so in
the worst case - when the pre-emption happened right after calculating
timeout__ in _wait_for() - we called skl_cdclk_wait_for_pcu_ready() only
once which failed and so _wait_for() timed out. As opposed to this the
spec says to keep retrying the request for at most a 3ms period.
To fix this send the first request explicitly to guarantee that there is
3ms between the first and last request. Though this matches the spec, I
noticed that in rare cases this can still time out if we sent only a few
requests (in the worst case 2) _and_ PCODE is busy for some reason even
after a previous request and a 3ms delay. To work around this retry the
polling with pre-emption disabled to maximize the number of requests.
Also increase the timeout to 10ms to account for interrupts that could
reduce the number of requests. With this change I couldn't trigger
the problem.
v2:
- Use 1ms poll period instead of 10us. (Chris)
v3:
- Poll with pre-emption disabled to increase the number of request
attempts. (Ville, Chris)
- Factor out a helper to poll, it's also needed by the next patch.
v4:
- Pass reply_mask, reply to skl_pcode_request(), instead of assuming the
reply is generic. (Ville)
v5:
- List the request specific timeout values as code comment. (Ville)
v6:
- Try the poll first with preemption enabled.
- Add code comment about first request being queued by PCODE. (Art)
- Add timeout_base_ms argument. (Ville)
v7:
- Clarify code comment about first queued request. (Chris)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Art Runyan <arthur.j.runyan@intel.com>
Cc: <stable@vger.kernel.org> # v4.2- : 3b2c171 : drm/i915: Wait up to 3ms
Cc: <stable@vger.kernel.org> # v4.2-
Fixes: 5d96d8afcf ("drm/i915/skl: Deinit/init the display at suspend/resume")
Reference: https://bugs.freedesktop.org/show_bug.cgi?id=97929
Testcase: igt/kms_pipe_crc_basic/suspend-read-crc-pipe-B
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1480955258-26311-1-git-send-email-imre.deak@intel.com
(cherry picked from commit a0b8a1fe34)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e40795c3b upstream.
Plantronics BT600 does not support reading the sample rate which leads
to many lines of "cannot get freq at ep 0x1" and "cannot get freq at
ep 0x82". This patch adds the USB ID of the BT600 to quirks.c and
avoids those error messages.
Signed-off-by: Dennis Kadioglu <denk@post.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7243e0b207 upstream.
The calculation of SPR and SPPR doesn't round correctly at several
places which might result in baud rates that are too big. For example
with tclk_hz = 250000001 and target rate 25000000 it determined a
divider of 10 which is wrong.
Instead of fixing all the corner cases replace the calculation by an
algorithm without a loop which should even be quicker to execute apart
from being correct.
Fixes: df59fa7f4b ("spi: orion: support armada extended baud rates")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f86a2c875f upstream.
The commit 55ee7017ee ("arm: omap2: board-generic: use
omap4_local_timer_init for AM437x") unintentionally changes the
clocksource devices for AM437x from OMAP GP Timer to SyncTimer32K.
Unfortunately, the SyncTimer32K is starving from frequency deviation
as mentioned in commit 5b5c013591 ("ARM: OMAP2+: AM43x: Use gptimer
as clocksource") and, as reported by Franklin [1], even its monotonic
nature is under question (most probably there is a HW issue, but it's
still under investigation).
Taking into account above facts It's reasonable to rollback to the use
of omap3_gptimer_timer_init().
[1] http://www.spinics.net/lists/linux-omap/msg127425.html
Fixes: 55ee7017ee ("arm: omap2: board-generic: use
omap4_local_timer_init for AM437x")
Reported-by: Cooper Jr., Franklin <fcooper@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9388093db4 upstream.
Unlike clk_register_clkdev(), clk_hw_register_clkdev() doesn't check for
passed error objects from a previous registration call. Hence the caller
of clk_hw_register_*() has to check for errors before calling
clk_hw_register_clkdev*().
Make clk_hw_register_clkdev() more similar to clk_register_clkdev() by
adding this error check, removing the burden from callers that do mass
registration.
Fixes: e4f1b49bda ("clkdev: Add clk_hw based registration APIs")
Fixes: 944b9a41e0 ("clk: ls1x: Migrate to clk_hw based OF and registration APIs")
Fixes: 44ce9a9ae9 ("MIPS: TXx9: Convert to Common Clock Framework")
Fixes: f48d947a16 ("clk: clps711x: Migrate to clk_hw based OF and registration APIs")
Fixes: b4626a7f48 ("CLK: Add Loongson1C clock support")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbf2642872 upstream.
We don't want to fall through to a bunch of errors for retention
if PM_OMAP4_CPU_OSWR_DISABLE is not configured for a SoC.
Fixes: 6099dd37c6 ("ARM: OMAP5 / DRA7: Enable CPU RET on suspend")
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit da6d5993bf upstream.
It's CONFIG_SOC_OMAP5, not CONFIG_ARCH_OMAP5. Looks like make randconfig
builds have not hit this one yet.
Fixes: b3bf289c1c ("ARM: OMAP2+: Fix build with CONFIG_SMP and CONFIG_PM is not set")
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8a8be46afe upstream.
We need to properly initialize mpuss also on omap5 like we do on omap4.
Otherwise we run into similar kexec problems like we had on omap4 when
trying to kexec from a kernel with PM initialized.
Fixes: 0573b957fc ("ARM: OMAP4+: Prevent CPU1 related hang with kexec")
Acked-by: Santosh Shilimkar <ssantosh@kernel.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a3cc2a7b2 upstream.
On Zynq, we haven't been reserving the correct amount of DMA-incapable
RAM to keep DMA away from it (per the Zynq TRM Section 4.1, it should be
the first 512k). In older kernels, this was masked by the
memblock_reserve call in arm_memblock_init(). Now, reserve the correct
amount excplicitly rather than relying on swapper_pg_dir, which is an
address and not a size anyway.
Fixes: 46f5b96 ("ARM: zynq: Reserve not DMAable space in front of the kernel")
Signed-off-by: Kyle Roeschley <kyle.roeschley@ni.com>
Tested-by: Nathan Rossi <nathan@nathanrossi.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e413bd33ac upstream.
In the device-tree case, the root interrupt controller cannot be
accessed through the 6th coprocessor, contrary to pxa27x and pxa3xx
architectures.
Fix it to behave as in non-devicetree builds.
Fixes: 32f17997c1 ("ARM: pxa: remove irq init from dt machines")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a44e87b471 upstream.
We are incorrectly defining the pwr LED, attaching it to a gpio line
that is wired to the Wi-Fi SDIO module (which fails due to this).
The actual power LED is connected to the GPIO expander, which we don't
expose currently.
Fixes: 9d56c22a78 ("ARM: bcm2835: Add devicetree for the Raspberry Pi 3.")
Thanks-to: Eric Anholt <eric@anholt.net> [for clarifying we can't control the LED]
Signed-off-by: Andrea Merello <andrea.merello@gmail.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a3207d644f upstream.
The devicetree node for mt8173-auxadc lacks the clock and
io-channel-cells property. This leads to a non-working driver.
mt6577-auxadc 11001000.auxadc: failed to get auxadc clock
mt6577-auxadc: probe of 11001000.auxadc failed with error -2
Fix these fields to get the device up and running.
Fixes: 748c7d4de4 ("ARM64: dts: mt8173: Add thermal/auxadc device
nodes")
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8ae679c4bc upstream.
I am getting the following warning when I build kernel 4.9-git on my
PowerBook G4 with a 32-bit PPC processor:
AS arch/powerpc/kernel/misc_32.o
arch/powerpc/kernel/misc_32.S:299:7: warning: "CONFIG_FSL_BOOKE" is not defined [-Wundef]
This problem is evident after commit 989cea5c14 ("kbuild: prevent
lib-ksyms.o rebuilds"); however, this change in kbuild only exposes an
error that has been in the code since 2005 when this source file was
created. That was with commit 9994a33865 ("powerpc: Introduce
entry_{32,64}.S, misc_{32,64}.S, systbl.S").
The offending line does not make a lot of sense. This error does not
seem to cause any errors in the executable, thus I am not recommending
that it be applied to any stable versions.
Thanks to Nicholas Piggin for suggesting this solution.
Fixes: 9994a33865 ("powerpc: Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S")
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6a2a2f4556 upstream.
This module has a bug not to return error code in a case that data
structure for transmitted packets fails to be initialized.
This commit fixes the bug.
Fixes: 35efa5c489 ("ALSA: firewire-tascam: add streaming functionality")
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 24c63bbc18 ]
Frank reported that vrf devices can be created with a table id of 0.
This breaks many of the run time table id checks and should not be
allowed. Detect this condition at create time and fail with EINVAL.
Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Reported-by: Frank Kellermann <frank.kellermann@atos.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7a18c5b9fb ]
fib_select_path does not call fib_select_multipath if oif is set in the
flow struct. For VRF use cases oif is always set, so multipath route
selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
check similar to what is done in fib_table_lookup.
Add saddr and proto to the flow struct for the fib lookup done by the
VRF driver to better match hash computation for a flow.
Fixes: 613d09b30f ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0bbcc0a8fc ]
When trying to do interface down or changing interface configuration
under heavy traffic, some of the adaptive moderation corner cases can
occur and leave a WARN_ONCE call trace in the kernel log.
Those WARN_ONCE are meant for debug only, and should have been inserted
only under debug. We avoid such call traces by removing those WARN_ONCE.
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Gil Rockah <gilr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 57ea52a865 ]
The GRO fast path caches the frag0 address. This address becomes
invalid if frag0 is modified by pskb_may_pull or its variants.
So whenever that happens we must disable the frag0 optimization.
This is usually done through the combination of gro_header_hard
and gro_header_slow, however, the IPv6 extension header path did
the pulling directly and would continue to use the GRO fast path
incorrectly.
This patch fixes it by disabling the fast path when we enter the
IPv6 extension header path.
Fixes: 78a478d0ef ("gro: Inline skb_gro_header and cache frag0 virtual address")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7cfd5fd5a9 ]
On 32bit arches, (skb->end - skb->data) is not 'unsigned int',
so we shall use min_t() instead of min() to avoid a compiler error.
Fixes: 1272ce87fa ("gro: Enter slow-path if there is no tailroom")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 1272ce87fa ]
The GRO path has a fast-path where we avoid calling pskb_may_pull
and pskb_expand by directly accessing frag0. However, this should
only be done if we have enough tailroom in the skb as otherwise
we'll have to expand it later anyway.
This patch adds the check by capping frag0_len with the skb tailroom.
Fixes: cb18978cbf ("gro: Open-code final pskb_may_pull")
Reported-by: Slava Shwartsman <slavash@mellanox.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit faf3a932fb ]
It is perfectly possible to have non zero indexed switches being present
in a DSA switch tree, in such a case, we will be deferencing a NULL
pointer while dsa_cpu_port_ethtool_{setup,restore}. Be more defensive
and ensure that dst->ds[0] is valid before doing anything with it.
Fixes: 0c73c523cf ("net: dsa: Initialize CPU port ethtool ops per tree")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 75dc692eda ]
Pause the rx and make sure the rx fifo is empty when the autosuspend
occurs.
If the rx data comes when the driver is canceling the rx urb, the host
controller would stop getting the data from the device and continue
it after next rx urb is submitted. That is, one continuing data is
split into two different urb buffers. That let the driver take the
data as a rx descriptor, and unexpected behavior happens.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 2cfe8f8290 ]
We are implementing a MDIO bus which is behind another one, so use the
nested version of the accessors to get lockdep annotations correct.
Fixes: 461cd1b03e ("net: dsa: bcm_sf2: Register our slave MDIO bus")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a4c61b92b3 ]
We make the bcm_sf2 driver override ds->ops which points to
b53_switch_ops since b53_switch_alloc() did the assignent. This is all
well and good until a second b53 switch comes in, and ends up using the
bcm_sf2 operations. Make a proper local copy, substitute the ds->ops
pointer and then override the operations.
Fixes: f458995b9a ("net: dsa: bcm_sf2: Utilize core B53 driver when possible")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9d5ecb09d5 ]
If after too many passes still no image could be emitted, then
swap back to the original program as we do in all other cases
and don't use the one with blinding.
Fixes: 959a757916 ("bpf, x86: add support for constant blinding")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5350d54f6c ]
In the case of custom rules being present we need to handle the case of the
LOCAL table being intialized after the new rule has been added. To address
that I am adding a new check so that we can make certain we don't use an
alias of MAIN for LOCAL when allocating a new table.
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Reported-by: Oliver Brunel <jjk@jjacky.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7ababb7826 ]
5.2. Action on Reception of a Query
When a system receives a Query, it does not respond immediately.
Instead, it delays its response by a random amount of time, bounded
by the Max Resp Time value derived from the Max Resp Code in the
received Query message. A system may receive a variety of Queries on
different interfaces and of different kinds (e.g., General Queries,
Group-Specific Queries, and Group-and-Source-Specific Queries), each
of which may require its own delayed response.
Before scheduling a response to a Query, the system must first
consider previously scheduled pending responses and in many cases
schedule a combined response. Therefore, the system must be able to
maintain the following state:
o A timer per interface for scheduling responses to General Queries.
o A per-group and interface timer for scheduling responses to Group-
Specific and Group-and-Source-Specific Queries.
o A per-group and interface list of sources to be reported in the
response to a Group-and-Source-Specific Query.
When a new Query with the Router-Alert option arrives on an
interface, provided the system has state to report, a delay for a
response is randomly selected in the range (0, [Max Resp Time]) where
Max Resp Time is derived from Max Resp Code in the received Query
message. The following rules are then used to determine if a Report
needs to be scheduled and the type of Report to schedule. The rules
are considered in order and only the first matching rule is applied.
1. If there is a pending response to a previous General Query
scheduled sooner than the selected delay, no additional response
needs to be scheduled.
2. If the received Query is a General Query, the interface timer is
used to schedule a response to the General Query after the
selected delay. Any previously pending response to a General
Query is canceled.
--8<--
Currently the timer is rearmed with new random expiration time for
every incoming query regardless of possibly already pending report.
Which is not aligned with the above RFE.
It also might happen that higher rate of incoming queries can
postpone the report after the expiration time of the first query
causing group membership loss.
Now the per interface general query timer is rearmed only
when there is no pending report already scheduled on that interface or
the newly selected expiration time is before the already pending
scheduled report.
Signed-off-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 3b48ab2248 ]
Final nlmsg_len field update must reflect inserted net_dm_drop_point
data.
This patch depends on previous patch:
"drop_monitor: add missing call to genlmsg_end"
Signed-off-by: Reiter Wolfgang <wr0112358@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f5a0aab84b ]
IPv4 output routes already use l3mdev device instead of loopback for dst's
if it is applicable. Change local input routes to do the same.
This fixes icmp responses for unreachable UDP ports which are directed
to the wrong table after commit 9d1a6c4ea4 because local_input
routes use the loopback device. Moving from ingress device to loopback
loses the L3 domain causing responses based on the dst to get to lost.
Fixes: 9d1a6c4ea4 ("net: icmp_route_lookup should use rt dev to
determine L3 domain")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit f0c16ba893 ]
When we send a packet for our own local address on a non-loopback
interface (e.g. eth0), due to the change had been introduced from
commit 0b922b7a82 ("net: original ingress device index in PKTINFO"), the
original ingress device index would be set as the loopback interface.
However, the packet should be considered as if it is being arrived via the
sending interface (eth0), otherwise it would break the expectation of the
userspace application (e.g. the DHCPRELEASE message from dhcp_release
binary would be ignored by the dnsmasq daemon, since it come from lo which
is not the interface dnsmasq bind to)
Fixes: 0b922b7a82 ("net: original ingress device index in PKTINFO")
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Wei Zhang <asuka.com@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 4775cc1f2d ]
We miss to check if the netlink message is actually big enough to contain
a struct if_stats_msg.
Add a check to prevent userland from sending us short messages that would
make us access memory beyond the end of the message.
Fixes: 10c9ead9f3 ("rtnetlink: add new RTM_GETSTATS message to dump...")
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 37f304d100 ]
Disable netdev should come after it was closed, although no harm of doing it
before -hence the MLX5E_STATE_DESTROYING bit- but it is more natural this way.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 610e89e05c ]
Skip setting netdev vxlan ports and netdev rx_mode on driver load
when netdev is not yet registered.
Synchronizing with netdev state is needed only on reset flow where the
netdev remains registered for the whole reset period.
This also fixes an access before initialization of net_device.addr_list_lock
- which for some reason initialized on register_netdev - where we queued
set_rx_mode work on driver load before netdev registration.
Fixes: 26e59d8077 ("net/mlx5e: Implement mlx5e interface attach/detach callbacks")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit ccce170026 ]
Need to check that VF mac address entered by the admin user is either
zero or unicast mac.
Multicast mac addresses are prohibited.
Fixes: 77256579c6 ('net/mlx5: E-Switch, Introduce Vport administration functions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 077b1e8069 ]
We need to mask the destination mac value with the destination mac
mask when adding steering rule via ethtool.
Fixes: 1174fce8d1 ('net/mlx5e: Support l3/l4 flow type specs in ethtool flow steering')
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 689a248df8 ]
If there is pending delayed work for health recovery it must be canceled
if the device is being unloaded.
Fixes: 05ac2c0b74 ("net/mlx5: Fix race between PCI error handlers and health work")
Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 883371c453 ]
When setting HCA capabilities, set log_max_qp to be the minimum
between the selected profile's value and the HCA limitation.
Fixes: 938fe83c8d ('net/mlx5_core: New device capabilities...')
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0df0f207aa ]
Since we now use a non zero mask on addr_type, we are matching on its
value (IPV4/IPV6). So before this fix, matching on enc_src_ip/enc_dst_ip
failed in SW/classify path since its value was zero.
This patch sets the proper value of addr_type for encapsulated packets.
Fixes: 970bfcd097 ('net/sched: cls_flower: Use mask for addr_type')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 5701659004 ]
There is currently a small window during which the network device registered by
stmmac can be made visible, yet all resources, including and clock and MDIO bus
have not had a chance to be set up, this can lead to the following error to
occur:
[ 473.919358] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
stmmac_dvr_probe: warning: cannot get CSR clock
[ 473.919382] stmmaceth 0000:01:00.0: no reset control found
[ 473.919412] stmmac - user ID: 0x10, Synopsys ID: 0x42
[ 473.919429] stmmaceth 0000:01:00.0: DMA HW capability register supported
[ 473.919436] stmmaceth 0000:01:00.0: RX Checksum Offload Engine supported
[ 473.919443] stmmaceth 0000:01:00.0: TX Checksum insertion supported
[ 473.919451] stmmaceth 0000:01:00.0 (unnamed net_device) (uninitialized):
Enable RX Mitigation via HW Watchdog Timer
[ 473.921395] libphy: PHY stmmac-1:00 not found
[ 473.921417] stmmaceth 0000:01:00.0 eth0: Could not attach to PHY
[ 473.921427] stmmaceth 0000:01:00.0 eth0: stmmac_open: Cannot attach to
PHY (error: -19)
[ 473.959710] libphy: stmmac: probed
[ 473.959724] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 0 IRQ POLL
(stmmac-1:00) active
[ 473.959728] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 1 IRQ POLL
(stmmac-1:01)
[ 473.959731] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 2 IRQ POLL
(stmmac-1:02)
[ 473.959734] stmmaceth 0000:01:00.0 eth0: PHY ID 01410cc2 at 3 IRQ POLL
(stmmac-1:03)
Fix this by making sure that register_netdev() is the last thing being done,
which guarantees that the clock and the MDIO bus are available.
Fixes: 4bfcbd7abc ("stmmac: Move the mdio_register/_unregister in probe/remove")
Reported-by: Kweh, Hock Leong <hock.leong.kweh@intel.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 628185cfdd ]
Shahar reported a soft lockup in tc_classify(), where we run into an
endless loop when walking the classifier chain due to tp->next == tp
which is a state we should never run into. The issue only seems to
trigger under load in the tc control path.
What happens is that in tc_ctl_tfilter(), thread A allocates a new
tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
with it. In that classifier callback we had to unlock/lock the rtnl
mutex and returned with -EAGAIN. One reason why we need to drop there
is, for example, that we need to request an action module to be loaded.
This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
after we loaded and found the requested action, we need to redo the
whole request so we don't race against others. While we had to unlock
rtnl in that time, thread B's request was processed next on that CPU.
Thread B added a new tp instance successfully to the classifier chain.
When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
and destroying its tp instance which never got linked, we goto replay
and redo A's request.
This time when walking the classifier chain in tc_ctl_tfilter() for
checking for existing tp instances we had a priority match and found
the tp instance that was created and linked by thread B. Now calling
again into tp->ops->change() with that tp was successful and returned
without error.
tp_created was never cleared in the second round, thus kernel thinks
that we need to link it into the classifier chain (once again). tp and
*back point to the same object due to the match we had earlier on. Thus
for thread B's already public tp, we reset tp->next to tp itself and
link it into the chain, which eventually causes the mentioned endless
loop in tc_classify() once a packet hits the data path.
Fix is to clear tp_created at the beginning of each request, also when
we replay it. On the paths that can cause -EAGAIN we already destroy
the original tp instance we had and on replay we really need to start
from scratch. It seems that this issue was first introduced in commit
12186be7d2 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
and avoid kernel panic when we use cls_cgroup").
Fixes: 12186be7d2 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
Reported-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 39b2dd765e ]
Socket cmsg IP(V6)_RECVORIGDSTADDR checks that port range lies within
the packet. For sockets that have transport headers pulled, transport
offset can be negative. Use signed comparison to avoid overflow.
Fixes: e6afc8ace6 ("udp: remove headers from UDP packets before queueing")
Reported-by: Nisar Jagabar <njagabar@cloudmark.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 08abb79542 ]
Prior to this patch, sctp_transport_lookup_process didn't rcu_read_unlock
when it failed to find a transport by sctp_addrs_lookup_transport.
This patch is to fix it by moving up rcu_read_unlock right before checking
transport and also to remove the out path.
Fixes: 1cceda7849 ("sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit eb63ecc170 ]
Locally originated traffic in a VRF fails in the presence of a POSTROUTING
rule. For example,
$ iptables -t nat -A POSTROUTING -s 11.1.1.0/24 -j MASQUERADE
$ ping -I red -c1 11.1.1.3
ping: Warning: source address might be selected on device other than red.
PING 11.1.1.3 (11.1.1.3) from 11.1.1.2 red: 56(84) bytes of data.
ping: sendmsg: Operation not permitted
Worse, the above causes random corruption resulting in a panic in random
places (I have not seen a consistent backtrace).
Call nf_reset to drop the conntrack info following the pass through the
VRF device. The nf_reset is needed on Tx but not Rx because of the order
in which NF_HOOK's are hit: on Rx the VRF device is after the real ingress
device and on Tx it is is before the real egress device. Connection
tracking should be tied to the real egress device and not the VRF device.
Fixes: 8f58336d3f ("net: Add ethernet header for pass through VRF device")
Fixes: 35402e3136 ("net: Add IPv6 support to VRF device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a0f37efa82 ]
Connection tracking with VRF is broken because the pass through the VRF
device drops the connection tracking info. Removing the call to nf_reset
allows DNAT and MASQUERADE to work across interfaces within a VRF.
Fixes: 73e20b761a ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eaa496ffaa upstream.
ep->mult is supposed to be set to Isochronous and
Interrupt Endapoint's multiplier value. This value
is computed from different places depending on the
link speed.
If we're dealing with HighSpeed, then it's part of
bits [12:11] of wMaxPacketSize. This case wasn't
taken into consideration before.
While at that, also make sure the ep->mult defaults
to one so drivers can use it unconditionally and
assume they'll never multiply ep->maxpacket to zero.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a1b7a4dea6 upstream.
There is a race window between write_cache_pages calling
clear_page_dirty_for_io and XFS calling set_page_writeback, in which
the mapping for an inode is tagged neither as dirty, nor as writeback.
If the COW shrinker hits in exactly that window we'll remove the delayed
COW extents and writepages trying to write it back, which in release
kernels will manifest as corruption of the bmap btree, and in debug
kernels will trip the ASSERT about now calling xfs_bmapi_write with the
COWFORK flag for holes. A complex customer load manages to hit this
window fairly reliably, probably by always having COW writeback in flight
while the cow shrinker runs.
This patch adds another check for having the I_DIRTY_PAGES flag set,
which is still set during this race window. While this fixes the problem
I'm still not overly happy about the way the COW shrinker works as it
still seems a bit fragile.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 20e73b000b upstream.
We need to use the actual AG length when making per-AG reservations,
since we could otherwise end up reserving more blocks out of the last
AG than there are actual blocks.
Complained-about-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7a21272b08 upstream.
Dan Carpenter reported a double-free of rcur if _defer_finish fails
while we're recovering CUI items. Fix the error recovery to prevent
this.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e1d23370e upstream.
When we create a new attribute, we first create a shortform
attribute, and try to fit the new attribute into it.
If that fails, we copy the (empty) attribute into a leaf attribute,
and do the copy again. Thus there can be a transient state where
we have an empty leaf attribute.
If we encounter this during log replay, the verifier will fail.
So add a test to ignore this part of the leaf attr verification
during log replay.
Thanks as usual to dchinner for spotting the problem.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bb33a9870 upstream.
After various discussions on linux-fsdevel, it has been decided that it
is not necessary to cap the length of a dedupe request, and that
correctly-written userspace client programs will be able to absorb the
change. Therefore, remove the length clamping behavior.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef388e2054 upstream.
The on-disk field di_size is used to set i_size, which is a signed
integer of loff_t. If the high bit of di_size is set, we'll end up with
a negative i_size, which will cause all sorts of problems. Since the
VFS won't let us create a file with such length, we should catch them
here in the verifier too.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f352f8ee8 upstream.
We shouldn't assert if somehow we end up trying to add an attr fork to
an inode that apparently already has attr extents because this is an
indication of on-disk corruption. Instead, return an error code to
userspace.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96a3aefb8f upstream.
In xfs_dir3_data_read, we can encounter the situation where err == 0 and
*bpp == NULL if the given bno offset happens to be a hole; this leads to
a crash if we try to set the buffer type after the _da_read_buf call.
Holes can happen due to corrupt or malicious entries in the bmbt data,
so be a little more careful when we're handling buffers.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 356a322522 upstream.
When reading into memory all extents of a btree-format inode fork,
complain if the number of extents we find is not the same as the number
of extents reported in the inode core. This is needed to stop an IO
action from accessing the garbage areas of the in-core fork.
[dchinner: removed redundant assert]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2a047f31e upstream.
There is no such thing as a zero-level AG btree since even a single-node
zero-records btree has one level. Btree cursor constructors read
cur_nlevels straight from disk and then access things like
cur_bufs[cur_nlevels - 1] which is /really/ bad if cur_nlevels is zero!
Therefore, strengthen the verifiers to prevent this possibility.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c44a1f2262 upstream.
By inspection, xfs_bmap_trace_exlist isn't handling cow forks,
and will trace the data fork instead.
Fix this by setting state appropriately if whichfork
== XFS_COW_FORK.
()___()
< @ @ >
| |
{o_o}
(|)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7710517fc3 upstream.
When xfs_bmap_trace_exlist called trace_xfs_extlist,
it sent in the "whichfork" var instead of the bmap "state"
as expected (even though state was already set up for this
purpose).
As a result, the xfs_bmap_class in tracing code used
"whichfork" not state in xfs_iext_state_to_fork(), and got
the wrong ifork pointer. It all goes downhill from
there, including an ASSERT when ifp_bytes is empty
by the time it reaches xfs_iext_get_ext():
XFS: Assertion failed: idx < ifp->if_bytes / sizeof(xfs_bmbt_rec_t)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 200237d674 upstream.
We've missed properly setting the buffer type for
an AGI transaction in 3 spots now, so just move it
into xfs_read_agi() and set it if we are in a transaction
to avoid the problem in the future.
This is similar to how it is done in i.e. the dir3
and attr3 read functions.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f782088c9e upstream.
xfs_file_iomap_begin_delay() implements post-eof speculative
preallocation by extending the block count of the requested delayed
allocation. Now that xfs_bmapi_reserve_delalloc() has been updated to
handle prealloc blocks separately and tag the inode, update
xfs_file_iomap_begin_delay() to use the new parameter and rely on the
former to tag the inode.
Note that this patch does not change behavior.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0260d8ff5f upstream.
COW fork reservation is implemented via delayed allocation. The code is
modeled after the traditional delalloc allocation code, but is slightly
different in terms of how preallocation occurs. Rather than post-eof
speculative preallocation, COW fork preallocation is implemented via a
COW extent size hint that is designed to minimize fragmentation as a
reflinked file is split over time.
xfs_reflink_reserve_cow() still uses logic that is oriented towards
dealing with post-eof speculative preallocation, however, and is stale
or not necessarily correct. First, the EOF alignment to the COW extent
size hint is implemented in xfs_bmapi_reserve_delalloc() (which does so
correctly by aligning the start and end offsets) and so is not necessary
in xfs_reflink_reserve_cow(). The backoff and retry logic on ENOSPC is
also ineffective for the same reason, as xfs_bmapi_reserve_delalloc()
will simply perform the same allocation request on the retry. Finally,
since the COW extent size hint aligns the start and end offset of the
range to allocate, the end_fsb != orig_end_fsb logic is not sufficient.
Indeed, if a write request happens to end on an aligned offset, it is
possible that we do not tag the inode for COW preallocation even though
xfs_bmapi_reserve_delalloc() may have preallocated at the start offset.
Kill the unnecessary, duplicate code in xfs_reflink_reserve_cow().
Remove the inode tag logic as well since xfs_bmapi_reserve_delalloc()
has been updated to tag the inode correctly.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 974ae922ef upstream.
Speculative preallocation is currently processed entirely by the callers
of xfs_bmapi_reserve_delalloc(). The caller determines how much
preallocation to include, adjusts the extent length and passes down the
resulting request.
While this works fine for post-eof speculative preallocation, it is not
as reliable for COW fork preallocation. COW fork preallocation is
implemented via the cowextszhint, which aligns the start offset as well
as the length of the extent. Further, it is difficult for the caller to
accurately identify when preallocation occurs because the returned
extent could have been merged with neighboring extents in the fork.
To simplify this situation and facilitate further COW fork preallocation
enhancements, update xfs_bmapi_reserve_delalloc() to take a separate
preallocation parameter to incorporate into the allocation request. The
preallocation blocks value is tacked onto the end of the request and
adjusted to accommodate neighboring extents and extent size limits.
Since xfs_bmapi_reserve_delalloc() now knows precisely how much
preallocation was included in the allocation, it can also tag the inodes
appropriately to support preallocation reclaim.
Note that xfs_bmapi_reserve_delalloc() callers are not yet updated to
use the preallocation mechanism. This patch should not change behavior
outside of correctly tagging reflink inodes when start offset
preallocation occurs (which the caller does not handle correctly).
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65c5f41978 upstream.
We can easily lookup the previous extent for the cases where we need it,
which saves the callers from looking it up for us later in the series.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fba3e594ef upstream.
It turns out that btrfs and xfs had differing interpretations of what
to do when the dedupe length is zero. Change xfs to follow btrfs'
semantics so that the userland interface is consistent.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fd26a88093 upstream.
When we're estimating the amount of space it's going to take to satisfy
a delalloc reservation, we need to include the space that we might need
to grow the rmapbt. This helps us to avoid running out of space later
when _iomap_write_allocate needs more space than we reserved. Eryu Guan
observed this happening on generic/224 when sunit/swidth were set.
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 93533c7855 upstream.
xfs_iext_lookup_extent looks up a single extent at the passed in offset,
and returns the extent covering the area, or the one behind it in case
of a hole, as well as the index of the returned extent in arguments,
as well as a simple bool as return value that is set to false if no
extent could be found because the offset is behind EOF. It is a simpler
replacement for xfs_bmap_search_extent that leaves looking up the rarely
needed previous extent to the caller and has a nicer calling convention.
xfs_iext_get_extent is a helper for iterating over the extent list,
it takes an extent index as input, and returns the extent at that index
in it's expanded form in an argument if it exists. The actual return
value is a bool whether the index is valid or not.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 98efe8af1c upstream.
Filesystem shutdown testing on an older distro kernel has uncovered an
imbalanced locking pattern for the inode flush lock in
xfs_reclaim_inode(). Specifically, there is a double unlock sequence
between the call to xfs_iflush_abort() and xfs_reclaim_inode() at the
"reclaim:" label.
This actually does not cause obvious problems on current kernels due to
the current flush lock implementation. Older kernels use a counting
based flush lock mechanism, however, which effectively breaks the lock
indefinitely when an already unlocked flush lock is repeatedly unlocked.
Though this only currently occurs on filesystem shutdown, it has
reproduced the effect of elevating an fs shutdown to a system-wide crash
or hang.
As it turns out, the flush lock is not actually required for the reclaim
logic in xfs_reclaim_inode() because by that time we have already cycled
the flush lock once while holding ILOCK_EXCL. Therefore, remove the
additional flush lock/unlock cycle around the 'reclaim:' label and
update branches into this label to release the flush lock where
appropriate. Add an assert to xfs_ifunlock() to help prevent future
occurences of the same problem.
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d829300be upstream.
The open-coded pattern:
ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t)
is all over the xfs code; provide a new helper
xfs_iext_count(ifp) to count the number of inline extents
in an inode fork.
[dchinner: pick up several missed conversions]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04197b341f upstream.
We've had reports of generic/095 causing XFS to BUG() in
__xfs_get_blocks() due to the existence of delalloc blocks on a
direct I/O read. generic/095 issues a mix of various types of I/O,
including direct and memory mapped I/O to a single file. This is
clearly not supported behavior and is known to lead to such
problems. E.g., the lack of exclusion between the direct I/O and
write fault paths means that a write fault can allocate delalloc
blocks in a region of a file that was previously a hole after the
direct read has attempted to flush/inval the file range, but before
it actually reads the block mapping. In turn, the direct read
discovers a delalloc extent and cannot proceed.
While the appropriate solution here is to not mix direct and memory
mapped I/O to the same regions of the same file, the current
BUG_ON() behavior is probably overkill as it can crash the entire
system. Instead, localize the failure to the I/O in question by
returning an error for a direct I/O that cannot be handled safely
due to delalloc blocks. Be careful to allow the case of a direct
write to post-eof delalloc blocks. This can occur due to speculative
preallocation and is safe as post-eof blocks are not accompanied by
dirty pages in pagecache (conversely, preallocation within eof must
have been zeroed, and thus dirtied, before the inode size could have
been increased beyond said blocks).
Finally, provide an additional warning if a direct I/O write occurs
while the file is memory mapped. This may not catch all problematic
scenarios, but provides a hint that some known-to-be-problematic I/O
methods are in use.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 399372349a upstream.
The cowblocks background scanner currently clears the cowblocks tag
for inodes without any real allocations in the cow fork. This
excludes inodes with only delalloc blocks in the cow fork. While we
might never expect to clear delalloc blocks from the cow fork in the
background scanner, it is not necessarily correct to clear the
cowblocks tag from such inodes.
For example, if the background scanner happens to process an inode
between a buffered write and writeback, the scanner catches the
inode in a state after delalloc blocks have been allocated to the
cow fork but before the delalloc blocks have been converted to real
blocks by writeback. The background scanner then incorrectly clears
the cowblocks tag, even if part of the aforementioned delalloc
reservation will not be remapped to the data fork (i.e., extra
blocks due to the cowextsize hint). This means that any such
additional blocks in the cow fork might never be reclaimed by the
background scanner and could persist until the inode itself is
reclaimed.
To address this problem, only skip and clear inodes without any cow
fork allocations whatsoever from the background scanner. While we
generally do not want to cancel delalloc reservations from the
background scanner, the pagecache dirty check following the
cowblocks check should prevent that situation. If we do end up with
delalloc cow fork blocks without a dirty address space mapping, this
is probably an indication that something has gone wrong and the
blocks should be reclaimed, as they may never be converted to a real
allocation.
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6fc6fcf44 upstream.
Source xfsprogs commit: ee3754254e8c186c99b6cdd4d59f741759d04acb
Kernel commit 5ef828c4 ("xfs: avoid false quotacheck after unclean
shutdown") made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
by default.
However, when this was merged to libxfs, existing separate
calls to libxfs_sb_quota_from_disk remained, and calling it
twice in a row on a V4 superblock leads to issues, because:
if (sbp->sb_qflags & XFS_PQUOTA_ACCT) {
...
sbp->sb_pquotino = sbp->sb_gquotino;
sbp->sb_gquotino = NULLFSINO;
and after the second call, we have set both pquotino and gquotino
to NULLFSINO.
Fix this by making it safe to call twice, and also remove the extra
calls to libxfs_sb_quota_from_disk.
This is only spotted when running xfstests with "-m crc=0" because
the sb_from_disk change came about after V5 became default, and
the above behavior only exists on a V4 superblock.
Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 26a137e31f upstream.
If the TPM we're connecting to uses a static burst count, it will report
a burst count of zero throughout the response read. However, get_burstcount
assumes that a response of zero indicates that the TPM is not ready to
receive more data. In this case, it returns a negative error code, which
is passed on to tpm_tis_{write,read}_bytes as a u16, causing
them to read/write far too many bytes.
This patch checks for negative return codes and bails out from recv_data
and tpm_tis_send_data.
Fixes: 1107d065fd (tpm_tis: Introduce intermediate layer for TPM access)
Signed-off-by: Josh Zimmerman <joshz@google.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4349bd775c upstream.
We were storing viewport relative coordinates for AVIVO/DCE display
engines. However, radeon_crtc_cursor_set2 and radeon_cursor_reset pass
radeon_crtc->cursor_x/y as the x/y parameters of
radeon_cursor_move_locked, which would break if the CRTC isn't located
at (0, 0).
Cc: stable@vger.kernel.org
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b7df3ce92 upstream.
__s390_dma_map_sg maps a dma-contiguous area. Although we only map
whole pages we have to take into account that the area doesn't start
or stop at a page boundary because we use the dma address to loop
over the individual sg entries. Failing to do that might lead to an
access of the wrong sg entry.
Fixes: ee877b81c6 ("s390/pci_dma: improve map_sg")
Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ebb299a510 upstream.
The s390 specific sched_domain_topology_level should always be used,
not only if the machine provides topology information. Luckily this
odd behaviour, that was by accident introduced with git commit
d05d15da18 ("s390/topology: delay initialization of topology cpu
masks") has currently no side effect.
Fixes: d05d15da18 ("s390/topology: delay initialization of topology cpumasks")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 99e5cde5ea upstream.
Make sure to drop any device reference taken by vio_find_node() when
adding and removing virtual I/O slots.
Fixes: 5eeb8c63a3 ("[PATCH] PCI Hotplug: rpaphp: Move VIO registration")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c7de2b4ff upstream.
There is at least one Chelsio 10Gb card which uses VPD area to store some
non-standard blocks (example below). However pci_vpd_size() returns the
length of the first block only assuming that there can be only one VPD "End
Tag".
Since 4e1a635552 ("vfio/pci: Use kernel VPD access functions"), VFIO
blocks access beyond that offset, which prevents the guest "cxgb3" driver
from probing the device. The host system does not have this problem as its
driver accesses the config space directly without pci_read_vpd().
Add a quirk to override the VPD size to a bigger value. The maximum size
is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h.
We do not read the tag as the cxgb3 driver does as the driver supports
writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes
boundary. The quirk is registered for all devices supported by the cxgb3
driver.
This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3
driver itself accesses VPD directly and the problem only exists with the
vfio-pci driver (when cxgb3 is not running on the host and may not be even
loaded) which blocks accesses beyond the first block of VPD data. However
vfio-pci itself does not have quirks mechanism so we add it to PCI.
This is the controller:
Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030]
This is what I parsed from its VPD:
===
b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K'
0000 Large item 42 bytes; name 0x2 Identifier String
b'10 Gigabit Ethernet-SR PCI Express Adapter'
002d Large item 74 bytes; name 0x10
#00 [EC] len=7: b'D76809 '
#0a [FN] len=7: b'46K7897'
#14 [PN] len=7: b'46K7897'
#1e [MN] len=4: b'1037'
#25 [FC] len=4: b'5769'
#2c [SN] len=12: b'YL102035603V'
#3b [NA] len=12: b'00145E992ED1'
007a Small item 1 bytes; name 0xf End Tag
0c00 Large item 16 bytes; name 0x2 Identifier String
b'S310E-SR-X '
0c13 Large item 234 bytes; name 0x10
#00 [PN] len=16: b'TBD '
#13 [EC] len=16: b'110107730D2 '
#26 [SN] len=16: b'97YL102035603V '
#39 [NA] len=12: b'00145E992ED1'
#48 [V0] len=6: b'175000'
#51 [V1] len=6: b'266666'
#5a [V2] len=6: b'266666'
#63 [V3] len=6: b'2000 '
#6c [V4] len=2: b'1 '
#71 [V5] len=6: b'c2 '
#7a [V6] len=6: b'0 '
#83 [V7] len=2: b'1 '
#88 [V8] len=2: b'0 '
#8d [V9] len=2: b'0 '
#92 [VA] len=2: b'0 '
#97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0d00 Large item 252 bytes; name 0x11
#00 [VC] len=16: b'122310_1222 dp '
#13 [VD] len=16: b'610-0001-00 H1\x00\x00'
#26 [VE] len=16: b'122310_1353 fp '
#39 [VF] len=16: b'610-0001-00 H1\x00\x00'
#4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'...
0dff Small item 0 bytes; name 0xf End Tag
10f3 Large item 13315 bytes; name 0x62
!!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00'
===
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1600f62534 upstream.
Mellanox devices were marked as having INTx masking ability broken. As a
result, the VFIO driver fails to start when more than one device function
is passed-through to a VM if both have the same INTx pin.
Prior to Connect-IB, Mellanox devices exposed to the operating system one
PCI function per all ports. Starting from Connect-IB, the devices are
function-per-port. When passing the second function to a VM, VFIO will
fail to start.
Exclude ConnectX-4, ConnectX4-Lx and Connect-IB from the list of Mellanox
devices marked as having broken INTx masking:
- ConnectX-4 and ConnectX4-LX firmware version is checked. If INTx
masking is supported, we unmark the broken INTx masking.
- Connect-IB does not support INTx currently so will not cause any
problem.
[bhelgaas: call pci_disable_device() always, after iounmap()]
Fixes: 11e42532ad ("PCI: Assume all Mellanox devices have broken INTx masking")
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b88214ce4d upstream.
Convert all quirk_broken_intx_masking() quirks from HEADER to FINAL.
The quirk sets dev->broken_intx_masking, which is only used by
pci_intx_mask_supported(), which is not needed until after FINAL
quirks have been run.
[bhelgaas: changelog]
Signed-off-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a45e2611b9 upstream.
We're trying to mask out bits[23:8] while retaining [32:24, 7:0], but we're
doing the inverse. That doesn't have too much effect, since we're setting
all the [23:8] bits to 1, and the other bits are only relevant for modes
we're currently not using. But we should get this right.
Fixes: ca19890840 ("PCI: rockchip: Fix wrong transmitted FTS count")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45e9320f3a upstream.
The calculation of negotiated lanes is wrong: it should be shifted by
PCIE_CORE_PL_CONF_LANE_SHIFT, but it is shifted by
PCIE_CORE_PL_CONF_LANE_MASK instead. Let's fix it.
Fixes: e77f847df5 ("PCI: rockchip: Add Rockchip PCIe controller support")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 230436b3ef upstream.
gcc is unsure about the use of last_ofs_in_node, which might happen
without a prior initialization:
fs/f2fs//git/arm-soc/fs/f2fs/data.c: In function ‘f2fs_map_blocks’:
fs/f2fs/data.c:799:54: warning: ‘last_ofs_in_node’ may be used uninitialized in this function [-Wmaybe-uninitialized]
if (prealloc && dn.ofs_in_node != last_ofs_in_node + 1) {
As pointed out by Chao Yu, the code is actually correct as 'prealloc'
is only set if the last_ofs_in_node has been set, the two always
get updated together.
This initializes last_ofs_in_node to dn.ofs_in_node for each
new dnode at the start of the 'next_block' loop, which at that
point is a correct initialization as well. I assume that compilers
that correctly track the contents of the variables and do not
warn about the condition also figure out that they can eliminate
the extra assignment here.
Fixes: 46008c6d42 ("f2fs: support in batch multi blocks preallocation")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35782b233f upstream.
This patch removes percpu_count usage due to performance regression in iozone.
Fixes: 523be8a6b3 ("f2fs: use percpu_counter for page counters")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2342ca832 upstream.
md_open() gets a counted reference on an mddev using mddev_find().
If it ends up returning an error, it must drop this reference.
There are two error paths where the reference is not dropped.
One only happens if the process is signalled and an awkward time,
which is quite unlikely.
The other was introduced recently in commit af8d8e6f0.
Change the code to ensure the drop the reference when returning an error,
and make it harded to re-introduce this sort of bug in the future.
Reported-by: Marc Smith <marc.smith@mcc.edu>
Fixes: af8d8e6f03 ("md: changes for MD_STILL_CLOSED flag")
Signed-off-by: NeilBrown <neilb@suse.com>
Acked-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1803b9a52c upstream.
The core AES cipher implementation that uses ARMv8 Crypto Extensions
instructions erroneously loads the round keys as 64-bit quantities,
which causes the algorithm to fail when built for big endian. In
addition, the key schedule generation routine fails to take endianness
into account as well, when loading the combining the input key with
the round constants. So fix both issues.
Fixes: 12ac3efe74 ("arm64/crypto: use crypto instructions to generate AES key schedule")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit caf4b9e2b3 upstream.
Emit the XTS tweak literal constants in the appropriate order for a
single 128-bit scalar literal load.
Fixes: 49788fe2a1 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee71e5f1e7 upstream.
The SHA1 digest is an array of 5 32-bit quantities, so we should refer
to them as such in order for this code to work correctly when built for
big endian. So replace 16 byte scalar loads and stores with 4x4 vector
ones where appropriate.
Fixes: 2c98833a42 ("arm64/crypto: SHA-1 using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a2c435cc99 upstream.
The AES implementation using pure NEON instructions relies on the generic
AES key schedule generation routines, which store the round keys as arrays
of 32-bit quantities stored in memory using native endianness. This means
we should refer to these round keys using 4x4 loads rather than 16x1 loads.
In addition, the ShiftRows tables are loading using a single scalar load,
which is also affected by endianness, so emit these tables in the correct
order depending on whether we are building for big endian or not.
Fixes: 49788fe2a1 ("arm64/crypto: AES-ECB/CBC/CTR/XTS using ARMv8 NEON and Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56e4e76c68 upstream.
The AES-CCM implementation that uses ARMv8 Crypto Extensions instructions
refers to the AES round keys as pairs of 64-bit quantities, which causes
failures when building the code for big endian. In addition, it byte swaps
the input counter unconditionally, while this is only required for little
endian builds. So fix both issues.
Fixes: 12ac3efe74 ("arm64/crypto: use crypto instructions to generate AES key schedule")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58010fa6f7 upstream.
The AES key schedule generation is mostly endian agnostic, with the
exception of the rotation and the incorporation of the round constant
at the start of each round. So implement a big endian specific version
of that part to make the whole routine big endian compatible.
Fixes: 86464859cc ("crypto: arm - AES in ECB/CBC/CTR/XTS modes using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c433ad508 upstream.
The GHASH key and digest are both pairs of 64-bit quantities, but the
GHASH code does not always refer to them as such, causing failures when
built for big endian. So replace the 16x1 loads and stores with 2x8 ones.
Fixes: b913a6404c ("arm64/crypto: improve performance of GHASH algorithm")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 174122c39c upstream.
The SHA256 digest is an array of 8 32-bit quantities, so we should refer
to them as such in order for this code to work correctly when built for
big endian. So replace 16 byte scalar loads and stores with 4x32 vector
ones where appropriate.
Fixes: 6ba6c74dfc ("arm64/crypto: SHA-224/SHA-256 using ARMv8 Crypto Extensions")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6afcf8ef0c upstream.
Since commit bda807d444 ("mm: migrate: support non-lru movable page
migration") isolate_migratepages_block) can isolate !PageLRU pages which
would acct_isolated account as NR_ISOLATED_*. Accounting these non-lru
pages NR_ISOLATED_{ANON,FILE} doesn't make any sense and it can misguide
heuristics based on those counters such as pgdat_reclaimable_pages resp.
too_many_isolated which would lead to unexpected stalls during the
direct reclaim without any good reason. Note that
__alloc_contig_migrate_range can isolate a lot of pages at once.
On mobile devices such as 512M ram android Phone, it may use a big zram
swap. In some cases zram(zsmalloc) uses too many non-lru but
migratedable pages, such as:
MemTotal: 468148 kB
Normal free:5620kB
Free swap:4736kB
Total swap:409596kB
ZRAM: 164616kB(zsmalloc non-lru pages)
active_anon:60700kB
inactive_anon:60744kB
active_file:34420kB
inactive_file:37532kB
Fix this by only accounting lru pages to NR_ISOLATED_* in
isolate_migratepages_block right after they were isolated and we still
know they were on LRU. Drop acct_isolated because it is called after
the fact and we've lost that information. Batching per-cpu counter
doesn't make much improvement anyway. Also make sure that we uncharge
only LRU pages when putting them back on the LRU in
putback_movable_pages resp. when unmap_and_move migrates the page.
[mhocko@suse.com: replace acct_isolated() with direct counting]
Fixes: bda807d444 ("mm: migrate: support non-lru movable page migration")
Link: http://lkml.kernel.org/r/20161019080240.9682-1-mhocko@kernel.org
Signed-off-by: Ming Ling <ming.ling@spreadtrum.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91a45f7107 upstream.
Patch series "mm: workingset: radix tree subtleties & single-page file
refaults", v3.
This is another revision of the radix tree / workingset patches based on
feedback from Jan and Kirill.
This is a follow-up to d3798ae8c6 ("mm: filemap: don't plant shadow
entries without radix tree node"). That patch fixed an issue that was
caused mainly by the page cache sneaking special shadow page entries
into the radix tree and relying on subtleties in the radix tree code to
make that work. The fix also had to stop tracking refaults for
single-page files because shadow pages stored as direct pointers in
radix_tree_root->rnode weren't properly handled during tree extension.
These patches make the radix tree code explicitely support and track
such special entries, to eliminate the subtleties and to restore the
thrash detection for single-page files.
This patch (of 9):
When a radix tree iteration drops the tree lock, another thread might
swoop in and free the node holding the current slot. The iteration
needs to do another tree lookup from the current index to continue.
[kirill.shutemov@linux.intel.com: re-lookup for replacement]
Fixes: f3f0e1d215 ("khugepaged: add support of collapse for tmpfs/shmem pages")
Link: http://lkml.kernel.org/r/20161117191138.22769-2-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <mawilcox@linuxonhyperv.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e2a91f4f42 upstream.
PDF build on Kernel 4.9-rc? returns an error with Sphinx 1.3.x
and Sphinx 1.4.x, when trying to solve some cross-references.
The solution is to redefine the \DURole macro.
However, this is redefined too late. Move such redefinition to
LaTeX preamble and bind it to just the Sphinx versions where the
error is known to be present.
Tested by building the documentation on interactive mode:
make PDFLATEX=xelatex -C Documentation/output/./latex
Fixes: e61a39baf7 ("[media] index.rst: Fix LaTeX error in interactive mode on Sphinx 1.4.x")
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3999f52e31 upstream.
We cannot use the pte value used in set_pte_at for pte_same comparison,
because archs like ppc64, filter/add new pte flag in set_pte_at.
Instead fetch the pte value inside hugetlb_cow. We are comparing pte
value to make sure the pte didn't change since we dropped the page table
lock. hugetlb_cow get called with page table lock held, and we can take
a copy of the pte value before we drop the page table lock.
With hugetlbfs, we optimize the MAP_PRIVATE write fault path with no
previous mapping (huge_pte_none entries), by forcing a cow in the fault
path. This avoid take an addition fault to covert a read-only mapping
to read/write. Here we were comparing a recently instantiated pte (via
set_pte_at) to the pte values from linux page table. As explained above
on ppc64 such pte_same check returned wrong result, resulting in us
taking an additional fault on ppc64.
Fixes: 6a119eae94 ("powerpc/mm: Add a _PAGE_PTE bit")
Link: http://lkml.kernel.org/r/20161018154245.18023-1-aneesh.kumar@linux.vnet.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d74e7ed5d upstream.
qcom_smd_send() should return -EAGAIN for non-blocking channels with
insufficient space, so that we can propagate this event to user space.
Fixes: 53e2822e56 ("rpmsg: Introduce Qualcomm SMD backend")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0af524372 upstream.
Commit 34c3d9819f ("genirq/affinity: Provide smarter irq spreading
infrastructure") introduced a better IRQ spreading mechanism, taking
account of the available NUMA nodes in the machine.
Problem is that the algorithm of retrieving the nodemask iterates
"linearly" based on the number of online nodes - some architectures
present non-linear node distribution among the nodemask, like PowerPC.
If this is the case, the algorithm lead to a wrong node count number
and therefore to a bad/incomplete IRQ affinity distribution.
For example, this problem were found in a machine with 128 CPUs and two
nodes, namely nodes 0 and 8 (instead of 0 and 1, if it was linearly
distributed). This led to a wrong affinity distribution which then led to
a bad mq allocation for nvme driver.
Finally, we take the opportunity to fix a comment regarding the affinity
distribution when we have _more_ nodes than vectors.
Fixes: 34c3d9819f ("genirq/affinity: Provide smarter irq spreading infrastructure")
Reported-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Cc: linux-pci@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: hch@lst.de
Link: http://lkml.kernel.org/r/1481738472-2671-1-git-send-email-gpiccoli@linux.vnet.ibm.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bed570307e upstream.
I noticed some wakeirq flakeyness with consumer drivers not using
autosuspend. For drivers not using autosuspend, the wakeirq may never
get unmasked in rpm_suspend() because of irq desc->depth.
We are configuring dedicated wakeirqs to start with IRQ_NOAUTOEN as we
naturally don't want them running until rpm_suspend() is called.
However, when a consumer driver initially calls pm_runtime_get(), we
now wrongly start with disable_irq_nosync() call on the dedicated
wakeirq that is disabled to start with.
This causes desc->depth to toggle between 1 and 2 instead of the usual
0 and 1. This can prevent enable_irq() from unmasking the wakeirq as
that only happens at desc->depth 1.
This does not necessarily show up with drivers using autosuspend as
there is time for disable_irq_nosync() before rpm_suspend() gets called
after the autosuspend timeout.
Let's fix the issue by adding wirq->status that lazily gets set on
the first rpm_suspend(). We also need PM runtime core private functions
for dev_pm_enable_wake_irq_check() and dev_pm_disable_wake_irq_check()
so we can enable the dedicated wakeirq on the first rpm_suspend().
While at it, let's also fix the comments for dev_pm_enable_wake_irq()
and dev_pm_disable_wake_irq(). Those can still be used by the consumer
drivers as needed because the IRQ core manages the interrupt usecount
for us.
Fixes: 4990d4fe32 (PM / Wakeirq: Add automated device wake IRQ handling)
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1d111e073 upstream.
If msi_setup_entry() fails to allocate an affinity mask, it logs a message
but continues on and allocates an MSI entry with entry->affinity == NULL.
Check for this case in pci_irq_get_affinity() so we don't try to
dereference a NULL pointer.
[bhelgaas: changelog]
Fixes: ee8d41e53e "pci/msi: Retrieve affinity for a vector"
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a11a18902 upstream.
When the "policy" securityfs file is opened for read, it is opened as a
sequential file. However, when it is eventually released, there is no
cleanup for the sequential file, therefore some memory is leaked.
This patch adds a call to seq_release() in ima_release_policy() to clean up
the memory when the file is opened for read.
Fixes: 80eae209d6 IMA: allow reading back the current policy
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Eric Richter <erichte@linux.vnet.ibm.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a91918cd3e upstream.
This iscsit_tpg_add_portal_group() function is only called from
lio_target_tiqn_addtpg(). Both functions free the "tpg" pointer on
error so it's a double free bug. The memory is allocated in the caller
so it should be freed in the caller and not here.
Fixes: e48354ce07 ("iscsi-target: Add iSCSI fabric support for target v4.1")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
[ bvanassche: Added "Fix" at start of patch title ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af15769ffa upstream.
gcc-7 notices that the condition in mvs_94xx_command_active looks
suspicious:
drivers/scsi/mvsas/mv_94xx.c: In function 'mvs_94xx_command_active':
drivers/scsi/mvsas/mv_94xx.c:671:15: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]
This was introduced when the mv_printk() statement got added, and leads
to the condition being ignored. This is probably harmless.
Changing '&&' to '&' makes the code look reasonable, as we check the
command bit before setting and printing it.
Fixes: a4632aae8b ("[SCSI] mvsas: Add new macros and functions")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b93ca43b7 upstream.
When a SW-configurable card is specified but not found, the driver
releases wrong region, causing the following message in kernel log:
Trying to free nonexistent resource <0000000000000000-000000000000000f>
Fix it by assigning base earlier.
Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Fixes: a8cfbcaec0 ("scsi: g_NCR5380: Stop using scsi_module.c")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5faf071d08 upstream.
Unfortunately, I seem to have missed a case where an IRQ safe spinlock was
required, in samsung_i2s_dai_remove, when I fixed up the other calls in
this patch:
316fa9e09a ("ASoC: samsung: Use IRQ safe spin lock calls")
This causes a lockdep warning when unbinding and rebinding the audio card:
[ 104.357664] CPU0 CPU1
[ 104.362174] ---- ----
[ 104.366692] lock(&(&pri_dai->spinlock)->rlock);
[ 104.371372] local_irq_disable();
[ 104.377283] lock(&(&substream->self_group.lock)->rlock);
[ 104.385259] lock(&(&pri_dai->spinlock)->rlock);
[ 104.392469] <Interrupt>
[ 104.395072] lock(&(&substream->self_group.lock)->rlock);
[ 104.400710]
[ 104.400710] *** DEADLOCK ***
Fixes: ce8bcdbb61 ("ASoC: samsung: i2s: Protect more registers with a spinlock")
Signed-off-by: Charles Keepax <ckeepax@opensource.wolfsonmicro.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a823a17981 upstream.
cht_bsw_rt5645 driver allocates the own codec_id string but doesn't
release it. For simplicity, put the string in cht_mc_private; then
the string is allocated in a shot and released altogether.
Fixes: c8560b7c91 ("ASoC: cht_bsw_rt5645: Fix writing to string literal")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3b89e4b77e upstream.
A bugfix accidentally removed the implicit initialization of the
dma channel number, causing undefined behavior when
v->alloc_dma_channel is NULL:
sound/soc/qcom/lpass-platform.c: In function ‘lpass_platform_pcmops_open’:
sound/soc/qcom/lpass-platform.c:83:29: error: ‘dma_ch’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
This adds back an explicit initialization to zero, restoring the
previous behavior for that case.
Fixes: 022d00ee0b ("ASoC: lpass-platform: Fix broken pcm data usage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kenneth Westfield <kwestfie@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit aec0e86172 upstream.
We met the DMAR fault both on hpsa P420i and P421 SmartArray controllers
under kdump, it can be steadily reproduced on several different machines,
the dmesg log is like:
HP HPSA Driver (v 3.4.16-0)
hpsa 0000:02:00.0: using doorbell to reset controller
hpsa 0000:02:00.0: board ready after hard reset.
hpsa 0000:02:00.0: Waiting for controller to respond to no-op
DMAR: Setting identity map for device 0000:02:00.0 [0xe8000 - 0xe8fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xf4000 - 0xf4fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6e000 - 0xbdf6efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf6f000 - 0xbdf7efff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf7f000 - 0xbdf82fff]
DMAR: Setting identity map for device 0000:02:00.0 [0xbdf83000 - 0xbdf84fff]
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [02:00.0] fault addr fffff000 [fault reason 06] PTE Read access is not set
hpsa 0000:02:00.0: controller message 03:00 timed out
hpsa 0000:02:00.0: no-op failed; re-trying
After some debugging, we found that the fault addr is from DMA initiated at
the driver probe stage after reset(not in-flight DMA), and the corresponding
pte entry value is correct, the fault is likely due to the old iommu caches
of the in-flight DMA before it.
Thus we need to flush the old cache after context mapping is setup for the
device, where the device is supposed to finish reset at its driver probe
stage and no in-flight DMA exists hereafter.
I'm not sure if the hardware is responsible for invalidating all the related
caches allocated in the iommu hardware before, but seems not the case for hpsa,
actually many device drivers have problems in properly resetting the hardware.
Anyway flushing (again) by software in kdump kernel when the device gets context
mapped which is a quite infrequent operation does little harm.
With this patch, the problematic machine can survive the kdump tests.
CC: Myron Stowe <myron.stowe@gmail.com>
CC: Joseph Szczypek <jszczype@redhat.com>
CC: Don Brace <don.brace@microsemi.com>
CC: Baoquan He <bhe@redhat.com>
CC: Dave Young <dyoung@redhat.com>
Fixes: 091d42e43d ("iommu/vt-d: Copy translation tables from old kernel")
Fixes: dbcd861f25 ("iommu/vt-d: Do not re-use domain-ids from the old kernel")
Fixes: cf484d0e69 ("iommu/vt-d: Mark copied context entries")
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
Tested-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65ca7f5f7d upstream.
Different encodings are used to represent supported PASID bits
and number of PASID table entries.
The current code assigns ecap_pss directly to extended context
table entry PTS which is wrong and could result in writing
non-zero bits to the reserved fields. IOMMU fault reason
11 will be reported when reserved bits are nonzero.
This patch converts ecap_pss to extend context entry pts encoding
based on VT-d spec. Chapter 9.4 as follows:
- number of PASID bits = ecap_pss + 1
- number of PASID table entries = 2^(pts + 5)
Software assigned limit of pasid_max value is also respected to
match the allocation limitation of PASID table.
cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com>
Tested-by: Mika Kuoppala <mika.kuoppala@intel.com>
Fixes: 2f26e0a9c9 ('iommu/vt-d: Add basic SVM PASID support')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 432abf68a7 upstream.
The generic command buffer entry is 128 bits (16 bytes), so the offset
of tail and head pointer should be 16 bytes aligned and increased with
0x10 per command.
When cmd buf is full, head = (tail + 0x10) % CMD_BUFFER_SIZE.
So when left space of cmd buf should be able to store only two
command, we should be issued one COMPLETE_WAIT additionally to wait
all older commands completed. Then the left space should be increased
after IOMMU fetching from cmd buf.
So left check value should be left <= 0x20 (two commands).
Signed-off-by: Huang Rui <ray.huang@amd.com>
Fixes: ac0ea6e92b ('x86/amd-iommu: Improve handling of full command buffer')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bae203d58b upstream.
Function mx31_clocks_init() is called during clock intialization on
legacy boards with reference clock frequency passed as its input
argument, this can be verified by examination of the function
declaration found in arch/arm/mach-imx/common.h and actual function
users which include that header file.
Inside CCF driver the function ignores its input argument, by chance
the used value in the function body is the same as input arguments on
side of all callers.
Fixes: d9388c8432 ("clk: imx31: Do not call mxc_timer_init twice when booting with DT")
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0f6f9302b8 upstream.
The audio module clocks are supposed to be set according to the sample
rate of the audio stream. The audio PLL provides the clock signal for
these module clocks, and only it is freely tunable.
Set CLK_SET_RATE_PARENT for the audio module clocks so their users can
properly tune the clock rate.
Fixes: 0577e4853b ("clk: sunxi-ng: Add H3 clocks")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 937ff9ded8 upstream.
The audio module clocks are supposed to be set according to the sample
rate of the audio stream. The audio PLL provides the clock signal for
these module clocks, and only it is freely tunable.
Set CLK_SET_RATE_PARENT for the audio module clocks so their users can
properly tune the clock rate.
Fixes: 5690879d93 ("clk: sunxi-ng: Add A23 CCU")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cbf2e548ca upstream.
The clocks on these boards run at 25 MHz, not 19.2 and 27 like
other platforms. Unfortunately I copy/pasted from other similar
SoCs but forgot this one is different. Fix it.
Fixes: a085f877a8 ("clk: qcom: Move cxo/pxo/xo into dt files")
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e9572fdd13 upstream.
Since commit commit eb1c8f4325 ("hwmon: (lm90) Convert to use new hwmon
registration API") the temp1_max_alarm and temp1_crit_alarm attributes are
mapped to the same alarm bit. Fix the typo.
Fixes: eb1c8f4325 ("hwmon: (lm90) Convert to use new hwmon registration API")
Signed-off-by: Micehael Walle <michael@walle.cc>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fccd4a1e8 upstream.
Fix overflows seen when writing into fan speed limit attributes.
Also fix crash due to division by zero, seen when certain very
large values (such as 2147483648, or 0x80000000) are written
into fan speed limit attributes.
Fixes: 594fbe713b ("Add support for GMT G762/G763 PWM fan controllers")
Cc: Arnaud Ebalard <arno@natisbad.org>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0d04e9112 upstream.
Fix overflows seen when writing voltage and temperature limit attributes.
The value passed to DIV_ROUND_CLOSEST() needs to be clamped, and the
value parameter passed to nct7802_write_fan_min() is an unsigned long.
Also, writing values larger than 2700000 into a fan limit attribute results
in writing 0 into the chip's limit registers. The exact behavior when
writing this value is unspecified. For consistency, report a limit of
1350000 if the chip register reads 0. This may be wrong, and the chip
behavior should be verified with the actual chip, but it is better than
reporting a value of 0 (which, when written, results in writing a value
of 0x1fff into the chip register).
Fixes: 3434f37835 ("hwmon: Driver for Nuvoton NCT7802Y")
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e36ce99ee0 upstream.
Module test reports:
temp1_max: Suspected overflow: [160000 vs. 0]
temp1_min: Suspected overflow: [160000 vs. 0]
This is seen because the values passed when writing temperature limits
are unbound.
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Fixes: 6099469805 ("hwmon: Support for Dallas Semiconductor DS620")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4538bfbf2d upstream.
Converts the unsigned temperature values from the i2c read
to be sign extended as defined in the datasheet so that
negative temperatures are properly read.
Fixes: 28e6274d8f ("hwmon: (amc6821) Avoid forward declaration")
Signed-off-by: Jared Bents <jared.bents@rockwellcollins.com>
Signed-off-by: Matt Weber <matthew.weber@rockwellcollins.com>
[groeck: Dropped unnecessary continuation line]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 13edb767aa upstream.
If the driver is built as a module, autoload won't work because the module
alias information is not filled. So user-space can't match the registered
device with the corresponding module.
Export the module alias information using the MODULE_DEVICE_TABLE() macro.
Before this patch:
$ modinfo drivers/hwmon/scpi-hwmon.ko | grep alias
$
After this patch:
$ modinfo drivers/hwmon/scpi-hwmon.ko | grep alias
alias: of:N*T*Carm,scpi-sensorsC*
alias: of:N*T*Carm,scpi-sensors
Signed-off-by: Javier Martinez Canillas <javier@osg.samsung.com>
Fixes: ea98b29a05 ("hwmon: Support sensors exported via ARM SCP interface")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a608a9d52f upstream.
All LED-setting functions in fujitsu-laptop are currently assigned to
the brightness_set callback, which is incorrect because they can sleep
(due to their use of call_fext_func(), which in turn issues ACPI calls)
and the documentation (in include/linux/leds.h) clearly states they must
not. Assign them to brightness_set_blocking instead and change them to
match the expected function prototype.
This change makes it possible to use Fujitsu-specific LEDs with "heavy"
triggers, like disk-activity or phy0rx.
Fixes: 3a40708609 ("fujitsu-laptop: Add BL power, LED control and radio state information")
Fixes: 4f62568c1f ("fujitsu-laptop: Support radio LED")
Fixes: d6b88f64b0 ("fujitsu-laptop: Add support for eco LED")
Signed-off-by: Michał Kępień <kernel@kempniu.pl>
Acked-by: Jonathan Woithe <jwoithe@just42.net>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7f847dd317 upstream.
The slp_s0_residency_usec debugfs file currently uses
DEFINE_DEBUGFS_ATTRIBUTE(), but that macro cannot really be used to
define files outside of the debugfs code, as it has no reference to
the get/set functions if CONFIG_DEBUG_FS is not defined:
drivers/platform/x86/intel_pmc_core.c:80:12: error: ‘pmc_core_dev_state_get’ defined but not used [-Werror=unused-function]
This fixes the macro to always contain the reference, and instead rely
on the stubbed-out debugfs_create_file to not actually refer to
its arguments so the compiler can still drop the reference.
This works because the attribute definition is always 'static',
and the dead-code removal silently drops all static symbols
that are not used.
Fixes: c646880814 ("debugfs: add support for self-protecting attribute file fops")
Fixes: df2294fb64 ("intel_pmc_core: Convert to DEFINE_DEBUGFS_ATTRIBUTE")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[nicstange@gmail.com: Add dummy implementations of debugfs_attr_read() and
debugfs_attr_write() in order to protect against possibly broken dead
code elimination and to improve readability.
Correct CONFIG_DEBUGFS_FS -> CONFIG_DEBUG_FS typo in changelog.]
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bc4725d902 upstream.
The intention was to enable the checks if debugging is enabled, not
disabled.
Fixes: f793d1e517 ("clk: shmobile: Add new CPG/MSSR driver core")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 328cf6927b upstream.
If CONFIG_ETRAX_AXISFLASHMAP is not configured, the flash rescue image
object file is empty. With recent versions of binutils, this results
in the following build error.
cris-linux-objcopy: error:
the input file 'arch/cris/boot/rescue/rescue.o' has no sections
This is seen, for example, when trying to build cris:allnoconfig
with recently generated toolchains.
Since it does not make sense to build a flash rescue image if there is
no flash, only build it if CONFIG_ETRAX_AXISFLASHMAP is enabled.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Fixes: 66ab3a74c5 ("CRIS: Merge machine dependent boot/compressed ..")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31b239824e upstream.
The word "background" contains 10 characters so the third argument of
strncmp() need to be 10 in order to match this prefix correctly.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Fixes: 855aed1220 ("ath10k: add spectral scan feature")
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcf7cf1551 upstream.
This partially reverts 'commit 2cdce425aa
("ath10k: Fix broken NULL func data frame status for 10.4")'
Unfortunately this breaks sending NULL func and the existing
issue of obtaining proper tx status for NULL function will be
fixed. Also update the comments for feature flag added to be
useless and not working
Fixes: 2cdce425aa "ath10k: Fix broken NULL func data frame status for
10.4"
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fa436b3a2 upstream.
NL80211_ATTR_MAC was used to set both the specific BSSID to be scanned
and the random MAC address to be used when privacy is enabled. When both
the features are enabled, both the BSSID and the local MAC address were
getting same value causing Probe Request frames to go with unintended
DA. Hence, this has been fixed by using a different NL80211_ATTR_BSSID
attribute to set the specific BSSID (which was the more recent addition
in cfg80211) for a scan.
Backwards compatibility with old userspace software is maintained to
some extent by allowing NL80211_ATTR_MAC to be used to set the specific
BSSID when scanning without enabling random MAC address use.
Scanning with random source MAC address was introduced by commit
ad2b26abc1 ("cfg80211: allow drivers to support random MAC addresses
for scan") and the issue was introduced with the addition of the second
user for the same attribute in commit 818965d391 ("cfg80211: Allow a
scan request for a specific BSSID").
Fixes: 818965d391 ("cfg80211: Allow a scan request for a specific BSSID")
Signed-off-by: Vamsi Krishna <vamsin@qti.qualcomm.com>
Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c3d185a9a upstream.
On drivers setting the SUPPORTS_REORDERING_BUFFER hardware flag,
we crash when the peer sends an AddBA request while we already
have a session open on the seame TID; this is because on those
drivers, the tid_agg_rx is left NULL even though the session is
valid, and the agg_session_valid bit is set.
To fix this, store the dialog tokens outside the tid_agg_rx to
be able to compare them to the received AddBA request.
Fixes: f89e07d4cf ("mac80211: agg-rx: refuse ADDBA Request with timeout update")
Reported-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d621459299 upstream.
commit 0416e494ce ("usb: dwc3: ep0: correct cache
sync issue in case of ep0_bounced") introduced a bug
where we would leak DMA resources which would cause
us to starve the system of them resulting in failing
DMA transfers.
Fix the bug by making sure that we always unmap EP0
requests since those are *always* mapped.
Fixes: 0416e494ce ("usb: dwc3: ep0: correct cache
sync issue in case of ep0_bounced")
Cc: <stable@vger.kernel.org>
Tested-by: Tomasz Medrek <tomaszx.medrek@intel.com>
Reported-by: Janusz Dziedzic <januszx.dziedzic@linux.intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 19ec31230e upstream.
Let's call dwc3_ep0_prepare_one_trb() explicitly
because there are occasions where we will need more
than one TRB to handle an EP0 transfer.
A follow-up patch will fix one bug related to
multiple-TRB Data Phases when it comes to
mapping/unmapping requests for DMA.
Reported-by: Janusz Dziedzic <januszx.dziedzic@linux.intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7931ec86c1 upstream.
For now this is just a cleanup patch, no functional
changes. We will be using the new function to fix a
bug introduced long ago by commit 0416e494ce
("usb: dwc3: ep0: correct cache sync issue in case
of ep0_bounced") and further worsened by commit
c0bd5456a4 ("usb: dwc3: ep0: handle non maxpacket
aligned transfers > 512")
Reported-by: Janusz Dziedzic <januszx.dziedzic@linux.intel.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65e4345c8e upstream.
The LIS3LV02 has a special bit that need to be set to get the
read values left aligned. Before this patch we get gibberish
like this:
iio_generic_buffer -a -c10 -n lis3lv02dl_accel
(...)
0.000000 -0.010042 -0.642688 19155832931907
0.000000 -0.010042 -0.642688 19155858751073
Which is because we read a raw value for 1g as 64 which is
the nominal 1024 for 1g shifted 4 bits to the left by being
right-aligned rather than left aligned.
Since all other sensors are left aligned, add some code to
set the special DAS (data alignment setting) bit to 1 so that
the right value is now read like this:
iio_generic_buffer -a -c10 -n lis3lv02dl_accel
(...)
0.000000 -0.147095 -10.120135 24761614364956
-0.029419 -0.176514 -10.120135 24761631624540
The scaling was weird as well: we have a gain of 1000 for 1g
and 3000 for 6g. I don't even remember how I came up with the
old values but they are wrong.
Fixes: 3acddf74f8 ("iio: st-sensors: add support for lis3lv02d accelerometer")
Cc: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Cc: Giuseppe Barba <giuseppe.barba@st.com>
Cc: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b321a38d24 upstream.
The oversampling ratio is controlled using the oversampling pins,
OS [2:0] with OS2 being the MSB control bit, and OS0 the LSB control
bit.
The gpio connected to the OS2 pin is not being set correctly, only OS0
and OS1 pins are being set. Fix the typo to allow proper control of the
oversampling pins.
Signed-off-by: Eva Rachel Retuya <eraretuya@gmail.com>
Fixes: b9618c0 ("staging: IIO: ADC: New driver for AD7606/AD7606-6/AD7606-4")
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e09ee853c9 upstream.
The credentials handling was pushed to the write handlers
but error handling wasn't done properly.
Move write callbacks to completion queue to destroy them
and to notify a blocked writer about the failure
Fixes: 136698e535 (mei: push credentials inside the irq write handler)
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e35d6d7c4e upstream.
Bind to the interface, but do not register any ports, after having
downloaded the firmware. The device will still disconnect and
re-enumerate, but this way we avoid an error messages from being logged
as part of the process:
io_ti: probe of 1-1.3:1.0 failed with error -5
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f5ecaf985 upstream.
'buf' is malloced in dibusb_rc_query() and should be freed before
leaving from the error handling cases, otherwise it will cause
memory leak.
Fixes: ff1c123545 ("[media] dibusb: handle error code on RC query")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0cff18cbab upstream.
The 2 USB host ports are directly tied to the 2 USB hosts in the SoC.
The 2 host pairs were already enabled, but the USB PHY wasn't.
VBUS on the 2 ports are always on.
Enable the USB PHY.
Fixes: 04c85ecad3 ("ARM: dts: sun7i: Add dts file for Bananapi M1 Plus
board")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 015105b121 upstream.
Make sure to drop the references taken by of_parse_phandle() and
bus_find_device() before returning from am335x_get_phy_control().
Note that there is no guarantee that the devres-managed struct
phy_control will be valid for the lifetime of the sibling phy device
regardless of this change.
Fixes: 3bb869c8b3 ("usb: phy: Add AM335x PHY driver")
Acked-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 982555fc26 upstream.
For isoc endpoint descriptor, the wMaxPacketSize is not real max packet
size (see Table 9-13. Standard Endpoint Descriptor, USB 2.0 specifcation),
it may contain the number of packet, so the real max packet should be
ep->desc->wMaxPacketSize && 0x7ff.
Cc: Felipe F. Tonello <eu@felipetonello.com>
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Fixes: 16b114a6d7 ("usb: gadget: fix usb_ep_align_maybe
endianness and new usb_ep_aligna")
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3c3dd1e058 upstream.
Function klsi_105_open() calls usb_control_msg() (to "enable read") and
checks its return value. When the return value is unexpected, it only
assigns the error code to the return variable retval, but does not
terminate the exception path. This patch fixes the bug by inserting
"goto err_generic_close;" when the call to usb_control_msg() fails.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Pan Bian <bianpan2016@163.com>
[johan: rebase on prerequisite fix and amend commit message]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4763601a56 upstream.
The function returns -EINVAL even if it builds the stream properly.
The bogus error code sneaked in during the code refactoring, but it
wasn't noticed until now since the returned error code itself is
ignored in anyway. Kill it here, but there is no behavior change by
this patch, obviously.
Fixes: e5779998bf ('ALSA: usb-audio: refactor code')
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5563bb5743 upstream.
The function bfin_fifo_offset is defined but not used:
drivers/usb/musb/blackfin.c:36:12: warning: ‘bfin_fifo_offset’ defined
but not used [-Wunused-function]
static u32 bfin_fifo_offset(u8 epnum)
^~~~~~~~~~~~~~~~
Adding bfin_fifo_offset to bfin_ops fixes this warning and allows musb
core to call this function instead of default_fifo_offset.
Fixes: cc92f6818f ("usb: musb: Populate new IO functions for blackfin")
Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3bc02bce90 upstream.
If CONFIG_PM=n:
drivers/usb/core/hub.c:107: warning: ‘hub_usb3_port_prepare_disable’ declared inline after being called
drivers/usb/core/hub.c:107: warning: previous declaration of ‘hub_usb3_port_prepare_disable’ was here
To fix this, move hub_port_disable() after
hub_usb3_port_prepare_disable(), and adjust forward declarations.
Fixes: 37be66767e ("usb: hub: Fix auto-remount of safely removed or ejected USB-3 devices")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8c300fe282 upstream.
When unloading omap2430, we can get the following splat:
WARNING: CPU: 1 PID: 295 at kernel/irq/manage.c:1478 __free_irq+0xa8/0x2c8
Trying to free already-free IRQ 4
...
[<c01a8b78>] (free_irq) from [<bf0aea84>]
(musbhs_dma_controller_destroy+0x28/0xb0 [musb_hdrc])
[<bf0aea84>] (musbhs_dma_controller_destroy [musb_hdrc]) from
[<bf09f88c>] (musb_remove+0xf0/0x12c [musb_hdrc])
[<bf09f88c>] (musb_remove [musb_hdrc]) from [<c056a384>]
(platform_drv_remove+0x24/0x3c)
...
This is because the irq number in use is 260 nowadays, and the dma
controller is using u8 instead of int.
Fixes: 6995eb68aa ("USB: musb: enable low level DMA operation for Blackfin")
Signed-off-by: Tony Lindgren <tony@atomide.com>
[b-liu@ti.com: added Fixes tag]
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9418ee15f7 upstream.
DCFG.DEVSPD == 0x3 is not valid and we need to set
DCFG.DEVSPD to 0x1 for full speed mode. Same goes for
DSTS.CONNECTSPD.
Old databooks had 0x3 for full speed in 48MHz mode for
USB1.1 transceivers which was never supported. Newer databooks
don't mention 0x3 at all.
Cc: John Youn <John.Youn@synopsys.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 51c1685d95 upstream.
usb_get_dr_mode() expects the device-property to be spelled
"dr_mode" not "dr-mode".
Spelling it properly fixes the following warning showing up in dmesg:
[ 8704.500545] dwc3 dwc3.2.auto: Configuration mismatch. dr_mode forced to gadget
Signed-off-by: Hans de Goede <hdegoede@redhat.com
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c111b6c38 upstream.
Current abort operation has race.
xhci_handle_command_timeout()
xhci_abort_cmd_ring()
xhci_write_64(CMD_RING_ABORT)
xhci_handshake(5s)
do {
check CMD_RING_RUNNING
udelay(1)
...
COMP_CMD_ABORT event
COMP_CMD_STOP event
xhci_handle_stopped_cmd_ring()
restart cmd_ring
CMD_RING_RUNNING become 1 again
} while ()
return -ETIMEDOUT
xhci_write_64(CMD_RING_ABORT)
/* can abort random command */
To do abort operation correctly, we have to wait both of COMP_CMD_STOP
event and negation of CMD_RING_RUNNING.
But like above, while timeout handler is waiting negation of
CMD_RING_RUNNING, event handler can restart cmd_ring. So timeout
handler never be notice negation of CMD_RING_RUNNING, and retry of
CMD_RING_ABORT can abort random command (BTW, I guess retry of
CMD_RING_ABORT was workaround of this race).
To fix this race, this moves xhci_handle_stopped_cmd_ring() to
xhci_abort_cmd_ring(). And timeout handler waits COMP_CMD_STOP event.
At this point, timeout handler is owner of cmd_ring, and safely
restart cmd_ring by using xhci_handle_stopped_cmd_ring().
[FWIW, as bonus, this way would be easily extend to add CMD_RING_PAUSE
operation]
[locks edited as patch is rebased on other locking fixes -Mathias]
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cb4d5ce588 upstream.
This is preparation to fix abort operation race (See "xhci: Fix race
related to abort operation"). To make timeout sleepable, use
delayed_work instead of timer.
[change a newly added pending timer fix to pending work -Mathias]
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fde1faf872 upstream.
A static usb-serial-driver structure that is used to initialise the
interrupt URB was modified during probe depending on the currently
probed device type, something which could break a parallel probe of a
device of a different type.
Fix this up by overriding the default completion callback for MCS7715
devices in attach() instead. We may want to use two usb-serial driver
instances for the two types later.
Fixes: fb088e335d ("USB: serial: add support for serial port on the moschip 7715")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 75dd211e77 upstream.
Do not submit the interrupt URB until after the parport has been
successfully registered to avoid another use-after-free in the
completion handler when accessing the freed parport private data in case
of a racing completion.
Fixes: b69578df7e ("USB: usbserial: mos7720: add support for parallel port on moschip 7715")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91a1ff4d53 upstream.
The interrupt URB was submitted on probe but never stopped on probe
errors. This can lead to use-after-free issues in the completion
handler when accessing the freed usb-serial struct:
Unable to handle kernel paging request at virtual address 6b6b6be7
...
[<bf052e70>] (mos7715_interrupt_callback [mos7720]) from [<c052a894>] (__usb_hcd_giveback_urb+0x80/0x140)
[<c052a894>] (__usb_hcd_giveback_urb) from [<c052a9a4>] (usb_hcd_giveback_urb+0x50/0x138)
[<c052a9a4>] (usb_hcd_giveback_urb) from [<c0550684>] (musb_giveback+0xc8/0x1cc)
Fixes: b69578df7e ("USB: usbserial: mos7720: add support for parallel port on moschip 7715")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b05aebc25f upstream.
Fix NULL-pointer dereference at port open if a device lacks the expected
bulk in and out endpoints.
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
[<bf071c20>] (mos7720_open [mos7720]) from [<bf0490e0>] (serial_port_activate+0x68/0x98 [usbserial])
[<bf0490e0>] (serial_port_activate [usbserial]) from [<c0470ca4>] (tty_port_open+0x9c/0xe8)
[<c0470ca4>] (tty_port_open) from [<bf049d98>] (serial_open+0x48/0x6c [usbserial])
[<bf049d98>] (serial_open [usbserial]) from [<c0469178>] (tty_open+0xcc/0x5cc)
Fixes: 0f64478cbc ("USB: add USB serial mos7720 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c75633ef7 upstream.
Fix NULL-pointer dereference in open() should the device lack the
expected endpoints:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at mos7840_open+0x88/0x8dc [mos7840]
Note that we continue to treat the interrupt-in endpoint as optional for
now.
Fixes: 3f5429746d ("USB: Moschip 7840 USB-Serial Driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21ce578402 upstream.
Fix NULL-pointer dereference in write() should the device lack the
expected interrupt-out endpoint:
Unable to handle kernel NULL pointer dereference at virtual address 00000054
...
PC is at kobil_write+0x144/0x2a0 [kobil_sct]
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3dca01114d upstream.
Fix NULL-pointer dereference when clearing halt at open should the device
lack a bulk-out endpoint.
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at cyberjack_open+0x40/0x9c [cyberjack]
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5afeef2366 upstream.
Fix NULL-pointer dereference in open() should the device lack the
expected endpoints:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at oti6858_open+0x30/0x1d0 [oti6858]
Note that a missing interrupt-in endpoint would have caused open() to
fail.
Fixes: 49cdee0ed0 ("USB: oti6858 usb-serial driver (in Nokia CA-42
cable)")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0dd408425e upstream.
Fix NULL-pointer dereference when initialising URBs at open should a
non-EPIC device lack a bulk-in or interrupt-in endpoint.
Unable to handle kernel NULL pointer dereference at virtual address 00000028
...
PC is at edge_open+0x24c/0x3e8 [io_edgeport]
Note that the EPIC-device probe path has the required sanity checks so
this makes those checks partially redundant.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c4ac4496e8 upstream.
Make sure to free the URB transfer buffer in case submission fails (e.g.
due to a disconnect).
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90507d54f7 upstream.
Fix NULL-pointer dereference at open should the device lack a bulk-in or
bulk-out endpoint:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at iuu_open+0x78/0x59c [iuu_phoenix]
Fixes: 07c3b1a100 ("USB: remove broken usb-serial num_endpoints
check")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2330d0a853 upstream.
Cancel the heartbeat work on driver unbind in order to avoid I/O after
disconnect in case the port is held open.
Note that the cancel in release() is still needed to stop the heartbeat
after late probe errors.
Fixes: 26c78daade ("USB: io_ti: Add heartbeat to keep idle EP/416 ports from disconnecting")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4f9785cc99 upstream.
In case a device is left in "boot-mode" we must not register any port
devices in order to avoid a NULL-pointer dereference on open due to
missing endpoints. This could be used by a malicious device to trigger
an OOPS:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
[<bf0caa84>] (edge_open [io_ti]) from [<bf0b0118>] (serial_port_activate+0x68/0x98 [usbserial])
[<bf0b0118>] (serial_port_activate [usbserial]) from [<c0470ca4>] (tty_port_open+0x9c/0xe8)
[<c0470ca4>] (tty_port_open) from [<bf0b0da0>] (serial_open+0x48/0x6c [usbserial])
[<bf0b0da0>] (serial_open [usbserial]) from [<c0469178>] (tty_open+0xcc/0x5cc)
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a323fefc6f upstream.
Fix NULL-pointer dereference when clearing halt at open should a
malicious device lack the expected endpoints when in download mode.
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
[<bf011ed8>] (edge_open [io_ti]) from [<bf000118>] (serial_port_activate+0x68/0x98 [usbserial])
[<bf000118>] (serial_port_activate [usbserial]) from [<c0470ca4>] (tty_port_open+0x9c/0xe8)
[<c0470ca4>] (tty_port_open) from [<bf000da0>] (serial_open+0x48/0x6c [usbserial])
[<bf000da0>] (serial_open [usbserial]) from [<c0469178>] (tty_open+0xcc/0x5cc)
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cc09092482 upstream.
Fix NULL-pointer dereference in open() should the device lack the
expected endpoints:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at spcp8x5_open+0x30/0xd0 [spcp8x5]
Fixes: 619a6f1d14 ("USB: add usb-serial spcp8x5 driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5d9b0f859b upstream.
Check for the expected endpoints in attach() and fail loudly if not
present.
Note that failing to do this appears to be benign since da280e3488
("USB: keyspan_pda: clean up write-urb busy handling") which prevents a
NULL-pointer dereference in write() by never marking a non-existent
write-urb as free.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 76ab439ed1 upstream.
Fix NULL-pointer dereference in open() should a type-0 or type-1 device
lack the expected endpoints:
Unable to handle kernel NULL pointer dereference at virtual address 00000030
...
PC is at pl2303_open+0x38/0xec [pl2303]
Note that a missing interrupt-in endpoint would have caused open() to
fail.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f09d1886a4 upstream.
The write URB was being killed using the synchronous interface while
holding a spin lock in close().
Simply drop the lock and busy-flag update, something which would have
been taken care of by the completion handler if the URB was in flight.
Fixes: f7a33e608d ("USB: serial: add quatech2 usb to serial driver")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 28bedb5ae4 upstream.
In function xhci_mtk_probe(), variable ret takes the return value. Its
value should be negative on failures. However, when the call to function
platform_get_irq() fails, it does not set the error code, and 0 will be
returned. 0 indicates no error. As a result, the callers of function
xhci_mtk_probe() will not be able to detect the error. This patch fixes
the bug by assigning the return value of platform_get_irq() to variable
ret if it fails.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4dea70778c upstream.
In command timer function, xhci_handle_command_timeout(), xhci->lock
is unlocked before call into xhci_abort_cmd_ring(). This might cause
race between the timer function and the event handler.
The xhci_abort_cmd_ring() function sets the CMD_RING_ABORT bit in the
command register and polling it until the setting takes effect. A stop
command ring event might be handled between writing the abort bit and
polling for it. The event handler will restart the command ring, which
causes the failure of polling, and we ever believed that we failed to
stop it.
As a bonus, this also fixes some issues of calling functions without
locking in xhci_handle_command_timeout().
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a5a1b95141 upstream.
If we get a command completion event at the same time as the command
timeout work starts on another cpu we might end up aborting the wrong
command.
If the command completion takes the xhci lock before the timeout work, it
will handle the command, pick the next command, mark it as current_cmd, and
re-queue the timeout work. When the timeout work finally gets the lock
It will start aborting the wrong command.
This case can be resolved by checking if the timeout work is pending inside
the timeout function itself. A new timeout work can only be pending if the
command completed and a new command was queued.
If there are no more commands pending then command completion will set
the current_cmd to NULL, which is already handled in the timeout work.
Reported-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a7cfdf37b upstream.
When current command was supposed to be aborted, host will free the command
in handle_cmd_completion() function. But it might be still referenced by
xhci->current_cmd, which need to set NULL.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 90797aee5d upstream.
xhci_setup_device() should return failure with correct error number
when xhci host has died, removed or halted.
During usb device enumeration, if usb host is not accessible (died,
removed or halted), the hc_driver->address_device() should return
a corresponding error code to usb core. But current xhci driver just
returns success. This misleads usb core to continue the enumeration
by reading the device descriptor, which will result in failure, and
users will get a misleading message like "device descriptor read/8,
error -110".
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ee8665e28e upstream.
the tt_info provided by a HS hub might be in use to by a child device
Make sure we free the devices in the correct order.
This is needed in special cases such as when xhci controller is
reset when resuming from hibernate, and all virt_devices are freed.
Also free the virt_devices starting from max slot_id as children
more commonly have higher slot_id than parent.
Reported-by: Guenter Roeck <groeck@chromium.org>
Tested-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2b98546737 upstream.
handle_cmd_completion() frees a command structure which might be still
referenced by xhci->current_cmd.
This might cause problem when xhci->current_cmd is accessed after that.
A real-life case could be like this. The host takes a very long time to
respond to a command, and the command timer is fired at the same time
when the command completion event arrives. The command completion
handler frees xhci->current_cmd before the timer function can grab
xhci->lock. Afterward, timer function grabs the lock and go ahead with
checking and setting members of xhci->current_cmd.
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e71d363d9c upstream.
Now that we're handling so many transfers at a time
and for some dwc3 revisions LPM events *must* be
enabled, we can fall into a situation where too many
events fire and we start receiving Overflow events.
Let's do what XHCI does and allocate a full page for
the Event Ring, this will avoid any future issues.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e4da3fcf7 upstream.
By convention (according to doc) if function does not provide
get_alt() callback composite framework should assume that it has only
altsetting 0 and should respond with error if host tries to set
other one.
After commit dd4dff8b03 ("USB: composite: Fix bug: should test
set_alt function pointer before use it")
we started checking set_alt() callback instead of get_alt().
This check is useless as we check if set_alt() is set inside
usb_add_function() and fail if it's NULL.
Let's fix this check and move comment about why we check the get
method instead of set a little bit closer to prevent future false
fixes.
Fixes: dd4dff8b03 ("USB: composite: Fix bug: should test set_alt function pointer before use it")
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcdbeb8447 upstream.
The stop_activity() routine in dummy-hcd is supposed to unlink all
active requests for every endpoint, among other things. But it
doesn't handle ep0. As a result, fuzz testing can generate a WARNING
like the following:
WARNING: CPU: 0 PID: 4410 at drivers/usb/gadget/udc/dummy_hcd.c:672 dummy_free_request+0x153/0x170
Modules linked in:
CPU: 0 PID: 4410 Comm: syz-executor Not tainted 4.9.0-rc7+ #32
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88006a64ed10 ffffffff81f96b8a ffffffff41b58ab3 1ffff1000d4c9d35
ffffed000d4c9d2d ffff880065f8ac00 0000000041b58ab3 ffffffff8598b510
ffffffff81f968f8 0000000041b58ab3 ffffffff859410e0 ffffffff813f0590
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
[<ffffffff812b808f>] __warn+0x19f/0x1e0 kernel/panic.c:550
[<ffffffff812b831c>] warn_slowpath_null+0x2c/0x40 kernel/panic.c:585
[<ffffffff830fcb13>] dummy_free_request+0x153/0x170 drivers/usb/gadget/udc/dummy_hcd.c:672
[<ffffffff830ed1b0>] usb_ep_free_request+0xc0/0x420 drivers/usb/gadget/udc/core.c:195
[<ffffffff83225031>] gadgetfs_unbind+0x131/0x190 drivers/usb/gadget/legacy/inode.c:1612
[<ffffffff830ebd8f>] usb_gadget_remove_driver+0x10f/0x2b0 drivers/usb/gadget/udc/core.c:1228
[<ffffffff830ec084>] usb_gadget_unregister_driver+0x154/0x240 drivers/usb/gadget/udc/core.c:1357
This patch fixes the problem by iterating over all the endpoints in
the driver's ep array instead of iterating over the gadget's ep_list,
which explicitly leaves out ep0.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0a8fd13462 upstream.
When checking a new device's descriptors, the USB core does not check
for duplicate endpoint addresses. This can cause a problem when the
sysfs files for those endpoints are created; trying to create multiple
files with the same name will provoke a WARNING:
WARNING: CPU: 2 PID: 865 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x8a/0xa0
sysfs: cannot create duplicate filename
'/devices/platform/dummy_hcd.0/usb2/2-1/2-1:64.0/ep_05'
Kernel panic - not syncing: panic_on_warn set ...
CPU: 2 PID: 865 Comm: kworker/2:1 Not tainted 4.9.0-rc7+ #34
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
ffff88006bee64c8 ffffffff81f96b8a ffffffff00000001 1ffff1000d7dcc2c
ffffed000d7dcc24 0000000000000001 0000000041b58ab3 ffffffff8598b510
ffffffff81f968f8 ffffffff850fee20 ffffffff85cff020 dffffc0000000000
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81f96b8a>] dump_stack+0x292/0x398 lib/dump_stack.c:51
[<ffffffff8168c88e>] panic+0x1cb/0x3a9 kernel/panic.c:179
[<ffffffff812b80b4>] __warn+0x1c4/0x1e0 kernel/panic.c:542
[<ffffffff812b8195>] warn_slowpath_fmt+0xc5/0x110 kernel/panic.c:565
[<ffffffff819e70ca>] sysfs_warn_dup+0x8a/0xa0 fs/sysfs/dir.c:30
[<ffffffff819e7308>] sysfs_create_dir_ns+0x178/0x1d0 fs/sysfs/dir.c:59
[< inline >] create_dir lib/kobject.c:71
[<ffffffff81fa1b07>] kobject_add_internal+0x227/0xa60 lib/kobject.c:229
[< inline >] kobject_add_varg lib/kobject.c:366
[<ffffffff81fa2479>] kobject_add+0x139/0x220 lib/kobject.c:411
[<ffffffff82737a63>] device_add+0x353/0x1660 drivers/base/core.c:1088
[<ffffffff82738d8d>] device_register+0x1d/0x20 drivers/base/core.c:1206
[<ffffffff82cb77d3>] usb_create_ep_devs+0x163/0x260 drivers/usb/core/endpoint.c:195
[<ffffffff82c9f27b>] create_intf_ep_devs+0x13b/0x200 drivers/usb/core/message.c:1030
[<ffffffff82ca39d3>] usb_set_configuration+0x1083/0x18d0 drivers/usb/core/message.c:1937
[<ffffffff82cc9e2e>] generic_probe+0x6e/0xe0 drivers/usb/core/generic.c:172
[<ffffffff82caa7fa>] usb_probe_device+0xaa/0xe0 drivers/usb/core/driver.c:263
This patch prevents the problem by checking for duplicate endpoint
addresses during enumeration and skipping any duplicates.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1c069b057d upstream.
Andrey Konovalov's fuzz testing of gadgetfs showed that we should
improve the driver's checks for valid configuration descriptors passed
in by the user. In particular, the driver needs to verify that the
wTotalLength value in the descriptor is not too short (smaller
than USB_DT_CONFIG_SIZE). And the check for whether wTotalLength is
too large has to be changed, because the driver assumes there is
always enough room remaining in the buffer to hold a device descriptor
(at least USB_DT_DEVICE_SIZE bytes).
This patch adds the additional check and fixes the existing check. It
may do a little more than strictly necessary, but one extra check
won't hurt.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit add333a81a upstream.
Andrey Konovalov reports that fuzz testing with syzkaller causes a
KASAN use-after-free bug report in gadgetfs:
BUG: KASAN: use-after-free in gadgetfs_setup+0x208a/0x20e0 at addr ffff88003dfe5bf2
Read of size 2 by task syz-executor0/22994
CPU: 3 PID: 22994 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #16
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88006df06a18 ffffffff81f96aba ffffffffe0528500 1ffff1000dbe0cd6
ffffed000dbe0cce ffff88006df068f0 0000000041b58ab3 ffffffff8598b4c8
ffffffff81f96828 1ffff1000dbe0ccd ffff88006df06708 ffff88006df06748
Call Trace:
<IRQ> [ 201.343209] [< inline >] __dump_stack lib/dump_stack.c:15
<IRQ> [ 201.343209] [<ffffffff81f96aba>] dump_stack+0x292/0x398 lib/dump_stack.c:51
[<ffffffff817e4dec>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
[< inline >] print_address_description mm/kasan/report.c:197
[<ffffffff817e5080>] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
[< inline >] kasan_report mm/kasan/report.c:306
[<ffffffff817e562a>] __asan_report_load_n_noabort+0x3a/0x40 mm/kasan/report.c:337
[< inline >] config_buf drivers/usb/gadget/legacy/inode.c:1298
[<ffffffff8322c8fa>] gadgetfs_setup+0x208a/0x20e0 drivers/usb/gadget/legacy/inode.c:1368
[<ffffffff830fdcd0>] dummy_timer+0x11f0/0x36d0 drivers/usb/gadget/udc/dummy_hcd.c:1858
[<ffffffff814807c1>] call_timer_fn+0x241/0x800 kernel/time/timer.c:1308
[< inline >] expire_timers kernel/time/timer.c:1348
[<ffffffff81482de6>] __run_timers+0xa06/0xec0 kernel/time/timer.c:1641
[<ffffffff814832c1>] run_timer_softirq+0x21/0x80 kernel/time/timer.c:1654
[<ffffffff84f4af8b>] __do_softirq+0x2fb/0xb63 kernel/softirq.c:284
The cause of the bug is subtle. The dev_config() routine gets called
twice by the fuzzer. The first time, the user data contains both a
full-speed configuration descriptor and a high-speed config
descriptor, causing dev->hs_config to be set. But it also contains an
invalid device descriptor, so the buffer containing the descriptors is
deallocated and dev_config() returns an error.
The second time dev_config() is called, the user data contains only a
full-speed config descriptor. But dev->hs_config still has the stale
pointer remaining from the first call, causing the routine to think
that there is a valid high-speed config. Later on, when the driver
dereferences the stale pointer to copy that descriptor, we get a
use-after-free access.
The fix is simple: Clear dev->hs_config if the passed-in data does not
contain a high-speed config descriptor.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit faab50984f upstream.
Andrey Konovalov reports that fuzz testing with syzkaller causes a
KASAN warning in gadgetfs:
BUG: KASAN: slab-out-of-bounds in dev_config+0x86f/0x1190 at addr ffff88003c47e160
Write of size 65537 by task syz-executor0/6356
CPU: 3 PID: 6356 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #19
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
ffff88003c107ad8 ffffffff81f96aba ffffffff3dc11ef0 1ffff10007820eee
ffffed0007820ee6 ffff88003dc11f00 0000000041b58ab3 ffffffff8598b4c8
ffffffff81f96828 ffffffff813fb4a0 ffff88003b6eadc0 ffff88003c107738
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<ffffffff81f96aba>] dump_stack+0x292/0x398 lib/dump_stack.c:51
[<ffffffff817e4dec>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:159
[< inline >] print_address_description mm/kasan/report.c:197
[<ffffffff817e5080>] kasan_report_error+0x1f0/0x4e0 mm/kasan/report.c:286
[<ffffffff817e5705>] kasan_report+0x35/0x40 mm/kasan/report.c:306
[< inline >] check_memory_region_inline mm/kasan/kasan.c:308
[<ffffffff817e3fb9>] check_memory_region+0x139/0x190 mm/kasan/kasan.c:315
[<ffffffff817e4044>] kasan_check_write+0x14/0x20 mm/kasan/kasan.c:326
[< inline >] copy_from_user arch/x86/include/asm/uaccess.h:689
[< inline >] ep0_write drivers/usb/gadget/legacy/inode.c:1135
[<ffffffff83228caf>] dev_config+0x86f/0x1190 drivers/usb/gadget/legacy/inode.c:1759
[<ffffffff817fdd55>] __vfs_write+0x5d5/0x760 fs/read_write.c:510
[<ffffffff817ff650>] vfs_write+0x170/0x4e0 fs/read_write.c:560
[< inline >] SYSC_write fs/read_write.c:607
[<ffffffff81803a5b>] SyS_write+0xfb/0x230 fs/read_write.c:599
[<ffffffff84f47ec1>] entry_SYSCALL_64_fastpath+0x1f/0xc2
Indeed, there is a comment saying that the value of len is restricted
to a 16-bit integer, but the code doesn't actually do this.
This patch fixes the warning. It replaces the comment with a
computation that forces the amount of data copied from the user in
ep0_write() to be no larger than the wLength size for the control
transfer, which is a 16-bit quantity.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0994b0a257 upstream.
Andrey Konovalov reported that we were not properly checking the upper
limit before of a device configuration size before calling
memdup_user(), which could cause some problems.
So set the upper limit to PAGE_SIZE * 4, which should be good enough for
all devices.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 674aea07e3 upstream.
This device gives the following error on detection.
xhci_hcd 0000:00:11.0: ERROR Transfer event for disabled endpoint or
incorrect stream ring
The same error is not seen when it is added to unusual_device
list with US_FL_NO_REPORT_OPCODES passed.
Signed-off-by: George Cherian <george.cherian@cavium.com>
Signed-off-by: Oliver Neukum <oneukun@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c48400baa0 upstream.
During dma teardown for dequque urb, if musb load is high, musb might
generate bogus rx ep interrupt even when the rx fifo is flushed. In such
case any of the follow log messages could happen.
musb_host_rx 1853: BOGUS RX2 ready, csr 0000, count 0
musb_host_rx 1936: RX3 dma busy, csr 2020
As mentioned in the current inline comment, clearing ep interrupt in the
teardown path avoids the bogus interrupt, so implement clear_ep_rxintr()
callback.
This bug seems to be existing since the initial driver for musb support,
but I only validated the fix back to v4.1, so only cc stable for v4.1+.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6def85a396 upstream.
During dma teardown for dequque urb, if musb load is high, musb might
generate bogus rx ep interrupt even when the rx fifo is flushed. In such
case any of the follow log messages could happen.
musb_host_rx 1853: BOGUS RX2 ready, csr 0000, count 0
musb_host_rx 1936: RX3 dma busy, csr 2020
As mentioned in the current inline comment, clearing ep interrupt in the
teardown path avoids the bogus interrupt.
Clearing ep interrupt is platform dependent, so this patch adds a
platform callback to allow glue driver to clear the ep interrupt.
This bug seems to be existing since the initial driver for musb support,
but I only validated the fix back to v4.1, so only cc stable for v4.1+.
Signed-off-by: Bin Liu <b-liu@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4c881451d3 upstream.
On 64-bit kernels, MIPS KVM will clear CP0_Status.UX to prevent the
guest (running in user mode) from accessing the 64-bit memory segments.
However the previous value of CP0_Status.UX is never restored when
exiting from the guest.
If the user process uses 64-bit addressing (the n64 ABI) this can result
in address error exceptions from the kernel if it needs to deliver a
signal before returning to user mode, as the kernel will need to write a
sigframe to high user addresses on the user stack which are disallowed
by CP0_Status.UX=0.
This is fixed by explicitly setting SX and UX again when exiting from
the guest, and explicitly clearing those bits when returning to the
guest. Having the SX and UX bits set when handling guest exits (rather
than only when exiting to userland) will be helpful when we support VZ,
since we shouldn't need to directly read or write guest memory, so it
will be valid for cache management IPIs to access host user addresses.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: kvm@vger.kernel.org
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8581f1b5ee upstream.
Apparently some VLV BIOSen like to leave the VDD force bit enabled
even for power seqeuncers that aren't properly hooked up to any
port. That will result in a imbalance in the AUX power domain
refcount when we stat to use said power sequencer as edp_panel_vdd_on()
will not grab the power domain reference if it sees that the VDD is
already on.
To fix this let's make sure we turn off the VDD force bit when we
initialize the power sequencer registers. That is, unless it's
being done from the init path since there we are actually
initializing the registers for the current power sequencer and
we don't want to turn VDD off needlessly as that would require
waiting for the power cycle delay before we turn it back on.
This fixes the following kind of warnings:
WARNING: CPU: 0 PID: 123 at ../drivers/gpu/drm/i915/intel_runtime_pm.c:1455 intel_display_power_put+0x13a/0x170 [i915]()
WARN_ON(!power_domains->domain_use_count[domain])
...
v2: Fix typos in comment (David)
Cc: Matwey V. Kornilov <matwey.kornilov@gmail.com>
Tested-by: Matwey V. Kornilov <matwey.kornilov@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98695
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161220165117.24801-1-ville.syrjala@linux.intel.com
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
(cherry picked from commit 5d5ab2d26f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 81d873a871 upstream.
This updates gcc-common.h from Emese Revfy for gcc 7. This fixes issues seen
by Kugan and Arnd. Build tested with gcc 5.4 and 7 snapshot.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7858bf16c upstream.
The asm-prototypes.h file is used to provide dummy function declarations
for genksyms, when processing asm files with EXPORT_SYMBOL. Make sure
that any architecture defines get out of our way. x86 currently has an
issue with memcpy on 64bit with CONFIG_KMEMCHECK=y and with
memset/__memset on 32bit:
$ cat init/test.c
#include <asm/asm-prototypes.h>
$ make -s init/test.o
In file included from ./arch/x86/include/asm/string.h:4:0,
from ./include/linux/string.h:18,
from ./include/linux/bitmap.h:8,
from ./include/linux/cpumask.h:11,
from ./arch/x86/include/asm/cpumask.h:4,
from ./arch/x86/include/asm/msr.h:10,
from ./arch/x86/include/asm/processor.h:20,
from ./arch/x86/include/asm/cpufeature.h:4,
from ./arch/x86/include/asm/thread_info.h:52,
from ./include/linux/thread_info.h:25,
from ./arch/x86/include/asm/preempt.h:6,
from ./include/linux/preempt.h:59,
from ./include/linux/spinlock.h:50,
from ./include/linux/seqlock.h:35,
from ./include/linux/time.h:5,
from ./include/uapi/linux/timex.h:56,
from ./include/linux/timex.h:56,
from ./include/linux/sched.h:19,
from ./include/linux/uaccess.h:4,
from ./arch/x86/include/asm/asm-prototypes.h:2,
from init/test.c:1:
./arch/x86/include/asm/string_64.h:52:47: error: expected declaration specifiers or ‘...’ before ‘(’ token
#define memcpy(dst, src, len) __inline_memcpy((dst), (src), (len))
./include/asm-generic/asm-prototypes.h:6:14: note: in expansion of macro ‘memcpy’
extern void *memcpy(void *, const void *, __kernel_size_t);
^
...
During real build, this manifests itself by genksyms segfaulting.
Fixes: 334bb77387 ("x86/kbuild: enable modversions for symbols exported from asm")
Reported-and-tested-by: Borislav Petkov <bp@alien8.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 35f432a03e upstream.
In ieee80211_xmit_fast(), 'info' is initialized to point to the skb
that's passed in, but that skb may later be replaced by a clone (if
it was shared), leading to an invalid pointer.
This can lead to use-after-free and also later crashes since the
real SKB's info->hw_queue doesn't get initialized properly.
Fix this by assigning info only later, when it's needed, after the
skb replacement (may have) happened.
Reported-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef37427ac5 upstream.
Similarly to the aemif clock - this screws up the linked list of clock
children. Create a separate clock for mdio inheriting the rate from
emac_clk.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
[nsekhar@ti.com: add a comment over mdio_clk to explaing its existence +
commit headline updates]
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 143fca77cc upstream.
While applying patch d443a0aa3a: "HID: hid-sensor-hub: clear memory to
avoid random data", there was some issues in applying correct version of
the patch. This resulted in the breakage of sensor functions as all
request like power-up will be reset by the memset() in the function
sensor_hub_set_feature().
The reset of caller buffer should be in the function
sensor_hub_get_feature(), not in the sensor_hub_set_feature().
Fixes: d443a0aa3a ("HID: hid-sensor-hub: clear memory to avoid random data")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4174421360 upstream.
The cr16 interval timer of each CPU is not syncronized to other cr16
timers in other CPUs in a SMP system. So, delay the registration of the
cr16 clocksource until all CPUs have been detected and then - if we are
on a SMP machine - mark the cr16 clocksource as unstable and lower it's
rating before registering it at the clocksource framework.
This patch fixes the stalled CPU warnings which we have seen since
introduction of the cr16 clocksource.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42d97eb0ad upstream.
Attempting to link a device node, named pipe, or socket file into an
encrypted directory through rename(2) or link(2) always failed with
EPERM. This happened because fscrypt_has_permitted_context() saw that
the file was unencrypted and forbid creating the link. This behavior
was unexpected because such files are never encrypted; only regular
files, directories, and symlinks can be encrypted.
To fix this, make fscrypt_has_permitted_context() always return true on
special files.
This will be covered by a test in my encryption xfstests patchset.
Fixes: 9bd8212f98 ("ext4 crypto: add encryption policy and password salt support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1d0f953086 upstream.
Commit 16200948d8 ("ALSA: usb-audio: Fix race at stopping the stream") was
incomplete causing another more severe kernel panic, so it got reverted.
This fixes both the original problem and its fallout kernel race/crash.
The original fix is to move the endpoint member NULL clearing logic inside
wait_clear_urbs() so the irq triggering the urb completion doesn't call
retire_capture/playback_urb() after the NULL clearing and generate a panic.
However this creates a new race between snd_usb_endpoint_start()'s call
to wait_clear_urbs() and the irq urb completion handler which again calls
retire_capture/playback_urb() leading to a new NULL dereference.
We keep the EP deactivation code in snd_usb_endpoint_start() because
removing it will break the EP reference counting (see [1] [2] for info),
however we don't need the "can_sleep" mechanism anymore because a new
function was introduced (snd_usb_endpoint_sync_pending_stop()) which
synchronizes pending stops and gets called inside the pcm prepare callback.
It also makes sense to remove can_sleep because it was also removed from
deactivate_urbs() signature in [3] so we benefit from more simplification.
[1] commit 015618b90 ("ALSA: snd-usb: Fix URB cancellation at stream start")
[2] commit e9ba389c5 ("ALSA: usb-audio: Fix scheduling-while-atomic bug in PCM capture stream")
[3] commit ccc1696d5 ("ALSA: usb-audio: simplify endpoint deactivation code")
Fixes: f8114f8583 ("Revert "ALSA: usb-audio: Fix race at stopping the stream"")
Signed-off-by: Ioan-Adrian Ratiu <adi@adirat.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c7efff9284 upstream.
Although the old quirk table showed ASUS X71SL with ALC663 codec being
compatible with asus-mode3 fixup, the bugzilla reporter explained that
asus-model8 fits better for the dual headphone controls. So be it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=191781
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e7c9a3d9e4 upstream.
The Octeon driver calls into PHYLIB which now checks for
net_device->dev.parent, so make sure we do set it before calling into
any MDIO/PHYLIB related function.
Fixes: ec988ad78e ("phy: Don't increment MDIO bus refcount unless it's a different owner")
Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 01d1f7a99e upstream.
Datasheet specifies typical and maximum execution times for which CMD
register is occupied after previous command execution. We took these
values as minimum and maximum time for usleep_range() call before making
a new command execution.
To be sure, that the CMD register is no longer occupied we need to wait
*at least* the maximum time specified by datasheet.
Signed-off-by: Marcin Niestroj <m.niestroj@grinn-global.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 65c8aea07d upstream.
Using realbits as i2c/spi read len, when that value is not byte aligned
(e.g 12 bits), lead to skip msb part of out data registers.
Fix this taking into account scan_type.shift in addition to
scan_type.realbits as read length:
read_len = DIV_ROUND_UP(realbits + shift, 8)
This fix has been tested on 8, 12, 16, 24 bit sensors
Fixes: e7385de529 ("iio:st_sensors: align on storagebits boundaries")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@st.com>
Tested-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bb98e72ada upstream.
On my Cherrytrail CUBE iwork8 Air tablet PIPE-A would get stuck on loading
i915 at boot 1 out of every 3 boots, resulting in a non functional LCD.
Once the i915 driver has successfully loaded, the panel can be disabled /
enabled without hitting this issue.
The getting stuck is caused by vlv_init_display_clock_gating() clearing
the DPOUNIT_CLOCK_GATE_DISABLE bit in DSPCLK_GATE_D when called from
chv_pipe_power_well_ops.enable() on driver load, while a pipe is enabled
driving the DSI LCD by the BIOS.
Clearing this bit while DSI is in use is a known issue and
intel_dsi_pre_enable() / intel_dsi_post_disable() already set / clear it
as appropriate.
This commit modifies vlv_init_display_clock_gating() to leave the
DPOUNIT_CLOCK_GATE_DISABLE bit alone fixing the pipe getting stuck.
Changes in v2:
-Replace PIPE-A with "a pipe" or "the pipe" in the commit msg and
comment
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97330
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161202142904.25613-1-hdegoede@redhat.com
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit 721d484563)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8354491c9d upstream.
Since commit 71ce391dfb ("net: mvpp2: enable proper per-CPU TX
buffers unmapping"), we are not correctly DMA unmapping TX buffers for
fragments.
Indeed, the mvpp2_txq_inc_put() function only stores in the
txq_cpu->tx_buffs[] array the physical address of the buffer to be
DMA-unmapped when skb != NULL. In addition, when DMA-unmapping, we use
skb_headlen(skb) to get the size to be unmapped. Both of this works fine
for TX descriptors that are associated directly to a SKB, but not the
ones that are used for fragments, with a NULL pointer as skb:
- We have a NULL physical address when calling DMA unmap
- skb_headlen(skb) crashes because skb is NULL
This causes random crashes when fragments are used.
To solve this problem, we need to:
- Store the physical address of the buffer to be unmapped
unconditionally, regardless of whether it is tied to a SKB or not.
- Store the length of the buffer to be unmapped, which requires a new
field.
Instead of adding a third array to store the length of the buffer to be
unmapped, and as suggested by David Miller, this commit refactors the
tx_buffs[] and tx_skb[] arrays of 'struct mvpp2_txq_pcpu' into a
separate structure 'mvpp2_txq_pcpu_buf', to which a 'size' field is
added. Therefore, instead of having three arrays to allocate/free, we
have a single one, which also improve data locality, reducing the
impact on the CPU cache.
Fixes: 71ce391dfb ("net: mvpp2: enable proper per-CPU TX buffers unmapping")
Reported-by: Raphael G <raphael.glon@corp.ovh.com>
Cc: Raphael G <raphael.glon@corp.ovh.com>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 128394eff3 upstream.
Both damn things interpret userland pointers embedded into the payload;
worse, they are actually traversing those. Leaving aside the bad
API design, this is very much _not_ safe to call with KERNEL_DS.
Bail out early if that happens.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79e51b5c2d upstream.
Currently it is impossible to edit the value of a config symbol with a
prompt longer than (terminal width - 2) characters. dialog_inputbox()
calculates a negative x-offset for the input window and newwin() fails
as this is invalid. It also doesn't check for this failure, so it
busy-loops calling wgetch(NULL) which immediately returns -1.
The additions in the offset calculations also don't match the intended
size of the window.
Limit the window size and calculate the offset similarly to
show_scroll_win().
Fixes: 692d97c380 ("kconfig: new configuration interface (nconfig)")
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0905ca757 upstream.
Don't free the cmd in tcmu_check_expired_cmd, it's still referenced by
an entry in our cmd_id->cmd idr. If userspace ever resumes processing,
tcmu_handle_completions() will use the now-invalid cmd pointer.
Instead, don't free cmd. It will be freed by tcmu_handle_completion() if
userspace ever recovers, or tcmu_free_device if not.
Reported-by: Bryant G Ly <bgly@us.ibm.com>
Tested-by: Bryant G Ly <bgly@us.ibm.com>
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit af7d9f0c57 upstream.
Fix the format specifier so that the attribute can be parsed correctly.
Currently it returns decimal 1000 for a 4096-byte alignment.
Reported-by: Dave Jiang <dave.jiang@intel.com>
Fixes: 315c562536 ("libnvdimm, pfn: add 'align' attribute, default to HPAGE_SIZE")
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b6cc9474e2 upstream.
On arm64 NUMA kernels we can pass "numa=off" on the command line to
disable NUMA. A side effect of this is that kmalloc_node() calls to
non-zero nodes will crash the system with an OOPS:
[ 0.000000] ITS@0x0000901000020000: allocated 2097152 Devices @10002000000 (flat, esz 8, psz 64K, shr 1)
[ 0.000000] Unable to handle kernel NULL pointer dereference at virtual address 00001680
[ 0.000000] pgd = fffffc0009470000
[ 0.000000] [00001680] *pgd=0000010ffff90003, *pud=0000010ffff90003, *pmd=0000010ffff90003, *pte=0000000000000000
[ 0.000000] Internal error: Oops: 96000006 [#1] SMP
.
.
.
[ 0.000000] [<fffffc00081c8950>] __alloc_pages_nodemask+0xa4/0xe68
[ 0.000000] [<fffffc000821fa70>] new_slab+0xd0/0x564
[ 0.000000] [<fffffc0008221e24>] ___slab_alloc+0x2e4/0x514
[ 0.000000] [<fffffc0008239498>] __slab_alloc+0x48/0x58
[ 0.000000] [<fffffc0008222c20>] __kmalloc_node+0xd0/0x2dc
[ 0.000000] [<fffffc0008115374>] __irq_domain_add+0x7c/0x164
[ 0.000000] [<fffffc0008b461dc>] its_probe+0x784/0x81c
[ 0.000000] [<fffffc0008b462bc>] its_init+0x48/0x1b0
[ 0.000000] [<fffffc0008b4543c>] gic_init_bases+0x228/0x360
[ 0.000000] [<fffffc0008b456bc>] gic_of_init+0x148/0x1cc
[ 0.000000] [<fffffc0008b5aec8>] of_irq_init+0x184/0x298
[ 0.000000] [<fffffc0008b43f9c>] irqchip_init+0x14/0x38
[ 0.000000] [<fffffc0008b12d60>] init_IRQ+0xc/0x30
[ 0.000000] [<fffffc0008b10a3c>] start_kernel+0x240/0x3b8
[ 0.000000] [<fffffc0008b101c4>] __primary_switched+0x30/0x6c
[ 0.000000] Code: 912ec2a0 b9403809 0a0902fb 37b007db (f9400300)
.
.
.
This is caused by code like this in kernel/irq/irqdomain.c
domain = kzalloc_node(sizeof(*domain) + (sizeof(unsigned int) * size),
GFP_KERNEL, of_node_to_nid(of_node));
When NUMA is disabled, the concept of a node is really undefined, so
of_node_to_nid() should unconditionally return NUMA_NO_NODE.
Fix by returning NUMA_NO_NODE when the nid is not in the set of
possible nodes.
Reported-by: Gilbert Netzer <noname@pdc.kth.se>
Signed-off-by: David Daney <david.daney@cavium.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ff45000fcb upstream.
The boot wrapper performs its own relocations and does not require
PT_INTERP segment. However currently we don't tell the linker that.
Prior to binutils 2.28 that works OK. But since binutils commit
1a9ccd70f9a7 ("Fix the linker so that it will not silently generate ELF
binaries with invalid program headers. Fix readelf to report such
invalid binaries.") binutils tries to create a program header segment
due to PT_INTERP, and the link fails because there is no space for it:
ld: arch/powerpc/boot/zImage.pseries: Not enough room for program headers, try linking with -N
ld: final link failed: Bad value
So tell the linker not to do that, by passing --no-dynamic-linker.
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop dependency on ld-version.sh and massage change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6dff5b6705 upstream.
GCC 5 generates different code for this bootwrapper null check that
causes the PS3 to hang very early in its bootup. This check is of
limited value, so just get rid of it.
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f87f253bac upstream.
From 80f23935ca ("powerpc: Convert cmp to cmpd in idle enter sequence"):
PowerPC's "cmp" instruction has four operands. Normally people write
"cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently
people forget, and write "cmp" with just three operands.
With older binutils this is silently accepted as if this was "cmpw",
while often "cmpd" is wanted. With newer binutils GAS will complain
about this for 64-bit code. For 32-bit code it still silently assumes
"cmpw" is what is meant.
In this case, cmpwi is called for, so this is just a build fix for
new toolchains.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1cded9d297 upstream.
There are two problems with refcounting of auth_gss messages.
First, the reference on the pipe->pipe list (taken by a call
to rpc_queue_upcall()) is not counted. It seems to be
assumed that a message in pipe->pipe will always also be in
pipe->in_downcall, where it is correctly reference counted.
However there is no guaranty of this. I have a report of a
NULL dereferences in rpc_pipe_read() which suggests a msg
that has been freed is still on the pipe->pipe list.
One way I imagine this might happen is:
- message is queued for uid=U and auth->service=S1
- rpc.gssd reads this message and starts processing.
This removes the message from pipe->pipe
- message is queued for uid=U and auth->service=S2
- rpc.gssd replies to the first message. gss_pipe_downcall()
calls __gss_find_upcall(pipe, U, NULL) and it finds the
*second* message, as new messages are placed at the head
of ->in_downcall, and the service type is not checked.
- This second message is removed from ->in_downcall and freed
by gss_release_msg() (even though it is still on pipe->pipe)
- rpc.gssd tries to read another message, and dereferences a pointer
to this message that has just been freed.
I fix this by incrementing the reference count before calling
rpc_queue_upcall(), and decrementing it if that fails, or normally in
gss_pipe_destroy_msg().
It seems strange that the reply doesn't target the message more
precisely, but I don't know all the details. In any case, I think the
reference counting irregularity became a measureable bug when the
extra arg was added to __gss_find_upcall(), hence the Fixes: line
below.
The second problem is that if rpc_queue_upcall() fails, the new
message is not freed. gss_alloc_msg() set the ->count to 1,
gss_add_msg() increments this to 2, gss_unhash_msg() decrements to 1,
then the pointer is discarded so the memory never gets freed.
Fixes: 9130b8dbc6 ("SUNRPC: allow for upcalls for same uid but different gss service")
Link: https://bugzilla.opensuse.org/show_bug.cgi?id=1011250
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 54e4a0dfa2 upstream.
We must not call nfs_pageio_init_read() on a new nfs_pageio_descriptor
while holding a reference to a layout segment, as that can deadlock
pnfs_update_layout().
Fixes: d67ae825a5 ("pnfs/flexfiles: Add the FlexFile Layout Driver")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae5a459d5f upstream.
We must ensure that we don't schedule a layoutreturn if the layout stateid
has been marked as invalid.
Fixes: 2a59a04116 ("pNFS: Fix pnfs_set_layout_stateid() to clear...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7b650994ab upstream.
If we no longer hold any layout segments, we're normally expected to
consider the layout stateid to be invalid. However we cannot assume this
if we're about to, or in the process of sending a layoutreturn.
Fixes: 334a8f3711 ("pNFS: Don't forget the layout stateid if...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6604b203fb upstream.
If there is an I/O error, we should not call LAYOUTGET until the
LAYOUTRETURN that reports the error is complete.
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c0cf3ef5e0 upstream.
What matters when deciding if we should make a page uptodate is
not how much we _wanted_ to copy, but how much we actually have
copied. As it is, on architectures that do not zero tail on
short copy we can leave uninitialized data in page marked uptodate.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5c056fdc5b upstream.
After sending an authorizer (ceph_x_authorize_a + ceph_x_authorize_b),
the client gets back a ceph_x_authorize_reply, which it is supposed to
verify to ensure the authenticity and protect against replay attacks.
The code for doing this is there (ceph_x_verify_authorizer_reply(),
ceph_auth_verify_authorizer_reply() + plumbing), but it is never
invoked by the the messenger.
AFAICT this goes back to 2009, when ceph authentication protocols
support was added to the kernel client in 4e7a5dcd1b ("ceph:
negotiate authentication protocol; implement AUTH_NONE protocol").
The second param of ceph_connection_operations::verify_authorizer_reply
is unused all the way down. Pass 0 to facilitate backporting, and kill
it in the next commit.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Sage Weil <sage@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6496ebd7ed upstream.
One some systems, the firmware does not allow certain PCI devices to be put
in deep D-states. This can cause problems for wakeup signalling, if the
device does not support PME# in the deepest allowed suspend state. For
example, Pierre reports that on his system, ACPI does not permit his xHCI
host controller to go into D3 during runtime suspend -- but D3 is the only
state in which the controller can generate PME# signals. As a result, the
controller goes into runtime suspend but never wakes up, so it doesn't work
properly. USB devices plugged into the controller are never detected.
If the device relies on PME# for wakeup signals but is not capable of
generating PME# in the target state, the PCI core should accurately report
that it cannot do wakeup from runtime suspend. This patch modifies the
pci_dev_run_wake() routine to add this check.
Reported-by: Pierre de Villemereuil <flyos@mailoo.org>
Tested-by: Pierre de Villemereuil <flyos@mailoo.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91c42b72f8 upstream.
hw_stats is a pointer to i40_iw_dev_stats struct in i40iw_get_hw_stats().
Use hw_stats and not &hw_stats in the memcpy to copy the i40iw device stats
data into rdma_hw_stats counters.
Fixes: b40f4757da ("IB/core: Make device counter infrastructure dynamic")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f9ca75516 upstream.
New inode operations were forgotten to be added to bad_inode. Most of the
time the op is checked for NULL before being called but marking the inode
bad and the check can race (very unlikely).
However in case of ->get_link() only DCACHE_SYMLINK_TYPE is checked before
calling the op, so there's no race and will definitely oops when trying to
follow links on such a beast.
Also remove comments about extinct ops.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5a8a6b89c1 upstream.
We were assigning I2C bus controller instead of client as parent device.
Besides being logically wrong, it messed up with devm handling of input
device. As a result we were leaving input device and event node behind
after rmmod-ing the driver, which lead to a kernel oops if one were to
access the event node later.
Let's remove the assignment and rely on devm_input_allocate_device() to
set it up properly for us.
Signed-off-by: Jingkui Wang <jkwang@google.com>
Fixes: 7132fe4f56 ("Input: drv260x - add TI drv260x haptics driver")
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fc4b067ec upstream.
This fixes a lockup at device probing which happens on some solo6010
hardware samples. This is a regression introduced by commit e1ceb25a15
("[media] SOLO6x10: remove unneeded register locking and barriers")
The observed lockup happens in solo_set_motion_threshold() called from
solo_motion_config().
This extra "flushing" is not fundamentally needed for every write, but
apparently the code in driver assumes such behaviour at last in some
places.
Actual fix was proposed by Hans Verkuil.
Fixes: e1ceb25a15 ("[media] SOLO6x10: remove unneeded register locking and barriers")
Signed-off-by: Andrey Utkin <andrey.utkin@corp.bluecherry.net>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d930b5b5bf upstream.
A register used to identify chip during probe was overwritten during
firmware download and due to that later probe's for warm chip were
failing. Detect chip from the another register, which is located on
different register bank 2.
Fixes: 7908fad99a ("[media] mn88473: finalize driver")
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 365fe4e0ce upstream.
A register used to identify chip during probe was overwritten during
firmware download and due to that later probe's for warm chip were
failing. Detect chip from the another register, which is located on
different register bank 2.
Fixes: 94d0eaa419 ("[media] mn88472: move out of staging to media")
Signed-off-by: Antti Palosaari <crope@iki.fi>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fe2f378dd upstream.
The array ib_mad_mgmt_class_table.method_table has MAX_MGMT_CLASS
(80) elements. Hence compare the array index with that value instead
of with IB_MGMT_MAX_METHODS (128). This patch avoids that Coverity
reports the following:
Overrunning array class->method_table of 80 8-byte elements at element index 127 (byte offset 1016) using index convert_mgmt_class(mad_hdr->mgmt_class) (which evaluates to 127).
Fixes: commit b7ab0b19a8 ("IB/mad: Verify mgmt class in received MADs")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Sean Hefty <sean.hefty@intel.com>
Reviewed-by: Hal Rosenstock <hal@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 794de08a16 upstream.
Both the wakeup and irqsoff tracers can use the function graph tracer when
the display-graph option is set. The problem is that they ignore the notrace
file, and record the entry of functions that would be ignored by the
function_graph tracer. This causes the trace->depth to be recorded into the
ring buffer. The set_graph_notrace uses a trick by adding a large negative
number to the trace->depth when a graph function is to be ignored.
On trace output, the graph function uses the depth to record a stack of
functions. But since the depth is negative, it accesses the array with a
negative number and causes an out of bounds access that can cause a kernel
oops or corrupt data.
Have the print functions handle cases where a tracer still records functions
even when they are in set_graph_notrace.
Also add warnings if the depth is below zero before accessing the array.
Note, the function graph logic will still prevent the return of these
functions from being recorded, which means that they will be left hanging
without a return. For example:
# echo '*spin*' > set_graph_notrace
# echo 1 > options/display-graph
# echo wakeup > current_tracer
# cat trace
[...]
_raw_spin_lock() {
preempt_count_add() {
do_raw_spin_lock() {
update_rq_clock();
Where it should look like:
_raw_spin_lock() {
preempt_count_add();
do_raw_spin_lock();
}
update_rq_clock();
Cc: Namhyung Kim <namhyung.kim@lge.com>
Fixes: 29ad23b004 ("ftrace: Add set_graph_notrace filter")
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9d85eb9119 upstream.
The logical package management has several issues:
- The APIC ids provided by ACPI are not required to be the same as the
initial APIC id which can be retrieved by CPUID. The APIC ids provided
by ACPI are those which are written by the BIOS into the APIC. The
initial id is set by hardware and can not be changed. The hardware
provided ids contain the real hardware package information.
Especially AMD sets the effective APIC id different from the hardware id
as they need to reserve space for the IOAPIC ids starting at id 0.
As a consequence those machines trigger the currently active firmware
bug printouts in dmesg, These are obviously wrong.
- Virtual machines have their own interesting of enumerating APICs and
packages which are not reliably covered by the current implementation.
The sizing of the mapping array has been tweaked to be generously large to
handle systems which provide a wrong core count when HT is disabled so the
whole magic which checks for space in the physical hotplug case is not
needed anymore.
Simplify the whole machinery and do the mapping when the CPU starts and the
CPUID derived physical package information is available. This solves the
observed problems on AMD machines and works for the virtualization issues
as well.
Remove the extra call from XEN cpu bringup code as it is not longer
required.
Fixes: d49597fd3b ("x86/cpu: Deal with broken firmware (VMWare/XEN)")
Reported-and-tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: M. Vefa Bicakci <m.v.b@runbox.com>
Cc: xen-devel <xen-devel@lists.xen.org>
Cc: Charles (Chas) Williams <ciwillia@brocade.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Alok Kataria <akataria@vmware.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1612121102260.3429@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 847fa1a6d3 upstream.
With new binutils, gcc may get smart with its optimization and change a jmp
from a 5 byte jump to a 2 byte one even though it was jumping to a global
function. But that global function existed within a 2 byte radius, and gcc
was able to optimize it. Unfortunately, that jump was also being modified
when function graph tracing begins. Since ftrace expected that jump to be 5
bytes, but it was only two, it overwrote code after the jump, causing a
crash.
This was fixed for x86_64 with commit 8329e818f1, with the same subject as
this commit, but nothing was done for x86_32.
Fixes: d61f82d066 ("ftrace: use dynamic patching for updating mcount calls")
Reported-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f83f12d660 upstream.
These fields are 64 bit, using le32_to_cpu and friends
on these will not do the right thing.
Fix this up.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5716863e0f upstream.
fsnotify_unmount_inodes() plays complex tricks to pin next inode in the
sb->s_inodes list when iterating over all inodes. Furthermore the code has a
bug that if the current inode is the last on i_sb_list that does not have e.g.
I_FREEING set, then we leave next_i pointing to inode which may get removed
from the i_sb_list once we drop s_inode_list_lock thus resulting in
use-after-free issues (usually manifesting as infinite looping in
fsnotify_unmount_inodes()).
Fix the problem by keeping current inode pinned somewhat longer. Then we can
make the code much simpler and standard.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef85b67385 upstream.
When L2 exits to L0 due to "exception or NMI", software exceptions
(#BP and #OF) for which L1 has requested an intercept should be
handled by L1 rather than L0. Previously, only hardware exceptions
were forwarded to L1.
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f064a0de15 upstream.
The hashed page table MMU in POWER processors can update the R
(reference) and C (change) bits in a HPTE at any time until the
HPTE has been invalidated and the TLB invalidation sequence has
completed. In kvmppc_h_protect, which implements the H_PROTECT
hypercall, we read the HPTE, modify the second doubleword,
invalidate the HPTE in memory, do the TLB invalidation sequence,
and then write the modified value of the second doubleword back
to memory. In doing so we could overwrite an R/C bit update done
by hardware between when we read the HPTE and when the TLB
invalidation completed. To fix this we re-read the second
doubleword after the TLB invalidation and OR in the (possibly)
new values of R and C. We can use an OR since hardware only ever
sets R and C, never clears them.
This race was found by code inspection. In principle this bug could
cause occasional guest memory corruption under host memory pressure.
Fixes: a8606e20e4 ("KVM: PPC: Handle some PAPR hcalls in the kernel", 2011-06-29)
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0d808df06a upstream.
When switching from/to a guest that has a transaction in progress,
we need to save/restore the checkpointed register state. Although
XER is part of the CPU state that gets checkpointed, the code that
does this saving and restoring doesn't save/restore XER.
This fixes it by saving and restoring the XER. To allow userspace
to read/write the checkpointed XER value, we also add a new ONE_REG
specifier.
The visible effect of this bug is that the guest may see its XER
value being corrupted when it uses transactions.
Fixes: e4e3812150 ("KVM: PPC: Book3S HV: Add transactional memory support")
Fixes: 0a8eccefcb ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8d7c33232 upstream.
Current implementation employ 16bit counter of active stripes in lower
bits of bio->bi_phys_segments. If request is big enough to overflow
this counter bio will be completed and freed too early.
Fortunately this not happens in default configuration because several
other limits prevent that: stripe_cache_size * nr_disks effectively
limits count of active stripes. And small max_sectors_kb at lower
disks prevent that during normal read/write operations.
Overflow easily happens in discard if it's enabled by module parameter
"devices_handle_discard_safely" and stripe_cache_size is set big enough.
This patch limits requests size with 256Mb - 8Kb to prevent overflows.
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Shaohua Li <shli@kernel.org>
Cc: Neil Brown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 04da73803c upstream.
The use of IRQF_ONESHOT when registering an interrupt handler with
request_irq() is non-sensical.
Not only that, it also prevents the handler from being threaded when it
otherwise should be w/ IRQ_FORCED_THREADING is enabled. This causes the
following deadlock observed by Sean Nyekjaer on -rt:
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[..]
rt_spin_lock_slowlock from queue_kthread_work
queue_kthread_work from sc16is7xx_irq
sc16is7xx_irq [sc16is7xx] from handle_irq_event_percpu
handle_irq_event_percpu from handle_irq_event
handle_irq_event from handle_level_irq
handle_level_irq from generic_handle_irq
generic_handle_irq from mxc_gpio_irq_handler
mxc_gpio_irq_handler from mx3_gpio_irq_handler
mx3_gpio_irq_handler from generic_handle_irq
generic_handle_irq from __handle_domain_irq
__handle_domain_irq from gic_handle_irq
gic_handle_irq from __irq_svc
__irq_svc from rt_spin_unlock
rt_spin_unlock from kthread_worker_fn
kthread_worker_fn from kthread
kthread from ret_from_fork
Fixes: 9e6f4ca3e5 ("sc16is7xx: use kthread_worker for tx_work and irq")
Reported-by: Sean Nyekjaer <sean.nyekjaer@prevas.dk>
Signed-off-by: Josh Cartwright <joshc@ni.com>
Cc: linux-rt-users@vger.kernel.org
Cc: Jakub Kicinski <moorray3@wp.pl>
Cc: linux-serial@vger.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Julia Cartwright <julia@ni.com>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 21cbe3cc8a upstream.
The ARMv8 architecture allows the cycle counter to be configured
by setting PMSELR_EL0.SEL==0x1f and then accessing PMXEVTYPER_EL0,
hence accessing PMCCFILTR_EL0. But it disallows the use of
PMSELR_EL0.SEL==0x1f to access the cycle counter itself through
PMXEVCNTR_EL0.
Linux itself doesn't violate this rule, but we may end up with
PMSELR_EL0.SEL being set to 0x1f when we enter a guest. If that
guest accesses PMXEVCNTR_EL0, the access may UNDEF at EL1,
despite the guest not having done anything wrong.
In order to avoid this unfortunate course of events (haha!), let's
sanitize PMSELR_EL0 on guest entry. This ensures that the guest
won't explode unexpectedly.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9f88eb4df7 upstream.
When re-adding crash kernel memory within setup_resources() the
function memblock_add() is used. That function will add memory by
default to node "MAX_NUMNODES" instead of node 0, like the memory
detection code does. In case of !NUMA this will trigger this warning
when the kernel generates the vmemmap:
Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead
WARNING: CPU: 0 PID: 0 at mm/memblock.c:1261 memblock_virt_alloc_internal+0x76/0x220
CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc6 #16
Call Trace:
[<0000000000d0b2e8>] memblock_virt_alloc_try_nid+0x88/0xc8
[<000000000083c8ea>] __earlyonly_bootmem_alloc.constprop.1+0x42/0x50
[<000000000083e7f4>] vmemmap_populate+0x1ac/0x1e0
[<0000000000840136>] sparse_mem_map_populate+0x46/0x68
[<0000000000d0c59c>] sparse_init+0x184/0x238
[<0000000000cf45f6>] paging_init+0xbe/0xf8
[<0000000000cf1d4a>] setup_arch+0xa02/0xae0
[<0000000000ced75a>] start_kernel+0x72/0x450
[<0000000000100020>] _stext+0x20/0x80
If NUMA is selected numa_setup_memory() will fix the node assignments
before the vmemmap will be populated; so this warning will only appear
if NUMA is not selected.
To fix this simply use memblock_add_node() and re-add crash kernel
memory explicitly to node 0.
Reported-and-tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 4e042af463 ("s390/kexec: fix crash on resize of reserved memory")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5457e03de9 upstream.
The buffer for iucv_message_receive() needs to be below 2 GB. In
__iucv_message_receive(), the buffer address is casted to an u32, which
would result in either memory corruption or an addressing exception when
using addresses >= 2 GB.
Fix this by using GFP_DMA for the buffer allocation.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2e700f8d85 upstream.
When you use the firmware usermode helper fallback with a timeout value set to a
value greater than INT_MAX (2147483647) a cast overflow issue causes the
timeout value to go negative and breaks all usermode helper loading. This
regression was introduced through commit 68ff2a00db ("firmware_loader:
handle timeout via wait_for_completion_interruptible_timeout()") on kernel
v4.0.
The firmware_class drivers relies on the firmware usermode helper
fallback as a mechanism to look for firmware if the direct filesystem
search failed only if:
a) You've enabled CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros):
Then all of these callers will rely on the fallback mechanism in case
the firmware is not found through an initial direct filesystem lookup:
o request_firmware()
o request_firmware_into_buf()
o request_firmware_nowait()
b) If you've only enabled CONFIG_FW_LOADER_USER_HELPER (most distros):
Then only callers using request_firmware_nowait() with the second
argument set to false, this explicitly is requesting the UMH firmware
fallback to be relied on in case the first filesystem lookup fails.
Using Coccinelle SmPL grammar we have identified only two drivers
explicitly requesting the UMH firmware fallback mechanism:
- drivers/firmware/dell_rbu.c
- drivers/leds/leds-lp55xx-common.c
Since most distributions only enable CONFIG_FW_LOADER_USER_HELPER the
biggest impact of this regression are users of the dell_rbu and
leds-lp55xx-common device driver which required the UMH to find their
respective needed firmwares.
The default timeout for the UMH is set to 60 seconds always, as of
commit 68ff2a00db ("firmware_loader: handle timeout via
wait_for_completion_interruptible_timeout()") the timeout was bumped
to MAX_JIFFY_OFFSET ((LONG_MAX >> 1)-1). Additionally the MAX_JIFFY_OFFSET
value was also used if the timeout was configured by a user to 0.
The following works:
echo 2147483647 > /sys/class/firmware/timeout
But both of the following set the timeout to MAX_JIFFY_OFFSET even if
we display 0 back to userspace:
echo 2147483648 > /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0
echo 0> /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0
A max value of INT_MAX (2147483647) seconds is therefore implicit due to the
another cast with simple_strtol().
This fixes the secondary cast (the first one is simple_strtol() but its an
issue only by forcing an implicit limit) by re-using the timeout variable and
only setting retval in appropriate cases.
Lastly worth noting systemd had ripped out the UMH firmware fallback
mechanism from udev since udev 2014 via commit be2ea723b1d023b3d
("udev: remove userspace firmware loading support"), so as of systemd v217.
Signed-off-by: Yves-Alexis Perez <corsac@corsac.net>
Fixes: 68ff2a00db "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[mcgrof@kernel.org: gave commit log a whole lot of love]
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08fe007968 upstream.
An ARC700 customer reported linux boot crashes when upgrading to bigger
L1 dcache (64K from 32K). Turns out they had an aliasing VIPT config and
current code only assumed 2 colours, while theirs had 4. So default to 4
colours and complain if there are fewer. Ideally this needs to be a
Kconfig option, but heck that's too much of hassle for a single user.
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d2a145252c upstream.
A race between scanning and fc_remote_port_delete() may result in a
permanent stop if the device gets blocked before scsi_sysfs_add_sdev()
and unblocked after. The reason is that blocking a device sets both the
SDEV_BLOCKED state and the QUEUE_FLAG_STOPPED. However,
scsi_sysfs_add_sdev() unconditionally sets SDEV_RUNNING which causes the
device to be ignored by scsi_target_unblock() and thus never have its
QUEUE_FLAG_STOPPED cleared leading to a device which is apparently
running but has a stopped queue.
We actually have two places where SDEV_RUNNING is set: once in
scsi_add_lun() which respects the blocked flag and once in
scsi_sysfs_add_sdev() which doesn't. Since the second set is entirely
spurious, simply remove it to fix the problem.
Reported-by: Zengxi Chen <chenzengxi@huawei.com>
Signed-off-by: Wei Fang <fangwei1@huawei.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6f2ce1c6af upstream.
It is unavoidable that zfcp_scsi_queuecommand() has to finish requests
with DID_IMM_RETRY (like fc_remote_port_chkready()) during the time
window when zfcp detected an unavailable rport but
fc_remote_port_delete(), which is asynchronous via
zfcp_scsi_schedule_rport_block(), has not yet blocked the rport.
However, for the case when the rport becomes available again, we should
prevent unblocking the rport too early. In contrast to other FCP LLDDs,
zfcp has to open each LUN with the FCP channel hardware before it can
send I/O to a LUN. So if a port already has LUNs attached and we
unblock the rport just after port recovery, recoveries of LUNs behind
this port can still be pending which in turn force
zfcp_scsi_queuecommand() to unnecessarily finish requests with
DID_IMM_RETRY.
This also opens a time window with unblocked rport (until the followup
LUN reopen recovery has finished). If a scsi_cmnd timeout occurs during
this time window fc_timed_out() cannot work as desired and such command
would indeed time out and trigger scsi_eh. This prevents a clean and
timely path failover. This should not happen if the path issue can be
recovered on FC transport layer such as path issues involving RSCNs.
Fix this by only calling zfcp_scsi_schedule_rport_register(), to
asynchronously trigger fc_remote_port_add(), after all LUN recoveries as
children of the rport have finished and no new recoveries of equal or
higher order were triggered meanwhile. Finished intentionally includes
any recovery result no matter if successful or failed (still unblock
rport so other successful LUNs work). For simplicity, we check after
each finished LUN recovery if there is another LUN recovery pending on
the same port and then do nothing. We handle the special case of a
successful recovery of a port without LUN children the same way without
changing this case's semantics.
For debugging we introduce 2 new trace records written if the rport
unblock attempt was aborted due to still unfinished or freshly triggered
recovery. The records are only written above the default trace level.
Benjamin noticed the important special case of new recovery that can be
triggered between having given up the erp_lock and before calling
zfcp_erp_action_cleanup() within zfcp_erp_strategy(). We must avoid the
following sequence:
ERP thread rport_work other context
------------------------- -------------- --------------------------------
port is unblocked, rport still blocked,
due to pending/running ERP action,
so ((port->status & ...UNBLOCK) != 0)
and (port->rport == NULL)
unlock ERP
zfcp_erp_action_cleanup()
case ZFCP_ERP_ACTION_REOPEN_LUN:
zfcp_erp_try_rport_unblock()
((status & ...UNBLOCK) != 0) [OLD!]
zfcp_erp_port_reopen()
lock ERP
zfcp_erp_port_block()
port->status clear ...UNBLOCK
unlock ERP
zfcp_scsi_schedule_rport_block()
port->rport_task = RPORT_DEL
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task != RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_block()
if (!port->rport) return
zfcp_scsi_schedule_rport_register()
port->rport_task = RPORT_ADD
queue_work(rport_work)
zfcp_scsi_rport_work()
(port->rport_task == RPORT_ADD)
port->rport_task = RPORT_NONE
zfcp_scsi_rport_register()
(port->rport == NULL)
rport = fc_remote_port_add()
port->rport = rport;
Now the rport was erroneously unblocked while the zfcp_port is blocked.
This is another situation we want to avoid due to scsi_eh
potential. This state would at least remain until the new recovery from
the other context finished successfully, or potentially forever if it
failed. In order to close this race, we take the erp_lock inside
zfcp_erp_try_rport_unblock() when checking the status of zfcp_port or
LUN. With that, the possible corresponding rport state sequences would
be: (unblock[ERP thread],block[other context]) if the ERP thread gets
erp_lock first and still sees ((port->status & ...UNBLOCK) != 0),
(block[other context],NOP[ERP thread]) if the ERP thread gets erp_lock
after the other context has already cleard ...UNBLOCK from port->status.
Since checking fields of struct erp_action is unsafe because they could
have been overwritten (re-used for new recovery) meanwhile, we only
check status of zfcp_port and LUN since these are only changed under
erp_lock elsewhere. Regarding the check of the proper status flags (port
or port_forced are similar to the shown adapter recovery):
[zfcp_erp_adapter_shutdown()]
zfcp_erp_adapter_reopen()
zfcp_erp_adapter_block()
* clear UNBLOCK ---------------------------------------+
zfcp_scsi_schedule_rports_block() |
write_lock_irqsave(&adapter->erp_lock, flags);-------+ |
zfcp_erp_action_enqueue() | |
zfcp_erp_setup_act() | |
* set ERP_INUSE -----------------------------------|--|--+
write_unlock_irqrestore(&adapter->erp_lock, flags);--+ | |
.context-switch. | |
zfcp_erp_thread() | |
zfcp_erp_strategy() | |
write_lock_irqsave(&adapter->erp_lock, flags);------+ | |
... | | |
zfcp_erp_strategy_check_target() | | |
zfcp_erp_strategy_check_adapter() | | |
zfcp_erp_adapter_unblock() | | |
* set UNBLOCK -----------------------------------|--+ |
zfcp_erp_action_dequeue() | |
* clear ERP_INUSE ---------------------------------|-----+
... |
write_unlock_irqrestore(&adapter->erp_lock, flags);-+
Hence, we should check for both UNBLOCK and ERP_INUSE because they are
interleaved. Also we need to explicitly check ERP_FAILED for the link
down case which currently does not clear the UNBLOCK flag in
zfcp_fsf_link_down_info_eval().
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: 8830271c48 ("[SCSI] zfcp: Dont fail SCSI commands when transitioning to blocked fc_rport")
Fixes: a2fa0aede0 ("[SCSI] zfcp: Block FC transport rports early on errors")
Fixes: 5f852be9e1 ("[SCSI] zfcp: Fix deadlock between zfcp ERP and SCSI")
Fixes: 338151e066 ("[SCSI] zfcp: make use of fc_remote_port_delete when target port is unavailable")
Fixes: 3859f6a248 ("[PATCH] zfcp: add rports to enable scsi_add_device to work again")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 56d23ed7ad upstream.
Since quite a while, Linux issues enough SCSI commands per scsi_device
which successfully return with FCP_RESID_UNDER, FSF_FCP_RSP_AVAILABLE,
and SAM_STAT_GOOD. This floods the HBA trace area and we cannot see
other and important HBA trace records long enough.
Therefore, do not trace HBA response errors for pure benign residual
under counts at the default trace level.
This excludes benign residual under count combined with other validity
bits set in FCP_RSP_IU, such as FCP_SNS_LEN_VAL. For all those other
cases, we still do want to see both the HBA record and the corresponding
SCSI record by default.
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Fixes: a54ca0f62f ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dac37e15b7 upstream.
When SCSI EH invokes zFCP's callbacks for eh_device_reset_handler() and
eh_target_reset_handler(), it expects us to relent the ownership over
the given scsi_cmnd and all other scsi_cmnds within the same scope - LUN
or target - when returning with SUCCESS from the callback ('release'
them). SCSI EH can then reuse those commands.
We did not follow this rule to release commands upon SUCCESS; and if
later a reply arrived for one of those supposed to be released commands,
we would still make use of the scsi_cmnd in our ingress tasklet. This
will at least result in undefined behavior or a kernel panic because of
a wrong kernel pointer dereference.
To fix this, we NULLify all pointers to scsi_cmnds (struct zfcp_fsf_req
*)->data in the matching scope if a TMF was successful. This is done
under the locks (struct zfcp_adapter *)->abort_lock and (struct
zfcp_reqlist *)->lock to prevent the requests from being removed from
the request-hashtable, and the ingress tasklet from making use of the
scsi_cmnd-pointer in zfcp_fsf_fcp_cmnd_handler().
For cases where a reply arrives during SCSI EH, but before we get a
chance to NULLify the pointer - but before we return from the callback
-, we assume that the code is protected from races via the CAS operation
in blk_complete_request() that is called in scsi_done().
The following stacktrace shows an example for a crash resulting from the
previous behavior:
Unable to handle kernel pointer dereference at virtual kernel address fffffee17a672000
Oops: 0038 [#1] SMP
CPU: 2 PID: 0 Comm: swapper/2 Not tainted
task: 00000003f7ff5be0 ti: 00000003f3d38000 task.ti: 00000003f3d38000
Krnl PSW : 0404d00180000000 00000000001156b0 (smp_vcpu_scheduled+0x18/0x40)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 EA:3
Krnl GPRS: 000000200000007e 0000000000000000 fffffee17a671fd8 0000000300000015
ffffffff80000000 00000000005dfde8 07000003f7f80e00 000000004fa4e800
000000036ce8d8f8 000000036ce8d9c0 00000003ece8fe00 ffffffff969c9e93
00000003fffffffd 000000036ce8da10 00000000003bf134 00000003f3b07918
Krnl Code: 00000000001156a2: a7190000 lghi %r1,0
00000000001156a6: a7380015 lhi %r3,21
#00000000001156aa: e32050000008 ag %r2,0(%r5)
>00000000001156b0: 482022b0 lh %r2,688(%r2)
00000000001156b4: ae123000 sigp %r1,%r2,0(%r3)
00000000001156b8: b2220020 ipm %r2
00000000001156bc: 8820001c srl %r2,28
00000000001156c0: c02700000001 xilf %r2,1
Call Trace:
([<0000000000000000>] 0x0)
[<000003ff807bdb8e>] zfcp_fsf_fcp_cmnd_handler+0x3de/0x490 [zfcp]
[<000003ff807be30a>] zfcp_fsf_req_complete+0x252/0x800 [zfcp]
[<000003ff807c0a48>] zfcp_fsf_reqid_check+0xe8/0x190 [zfcp]
[<000003ff807c194e>] zfcp_qdio_int_resp+0x66/0x188 [zfcp]
[<000003ff80440c64>] qdio_kick_handler+0xdc/0x310 [qdio]
[<000003ff804463d0>] __tiqdio_inbound_processing+0xf8/0xcd8 [qdio]
[<0000000000141fd4>] tasklet_action+0x9c/0x170
[<0000000000141550>] __do_softirq+0xe8/0x258
[<000000000010ce0a>] do_softirq+0xba/0xc0
[<000000000014187c>] irq_exit+0xc4/0xe8
[<000000000046b526>] do_IRQ+0x146/0x1d8
[<00000000005d6a3c>] io_return+0x0/0x8
[<00000000005d6422>] vtime_stop_cpu+0x4a/0xa0
([<0000000000000000>] 0x0)
[<0000000000103d8a>] arch_cpu_idle+0xa2/0xb0
[<0000000000197f94>] cpu_startup_entry+0x13c/0x1f8
[<0000000000114782>] smp_start_secondary+0xda/0xe8
[<00000000005d6efe>] restart_int_handler+0x56/0x6c
[<0000000000000000>] 0x0
Last Breaking-Event-Address:
[<00000000003bf12e>] arch_spin_lock_wait+0x56/0xb0
Suggested-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Fixes: ea127f9754 ("[PATCH] s390 (7/7): zfcp host adapter.") (tglx/history.git)
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83337e5443 upstream.
If iscsit_tpg_add_network_portal() fails then
return error code instead of 0 to user space.
If iscsi-target returns 0 then user space keeps
on retrying same command infinitely, targetcli or
echo hangs till command completes with non zero
return value. In some cases it is possible that
add network portal command never completes with
success even after retrying multiple times,
for example - cxgbit_setup_np() always returns
-EINVAL if portal IP does not belong to Chelsio
adapter interface.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
[ bvanassche: Added "Fixes:" and "Cc: stable" tags ]
Fixes: commit d4b3fa4b08 ("iscsi-target: Make iscsi_tpg_np driver show/store use generic code")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 18e1c7f68a upstream.
For SRIOV enabled firmware, if there is a OCR(online controller reset)
possibility driver set the convert flag to 1, which is not happening if
there are outstanding commands even after 180 seconds. As driver does
not set convert flag to 1 and still making the OCR to run, VF(Virtual
function) driver is directly writing on to the register instead of
waiting for 30 seconds. Setting convert flag to 1 will cause VF driver
will wait for 30 secs before going for reset.
Signed-off-by: Kiran Kumar Kasturi <kiran-kumar.kasturi@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31b5929d53 upstream.
There is a disagreement between drivers/tty/vt/keyboard.c and
drivers/input/input-leds.c with regard to what is a Scroll Lock LED
trigger name: input calls it "kbd-scrolllock", but vt calls it
"kbd-scrollock" (two l's).
This prevents Scroll Lock LED trigger from binding to this LED by default.
Since it is a scroLL Lock LED, this interface was introduced only about a
year ago and in an Internet search people seem to reference this trigger
only to set it to this LED let's simply rename it to "kbd-scrolllock".
Also, it looks like this was supposed to be changed before this code was
merged: https://lkml.org/lkml/2015/6/9/697 but it was done only on
the input side.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5f8e166c2 upstream.
pm_runtime_autosuspend can take synchronous or asynchronous
paths, Because we are calling pm_runtime_mark_last_busy just before
this most of the cases it takes the asynchronous way. However,
when the FW or driver resets during already running runtime suspend,
the call will result in calling to the driver's rpm callback and results
in a deadlock on device_lock.
The simplest fix is to replace pm_runtime_autosuspend with
asynchronous pm_request_autosuspend.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10e2ca346b upstream.
If vBIOS noFan bit is set, the fan table parameters in thermal controller
will not get initialized. The driver should avoid to use these uninitialized
parameter to do calculation. Otherwise, it may trigger divide 0 error.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b27add13f5 upstream.
This avoids an issue that occurs when we're attempting to preempt multiple
channels simultaneously. HW seems to ignore preempt requests while it's
still processing a previous one, which, well, makes sense.
Fixes random "fifo: SCHED_ERROR 0d []" + GPCCS page faults during parallel
piglit runs on (at least) GM107.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b3800a6b7 upstream.
DPAUX registers moved on Kepler, these chipsets were still using the
Fermi implementation for some reason.
This fixes detection of hotplug/sink IRQs on DP connectors.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 10dcab3e7f upstream.
TTM was changed a while back to allow for pipelining of buffer moves, and
part of this was the removal of waiting for a BO to idle before calling
move(), placing the responsibility on the driver to do this if required.
That's all well and good, except, we make use of move_notify() to handle
mapping/unmapping from the GPU VMM as move() isn't called on all paths.
This commit adds a wait before unmapping from a VMM in move_notify(), to
prevent GPU page faults where a buffer is still being accessed.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e137040e0d upstream.
Look for firmware files using the legacy ("nouveau/nvxx_fucxxxx") path
if they cannot be found in the new, "official" path. User setups were
broken by the switch, which is bad.
There are only 4 firmware files we may want to look up that way, so
hardcode them into the lookup function. All new firmware files should
use the standard "nvidia/<chip>/gr/" path.
Fixes: 8539b37ace ("drm/nouveau/gr: use NVIDIA-provided external firmwares")
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit dd31ae9ac9 upstream.
GUI idle interrupts should be enabled only after we
have enabled coarse grain clock gating (CGCG). This
prevents GFX engine generating idle interrupt even
though CGCG is not completely enabled.
Most of the time this goes un-noticed, but on some
Stoney ASICs this results in GFX engine hang after
system resumes from suspend. The issue is not
particular to Stoney though and could have occured
on any ASIC. The patch fixes this issue.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Sunil Uttarwar <Sunil.Uttarwar1@amd.com>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e57ec613d upstream.
We were storing viewport relative coordinates. However, crtc_cursor_set2
and cursor_reset pass amdgpu_crtc->cursor_x/y as the x/y parameters of
cursor_move_locked, which would break if the CRTC isn't located at
(0, 0).
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6276e53fa8 upstream.
The HP Pavilion dv6 has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.
Note that there are quite a few HP Pavilion dv6 variants, some
woth ATI and some with NVIDIA hybrid gfx, both seem to need this
quirk to have working backlight control. There are also some versions
with only Intel integrated gfx, these may not need this quirk, but it
should not hurt there.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1204476
Link: https://bugs.launchpad.net/ubuntu/+source/linux-lts-trusty/+bug/1416940
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 350fa038c3 upstream.
The Dell XPS 17 L702X has a non-working acpi_video0 backlight interface
and an intel_backlight interface which works fine. Add a force_native
quirk for it so that the non-working acpi_video0 interface does not get
registered.
Note that there also is an issue with the brightnesskeys on this laptop,
they do not generate key-press events in anyway. That is not solved by
this patch.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1123661
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 857a661020 upstream.
Commit 0557344e21 ("staging: comedi: ni_mio_common: fix local var for
32-bit read") changed the type of local variable `d` from `unsigned
short` to `unsigned int` to fix a bug introduced in
commit 9c340ac934 ("staging: comedi: ni_stc.h: add read/write
callbacks to struct ni_private") when reading AI data for NI PCI-6110
and PCI-6111 cards. Unfortunately, other parts of the function rely on
the variable being `unsigned short` when an offset value in local
variable `signbits` is added to `d` before writing the value to the
`data` array:
d += signbits;
data[n] = d;
The `signbits` variable will be non-zero in bipolar mode, and is used to
convert the hardware's 2's complement, 16-bit numbers to Comedi's
straight binary sample format (with 0 representing the most negative
voltage). This breaks because `d` is now 32 bits wide instead of 16
bits wide, so after the addition of `signbits`, `data[n]` ends up being
set to values above 65536 for negative voltages. This affects all
supported "E series" cards except PCI-6143 (and PXI-6143). Fix it by
ANDing the value written to the `data[n]` with the mask 0xffff.
Fixes: 0557344e21 ("staging: comedi: ni_mio_common: fix local var for 32-bit read")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 655c4d442d upstream.
For NI M Series cards, the Comedi `insn_read` handler for the AI
subdevice is broken due to ANDing the value read from the AI FIFO data
register with an incorrect mask. The incorrect mask clears all but the
most significant bit of the sample data. It should preserve all the
sample data bits. Correct it.
Fixes: 817144ae7f ("staging: comedi: ni_mio_common: remove unnecessary use of 'board->adbits'")
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8cb86fd95 upstream.
James Simmons reports:
> The ldlm_pool field pl_recalc_time is set to the current
> monotonic clock value but the interval period is calculated
> with the wall clock. This means the interval period will
> always be far larger than the pl_recalc_period, which is
> just a small interval time period. The correct thing to
> do is to use monotomic clock current value instead of the
> wall clocks value when calculating recalc_interval_sec.
This broke when I converted the 32-bit get_seconds() into
ktime_get_{real_,}seconds() inconsistently. Either
one of those two would have worked, but mixing them
does not.
Staying with the original intention of the patch, this
changes the ktime_get_seconds() calls into ktime_get_real_seconds(),
using real time instead of mononic time.
Fixes: 8f83409cf2 ("staging/lustre: use 64-bit time for pl_recalc")
Reported-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd15dd6ef4 upstream.
I have been having a lot of unexplainable crashes in osc_lru_shrink
lately that I could not see a good explanation for and then I found
this patch that slip under the radar somehow that incorrectly
converted while loop for lru list iteration into
list_for_each_entry_safe totally ignoring that in the body of
the loop we drop spinlocks guarding this list and move list entries
around.
Not sure why it was not showing up right away, perhaps some of the
more recent LRU changes committed caused some extra pressure on this
code that finally highlighted the breakage.
Reverts: 8adddc36b1 ("staging: lustre: osc: Use list_for_each_entry_safe")
CC: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abd1026da4 upstream.
"kernel BUG at drivers/hv/channel_mgmt.c:350!" is observed when hv_vmbus
module is unloaded. BUG_ON() was introduced in commit 85d9aa7051
("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()") as
vmbus_free_channels() codepath was apparently forgotten.
Fixes: 85d9aa7051 ("Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f37fabb864 upstream.
In the critical sysfs entry the thermal hwmon was returning wrong
temperature to the user-space. It was reporting the temperature of the
first trip point instead of the temperature of critical trip point.
For example:
/sys/class/hwmon/hwmon0/temp1_crit:50000
/sys/class/thermal/thermal_zone0/trip_point_0_temp:50000
/sys/class/thermal/thermal_zone0/trip_point_0_type:active
/sys/class/thermal/thermal_zone0/trip_point_3_temp:120000
/sys/class/thermal/thermal_zone0/trip_point_3_type:critical
Since commit e68b16abd9 ("thermal: add hwmon sysfs I/F") the driver
have been registering a sysfs entry if get_crit_temp() callback was
provided. However when accessed, it was calling get_trip_temp() instead
of the get_crit_temp().
Fixes: e68b16abd9 ("thermal: add hwmon sysfs I/F")
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 68af4fa8f3 upstream.
bcm2835_pll_divider_off() is resetting the divider field in the A2W reg
to zero when disabling the clock.
Make sure we preserve this value by reading the previous a2w_reg value
first and ORing the result with A2W_PLL_CHANNEL_DISABLE.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Fixes: 41691b8862 ("clk: bcm2835: Add support for programming the audio domain clocks")
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e6b9a89af upstream.
Add the VDD_GPU regulator (a GPIO-enabled PWM regulator) to the Jetson
TX1 board. This addition allows the GPU to be used provided the
bootloader properly enabled the GPU node.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
[as pointed out by Thierry on IRC, nobody has reported a bug
in the field, but using a new bootloader with a .dtb that
has the incorrect data, it will crash on boot]
Fixes: 336f79c7b6 ("arm64: tegra: Add NVIDIA Jetson TX1 Developer Kit support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f4e81c5297 upstream.
The GPIO chardev is used for management tasks (allocating line and event
handles) and does neither support read() nor write() operations. Hence it
does not make much sense to allow seek operations.
Currently the chardev uses noop_llseek() for its seek implementation. This
function does not move the pointer and simply returns the current position
(always 0 for the GPIO chardev). noop_llseek() is primarily meant for
devices that can not support seek, but where there might be a user that
depends on the seek() operation succeeding. For newly added devices that
can not support seek operations it is recommended to use no_llseek(), which
will return an error. For more information see commit 6038f373a3
("llseek: automatically add .llseek fop").
Unfortunately this was overlooked when the GPIO chardev ABI was introduced.
But it is highly unlikely that since then userspace applications have
appeared that rely on being able to perform non-failing seek operations on
a GPIO chardev file descriptor. So it should be safe to change from
noop_llseel() to no_seek(). Also use nonseekable_open() in the chardev
open() callback to clear the FMODE_SEEK, FMODE_PREAD and FMODE_PWRITE flags
from the file. Neither of these should be set on a file that does not
support seek operations.
Fixes: 3c702e9987 ("gpio: add a userspace chardev ABI for GPIOs")
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1516c6350a upstream.
commit 43db289d00 ("gpio: stmpe: Rework registers access")
reworked the STMPE register access so as to use
[STMPE_IDX_*_LSB + i] to access the 8bit register for a
certain bank, assuming the CSB and MSB will follow after
the enumerator. For this to work the index needs to go from
(size-1) to 0 not 0 to (size-1).
However for the GPIO IRQ handler, the status registers we read
register MSB + 3 bytes ahead for the 24 bit GPIOs and index
registers from MSB upwards and run an index i over the
registers UNLESS we are STMPE1600.
This is not working when we get to clearing the interrupt
EDGE status register STMPE_IDX_GPEDR_[LCM]SB: it is indexed
like all other registers [STMPE_IDX_*_LSB + i] but in this
loop we index from 0 to get the right bank index for the
calculations, and we need to just add i to the MSB.
Before this, interrupts on the STMPE2401 were broken, this
patch fixes it so it works again.
Cc: Patrice Chotard <patrice.chotard@st.com>
Fixes: 43db289d00 ("gpio: stmpe: Rework registers access")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c1645727b upstream.
The clocksource delta to nanoseconds conversion is using signed math, but
the delta is unsigned. This makes the conversion space smaller than
necessary and in case of a multiplication overflow the conversion can
become negative. The conversion is done with scaled math:
s64 nsec_delta = ((s64)clkdelta * clk->mult) >> clk->shift;
Shifting a signed integer right obvioulsy preserves the sign, which has
interesting consequences:
- Time jumps backwards
- __iter_div_u64_rem() which is used in one of the calling code pathes
will take forever to piecewise calculate the seconds/nanoseconds part.
This has been reported by several people with different scenarios:
David observed that when stopping a VM with a debugger:
"It was essentially the stopped by debugger case. I forget exactly why,
but the guest was being explicitly stopped from outside, it wasn't just
scheduling lag. I think it was something in the vicinity of 10 minutes
stopped."
When lifting the stop the machine went dead.
The stopped by debugger case is not really interesting, but nevertheless it
would be a good thing not to die completely.
But this was also observed on a live system by Liav:
"When the OS is too overloaded, delta will get a high enough value for the
msb of the sum delta * tkr->mult + tkr->xtime_nsec to be set, and so
after the shift the nsec variable will gain a value similar to
0xffffffffff000000."
Unfortunately this has been reintroduced recently with commit 6bd58f09e1
("time: Add cycles to nanoseconds translation"). It had been fixed a year
ago already in commit 35a4933a89 ("time: Avoid signed overflow in
timekeeping_get_ns()").
Though it's not surprising that the issue has been reintroduced because the
function itself and the whole call chain uses s64 for the result and the
propagation of it. The change in this recent commit is subtle:
s64 nsec;
- nsec = (d * m + n) >> s:
+ nsec = d * m + n;
+ nsec >>= s;
d being type of cycle_t adds another level of obfuscation.
This wouldn't have happened if the previous change to unsigned computation
would have made the 'nsec' variable u64 right away and a follow up patch
had cleaned up the whole call chain.
There have been patches submitted which basically did a revert of the above
patch leaving everything else unchanged as signed. Back to square one. This
spawned a admittedly pointless discussion about potential users which rely
on the unsigned behaviour until someone pointed out that it had been fixed
before. The changelogs of said patches added further confusion as they made
finally false claims about the consequences for eventual users which expect
signed results.
Despite delta being cycle_t, aka. u64, it's very well possible to hand in
a signed negative value and the signed computation will happily return the
correct result. But nobody actually sat down and analyzed the code which
was added as user after the propably unintended signed conversion.
Though in sensitive code like this it's better to analyze it proper and
make sure that nothing relies on this than hunting the subtle wreckage half
a year later. After analyzing all call chains it stands that no caller can
hand in a negative value (which actually would work due to the s64 cast)
and rely on the signed math to do the right thing.
Change the conversion function to unsigned math. The conversion of all call
chains is done in a follow up patch.
This solves the starvation issue, which was caused by the negative result,
but it does not solve the underlying problem. It merily procrastinates
it. When the timekeeper update is deferred long enough that the unsigned
multiplication overflows, then time going backwards is observable again.
It does neither solve the issue of clocksources with a small counter width
which will wrap around possibly several times and cause random time stamps
to be generated. But those are usually not found on systems used for
virtualization, so this is likely a non issue.
I took the liberty to claim authorship for this simply because
analyzing all callsites and writing the changelog took substantially
more time than just making the simple s/s64/u64/ change and ignore the
rest.
Fixes: 6bd58f09e1 ("time: Add cycles to nanoseconds translation")
Reported-by: David Gibson <david@gibson.dropbear.id.au>
Reported-by: Liav Rehana <liavr@mellanox.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Parit Bhargava <prarit@redhat.com>
Cc: Laurent Vivier <lvivier@redhat.com>
Cc: "Christopher S. Hall" <christopher.s.hall@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20161208204228.688545601@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e85baa8868 upstream.
The mmc_read_ssr() function results in DMA to the raw_ssr member of
struct mmc_card, which is not guaranteed to be cache line aligned & thus
might not meet the requirements set out in Documentation/DMA-API.txt:
Warnings: Memory coherency operates at a granularity called the cache
line width. In order for memory mapped by this API to operate
correctly, the mapped region must begin exactly on a cache line
boundary and end exactly on one (to prevent two separately mapped
regions from sharing a single cache line). Since the cache line size
may not be known at compile time, the API will not enforce this
requirement. Therefore, it is recommended that driver writers who
don't take special care to determine the cache line size at run time
only map virtual regions that begin and end on page boundaries (which
are guaranteed also to be cache line boundaries).
On some systems where DMA is non-coherent this can lead to us losing
data that shares cache lines with the raw_ssr array.
Fix this by kmalloc'ing a temporary buffer to perform DMA into. kmalloc
will ensure the buffer is suitably aligned, allowing the DMA to be
performed without any loss of data.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 5275a652d2 ("mmc: sd: Export SD Status via “ssr” device attribute")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 295070e9aa upstream.
The regulator has never been properly enabled, it has been
dormant all the time. It's strange that MMC was working
at all, but it likely worked by the signals going through
the levelshifter and reaching the card anyways.
Fixes: 3615a34ea1 ("regulator: add STw481x VMMC driver")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 61e53bd004 upstream.
Clearing the tuning bits should reset the tuning circuit. However there is
more to do. Reset the command and data lines for good measure, and then
for eMMC ensure the card is not still trying to process a tuning command by
sending a stop command.
Note the JEDEC eMMC specification says the stop command (CMD12) can be used
to stop a tuning command (CMD21) whereas the SD specification is silent on
the subject with respect to the SD tuning command (CMD19). Considering that
CMD12 is not a valid SDIO command, the stop command is sent only when the
tuning command is CMD21 i.e. for eMMC. That addresses cases seen so far
which have been on eMMC.
Note that this replaces the commit fe5fb2e3b5 ("mmc: sdhci: Reset cmd and
data circuits after tuning failure") which is being reverted for v4.9+.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Dan O'Donovan <dan@emutex.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ca71c27ee upstream.
This reverts commit fe5fb2e3b5 ("mmc: sdhci: Reset cmd and data circuits
after tuning failure").
A better fix is available, and it will be applied to older stable releases,
so get this out of the way by reverting it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 79e57dd113 upstream.
The active_high LED of my Wistron DNMA-92 is still being recognized as
active_low on 4.7.6 mainline. When I was preparing my former commit
0f9edcdd88 ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92
cards.") to fix that I must have somehow messed up with testing, because
I tested the final version of that patch before sending it, and it was
apparently working; but now it is not working on 4.7.6 mainline.
I initially added the PCI_DEVICE_SUB section for 0x0029/0x2096 above the
PCI_VDEVICE section for 0x0029; but then I moved the former below the
latter after seeing how 0x002A sections were sorted in the file.
This turned out to be wrong: if a generic PCI_VDEVICE entry (that has
both subvendor and subdevice IDs set to PCI_ANY_ID) is put before a more
specific one (PCI_DEVICE_SUB), then the generic PCI_VDEVICE entry will
match first and will be used.
With this patch, 0x0029/0x2096 has finally got active_high LED on 4.7.6.
While I'm at it, let's fix 0x002A too by also moving its generic definition
below its specific ones.
Fixes: 0f9edcdd88 ("ath9k: Fix LED polarity for some Mini PCI AR9220 MB92 cards.")
Signed-off-by: Vittorio Gambaletta <linuxbugs@vittgam.net>
[kvalo@qca.qualcomm.com: improve the commit log based on email discussions]
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91851cc7a9 upstream.
Commit b2d70d4944 ("ath9k: make GPIO API to support both of WMAC and
SOC") refactored ath9k_hw_gpio_get() to support both WMAC and SOC GPIOs,
changing the return on success from 1 to BIT(gpio). This broke some callers
like ath_is_rfkill_set(). This doesn't fix any known bug in mainline at the
moment, but should be fixed anyway.
Instead of fixing all callers, change ath9k_hw_gpio_get() back to only
return 0 or 1.
Fixes: b2d70d4944 ("ath9k: make GPIO API to support both of WMAC and SOC")
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
[kvalo@qca.qualcomm.com: mention that doesn't fix any known bug]
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e6f462df9a upstream.
When mac80211 abandons an association attempt, it may free
all the data structures, but inform cfg80211 and userspace
about it only by sending the deauth frame it received, in
which case cfg80211 has no link to the BSS struct that was
used and will not cfg80211_unhold_bss() it.
Fix this by providing a way to inform cfg80211 of this with
the BSS entry passed, so that it can clean up properly, and
use this ability in the appropriate places in mac80211.
This isn't ideal: some code is more or less duplicated and
tracing is missing. However, it's a fairly small change and
it's thus easier to backport - cleanups can come later.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 834fcd2980 upstream.
If the pmu registration fails the registered hotplug callbacks are not
removed. Wrong in any case, but fatal in case of a modular driver.
Replace the nonsensical state names with proper ones while at it.
Fixes: 77c34ef1c3 ("perf/x86/intel/cstate: Convert Intel CSTATE to hotplug state machine")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ba9f93f82a upstream.
In commit a5ffbe0a19 ("rtlwifi: Fix scheduling while atomic bug") and
commit a269913c52 ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter()
to use work queue"), an error was introduced in the power-save routines
due to the fact that leaving PS was delayed by the use of a work queue.
This problem is fixed by detecting if the enter or leave routines are
in interrupt mode. If so, the workqueue is used to place the request.
If in normal mode, the enter or leave routines are called directly.
Fixes: a269913c52 ("rtlwifi: Rework rtl_lps_leave() and rtl_lps_enter() to use work queue")
Reported-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2cac2f74a upstream.
During firmware crash (or) user requested manual restart
the system gets into a soft lock up state because of the
below root cause.
During user requested hardware restart / firmware crash
the system goes into a soft lockup state as 'napi_synchronize'
is called after 'napi_disable' (which sets 'NAPI_STATE_SCHED'
bit) and it sleeps into infinite loop as it waits for
'NAPI_STATE_SCHED' to be cleared. This condition is hit because
'ath10k_hif_stop' is called twice as below (resulting in calling
'napi_synchronize' after 'napi_disable')
'ath10k_core_restart' -> 'ath10k_hif_stop' (ATH10K_STATE_ON) ->
-> 'ieee80211_restart_hw' -> 'ath10k_start' -> 'ath10k_halt' ->
'ath10k_core_stop' -> 'ath10k_hif_stop' (ATH10K_STATE_RESTARTING)
Fix this by calling 'ath10k_halt' in ath10k_core_restart itself
as it makes more sense before informing mac80211 to restart h/w
Also remove 'ath10k_halt' in ath10k_start for the state of 'restarting'
Fixes: 3c97f5de1f ("ath10k: implement NAPI support")
Signed-off-by: Mohammed Shafi Shajakhan <mohammed@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8052d7245b upstream.
When there is a CRC error in the SPROM read from the device, the code
attempts to handle a fallback SPROM. When this also fails, the driver
returns zero rather than an error code.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 334bb77387 upstream.
Commit 4efca4ed ("kbuild: modversions for EXPORT_SYMBOL() for asm") adds
modversion support for symbols exported from asm files. Architectures
must include C-style declarations for those symbols in asm/asm-prototypes.h
in order for them to be versioned.
Add these declarations for x86, and an architecture-independent file that
can be used for common symbols.
With f27c2f6 reverting 8ab2ae6 ("default exported asm symbols to zero") we
produce a scary warning on x86, this commit fixes that.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Tested-by: Kalle Valo <kvalo@codeaurora.org>
Acked-by: Nicholas Piggin <npiggin@gmail.com>
Tested-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 152b695d74 upstream.
Both Debian and kernel archs are "arm64" but UTS_MACHINE and gcc say
"aarch64". Recognizing just the latter should be enough but let's
accept both in case something regresses again or an user sets
UTS_MACHINE=arm64.
Regressed in cfa88c7: arm64: Set UTS_MACHINE in the Makefile.
Signed-off-by: Adam Borowski <kilobyte@angband.pl>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b10b23ca9 upstream.
xlog_recover_clear_agi_bucket didn't set the
type to XFS_BLFT_AGI_BUF, so we got a warning during log
replay (or an ASSERT on a debug build).
XFS (md0): Unknown buffer type 0!
XFS (md0): _xfs_buf_ioapply: no ops on block 0xaea8802/0x1
Fix this, as was done in f19b872b for 2 other locations
with the same problem.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4dfce57db6 upstream.
There have been several reports over the years of NULL pointer
dereferences in xfs_trans_log_inode during xfs_fsr processes,
when the process is doing an fput and tearing down extents
on the temporary inode, something like:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr"
[exception RIP: xfs_trans_log_inode+0x10]
#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs]
#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs]
#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs]
#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs]
#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs]
#14 [ffff8800a57bbe00] evict at ffffffff811e1b67
#15 [ffff8800a57bbe28] iput at ffffffff811e23a5
#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8
#17 [ffff8800a57bbe88] dput at ffffffff811dd06c
#18 [ffff8800a57bbea8] __fput at ffffffff811c823b
#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e
#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27
#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c
#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d
As it turns out, this is because the i_itemp pointer, along
with the d_ops pointer, has been overwritten with zeros
when we tear down the extents during truncate. When the in-core
inode fork on the temporary inode used by xfs_fsr was originally
set up during the extent swap, we mistakenly looked at di_nextents
to determine whether all extents fit inline, but this misses extents
generated by speculative preallocation; we should be using if_bytes
instead.
This mistake corrupts the in-memory inode, and code in
xfs_iext_remove_inline eventually gets bad inputs, causing
it to memmove and memset incorrect ranges; this became apparent
because the two values in ifp->if_u2.if_inline_ext[1] contained
what should have been in d_ops and i_itemp; they were memmoved due
to incorrect array indexing and then the original locations
were zeroed with memset, again due to an array overrun.
Fix this by properly using i_df.if_bytes to determine the number
of extents, not di_nextents.
Thanks to dchinner for looking at this with me and spotting the
root cause.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30faaafdfa upstream.
Commit 9c17d96500 ("xen/gntdev: Grant maps should not be subject to
NUMA balancing") set VM_IO flag to prevent grant maps from being
subjected to NUMA balancing.
It was discovered recently that this flag causes get_user_pages() to
always fail with -EFAULT.
check_vma_flags
__get_user_pages
__get_user_pages_locked
__get_user_pages_unlocked
get_user_pages_fast
iov_iter_get_pages
dio_refill_pages
do_direct_IO
do_blockdev_direct_IO
do_blockdev_direct_IO
ext4_direct_IO_read
generic_file_read_iter
aio_run_iocb
(which can happen if guest's vdisk has direct-io-safe option).
To avoid this let's use VM_MIXEDMAP flag instead --- it prevents
NUMA balancing just as VM_IO does and has no effect on
check_vma_flags().
Reported-by: Olaf Hering <olaf@aepfle.de>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2d13bb6494 upstream.
We've got a delay loop waiting for secondary CPUs. That loop uses
loops_per_jiffy. However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures. It is quite
common to see things like this in the boot log:
Calibrating delay loop (skipped), value calculated using timer
frequency.. 48.00 BogoMIPS (lpj=24000)
In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.
Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout. We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.
[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9eff1140a8 upstream.
Systemd on reboot enables shutdown watchdog that leaves the watchdog
device open to ensure that even if power down process get stuck the
platform reboots nonetheless.
The iamt_wdt is an alarm-only watchdog and can't reboot system, but the
FW will generate an alarm event reboot was completed in time, as the
watchdog is not automatically disabled during power cycle.
So we should request stop watchdog on reboot to eliminate wrong alarm
from the FW.
Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4d1f0fb096 upstream.
NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
pointing to the code interrupted by IRQ which was interrupted by NMI.
NULL isn't a problem: in this case watchdog calls dump_stack() and
prints full stack trace including NMI. But if we're stuck in IRQ
handler then NMI watchlog will print stack trace without IRQ part at
all.
This patch uses registers snapshot passed into NMI handler as arguments:
these registers point exactly to the instruction interrupted by NMI.
Fixes: 55537871ef ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e3d240e9d5 upstream.
If maxBuf is not 0 but less than a size of SMB2 lock structure
we can end up with a memory corruption.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 96a988ffeb upstream.
With the current code it is possible to lock a mutex twice when
a subsequent reconnects are triggered. On the 1st reconnect we
reconnect sessions and tcons and then persistent file handles.
If the 2nd reconnect happens during the reconnecting of persistent
file handles then the following sequence of calls is observed:
cifs_reopen_file -> SMB2_open -> small_smb2_init -> smb2_reconnect
-> cifs_reopen_persistent_file_handles -> cifs_reopen_file (again!).
So, we are trying to acquire the same cfile->fh_mutex twice which
is wrong. Fix this by moving reconnecting of persistent handles to
the delayed work (smb2_reconnect_server) and submitting this work
every time we reconnect tcon in SMB2 commands handling codepath.
This can also lead to corruption of a temporary file list in
cifs_reopen_persistent_file_handles() because we can recursively
call this function twice.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 53e0e11efe upstream.
We can not unlock/lock cifs_tcp_ses_lock while walking through ses
and tcon lists because it can corrupt list iterator pointers and
a tcon structure can be released if we don't hold an extra reference.
Fix it by moving a reconnect process to a separate delayed work
and acquiring a reference to every tcon that needs to be reconnected.
Also do not send an echo request on newly established connections.
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 06deeec77a upstream.
smbencrypt() points a scatterlist to the stack, which is breaks if
CONFIG_VMAP_STACK=y.
Fix it by switching to crypto_cipher_encrypt_one(). The new code
should be considerably faster as an added benefit.
This code is nearly identical to some code that Eric Biggers
suggested.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 314c25c56c upstream.
In dm_sm_metadata_create() we temporarily change the dm_space_map
operations from 'ops' (whose .destroy function deallocates the
sm_metadata) to 'bootstrap_ops' (whose .destroy function doesn't).
If dm_sm_metadata_create() fails in sm_ll_new_metadata() or
sm_ll_extend(), it exits back to dm_tm_create_internal(), which calls
dm_sm_destroy() with the intention of freeing the sm_metadata, but it
doesn't (because the dm_space_map operations is still set to
'bootstrap_ops').
Fix this by setting the dm_space_map operations back to 'ops' if
dm_sm_metadata_create() fails when it is set to 'bootstrap_ops'.
Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d15bb3a646 upstream.
It is required to hold the queue lock when calling blk_run_queue_async()
to avoid that a race between blk_run_queue_async() and
blk_cleanup_queue() is triggered.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 265e9098ba upstream.
In crypt_set_key(), if a failure occurs while replacing the old key
(e.g. tfm->setkey() fails) the key must not have DM_CRYPT_KEY_VALID flag
set. Otherwise, the crypto layer would have an invalid key that still
has DM_CRYPT_KEY_VALID flag set.
Signed-off-by: Ondrej Kozina <okozina@redhat.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bff7e067ee upstream.
Fix to return error code -EINVAL instead of 0, as is done elsewhere in
this function.
Fixes: e80d1c805a ("dm: do not override error code returned from dm_get_device()")
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 301fc3f5ef upstream.
When dm_table_set_type() is used by a target to establish a DM table's
type (e.g. DM_TYPE_MQ_REQUEST_BASED in the case of DM multipath) the
DM core must go on to verify that the devices in the table are
compatible with the established type.
Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6936c12cf8 upstream.
An earlier DM multipath table could have been build ontop of underlying
devices that were all using blk-mq. In that case, if that active
multipath table is replaced with an empty DM multipath table (that
reflects all paths have failed) then it is important that the
'all_blk_mq' state of the active table is transfered to the new empty DM
table. Otherwise dm-rq.c:dm_old_prep_tio() will incorrectly clone a
request that isn't needed by the DM multipath target when it is to issue
IO to an underlying blk-mq device.
Fixes: e83068a5 ("dm mpath: add optional "queue_mode" feature")
Reported-by: Bart Van Assche <bart.vanassche@sandisk.com>
Tested-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91291d9ad9 upstream.
Joonyoung Shim reported an interesting problem on his ARM octa-core
Odoroid-XU3 platform. During system suspend, dev_pm_opp_put_regulator()
was failing for a struct device for which dev_pm_opp_set_regulator() is
called earlier.
This happened because an earlier call to
dev_pm_opp_of_cpumask_remove_table() function (from cpufreq-dt.c file)
removed all the entries from opp_table->dev_list apart from the last CPU
device in the cpumask of CPUs sharing the OPP.
But both dev_pm_opp_set_regulator() and dev_pm_opp_put_regulator()
routines get CPU device for the first CPU in the cpumask. And so the OPP
core failed to find the OPP table for the struct device.
This patch attempts to fix this problem by returning a pointer to the
opp_table from dev_pm_opp_set_regulator() and using that as the
parameter to dev_pm_opp_put_regulator(). This ensures that the
dev_pm_opp_put_regulator() doesn't fail to find the opp table.
Note that similar design problem also exists with other
dev_pm_opp_put_*() APIs, but those aren't used currently by anyone and
so we don't need to update them for now.
Reported-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ Viresh: Wrote commit log and tested on exynos 5250 ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eaa496ffaa upstream.
ep->mult is supposed to be set to Isochronous and
Interrupt Endapoint's multiplier value. This value
is computed from different places depending on the
link speed.
If we're dealing with HighSpeed, then it's part of
bits [12:11] of wMaxPacketSize. This case wasn't
taken into consideration before.
While at that, also make sure the ep->mult defaults
to one so drivers can use it unconditionally and
assume they'll never multiply ep->maxpacket to zero.
Cc: <stable@vger.kernel.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a6de734bc0 upstream.
Vlastimil Babka pointed out that commit 479f854a20 ("mm, page_alloc:
defer debugging checks of pages allocated from the PCP") will allow the
per-cpu list counter to be out of sync with the per-cpu list contents if
a struct page is corrupted.
The consequence is an infinite loop if the per-cpu lists get fully
drained by free_pcppages_bulk because all the lists are empty but the
count is positive. The infinite loop occurs here
do {
batch_free++;
if (++migratetype == MIGRATE_PCPTYPES)
migratetype = 0;
list = &pcp->lists[migratetype];
} while (list_empty(list));
What the user sees is a bad page warning followed by a soft lockup with
interrupts disabled in free_pcppages_bulk().
This patch keeps the accounting in sync.
Fixes: 479f854a20 ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
Link: http://lkml.kernel.org/r/20161202112951.23346-2-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5f33a0803b upstream.
Our system uses significantly more slab memory with memcg enabled with
the latest kernel. With 3.10 kernel, slab uses 2G memory, while with
4.6 kernel, 6G memory is used. The shrinker has problem. Let's see we
have two memcg for one shrinker. In do_shrink_slab:
1. Check cg1. nr_deferred = 0, assume total_scan = 700. batch size
is 1024, then no memory is freed. nr_deferred = 700
2. Check cg2. nr_deferred = 700. Assume freeable = 20, then
total_scan = 10 or 40. Let's assume it's 10. No memory is freed.
nr_deferred = 10.
The deferred share of cg1 is lost in this case. kswapd will free no
memory even run above steps again and again.
The fix makes sure one memcg's deferred share isn't lost.
Link: http://lkml.kernel.org/r/2414be961b5d25892060315fbb56bb19d81d0c07.1476227351.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e4fcf07cca upstream.
When removing a namespace we delete it from the subsystem namespaces
list with list_del_init which allows us to know if it is enabled or
not.
The problem is that list_del_init initialize the list next and does
not respect the RCU list-traversal we do on the IO path for locating
a namespace. Instead we need to use list_del_rcu which is allowed to
run concurrently with the _rcu list-traversal primitives (keeps list
next intact) and guarantees concurrent nvmet_find_naespace forward
progress.
By changing that, we cannot rely on ns->dev_link for knowing if the
namspace is enabled, so add enabled indicator entry to nvmet_ns for
that.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Solganik Alexander <sashas@lightbitslabs.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b4a567e811 upstream.
->queue_rq() should return one of the BLK_MQ_RQ_QUEUE_* constants, not
an errno.
Fixes: f4aa4c7bba ("block: loop: convert to per-device workqueue")
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8508e44ae9 upstream.
We don't guarantee cp_addr is fixed by cp_version.
This is to sync with f2fs-tools.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e87f7329bb upstream.
In the last ilen case, i was already increased, resulting in accessing out-
of-boundary entry of do_replace and blkaddr.
Fix to check ilen first to exit the loop.
Fixes: 2aa8fbb9693020 ("f2fs: refactor __exchange_data_block for speed up")
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 05e6ea2685 upstream.
The struct file_operations instance serving the f2fs/status debugfs file
lacks an initialization of its ->owner.
This means that although that file might have been opened, the f2fs module
can still get removed. Any further operation on that opened file, releasing
included, will cause accesses to unmapped memory.
Indeed, Mike Marshall reported the following:
BUG: unable to handle kernel paging request at ffffffffa0307430
IP: [<ffffffff8132a224>] full_proxy_release+0x24/0x90
<...>
Call Trace:
[] __fput+0xdf/0x1d0
[] ____fput+0xe/0x10
[] task_work_run+0x8e/0xc0
[] do_exit+0x2ae/0xae0
[] ? __audit_syscall_entry+0xae/0x100
[] ? syscall_trace_enter+0x1ca/0x310
[] do_group_exit+0x44/0xc0
[] SyS_exit_group+0x14/0x20
[] do_syscall_64+0x61/0x150
[] entry_SYSCALL64_slow_path+0x25/0x25
<...>
---[ end trace f22ae883fa3ea6b8 ]---
Fixing recursive fault but reboot is needed!
Fix this by initializing the f2fs/status file_operations' ->owner with
THIS_MODULE.
This will allow debugfs to grab a reference to the f2fs module upon any
open on that file, thus preventing it from getting removed.
Fixes: 902829aa0b ("f2fs: move proc files to debugfs")
Reported-by: Mike Marshall <hubcap@omnibond.com>
Reported-by: Martin Brandenburg <martin@omnibond.com>
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 204706c7ac upstream.
This reverts commit 1beba1b3a9.
The perpcu_counter doesn't provide atomicity in single core and consume more
DRAM. That incurs fs_mark test failure due to ENOMEM.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 73b92a2a5e upstream.
Currently data journalling is incompatible with encryption: enabling both
at the same time has never been supported by design, and would result in
unpredictable behavior. However, users are not precluded from turning on
both features simultaneously. This change programmatically replaces data
journaling for encrypted regular files with ordered data journaling mode.
Background:
Journaling encrypted data has not been supported because it operates on
buffer heads of the page in the page cache. Namely, when the commit
happens, which could be up to five seconds after caching, the commit
thread uses the buffer heads attached to the page to copy the contents of
the page to the journal. With encryption, it would have been required to
keep the bounce buffer with ciphertext for up to the aforementioned five
seconds, since the page cache can only hold plaintext and could not be
used for journaling. Alternatively, it would be required to setup the
journal to initiate a callback at the commit time to perform deferred
encryption - in this case, not only would the data have to be written
twice, but it would also have to be encrypted twice. This level of
complexity was not justified for a mode that in practice is very rarely
used because of the overhead from the data journalling.
Solution:
If data=journaled has been set as a mount option for a filesystem, or if
journaling is enabled on a regular file, do not perform journaling if the
file is also encrypted, instead fall back to the data=ordered mode for the
file.
Rationale:
The intent is to allow seamless and proper filesystem operation when
journaling and encryption have both been enabled, and have these two
conflicting features gracefully resolved by the filesystem.
Fixes: 4461471107
Signed-off-by: Sergey Karamov <skaramov@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7e6e1ef48f upstream.
Don't load an inode with a negative size; this causes integer overflow
problems in the VFS.
[ Added EXT4_ERROR_INODE() to mark file system as corrupted. -TYT]
Fixes: a48380f769 (ext4: rename i_dir_acl to i_size_high)
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c48ae41baf upstream.
The commit "ext4: sanity check the block and cluster size at mount
time" should prevent any problems, but in case the superblock is
modified while the file system is mounted, add an extra safety check
to make sure we won't overrun the allocated buffer.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5aee0f8a3f upstream.
Fix a large number of problems with how we handle mount options in the
superblock. For one, if the string in the superblock is long enough
that it is not null terminated, we could run off the end of the string
and try to interpret superblocks fields as characters. It's unlikely
this will cause a security problem, but it could result in an invalid
parse. Also, parse_options is destructive to the string, so in some
cases if there is a comma-separated string, it would be modified in
the superblock. (Fortunately it only happens on file systems with a
1k block size.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd6bb35bf7 upstream.
Centralize the checks for inodes_per_block and be more strict to make
sure the inodes_per_block_group can't end up being zero.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 30a9d7afe7 upstream.
The number of 'counters' elements needed in 'struct sg' is
super_block->s_blocksize_bits + 2. Presently we have 16 'counters'
elements in the array. This is insufficient for block sizes >= 32k. In
such cases the memcpy operation performed in ext4_mb_seq_groups_show()
would cause stack memory corruption.
Fixes: c9de560ded
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 69e43e8cc9 upstream.
'border' variable is set to a value of 2 times the block size of the
underlying filesystem. With 64k block size, the resulting value won't
fit into a 16-bit variable. Hence this commit changes the data type of
'border' to 'unsigned int'.
Fixes: c9de560ded
Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1566a48aaa upstream.
If there is an error reported in mballoc via ext4_grp_locked_error(),
the code is holding a spinlock, so ext4_commit_super() must not try to
lock the buffer head, or else it will trigger a BUG:
BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:358
in_atomic(): 1, irqs_disabled(): 0, pid: 993, name: mount
CPU: 0 PID: 993 Comm: mount Not tainted 4.9.0-rc1-clouder1 #62
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
ffff880006423548 ffffffff81318c89 ffffffff819ecdd0 0000000000000166
ffff880006423558 ffffffff810810b0 ffff880006423580 ffffffff81081153
ffff880006e5a1a0 ffff88000690e400 0000000000000000 ffff8800064235c0
Call Trace:
[<ffffffff81318c89>] dump_stack+0x67/0x9e
[<ffffffff810810b0>] ___might_sleep+0xf0/0x140
[<ffffffff81081153>] __might_sleep+0x53/0xb0
[<ffffffff8126c1dc>] ext4_commit_super+0x19c/0x290
[<ffffffff8126e61a>] __ext4_grp_locked_error+0x14a/0x230
[<ffffffff81081153>] ? __might_sleep+0x53/0xb0
[<ffffffff812822be>] ext4_mb_generate_buddy+0x1de/0x320
Since ext4_grp_locked_error() calls ext4_commit_super with sync == 0
(and it is the only caller which does so), avoid locking and unlocking
the buffer in this case.
This can result in races with ext4_commit_super() if there are other
problems (which is what commit 4743f83990 was trying to address),
but a Warning is better than BUG.
Fixes: 4743f83990
Reported-by: Nikolay Borisov <kernel@kyup.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d128af1787 upstream.
The AEAD givenc descriptor relies on moving the IV through the
output FIFO and then back to the CTX2 for authentication. The
SEQ FIFO STORE could be scheduled before the data can be
read from OFIFO, especially since the SEQ FIFO LOAD needs
to wait for the SEQ FIFO LOAD SKIP to finish first. The
SKIP takes more time when the input is SG than when it's
a contiguous buffer. If the SEQ FIFO LOAD is not scheduled
before the STORE, the DECO will hang waiting for data
to be available in the OFIFO so it can be transferred to C2.
In order to overcome this, first force transfer of IV to C2
by starting the "cryptlen" transfer first and then starting to
store data from OFIFO to the output buffer.
Fixes: 1acebad3d8 ("crypto: caam - faster aead implementation")
Signed-off-by: Alex Porosanu <alexandru.porosanu@nxp.com>
Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84d77d3f06 upstream.
It is the reasonable expectation that if an executable file is not
readable there will be no way for a user without special privileges to
read the file. This is enforced in ptrace_attach but if ptrace
is already attached before exec there is no enforcement for read-only
executables.
As the only way to read such an mm is through access_process_vm
spin a variant called ptrace_access_vm that will fail if the
target process is not being ptraced by the current process, or
the current process did not have sufficient privileges when ptracing
began to read the target processes mm.
In the ptrace implementations replace access_process_vm by
ptrace_access_vm. There remain several ptrace sites that still use
access_process_vm as they are reading the target executables
instructions (for kernel consumption) or register stacks. As such it
does not appear necessary to add a permission check to those calls.
This bug has always existed in Linux.
Fixes: v1.0
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64b875f7ac upstream.
When the flag PT_PTRACE_CAP was added the PTRACE_TRACEME path was
overlooked. This can result in incorrect behavior when an application
like strace traces an exec of a setuid executable.
Further PT_PTRACE_CAP does not have enough information for making good
security decisions as it does not report which user namespace the
capability is in. This has already allowed one mistake through
insufficient granulariy.
I found this issue when I was testing another corner case of exec and
discovered that I could not get strace to set PT_PTRACE_CAP even when
running strace as root with a full set of caps.
This change fixes the above issue with strace allowing stracing as
root a setuid executable without disabling setuid. More fundamentaly
this change allows what is allowable at all times, by using the correct
information in it's decision.
Fixes: 4214e42f96d4 ("v2.4.9.11 -> v2.4.9.12")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bfedb58925 upstream.
During exec dumpable is cleared if the file that is being executed is
not readable by the user executing the file. A bug in
ptrace_may_access allows reading the file if the executable happens to
enter into a subordinate user namespace (aka clone(CLONE_NEWUSER),
unshare(CLONE_NEWUSER), or setns(fd, CLONE_NEWUSER).
This problem is fixed with only necessary userspace breakage by adding
a user namespace owner to mm_struct, captured at the time of exec, so
it is clear in which user namespace CAP_SYS_PTRACE must be present in
to be able to safely give read permission to the executable.
The function ptrace_may_access is modified to verify that the ptracer
has CAP_SYS_ADMIN in task->mm->user_ns instead of task->cred->user_ns.
This ensures that if the task changes it's cred into a subordinate
user namespace it does not become ptraceable.
The function ptrace_attach is modified to only set PT_PTRACE_CAP when
CAP_SYS_PTRACE is held over task->mm->user_ns. The intent of
PT_PTRACE_CAP is to be a flag to note that whatever permission changes
the task might go through the tracer has sufficient permissions for
it not to be an issue. task->cred->user_ns is always the same
as or descendent of mm->user_ns. Which guarantees that having
CAP_SYS_PTRACE over mm->user_ns is the worst case for the tasks
credentials.
To prevent regressions mm->dumpable and mm->user_ns are not considered
when a task has no mm. As simply failing ptrace_may_attach causes
regressions in privileged applications attempting to read things
such as /proc/<pid>/stat
Acked-by: Kees Cook <keescook@chromium.org>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Fixes: 8409cca705 ("userns: allow ptrace from non-init user namespaces")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bcc7f5b4be upstream.
bdev->bd_contains is not stable before calling __blkdev_get().
When __blkdev_get() is called on a parition with ->bd_openers == 0
it sets
bdev->bd_contains = bdev;
which is not correct for a partition.
After a call to __blkdev_get() succeeds, ->bd_openers will be > 0
and then ->bd_contains is stable.
When FMODE_EXCL is used, blkdev_get() calls
bd_start_claiming() -> bd_prepare_to_claim() -> bd_may_claim()
This call happens before __blkdev_get() is called, so ->bd_contains
is not stable. So bd_may_claim() cannot safely use ->bd_contains.
It currently tries to use it, and this can lead to a BUG_ON().
This happens when a whole device is already open with a bd_holder (in
use by dm in my particular example) and two threads race to open a
partition of that device for the first time, one opening with O_EXCL and
one without.
The thread that doesn't use O_EXCL gets through blkdev_get() to
__blkdev_get(), gains the ->bd_mutex, and sets bdev->bd_contains = bdev;
Immediately thereafter the other thread, using FMODE_EXCL, calls
bd_start_claiming() from blkdev_get(). This should fail because the
whole device has a holder, but because bdev->bd_contains == bdev
bd_may_claim() incorrectly reports success.
This thread continues and blocks on bd_mutex.
The first thread then sets bdev->bd_contains correctly and drops the mutex.
The thread using FMODE_EXCL then continues and when it calls bd_may_claim()
again in:
BUG_ON(!bd_may_claim(bdev, whole, holder));
The BUG_ON fires.
Fix this by removing the dependency on ->bd_contains in
bd_may_claim(). As bd_may_claim() has direct access to the whole
device, it can simply test if the target bdev is the whole device.
Fixes: 6b4517a791 ("block: implement bd_claiming and claiming block")
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52bce91165 upstream.
Commit 8924feff66 ("splice: lift pipe_lock out of splice_to_pipe()")
caused a regression when there were no more readers left on a pipe that
was being spliced into: rather than the expected SIGPIPE and -EPIPE
return value, the writer would end up waiting forever for space to free
up (which obviously was not going to happen with no readers around).
Fixes: 8924feff66 ("splice: lift pipe_lock out of splice_to_pipe()")
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Debugged-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 613cc2b6f2 upstream.
If you have a process that has set itself to be non-dumpable, and it
then undergoes exec(2), any CLOEXEC file descriptors it has open are
"exposed" during a race window between the dumpable flags of the process
being reset for exec(2) and CLOEXEC being applied to the file
descriptors. This can be exploited by a process by attempting to access
/proc/<pid>/fd/... during this window, without requiring CAP_SYS_PTRACE.
The race in question is after set_dumpable has been (for get_link,
though the trace is basically the same for readlink):
[vfs]
-> proc_pid_link_inode_operations.get_link
-> proc_pid_get_link
-> proc_fd_access_allowed
-> ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
Which will return 0, during the race window and CLOEXEC file descriptors
will still be open during this window because do_close_on_exec has not
been called yet. As a result, the ordering of these calls should be
reversed to avoid this race window.
This is of particular concern to container runtimes, where joining a
PID namespace with file descriptors referring to the host filesystem
can result in security issues (since PRCTL_SET_DUMPABLE doesn't protect
against access of CLOEXEC file descriptors -- file descriptors which may
reference filesystem objects the container shouldn't have access to).
Cc: dev@opencontainers.org
Reported-by: Michael Crosby <crosbymichael@gmail.com>
Signed-off-by: Aleksa Sarai <asarai@suse.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f84df2a6f2 upstream.
When the user namespace support was merged the need to prevent
ptrace from revealing the contents of an unreadable executable
was overlooked.
Correct this oversight by ensuring that the executed file
or files are in mm->user_ns, by adjusting mm->user_ns.
Use the new function privileged_wrt_inode_uidgid to see if
the executable is a member of the user namespace, and as such
if having CAP_SYS_PTRACE in the user namespace should allow
tracing the executable. If not update mm->user_ns to
the parent user namespace until an appropriate parent is found.
Reported-by: Jann Horn <jann@thejh.net>
Fixes: 9e4a36ece6 ("userns: Fail exec for suid and sgid binaries with ids outside our user namespace.")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 035cd485a4 upstream.
The OMAP36xx DPLL5, driving EHCI USB, can be subject to a long-term
frequency drift. The frequency drift magnitude depends on the VCO update
rate, which is inversely proportional to the PLL divider. The kernel
DPLL configuration code results in a high value for the divider, leading
to a long term drift high enough to cause USB transmission errors. In
the worst case the USB PHY's ULPI interface can stop responding,
breaking USB operation completely. This manifests itself on the
Beagleboard xM by the LAN9514 reporting 'Cannot enable port 2. Maybe the
cable is bad?' in the kernel log.
Errata sprz319 advisory 2.1 documents PLL values that minimize the
drift. Use them automatically when DPLL5 is used for USB operation,
which we detect based on the requested clock rate. The clock framework
will still compute the PLL parameters and resulting rate as usual, but
the PLL M and N values will then be overridden. This can result in the
effective clock rate being slightly different than the rate cached by
the clock framework, but won't cause any adverse effect to USB
operation.
Signed-off-by: Richard Watts <rrw@kynesim.co.uk>
[Upported from v3.2 to v4.9]
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Tested-by: Ladislav Michl <ladis@linux-mips.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Cc: Adam Ford <aford173@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5e0ad0d874 upstream.
Commit [64047d7f49 ALSA: hda - ignore the assoc and seq when comparing
pin configurations] intented to ignore both seq and assoc at pin
comparing, but it only ignored seq. So that commit may still fail to
match pins on some machines.
Change the bitmask to also ignore assoc.
v2: Use macro to do bit masking.
Thanks to Hui Wang for the analysis.
Fixes: 64047d7f49 ("ALSA: hda - ignore the assoc and seq when comparing...")
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f73cd43ac3 upstream.
HP Z1 Gen3 AiO with Conexant codec doesn't give an unsolicited event
to the headset mic pin upon the jack plugging, it reports only to the
headphone pin. It results in the missing mic switching. Let's fix up
by simply gating the jack event.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 989dbe4a30 upstream.
This group of new pins is not in the pin quirk table yet, adding
them to the pin quirk table to fix the headset-mic problem.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 64047d7f49 upstream.
More and more pin configurations have been adding to the pin quirk
table, lots of them are only different from assoc and seq, but they
all apply to the same QUIRK_FIXUP, if we don't compare assoc and seq
when matching pin configurations, it will greatly reduce the pin
quirk table size.
We have tested this change on a couple of Dell laptops, it worked
well.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5337cfe06 upstream.
I'm using an Alienware 15 R2 and had to use the alienware quirks to
get my headphone output working.
I fixed it by adding, SND_PCI_QUIRK(0x1028, 0x0708, "Alienware 15 R2
2016", QUIRK_ALIENWARE) to the patch.
Signed-off-by: Sven Hahne <hahne@zeitkunst.eu>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 995c6a7fd9 upstream.
Sampling rate changes after first set one are not reflected to the
hardware, while driver and ALSA think the rate has been changed.
Fix the problem by properly stopping the interface at the beginning of
prepare call, allowing new rate to be set to the hardware. This keeps
the hardware in sync with the driver.
Signed-off-by: Jussi Laako <jussi@sonarnerd.net>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 82ffb6fc63 upstream.
The Logitech QuickCam Communicate Deluxe/S7500 microphone fails with the
following warning.
[ 6.778995] usb 2-1.2.2.2: Warning! Unlikely big volume range (=3072),
cval->res is probably wrong.
[ 6.778996] usb 2-1.2.2.2: [5] FU [Mic Capture Volume] ch = 1, val =
4608/7680/1
Adding it to the list of devices in volume_control_quirks makes it work
properly, fixing related typo.
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3e448e13a6 upstream.
ep_list inside gadget structure doesn't contain ep0.
It is stored separately in ep0 field.
This causes an urb hang if gadget driver decides to
delay setup handling. On host side this is visible as
timeout error when setting configuration.
This bug can be reproduced using for example any gadget
with mass storage function.
Fixes: abdb295743 ("usbip: vudc: Add vudc_transfer")
Signed-off-by: Krzysztof Opasiak <k.opasiak@samsung.com>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ccdb6be9ec upstream.
The UHCI controllers in Intel chipsets rely on a platform-specific non-PME
mechanism for wakeup signalling. They can generate wakeup signals even
though they don't support PME.
We need to let the USB core know this so that it will enable runtime
suspend for UHCI controllers.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e8f29bb719 upstream.
usb_endpoint_maxp() returns wMaxPacketSize in its
raw form. Without taking into consideration that it
also contains other bits reserved for isochronous
endpoints.
This patch fixes one occasion where this is a
problem by making sure that we initialize
ep->maxpacket only with lower 10 bits of the value
returned by usb_endpoint_maxp(). Note that seperate
patches will be necessary to audit all call sites of
usb_endpoint_maxp() and make sure that
usb_endpoint_maxp() only returns lower 10 bits of
wMaxPacketSize.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 37be66767e upstream.
USB-3 does not have any link state that will avoid negotiating a connection
with a plugged-in cable but will signal the host when the cable is
unplugged.
For USB-3 we used to first set the link to Disabled, then to RxDdetect to
be able to detect cable connects or disconnects. But in RxDetect the
connected device is detected again and eventually enabled.
Instead set the link into U3 and disable remote wakeups for the device.
This is what Windows does, and what Alan Stern suggested.
Cc: Alan Stern <stern@rowland.harvard.edu>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6b9018d4c1 upstream.
In case of High-Speed, High-Bandwidth endpoints, we
need to tell DWC3 that we have more than one packet
per interval. We do that by setting PCM1 field of
Isochronous-First TRB.
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6774d5f532 upstream.
Kill urbs and disable read before returning from open on failure to
retrieve the line state.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5b09eff0c3 upstream.
This patch adds support for PIDs 0x1040, 0x1041 of Telit LE922A.
Since the interface positions are the same than the ones used
for other Telit compositions, previous defined blacklists are used.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8d9eddad19 upstream.
We were setting the qgroup_rescan_running flag to true only after the
rescan worker started (which is a task run by a queue). So if a user
space task starts a rescan and immediately after asks to wait for the
rescan worker to finish, this second call might happen before the rescan
worker task starts running, in which case the rescan wait ioctl returns
immediatley, not waiting for the rescan worker to finish.
This was making the fstest btrfs/022 fail very often.
Fixes: d2c609b834 (btrfs: properly track when rescan worker is running)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ed0df618b1 upstream.
The balance status item contains currently known filter values, but the
stripes filter was unintentionally not among them. This would mean, that
interrupted and automatically restarted balance does not apply the
stripe filters.
Fixes: dee32d0ac3
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 054570a1dc upstream.
During relocation of a data block group we create a relocation tree
for each fs/subvol tree by making a snapshot of each tree using
btrfs_copy_root() and the tree's commit root, and then setting the last
snapshot field for the fs/subvol tree's root to the value of the current
transaction id minus 1. However this can lead to relocation later
dropping references that it did not create if we have qgroups enabled,
leaving the filesystem in an inconsistent state that keeps aborting
transactions.
Lets consider the following example to explain the problem, which requires
qgroups to be enabled.
We are relocating data block group Y, we have a subvolume with id 258 that
has a root at level 1, that subvolume is used to store directory entries
for snapshots and we are currently at transaction 3404.
When committing transaction 3404, we have a pending snapshot and therefore
we call btrfs_run_delayed_items() at transaction.c:create_pending_snapshot()
in order to create its dentry at subvolume 258. This results in COWing
leaf A from root 258 in order to add the dentry. Note that leaf A
also contains file extent items referring to extents from some other
block group X (we are currently relocating block group Y). Later on, still
at create_pending_snapshot() we call qgroup_account_snapshot(), which
switches the commit root for root 258 when it calls switch_commit_roots(),
so now the COWed version of leaf A, lets call it leaf A', is accessible
from the commit root of tree 258. At the end of qgroup_account_snapshot(),
we call record_root_in_trans() with 258 as its argument, which results
in btrfs_init_reloc_root() being called, which in turn calls
relocation.c:create_reloc_root() in order to create a relocation tree
associated to root 258, which results in assigning the value of 3403
(which is the current transaction id minus 1 = 3404 - 1) to the
last_snapshot field of root 258. When creating the relocation tree root
at ctree.c:btrfs_copy_root() we add a shared reference for leaf A',
corresponding to the relocation tree's root, when we call btrfs_inc_ref()
against the COWed root (a copy of the commit root from tree 258), which
is at level 1. So at this point leaf A' has 2 references, one normal
reference corresponding to root 258 and one shared reference corresponding
to the root of the relocation tree.
Transaction 3404 finishes its commit and transaction 3405 is started by
relocation when calling merge_reloc_root() for the relocation tree
associated to root 258. In the meanwhile leaf A' is COWed again, in
response to some filesystem operation, when we are still at transaction
3405. However when we COW leaf A', at ctree.c:update_ref_for_cow(), we
call btrfs_block_can_be_shared() in order to figure out if other trees
refer to the leaf and if any such trees exists, add a full back reference
to leaf A' - but btrfs_block_can_be_shared() incorrectly returns false
because the following condition is false:
btrfs_header_generation(buf) <= btrfs_root_last_snapshot(&root->root_item)
which evaluates to 3404 <= 3403. So after leaf A' is COWed, it stays with
only one reference, corresponding to the shared reference we created when
we called btrfs_copy_root() to create the relocation tree's root and
btrfs_inc_ref() ends up not being called for leaf A' nor we end up setting
the flag BTRFS_BLOCK_FLAG_FULL_BACKREF in leaf A'. This results in not
adding shared references for the extents from block group X that leaf A'
refers to with its file extent items.
Later, after merging the relocation root we do a call to to
btrfs_drop_snapshot() in order to delete the relocation tree. This ends
up calling do_walk_down() when path->slots[1] points to leaf A', which
results in calling btrfs_lookup_extent_info() to get the number of
references for leaf A', which is 1 at this time (only the shared reference
exists) and this value is stored at wc->refs[0]. After this walk_up_proc()
is called when wc->level is 0 and path->nodes[0] corresponds to leaf A'.
Because the current level is 0 and wc->refs[0] is 1, it does call
btrfs_dec_ref() against leaf A', which results in removing the single
references that the extents from block group X have which are associated
to root 258 - the expectation was to have each of these extents with 2
references - one reference for root 258 and one shared reference related
to the root of the relocation tree, and so we would drop only the shared
reference (because leaf A' was supposed to have the flag
BTRFS_BLOCK_FLAG_FULL_BACKREF set).
This leaves the filesystem in an inconsistent state as we now have file
extent items in a subvolume tree that point to extents from block group X
without references in the extent tree. So later on when we try to decrement
the references for these extents, for example due to a file unlink operation,
truncate operation or overwriting ranges of a file, we fail because the
expected references do not exist in the extent tree.
This leads to warnings and transaction aborts like the following:
[ 588.965795] ------------[ cut here ]------------
[ 588.965815] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:1625 lookup_inline_extent_backref+0x432/0x5b0 [btrfs]
[ 588.965816] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc
parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg
[ 588.965831] CPU: 2 PID: 2479 Comm: kworker/u8:7 Not tainted 4.7.3-3-default-fdm+ #1
[ 588.965832] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[ 588.965844] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[ 588.965845] 0000000000000000 ffff8802263bfa28 ffffffff813af542 0000000000000000
[ 588.965847] 0000000000000000 ffff8802263bfa68 ffffffff81081e8b 0000065900000000
[ 588.965848] ffff8801db2af000 000000012bbe2000 0000000000000000 ffff880215703b48
[ 588.965849] Call Trace:
[ 588.965852] [<ffffffff813af542>] dump_stack+0x63/0x81
[ 588.965854] [<ffffffff81081e8b>] __warn+0xcb/0xf0
[ 588.965855] [<ffffffff81081f7d>] warn_slowpath_null+0x1d/0x20
[ 588.965863] [<ffffffffa0175042>] lookup_inline_extent_backref+0x432/0x5b0 [btrfs]
[ 588.965865] [<ffffffff81143220>] ? trace_clock_local+0x10/0x30
[ 588.965867] [<ffffffff8114c5df>] ? rb_reserve_next_event+0x6f/0x460
[ 588.965875] [<ffffffffa0175215>] insert_inline_extent_backref+0x55/0xd0 [btrfs]
[ 588.965882] [<ffffffffa017531f>] __btrfs_inc_extent_ref.isra.55+0x8f/0x240 [btrfs]
[ 588.965890] [<ffffffffa017acea>] __btrfs_run_delayed_refs+0x74a/0x1260 [btrfs]
[ 588.965892] [<ffffffff810cb046>] ? cpuacct_charge+0x86/0xa0
[ 588.965900] [<ffffffffa017e74f>] btrfs_run_delayed_refs+0x9f/0x2c0 [btrfs]
[ 588.965908] [<ffffffffa017ea04>] delayed_ref_async_start+0x94/0xb0 [btrfs]
[ 588.965918] [<ffffffffa01c799a>] btrfs_scrubparity_helper+0xca/0x350 [btrfs]
[ 588.965928] [<ffffffffa01c7c5e>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[ 588.965930] [<ffffffff8109b323>] process_one_work+0x1f3/0x4e0
[ 588.965931] [<ffffffff8109b658>] worker_thread+0x48/0x4e0
[ 588.965932] [<ffffffff8109b610>] ? process_one_work+0x4e0/0x4e0
[ 588.965934] [<ffffffff810a1659>] kthread+0xc9/0xe0
[ 588.965936] [<ffffffff816f2f1f>] ret_from_fork+0x1f/0x40
[ 588.965937] [<ffffffff810a1590>] ? kthread_worker_fn+0x170/0x170
[ 588.965938] ---[ end trace 34e5232c933a1749 ]---
[ 588.966187] ------------[ cut here ]------------
[ 588.966196] WARNING: CPU: 2 PID: 2479 at fs/btrfs/extent-tree.c:2966 btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs]
[ 588.966196] BTRFS: Transaction aborted (error -5)
[ 588.966197] Modules linked in: af_packet iscsi_ibft iscsi_boot_sysfs xfs libcrc32c ppdev acpi_cpufreq button tpm_tis e1000 i2c_piix4 pcspkr parport_pc
parport tpm qemu_fw_cfg joydev btrfs xor raid6_pq sr_mod cdrom ata_generic virtio_scsi ata_piix virtio_pci bochs_drm virtio_ring drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops virtio ttm serio_raw drm floppy sg
[ 588.966206] CPU: 2 PID: 2479 Comm: kworker/u8:7 Tainted: G W 4.7.3-3-default-fdm+ #1
[ 588.966207] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[ 588.966217] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[ 588.966217] 0000000000000000 ffff8802263bfc98 ffffffff813af542 ffff8802263bfce8
[ 588.966219] 0000000000000000 ffff8802263bfcd8 ffffffff81081e8b 00000b96345ee000
[ 588.966220] ffffffffa021ae1c ffff880215703b48 00000000000005fe ffff8802345ee000
[ 588.966221] Call Trace:
[ 588.966223] [<ffffffff813af542>] dump_stack+0x63/0x81
[ 588.966224] [<ffffffff81081e8b>] __warn+0xcb/0xf0
[ 588.966225] [<ffffffff81081eff>] warn_slowpath_fmt+0x4f/0x60
[ 588.966233] [<ffffffffa017e93c>] btrfs_run_delayed_refs+0x28c/0x2c0 [btrfs]
[ 588.966241] [<ffffffffa017ea04>] delayed_ref_async_start+0x94/0xb0 [btrfs]
[ 588.966250] [<ffffffffa01c799a>] btrfs_scrubparity_helper+0xca/0x350 [btrfs]
[ 588.966259] [<ffffffffa01c7c5e>] btrfs_extent_refs_helper+0xe/0x10 [btrfs]
[ 588.966260] [<ffffffff8109b323>] process_one_work+0x1f3/0x4e0
[ 588.966261] [<ffffffff8109b658>] worker_thread+0x48/0x4e0
[ 588.966263] [<ffffffff8109b610>] ? process_one_work+0x4e0/0x4e0
[ 588.966264] [<ffffffff810a1659>] kthread+0xc9/0xe0
[ 588.966265] [<ffffffff816f2f1f>] ret_from_fork+0x1f/0x40
[ 588.966267] [<ffffffff810a1590>] ? kthread_worker_fn+0x170/0x170
[ 588.966268] ---[ end trace 34e5232c933a174a ]---
[ 588.966269] BTRFS: error (device sda2) in btrfs_run_delayed_refs:2966: errno=-5 IO failure
[ 588.966270] BTRFS info (device sda2): forced readonly
This was happening often on openSUSE and SLE systems using btrfs as the
root filesystem (with its default layout where multiple subvolumes are
used) where balance happens in the background triggered by a cron job and
snapshots are automatically created before/after package installations,
upgrades and removals. The issue could be triggered simply by running the
following loop on the first system boot post installation:
while true; do
zypper -n in nfs-kernel-server
zypper -n rm nfs-kernel-server
done
(If we were fast enough and made that loop before the cron job triggered
a balance operation and the balance finished)
So fix by setting the last_snapshot field of the root to the value of the
generation of its commit root. Like this btrfs_block_can_be_shared()
behaves correctly for the case where the relocation root is created during
a transaction commit and for the case where it's created before a
transaction commit.
Fixes: 6426c7ad69 (btrfs: qgroup: Fix qgroup accounting when creating snapshot)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2a7bf53f57 upstream.
If a log tree has a layout like the following:
leaf N:
...
item 240 key (282 DIR_LOG_ITEM 0) itemoff 8189 itemsize 8
dir log end 1275809046
leaf N + 1:
item 0 key (282 DIR_LOG_ITEM 3936149215) itemoff 16275 itemsize 8
dir log end 18446744073709551615
...
When we pass the value 1275809046 + 1 as the parameter start_ret to the
function tree-log.c:find_dir_range() (done by replay_dir_deletes()), we
end up with path->slots[0] having the value 239 (points to the last item
of leaf N, item 240). Because the dir log item in that position has an
offset value smaller than *start_ret (1275809046 + 1) we need to move on
to the next leaf, however the logic for that is wrong since it compares
the current slot to the number of items in the leaf, which is smaller
and therefore we don't lookup for the next leaf but instead we set the
slot to point to an item that does not exist, at slot 240, and we later
operate on that slot which has unexpected content or in the worst case
can result in an invalid memory access (accessing beyond the last page
of leaf N's extent buffer).
So fix the logic that checks when we need to lookup at the next leaf
by first incrementing the slot and only after to check if that slot
is beyond the last item of the current leaf.
Signed-off-by: Robbie Ko <robbieko@synology.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Fixes: e02119d5a7 (Btrfs: Add a write ahead tree log to optimize synchronous operations)
Signed-off-by: Filipe Manana <fdmanana@suse.com>
[Modified changelog for clarity and correctness]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef85b25e98 upstream.
This can only happen with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y.
Commit 1ba98d0 ("Btrfs: detect corruption when non-root leaf has zero item")
assumes that a leaf is its root when leaf->bytenr == btrfs_root_bytenr(root),
however, we should not use btrfs_root_bytenr(root) since it's mainly got
updated during committing transaction. So the check can fail when doing
COW on this leaf while it is a root.
This changes to use "if (leaf == btrfs_root_node(root))" instead, just like
how we check whether leaf is a root in __btrfs_cow_block().
Fixes: 1ba98d086f (Btrfs: detect corruption when non-root leaf has zero item)
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2939e1a86f upstream.
Problem statement: unprivileged user who has read-write access to more than
one btrfs subvolume may easily consume all kernel memory (eventually
triggering oom-killer).
Reproducer (./mkrmdir below essentially loops over mkdir/rmdir):
[root@kteam1 ~]# cat prep.sh
DEV=/dev/sdb
mkfs.btrfs -f $DEV
mount $DEV /mnt
for i in `seq 1 16`
do
mkdir /mnt/$i
btrfs subvolume create /mnt/SV_$i
ID=`btrfs subvolume list /mnt |grep "SV_$i$" |cut -d ' ' -f 2`
mount -t btrfs -o subvolid=$ID $DEV /mnt/$i
chmod a+rwx /mnt/$i
done
[root@kteam1 ~]# sh prep.sh
[maxim@kteam1 ~]$ for i in `seq 1 16`; do ./mkrmdir /mnt/$i 2000 2000 & done
[root@kteam1 ~]# for i in `seq 1 4`; do grep "kmalloc-128" /proc/slabinfo | grep -v dma; sleep 60; done
kmalloc-128 10144 10144 128 32 1 : tunables 0 0 0 : slabdata 317 317 0
kmalloc-128 9992352 9992352 128 32 1 : tunables 0 0 0 : slabdata 312261 312261 0
kmalloc-128 24226752 24226752 128 32 1 : tunables 0 0 0 : slabdata 757086 757086 0
kmalloc-128 42754240 42754240 128 32 1 : tunables 0 0 0 : slabdata 1336070 1336070 0
The huge numbers above come from insane number of async_work-s allocated
and queued by btrfs_wq_run_delayed_node.
The problem is caused by btrfs_wq_run_delayed_node() queuing more and more
works if the number of delayed items is above BTRFS_DELAYED_BACKGROUND. The
worker func (btrfs_async_run_delayed_root) processes at least
BTRFS_DELAYED_BATCH items (if they are present in the list). So, the machinery
works as expected while the list is almost empty. As soon as it is getting
bigger, worker func starts to process more than one item at a time, it takes
longer, and the chances to have async_works queued more than needed is getting
higher.
The problem above is worsened by another flaw of delayed-inode implementation:
if async_work was queued in a throttling branch (number of items >=
BTRFS_DELAYED_WRITEBACK), corresponding worker func won't quit until
the number of items < BTRFS_DELAYED_BACKGROUND / 2. So, it is possible that
the func occupies CPU infinitely (up to 30sec in my experiments): while the
func is trying to drain the list, the user activity may add more and more
items to the list.
The patch fixes both problems in straightforward way: refuse queuing too
many works in btrfs_wq_run_delayed_node and bail out of worker func if
at least BTRFS_DELAYED_WRITEBACK items are processed.
Changed in v2: remove support of thresh == NO_THRESHOLD.
Signed-off-by: Maxim Patlasov <mpatlasov@virtuozzo.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 777c6e0dae upstream.
Yu Zhao has noticed that __unregister_cpu_notifier only unregisters its
notifiers when HOTPLUG_CPU=y while the registration might succeed even
when HOTPLUG_CPU=n if MODULE is enabled. This means that e.g. zswap
might keep a stale notifier on the list on the manual clean up during
the pool tear down and thus corrupt the list. Resulting in the following
[ 144.964346] BUG: unable to handle kernel paging request at ffff880658a2be78
[ 144.971337] IP: [<ffffffffa290b00b>] raw_notifier_chain_register+0x1b/0x40
<snipped>
[ 145.122628] Call Trace:
[ 145.125086] [<ffffffffa28e5cf8>] __register_cpu_notifier+0x18/0x20
[ 145.131350] [<ffffffffa2a5dd73>] zswap_pool_create+0x273/0x400
[ 145.137268] [<ffffffffa2a5e0fc>] __zswap_param_set+0x1fc/0x300
[ 145.143188] [<ffffffffa2944c1d>] ? trace_hardirqs_on+0xd/0x10
[ 145.149018] [<ffffffffa2908798>] ? kernel_param_lock+0x28/0x30
[ 145.154940] [<ffffffffa2a3e8cf>] ? __might_fault+0x4f/0xa0
[ 145.160511] [<ffffffffa2a5e237>] zswap_compressor_param_set+0x17/0x20
[ 145.167035] [<ffffffffa2908d3c>] param_attr_store+0x5c/0xb0
[ 145.172694] [<ffffffffa290848d>] module_attr_store+0x1d/0x30
[ 145.178443] [<ffffffffa2b2b41f>] sysfs_kf_write+0x4f/0x70
[ 145.183925] [<ffffffffa2b2a5b9>] kernfs_fop_write+0x149/0x180
[ 145.189761] [<ffffffffa2a99248>] __vfs_write+0x18/0x40
[ 145.194982] [<ffffffffa2a9a412>] vfs_write+0xb2/0x1a0
[ 145.200122] [<ffffffffa2a9a732>] SyS_write+0x52/0xa0
[ 145.205177] [<ffffffffa2ff4d97>] entry_SYSCALL_64_fastpath+0x12/0x17
This can be even triggered manually by changing
/sys/module/zswap/parameters/compressor multiple times.
Fix this issue by making unregister APIs symmetric to the register so
there are no surprises.
Fixes: 47e627bc8c ("[PATCH] hotplug: Allow modules to use the cpu hotplug notifiers even if !CONFIG_HOTPLUG_CPU")
Reported-and-tested-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Streetman <ddstreet@ieee.org>
Link: http://lkml.kernel.org/r/20161207135438.4310-1-mhocko@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-06 10:40:10 +01:00
2434 changed files with 329401 additions and 14558 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.