Commit Graph

34276 Commits

Author SHA1 Message Date
Boqun Feng
f08e388857 lockdep: Fix recursive read lock related safe->unsafe detection
Currently, in safe->unsafe detection, lockdep misses the fact that a
LOCK_ENABLED_IRQ_*_READ usage and a LOCK_USED_IN_IRQ_*_READ usage may
cause deadlock too, for example:

	P1                          P2
	<irq disabled>
	write_lock(l1);             <irq enabled>
				    read_lock(l2);
	write_lock(l2);
				    <in irq>
				    read_lock(l1);

Actually, all of the following cases may cause deadlocks:

	LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*
	LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*
	LOCK_USED_IN_IRQ_* -> LOCK_ENABLED_IRQ_*_READ
	LOCK_USED_IN_IRQ_*_READ -> LOCK_ENABLED_IRQ_*_READ

To fix this, we need to 1) change the calculation of exclusive_mask() so
that READ bits are not dropped and 2) always call usage() in
mark_lock_irq() to check usage deadlocks, even when the new usage of the
lock is READ.

Besides, adjust usage_match() and usage_acculumate() to recursive read
lock changes.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-12-boqun.feng@gmail.com
2020-08-26 12:42:05 +02:00
Boqun Feng
68e3056785 lockdep: Adjust check_redundant() for recursive read change
check_redundant() will report redundancy if it finds a path could
replace the about-to-add dependency in the BFS search. With recursive
read lock changes, we certainly need to change the match function for
the check_redundant(), because the path needs to match not only the lock
class but also the dependency kinds. For example, if the about-to-add
dependency @prev -> @next is A -(SN)-> B, and we find a path A -(S*)->
.. -(*R)->B in the dependency graph with __bfs() (for simplicity, we can
also say we find an -(SR)-> path from A to B), we can not replace the
dependency with that path in the BFS search. Because the -(SN)->
dependency can make a strong path with a following -(S*)-> dependency,
however an -(SR)-> path cannot.

Further, we can replace an -(SN)-> dependency with a -(EN)-> path, that
means if we find a path which is stronger than or equal to the
about-to-add dependency, we can report the redundancy. By "stronger", it
means both the start and the end of the path are not weaker than the
start and the end of the dependency (E is "stronger" than S and N is
"stronger" than R), so that we can replace the dependency with that
path.

To make sure we find a path whose start point is not weaker than the
about-to-add dependency, we use a trick: the ->only_xr of the root
(start point) of __bfs() is initialized as @prev-> == 0, therefore if
@prev is E, __bfs() will pick only -(E*)-> for the first dependency,
otherwise, __bfs() can pick -(E*)-> or -(S*)-> for the first dependency.

To make sure we find a path whose end point is not weaker than the
about-to-add dependency, we replace the match function for __bfs()
check_redundant(), we check for the case that either @next is R
(anything is not weaker than it) or the end point of the path is N
(which is not weaker than anything).

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-11-boqun.feng@gmail.com
2020-08-26 12:42:05 +02:00
Boqun Feng
9de0c9bbce lockdep: Support deadlock detection for recursive read locks in check_noncircular()
Currently, lockdep only has limit support for deadlock detection for
recursive read locks.

This patch support deadlock detection for recursive read locks. The
basic idea is:

We are about to add dependency B -> A in to the dependency graph, we use
check_noncircular() to find whether we have a strong dependency path
A -> .. -> B so that we have a strong dependency circle (a closed strong
dependency path):

	 A -> .. -> B -> A

, which doesn't have two adjacent dependencies as -(*R)-> L -(S*)->.

Since A -> .. -> B is already a strong dependency path, so if either
B -> A is -(E*)-> or A -> .. -> B is -(*N)->, the circle A -> .. -> B ->
A is strong, otherwise not. So we introduce a new match function
hlock_conflict() to replace the class_equal() for the deadlock check in
check_noncircular().

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-10-boqun.feng@gmail.com
2020-08-26 12:42:05 +02:00
Boqun Feng
61775ed243 lockdep: Make __bfs(.match) return bool
The "match" parameter of __bfs() is used for checking whether we hit a
match in the search, therefore it should return a boolean value rather
than an integer for better readability.

This patch then changes the return type of the function parameter and the
match functions to bool.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-9-boqun.feng@gmail.com
2020-08-26 12:42:05 +02:00
Boqun Feng
6971c0f345 lockdep: Extend __bfs() to work with multiple types of dependencies
Now we have four types of dependencies in the dependency graph, and not
all the pathes carry real dependencies (the dependencies that may cause
a deadlock), for example:

	Given lock A and B, if we have:

	CPU1			CPU2
	=============		==============
	write_lock(A);		read_lock(B);
	read_lock(B);		write_lock(A);

	(assuming read_lock(B) is a recursive reader)

	then we have dependencies A -(ER)-> B, and B -(SN)-> A, and a
	dependency path A -(ER)-> B -(SN)-> A.

	In lockdep w/o recursive locks, a dependency path from A to A
	means a deadlock. However, the above case is obviously not a
	deadlock, because no one holds B exclusively, therefore no one
	waits for the other to release B, so who get A first in CPU1 and
	CPU2 will run non-blockingly.

	As a result, dependency path A -(ER)-> B -(SN)-> A is not a
	real/strong dependency that could cause a deadlock.

From the observation above, we know that for a dependency path to be
real/strong, no two adjacent dependencies can be as -(*R)-> -(S*)->.

Now our mission is to make __bfs() traverse only the strong dependency
paths, which is simple: we record whether we only have -(*R)-> for the
previous lock_list of the path in lock_list::only_xr, and when we pick a
dependency in the traverse, we 1) filter out -(S*)-> dependency if the
previous lock_list only has -(*R)-> dependency (i.e. ->only_xr is true)
and 2) set the next lock_list::only_xr to true if we only have -(*R)->
left after we filter out dependencies based on 1), otherwise, set it to
false.

With this extension for __bfs(), we now need to initialize the root of
__bfs() properly (with a correct ->only_xr), to do so, we introduce some
helper functions, which also cleans up a little bit for the __bfs() root
initialization code.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-8-boqun.feng@gmail.com
2020-08-26 12:42:04 +02:00
Boqun Feng
3454a36d6a lockdep: Introduce lock_list::dep
To add recursive read locks into the dependency graph, we need to store
the types of dependencies for the BFS later. There are four types of
dependencies:

*	Exclusive -> Non-recursive dependencies: EN
	e.g. write_lock(prev) held and try to acquire write_lock(next)
	or non-recursive read_lock(next), which can be represented as
	"prev -(EN)-> next"

*	Shared -> Non-recursive dependencies: SN
	e.g. read_lock(prev) held and try to acquire write_lock(next) or
	non-recursive read_lock(next), which can be represented as
	"prev -(SN)-> next"

*	Exclusive -> Recursive dependencies: ER
	e.g. write_lock(prev) held and try to acquire recursive
	read_lock(next), which can be represented as "prev -(ER)-> next"

*	Shared -> Recursive dependencies: SR
	e.g. read_lock(prev) held and try to acquire recursive
	read_lock(next), which can be represented as "prev -(SR)-> next"

So we use 4 bits for the presence of each type in lock_list::dep. Helper
functions and macros are also introduced to convert a pair of locks into
lock_list::dep bit and maintain the addition of different types of
dependencies.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-7-boqun.feng@gmail.com
2020-08-26 12:42:04 +02:00
Boqun Feng
bd76eca10d lockdep: Reduce the size of lock_list::distance
lock_list::distance is always not greater than MAX_LOCK_DEPTH (which
is 48 right now), so a u16 will fit. This patch reduces the size of
lock_list::distance to save space, so that we can introduce other fields
to help detect recursive read lock deadlocks without increasing the size
of lock_list structure.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-6-boqun.feng@gmail.com
2020-08-26 12:42:04 +02:00
Boqun Feng
d563bc6ead lockdep: Make __bfs() visit every dependency until a match
Currently, __bfs() will do a breadth-first search in the dependency
graph and visit each lock class in the graph exactly once, so for
example, in the following graph:

	A ---------> B
	|            ^
	|            |
	+----------> C

a __bfs() call starts at A, will visit B through dependency A -> B and
visit C through dependency A -> C and that's it, IOW, __bfs() will not
visit dependency C -> B.

This is OK for now, as we only have strong dependencies in the
dependency graph, so whenever there is a traverse path from A to B in
__bfs(), it means A has strong dependencies to B (IOW, B depends on A
strongly). So no need to visit all dependencies in the graph.

However, as we are going to add recursive-read lock into the dependency
graph, as a result, not all the paths mean strong dependencies, in the
same example above, dependency A -> B may be a weak dependency and
traverse A -> C -> B may be a strong dependency path. And with the old
way of __bfs() (i.e. visiting every lock class exactly once), we will
miss the strong dependency path, which will result into failing to find
a deadlock. To cure this for the future, we need to find a way for
__bfs() to visit each dependency, rather than each class, exactly once
in the search until we find a match.

The solution is simple:

We used to mark lock_class::lockdep_dependency_gen_id to indicate a
class has been visited in __bfs(), now we change the semantics a little
bit: we now mark lock_class::lockdep_dependency_gen_id to indicate _all
the dependencies_ in its lock_{after,before} have been visited in the
__bfs() (note we only take one direction in a __bfs() search). In this
way, every dependency is guaranteed to be visited until we find a match.

Note: the checks in mark_lock_accessed() and lock_accessed() are
removed, because after this modification, we may call these two
functions on @source_entry of __bfs(), which may not be the entry in
"list_entries"

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-5-boqun.feng@gmail.com
2020-08-26 12:42:03 +02:00
Boqun Feng
b11be024de lockdep: Demagic the return value of BFS
__bfs() could return four magic numbers:

	1: search succeeds, but none match.
	0: search succeeds, find one match.
	-1: search fails because of the cq is full.
	-2: search fails because a invalid node is found.

This patch cleans things up by using a enum type for the return value
of __bfs() and its friends, this improves the code readability of the
code, and further, could help if we want to extend the BFS.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-4-boqun.feng@gmail.com
2020-08-26 12:42:03 +02:00
Boqun Feng
e918188611 locking: More accurate annotations for read_lock()
On the archs using QUEUED_RWLOCKS, read_lock() is not always a recursive
read lock, actually it's only recursive if in_interrupt() is true. So
change the annotation accordingly to catch more deadlocks.

Note we used to treat read_lock() as pure recursive read locks in
lib/locking-seftest.c, and this is useful, especially for the lockdep
development selftest, so we keep this via a variable to force switching
lock annotation for read_lock().

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200807074238.1632519-2-boqun.feng@gmail.com
2020-08-26 12:42:02 +02:00
Valentin Schneider
4fc472f121 sched/topology: Move SD_DEGENERATE_GROUPS_MASK out of linux/sched/topology.h
SD_DEGENERATE_GROUPS_MASK is only useful for sched/topology.c, but still
gets defined for anyone who imports topology.h, leading to a flurry of
unused variable warnings.

Move it out of the header and place it next to the SD degeneration
functions in sched/topology.c.

Fixes: 4ee4ea443a ("sched/topology: Introduce SD metaflag for flags needing > 1 groups")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200825133216.9163-2-valentin.schneider@arm.com
2020-08-26 12:41:59 +02:00
Valentin Schneider
8fca9494d4 sched/topology: Move sd_flag_debug out of linux/sched/topology.h
Defining an array in a header imported all over the place clearly is a daft
idea, that still didn't stop me from doing it.

Leave a declaration of sd_flag_debug in topology.h and move its definition
to sched/debug.c.

Fixes: b6e862f386 ("sched/topology: Define and assign sched_domain flag metadata")
Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200825133216.9163-1-valentin.schneider@arm.com
2020-08-26 12:41:59 +02:00
Sebastian Andrzej Siewior
c1cecf884a sched: Cache task_struct::flags in sched_submit_work()
sched_submit_work() is considered to be a hot path. The preempt_disable()
instruction is a compiler barrier and forces the compiler to load
task_struct::flags for the second comparison.
By using a local variable, the compiler can load the value once and keep it in
a register for the second comparison.

Verified on x86-64 with gcc-10.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200819200025.lqvmyefqnbok5i4f@linutronix.de
2020-08-26 12:41:58 +02:00
Jiang Biao
1724b95b92 sched/fair: Simplify the work when reweighting entity
The code in reweight_entity() can be simplified.

For a sched entity on the rq, the entity accounting can be replaced by
cfs_rq instantaneous load updates currently called from within the
entity accounting.

Even though an entity on the rq can't represent a task in
reweight_entity() (a task is always dequeued before calling this
function) and so the numa task accounting and the rq->cfs_tasks list
management of the entity accounting are never called, the redundant
cfs_rq->nr_running decrement/increment will be avoided.

Signed-off-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20200811113209.34057-1-benbjiang@tencent.com
2020-08-26 12:41:58 +02:00
Lukasz Luba
da0777d35f sched/fair: Fix wrong negative conversion in find_energy_efficient_cpu()
In find_energy_efficient_cpu() 'cpu_cap' could be less that 'util'.
It might be because of RT, DL (so higher sched class than CFS), irq or
thermal pressure signal, which reduce the capacity value.
In such situation the result of 'cpu_cap - util' might be negative but
stored in the unsigned long. Then it might be compared with other unsigned
long when uclamp_rq_util_with() reduced the 'util' such that is passes the
fits_capacity() check.

Prevent this situation and make the arithmetic more safe.

Fixes: 1d42509e47 ("sched/fair: Make EAS wakeup placement consider uclamp restrictions")
Signed-off-by: Lukasz Luba <lukasz.luba@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Link: https://lkml.kernel.org/r/20200810083004.26420-1-lukasz.luba@arm.com
2020-08-26 12:41:57 +02:00
Josh Don
ec73240b16 sched/fair: Ignore cache hotness for SMT migration
SMT siblings share caches, so cache hotness should be irrelevant for
cross-sibling migration.

Signed-off-by: Josh Don <joshdon@google.com>
Proposed-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200804193413.510651-1-joshdon@google.com
2020-08-26 12:41:57 +02:00
Peter Zijlstra
eb1f00237a lockdep,trace: Expose tracepoints
The lockdep tracepoints are under the lockdep recursion counter, this
has a bunch of nasty side effects:

 - TRACE_IRQFLAGS doesn't work across the entire tracepoint

 - RCU-lockdep doesn't see the tracepoints either, hiding numerous
   "suspicious RCU usage" warnings.

Pull the trace_lock_*() tracepoints completely out from under the
lockdep recursion handling and completely rely on the trace level
recusion handling -- also, tracing *SHOULD* not be taking locks in any
case.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.782688941@infradead.org
2020-08-26 12:41:56 +02:00
Peter Zijlstra
9864f5b594 cpuidle: Move trace_cpu_idle() into generic code
Remove trace_cpu_idle() from the arch_cpu_idle() implementations and
put it in the generic code, right before disabling RCU. Gets rid of
more trace_*_rcuidle() users.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.428433395@infradead.org
2020-08-26 12:41:54 +02:00
Peter Zijlstra
1098582a0f sched,idle,rcu: Push rcu_idle deeper into the idle path
Lots of things take locks, due to a wee bug, rcu_lockdep didn't notice
that the locking tracepoints were using RCU.

Push rcu_idle_{enter,exit}() as deep as possible into the idle paths,
this also resolves a lot of _rcuidle()/RCU_NONIDLE() usage.

Specifically, sched_clock_idle_wakeup_event() will use ktime which
will use seqlocks which will tickle lockdep, and
stop_critical_timings() uses lock.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Link: https://lkml.kernel.org/r/20200821085348.310943801@infradead.org
2020-08-26 12:41:53 +02:00
Peter Zijlstra
fddf9055a6 lockdep: Use raw_cpu_*() for per-cpu variables
Sven reported that commit a21ee6055c ("lockdep: Change
hardirq{s_enabled,_context} to per-cpu variables") caused trouble on
s390 because their this_cpu_*() primitives disable preemption which
then lands back tracing.

On the one hand, per-cpu ops should use preempt_*able_notrace() and
raw_local_irq_*(), on the other hand, we can trivialy use raw_cpu_*()
ops for this.

Fixes: a21ee6055c ("lockdep: Change hardirq{s_enabled,_context} to per-cpu variables")
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200821085348.192346882@infradead.org
2020-08-26 12:41:53 +02:00
Xu Wang
ec02821c1d alarmtimer: Convert comma to semicolon
Replace a comma between expression statements by a semicolon.

Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20200818062651.21680-1-vulab@iscas.ac.cn
2020-08-25 12:45:53 +02:00
Tobias Klauser
7787b6fc93 bpf, sysctl: Let bpf_stats_handler take a kernel pointer buffer
Commit 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
changed ctl_table.proc_handler to take a kernel pointer. Adjust the
signature of bpf_stats_handler to match ctl_table.proc_handler which
fixes the following sparse warning:

kernel/sysctl.c:226:49: warning: incorrect type in argument 3 (different address spaces)
kernel/sysctl.c:226:49:    expected void *
kernel/sysctl.c:226:49:    got void [noderef] __user *buffer
kernel/sysctl.c:2640:35: warning: incorrect type in initializer (incompatible argument 3 (different address spaces))
kernel/sysctl.c:2640:35:    expected int ( [usertype] *proc_handler )( ... )
kernel/sysctl.c:2640:35:    got int ( * )( ... )

Fixes: 32927393dc ("sysctl: pass kernel pointers to ->proc_handler")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/bpf/20200824142047.22043-1-tklauser@distanz.ch
2020-08-24 21:11:40 -07:00
Yonghong Song
b474959d5a bpf: Fix a buffer out-of-bound access when filling raw_tp link_info
Commit f2e10bff16 ("bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link")
added link query for raw_tp. One of fields in link_info is to
fill a user buffer with tp_name. The Scurrent checking only
declares "ulen && !ubuf" as invalid. So "!ulen && ubuf" will be
valid. Later on, we do "copy_to_user(ubuf, tp_name, ulen - 1)" which
may overwrite user memory incorrectly.

This patch fixed the problem by disallowing "!ulen && ubuf" case as well.

Fixes: f2e10bff16 ("bpf: Add support for BPF_OBJ_GET_INFO_BY_FD for bpf_link")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20200821191054.714731-1-yhs@fb.com
2020-08-24 21:03:07 -07:00
Marco Elver
2e986b81f6 kcsan: Optimize debugfs stats counters
Remove kcsan_counter_inc/dec() functions, as they perform no other
logic, and are no longer needed.

This avoids several calls in kcsan_setup_watchpoint() and
kcsan_found_watchpoint(), as well as lets the compiler warn us about
potential out-of-bounds accesses as the array's size is known at all
usage sites at compile-time.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:23 -07:00
Marco Elver
178a1877d7 kcsan: Use pr_fmt for consistency
Use the same pr_fmt throughout for consistency. [ The only exception is
report.c, where the format must be kept precisely as-is. ]

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:23 -07:00
Marco Elver
2778793072 kcsan: Show message if enabled early
Show a message in the kernel log if KCSAN was enabled early.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:22 -07:00
Marco Elver
4700ccdf18 kcsan: Remove debugfs test command
Remove the debugfs test command, as it is no longer needed now that we
have the KUnit+Torture based kcsan-test module. This is to avoid
confusion around how KCSAN should be tested, as only the kcsan-test
module is maintained.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:22 -07:00
Marco Elver
a4e74fa5f0 kcsan: Simplify constant string handling
Simplify checking prefixes and length calculation of constant strings.
For the former, the kernel provides str_has_prefix(), and the latter we
should just use strlen("..") because GCC and Clang have optimizations
that optimize these into constants.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:22 -07:00
Marco Elver
69b2c81bc8 kcsan: Simplify debugfs counter to name mapping
Simplify counter ID to name mapping by using an array with designated
inits. This way, we can turn a run-time BUG() into a compile-time static
assertion failure if a counter name is missing.

No functional change intended.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:10:21 -07:00
Marco Elver
bec4a24748 kcsan: Test support for compound instrumentation
Changes kcsan-test module to support checking reports that include
compound instrumentation. Since we should not fail the test if this
support is unavailable, we have to add a config variable that the test
can use to decide what to check for.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:58 -07:00
Marco Elver
9d1335cc1e kcsan: Add missing CONFIG_KCSAN_IGNORE_ATOMICS checks
Add missing CONFIG_KCSAN_IGNORE_ATOMICS checks for the builtin atomics
instrumentation.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:57 -07:00
Marco Elver
106a307fd0 kcsan: Skew delay to be longer for certain access types
For compound instrumentation and assert accesses, skew the watchpoint
delay to be longer if randomized. This is useful to improve race
detection for such accesses.

For compound accesses we should increase the delay as we've aggregated
both read and write instrumentation. By giving up 1 call into the
runtime, we're less likely to set up a watchpoint and thus less likely
to detect a race. We can balance this by increasing the watchpoint
delay.

For assert accesses, we know these are of increased interest, and we
wish to increase our chances of detecting races for such checks.

Note that, kcsan_udelay_{task,interrupt} define the upper bound delays.
When randomized, delays are uniformly distributed between [0, delay].
Skewing the delay does not break this promise as long as the defined
upper bounds are still adhered to. The current skew results in delays
uniformly distributed between [delay/2, delay].

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:57 -07:00
Marco Elver
14e2ac8de0 kcsan: Support compounded read-write instrumentation
Add support for compounded read-write instrumentation if supported by
the compiler. Adds the necessary instrumentation functions, and a new
type which is used to generate a more descriptive report.

Furthermore, such compounded memory access instrumentation is excluded
from the "assume aligned writes up to word size are atomic" rule,
because we cannot assume that the compiler emits code that is atomic for
compound ops.

LLVM/Clang added support for the feature in:
785d41a261

The new instrumentation is emitted for sets of memory accesses in the
same basic block to the same address with at least one read appearing
before a write. These typically result from compound operations such as
++, --, +=, -=, |=, &=, etc. but also equivalent forms such as "var =
var + 1". Where the compiler determines that it is equivalent to emit a
call to a single __tsan_read_write instead of separate __tsan_read and
__tsan_write, we can then benefit from improved performance and better
reporting for such access patterns.

The new reports now show that the ops are both reads and writes, for
example:

	read-write to 0xffffffff90548a38 of 8 bytes by task 143 on cpu 3:
	 test_kernel_rmw_array+0x45/0xa0
	 access_thread+0x71/0xb0
	 kthread+0x21e/0x240
	 ret_from_fork+0x22/0x30

	read-write to 0xffffffff90548a38 of 8 bytes by task 144 on cpu 2:
	 test_kernel_rmw_array+0x45/0xa0
	 access_thread+0x71/0xb0
	 kthread+0x21e/0x240
	 ret_from_fork+0x22/0x30

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:32 -07:00
Marco Elver
f9ea631931 kcsan: Add atomic builtin test case
Adds test case to kcsan-test module, to test atomic builtin
instrumentation works.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:31 -07:00
Marco Elver
0f8ad5f2e9 kcsan: Add support for atomic builtins
Some architectures (currently e.g. s390 partially) implement atomics
using the compiler's atomic builtins (__atomic_*, __sync_*). To support
enabling KCSAN on such architectures in future, or support experimental
use of these builtins, implement support for them.

We should also avoid breaking KCSAN kernels due to use (accidental or
otherwise) of atomic builtins in drivers, as has happened in the past:
https://lkml.kernel.org/r/5231d2c0-41d9-6721-e15f-a7eedf3ce69e@infradead.org

The instrumentation is subtly different from regular reads/writes: TSAN
instrumentation replaces the use of atomic builtins with a call into the
runtime, and the runtime's job is to also execute the desired atomic
operation. We rely on the __atomic_* compiler builtins, available with
all KCSAN-supported compilers, to implement each TSAN atomic
instrumentation function.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2020-08-24 15:09:05 -07:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
e99b2507ba Merge tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull entry fix from Thomas Gleixner:
 "A single bug fix for the common entry code.

  The transcription of the x86 version messed up the reload of the
  syscall number from pt_regs after ptrace and seccomp which breaks
  syscall number rewriting"

* tag 'core-urgent-2020-08-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  core/entry: Respect syscall number rewrites
2020-08-23 11:05:47 -07:00
Linus Torvalds
9d045ed1eb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
 "Nothing earth shattering here, lots of small fixes (f.e. missing RCU
  protection, bad ref counting, missing memset(), etc.) all over the
  place:

   1) Use get_file_rcu() in task_file iterator, from Yonghong Song.

   2) There are two ways to set remote source MAC addresses in macvlan
      driver, but only one of which validates things properly. Fix this.
      From Alvin Šipraga.

   3) Missing of_node_put() in gianfar probing, from Sumera
      Priyadarsini.

   4) Preserve device wanted feature bits across multiple netlink
      ethtool requests, from Maxim Mikityanskiy.

   5) Fix rcu_sched stall in task and task_file bpf iterators, from
      Yonghong Song.

   6) Avoid reset after device destroy in ena driver, from Shay
      Agroskin.

   7) Missing memset() in netlink policy export reallocation path, from
      Johannes Berg.

   8) Fix info leak in __smc_diag_dump(), from Peilin Ye.

   9) Decapsulate ECN properly for ipv6 in ipv4 tunnels, from Mark
      Tomlinson.

  10) Fix number of data stream negotiation in SCTP, from David Laight.

  11) Fix double free in connection tracker action module, from Alaa
      Hleihel.

  12) Don't allow empty NHA_GROUP attributes, from Nikolay Aleksandrov"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (46 commits)
  net: nexthop: don't allow empty NHA_GROUP
  bpf: Fix two typos in uapi/linux/bpf.h
  net: dsa: b53: check for timeout
  tipc: call rcu_read_lock() in tipc_aead_encrypt_done()
  net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments() error flow
  net: sctp: Fix negotiation of the number of data streams.
  dt-bindings: net: renesas, ether: Improve schema validation
  gre6: Fix reception with IP6_TNL_F_RCV_DSCP_COPY
  hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit()
  hv_netvsc: Remove "unlikely" from netvsc_select_queue
  bpf: selftests: global_funcs: Check err_str before strstr
  bpf: xdp: Fix XDP mode when no mode flags specified
  selftests/bpf: Remove test_align leftovers
  tools/resolve_btfids: Fix sections with wrong alignment
  net/smc: Prevent kernel-infoleak in __smc_diag_dump()
  sfc: fix build warnings on 32-bit
  net: phy: mscc: Fix a couple of spelling mistakes "spcified" -> "specified"
  libbpf: Fix map index used in error message
  net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe()
  net: atlantic: Use readx_poll_timeout() for large timeout
  ...
2020-08-23 10:52:33 -07:00
Thomas Gleixner
e2d977c9f1 timekeeping: Provide multi-timestamp accessor to NMI safe timekeeper
printk wants to store various timestamps (MONOTONIC, REALTIME, BOOTTIME) to
make correlation of dmesg from several systems easier.

Provide an interface to retrieve all three timestamps in one go.

There are some caveats:

1) Boot time and late sleep time injection

  Boot time is a racy access on 32bit systems if the sleep time injection
  happens late during resume and not in timekeeping_resume(). That could be
  avoided by expanding struct tk_read_base with boot offset for 32bit and
  adding more overhead to the update. As this is a hard to observe once per
  resume event which can be filtered with reasonable effort using the
  accurate mono/real timestamps, it's probably not worth the trouble.

  Aside of that it might be possible on 32 and 64 bit to observe the
  following when the sleep time injection happens late:

  CPU 0				         CPU 1
  timekeeping_resume()
  ktime_get_fast_timestamps()
    mono, real = __ktime_get_real_fast()
  					 inject_sleep_time()
  					   update boot offset
  	boot = mono + bootoffset;
  
  That means that boot time already has the sleep time adjustment, but
  real time does not. On the next readout both are in sync again.
  
  Preventing this for 64bit is not really feasible without destroying the
  careful cache layout of the timekeeper because the sequence count and
  struct tk_read_base would then need two cache lines instead of one.

2) Suspend/resume timestamps

   Access to the time keeper clock source is disabled accross the innermost
   steps of suspend/resume. The accessors still work, but the timestamps
   are frozen until time keeping is resumed which happens very early.

   For regular suspend/resume there is no observable difference vs. sched
   clock, but it might affect some of the nasty low level debug printks.

   OTOH, access to sched clock is not guaranteed accross suspend/resume on
   all systems either so it depends on the hardware in use.

   If that turns out to be a real problem then this could be mitigated by
   using sched clock in a similar way as during early boot. But it's not as
   trivial as on early boot because it needs some careful protection
   against the clock monotonic timestamp jumping backwards on resume.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Petr Mladek <pmladek@suse.com>                                                                                                                                                                                                                                      
Link: https://lore.kernel.org/r/20200814115512.159981360@linutronix.de
2020-08-23 10:38:24 +02:00
Thomas Gleixner
71419b30ca timekeeping: Utilize local_clock() for NMI safe timekeeper during early boot
During early boot the NMI safe timekeeper returns 0 until the first
clocksource becomes available.

This prevents it from being used for printk or other facilities which today
use sched clock. sched clock can be available way before timekeeping is
initialized.

The obvious workaround for this is to utilize the early sched clock in the
default dummy clock read function until a clocksource becomes available.

After switching to the clocksource clock MONOTONIC and BOOTTIME will not
jump because the timekeeping_init() bases clock MONOTONIC on sched clock
and the offset between clock MONOTONIC and BOOTTIME is zero during boot.

Clock REALTIME cannot provide useful timestamps during early boot up to
the point where a persistent clock becomes available, which is either in
timekeeping_init() or later when the RTC driver which might depend on I2C
or other subsystems is initialized.

There is a minor difference to sched_clock() vs. suspend/resume. As the
timekeeper clock source might not be accessible during suspend, after
timekeeping_suspend() timestamps freeze up to the point where
timekeeping_resume() is invoked. OTOH this is true for some sched clock
implementations as well.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Petr Mladek <pmladek@suse.com>                                                                                                                                                                                                                                      
Link: https://lore.kernel.org/r/20200814115512.041422402@linutronix.de
2020-08-23 10:38:24 +02:00
Linus Torvalds
349111f050 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "11 patches.

  Subsystems affected by this: misc, mm/hugetlb, mm/vmalloc, mm/misc,
  romfs, relay, uprobes, squashfs, mm/cma, mm/pagealloc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, page_alloc: fix core hung in free_pcppages_bulk()
  mm: include CMA pages in lowmem_reserve at boot
  squashfs: avoid bio_alloc() failure with 1Mbyte blocks
  uprobes: __replace_page() avoid BUG in munlock_vma_page()
  kernel/relay.c: fix memleak on destroy relay channel
  romfs: fix uninitialized memory leak in romfs_dev_read()
  mm/rodata_test.c: fix missing function declaration
  mm/vunmap: add cond_resched() in vunmap_pmd_range
  khugepaged: adjust VM_BUG_ON_MM() in __khugepaged_enter()
  hugetlb_cgroup: convert comma to semicolon
  mailmap: add Andi Kleen
2020-08-21 14:44:48 -07:00
David S. Miller
4af7b32f84 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2020-08-21

The following pull-request contains BPF updates for your *net* tree.

We've added 11 non-merge commits during the last 5 day(s) which contain
a total of 12 files changed, 78 insertions(+), 24 deletions(-).

The main changes are:

1) three fixes in BPF task iterator logic, from Yonghong.

2) fix for compressed dwarf sections in vmlinux, from Jiri.

3) fix xdp attach regression, from Andrii.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-21 12:54:50 -07:00
Hugh Dickins
c17c3dc9d0 uprobes: __replace_page() avoid BUG in munlock_vma_page()
syzbot crashed on the VM_BUG_ON_PAGE(PageTail) in munlock_vma_page(), when
called from uprobes __replace_page().  Which of many ways to fix it?
Settled on not calling when PageCompound (since Head and Tail are equals
in this context, PageCompound the usual check in uprobes.c, and the prior
use of FOLL_SPLIT_PMD will have cleared PageMlocked already).

Fixes: 5a52c9df62 ("uprobe: use FOLL_SPLIT_PMD instead of FOLL_SPLIT")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[5.4+]
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2008161338360.20413@eggly.anvils
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21 09:52:53 -07:00
Wei Yongjun
71e843295c kernel/relay.c: fix memleak on destroy relay channel
kmemleak report memory leak as follows:

  unreferenced object 0x607ee4e5f948 (size 8):
  comm "syz-executor.1", pid 2098, jiffies 4295031601 (age 288.468s)
  hex dump (first 8 bytes):
  00 00 00 00 00 00 00 00 ........
  backtrace:
     relay_open kernel/relay.c:583 [inline]
     relay_open+0xb6/0x970 kernel/relay.c:563
     do_blk_trace_setup+0x4a8/0xb20 kernel/trace/blktrace.c:557
     __blk_trace_setup+0xb6/0x150 kernel/trace/blktrace.c:597
     blk_trace_ioctl+0x146/0x280 kernel/trace/blktrace.c:738
     blkdev_ioctl+0xb2/0x6a0 block/ioctl.c:613
     block_ioctl+0xe5/0x120 fs/block_dev.c:1871
     vfs_ioctl fs/ioctl.c:48 [inline]
     __do_sys_ioctl fs/ioctl.c:753 [inline]
     __se_sys_ioctl fs/ioctl.c:739 [inline]
     __x64_sys_ioctl+0x170/0x1ce fs/ioctl.c:739
     do_syscall_64+0x33/0x40 arch/x86/entry/common.c:46
     entry_SYSCALL_64_after_hwframe+0x44/0xa9

'chan->buf' is malloced in relay_open() by alloc_percpu() but not free
while destroy the relay channel.  Fix it by adding free_percpu() before
return from relay_destroy_channel().

Fixes: 017c59c042 ("relay: Use per CPU constructs for the relay channel buffer pointers")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: David Rientjes <rientjes@google.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Akash Goel <akash.goel@intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200817122826.48518-1-weiyongjun1@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-21 09:52:53 -07:00
Thomas Gleixner
d88d59b64c core/entry: Respect syscall number rewrites
The transcript of the x86 entry code to the generic version failed to
reload the syscall number from ptregs after ptrace and seccomp have run,
which both can modify the syscall number in ptregs. It returns the original
syscall number instead which is obviously not the right thing to do.

Reload the syscall number to fix that.

Fixes: 142781e108 ("entry: Provide generic syscall entry functionality")
Reported-by: Kyle Huey <me@kylehuey.com> 
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Huey <me@kylehuey.com> 
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/87blj6ifo8.fsf@nanos.tec.linutronix.de
2020-08-21 16:17:29 +02:00
Linus Torvalds
d271b51c60 Merge tag 'dma-mapping-5.9-1' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
 "Fix more fallout from the dma-pool changes (Nicolas Saenz Julienne,
  me)"

* tag 'dma-mapping-5.9-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-pool: Only allocate from CMA when in same memory zone
  dma-pool: fix coherent pool allocations for IOMMU mappings
2020-08-20 10:48:17 -07:00
Valentin Schneider
3a6712c768 sched/topology: Mark SD_PREFER_SIBLING as SDF_NEEDS_GROUPS
SD_PREFER_SIBLING is currently considered in sd_parent_degenerate() but not
in sd_degenerate(). It too hinges on load balancing, and thus won't have
any effect when set on a domain with a single group. Add it to
SD_DEGENERATE_GROUPS_MASK.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200817113003.20802-12-valentin.schneider@arm.com
2020-08-19 10:49:49 +02:00
Valentin Schneider
c200191d4c sched/topology: Propagate SD_ASYM_CPUCAPACITY upwards
We currently set this flag *only* on domains whose topology level exactly
match the level where we detect asymmetry (as returned by
asym_cpu_capacity_level()). This is rather problematic.

Say there are two clusters in the system, one with a lone big CPU and the
other with a mix of big and LITTLE CPUs (as is allowed by DynamIQ):

  DIE [                ]
  MC  [             ][ ]
       0   1   2   3  4
       L   L   B   B  B

asym_cpu_capacity_level() will figure out that the MC level is the one
where all CPUs can see a CPU of max capacity, and we will thus set
SD_ASYM_CPUCAPACITY at MC level for all CPUs.

That lone big CPU will degenerate its MC domain, since it would be alone in
there, and will end up with just a DIE domain. Since the flag was only set
at MC, this CPU ends up not seeing any SD with the flag set, which is
broken.

Rather than clearing dflags at every topology level, clear it before
entering the topology level loop. This will properly propagate upwards
flags that are set starting from a certain level.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200817113003.20802-11-valentin.schneider@arm.com
2020-08-19 10:49:49 +02:00
Valentin Schneider
ab65afb094 sched/topology: Remove SD_SERIALIZE degeneration special case
If there is only a single NUMA node in the system, the only NUMA topology
level that will be generated will be NODE (identity distance), which
doesn't have SD_SERIALIZE.

This means we don't need this special case in sd_parent_degenerate(), as
having the NODE level "naturally" covers it. Thus, remove it.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200817113003.20802-10-valentin.schneider@arm.com
2020-08-19 10:49:48 +02:00
Valentin Schneider
6f34981862 sched/topology: Use prebuilt SD flag degeneration mask
Leverage SD_DEGENERATE_GROUPS_MASK in sd_degenerate() and
sd_parent_degenerate().

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: https://lore.kernel.org/r/20200817113003.20802-9-valentin.schneider@arm.com
2020-08-19 10:49:48 +02:00