The -l/--lock-addr option is to implement per-lock-instance contention
stat using LOCK_AGGR_ADDR. It displays lock address and optionally
symbol name if exists.
$ sudo ./perf lock con -abl sleep 1
contended total wait max wait avg wait address symbol
1 36.28 us 36.28 us 36.28 us ffff92615d6448b8
9 10.91 us 1.84 us 1.21 us ffffffffbaed50c0 rcu_state
1 10.49 us 10.49 us 10.49 us ffff9262ac4f0c80
8 4.68 us 1.67 us 585 ns ffffffffbae07a40 jiffies_lock
3 3.03 us 1.45 us 1.01 us ffff9262277861e0
1 924 ns 924 ns 924 ns ffff926095ba9d20
1 436 ns 436 ns 436 ns ffff9260bfda4f60
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The msan reported a use-of-uninitialized-value warning for the struct
lock_contention_data in lock_contention_read(). While it'd be filled
by bpf_map_lookup_elem(), let's just initialize it to silence the
warning.
==12524==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x562b0f16b1cd in lock_contention_read util/bpf_lock_contention.c:139:7
#1 0x562b0ef65ec6 in __cmd_contention builtin-lock.c:1737:3
#2 0x562b0ef65ec6 in cmd_lock builtin-lock.c:1992:8
#3 0x562b0ee7f50b in run_builtin perf.c:322:11
#4 0x562b0ee7efc1 in handle_internal_command perf.c:376:8
#5 0x562b0ee7e1e9 in run_argv perf.c:420:2
#6 0x562b0ee7e1e9 in main perf.c:550:3
#7 0x7f065f10e632 in __libc_start_main (/usr/lib64/libc.so.6+0x61632)
#8 0x562b0edf2fa9 in _start (perf+0xfa9)
SUMMARY: MemorySanitizer: use-of-uninitialized-value (perf+0xe15160) in lock_contention_read
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221028180128.3311491-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it collects stack traces to max size then skip entries.
Because we don't have control how to skip perf callchains. But BPF can
do it with bpf_get_stackid() with a flag.
Say we have max-stack=4 and stack-skip=2, we get these stack traces.
Before: After:
.---> +---+ <--. .---> +---+ <--.
| | | | | | | |
| +---+ usable | +---+ |
max | | | max | | |
stack +---+ <--' stack +---+ usable
| | X | | | | |
| +---+ skip | +---+ |
| | X | | | | |
`---> +---+ `---> +---+ <--' <=== collection
| X |
+---+ skip
| X |
+---+
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It needs stack traces to find callers of locks. To minimize the
performance overhead it only collects up to 8 entries for each stack
trace. And it skips first 3 entries as they came from BPF, tracepoint
and lock functions which are not interested for most users.
But it turned out that those numbers are different in some
configuration. Using fixed number can result in non meaningful caller
names. Let's make them adjustable with --stack-depth and --skip-stack
options.
On my setup, the default output is like below:
# /perf lock con -ab -F contended,wait_total sleep 3
contended total wait type caller
28 4.55 ms rwlock:W __bpf_trace_contention_begin+0xb
33 1.67 ms rwlock:W __bpf_trace_contention_begin+0xb
12 580.28 us spinlock __bpf_trace_contention_begin+0xb
60 240.54 us rwsem:R __bpf_trace_contention_begin+0xb
27 64.45 us spinlock __bpf_trace_contention_begin+0xb
If I change the stack skip to 5, the result will be like:
# perf lock con -ab -F contended,wait_total --stack-skip 5 sleep 3
contended total wait type caller
32 715.45 us spinlock folio_lruvec_lock_irqsave+0x61
26 550.22 us spinlock folio_lruvec_lock_irqsave+0x61
15 486.93 us rwsem:R mmap_read_lock+0x13
12 139.66 us rwsem:W vm_mmap_pgoff+0x93
1 7.04 us spinlock tick_do_update_jiffies64+0x25
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently it shows a caller function for each entry, but users need to see
the full call stacks sometimes. Use -v/--verbose option to do that.
# perf lock con -a -b -v sleep 3
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel data
Using /proc/kallsyms for symbols
contended total wait max wait avg wait type caller
1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffbb8b8e75 bpf_trace_run2+0x35
0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb
0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5
0xffffffffbc1c26ff _raw_spin_lock+0x1f
0xffffffffbb841015 tick_do_update_jiffies64+0x25
0xffffffffbb8409ee tick_irq_enter+0x9e
1 7.70 us 7.70 us 7.70 us spinlock __bpf_trace_contention_begin+0xb
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffbb8b8e75 bpf_trace_run2+0x35
0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb
0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5
0xffffffffbc1c26ff _raw_spin_lock+0x1f
0xffffffffbb7bc27e raw_spin_rq_lock_nested+0xe
0xffffffffbb7cef9c load_balance+0x66c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -b/--use-bpf option to use BPF to collect lock contention stats.
For simplicity it now runs system-wide and requires C-c to stop.
Upcoming changes will add the usual filtering.
$ sudo perf lock con -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>