Commit Graph

11 Commits

Author SHA1 Message Date
Namhyung Kim
5e3febe7b7 perf lock contention: Support lock addr/name filtering for BPF
Likewise, add addr_filter BPF hash map and check it with the lock
address.

  $ sudo ./perf lock con -ab -L tasklist_lock -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.169 [sec]
   contended  total wait  max wait  avg wait      type  caller

          18   174.09 us  25.31 us   9.67 us  rwlock:W  do_exit+0x36d
           5    32.34 us  10.87 us   6.47 us  rwlock:R  do_wait+0x8b
           4    15.41 us   4.73 us   3.85 us  rwlock:W  release_task+0x6e

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
529772c4df perf lock contention: Support lock type filtering for BPF
Likewise, add type_filter BPF hash map and check it when user gave a
lock type filter.

  $ sudo ./perf lock con -ab -Y rwlock -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.203 [sec]
   contended  total wait  max wait  avg wait       type  caller

          15   156.19 us  19.45 us  10.41 us   rwlock:W  do_exit+0x36d
           1    11.12 us  11.12 us  11.12 us   rwlock:R  do_wait+0x8b
           1     5.09 us   5.09 us   5.09 us   rwlock:W  release_task+0x6e

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221219201732.460111-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-21 14:52:39 -03:00
Namhyung Kim
688d2e8de2 perf lock contention: Add -l/--lock-addr option
The -l/--lock-addr option is to implement per-lock-instance contention
stat using LOCK_AGGR_ADDR.  It displays lock address and optionally
symbol name if exists.

  $ sudo ./perf lock con -abl sleep 1
   contended   total wait     max wait     avg wait            address   symbol

           1     36.28 us     36.28 us     36.28 us   ffff92615d6448b8
           9     10.91 us      1.84 us      1.21 us   ffffffffbaed50c0   rcu_state
           1     10.49 us     10.49 us     10.49 us   ffff9262ac4f0c80
           8      4.68 us      1.67 us       585 ns   ffffffffbae07a40   jiffies_lock
           3      3.03 us      1.45 us      1.01 us   ffff9262277861e0
           1       924 ns       924 ns       924 ns   ffff926095ba9d20
           1       436 ns       436 ns       436 ns   ffff9260bfda4f60

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
eca949b2b4 perf lock contention: Implement -t/--threads option for BPF
The BPF didn't show the per-thread stat properly.  Use task's thread id (PID)
as a key instead of stack_id and add a task_data map to save task comm names.

  $ sudo ./perf lock con -abt -E 5 sleep 1
   contended   total wait     max wait     avg wait          pid   comm

           1    740.66 ms    740.66 ms    740.66 ms         1950   nv_queue
           3    305.50 ms    298.19 ms    101.83 ms         1884   nvidia-modeset/
           1     25.14 us     25.14 us     25.14 us      2725038   EventManager_De
          12     23.09 us      9.30 us      1.92 us            0   swapper
           1     20.18 us     20.18 us     20.18 us      2725033   EventManager_De

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
fd507d3e35 perf lock contention: Add lock_data.h for common data
Accessing BPF maps should use the same data types.  Add bpf_skel/lock_data.h
to define the common data structures.  No functional changes.

Committer notes:

Fixed contention_key.stack_id missing rename to contention_key.stack_or_task_id.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221209190727.759804-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:24:31 -03:00
Namhyung Kim
c66a36af7b perf lock contention: Do not use BPF task local storage
It caused some troubles when a lock inside kmalloc is contended
because task local storage would allocate memory using kmalloc.
It'd create a recusion and even crash in my system.

There could be a couple of workarounds but I think the simplest
one is to use a pre-allocated hash map.  We could fix the task
local storage to use the safe BPF allocator, but it takes time
so let's change this until it happens actually.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chris Li <chriscli@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221118190109.1512674-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23 10:42:28 -03:00
Namhyung Kim
433b31fa00 perf lock contention: Fix a build error on 32-bit
It was reported that it failed to build the BPF lock contention skeleton
on 32 bit arch due to the size of long.  The lost count is used only for
reporting errors due to lack of stackmap space through bad_hist which
type is 'int'.  Let's use int type then.

Fixes: 6d499a6b3d ("perf lock: Print the number of lost entries for BPF")
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20220926215638.3931222-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-06 08:03:51 -03:00
Namhyung Kim
c1da8dd5c1 perf lock contention: Skip stack trace from BPF
Currently it collects stack traces to max size then skip entries.
Because we don't have control how to skip perf callchains.  But BPF can
do it with bpf_get_stackid() with a flag.

Say we have max-stack=4 and stack-skip=2, we get these stack traces.

Before:                    After:

     .---> +---+ <--.           .---> +---+ <--.
     |     |   |    |           |     |   |    |
     |     +---+  usable        |     +---+    |
    max    |   |    |          max    |   |    |
   stack   +---+ <--'         stack   +---+  usable
     |     | X |                |     |   |    |
     |     +---+   skip         |     +---+    |
     |     | X |                |     |   |    |
     `---> +---+                `---> +---+ <--'   <=== collection
                                      | X |
                                      +---+   skip
                                      | X |
                                      +---+

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220912055314.744552-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04 08:55:22 -03:00
Namhyung Kim
6d499a6b3d perf lock: Print the number of lost entries for BPF
Like the normal 'perf lock contention' output, it'd print the number of
lost entries for BPF if exists or -v option is passed.

Currently it uses BROKEN_CONTENDED stat for the lost count (due to full
stack maps).

  $ sudo perf lock con -a -b --map-nr-entries 128 sleep 5
  ...
  === output for debug===

  bad: 43, total: 14903
  bad rate: 0.29 %
  histogram of events caused bad sequence
      acquire: 0
     acquired: 0
    contended: 43
      release: 0

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220802191004.347740-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 18:03:31 -03:00
Namhyung Kim
6fda2405f4 perf lock: Implement cpu and task filters for BPF
Add -a/--all-cpus and -C/--cpu options for cpu filtering.  Also -p/--pid
and --tid options are added for task filtering.  The short -t option is
taken for --threads already.  Tracking the command line workload is
possible as well.

  $ sudo perf lock contention -a -b sleep 1

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:28:51 -03:00
Namhyung Kim
407b36f69e perf lock: Use BPF for lock contention analysis
Add -b/--use-bpf option to use BPF to collect lock contention stats.
For simplicity it now runs system-wide and requires C-c to stop.
Upcoming changes will add the usual filtering.

  $ sudo perf lock con -b
  ^C
   contended   total wait     max wait     avg wait         type   caller

          42    192.67 us     13.64 us      4.59 us     spinlock   queue_work_on+0x20
          23     85.54 us     10.28 us      3.72 us     spinlock   worker_thread+0x14a
           6     13.92 us      6.51 us      2.32 us        mutex   kernfs_iop_permission+0x30
           3     11.59 us     10.04 us      3.86 us        mutex   kernfs_dop_revalidate+0x3c
           1      7.52 us      7.52 us      7.52 us     spinlock   kthread+0x115
           1      7.24 us      7.24 us      7.24 us     rwlock:W   sys_epoll_wait+0x148
           2      7.08 us      3.99 us      3.54 us     spinlock   delayed_work_timer_fn+0x1b
           1      6.41 us      6.41 us      6.41 us     spinlock   idle_balance+0xa06
           2      2.50 us      1.83 us      1.25 us        mutex   kernfs_iop_lookup+0x2f
           1      1.71 us      1.71 us      1.71 us        mutex   kernfs_iop_getattr+0x2c

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Blake Jones <blakejones@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220729200756.666106-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-01 09:28:38 -03:00