Commit Graph

16327 Commits

Author SHA1 Message Date
Ian Rogers
faa3591640 perf vendor events: Add goldmont counter information
Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-11-irogers@google.com
2024-06-20 16:53:31 -07:00
Ian Rogers
40ccd6aa3e perf vendor events: Add/update emeraldrapids events/metrics
Update events from v1.06 to v1.09.
Add TMA metrics v4.8.

Bring in the event updates v1.09:
3fd5892bb4
v1.08:
54525c4508

The TMA 4.8 information was added in:
59194d4d90

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SW_PREFETCH_ACCESS.ANY,
UNC_IIO_BANDWIDTH_OUT.PART[0-7]_FREERUN,
UOPS_ISSUED.CYCLES.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-10-irogers@google.com
2024-06-20 16:53:22 -07:00
Ian Rogers
1e56e9191f perf vendor events: Update elkhartlake events
Update events from v1.04 to v1.05. Bring in event updates from:
fb91e1851c

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-9-irogers@google.com
2024-06-20 16:53:15 -07:00
Ian Rogers
4cc4994244 perf vendor events: Update cascadelakex events/metrics
Update events from v1.21 to v1.22.

Bring in the event updates v1.22
013877729c

The TMA 4.8 information was updated in:
59194d4d90

New events are:
SW_PREFETCH_ACCESS.ANY

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-8-irogers@google.com
2024-06-20 16:53:06 -07:00
Ian Rogers
87835d9f85 perf vendor events: Update broadwellx metrics add event counter information
Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

The TMA 4.8 information was updated in:
59194d4d90

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-7-irogers@google.com
2024-06-20 16:52:49 -07:00
Ian Rogers
6a8ec0b65e perf vendor events: Update broadwellde metrics add event counter information
Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

The TMA 4.8 information was updated in:
59194d4d90

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-6-irogers@google.com
2024-06-20 16:52:41 -07:00
Ian Rogers
39b8bd1635 perf vendor events: Update broadwell metrics add event counter information
Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

The TMA 4.8 information was updated in:
59194d4d90

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-5-irogers@google.com
2024-06-20 16:52:34 -07:00
Ian Rogers
19121e877c perf vendor events: Add bonnell counter information
Add counter information necessary for optimizing event grouping the
perf tool.

The most recent RFC patch set using this information:
https://lore.kernel.org/lkml/20240412210756.309828-1-weilin.wang@intel.com/

The information was added in:
475892a969
and later patches.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-4-irogers@google.com
2024-06-20 16:52:24 -07:00
Ian Rogers
72da747ddd perf vendor events: Update alderlaken events/metrics
Update events from v1.24 to v1.27.
Update e-core TMA metrics to v3.6.

Bring in the event updates v1.27:
ea4f309a04
v1.26:
0052e68d24

The e-core TMA 3.6 information was updated in:
d9c2faa70b

New events are:
MEM_UOPS_RETIRED.LOCK_LOADS,
SERIALIZATION.C01_MS_SCB,
UOPS_ISSUED.ANY.

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-3-irogers@google.com
2024-06-20 16:52:15 -07:00
Ian Rogers
17d4b1922c perf vendor events: Update alderlake events/metrics
Update events from v1.24 to v1.27.
Update p-core TMA metrics from v4.7 to v4.8, and the e-core TMA
metrics to v3.6.

Bring in the event updates v1.27:
ea4f309a04
v1.26:
0052e68d24

The p-core TMA 4.8 information was updated in:
59194d4d90
And e-core in:
d9c2faa70b

New events are:
EXE_ACTIVITY.2_3_PORTS_UTIL,
ICACHE_DATA.STALL_PERIODS,
L2_TRANS.L2_WB,
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_1024,
MEM_UOPS_RETIRED.LOCK_LOADS,
OFFCORE_REQUESTS.DEMAND_CODE_RD,
OFFCORE_REQUESTS.DEMAND_RFO,
OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD,
OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD,
RS.EMPTY_RESOURCE,
SERIALIZATION.C01_MS_SCB,
SW_PREFETCH_ACCESS.ANY,
UOPS_ISSUED.ANY,
UOPS_ISSUED.CYCLES

Co-authored-by: Weilin Wang <weilin.wang@intel.com>
Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620181752.3945845-2-irogers@google.com
2024-06-20 16:52:00 -07:00
Ravi Bangoria
b739759c4e perf doc: Add AMD IBS usage document
Add a perf man page document that describes how to exploit AMD IBS with
Linux perf. Brief intro about IBS and simple one-liner examples will help
naive users to get started. This is not meant to be an exhaustive IBS
guide. User should refer latest AMD64 Architecture Programmer's Manual
for detailed description of IBS.

Usage:

  $ man perf-amd-ibs

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: ananth.narayan@amd.com
Cc: sandipan.das@amd.com
Cc: santosh.shukla@amd.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240620054104.815-1-ravi.bangoria@amd.com
2024-06-20 16:51:55 -07:00
Athira Rajeev
90d32e9201 tools/perf: Handle perftool-testsuite_probe testcases fail when kernel debuginfo is not present
Running "perftool-testsuite_probe" fails as below:

	./perf test -v "perftool-testsuite_probe"
	83: perftool-testsuite_probe  : FAILED

There are three fails:

1. Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
   -- [ FAIL ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf probe -l (output regexp parsing)

2. Regexp not found: "probe:vfs_mknod"
   Regexp not found: "probe:vfs_create"
   Regexp not found: "probe:vfs_rmdir"
   Regexp not found: "probe:vfs_link"
   Regexp not found: "probe:vfs_write"
   -- [ FAIL ] -- perf_probe :: test_adding_kernel :: wildcard adding support (command exitcode + output regexp parsing)

3. Regexp not found: "Failed to find"
   Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
   Regexp not found: "in this function|at this address"
   Line did not match any pattern: "The /boot/vmlinux file has no debug information."
   Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."

These three tests depends on kernel debug info.
1. Fail 1 expects file name along with probe which needs debuginfo
2. Fail 2 :
    perf probe -nf --max-probes=512 -a 'vfs_* $params'
    Debuginfo-analysis is not supported.
     Error: Failed to add events.

3. Fail 3 :
   perf probe 'vfs_read somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64'
   Debuginfo-analysis is not supported.
   Error: Failed to add events.

There is already helper function skip_if_no_debuginfo in
lib/probe_vfs_getname.sh which does perf probe and returns
"2" if debug info is not present. Use the skip_if_no_debuginfo
function and skip only the three tests which needs debuginfo
based on the result.

With the patch:

    83: perftool-testsuite_probe:
   --- start ---
   test child forked, pid 3927
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission ::
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: -a
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: adding probe inode_permission :: --add
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing added probe :: perf list
   Regexp not found: "\s*probe:inode_permission(?:_\d+)?\s+\(on inode_permission(?:[:\+][0-9A-Fa-f]+)?@.+\)"
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: using added probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: deleting added probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: listing removed probe (should NOT be listed)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: dry run :: adding probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: first probe adding
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (without force)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: force-adding probes :: second probe adding (with force)
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: using doubled probe
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: removing multiple probes
   Regexp not found: "probe:vfs_mknod"
   Regexp not found: "probe:vfs_create"
   Regexp not found: "probe:vfs_rmdir"
   Regexp not found: "probe:vfs_link"
   Regexp not found: "probe:vfs_write"
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   Regexp not found: "Failed to find"
   Regexp not found: "somenonexistingrandomstuffwhichisalsoprettylongorevenlongertoexceed64"
   Regexp not found: "in this function|at this address"
   Line did not match any pattern: "The /boot/vmlinux file has no debug information."
   Line did not match any pattern: "Rebuild with CONFIG_DEBUG_INFO=y, or install an appropriate debuginfo package."
   -- [ SKIP ] -- perf_probe :: test_adding_kernel :: 2 2 Skipped due to missing debuginfo :: testcase skipped
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: add
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function with retval :: record
   -- [ PASS ] -- perf_probe :: test_adding_kernel :: function argument probing :: script
   ## [ PASS ] ## perf_probe :: test_adding_kernel SUMMARY
   ---- end(0) ----
   83: perftool-testsuite_probe                                        : Ok

Only the three specific tests are skipped and remaining
ran successfully.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: disgoel@linux.vnet.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240617122121.7484-1-atrajeev@linux.vnet.ibm.com
2024-06-20 10:05:04 -07:00
Namhyung Kim
eae7044b67 perf hist: Honor symbol_conf.skip_empty
So that it can skip events with no sample according to the config value.
This can omit the dummy event in the output of perf report --group.

An example output:

  $ sudo perf mem record -a sleep 1
  $ sudo perf report --group

Before)
  #
  # Samples: 232  of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P, dummy:u'
  # Event count (approx.): 3089861
  #
  #                 Overhead  Command      Shared Object      Symbol
  # ........................  ...........  .................  .....................................
  #
       9.29%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] update_blocked_averages
       5.26%   0.15%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_se
       4.15%   0.00%   0.00%  perf-exec    [kernel.kallsyms]  [k] slab_update_freelist.isra.0
       3.87%   0.00%   0.00%  perf-exec    [kernel.kallsyms]  [k] memcg_slab_post_alloc_hook
       3.79%   0.17%   0.00%  swapper      [kernel.kallsyms]  [k] enqueue_task_fair
       3.63%   0.00%   0.00%  sleep        [kernel.kallsyms]  [k] next_uptodate_page
       2.86%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_cfs_rq
       2.78%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] __schedule
       2.34%   0.00%   0.00%  swapper      [kernel.kallsyms]  [k] intel_idle
       2.32%   0.97%   0.00%  swapper      [kernel.kallsyms]  [k] psi_group_change

After)
  #
  # Samples: 232  of events 'cpu/mem-loads,ldlat=30/P, cpu/mem-stores/P'
  # Event count (approx.): 3089861
  #
  #         Overhead  Command      Shared Object      Symbol
  # ................  ...........  .................  .....................................
  #
       9.29%   0.00%  swapper      [kernel.kallsyms]  [k] update_blocked_averages
       5.26%   0.15%  swapper      [kernel.kallsyms]  [k] __update_load_avg_se
       4.15%   0.00%  perf-exec    [kernel.kallsyms]  [k] slab_update_freelist.isra.0
       3.87%   0.00%  perf-exec    [kernel.kallsyms]  [k] memcg_slab_post_alloc_hook
       3.79%   0.17%  swapper      [kernel.kallsyms]  [k] enqueue_task_fair
       3.63%   0.00%  sleep        [kernel.kallsyms]  [k] next_uptodate_page
       2.86%   0.00%  swapper      [kernel.kallsyms]  [k] __update_load_avg_cfs_rq
       2.78%   0.00%  swapper      [kernel.kallsyms]  [k] __schedule
       2.34%   0.00%  swapper      [kernel.kallsyms]  [k] intel_idle
       2.32%   0.97%  swapper      [kernel.kallsyms]  [k] psi_group_change

Now it doesn't have a column for the dummy event.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-5-namhyung@kernel.org
2024-06-15 21:04:04 -07:00
Namhyung Kim
411ee13598 perf hist: Add symbol_conf.skip_empty
Add the skip_empty flag to symbol_conf and set the value from the report
command to preserve the existing behavior.  This makes the code simpler
and will be needed other code which is hard to add a new argument.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-4-namhyung@kernel.org
2024-06-15 21:04:04 -07:00
Namhyung Kim
8f6071a3dc perf hist: Simplify __hpp_fmt() using hpp_fmt_data
The struct hpp_fmt_data is to keep the values for each group members so
it doesn't need to check the event index in the group.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-3-namhyung@kernel.org
2024-06-15 21:04:04 -07:00
Namhyung Kim
cc2621cecd perf hist: Factor out __hpp__fmt_print()
Split the logic to print the histogram values according to the format
string.  This was used in 3 different places so it's better to move out
the logic into a function.

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607202918.2357459-2-namhyung@kernel.org
2024-06-15 21:04:04 -07:00
Fernand Sieber
231295a186 perf: sched map skips redundant lines with cpu filters
perf sched map supports cpu filter.
However, even with cpu filters active, any context switch currently
corresponds to a separate line.
As result, context switches on irrelevant cpus result to redundant lines,
which makes the output particlularly difficult to read on wide
architectures.

Fix it by skipping printing for irrelevant CPUs.

Example snippet of output before fix:

  *B0       1.461147 secs
   B0
   B0
   B0
  *G0       1.517139 secs

After fix:

  *B0       1.461147 secs
  *G0       1.517139 secs

Signed-off-by: Fernand Sieber <sieberf@amazon.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-and-tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240614073517.94974-1-sieberf@amazon.com
2024-06-15 21:03:50 -07:00
Ian Rogers
65b37df8c6 perf test pmu: Warn don't fail for legacy mixed case event names
PowerPC has mixed case events matching legacy hardware cache
events. Warn but don't fail in this case. Event parsing will still
work in this case by matching the legacy case.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240612124027.2712643-1-irogers@google.com
2024-06-13 21:27:49 -07:00
Athira Rajeev
245b0edf48 tools/perf: Fix timing issue with parallel threads in perf bench wake-up-parallel
perf bench futex fails as below and hangs intermittently when
attempted to run on on a powerpc system:

./perf bench futex wake-parallel
 Running 'futex/wake-parallel' benchmark:
 Run summary [PID 88588]: blocking on 640 threads (at [private] futex 0x10464b8c), 640 threads waking up 1 at a time.

[Run 1]: Avg per-thread latency (waking 1/640 threads) in 0.1309 ms (+-53.27%)
[Run 2]: Avg per-thread latency (waking 1/640 threads) in 0.0120 ms (+-31.16%)
[Run 3]: Avg per-thread latency (waking 1/640 threads) in 0.1474 ms (+-92.47%)
[Run 4]: Avg per-thread latency (waking 1/640 threads) in 0.2883 ms (+-67.75%)
[Run 5]: Avg per-thread latency (waking 1/640 threads) in 0.4108 ms (+-39.60%)
[Run 6]: Avg per-thread latency (waking 1/640 threads) in 0.7843 ms (+-78.98%)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)
perf: couldn't wakeup all tasks (0/1)

In the system, where perf bench wake-up-parallel is has system
configuration of 640 cpus. After debugging, this turned out to be
a timing issue. The benchmark creates threads equal to number of
cpus and issues a futex_wait. Then it does a usleep for .1 second
before initiating futex_wake. In system configuration with more
threads, the usleep time is not enough. Patch changes the usleep
from 100000 to 200000

With the patch, ran multiple iterations and there were no issues
further seen

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-3-atrajeev@linux.vnet.ibm.com
2024-06-13 21:27:49 -07:00
Athira Rajeev
3638e44542 tools/perf: Fix perf bench epoll to enable the run when some CPU's are offline
Perf bench epoll fails as below when attempted to run on
on a powerpc system:

   ./perf bench epoll wait
   Running 'epoll/wait' benchmark:
   Run summary [PID 627653]: 79 threads monitoring on 64 file-descriptors for 8 secs.

   perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While creating threads and using epoll_wait , code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-2-atrajeev@linux.vnet.ibm.com
2024-06-13 21:27:26 -07:00
Athira Rajeev
1833735867 tools/perf: Fix perf bench futex to enable the run when some CPU's are offline
Perf bench futex fails as below when attempted to run on
on a powerpc system:

 ./perf bench futex all
 Running futex/hash benchmark...
Run summary [PID 626307]: 80 threads, each operating on 1024 [private] futexes for 10 secs.

perf: pthread_create: No such file or directory

In the setup where this perf bench was ran, difference was that
partition had 640 CPU's, but not all CPUs were online. 80 CPUs
were online. While blocking the threads with futex_wait, code
sets the affinity using cpumask. The cpumask size used is 80
which is picked from "nrcpus = perf_cpu_map__nr(cpu)". Here the
benchmark reports fail while setting affinity for cpu number which
is greater than 80 or higher, because it attempts to set a bit
position which is not allocated on the cpumask. Fix this by changing
the size of cpumask to number of possible cpus and not the number
of online cpus.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: maddy@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607044354.82225-1-atrajeev@linux.vnet.ibm.com
2024-06-13 21:26:58 -07:00
Ian Rogers
6c1785cd75 perf record: Ensure space for lost samples
Previous allocation didn't account for sample ID written after the
lost samples event. Switch from malloc/free to a stack allocation.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240611050626.1223155-1-irogers@google.com
2024-06-13 20:45:31 -07:00
Ian Rogers
6828d6929b perf evsel: Refactor tool events
Tool events unnecessarily open a dummy perf event which is useless
even with `perf record` which will still open a dummy event. Change
the behavior of tool events so:

 - duration_time - call `rdclock` on open and then report the count as
   a delta since the start in evsel__read_counter. This moves code out
   of builtin-stat making it more general purpose.

 - user_time/system_time - open the fd as either `/proc/pid/stat` or
   `/proc/stat` for cases like system wide. evsel__read_counter will
   read the appropriate field out of the procfs file. These values
   were previously supplied by wait4, if the procfs read fails then
   the wait4 values are used, assuming the process/thread terminated.
   By reading user_time and system_time this way, interval mode, per
   PID and per CPU can be supported although there are restrictions
   given what the files provide (e.g. per PID can't be combined with
   per CPU).

Opening any of the tool events for `perf record` is changed to return
invalid.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240503232849.17752-1-irogers@google.com
2024-06-10 16:45:10 -07:00
Thomas Richter
658a8805cb perf test: Speed up test case 70 annotate basic tests
On some s390 linux machine (mostly older models) and with debug
packages installed, the test case 'perf annotate basic tests' runs
for some longer time.
Speed up the test and save the output of command perf annotate
in a temporary file. This is used to perform pattern matching via
grep command. This saves on invocation of perf annotate which
runs for some time.

Output before:
 # time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?

 real   4m35.543s
 user   3m19.442s
 sys    1m14.322s
 EXIT CODE 0
 #
Output after:
 # time bash -x tests/shell/annotate.sh >/dev/null 2>&1; echo EXIT CODE $?

 real   2m2.881s
 user   1m30.980s
 sys    0m30.684s
 EXIT CODE 0
 #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607054352.2774936-1-tmricht@linux.ibm.com
2024-06-07 13:06:44 -07:00
Ian Rogers
f5803651b4 perf stat: Choose the most disaggregate command line option
When multiple aggregation options are passed to perf stat the behavior
isn't clear. Consider "perf stat -A --per-socket .." and "perf stat
--per-socket -A ..", the first won't aggregate at all while the second
will do per-socket aggregation, even though the same options were
passed.

Rather than set an enum value, gather the options in a struct and
process them from most to least aggregate. This ensures the least
aggregate option always applies, so no aggregation if "-A" is passed.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605063828.195700-2-irogers@google.com
2024-06-07 13:00:03 -07:00
Ian Rogers
0dddd91ab6 perf stat: Make options local
Reduce the scope of stat_options to cmd_stat, and pass as an argument
to __cmd_record. This is done to make more localized changes to the
options in later patches. A side-effect of the change is to reduce the
size of a stripped PIE perf binary by 5952 bytes. The savings come
mainly in the dynamic relocation section.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605063828.195700-1-irogers@google.com
2024-06-07 12:59:53 -07:00
Ian Rogers
d2307fd4f9 perf maps: Add/use a sorted insert for fixup overlap and insert
Data may have lots of overlapping mmaps. The regular insert adds at
the end and relies on a later sort. For data with overlapping mappings
the sort will happen during a subsequent maps__find or
__maps__fixup_overlap_and_insert, there's never a period where the
inserted maps buffer up and a single sort happens. To avoid back to
back sorts, maintain the sort order when fixing up and
inserting. Previously the first_ending_after search was O(log n) where
n is the size of maps, and the insert was O(1) but because of the
continuous sorting was becoming O(n*log(n)). With maintaining sort
order, the insert now becomes O(n) for a memmove.

For a perf report on a perf.data file containing overlapping mappings
the time numbers are:

Before:
real    0m5.894s
user    0m5.650s
sys     0m0.231s

After:
real    0m0.675s
user    0m0.454s
sys     0m0.196s

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-4-irogers@google.com
2024-06-06 23:31:30 -07:00
Ian Rogers
aeefb04393 perf maps: Reduce sorting for overlapping mappings
When an 'after' map is generated the 'new' map must be before it so
terminate iterating and don't resort. If the entry 'pos' is entirely
overlapped by the 'new' mapping then don't remove and insert the
mapping, just replace - again to remove sorting.

For a perf report on a perf.data file containing overlapping mappings
the time numbers are:

Before:
real    0m9.856s
user    0m9.637s
sys     0m0.204s

After:
real    0m5.894s
user    0m5.650s
sys     0m0.231s

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-3-irogers@google.com
2024-06-06 23:31:15 -07:00
Ian Rogers
0b90dfda22 perf maps: Fix use after free in __maps__fixup_overlap_and_insert
In the case 'before' and 'after' are broken out from pos,
maps_by_address may be changed by __maps__insert, as such it needs
re-reading.

Don't ignore the return value from __maps_insert.

Fixes: 659ad3492b ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Steinar H . Gunderson <sesse@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521165109.708593-2-irogers@google.com
2024-06-06 23:30:58 -07:00
Lucas Stach
a9700511fd perf script: netdev-times: add location parameter to consume_skb
dd1b527831 ("net: add location to trace_consume_skb()") added a new
parameter to the consume_skb tracepoint. Adapt the script to match.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: kernel@pengutronix.de
Cc: patchwork-lst@pengutronix.de
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605144442.1985270-1-l.stach@pengutronix.de
2024-06-06 23:29:23 -07:00
Clément Le Goffic
9aa61d8ecb perf: parse-events: Fix compilation error while defining DEBUG_PARSER
Compiling perf tool with 'DEBUG_PARSER=1' leads to errors:

$> make -C tools/perf PARSER_DEBUG=1 NO_LIBTRACEEVENT=1
...
  CC      util/expr-flex.o
  CC      util/expr.o
util/parse-events.c:33:12: error: redundant redeclaration of ‘parse_events_debug’ [-Werror=redundant-decls]
   33 | extern int parse_events_debug;
      |            ^~~~~~~~~~~~~~~~~~
In file included from util/parse-events.c:18:
util/parse-events-bison.h:43:12: note: previous declaration of ‘parse_events_debug’ with type ‘int’
   43 | extern int parse_events_debug;
      |            ^~~~~~~~~~~~~~~~~~
util/expr.c:27:12: error: redundant redeclaration of ‘expr_debug’ [-Werror=redundant-decls]
   27 | extern int expr_debug;
      |            ^~~~~~~~~~
In file included from util/expr.c:11:
util/expr-bison.h:43:12: note: previous declaration of ‘expr_debug’ with type ‘int’
   43 | extern int expr_debug;
      |            ^~~~~~~~~~
cc-1: all warnings being treated as errors

Remove extern declaration from the parse-envents.c file as there is a
conflict with the ones generated using bison and yacc tools from the file
parse-events.[ly].

Signed-off-by: Clément Le Goffic <clement.legoffic@foss.st.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240605140453.614862-1-clement.legoffic@foss.st.com
2024-06-06 00:19:36 -07:00
Namhyung Kim
ca9680821d perf bpf: Fix handling of minimal vmlinux.h file when interrupting the build
Ingo reported that he was seeing these when hitting Control+C during a
perf tools build:

  Makefile.perf:1149: *** Missing bpftool input for generating vmlinux.h. Stop.

The failure happens when you don't have vmlinux.h or vmlinux with BTF.

ifeq ($(VMLINUX_H),)
  ifeq ($(VMLINUX_BTF),)
    $(error Missing bpftool input for generating vmlinux.h)
  endif
endif

VMLINUX_BTF can be empty if you didn't build a kernel or it doesn't have
a BTF section and the current kernel also has no BTF.  This is totally
ok.

But VMLINUX_H should be set to the minimal version in the source tree
(unless you overwrite it manually) when you don't pass GEN_VMLINUX_H=1
(which requires VMLINUX_BTF should not be empty).  The problem is that
it's defined in Makefile.config which is not included for `make clean`.

Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/CAM9d7ch5HTr+k+_GpbMrX0HUo5BZ11byh1xq0Two7B7RQACuNw@mail.gmail.com
Link: http://lore.kernel.org/lkml/ZjssGrj+abyC6mYP@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-05 11:33:00 -03:00
Arnaldo Carvalho de Melo
5b3cde1988 Revert "perf record: Reduce memory for recording PERF_RECORD_LOST_SAMPLES event"
This reverts commit 7d1405c71d.

This causes segfaults in some cases, as reported by Milian:

  ```
  sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e
  raw_syscalls:sys_enter ls
  ...
  [ perf record: Woken up 3 times to write data ]
  malloc(): invalid next size (unsorted)
  Aborted
  ```

  Backtrace with GDB + debuginfod:

  ```
  malloc(): invalid next size (unsorted)

  Thread 1 "perf" received signal SIGABRT, Aborted.
  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
  no_tid=no_tid@entry=0) at pthread_kill.c:44
  Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c
  44            return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO
  (ret) : 0;
  (gdb) bt
  #0  __pthread_kill_implementation (threadid=<optimized out>,
  signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
  #1  0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>,
  signo=6) at pthread_kill.c:78
  #2  0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/
  raise.c:26
  #3  0x00007ffff6e384c3 in __GI_abort () at abort.c:79
  #4  0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea
  "%s\n") at ../sysdeps/posix/libc_fatal.c:132
  #5  0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850
  "malloc(): invalid next size (unsorted)") at malloc.c:5772
  #6  0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0
  <main_arena>, bytes=bytes@entry=368) at malloc.c:4081
  #7  0x00007ffff6eb877e in __libc_calloc (n=<optimized out>,
  elem_size=<optimized out>) at malloc.c:3754
  #8  0x000055555569bdb6 in perf_session.do_write_header ()
  #9  0x00005555555a373a in __cmd_record.constprop.0 ()
  #10 0x00005555555a6846 in cmd_record ()
  #11 0x000055555564db7f in run_builtin ()
  #12 0x000055555558ed77 in main ()
  ```

  Valgrind memcheck:
  ```
  ==45136== Invalid write of size 8
  ==45136==    at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf)
  ==45136==    by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf)
  ==45136==    by 0x15A845: cmd_record (in /usr/bin/perf)
  ==45136==    by 0x201B7E: run_builtin (in /usr/bin/perf)
  ==45136==    by 0x142D76: main (in /usr/bin/perf)
  ==45136==  Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
  ==45136==    at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
  ==45136==    by 0x3574AB: zalloc (in /usr/bin/perf)
  ==45136==    by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
  ==45136==    by 0x15A845: cmd_record (in /usr/bin/perf)
  ==45136==    by 0x201B7E: run_builtin (in /usr/bin/perf)
  ==45136==    by 0x142D76: main (in /usr/bin/perf)
  ==45136==
  ==45136== Syscall param write(buf) points to unaddressable byte(s)
  ==45136==    at 0x575953D: __libc_write (write.c:26)
  ==45136==    by 0x575953D: write (write.c:24)
  ==45136==    by 0x35761F: ion (in /usr/bin/perf)
  ==45136==    by 0x357778: writen (in /usr/bin/perf)
  ==45136==    by 0x1548F7: record__write (in /usr/bin/perf)
  ==45136==    by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf)
  ==45136==    by 0x15A845: cmd_record (in /usr/bin/perf)
  ==45136==    by 0x201B7E: run_builtin (in /usr/bin/perf)
  ==45136==    by 0x142D76: main (in /usr/bin/perf)
  ==45136==  Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
  ==45136==    at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
  ==45136==    by 0x3574AB: zalloc (in /usr/bin/perf)
  ==45136==    by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
  ==45136==    by 0x15A845: cmd_record (in /usr/bin/perf)
  ==45136==    by 0x201B7E: run_builtin (in /usr/bin/perf)
  ==45136==    by 0x142D76: main (in /usr/bin/perf)
  ==45136==
 -----

Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/
Reported-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org # 6.8+
Link: https://lore.kernel.org/lkml/Zl9ksOlHJHnKM70p@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-05 11:12:36 -03:00
Dr. David Alan Gilbert
0770ceaff2 perf hisi-ptt: remove unused struct 'hisi_ptt_queue'
'hisi_ptt_queue' has been unused since the original
commit 5e91e57e68 ("perf auxtrace arm64: Add support for parsing
HiSilicon PCIe Trace packet").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: yangyicong@hisilicon.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240602000709.213116-1-linux@treblig.org
2024-06-04 18:17:17 -07:00
Dr. David Alan Gilbert
f7abc0cfa8 perf genelf: remove unused struct 'options'
'options' has been unused since
commit fa7f7e7354 ("perf jit: Move test functionality in to a test").

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240602000505.213032-1-linux@treblig.org
2024-06-03 22:07:52 -07:00
Nick Forrington
f7d4485fce perf lock info: Display both map and thread by default
Change "perf lock info" argument handling to:

Display both map and thread info (rather than an error) when neither are
specified.

Display both map and thread info (rather than just thread info) when
both are requested.

Signed-off-by: Nick Forrington <nick.forrington@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240513091413.738537-2-nick.forrington@arm.com
2024-06-03 22:01:00 -07:00
Ian Rogers
af75201634 perf top: Allow filters on events
Allow filters to be added to perf top events. One use is to workaround
issues with:
```
$ perf top --uid="$(id -u)"
```
which tries to scan /proc find processes belonging to the uid and can
fail in such a pid terminates between the scan and the
perf_event_open reporting:
```
Error:
The sys_perf_event_open() syscall returned with 3 (No such process) for event (cycles:P).
/bin/dmesg | grep -i perf may provide additional information.
```
A similar filter:
```
$ perf top -e cycles:P --filter "uid == $(id -u)"
```
doesn't fail this way.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-4-irogers@google.com
2024-05-30 10:05:57 -07:00
Ian Rogers
d92aa899fe perf bpf filter: Add uid and gid terms
Allow the BPF filter to use the uid and gid terms determined by the
bpf_get_current_uid_gid BPF helper. For example, the following will
record the cpu-clock event system wide discarding samples that don't
belong to the current user.

$ perf record -e cpu-clock --filter "uid == $(id -u)" -a sleep 0.1

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-3-irogers@google.com
2024-05-30 10:05:57 -07:00
Ian Rogers
63b9cbd794 perf bpf filter: Give terms their own enum
Give the term types their own enum so that additional terms can be
added that don't correspond to a PERF_SAMPLE_xx flag. The term values
are numerically ascending rather than bit field positions, this means
they need translating to a PERF_SAMPLE_xx bit field in certain places
using a shift.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240524205227.244375-2-irogers@google.com
2024-05-30 10:05:57 -07:00
Changbin Du
f975c13d2a perf trace beauty: Always show mmap prot even though PROT_NONE
PROT_NONE is also useful information, so do not omit the mmap prot even
though it is 0. syscall_arg__scnprintf_mmap_prot() could print PROT_NONE
for prot 0.

Before: PROT_NONE is not shown.
$ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0  -- ls
     0.000 ls/2979231 syscalls:sys_enter_mmap(len: 4220888, flags: PRIVATE|ANONYMOUS)

After: PROT_NONE is displayed.
$ sudo perf trace -e syscalls:sys_enter_mmap --filter prot==0  -- ls
     0.000 ls/2975708 syscalls:sys_enter_mmap(len: 4220888, prot: NONE, flags: PRIVATE|ANONYMOUS)

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240522033542.1359421-3-changbin.du@huawei.com
2024-05-29 22:48:23 -07:00
Changbin Du
92968dcc03 perf trace beauty: Always show param if show_zero is set
For some parameters, it is best to also display them when they are 0,
e.g. flags.

Here we only check the show_zero property and let arg printer handle
special cases.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240522033542.1359421-2-changbin.du@huawei.com
2024-05-29 22:48:05 -07:00
Ian Rogers
a93c83eca4 perf docs: Fix typos
Assorted typo fixes.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240521223555.858859-1-irogers@google.com
2024-05-28 22:52:28 -07:00
Breno Leitao
265b71153e perf list: Fix the --no-desc option
Currently, the --no-desc option in perf list isn't functioning as
intended.

This issue arises from the overwriting of struct option->desc with the
opposite value of struct option->long_desc. Consequently, whatever
parse_options() returns at struct option->desc gets overridden later,
rendering the --desc or --no-desc arguments ineffective.

To resolve this, set ->desc as true by default and allow parse_options()
to adjust it accordingly. This adjustment will fix the --no-desc
option while preserving the functionality of the other parameters.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: leit@meta.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240517141427.1905691-1-leitao@debian.org
2024-05-28 11:29:49 -07:00
Ian Rogers
cbd446b4db perf arm-spe: Unaligned pointer work around
Use get_unaligned_leXX instead of leXX_to_cpu to handle unaligned
pointers. Such pointers occur with libFuzzer testing.

A similar change for intel-pt was done in:
https://lore.kernel.org/r/20231005190451.175568-6-adrian.hunter@intel.com

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240514052402.3031871-1-irogers@google.com
2024-05-28 11:29:49 -07:00
Ian Rogers
678be1ca30 perf tests: Add some pmu core functionality tests
Test behavior of PMU names and comparisons wrt suffixes using Intel
uncore_cha, marvell mrvl_ddr_pmu and S390's cpum_cf as examples.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: Bhaskara Budiredla <bbudiredla@marvell.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240515060114.3268149-3-irogers@google.com
2024-05-28 11:29:49 -07:00
Ian Rogers
3241d46f5f perf pmus: Sort/merge/aggregate PMUs like mrvl_ddr_pmu
The mrvl_ddr_pmu is uncore and has a hexadecimal address suffix while
the previous PMU sorting/merging code assumes uncore PMU names start
with uncore_ and have a decimal suffix. Because of the previous
assumption it isn't possible to wildcard the mrvl_ddr_pmu.

Modify pmu_name_len_no_suffix but also remove the suffix number out
argument, this is because we don't know if a suffix number of say 100
is in hexadecimal or decimal. As the only use of the suffix number is
in comparisons, it is safe there to compare the values as hexadecimal.
Modify perf_pmu__match_ignoring_suffix so that hexadecimal suffixes
are ignored.

Only allow hexadecimal suffixes to be greater than length 2 (ie 3 or
more) so that S390's cpum_cf PMU doesn't lose its suffix.

Change the return type of pmu_name_len_no_suffix to size_t to
workaround GCC incorrectly determining the result could be negative.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: Bhaskara Budiredla <bbudiredla@marvell.com>
Cc: Tuan Phan <tuanphan@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240515060114.3268149-2-irogers@google.com
2024-05-28 11:29:49 -07:00
Arnaldo Carvalho de Melo
da42b5229b tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall
But also to wire up shadow stacks on 32-bit x86, picking up those
changes from these csets:

  ff388fe5c4 ("mseal: wire up mseal syscall")
  2883f01ec3 ("x86/shstk: Enable shadow stacks for x32")

This makes 'perf trace' support it, now its possible, for instance to
do:

  # perf trace -e mseal --max-stack=16

Here is an example with the 'sendmmsg' syscall:

  root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1
       0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64 ([kernel.kallsyms])
                                         [0x117ce7] (/usr/lib64/libc.so.6 (deleted))
  root@x1:~#

To do a system wide tracing of the new 'mseal' syscall with a backtrace
of at most 16 entries.

This addresses these perf tools build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H J Lu <hjl.tools@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28 11:10:00 -03:00
Arnaldo Carvalho de Melo
001821b0e7 perf trace beauty: Update the arch/x86/include/asm/irq_vectors.h copy with the kernel sources to pick POSTED_MSI_NOTIFICATION
To pick up the change in:

  f5a3562ec9 ("x86/irq: Reserve a per CPU IDT vector for posted MSIs")

That picks up this new vector:

  $ cp arch/x86/include/asm/irq_vectors.h tools/perf/trace/beauty/arch/x86/include/asm/irq_vectors.h
  $ tools/perf/trace/beauty/tracepoints/x86_irq_vectors.sh > after
  $ diff -u before after
  --- before	2024-05-27 12:50:47.708863932 -0300
  +++ after	2024-05-27 12:51:15.335113123 -0300
  @@ -1,6 +1,7 @@
   static const char *x86_irq_vectors[] = {
   	[0x02] = "NMI",
   	[0x80] = "IA32_SYSCALL",
  +	[0xeb] = "POSTED_MSI_NOTIFICATION",
   	[0xec] = "LOCAL_TIMER",
   	[0xed] = "HYPERV_STIMER0",
   	[0xee] = "HYPERV_REENLIGHTENMENT",
  $

Now those will be known when pretty printing the irq_vectors:*
tracepoints.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/ZlS34M0x30EFVhbg@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27 13:42:18 -03:00
Arnaldo Carvalho de Melo
a3eed53bee perf beauty: Update copy of linux/socket.h with the kernel sources
To pick up the fixes in:

  0645fbe760 ("net: have do_accept() take a struct proto_accept_arg argument")

That just changes a function prototype, not touching things used by the
perf scrape scripts such as:

  $ tools/perf/trace/beauty/sockaddr.sh | head -5
  static const char *socket_families[] = {
  	[0] = "UNSPEC",
  	[1] = "LOCAL",
  	[2] = "INET",
  	[3] = "AX25",
  $

This addresses this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZlSrceExgjrUiDb5@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27 12:49:18 -03:00
Arnaldo Carvalho de Melo
1437a9f06f tools headers UAPI: Sync fcntl.h with the kernel sources to pick F_DUPFD_QUERY
There is no scrape script yet for those, but the warning pointed out we
need to update the array with the F_LINUX_SPECIFIC_BASE entries, do it.

Now 'perf trace' can decode that cmd and also use it in filter, as in:

  root@number:~# perf trace -e syscalls:*enter_fcntl --filter 'cmd != SETFL && cmd != GETFL'
     0.000 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLK, arg: 0x7fffdc6a8a50)
     0.013 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a8aa0)
     0.090 sssd_kcm/303828 syscalls:sys_enter_fcntl(fd: 13</var/lib/sss/secrets/secrets.ldb>, cmd: SETLKW, arg: 0x7fffdc6a88e0)
  ^Croot@number:~#

This picks up the changes in:

  c62b758bae ("fcntl: add F_DUPFD_QUERY fcntl()")

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZlSqNQH9mFw2bmjq@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-27 12:44:09 -03:00