This patch isn't intended to have any effect on the compiled code. It
just removes one level of indirection: calling the *host* compiler to
build and then run a program that just printf:s the numerical entries of
the syscall-table. In other words, the generated syscalls.c changes
from:
[46] = "ftruncate",
to:
[__NR3264_ftruncate] = "ftruncate",
The latter is as good as the former to the user of perf, and this can be
done directly by the shell-script. The syscalls defined as non-literal
values (like "#define __NR_ftruncate __NR3264_ftruncate") are trivially
resolved at compile-time without namespace-leaking and/or collision for
its sole user, perf/util/syscalltbl.c, that just #includes the generated
file. A future "-mabi=32" support would probably have to handle this
differently, but that is a pre-existing problem not affected by this
simplification.
Calling the *host* compiler only complicates things and accidentally can
get a completely wrong set of files and syscall numbers, see earlier
commits. Note that the script parameter hostcc is now unused.
At the time of this patch, powerpc (the origin, see comments), and also
e.g. x86 has moved on, from filtering "gcc -dM -E" output to reading
separate specific text-file, a table of syscall numbers. IMHO should
arm64 consider adopting this.
Signed-off-by: Hans-Peter Nilsson <hp@axis.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20201228024159.2BB66203B5@pchp3.se.axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
ce883a2ba3 ("powerpc/32: fix syscall wrappers with 64-bit arguments")
That doesn't cause any changes in the perf tools.
This table is used in tools perf to allow features as described in the
last update to this file.
This addresses this perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y6H0C5plZ4V4aiPm@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.
If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.
This also disables CONFIG_TRACE that controls "perf trace".
CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.
Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".
Committer notes:
Fixed up a tools/perf/util/Build reject and added:
#include <traceevent/event-parse.h>
to tools/perf/util/scripting-engines/trace-event-perl.c.
Committer testing:
$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$
Default build:
$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$
# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#
Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.
Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
- perf-$(CONFIG_LIBTRACEEVENT) += scripts/
- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.
Building without libtraceevent-devel installed uncovered more build
failures:
- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.
- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.
From Athira:
<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>
Then, ditto for arm64 and s390, detected by container cross build tests.
- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
Also from Athira:
<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>
Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using "sort -nu", arm64 syscalls were lost. That is, the io_setup
syscall (number 0) and all but one (typically ftruncate; 64) of the
syscalls that are defined symbolically (like "#define __NR_ftruncate
__NR3264_ftruncate") at the point where "sort" is applied.
This creation-of-syscalls.c-scheme is, judging from comments,
copy-pasted from powerpc, and worked there because at the time, its
tools/arch/powerpc/include/uapi/asm/unistd.h had *literals*, like
"#define __NR_ftruncate 93".
With sort being numeric and the non-numeric key effectively evaluating
to 0, the sort option "-u" means these "duplicates" are removed.
There's no need to remove syscall lines with duplicate numbers for arm64
because there are none, so let's fix that by just losing the "-u".
Having the table numerically sorted on syscall-number for the rest of
the syscalls looks nice, so keep the "-n".
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Hans-Peter Nilsson <hp@axis.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20201228023941.E0DE2203B5@pchp3.se.axis.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some places were including event.h just to get 'struct perf_sample',
move it to a separate place so that we speed up a bit the build.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
e237506238 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs")
That doesn't cause any changes in the perf tools.
As a reminder, this table is used in tools perf to allow features such as:
[root@five ~]# perf trace -e set_mempolicy_home_node
^C[root@five ~]#
[root@five ~]# perf trace -v -e set_mempolicy_home_node
Using CPUID AuthenticAMD-25-21-0
event qualifier tracepoint filter: (common_pid != 253729 && common_pid != 3585) && (id == 450)
mmap size 528384B
^C[root@five ~]
[root@five ~]# perf trace -v -e set* --max-events 5
Using CPUID AuthenticAMD-25-21-0
event qualifier tracepoint filter: (common_pid != 253734 && common_pid != 3585) && (id == 38 || id == 54 || id == 105 || id == 106 || id == 109 || id == 112 || id == 113 || id == 114 || id == 116 || id == 117 || id == 119 || id == 122 || id == 123 || id == 141 || id == 160 || id == 164 || id == 170 || id == 171 || id == 188 || id == 205 || id == 218 || id == 238 || id == 273 || id == 308 || id == 450)
mmap size 528384B
0.000 ( 0.008 ms): bash/253735 setpgid(pid: 253735 (bash), pgid: 253735 (bash)) = 0
6849.011 ( 0.008 ms): bash/16046 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0
6849.080 ( 0.005 ms): bash/253736 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0
7437.718 ( 0.009 ms): gnome-shell/253737 set_robust_list(head: 0x7f34b527e920, len: 24) = 0
13445.986 ( 0.010 ms): bash/16046 setpgid(pid: 253738 (bash), pgid: 253738 (bash)) = 0
[root@five ~]#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -w set_mempolicy_home_node
tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/s390/entry/syscalls/syscall.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node
$
$ grep -w set_mempolicy_home_node /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
[450] = "set_mempolicy_home_node",
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/lkml/Y01HN2DGkWz8tC%2FJ@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Add support for AMD on 'perf mem' and 'perf c2c', the kernel
enablement patches went via tip.
Example:
$ sudo perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
$
- "perf lock" improvements:
- Add -E/--entries option to limit the number of entries to
display, say to ask for just the top 5 contended locks.
- Add -q/--quiet option to suppress header and debug messages.
- Add a 'perf test' kernel lock contention entry to test 'perf
lock'.
- "perf lock contention" improvements:
- Ask BPF's bpf_get_stackid() to skip some callchain entries.
The ones closer to the tooling are bpf related and not that
interesting, the ones calling the locking function are the ones
we're interested in, example of a full, unskipped callstack:
- Allow changing the callstack depth and number of entries to skip.
1 10.74 us 10.74 us 10.74 us spinlock __bpf_trace_contention_begin+0xb
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffc03b5c47 bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
0xffffffffbb8b8e75 bpf_trace_run2+0x35
0xffffffffbb7eab9b __bpf_trace_contention_begin+0xb
0xffffffffbb7ebe75 queued_spin_lock_slowpath+0x1f5
0xffffffffbc1c26ff _raw_spin_lock+0x1f
0xffffffffbb841015 tick_do_update_jiffies64+0x25
0xffffffffbb8409ee tick_irq_enter+0x9e
- Show full callstack in verbose mode (-v option), sometimes this
is desirable instead of showing just one callstack entry.
- Allow multiple time ranges in 'perf record --delay' to help in
reducing the amount of data collected from hardware tracing (Intel
PT, etc) when there is a rough idea of periods of time where events
of interest take time.
- Add Intel PT to record only decoder debug messages when error
happens.
- Improve layout of Intel PT man page.
- Add new branch types: alignment, data and inst faults and arch
specific ones, such as fiq, debug_halt, debug_exit, debug_inst and
debug_data on arm64.
Kernel enablement went thru the tip tree.
- Fix 'perf probe' error log check in 'perf test' when no debuginfo is
available.
- Fix 'perf stat' aggregation mode logic, it should be looking at the
CPU not at the core number.
- Fix flags parsing in 'perf trace' filters.
- Introduce compact encoding of CPU range encoding on perf.data, to
avoid having a bitmap with all the CPUs.
- Improvements to the 'perf stat' metrics, including adding
"core_wide", and computing "smt" from the CPU topology.
- Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format,
that allows tooling to ask for the precise number of lost samples for
a given event.
- Add 'addr' sort key to see just the address of sampled instructions:
$ perf record -o- true | perf report -i- -s addr
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
# Samples: 12 of event 'cycles:u'
# Event count (approx.): 252512
#
# Overhead Address
# ........ ..................
42.96% 0x7f96f08443d7
29.55% 0x7f96f0859b50
14.76% 0x7f96f0852e02
8.30% 0x7f96f0855028
4.43% 0xffffffff8de01087
perf annotate: Toggle full address <-> offset display
- Add 'f' hotkey to the 'perf annotate' TUI interface when in
'disassembler output' mode ('o' hotkey) to toggle showing full
virtual address or just the offset.
- Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for
pre-existing threads, at the start of a 'perf record' session,
speeding up that record startup phase.
- Add a command line option to specify build ids in 'perf inject'.
- Update JSON event files for the Intel alderlake, broadwell,
broadwellde, broadwellx, cascadelakex, haswell, haswellx, icelake,
icelakex, ivybridge, ivytown, jaketown, sandybridge, sapphirerapids,
skylake, skylakex, and tigerlake processors.
- Update vendor JSON event files for the ARM Neoverse V1 and E1
platforms.
- Add a 'perf test' entry for 'perf mem' where a struct has false
sharing and this gets detected in the 'perf mem' output, tested with
Intel, AMD and ARM64 systems.
- Add a 'perf test' entry to test the resolution of java symbols, where
an output like this is expected:
8.18% jshell jitted-50116-29.so [.] Interpreter
0.75% Thread-1 jitted-83602-1670.so [.] jdk.internal.jimage.BasicImageReader.getString(int)
- Add tests for the ARM64 CoreSight hardware tracing feature, with
specially crafted pureloop, memcpy, thread loop and unroll tread that
then gets traced and the output compared with expected output.
Documentation explaining it is also included.
- Add per thread Intel PT 'perf test' entry to check that
PERF_RECORD_TEXT_POKE events are recorded per CPU, resulting in a
mixture of per thread and per CPU events and mmaps, verify that this
gets all recorded correctly.
- Introduce pthread mutex wrappers to allow for building with clang's
-Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by"
"lockable", "exclusive_lock_function", "exclusive_trylock_function",
"exclusive_locks_required", and "no_thread_safety_analysis" compiler
function attributes.
- Fix empty version number when building outside of a git repo.
- Improve feature detection display when multiple versions of a feature
are present, such as for binutils libbfd, that has a mix of possible
ways to detect according to the Linux distribution.
Previously in some cases we had:
Auto-detecting system features
<SNIP>
... libbfd: [ on ]
... libbfd-liberty: [ on ]
... libbfd-liberty-z: [ on ]
<SNIP>
Now for this case we show just the main feature:
Auto-detecting system features
<SNIP>
... libbfd: [ on ]
<SNIP>
- Remove some unused structs, variables, macros, function prototypes
and includes from various places.
* tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf script: Add missing fields in usage hint
perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
perf mem/c2c: Avoid printing empty lines for unsupported events
perf mem/c2c: Add load store event mappings for AMD
perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel
tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout()
perf test coresight: Add relevant documentation about ARM64 CoreSight testing
perf test: Add git ignore for tmp and output files of ARM CoreSight tests
perf test coresight: Add unroll thread test shell script
perf test coresight: Add unroll thread test tool
perf test coresight: Add thread loop test shell scripts
perf test coresight: Add thread loop test tool
perf test coresight: Add memcpy thread test shell script
perf test coresight: Add memcpy thread test tool
perf test: Add git ignore for perf data generated by the ARM CoreSight tests
perf test: Add arm64 asm pureloop test shell script
perf test: Add asm pureloop test tool
...
Arch-specific implementations of syscall handlers are currently used
over generic implementations for the following reasons:
1. Semantics unique to powerpc
2. Compatibility syscalls require 'argument padding' to comply with
64-bit argument convention in ELF32 abi.
3. Parameter types or order is different in other architectures.
These syscall handlers have been defined prior to this patch series
without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with
custom input and output types. We remove every such direct definition in
favour of the aforementioned macros.
Also update syscalls.tbl in order to refer to the symbol names generated
by each of these macros. Since ppc64_personality can be called by both
64 bit and 32 bit binaries through compatibility, we must generate both
both compat_sys_ and sys_ symbols for this handler.
As an aside:
A number of architectures including arm and powerpc agree on an
alternative argument order and numbering for most of these arch-specific
handlers. A future patch series may allow for asm/unistd.h to signal
through its defines that a generic implementation of these syscall
handlers with the correct calling convention be emitted, through the
__ARCH_WANT_COMPAT_SYS_... convention.
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220921065605.1051927-16-rmclure@linux.ibm.com
Syscall #82 has been implemented for 32-bit platforms in a unique way on
powerpc systems. This hack will in effect guess whether the caller is
expecting new select semantics or old select semantics. It does so via a
guess, based off the first parameter. In new select, this parameter
represents the length of a user-memory array of file descriptors, and in
old select this is a pointer to an arguments structure.
The heuristic simply interprets sufficiently large values of its first
parameter as being a call to old select. The following is a discussion
on how this syscall should be handled.
As discussed in this thread, the existence of such a hack suggests that for
whatever powerpc binaries may predate glibc, it is most likely that they
would have taken use of the old select semantics. x86 and arm64 both
implement this syscall with oldselect semantics.
Remove the powerpc implementation, and update syscall.tbl to refer to emit
a reference to sys_old_select and compat_sys_old_select
for 32-bit binaries, in keeping with how other architectures support
syscall #82.
Signed-off-by: Rohan McLure <rmclure@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a1a0@csgroup.eu/
Link: https://lore.kernel.org/r/20220921065605.1051927-12-rmclure@linux.ibm.com
Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs() to add
the Arch specific default attrs, e.g., Topdown for x86.
It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.
Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.
Add default_null_attrs[] to indicate the arch specific attrs.
No functional change for the arch specific default attrs either.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-4-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hardware TopDown metrics feature, the sample-read feature should
be supported for a TopDown group, e.g., sample a non-topdown event and read
a Topdown metric group. But the current perf record code errors are out.
For a TopDown metric group,the slots event must be the leader of the group,
but the leader slots event doesn't support sampling. To support sample-read
the TopDown metric group, uses the 2nd event of the group as the "leader"
for the purposes of sampling.
Only the platform with the TopDown metric feature supports sample-read the
topdown group. In commit acb65150a4 ("perf record: Support sample-read
topdown metric group"), it adds arch_topdown_sample_read() to indicate
whether the TopDown group supports sample-read, it should only work on the
non-hybrid systems, this patch extends the support for hybrid platforms.
Before:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/topdown-retiring/).
/bin/dmesg | grep -i perf may provide additional information.
After:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.238 MB perf.data (369 samples) ]
Fixes: acb65150a4 ("perf record: Support sample-read topdown metric group")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220602153603.1884710-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.
Without the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15
The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.
Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.
With the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Fixes: 6466ec14aa ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 94dbfd6781 ("perf parse-events: Architecture specific
leader override") introduced a feature to reorder the slots event to
fulfill the restriction of the perf metrics topdown group. But the
feature doesn't work on the hybrid machine.
$ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1
Performance counter stats for 'system wide':
<not counted> cpu_core/instructions/
<not counted> cpu_core/slots/
<not supported> cpu_core/topdown-retiring/
1.002871801 seconds time elapsed
A hybrid platform has a different PMU name for the core PMUs, while
current perf hard code the PMU name "cpu".
Introduce a new function to check whether the system supports the perf
metrics feature. The result is cached for the future usage.
For X86, the core PMU name always has "cpu" prefix.
With the patch:
$ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1
Performance counter stats for 'system wide':
76,337,010 cpu_core/slots/
10,416,809 cpu_core/instructions/
11,692,372 cpu_core/topdown-retiring/
1.002805453 seconds time elapsed
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-5-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evsel->name may have a different format for a topdown event, a pure
topdown name (e.g., topdown-fe-bound), or a PMU name + a topdown name
(e.g., cpu/topdown-fe-bound/). The cpu/topdown-fe-bound/ kind format
isn't supported by the arch_evlist__leader(). This format is a very
common format for a hybrid platform, which requires specifying the PMU
name for each event.
Without the patch,
$ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1
Performance counter stats for 'system wide':
<not counted> instructions
<not counted> slots
<not supported> cpu/topdown-fe-bound/
1.003482041 seconds time elapsed
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
The events in group usually have to be from the same PMU. Try reorganizing the group.
With the patch,
$ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1
Performance counter stats for 'system wide':
157,383,996 slots
25,011,711 instructions
27,441,686 cpu/topdown-fe-bound/
1.003530890 seconds time elapsed
Fixes: bc355822f0 ("perf parse-events: Move slots only with topdown")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-4-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.
$./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
Performance counter stats for 'system wide':
751,765,068 cpu_core/slots/ (84.07%)
<not supported> cpu_core/topdown-bad-spec/
<not supported> cpu_core/topdown-be-bound/
<not supported> cpu_core/topdown-fe-bound/
<not supported> cpu_core/topdown-retiring/
12,398,197 cpu_core/branch-instructions/ (84.07%)
1,054,218 cpu_core/branch-misses/ (84.24%)
539,764,637 cpu_core/bus-cycles/ (84.64%)
14,683 cpu_core/cache-misses/ (84.87%)
7,277,809 cpu_core/cache-references/ (77.30%)
222,299,439 cpu_core/cpu-cycles/ (77.28%)
63,661,714 cpu_core/instructions/ (84.85%)
0 cpu_core/mem-loads/ (77.29%)
12,271,725 cpu_core/mem-stores/ (77.30%)
542,241,102 cpu_core/ref-cycles/ (84.85%)
8,854 cpu_core/cache-misses/ (76.71%)
7,179,013 cpu_core/cache-references/ (76.31%)
1.003245250 seconds time elapsed
A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".
The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.
In case other PMUs also define the "slots" event, checking the PMU type
as well.
With the patch,
$ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
Performance counter stats for 'system wide':
766,620,266 cpu_core/slots/ (84.06%)
73,172,129 cpu_core/topdown-bad-spec/ # 9.5% bad speculation (84.06%)
193,443,341 cpu_core/topdown-be-bound/ # 25.0% backend bound (84.06%)
403,940,929 cpu_core/topdown-fe-bound/ # 52.3% frontend bound (84.06%)
102,070,237 cpu_core/topdown-retiring/ # 13.2% retiring (84.06%)
12,364,429 cpu_core/branch-instructions/ (84.03%)
1,080,124 cpu_core/branch-misses/ (84.24%)
564,120,383 cpu_core/bus-cycles/ (84.65%)
36,979 cpu_core/cache-misses/ (84.86%)
7,298,094 cpu_core/cache-references/ (77.30%)
227,174,372 cpu_core/cpu-cycles/ (77.31%)
63,886,523 cpu_core/instructions/ (84.87%)
0 cpu_core/mem-loads/ (77.31%)
12,208,782 cpu_core/mem-stores/ (77.31%)
566,409,738 cpu_core/ref-cycles/ (84.87%)
23,118 cpu_core/cache-misses/ (76.71%)
7,212,602 cpu_core/cache-references/ (76.29%)
1.003228667 seconds time elapsed
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>