commit 4a86d41404 upstream.
Currently perf saves a build-id with size but old versions assumes the
size of 20. In case the build-id is less than 20 (like for MD5), it'd
fill the rest with 0s.
I saw a problem when old version of perf record saved a binary in the
build-id cache and new version of perf reads the data. The symbols
should be read from the build-id cache (as the path no longer has the
same binary) but it failed due to mismatch in the build-id.
symsrc__init: build id mismatch for /home/namhyung/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf.
The build-id event in the data has 20 byte build-ids, but it saw a
different size (16) when it reads the build-id of the elf file in the
build-id cache.
$ readelf -n ~/.debug/.build-id/53/e4c2f42a4c61a2d632d92a72afa08f00000000/elf
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000010 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 53e4c2f42a4c61a2d632d92a72afa08f
Let's fix this by allowing trailing zeros if the size is different.
Fixes: 39be8d0115 ("perf tools: Pass build_id object to dso__build_id_equal()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210910224630.1084877-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 57f0ff059e upstream.
It's later supposed to be either a correct address or NULL. Without the
initialization, it may contain an undefined value which results in the
following segmentation fault:
# perf top --sort comm -g --ignore-callees=do_idle
terminates with:
#0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6
#1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6
#2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489
#3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564
#4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420,
sample_self=sample_self@entry=true) at util/hist.c:657
#5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0,
sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288
#6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38)
at util/hist.c:1056
#7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056
#8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0)
at util/hist.c:1231
#9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0)
at builtin-top.c:842
#10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202
#11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244
#12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323
#13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339
#14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341
#15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339
#16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114
#17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0
#18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6
If you look at the frame #2, the code is:
488 if (he->srcline) {
489 he->srcline = strdup(he->srcline);
490 if (he->srcline == NULL)
491 goto err_rawdata;
492 }
If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish),
it gets strdupped and strdupping a rubbish random string causes the problem.
Also, if you look at the commit 1fb7d06a50, it adds the srcline property
into the struct, but not initializing it everywhere needed.
Committer notes:
Now I see, when using --ignore-callees=do_idle we end up here at line
2189 in add_callchain_ip():
2181 if (al.sym != NULL) {
2182 if (perf_hpp_list.parent && !*parent &&
2183 symbol__match_regex(al.sym, &parent_regex))
2184 *parent = al.sym;
2185 else if (have_ignore_callees && root_al &&
2186 symbol__match_regex(al.sym, &ignore_callees_regex)) {
2187 /* Treat this symbol as the root,
2188 forgetting its callees. */
2189 *root_al = al;
2190 callchain_cursor_reset(cursor);
2191 }
2192 }
And the al that doesn't have the ->srcline field initialized will be
copied to the root_al, so then, back to:
1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
1212 int max_stack_depth, void *arg)
1213 {
1214 int err, err2;
1215 struct map *alm = NULL;
1216
1217 if (al)
1218 alm = map__get(al->map);
1219
1220 err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
1221 iter->evsel, al, max_stack_depth);
1222 if (err) {
1223 map__put(alm);
1224 return err;
1225 }
1226
1227 err = iter->ops->prepare_entry(iter, al);
1228 if (err)
1229 goto out;
1230
1231 err = iter->ops->add_single_entry(iter, al);
1232 if (err)
1233 goto out;
1234
That al at line 1221 is what hist_entry_iter__add() (called from
sample__resolve_callchain()) saw as 'root_al', and then:
iter->ops->add_single_entry(iter, al);
will go on with al->srcline with a bogus value, I'll add the above
sequence to the cset and apply, thanks!
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
CC: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 1fb7d06a50 ("perf report Use srcline from callchain for hist entries")
Link: https //lore.kernel.org/r/20210719145332.29747-1-mpetlan@redhat.com
Reported-by: Juri Lelli <jlelli@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1f0e6edcd9 upstream.
Considering the following testcase:
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
'perf annotate' displays:
86.52 │40055e: → ja 40056c <foo(int, int)+0x26>
13.37 │400560: mov -0x18(%rbp),%eax
│400563: add %eax,-0x14(%rbp)
│400566: addl $0x1,-0x4(%rbp)
0.11 │40056a: → jmp 400557 <foo(int, int)+0x11>
│40056c: mov -0x14(%rbp),%eax
│40056f: pop %rbp
and the 'ja 40056c' does not link to the location in the function. It's
caused by fact that comma is wrongly parsed, it's part of function
signature.
With my patch I see:
86.52 │ ┌──ja 26
13.37 │ │ mov -0x18(%rbp),%eax
│ │ add %eax,-0x14(%rbp)
│ │ addl $0x1,-0x4(%rbp)
0.11 │ │↑ jmp 11
│26:└─→mov -0x14(%rbp),%eax
and 'o' output prints:
86.52 │4005┌── ↓ ja 40056c <foo(int, int)+0x26>
13.37 │4005│0: mov -0x18(%rbp),%eax
│4005│3: add %eax,-0x14(%rbp)
│4005│6: addl $0x1,-0x4(%rbp)
0.11 │4005│a: ↑ jmp 400557 <foo(int, int)+0x11>
│4005└─→ mov -0x14(%rbp),%eax
On the contrary, compiling the very same file with gcc -x c, the parsing
is fine because function arguments are not displayed:
jmp 400543 <foo+0x1d>
Committer testing:
Before:
$ cat cpp_args_annotate.c
int
foo(int a, int b)
{
for (unsigned i = 0; i < 1000000000; i++)
a += b;
return a;
}
int main()
{
foo (3, 4);
return 0;
}
$ gcc --version |& head -1
gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9)
$ gcc -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.275 MB perf.data (7188 samples) ]
$ perf annotate --stdio2 foo
Samples: 7K of event 'cycles:u', 4000 Hz, Event count (approx.): 7468429289, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo>:
foo():
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
↓ jmp 1d
a += b;
13.45 13: mov -0x18(%rbp),%eax
add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.09 1d: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.46 ↑ jbe 13
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
I.e. works for C, now lets switch to C++:
$ g++ -g cpp_args_annotate.c -o cpp_args_annotate
$ perf record ./cpp_args_annotate
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.268 MB perf.data (6976 samples) ]
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 → ja 40112c <foo(int, int)+0x26>
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 → jmp 401117 <foo(int, int)+0x11>
return a;
mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Reproduced.
Now with this patch:
Reusing the C++ built binary, as we can see here:
$ readelf -wi cpp_args_annotate | grep producer
<c> DW_AT_producer : (indirect string, offset: 0x2e): GNU C++14 10.2.1 20201125 (Red Hat 10.2.1-9) -mtune=generic -march=x86-64 -g
$
And furthermore:
$ file cpp_args_annotate
cpp_args_annotate: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=4fe3cab260204765605ec630d0dc7a7e93c361a9, for GNU/Linux 3.2.0, with debug_info, not stripped
$ perf buildid-list -i cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9
$ perf buildid-list | grep cpp_args_annotate
4fe3cab260204765605ec630d0dc7a7e93c361a9 /home/acme/c/cpp_args_annotate
$
It now works:
$ perf annotate --stdio2 foo
Samples: 6K of event 'cycles:u', 4000 Hz, Event count (approx.): 7380681761, [percent: local period]
foo() /home/acme/c/cpp_args_annotate
Percent
0000000000401106 <foo(int, int)>:
foo(int, int):
int
foo(int a, int b)
{
push %rbp
mov %rsp,%rbp
mov %edi,-0x14(%rbp)
mov %esi,-0x18(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
movl $0x0,-0x4(%rbp)
11: cmpl $0x3b9ac9ff,-0x4(%rbp)
86.53 ↓ ja 26
a += b;
13.32 mov -0x18(%rbp),%eax
0.00 add %eax,-0x14(%rbp)
for (unsigned i = 0; i < 1000000000; i++)
addl $0x1,-0x4(%rbp)
0.15 ↑ jmp 11
return a;
26: mov -0x14(%rbp),%eax
}
pop %rbp
← retq
$
Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Link: http://lore.kernel.org/lkml/13e1a405-edf9-e4c2-4327-a9b454353730@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 41d5854113 upstream.
I got several memory leak reports from Asan with a simple command. It
was because VDSO is not released due to the refcount. Like in
__dsos_addnew_id(), it should put the refcount after adding to the list.
$ perf record true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ]
=================================================================
==692599==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 439 byte(s) in 1 object(s) allocated from:
#0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x559bce4aa8ee in dso__new_id util/dso.c:1256
#2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132
#3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347
#4 0x559bce50826c in map__new util/map.c:175
#5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
#6 0x559bce512f6b in machines__deliver_event util/session.c:1481
#7 0x559bce515107 in perf_session__deliver_event util/session.c:1551
#8 0x559bce51d4d2 in do_flush util/ordered-events.c:244
#9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
#10 0x559bce519bea in __perf_session__process_events util/session.c:2268
#11 0x559bce519bea in perf_session__process_events util/session.c:2297
#12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
#13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
#14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
#15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
#16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308
Indirect leak of 32 byte(s) in 1 object(s) allocated from:
#0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x559bce520907 in nsinfo__copy util/namespaces.c:169
#2 0x559bce50821b in map__new util/map.c:168
#3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787
#4 0x559bce512f6b in machines__deliver_event util/session.c:1481
#5 0x559bce515107 in perf_session__deliver_event util/session.c:1551
#6 0x559bce51d4d2 in do_flush util/ordered-events.c:244
#7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323
#8 0x559bce519bea in __perf_session__process_events util/session.c:2268
#9 0x559bce519bea in perf_session__process_events util/session.c:2297
#10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017
#11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234
#12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026
#13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858
#14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210315045641.700430-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 67069a1f0f upstream.
ASan reported a memory leak caused by info_linear not being deallocated.
The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().
This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().
$ sudo ./perf record -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]
=================================================================
==297735==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7688 byte(s) in 19 object(s) allocated from:
#0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
#1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
#2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
#3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
#4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
#5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
#6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
#7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
#8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
#9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
#10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
#11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 197eecb6ec ]
When peeking an event, it has a short path and a long path. The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".
The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten. We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.
This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.
Fixes: 5a52f33adf ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3cb17cce1e ]
If we just check whether the variable can be converted, 'tvar' should be
a null pointer. However, the null pointer check is missing in the
'Constant value' execution path.
The following cases can trigger this problem:
$ cat test.c
#include <stdio.h>
void main(void)
{
int a;
const int b = 1;
asm volatile("mov %1, %0" : "=r"(a): "i"(b));
printf("a: %d\n", a);
}
$ gcc test.c -o test -O -g
$ sudo ./perf probe -x ./test -L "main"
<main@/home/lhf/test.c:0>
0 void main(void)
{
2 int a;
const int b = 1;
asm volatile("mov %1, %0" : "=r"(a): "i"(b));
6 printf("a: %d\n", a);
}
$ sudo ./perf probe -x ./test -V "main:6"
Segmentation fault
The check on 'tvar' is added. If 'tavr' is a null pointer, we return 0
to indicate that the variable can be converted. Now, we can successfully
show the variables that can be accessed.
$ sudo ./perf probe -x ./test -V "main:6"
Available variables at main:6
@<main+13>
char* __fmt
int a
int b
However, the variable 'b' cannot be tracked.
$ sudo ./perf probe -x ./test -D "main:6 b"
Failed to find the location of the 'b' variable at this address.
Perhaps it has been optimized out.
Use -V with the --range option to show 'b' location range.
Error: Failed to add events.
This is because __die_find_variable_cb() did not successfully match
variable 'b', which has the DW_AT_const_value attribute instead of
DW_AT_location. We added support for DW_AT_const_value in
__die_find_variable_cb(). With this modification, we can successfully
track the variable 'b'.
$ sudo ./perf probe -x ./test -D "main:6 b"
p:probe_test/main_L6 /home/lhf/test:0x1156 b=\1:s32
Fixes: 66f69b2197 ("perf probe: Support DW_AT_const_value constant value")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Jianlin Lv <jianlin.lv@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
http://lore.kernel.org/lkml/20210601092750.169601-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit c954eb72b3 upstream.
The decoder reports the current instruction if it was decoded. In some
cases the current instruction is not decoded, in which case the instruction
bytes length must be set to zero. Ensure that is always done.
Note perf script can anyway get the instruction bytes for any samples where
they are not present.
Also note, that there is a redundant "ptq->insn_len = 0" statement which is
not removed until a subsequent patch in order to make this patch apply
cleanly to stable branches.
Example:
A machne that supports TSX is required. It will have flag "rtm". Kernel
parameter tsx=on may be required.
# for w in `cat /proc/cpuinfo | grep -m1 flags `;do echo $w | grep rtm ; done
rtm
Test program:
#include <stdio.h>
#include <immintrin.h>
int main()
{
int x = 0;
if (_xbegin() == _XBEGIN_STARTED) {
x = 1;
_xabort(1);
} else {
printf("x = %d\n", x);
}
return 0;
}
Compile with -mrtm i.e.
gcc -Wall -Wextra -mrtm xabort.c -o xabort
Record:
perf record -e intel_pt/cyc/u --filter 'filter main @ ./xabort' ./xabort
Before:
# perf script --itrace=xe -F+flags,+insn,-period --xed --ns
xabort 1478 [007] 92161.431348581: transactions: x 400b81 main+0x14 (/root/xabort) mov $0xffffffff, %eax
xabort 1478 [007] 92161.431348624: transactions: tx abrt 400b93 main+0x26 (/root/xabort) mov $0xffffffff, %eax
After:
# perf script --itrace=xe -F+flags,+insn,-period --xed --ns
xabort 1478 [007] 92161.431348581: transactions: x 400b81 main+0x14 (/root/xabort) xbegin 0x6
xabort 1478 [007] 92161.431348624: transactions: tx abrt 400b93 main+0x26 (/root/xabort) xabort $0x1
Fixes: faaa87680b ("perf intel-pt/bts: Report instruction bytes and length in sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210519074515.9262-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 77d02bd00c upstream.
Noticed on a debian:experimental mips and mipsel cross build build
environment:
perfbuilder@ec265a086e9b:~$ mips-linux-gnu-gcc --version | head -1
mips-linux-gnu-gcc (Debian 10.2.1-3) 10.2.1 20201224
perfbuilder@ec265a086e9b:~$
CC /tmp/build/perf/util/map.o
util/map.c: In function 'map__new':
util/map.c:109:5: error: '%s' directive output may be truncated writing between 1 and 2147483645 bytes into a region of size 4096 [-Werror=format-truncation=]
109 | "%s/platforms/%s/arch-%s/usr/lib/%s",
| ^~
In file included from /usr/mips-linux-gnu/include/stdio.h:867,
from util/symbol.h:11,
from util/map.c:2:
/usr/mips-linux-gnu/include/bits/stdio2.h:67:10: note: '__builtin___snprintf_chk' output 32 or more bytes (assuming 4294967321) into a destination of size 4096
67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
68 | __bos (__s), __fmt, __va_arg_pack ());
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Since we have the lenghts for what lands in that place, use it to give
the compiler more info and make it happy.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit b410ed2a85 ]
The only requirement of an auxtrace queue is that the buffers are in
time order. That is achieved by making separate queues for separate
perf buffer or AUX area buffer mmaps.
That generally means a separate queue per cpu for per-cpu contexts, and
a separate queue per thread for per-task contexts.
When buffers are added to a queue, perf checks that the buffer cpu and
thread id (tid) match the queue cpu and thread id.
However, generally, that need not be true, and perf will queue buffers
correctly anyway, so the check is not needed.
In addition, the check gets erroneously hit when using sample mode to
trace multiple threads.
Consequently, fix that case by removing the check.
Fixes: e502789302 ("perf auxtrace: Add helpers for queuing AUX area tracing data")
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210308151143.18338-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e16c2ce7c5 ]
Commit da231338ec ("perf record: Use an eventfd to wakeup when
done") uses eventfd() to solve a rare race where the setting and
checking of 'done' which add done_fd to pollfd. When draining buffer,
revents of done_fd is 0 and evlist__filter_pollfd function returns a
non-zero value. As a result, perf record does not stop profiling.
The following simple scenarios can trigger this condition:
# sleep 10 &
# perf record -p $!
After the sleep process exits, perf record should stop profiling and exit.
However, perf record keeps running.
If pollfd revents contains only POLLERR or POLLHUP, perf record
indicates that buffer is draining and need to stop profiling. Use
fdarray_flag__nonfilterable() to set done eventfd to nonfilterable
objects, so that evlist__filter_pollfd() does not filter and check done
eventfd.
Fixes: da231338ec ("perf record: Use an eventfd to wakeup when done")
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhangjinhao2@huawei.com
Link: http://lore.kernel.org/lkml/20210205065001.23252-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 96de68fff5 ]
GCC (GCC) 8.4.0 20200304 fails to build perf with:
: util/symbol.c: In function 'dso__load_bfd_symbols':
: util/symbol.c:1626:16: error: comparison of integer expressions of different signednes
: for (i = 0; i < symbols_count; ++i) {
: ^
: util/symbol.c:1632:16: error: comparison of integer expressions of different signednes
: while (i + 1 < symbols_count &&
: ^
: util/symbol.c:1637:13: error: comparison of integer expressions of different signednes
: if (i + 1 < symbols_count &&
: ^
: cc1: all warnings being treated as errors
It's unlikely that the symtable will be that big, but the fix is an
oneliner and as perf has CORE_CFLAGS += -Wextra, which makes build to
fail together with CORE_CFLAGS += -Werror
Fixes: eac9a4342e ("perf symbols: Try reading the symbol table with libbfd")
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitry Safonov <0x7f454c46@gmail.com>
Cc: Jacek Caban <jacek@codeweavers.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210209145148.178702-1-dima@arista.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 5501e9229a upstream.
In some cases, the number of cpus (nr_cpus_online) is confused with the
maximum cpu number (nr_cpus_avail), which results in the error in the
example below:
Example on system with 8 cpus:
Before:
# echo 0 > /sys/devices/system/cpu/cpu2/online
# ./perf record --kcore -e intel_pt// taskset --cpu-list 7 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.147 MB perf.data ]
# ./perf script --itrace=e
Requested CPU 7 too large. Consider raising MAX_NR_CPUS
0x25908 [0x8]: failed to process type: 68 [Invalid argument]
After:
# ./perf script --itrace=e
#
Fixes: 8c7274691f ("perf machine: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Fixes: 7df4e36a47 ("perf session: Replace MAX_NR_CPUS with perf_env::nr_cpus_online")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210107174159.24897-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since some gcc generates a broken DWARF which lacks DW_AT_declaration
attribute from the subprogram DIE of function prototype.
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060)
So, in addition to the DW_AT_declaration check, we also check the
subprogram DIE has DW_AT_inline or actual entry pc.
Committer testing:
# cat /etc/fedora-release
Fedora release 33 (Thirty Three)
#
Before:
# perf test vfs_getname
78: Use vfs_getname probe to get syscall args filenames : FAILED!
79: Check open filename arg using perf trace + vfs_getname : FAILED!
81: Add vfs_getname probe to get syscall args filenames : FAILED!
#
After:
# perf test vfs_getname
78: Use vfs_getname probe to get syscall args filenames : Ok
79: Check open filename arg using perf trace + vfs_getname : Ok
81: Add vfs_getname probe to get syscall args filenames : Ok
#
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: http://lore.kernel.org/lkml/160645613571.2824037.7441351537890235895.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>