Currently, the auto-analysis is attached to the timerlat top instance.
The idea was to avoid creating another instance just for that, so one
instance could be reused.
The drawback is that, by doing so, the auto-analysis run for the entire
session, consuming CPU time. On my 24 box CPUs for timerlat with a 100
us period consumed 50 % with auto analysis, but only 16 % without.
By creating an instance for auto-analysis, we can keep the processing
stopped until a stop tracing condition is hit. Once it happens,
timerlat auto-analysis can use its own trace instance to parse only
the end of the trace.
By doing so, auto-analysis stop consuming cpu time when it is not
needed.
If the --aa-only is passed, the timerlat top instance is reused for
auto analysis.
Link: https://lkml.kernel.org/r/346b7168c1bae552a415715ec6d23c129a43bdb7.1686066600.git.bristot@kernel.org
Cc: William White <chwhite@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The "Previous IRQ interference" line is misaligned and without
a \n, breaking the tool's output:
## CPU 12 hit stop tracing, analyzing it ##
Previous IRQ interference: up to 2.22 us IRQ handler delay: 18.06 us (0.00 %)
IRQ latency: 18.52 us
Timerlat IRQ duration: 4.41 us (0.00 %)
Blocking thread: 216.93 us (0.03 %)
Fix the output:
## CPU 7 hit stop tracing, analyzing it ##
Previous IRQ interference: up to 8.93 us
IRQ handler delay: 0.98 us (0.00 %)
IRQ latency: 2.95 us
Timerlat IRQ duration: 11.26 us (0.03 %)
Link: https://lore.kernel.org/linux-trace-devel/8b5819077f15ccf24745c9bf3205451e16ee32d9.1679685525.git.bristot@kernel.org
Fixes: 27e348b221 ("rtla/timerlat: Add auto-analysis core")
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Parsing and formating timerlat data might consume a reasonable
amount of CPU time on very large systems, or when timerlat
has a too short period.
Add an option to run timerlat with auto-analysis enabled while
skipping the statistics parsing. In this mode, rtla timerlat
periodically checks if the tracing is on, going to sleep waiting
for the stop tracing condition to stop tracing, or for the
tracing session to finish.
If the stop tracing condition is hit, the tool prints the auto
analysis. Otherwise, the tool prints the max observed latency and
exit. The max observed latency is captured via tracing_max_latency.
Link: https://lore.kernel.org/linux-trace-devel/4dc514d1d5dc353c537a466a9b5af44c266b6da2.1680106912.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The hwnoise tool is a special mode for the osnoise top tool.
hwnoise dispatches the osnoise tracer and displays a summary of the noise.
The difference is that it runs the tracer with the OSNOISE_IRQ_DISABLE
option set, thus only allowing only hardware-related noise, resulting in
a simplified output. hwnoise has the same features of osnoise.
An example of the tool's output:
# rtla hwnoise -c 1-11 -T 1 -d 10m -q
Hardware-related Noise
duration: 0 00:10:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI
1 #599 599000000 138 99.99997 3 3 4 74
2 #599 599000000 85 99.99998 3 3 4 75
3 #599 599000000 86 99.99998 4 3 6 75
4 #599 599000000 81 99.99998 4 4 2 75
5 #599 599000000 85 99.99998 2 2 2 75
Link: https://lkml.kernel.org/r/2d6f49a6f3a4f8b51b2c806458b1cff71ad4d014.1675805361.git.bristot@kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, timerlat top displays the timerlat tracer latency results, saving
the intuitive timerlat trace for the developer to analyze.
This patch goes a step forward in the automaton of the scheduling latency
analysis by providing a summary of the root cause of a latency higher than
the passed "stop tracing" parameter if the trace stops.
The output is intuitive enough for non-expert users to have a general idea
of the root cause by looking at each factor's contribution percentage while
keeping the technical detail in the output for more expert users to start
an in dept debug or to correlate a root cause with an existing one.
The terminology is in line with recent industry and academic publications
to facilitate the understanding of both audiences.
Here is one example of tool output:
----------------------------------------- %< -----------------------------------------------------
# taskset -c 0 timerlat -a 40 -c 1-23 -q
Timer Latency
0 00:00:12 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
1 #12322 | 0 0 1 15 | 10 3 9 31
2 #12322 | 3 0 1 12 | 10 3 9 23
3 #12322 | 1 0 1 21 | 8 2 8 34
4 #12322 | 1 0 1 17 | 10 2 11 33
5 #12322 | 0 0 1 12 | 8 3 8 25
6 #12322 | 1 0 1 14 | 16 3 11 35
7 #12322 | 0 0 1 14 | 9 2 8 29
8 #12322 | 1 0 1 22 | 9 3 9 34
9 #12322 | 0 0 1 14 | 8 2 8 24
10 #12322 | 1 0 0 12 | 9 3 8 24
11 #12322 | 0 0 0 15 | 6 2 7 29
12 #12321 | 1 0 0 13 | 5 3 8 23
13 #12319 | 0 0 1 14 | 9 3 9 26
14 #12321 | 1 0 0 13 | 6 2 8 24
15 #12321 | 1 0 1 15 | 12 3 11 27
16 #12318 | 0 0 1 13 | 7 3 10 24
17 #12319 | 0 0 1 13 | 11 3 9 25
18 #12318 | 0 0 0 12 | 8 2 8 20
19 #12319 | 0 0 1 18 | 10 2 9 28
20 #12317 | 0 0 0 20 | 9 3 8 34
21 #12318 | 0 0 0 13 | 8 3 8 28
22 #12319 | 0 0 1 11 | 8 3 10 22
23 #12320 | 28 0 1 28 | 41 3 11 41
rtla timerlat hit stop tracing
## CPU 23 hit stop tracing, analyzing it ##
IRQ handler delay: 27.49 us (65.52 %)
IRQ latency: 28.13 us
Timerlat IRQ duration: 9.59 us (22.85 %)
Blocking thread: 3.79 us (9.03 %)
objtool:49256 3.79 us
Blocking thread stacktrace
-> timerlat_irq
-> __hrtimer_run_queues
-> hrtimer_interrupt
-> __sysvec_apic_timer_interrupt
-> sysvec_apic_timer_interrupt
-> asm_sysvec_apic_timer_interrupt
-> _raw_spin_unlock_irqrestore
-> cgroup_rstat_flush_locked
-> cgroup_rstat_flush_irqsafe
-> mem_cgroup_flush_stats
-> mem_cgroup_wb_stats
-> balance_dirty_pages
-> balance_dirty_pages_ratelimited_flags
-> btrfs_buffered_write
-> btrfs_do_write_iter
-> vfs_write
-> __x64_sys_pwrite64
-> do_syscall_64
-> entry_SYSCALL_64_after_hwframe
------------------------------------------------------------------------
Thread latency: 41.96 us (100%)
The system has exit from idle latency!
Max timerlat IRQ latency from idle: 17.48 us in cpu 4
Saving trace to timerlat_trace.txt
----------------------------------------- >% -----------------------------------------------------
In this case, the major factor was the delay suffered by the IRQ handler
that handles timerlat wakeup: 65.52 %. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (objtool:49256) disabled interrupts
via raw spin lock operations inside mem cgroup, while doing write
syscall in a btrfs file system.
A simple search for the function name on Google shows that this is
a legit case for disabling the interrupts:
cgroup: Use irqsave in cgroup_rstat_flush_locked()
lore.kernel.org/linux-mm/20220301122143.1521823-2-bigeasy@linutronix.de/
The output also prints other reasons for the latency root cause, such as:
- an IRQ that happened before the IRQ handler that caused delays
- The interference from NMI, IRQ, Softirq, and Threads
The details about how these factors affect the scheduling latency
can be found here:
https://bristot.me/demystifying-the-real-time-linux-latency/
Link: https://lkml.kernel.org/r/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
rtla_usage(), osnoise_usage() and timerlat_usage() all exit with an
error status.
However when these are called from help, they should exit with a
non-error status.
Fix this by passing the exit status to the functions.
Note, although we remove the subsequent call to exit after calling
usage, we leave it in at the end of a function to suppress the compiler
warning "control reaches end of a non-void function".
Link: https://lkml.kernel.org/r/20221107144313.22470-1-jkacur@redhat.com
Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Avoid double free by making trace_instance_destroy indempotent. When
trace_instance_init fails, it calls trace_instance_destroy, but its only
caller osnoise_destroy_tool calls it again.
Link: https://lkml.kernel.org/r/mvmilnlkyzx.fsf_-_@suse.de
Fixes: 0605bf009f ("rtla: Add osnoise tool")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Sedat Dilek reported an error on rtla Makefile when running:
$ make -C tools/ clean
[...]
make[2]: Entering directory
'/home/dileks/src/linux-kernel/git/tools/tracing/rtla'
[...]
'/home/dileks/src/linux-kernel/git/Documentation/tools/rtla'
/bin/sh: 1: test: rtla-make[2]:: unexpected operator <------ The problem
rm: cannot remove '/home/dileks/src/linux-kernel/git': Is a directory
make[2]: *** [Makefile:120: clean] Error 1
make[2]: Leaving directory
This occurred because the rtla calls kernel's Makefile to get the
version in silence mode, e.g.,
$ make -sC ../../.. kernelversion
5.19.0-rc4
But the -s is being ignored when rtla's makefile is called indirectly,
so the output looks like this:
$ make -C ../../.. kernelversion
make: Entering directory '/root/linux'
5.19.0-rc4
make: Leaving directory '/root/linux'
Using 'grep -v make' avoids this problem, e.g.,
$ make -C ../../.. kernelversion | grep -v make
5.19.0-rc4
Thus, add | grep -v make.
Link: https://lkml.kernel.org/r/870c02d4d97a921f02a31fa3b229fc549af61a20.1657747763.git.bristot@kernel.org
Fixes: 8619e32825 ("rtla: Follow kernel version")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Wagner reported to me that readproc.h got deprecated. Also,
while the procps-ng library was available on Fedora, it was not available
on RHEL, which is a piece of evidence that it was not that used.
rtla uses procps-ng only to find the PID of the tracers' workload.
I used the procps-ng library to avoid reinventing the wheel. But in this
case, reinventing the wheel took me less time than the time we already
took trying to work around problems.
Implement a function that reads /proc/ entries, checking if:
- the entry is a directory
- the directory name is composed only of digits (PID)
- the directory contains the comm file
- the comm file contains a comm that matches the tracers'
workload prefix.
- then return true; otherwise, return false.
And use it instead of procps-ng.
Link: https://lkml.kernel.org/r/e8276e122ee9eb2c5a0ba8e673fb6488b924b825.1652423574.git.bristot@kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tao Zhou <tao.zhou@linux.dev>
Fixes: b1696371d8 ("rtla: Helper functions for rtla")
Reported-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull Kbuild updates from Masahiro Yamada:
- Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
additional flags to be passed to user-space programs.
- Fix missing fflush() bugs in Kconfig and fixdep
- Fix a minor bug in the comment format of the .config file
- Make kallsyms ignore llvm's local labels, .L*
- Fix UAPI compile-test for cross-compiling with Clang
- Extend the LLVM= syntax to support LLVM=<suffix> form for using a
particular version of LLVm, and LLVM=<prefix> form for using custom
LLVM in a particular directory path.
- Clean up Makefiles
* tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: Make $(LLVM) more flexible
kbuild: add --target to correctly cross-compile UAPI headers with Clang
fixdep: use fflush() and ferror() to ensure successful write to files
arch: syscalls: simplify uapi/kapi directory creation
usr/include: replace extra-y with always-y
certs: simplify empty certs creation in certs/Makefile
certs: include certs/signing_key.x509 unconditionally
kallsyms: ignore all local labels prefixed by '.L'
kconfig: fix missing '# end of' for empty menu
kconfig: add fflush() before ferror() check
kbuild: replace $(if A,A,B) with $(or A,B)
kbuild: Add environment variables for userprogs flags
kbuild: unify cmd_copy and cmd_shipped
Add --trigger option. This option enables a trace event trigger to the
previous -e sys:event argument, allowing some advanced tracing options.
For instance, in a system with CPUs 2:23 isolated, it is possible to get
a stack trace of thread wakeup targeting those CPUs while running
osnoise with the following command line:
# osnoise top -c 2-23 -a 50 -e sched:sched_wakeup --trigger="stacktrace if target_cpu >= 2"
This option is available for all current tools.
Link: https://lkml.kernel.org/r/07d2983d5f71261d4da89dbaf02efcad100ab8ee.1646247211.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add the -a/--auto <arg in us> option. This option sets some commonly
used options while debugging the system. It aims to help users produce
reports in the field, reducing the number of arguments passed to the
tool in the first approach to a problem.
It is equivalent to setting osnoise/stop_tracing_us with the argument,
setting tracing_thresh to 1 us, and saving the trace to osnoise_trace.txt
file if the trace is stopped automatically.
Link: https://lkml.kernel.org/r/ef04c961b227eb93a83cd0b54bfca45e1a381b77.1646247211.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>