The hwnoise tool is a special mode for the osnoise top tool.
hwnoise dispatches the osnoise tracer and displays a summary of the noise.
The difference is that it runs the tracer with the OSNOISE_IRQ_DISABLE
option set, thus only allowing only hardware-related noise, resulting in
a simplified output. hwnoise has the same features of osnoise.
An example of the tool's output:
# rtla hwnoise -c 1-11 -T 1 -d 10m -q
Hardware-related Noise
duration: 0 00:10:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI
1 #599 599000000 138 99.99997 3 3 4 74
2 #599 599000000 85 99.99998 3 3 4 75
3 #599 599000000 86 99.99998 4 3 6 75
4 #599 599000000 81 99.99998 4 4 2 75
5 #599 599000000 85 99.99998 2 2 2 75
Link: https://lkml.kernel.org/r/2d6f49a6f3a4f8b51b2c806458b1cff71ad4d014.1675805361.git.bristot@kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, timerlat top displays the timerlat tracer latency results, saving
the intuitive timerlat trace for the developer to analyze.
This patch goes a step forward in the automaton of the scheduling latency
analysis by providing a summary of the root cause of a latency higher than
the passed "stop tracing" parameter if the trace stops.
The output is intuitive enough for non-expert users to have a general idea
of the root cause by looking at each factor's contribution percentage while
keeping the technical detail in the output for more expert users to start
an in dept debug or to correlate a root cause with an existing one.
The terminology is in line with recent industry and academic publications
to facilitate the understanding of both audiences.
Here is one example of tool output:
----------------------------------------- %< -----------------------------------------------------
# taskset -c 0 timerlat -a 40 -c 1-23 -q
Timer Latency
0 00:00:12 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
1 #12322 | 0 0 1 15 | 10 3 9 31
2 #12322 | 3 0 1 12 | 10 3 9 23
3 #12322 | 1 0 1 21 | 8 2 8 34
4 #12322 | 1 0 1 17 | 10 2 11 33
5 #12322 | 0 0 1 12 | 8 3 8 25
6 #12322 | 1 0 1 14 | 16 3 11 35
7 #12322 | 0 0 1 14 | 9 2 8 29
8 #12322 | 1 0 1 22 | 9 3 9 34
9 #12322 | 0 0 1 14 | 8 2 8 24
10 #12322 | 1 0 0 12 | 9 3 8 24
11 #12322 | 0 0 0 15 | 6 2 7 29
12 #12321 | 1 0 0 13 | 5 3 8 23
13 #12319 | 0 0 1 14 | 9 3 9 26
14 #12321 | 1 0 0 13 | 6 2 8 24
15 #12321 | 1 0 1 15 | 12 3 11 27
16 #12318 | 0 0 1 13 | 7 3 10 24
17 #12319 | 0 0 1 13 | 11 3 9 25
18 #12318 | 0 0 0 12 | 8 2 8 20
19 #12319 | 0 0 1 18 | 10 2 9 28
20 #12317 | 0 0 0 20 | 9 3 8 34
21 #12318 | 0 0 0 13 | 8 3 8 28
22 #12319 | 0 0 1 11 | 8 3 10 22
23 #12320 | 28 0 1 28 | 41 3 11 41
rtla timerlat hit stop tracing
## CPU 23 hit stop tracing, analyzing it ##
IRQ handler delay: 27.49 us (65.52 %)
IRQ latency: 28.13 us
Timerlat IRQ duration: 9.59 us (22.85 %)
Blocking thread: 3.79 us (9.03 %)
objtool:49256 3.79 us
Blocking thread stacktrace
-> timerlat_irq
-> __hrtimer_run_queues
-> hrtimer_interrupt
-> __sysvec_apic_timer_interrupt
-> sysvec_apic_timer_interrupt
-> asm_sysvec_apic_timer_interrupt
-> _raw_spin_unlock_irqrestore
-> cgroup_rstat_flush_locked
-> cgroup_rstat_flush_irqsafe
-> mem_cgroup_flush_stats
-> mem_cgroup_wb_stats
-> balance_dirty_pages
-> balance_dirty_pages_ratelimited_flags
-> btrfs_buffered_write
-> btrfs_do_write_iter
-> vfs_write
-> __x64_sys_pwrite64
-> do_syscall_64
-> entry_SYSCALL_64_after_hwframe
------------------------------------------------------------------------
Thread latency: 41.96 us (100%)
The system has exit from idle latency!
Max timerlat IRQ latency from idle: 17.48 us in cpu 4
Saving trace to timerlat_trace.txt
----------------------------------------- >% -----------------------------------------------------
In this case, the major factor was the delay suffered by the IRQ handler
that handles timerlat wakeup: 65.52 %. This can be caused by the
current thread masking interrupts, which can be seen in the blocking
thread stacktrace: the current thread (objtool:49256) disabled interrupts
via raw spin lock operations inside mem cgroup, while doing write
syscall in a btrfs file system.
A simple search for the function name on Google shows that this is
a legit case for disabling the interrupts:
cgroup: Use irqsave in cgroup_rstat_flush_locked()
lore.kernel.org/linux-mm/20220301122143.1521823-2-bigeasy@linutronix.de/
The output also prints other reasons for the latency root cause, such as:
- an IRQ that happened before the IRQ handler that caused delays
- The interference from NMI, IRQ, Softirq, and Threads
The details about how these factors affect the scheduling latency
can be found here:
https://bristot.me/demystifying-the-real-time-linux-latency/
Link: https://lkml.kernel.org/r/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
rtla_usage(), osnoise_usage() and timerlat_usage() all exit with an
error status.
However when these are called from help, they should exit with a
non-error status.
Fix this by passing the exit status to the functions.
Note, although we remove the subsequent call to exit after calling
usage, we leave it in at the end of a function to suppress the compiler
warning "control reaches end of a non-void function".
Link: https://lkml.kernel.org/r/20221107144313.22470-1-jkacur@redhat.com
Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Avoid double free by making trace_instance_destroy indempotent. When
trace_instance_init fails, it calls trace_instance_destroy, but its only
caller osnoise_destroy_tool calls it again.
Link: https://lkml.kernel.org/r/mvmilnlkyzx.fsf_-_@suse.de
Fixes: 0605bf009f ("rtla: Add osnoise tool")
Signed-off-by: Andreas Schwab <schwab@suse.de>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Sedat Dilek reported an error on rtla Makefile when running:
$ make -C tools/ clean
[...]
make[2]: Entering directory
'/home/dileks/src/linux-kernel/git/tools/tracing/rtla'
[...]
'/home/dileks/src/linux-kernel/git/Documentation/tools/rtla'
/bin/sh: 1: test: rtla-make[2]:: unexpected operator <------ The problem
rm: cannot remove '/home/dileks/src/linux-kernel/git': Is a directory
make[2]: *** [Makefile:120: clean] Error 1
make[2]: Leaving directory
This occurred because the rtla calls kernel's Makefile to get the
version in silence mode, e.g.,
$ make -sC ../../.. kernelversion
5.19.0-rc4
But the -s is being ignored when rtla's makefile is called indirectly,
so the output looks like this:
$ make -C ../../.. kernelversion
make: Entering directory '/root/linux'
5.19.0-rc4
make: Leaving directory '/root/linux'
Using 'grep -v make' avoids this problem, e.g.,
$ make -C ../../.. kernelversion | grep -v make
5.19.0-rc4
Thus, add | grep -v make.
Link: https://lkml.kernel.org/r/870c02d4d97a921f02a31fa3b229fc549af61a20.1657747763.git.bristot@kernel.org
Fixes: 8619e32825 ("rtla: Follow kernel version")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Daniel Wagner reported to me that readproc.h got deprecated. Also,
while the procps-ng library was available on Fedora, it was not available
on RHEL, which is a piece of evidence that it was not that used.
rtla uses procps-ng only to find the PID of the tracers' workload.
I used the procps-ng library to avoid reinventing the wheel. But in this
case, reinventing the wheel took me less time than the time we already
took trying to work around problems.
Implement a function that reads /proc/ entries, checking if:
- the entry is a directory
- the directory name is composed only of digits (PID)
- the directory contains the comm file
- the comm file contains a comm that matches the tracers'
workload prefix.
- then return true; otherwise, return false.
And use it instead of procps-ng.
Link: https://lkml.kernel.org/r/e8276e122ee9eb2c5a0ba8e673fb6488b924b825.1652423574.git.bristot@kernel.org
Cc: John Kacur <jkacur@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tao Zhou <tao.zhou@linux.dev>
Fixes: b1696371d8 ("rtla: Helper functions for rtla")
Reported-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull Kbuild updates from Masahiro Yamada:
- Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
additional flags to be passed to user-space programs.
- Fix missing fflush() bugs in Kconfig and fixdep
- Fix a minor bug in the comment format of the .config file
- Make kallsyms ignore llvm's local labels, .L*
- Fix UAPI compile-test for cross-compiling with Clang
- Extend the LLVM= syntax to support LLVM=<suffix> form for using a
particular version of LLVm, and LLVM=<prefix> form for using custom
LLVM in a particular directory path.
- Clean up Makefiles
* tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: Make $(LLVM) more flexible
kbuild: add --target to correctly cross-compile UAPI headers with Clang
fixdep: use fflush() and ferror() to ensure successful write to files
arch: syscalls: simplify uapi/kapi directory creation
usr/include: replace extra-y with always-y
certs: simplify empty certs creation in certs/Makefile
certs: include certs/signing_key.x509 unconditionally
kallsyms: ignore all local labels prefixed by '.L'
kconfig: fix missing '# end of' for empty menu
kconfig: add fflush() before ferror() check
kbuild: replace $(if A,A,B) with $(or A,B)
kbuild: Add environment variables for userprogs flags
kbuild: unify cmd_copy and cmd_shipped
Add --trigger option. This option enables a trace event trigger to the
previous -e sys:event argument, allowing some advanced tracing options.
For instance, in a system with CPUs 2:23 isolated, it is possible to get
a stack trace of thread wakeup targeting those CPUs while running
osnoise with the following command line:
# osnoise top -c 2-23 -a 50 -e sched:sched_wakeup --trigger="stacktrace if target_cpu >= 2"
This option is available for all current tools.
Link: https://lkml.kernel.org/r/07d2983d5f71261d4da89dbaf02efcad100ab8ee.1646247211.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Add the -a/--auto <arg in us> option. This option sets some commonly
used options while debugging the system. It aims to help users produce
reports in the field, reducing the number of arguments passed to the
tool in the first approach to a problem.
It is equivalent to setting osnoise/stop_tracing_us with the argument,
setting tracing_thresh to 1 us, and saving the trace to osnoise_trace.txt
file if the trace is stopped automatically.
Link: https://lkml.kernel.org/r/ef04c961b227eb93a83cd0b54bfca45e1a381b77.1646247211.git.bristot@kernel.org
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Currently, --entries uses -e as the short version in the hist mode of
timerlat and osnoise tools. But as -e is already used to enable events
on trace sessions by other tools, thus let's keep it available for the
same usage for all rtla tools.
Make -E the short version of --entries for hist mode on all tools.
Note: rtla was merged in this merge window, so rtla was not released yet.
Link: https://lkml.kernel.org/r/5dbf0cbe7364d3a05e708926b41a097c59a02b1e.1645206561.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
$(or ...) is available since GNU Make 3.81, and useful to shorten the
code in some places.
Covert as follows:
$(if A,A,B) --> $(or A,B)
This patch also converts:
$(if A, A, B) --> $(or A, B)
Strictly speaking, the latter is not an equivalent conversion because
GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B)
expands to " A", while $(or A, B) expands to "A".
Anyway, preceding spaces are not significant in the code hunks I touched.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
rtla osnoise and timerlat are causing a segmentation fault when running
with the --trace option on a kernel that does not support multiple
instances. For example:
[root@f34 rtla]# rtla osnoise top -t
failed to enable the tracer osnoise
Could not enable osnoiser tracer for tracing
Failed to enable the trace instance
Segmentation fault (core dumped)
This error happens because the exit code of the tools is trying
to destroy the trace instance that failed to be created.
Make osnoise_destroy_tool() aware of possible NULL osnoise_tool *,
and do not attempt to destroy it. This also simplifies the exit code.
Link: https://lkml.kernel.org/r/5660a2b6bf66c2655842360f2d7f6b48db5dba23.1644327249.git.bristot@kernel.org
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Fixes: 1eceb2fc2c ("rtla/osnoise: Add osnoise top mode")
Fixes: 829a6c0b56 ("rtla/osnoise: Add the hist mode")
Fixes: a828cd18bc ("rtla: Add timerlat tool and timelart top mode")
Fixes: 1eeb6328e8 ("rtla/timerlat: Add timerlat hist mode")
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The rtla timerlat tool is an interface for the timerlat tracer.
The timerlat tracer dispatches a kernel thread per-cpu. These threads set a
periodic timer to wake themselves up and go back to sleep. After the
wakeup, they collect and generate useful information for the debugging of
operating system timer latency.
The timerlat tracer outputs information in two ways. It periodically
prints the timer latency at the timer IRQ handler and the Thread handler.
It also provides information for each noise via the osnoise tracepoints.
The rtla timerlat top mode displays a summary of the periodic output from
the timerlat tracer.
Here is one example of the rtla timerlat tool output:
---------- %< ----------
[root@alien ~]# rtla timerlat top -c 0-3 -d 1m
Timer Latency
0 00:01:00 | IRQ Timer Latency (us) | Thread Timer Latency (us)
CPU COUNT | cur min avg max | cur min avg max
0 #60001 | 0 0 0 3 | 1 1 1 6
1 #60001 | 0 0 0 3 | 2 1 1 5
2 #60001 | 0 0 1 6 | 1 1 2 7
3 #60001 | 0 0 0 7 | 1 1 1 11
---------- >% ----------
Running:
# rtla timerlat --help
# rtla timerlat top --help
provides information about the available options.
Link: https://lkml.kernel.org/r/e95032e20c2b88c962195bf7693bb53c9ebcced8.1639158831.git.bristot@kernel.org
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: linux-rt-users@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The rtla osnoise tool is an interface for the osnoise tracer. The
osnoise tracer dispatches a kernel thread per-cpu. These threads read
the time in a loop while with preemption, softirqs and IRQs enabled,
thus allowing all the sources of osnoise during its execution. The
osnoise threads take note of the entry and exit point of any source
of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source
of interference.
The rtla osnoise top mode displays information about the periodic
summary from the osnoise tracer.
One example of rtla osnoise top output is:
[root@alien ~]# rtla osnoise top -c 0-3 -d 1m -q -r 900000 -P F:1
Operating System Noise
duration: 0 00:01:00 | time is in us
CPU Period Runtime Noise % CPU Aval Max Noise Max Single HW NMI IRQ Softirq Thread
0 #58 52200000 1031 99.99802 91 60 0 0 52285 0 101
1 #59 53100000 5 99.99999 5 5 0 9 53122 0 18
2 #59 53100000 7 99.99998 7 7 0 8 53115 0 18
3 #59 53100000 8274 99.98441 277 23 0 9 53778 0 660
"rtla osnoise top --help" works and provide information about the
available options.
Link: https://lkml.kernel.org/r/0d796993abf587ae5a170bb8415c49368d4999e1.1639158831.git.bristot@kernel.org
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Tom Zanussi <zanussi@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: linux-rt-users@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>