Pull tracing tooling updates from Steven Rostedt:
- Allow RTLA to collect data via BPF
The current implementation of rtla uses libtracefs and libtraceevent
to pull sample events generated by the timerlat tracer from the trace
buffer. rtla then processes the sample by updating the histogram and
summary (current, maximum, minimum, and sum values) as well as checks
if tracing has been stopped due to threshold overflow.
In use cases where a large number of samples is being generated, that
is, with measurements running on many CPUs and with a low interval,
this sample processing design causes a significant CPU load on the
rtla side. Furthermore, with >100 CPUs and 100us interval, rtla was
reported as not being able to keep up with the samples and dropping
most of them, leading to it being unusable.
Change the way the timerlat trace processes samples by attaching a
BPF program to the trace event using the BPF skeleton feature of
bpftool. Unlike the current implementation, the BPF implementation
does not check whether tracing is stopped (in BPF mode, tracing is
always off to improve performance), but waits for a write to a BPF
ringbuffer instead. This allows rtla to exit immediately when a
threshold is violated, without waiting for the next iteration of the
while loop.
If the requirements for the BPF implementation are not met, either at
build time or at run time, the current implementation is used as
fallback. Which implementation is being used can be seen when running
rtla timerlat with "-D" option. rtla can be forced to run in non-BPF
mode by setting the RTLA_NO_BPF option to 1, for debugging purposes.
- Fix LD_FLAGS from being dropped in build
- Refactor code to remove duplication of save_trace_to_file
- Always set options and do not rely on default settings
Do not rely on the default kernel settings of the tracers when
starting. They could have been changed by the user which gives
inconsistent results. Always set the options that rtla expects.
- Add creation of ctags and TAGS for traversing code
* tag 'trace-tools-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rtla: Add the ability to create ctags and etags
rtla/tests: Test setting default options
rtla/tests: Reset osnoise options before check
rtla: Always set all tracer options
rtla/osnoise: Set OSNOISE_WORKLOAD to true
rtla: Unify apply_config between top and hist
rtla/osnoise: Unify params struct
rtla: Fix segfault in save_trace_to_file call
tools/build: Use SYSTEM_BPFTOOL for system bpftool
rtla: Refactor save_trace_to_file
tools/rv: Keep user LDFLAGS in build
rtla/timerlat: Test BPF mode
rtla/timerlat_top: Use BPF to collect samples
rtla/timerlat_top: Move divisor to update
rtla/timerlat_hist: Use BPF to collect samples
rtla/timerlat: Add BPF skeleton to collect samples
rtla: Add optional dependency on BPF tooling
tools/build: Add bpftool-skeletons feature test
rtla/timerlat: Unify params struct
Pull latency tracing updates from Steven Rostedt:
- Add some trace events to osnoise and timerlat sample generation
This adds more information to the osnoise and timerlat tracers as
well as allows BPF programs to be attached to these locations to
extract even more data.
- Fix to DECLARE_TRACE_CONDITION() macro
It wasn't used but now will be and it happened to be broken causing
the build to fail.
- Add scheduler specification monitors to runtime verifier (RV)
This is a continuation of Daniel Bristot's work.
RV allows monitors to run and react concurrently. Running the
cumulative model is equivalent to running single components using the
same reactors, with the advantage that it's easier to point out which
specification failed in case of error.
This update introduces nested monitors to RV, in short, the sysfs
monitor folder will contain a monitor named sched, which is nothing
but an empty container for other monitors. Controlling the sched
monitor (enable, disable, set reactors) controls all nested monitors.
The following scheduling monitors are added:
- sco: scheduling context operations
Monitor to ensure sched_set_state happens only in thread context
- tss: task switch while scheduling
Monitor to ensure sched_switch happens only in scheduling context
- snroc: set non runnable on its own context
Monitor to ensure set_state happens only in the respective task's context
- scpd: schedule called with preemption disabled
Monitor to ensure schedule is called with preemption disabled
- snep: schedule does not enable preempt
Monitor to ensure schedule does not enable preempt
- sncid: schedule not called with interrupt disabled
Monitor to ensure schedule is not called with interrupt disabled
* tag 'trace-latency-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools/rv: Allow rv list to filter for container
Documentation/rv: Add docs for the sched monitors
verification/dot2k: Add support for nested monitors
tools/rv: Add support for nested monitors
rv: Add scpd, snep and sncid per-cpu monitors
rv: Add snroc per-task monitor
rv: Add sco and tss per-cpu monitors
rv: Add option for nested monitors and include sched
sched: Add sched tracepoints for RV task model
rv: Add license identifiers to monitor files
tracing: Fix DECLARE_TRACE_CONDITION
trace/osnoise: Add trace events for samples
Add possibility to supply the container name to rv list:
# rv list sched
mon1
mon2
mon3
This lists only monitors in sched, without indentation.
Supplying -h, any option (string starting with -) or more than 1
argument will still print the usage.
Passing a non-existent container prints nothing and passing no container
continues to print all monitors, showing indentation for nested
monitors, reported after their container.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-10-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
RV now supports nested monitors, this functionality requires a container
monitor, which has virtually no functionality besides holding other
monitors, and nested monitors, that have a container as parent.
Add the -p flag to pass a parent to a monitor, this sets it up while
registering the monitor and adds necessary includes and configurations.
Add the -c flag to create a container, since containers are empty, we
don't allow supplying a dot model or a monitor type, the template is
also different since functions to enable and disable the monitor are not
defined, nor any tracepoint. The generated header file only allows to
include the rv_monitor structure in children monitors.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-8-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
RV now supports nested monitors, this functionality requires a container
monitor, which has virtually no functionality besides holding other
monitors, and nested monitors, that have a container as parent.
Nested monitors' sysfs folders are physically nested in the container's
folder, and they are listed in the available_monitors file with the
notation container:monitor.
These changes go against the assumption that each line in the
available_monitors file correspond to a folder in the rv directory,
breaking the functionality of the rv tool.
Add support for nested containers in the rv userspace tool, indenting
nested monitors while listed and allowing both the notation with and
without container name, which are equivalent:
# rv list
mon1
mon2
container:
- nested1
- nested2
## notation with container name
# rv mon container:nested1
## notation without container name
# rv mon nested1
Either way, enabling a nested monitor is the same as enabling any other
non-nested monitor.
Selecting the container with rv mon enables all the nested monitors, if
-t is passed, the trace also includes the monitor name next to the
event:
# rv mon nested1 -t
<idle>-0 [004] event state1 x event -> state2
<idle>-0 [004] error event not expected in state2
# rv mon sched -t
<idle>-0 [004] event_nested1 state1 x event -> state2
<idle>-0 [004] error_nested1 event not expected in state2
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Link: https://lore.kernel.org/20250305140406.350227-7-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
rv, unlike rtla and perf, drops LDFLAGS supplied by the user and honors
only EXTRA_LDFLAGS. This is inconsistent with both perf and rtla and
can lead to all kinds of unexpected behavior.
For example, on Fedora and RHEL, it causes rv to be build without
PIE, unlike the aforementioned perf and rtla:
$ file /usr/bin/{rv,rtla,perf}
/usr/bin/rv: ELF 64-bit LSB executable, ...
/usr/bin/rtla: ELF 64-bit LSB pie executable, ...
/usr/bin/perf: ELF 64-bit LSB pie executable, ...
Keep both LDFLAGS and EXTRA_LDFLAGS for the build.
Cc: John Kacur <jkacur@redhat.com>
Cc: Luis Goncalves <lgoncalv@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/20250304142228.767658-1-tglozar@redhat.com
Fixes: 012e4e77df ("tools/verification: Use tools/build makefiles on rv")
Signed-off-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix tools/ quiet build Makefile infrastructure that was broken when
working on tools/perf/ without testing on other tools/ living
utilities.
* tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools: Remove redundant quiet setup
tools: Unify top-level quiet infrastructure
Currently dot2k treats all events equally and registers them with a
general da_handle_event. This is however just part of the work because
some events are necessary to understand when the monitor is entering the
initial state.
Specifically, the da_handle_start_event takes care of setting the
monitor in the initial state and da_handle_start_run_event also
registers the current event in the newly enabled monitor.
da_handle_start_event can be used on events that only lead to the
initial state (as it is currently done in the example monitors), while
da_handle_start_run_event could be used on events that are only valid
from the initial one.
Failing to set at least one of those functions to handle events makes
the monitor useless, since it will never be activated.
This patch adapts dot2k to parse the events that surely lead to the
initial state and set da_handle_start_event for those, if no such event
is found but some events are only valid in the initial event, we instead
set da_handle_start_run_event (it isn't necessary to set both).
We still add a comment to warn the user to make sure this change is
matching the model definition.
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20241227144752.362911-9-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
dot2k suggests a list of changes to the kernel tree while adding a
monitor: edit tracepoints header, Makefile, Kconfig and moving the
monitor folder. Those changes can be easily run automatically.
Add a flag to dot2k to alter the kernel source.
The kernel source directory can be either assumed from the PWD, or from
the running kernel, if installed.
This feature works best if the kernel tree is a git repository, so that
its easier to make sure there are no unintended changes.
The main RV files (e.g. Makefile) have now a comment placeholder that
can be useful for manual editing (e.g. to know where to add new
monitors) and it is used by the script to append the required lines.
We also slightly adapt the file handling functions in dot2k: __open_file
is now called __read_file and also closes the file before returning the
content; __create_file is now a more general __write_file, we no longer
return on FileExistsError (not thrown while opening), a new
__create_file simply calls __write_file specifying the monitor folder in
the path.
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20241227144752.362911-8-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This patch reduces and simplifies the manual steps still needed in
creating a new RV monitor.
It extends the dot2k script to create a tracepoint snippet and a
Kconfig file for the newly generated monitor. Those files can be kept
in the monitor's directory but shall be included in the main tracepoint
header and Kconfig.
Together with the checklist, dot2k now suggests the lines to add to
those files for inclusion and the Makefile line to compile the new
monitor:
Writing the monitor into the directory monitor_name
Almost done, checklist
- Edit the monitor_name/monitor_name.c to add the instrumentation
- Edit kernel/trace/rv/rv_trace.h:
Add this line where other tracepoints are included and DA_MON_EVENTS_ID is defined:
#include <monitors/monitor_name/monitor_name_trace.h>
- Edit kernel/trace/rv/Makefile:
Add this line where other monitors are included:
obj-$(CONFIG_RV_MON_MONITOR_NAME) += monitors/monitor_name/monitor_name.o
- Edit kernel/trace/rv/Kconfig:
Add this line where other monitors are included:
source "kernel/trace/rv/monitors/monitor_name/Kconfig"
- Move monitor_name/ to the kernel's monitor directory (kernel/trace/rv/monitors)
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20241227144752.362911-7-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The dot2k templates currently have variables that are automatically
filled by the script marked as an uppercase VARIABLE. This requires some
care while adding new variables to avoid using valid keywords and get
them unexpectedly substituted.
This patch switches the variables to the %%VARIABLE%% notation to make
the pattern substitution more robust.
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20241227144752.362911-4-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
dot2k can be run as installed (e.g. make install) or from the kernel
tree. In the former case it looks for templates in a known location; in
the latter, the PWD has to be `<linux>/tools/verification` to properly
import python modules. The current version looks for the template
in a wrong directory in this latter case.
This patch adjusts the directory where dot2k looks for templates if run
from the kernel tree (i.e. not installed).
Additionally we fix a few simple pylint warnings in boolean expressions.
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
Link: https://lore.kernel.org/20241227144752.362911-2-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This patch makes the dot parser used by dot2c and dot2k slightly more
robust, namely:
* allows parsing files with the gv extension (GraphViz)
* correctly parses edges with any indentation
* used to work only with a single character (e.g. '\t')
Additionally it fixes a couple of warnings reported by pylint such as
wrong indentation and comparison to False instead of `not ...`
Link: https://lore.kernel.org/20241017064238.41394-2-gmonaco@redhat.com
Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Pull tracing fix from Steven Rostedt:
"I missed this minor hardening of the kernel in the first pull.
- Make monitor structures read only"
* tag 'trace-v6.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rv/monitors: Move monitor structure in rodata
Add the ability to control and trace in-kernel monitors. This is
a generic interface, it will check for existing monitors and enable
standard setup, like enabling reactors.
For example:
# rv list
wip wakeup in preemptive per-cpu testing monitor. [OFF]
wwnr wakeup while not running per-task testing model. [OFF]
# rv mon wwnr --help
rv version 6.1.0-rc4: help
usage: rv mon wwnr [-h] [-q] [-r reactor] [-s] [-v]
-h/--help: print this menu and the reactor list
-r/--reactor 'reactor': enables the 'reactor'
-s/--self: when tracing (-t), also trace rv command
-t/--trace: trace monitor's event
-v/--verbose: print debug messages
available reactors: nop printk panic
# rv mon wwnr --trace
<TASK>-PID [CPU] TYPE ID STATE x EVENT -> NEXT_STATE FINAL
| | | | | | | | |
rv-3613 [001] event 3613 running x switch_out -> not_running Y
sshd-1248 [005] event 1248 running x switch_out -> not_running Y
<idle>-0 [005] event 71 not_running x wakeup -> not_running Y
<idle>-0 [005] event 71 not_running x switch_in -> running N
kcompactd0-71 [005] event 71 running x switch_out -> not_running Y
<idle>-0 [000] event 860 not_running x wakeup -> not_running Y
<idle>-0 [000] event 860 not_running x switch_in -> running N
systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y
<idle>-0 [000] event 860 not_running x wakeup -> not_running Y
<idle>-0 [000] event 860 not_running x switch_in -> running N
systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y
<idle>-0 [005] event 71 not_running x wakeup -> not_running Y
<idle>-0 [005] event 71 not_running x switch_in -> running N
kcompactd0-71 [005] event 71 running x switch_out -> not_running Y
<idle>-0 [000] event 860 not_running x wakeup -> not_running Y
<idle>-0 [000] event 860 not_running x switch_in -> running N
systemd-oomd-860 [000] event 860 running x switch_out -> not_running Y
<idle>-0 [001] event 3613 not_running x wakeup -> not_running Y
Link: https://lkml.kernel.org/r/1e57547e3acadda6e23949b2672c89e76ec2ec42.1668180100.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
This is the (user-space) runtime verification tool, named rv.
This tool aims to be the interface for in-kernel rv monitors, as
well as the home for monitors in user-space (online asynchronous),
and in *eBPF.
The tool receives a command as the first argument, the current
commands are:
list - list all available monitors
mon - run a given monitor
Each monitor is an independent piece of software inside the
tool and can have their own arguments.
There is no monitor implemented in this patch, it only
adds the basic structure of the tool, based on rtla.
# rv --help
rv version 6.1.0-rc4: help
usage: rv command [-h] [command_options]
-h/--help: print this menu
command: run one of the following command:
list: list all available monitors
mon: run a monitor
[command options]: each command has its own set of options
run rv command -h for further information
*dot2bpf is the next patch set, depends on this, doing cleanups.
Link: https://lkml.kernel.org/r/fb51184f3b95aea0d7bfdc33ec09f4153aee84fa.1668180100.git.bristot@kernel.org
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>