[ Upstream commit 16160c1946 ]
Problem: perf did not show branch predicted/mispredicted bit in brstack.
Output of perf -F brstack for profile collected
Before:
0x4fdbcd/0x4fdc03/-/-/-/0
0x45f4c1/0x4fdba0/-/-/-/0
0x45f544/0x45f4bb/-/-/-/0
0x45f555/0x45f53c/-/-/-/0
0x7f66901cc24b/0x45f555/-/-/-/0
0x7f66901cc22e/0x7f66901cc23d/-/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0
After:
0x4fdbcd/0x4fdc03/P/-/-/0
0x45f4c1/0x4fdba0/P/-/-/0
0x45f544/0x45f4bb/P/-/-/0
0x45f555/0x45f53c/P/-/-/0
0x7f66901cc24b/0x45f555/P/-/-/0
0x7f66901cc22e/0x7f66901cc23d/P/-/-/0
0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0
0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0
Cause:
As mentioned in Software Development Manual vol 3, 17.4.8.1,
IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is
stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as
its format. Despite that, registers containing FROM address of the branch,
do have MISPREDICT bit but because of the format indicated in
IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit.
Solution:
Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit.
Signed-off-by: Jacek Tomaka <jacek.tomaka@poczta.fm>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20180802013830.10600-1-jacekt@dugeo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 0592e57b24 ]
LBR has a limited stack size. If a task has a deeper call stack than
LBR's stack size, only the overflowed part is reported. A complete call
stack may not be reconstructed by perf tool.
Current code doesn't access all LBR registers. It only read the ones
below the TOS. The LBR registers above the TOS will be discarded
unconditionally.
When a CALL is captured, the TOS is incremented by 1 , modulo max LBR
stack size. The LBR HW only records the call stack information to the
register which the TOS points to. It will not touch other LBR
registers. So the registers above the TOS probably still store the valid
call stack information for an overflowed call stack, which need to be
reported.
To retrieve complete call stack information, we need to start from TOS,
read all LBR registers until an invalid entry is detected.
0s can be used to detect the invalid entry, because:
- When a RET is captured, the HW zeros the LBR register which TOS points
to, then decreases the TOS.
- The LBR registers are reset to 0 when adding a new LBR event or
scheduling an existing LBR event.
- A taken branch at IP 0 is not expected
The context switch code is also modified to save/restore all valid LBR
registers. Furthermore, the LBR registers, which don't have valid call
stack information, need to be reset in restore, because they may be
polluted while swapped out.
Here is a small test program, tchain_deep.
Its call stack is deeper than 32.
noinline void f33(void)
{
int i;
for (i = 0; i < 10000000;) {
if (i%2)
i++;
else
i++;
}
}
noinline void f32(void)
{
f33();
}
noinline void f31(void)
{
f32();
}
... ...
noinline void f1(void)
{
f2();
}
int main()
{
f1();
}
Here is the test result on SKX. The max stack size of SKX is 32.
Without the patch:
$ perf record -e cycles --call-graph lbr -- ./tchain_deep
$ perf report --stdio
#
# Children Self Command Shared Object Symbol
# ........ ........ ........... ................ .................
#
100.00% 99.99% tchain_deep tchain_deep [.] f33
|
--99.99%--f30
f31
f32
f33
With the patch:
$ perf record -e cycles --call-graph lbr -- ./tchain_deep
$ perf report --stdio
# Children Self Command Shared Object Symbol
# ........ ........ ........... ................ ..................
#
99.99% 0.00% tchain_deep tchain_deep [.] f1
|
---f1
f2
f3
f4
f5
f6
f7
f8
f9
f10
f11
f12
f13
f14
f15
f16
f17
f18
f19
f20
f21
f22
f23
f24
f25
f26
f27
f28
f29
f30
f31
f32
f33
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: eranian@google.com
Link: https://lore.kernel.org/lkml/1528213126-4312-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2c991e408d upstream.
Markus reported that BTS is sporadically missing the tail of the trace
in the perf_event data buffer: [decode error (1): instruction overflow]
shown in GDB; and bisected it to the conversion of debug_store to PTI.
A little "optimization" crept into alloc_bts_buffer(), which mistakenly
placed bts_interrupt_threshold away from the 24-byte record boundary.
Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from
the BTS buffer base that is a multiple of the BTS record size."
Revert "max" from a byte count to a record count, to calculate the
bts_interrupt_threshold correctly: which turns out to fix problem seen.
Fixes: c1961a4631 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-and-tested-by: Markus T Metzger <markus.t.metzger@intel.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <andi.kleen@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: stable@vger.kernel.org # v4.14+
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.1807141248290.1614@eggly.anvils
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit d31fc13fdc ]
There is a bug when reading event->count with large PEBS enabled.
Here is an example:
# ./read_count
0x71f0
0x122c0
0x1000000001c54
0x100000001257d
0x200000000bdc5
In fixed period mode, the auto-reload mechanism could be enabled for
PEBS events, but the calculation of event->count does not take the
auto-reload values into account.
Anyone who reads event->count will get the wrong result, e.g x86_pmu_read().
This bug was introduced with the auto-reload mechanism enabled since
commit:
851559e35f ("perf/x86/intel: Use the PEBS auto reload mechanism when possible")
Introduce intel_pmu_save_and_restart_reload() to calculate the
event->count only for auto-reload.
Since the counter increments a negative counter value and overflows on
the sign switch, giving the interval:
[-period, 0]
the difference between two consequtive reads is:
A) value2 - value1;
when no overflows have happened in between,
B) (0 - value1) + (value2 - (-period));
when one overflow happened in between,
C) (0 - value1) + (n - 1) * (period) + (value2 - (-period));
when @n overflows happened in between.
Here A) is the obvious difference, B) is the extension to the discrete
interval, where the first term is to the top of the interval and the
second term is from the bottom of the next interval and C) the extension
to multiple intervals, where the middle term is the whole intervals
covered.
The equation for all cases is:
value2 - value1 + n * period
Previously the event->count is updated right before the sample output.
But for case A, there is no PEBS record ready. It needs to be specially
handled.
Remove the auto-reload code from x86_perf_event_set_period() since
we'll not longer call that function in this case.
Based-on-code-from: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Fixes: 851559e35f ("perf/x86/intel: Use the PEBS auto reload mechanism when possible")
Link: http://lkml.kernel.org/r/1518474035-21006-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 71eb9ee959 ]
this patch fix a bug in how the pebs->real_ip is handled in the PEBS
handler. real_ip only exists in Haswell and later processor. It is
actually the eventing IP, i.e., where the event occurred. As opposed
to the pebs->ip which is the PEBS interrupt IP which is always off
by one.
The problem is that the real_ip just like the IP needs to be fixed up
because PEBS does not record all the machine state registers, and
in particular the code segement (cs). This is why we have the set_linear_ip()
function. The problem was that set_linear_ip() was only used on the pebs->ip
and not the pebs->real_ip.
We have profiles which ran into invalid callstacks because of this.
Here is an example:
..... 0: ffffffffffffff80 recent entry, marker kernel v
..... 1: 000000000040044d <= user address in kernel space!
..... 2: fffffffffffffe00 marker enter user v
..... 3: 000000000040044d
..... 4: 00000000004004b6 oldest entry
Debugging output in get_perf_callchain():
[ 857.769909] CALLCHAIN: CPU8 ip=40044d regs->cs=10 user_mode(regs)=0
The problem is that the kernel entry in 1: points to a user level
address. How can that be?
The reason is that with PEBS sampling the instruction that caused the event
to occur and the instruction where the CPU was when the interrupt was posted
may be far apart. And sometime during that time window, the privilege level may
change. This happens, for instance, when the PEBS sample is taken close to a
kernel entry point. Here PEBS, eventing IP (real_ip) captured a user level
instruction. But by the time the PMU interrupt fired, the processor had already
entered kernel space. This is why the debug output shows a user address with
user_mode() false.
The problem comes from PEBS not recording the code segment (cs) register.
The register is used in x86_64 to determine if executing in kernel vs user
space. This is okay because the kernel has a software workaround called
set_linear_ip(). But the issue in setup_pebs_sample_data() is that
set_linear_ip() is never called on the real_ip value when it is available
(Haswell and later) and precise_ip > 1.
This patch fixes this problem and eliminates the callchain discrepancy.
The patch restructures the code around set_linear_ip() to minimize the number
of times the IP has to be set.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: kan.liang@intel.com
Link: http://lkml.kernel.org/r/1521788507-10231-1-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 7ad1437d6a ]
A recent commit introduced an extra merge_attr() call in the skylake
branch, which causes a memory leak.
Store the pointer to the extra allocated memory and free it at the end of
the function.
Fixes: a5df70c354 ("perf/x86: Only show format attributes when supported")
Reported-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 40d4071ce2 upstream.
The AMD power module can be loaded on non AMD platforms, but unload fails
with the following Oops:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __list_del_entry_valid+0x29/0x90
Call Trace:
perf_pmu_unregister+0x25/0xf0
amd_power_pmu_exit+0x1c/0xd23 [power]
SyS_delete_module+0x1a8/0x2b0
? exit_to_usermode_loop+0x8f/0xb0
entry_SYSCALL_64_fastpath+0x20/0x83
Return -ENODEV instead of 0 from the module init function if the CPU does
not match.
Fixes: c7ab62bfbe ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism")
Signed-off-by: Xiao Liang <xiliang@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 42f3bdc5dd upstream.
Thomas reported the following warning:
BUG: using smp_processor_id() in preemptible [00000000] code: ovsdb-server/4498
caller is native_flush_tlb_single+0x57/0xc0
native_flush_tlb_single+0x57/0xc0
__set_pte_vaddr+0x2d/0x40
set_pte_vaddr+0x2f/0x40
cea_set_pte+0x30/0x40
ds_update_cea.constprop.4+0x4d/0x70
reserve_ds_buffers+0x159/0x410
x86_reserve_hardware+0x150/0x160
x86_pmu_event_init+0x3e/0x1f0
perf_try_init_event+0x69/0x80
perf_event_alloc+0x652/0x740
SyS_perf_event_open+0x3f6/0xd60
do_syscall_64+0x5c/0x190
set_pte_vaddr is used to map the ds buffers into the cpu entry area, but
there are two problems with that:
1) The resulting flush is not supposed to be called in preemptible context
2) The cpu entry area is supposed to be per CPU, but the debug store
buffers are mapped for all CPUs so these mappings need to be flushed
globally.
Add the necessary preemption protection across the mapping code and flush
TLBs globally.
Fixes: c1961a4631 ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Thomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hugh Dickins <hughd@google.com>
Link: https://lkml.kernel.org/r/20180104170712.GB3040@hirez.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2fe1bc1f50 upstream.
[ Note, this is a Git cherry-pick of the following commit:
a47ba4d77e ("perf/x86: Enable free running PEBS for REGS_USER/INTR")
... for easier x86 PTI code testing and back-porting. ]
Currently free running PEBS is disabled when user or interrupt
registers are requested. Most of the registers are actually
available in the PEBS record and can be supported.
So we just need to check for the supported registers and then
allow it: it is all except for the segment register.
For user registers this only works when the counter is limited
to ring 3 only, so this also needs to be checked.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170831214630.21892-1-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull initial SPDX identifiers from Greg KH:
"License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the
'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
binding shorthand, which can be used instead of the full boiler plate
text.
This patch is based on work done by Thomas Gleixner and Kate Stewart
and Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset
of the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to
license had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied
to a file was done in a spreadsheet of side by side results from of
the output of two independent scanners (ScanCode & Windriver)
producing SPDX tag:value files created by Philippe Ombredanne.
Philippe prepared the base worksheet, and did an initial spot review
of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537
files assessed. Kate Stewart did a file by file comparison of the
scanner results in the spreadsheet to determine which SPDX license
identifier(s) to be applied to the file. She confirmed any
determination that was not immediately clear with lawyers working with
the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained
>5 lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that
became the concluded license(s).
- when there was disagreement between the two scanners (one detected
a license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply
(and which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases,
confirmation by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.
The Windriver scanner is based on an older version of FOSSology in
part, so they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot
checks in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect
the correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial
patch version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch
license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
applied SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
License cleanup: add SPDX license identifier to uapi header files with a license
License cleanup: add SPDX license identifier to uapi header files with no license
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pull perf fixes from Ingo Molnar:
"Some tooling fixes plus three kernel fixes: a memory leak fix, a
statistics fix and a crash fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix memory leaks on allocation failures
perf/core: Fix cgroup time when scheduling descendants
perf/core: Avoid freeing static PMU contexts when PMU is unregistered
tools include uapi bpf.h: Sync kernel ABI header with tooling header
perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU
perf script: Add missing separator for "-F ip,brstack" (and brstackoff)
perf callchain: Compare dsos (as well) for CCKEY_FUNCTION
Pull watchddog clean-up and fixes from Thomas Gleixner:
"The watchdog (hard/softlockup detector) code is pretty much broken in
its current state. The patch series addresses this by removing all
duct tape and refactoring it into a workable state.
The reasons why I ask for inclusion that late in the cycle are:
1) The code causes lockdep splats vs. hotplug locking which get
reported over and over. Unfortunately there is no easy fix.
2) The risk of breakage is minimal because it's already broken
3) As 4.14 is a long term stable kernel, I prefer to have working
watchdog code in that and the lockdep issues resolved. I wouldn't
ask you to pull if 4.14 wouldn't be a LTS kernel or if the
solution would be easy to backport.
4) The series was around before the merge window opened, but then got
delayed due to the UP failure caused by the for_each_cpu()
surprise which we discussed recently.
Changes vs. V1:
- Addressed your review points
- Addressed the warning in the powerpc code which was discovered late
- Changed two function names which made sense up to a certain point
in the series. Now they match what they do in the end.
- Fixed a 'unused variable' warning, which got not detected by the
intel robot. I triggered it when trying all possible related config
combinations manually. Randconfig testing seems not random enough.
The changes have been tested by and reviewed by Don Zickus and tested
and acked by Micheal Ellerman for powerpc"
* 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
watchdog/core: Put softlockup_threads_initialized under ifdef guard
watchdog/core: Rename some softlockup_* functions
powerpc/watchdog: Make use of watchdog_nmi_probe()
watchdog/core, powerpc: Lock cpus across reconfiguration
watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()
watchdog/hardlockup/perf: Fix spelling mistake: "permanetely" -> "permanently"
watchdog/hardlockup/perf: Cure UP damage
watchdog/hardlockup: Clean up hotplug locking mess
watchdog/hardlockup/perf: Simplify deferred event destroy
watchdog/hardlockup/perf: Use new perf CPU enable mechanism
watchdog/hardlockup/perf: Implement CPU enable replacement
watchdog/hardlockup/perf: Implement init time detection of perf
watchdog/hardlockup/perf: Implement init time perf validation
watchdog/core: Get rid of the racy update loop
watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage
watchdog/sysctl: Clean up sysctl variable name space
watchdog/sysctl: Get rid of the #ifdeffery
watchdog/core: Clean up header mess
watchdog/core: Further simplify sysctl handling
watchdog/core: Get rid of the thread teardown/setup dance
...
Pull x86 cache quality monitoring update from Thomas Gleixner:
"This update provides a complete rewrite of the Cache Quality
Monitoring (CQM) facility.
The existing CQM support was duct taped into perf with a lot of issues
and the attempts to fix those turned out to be incomplete and
horrible.
After lengthy discussions it was decided to integrate the CQM support
into the Resource Director Technology (RDT) facility, which is the
obvious choise as in hardware CQM is part of RDT. This allowed to add
Memory Bandwidth Monitoring support on top.
As a result the mechanisms for allocating cache/memory bandwidth and
the corresponding monitoring mechanisms are integrated into a single
management facility with a consistent user interface"
* 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
x86/intel_rdt: Turn off most RDT features on Skylake
x86/intel_rdt: Add command line options for resource director technology
x86/intel_rdt: Move special case code for Haswell to a quirk function
x86/intel_rdt: Remove redundant ternary operator on return
x86/intel_rdt/cqm: Improve limbo list processing
x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug
x86/intel_rdt: Modify the intel_pqr_state for better performance
x86/intel_rdt/cqm: Clear the default RMID during hotcpu
x86/intel_rdt: Show bitmask of shareable resource with other executing units
x86/intel_rdt/mbm: Handle counter overflow
x86/intel_rdt/mbm: Add mbm counter initialization
x86/intel_rdt/mbm: Basic counting of MBM events (total and local)
x86/intel_rdt/cqm: Add CPU hotplug support
x86/intel_rdt/cqm: Add sched_in support
x86/intel_rdt: Introduce rdt_enable_key for scheduling
x86/intel_rdt/cqm: Add mount,umount support
x86/intel_rdt/cqm: Add rmdir support
x86/intel_rdt: Separate the ctrl bits from rmdir
x86/intel_rdt/cqm: Add mon_data
x86/intel_rdt: Prepare for RDT monitor data support
...
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Add branch type profiling/tracing support. (Jin Yao)
- Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of
physical memory addresses, where the PMU supports it. (Kan Liang)
- Export some PMU capability details in the new
/sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi
Kleen)
- Aux data fixes and updates (Will Deacon)
- kprobes fixes and updates (Masami Hiramatsu)
- AMD uncore PMU driver fixes and updates (Janakarajan Natarajan)
On the tooling side, here's a (limited!) list of highlights - there
were many other changes that I could not list, see the shortlog and
git history for details:
UI improvements:
- Implement a visual marker for fused x86 instructions in the
annotate TUI browser, available now in 'perf report', more work
needed to have it available as well in 'perf top' (Jin Yao)
Further explanation from one of Jin's patches:
│ ┌──cmpl $0x0,argp_program_version_hook
81.93 │ ├──je 20
│ │ lock cmpxchg %esi,0x38a9a4(%rip)
│ │↓ jne 29
│ │↓ jmp 43
11.47 │20:└─→cmpxch %esi,0x38a999(%rip)
That means the cmpl+je is a fused instruction pair and they should
be considered together.
- Record the branch type and then show statistics and info about in
callchain entries (Jin Yao)
Example from one of Jin's patches:
# perf record -g -j any,save_type
# perf report --branch-history --stdio --no-children
38.50% div.c:45 [.] main div
|
---main div.c:42 (RET CROSS_2M cycles:2)
compute_flag div.c:28 (cycles:2)
compute_flag div.c:27 (RET CROSS_2M cycles:1)
rand rand.c:28 (cycles:1)
rand rand.c:28 (RET CROSS_2M cycles:1)
__random random.c:298 (cycles:1)
__random random.c:297 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (COND_BWD CROSS_2M cycles:1)
__random random.c:295 (cycles:1)
__random random.c:295 (RET CROSS_2M cycles:9)
namespaces support:
- Add initial support for namespaces, using setns to access files in
namespaces, grabbing their build-ids, etc. (Krister Johansen)
perf trace enhancements:
- Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace'
(Arnaldo Carvalho de Melo)
- Add initial 'clone' syscall args beautifier in 'perf trace'
(Arnaldo Carvalho de Melo)
- Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace'
(Arnaldo Carvalho de Melo)
- Beautifiers for the 'cmd' arg of several ioctl types, including:
sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de
Melo)
- Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data'
CTF conversion, allowing CTF trace visualization tools to show
callchains and to resolve symbols (Geneviève Bastien)
- Beautify the fcntl syscall, which is an interesting one in the
sense that infrastructure had to be put in place to change the
formatters of some arguments according to the value in a previous
one, i.e. cmd dictates how arg and the syscall return will be
formatted. (Arnaldo Carvalho de Melo
perf stat enhancements:
- Use group read for event groups in 'perf stat', reducing overhead
when groups are defined in the event specification, i.e. when using
{} to enclose a list of events, asking them to be read at the same
time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa)
pipe mode improvements:
- Process tracing data in 'perf annotate' pipe mode (David
Carrillo-Cisneros)
- Add header record types to pipe-mode, now this command:
$ perf record -o - -e cycles sleep 1 | perf report --stdio --header
Will show the same as in non-pipe mode, i.e. involving a perf.data
file (David Carrillo-Cisneros)
Vendor specific hardware event support updates/enhancements:
- Update POWER9 vendor events tables (Sukadev Bhattiprolu)
- Add POWER9 PMU events Sukadev (Bhattiprolu)
- Support additional POWER8+ PVR in PMU mapfile (Shriya)
- Add Skylake server uncore JSON vendor events (Andi Kleen)
- Support exporting Intel PT data to sqlite3 with python perf
scripts, this is in addition to the postgresql support that was
already there (Adrian Hunter)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits)
perf symbols: Fix plt entry calculation for ARM and AARCH64
perf probe: Fix kprobe blacklist checking condition
perf/x86: Fix caps/ for !Intel
perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR
perf/core, pt, bts: Get rid of itrace_started
perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments
tools headers: Sync cpu features kernel ABI headers with tooling headers
perf tools: Pass full path of FEATURES_DUMP
perf tools: Robustify detection of clang binary
tools lib: Allow external definition of CC, AR and LD
perf tools: Allow external definition of flex and bison binary names
tools build tests: Don't hardcode gcc name
perf report: Group stat values on global event id
perf values: Zero value buffers
perf values: Fix allocation check
perf values: Fix thread index bug
perf report: Add dump_read function
perf record: Set read_format for inherit_stat
perf c2c: Fix remote HITM detection for Skylake
perf tools: Fix static build with newer toolchains
...
For understanding how the workload maps to memory channels and hardware
behavior, it's very important to collect address maps with physical
addresses. For example, 3D XPoint access can only be found by filtering
the physical address.
Add a new sample type for physical address.
perf already has a facility to collect data virtual address. This patch
introduces a function to convert the virtual address to physical address.
The function is quite generic and can be extended to any architecture as
long as a virtual address is provided.
- For kernel direct mapping addresses, virt_to_phys is used to convert
the virtual addresses to physical address.
- For user virtual addresses, __get_user_pages_fast is used to walk the
pages tables for user physical address.
- This does not work for vmalloc addresses right now. These are not
resolved, but code to do that could be added.
The new sample type requires collecting the virtual address. The
virtual address will not be output unless SAMPLE_ADDR is applied.
For security, the physical address can only be exposed to root or
privileged user.
Tested-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: acme@kernel.org
Cc: mpe@ellerman.id.au
Link: http://lkml.kernel.org/r/1503967969-48278-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I just noticed that hw.itrace_started and hw.config are aliased to the
same location. Now, the PT driver happens to use both, which works out
fine by sheer luck:
- STORE(hw.itrace_start) is ordered before STORE(hw.config), in the
program order, although there are no compiler barriers to ensure that,
- to the perf_log_itrace_start() hw.itrace_start looks set at the same
time as when it is intended to be set because both stores happen in the
same path,
- hw.config is never reset to zero in the PT driver.
Now, the use of hw.config by the PT driver makes more sense (it being a
HW PMU) than messing around with itrace_started, which is an awkward API
to begin with.
This patch replaces hw.itrace_started with an attach_state bit and an
API call for the PMU drivers to use to communicate the condition.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170330153956.25994-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>