Evan Green <evan@rivosinc.com> says:
This change detects the presence of Zba, Zbb, and Zbs extensions and exports
them per-hart to userspace via the hwprobe mechanism. Glibc can then use
these in setting up hwcaps-based library search paths.
There's a little bit of extra housekeeping here: the first change adds
Zba and Zbs to the set of extensions the kernel recognizes, and the second
change starts tracking ISA features per-hart (in addition to the ANDed
mask of features across all harts which the kernel uses to make
decisions). Now that we track the ISA information per-hart, we could
even fix up /proc/cpuinfo to accurately report extension per-hart,
though I've left that out of this series for now.
* b4-shazam-merge:
RISC-V: hwprobe: Expose Zba, Zbb, and Zbs
RISC-V: Track ISA extensions per hart
RISC-V: Add Zba, Zbs extension probing
Link: https://lore.kernel.org/r/20230509182504.2997252-1-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The kernel maintains a mask of ISA extensions ANDed together across all
harts. Let's also keep a bitmap of ISA extensions for each CPU. Although
the kernel is currently unlikely to enable a feature that exists only on
some CPUs, we want the ability to report asymmetric CPU extensions
accurately to usermode.
Note that riscv_fill_hwcaps() runs before the per_cpu_offsets are built,
which is why I've used a [NR_CPUS] array rather than per_cpu() data.
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230509182504.2997252-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andy Chiu <andy.chiu@sifive.com> says:
This is the v21 patch series for adding Vector extension support in
Linux. Please refer to [1] for the introduction of the patchset. The
v21 patch series was aimed to solve build issues from v19, provide usage
guideline for the prctl interface, and address review comments on v20.
Thank every one who has been reviewing, suggesting on the topic. Hope
this get a step closer to the final merge.
* b4-shazam-merge: (27 commits)
selftests: add .gitignore file for RISC-V hwprobe
selftests: Test RISC-V Vector prctl interface
riscv: Add documentation for Vector
riscv: Enable Vector code to be built
riscv: detect assembler support for .option arch
riscv: Add sysctl to set the default vector rule for new processes
riscv: Add prctl controls for userspace vector management
riscv: hwcap: change ELF_HWCAP to a function
riscv: KVM: Add vector lazy save/restore support
riscv: kvm: Add V extension to KVM ISA
riscv: prevent stack corruption by reserving task_pt_regs(p) early
riscv: signal: validate altstack to reflect Vector
riscv: signal: Report signal frame size to userspace via auxv
riscv: signal: Add sigcontext save/restore for vector
riscv: signal: check fp-reserved words unconditionally
riscv: Add ptrace vector support
riscv: Allocate user's vector context in the first-use trap
riscv: Add task switch support for vector
riscv: Introduce struct/helpers to save/restore per-task Vector state
riscv: Introduce riscv_v_vsize to record size of Vector context
...
Link: https://lore.kernel.org/r/20230605110724.21391-1-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
To support Vector extension, the series exports variable-length vector
registers on the signal frame. However, this potentially breaks abi if
processing vector registers is required in the signal handler for old
binaries. For example, there is such need if user-level context switch
is triggerred via signals[1].
For this reason, it is best to leave a decision to distro maintainers,
where the enablement of userspace Vector for new launching programs can
be controlled. Developers may also need the switch to experiment with.
The parameter is configurable through sysctl interface so a distro may
turn off Vector early at init script if the break really happens in the
wild.
The switch will only take effects on new execve() calls once set. This
will not effect existing processes that do not call execve(), nor
processes which has been set with a non-default vstate_ctrl by making
explicit PR_RISCV_V_SET_CONTROL prctl() calls.
Link: https://lore.kernel.org/all/87cz4048rp.fsf@all.your.base.are.belong.to.us/
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Vincent Chen <vincent.chen@sifive.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20230605110724.21391-23-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This patch add two riscv-specific prctls, to allow usespace control the
use of vector unit:
* PR_RISCV_V_SET_CONTROL: control the permission to use Vector at next,
or all following execve for a thread. Turning off a thread's Vector
live is not possible since libraries may have registered ifunc that
may execute Vector instructions.
* PR_RISCV_V_GET_CONTROL: get the same permission setting for the
current thread, and the setting for following execve(s).
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Vincent Chen <vincent.chen@sifive.com>
Link: https://lore.kernel.org/r/20230605110724.21391-22-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Using a function is flexible to represent ELF_HWCAP. So the kernel may
encode hwcap reflecting supported hardware features just at the moment of
the start of each program.
This will be helpful when we introduce prctl/sysctl interface to control
per-process availability of Vector extension in following patches.
Programs started with V disabled should see V masked off in theirs
ELF_HWCAP.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230605110724.21391-21-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Early function calls, such as setup_vm(), relocate_enable_mmu(),
soc_early_init() etc, are free to operate on stack. However,
PT_SIZE_ON_STACK bytes at the head of the kernel stack are purposedly
reserved for the placement of per-task register context pointed by
task_pt_regs(p). Those functions may corrupt task_pt_regs if we overlap
the $sp with it. In fact, we had accidentally corrupted sstatus.VS in some
tests, treating the kernel to save V context before V was actually
allocated, resulting in a kernel panic.
Thus, we should skip PT_SIZE_ON_STACK for $sp before making C function
calls from the top-level assembly.
Co-developed-by: ShihPo Hung <shihpo.hung@sifive.com>
Signed-off-by: ShihPo Hung <shihpo.hung@sifive.com>
Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Link: https://lore.kernel.org/r/20230605110724.21391-18-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Some extensions, such as Vector, dynamically change footprint on a
signal frame, so MINSIGSTKSZ is no longer accurate. For example, an
RV64V implementation with vlen = 512 may occupy 2K + 40 + 12 Bytes of a
signal frame with the upcoming support. And processes that do not
execute any vector instructions do not need to reserve the extra
sigframe. So we need a way to guard the allocation size of the sigframe
at process runtime according to current status of V.
Thus, provide the function sigaltstack_size_valid() to validate its size
based on current allocation status of supported extensions.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Link: https://lore.kernel.org/r/20230605110724.21391-17-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The vector register belongs to the signal context. They need to be stored
and restored as entering and leaving the signal handler. According to the
V-extension specification, the maximum length of the vector registers can
be 2^16. Hence, if userspace refers to the MINSIGSTKSZ to create a
sigframe, it may not be enough. To resolve this problem, this patch refers
to the commit 94b07c1f8c
("arm64: signal: Report signal frame size to userspace via auxv") to enable
userspace to know the minimum required sigframe size through the auxiliary
vector and use it to allocate enough memory for signal context.
Note that auxv always reports size of the sigframe as if V exists for
all starting processes, whenever the kernel has CONFIG_RISCV_ISA_V. The
reason is that users usually reference this value to allocate an
alternative signal stack, and the user may use V anytime. So the user
must reserve a space for V-context in sigframe in case that the signal
handler invokes after the kernel allocating V.
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Link: https://lore.kernel.org/r/20230605110724.21391-16-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This patch facilitates the existing fp-reserved words for placement of
the first extension's context header on the user's sigframe. A context
header consists of a distinct magic word and the size, including the
header itself, of an extension on the stack. Then, the frame is followed
by the context of that extension, and then a header + context body for
another extension if exists. If there is no more extension to come, then
the frame must be ended with a null context header. A special case is
rv64gc, where the kernel support no extensions requiring to expose
additional regfile to the user. In such case the kernel would place the
null context header right after the first reserved word of
__riscv_q_ext_state when saving sigframe. And the kernel would check if
all reserved words are zeros when a signal handler returns.
__riscv_q_ext_state---->| |<-__riscv_extra_ext_header
~ ~
.reserved[0]--->|0 |<- .reserved
<-------|magic |<- .hdr
| |size |_______ end of sc_fpregs
| |ext-bdy|
| ~ ~
+)size ------->|magic |<- another context header
|size |
|ext-bdy|
~ ~
|magic:0|<- null context header
|size:0 |
The vector registers will be saved in datap pointer. The datap pointer
will be allocated dynamically when the task needs in kernel space. On
the other hand, datap pointer on the sigframe will be set right after
the __riscv_v_ext_state data structure.
Co-developed-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Signed-off-by: Greentime Hu <greentime.hu@sifive.com>
Suggested-by: Vineet Gupta <vineetg@rivosinc.com>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Co-developed-by: Andy Chiu <andy.chiu@sifive.com>
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Link: https://lore.kernel.org/r/20230605110724.21391-15-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
In order to let kernel/user locate and identify an extension context on
the existing sigframe, we are going to utilize reserved space of fp and
encode the information there. And since the sigcontext has already
preserved a space for fp context w or w/o CONFIG_FPU, we move those
reserved words checking/setting routine back into generic code.
This commit also undone an additional logical change carried by the
refactor commit 007f5c3589
("Refactor FPU code in signal setup/return procedures"). Originally we
did not restore fp context if restoring of gpr have failed. And it was
fine on the other side. In such way the kernel could keep the regfiles
intact, and potentially react at the failing point of restore.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Tested-by: Heiko Stuebner <heiko.stuebner@vrull.eu>
Link: https://lore.kernel.org/r/20230605110724.21391-14-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Yangyu Chen <cyy@cyyself.name> says:
This patchset allows case-insensitive ISA string parsing, which is
needed in the ACPI environment. As the RISC-V Hart Capabilities Table
(RHCT) description in UEFI Forum ECR[1] shows the format of the ISA
string is defined in the RISC-V unprivileged specification[2]. However,
the RISC-V unprivileged specification defines the ISA naming strings are
case-insensitive while the current ISA string parser in the kernel only
accepts lowercase letters. In this case, the kernel should allow
case-insensitive ISA string parsing. Moreover, this reason has been
discussed in Conor's patch[3]. And I have also checked the current ISA
string parsing in the recent ACPI support patch[4] will also call
`riscv_fill_hwcap` function as DT we use now.
The original motivation for my patch v1[5] is that some SoC generators
will provide generated DT with illegal ISA string in dt-binding such as
rocket-chip, which will even cause kernel panic in some cases as I
mentioned in v1[5]. Now, the rocket-chip has been fixed in PR #3333[6].
However, when using some specific version of rocket-chip with
illegal ISA string in DT, this patchset will also work for parsing
uppercase letters correctly in DT, thus will have better compatibility.
In summary, this patch not only works for case-insensitive ISA string
parsing to meet the requirements in ECR[1] but also can be a workaround
for some specific versions of rocket-chip.
* b4-shazam-merge:
dt-bindings: riscv: drop invalid comment about riscv,isa lower-case reasoning
riscv: allow case-insensitive ISA string parsing
Link: https://lore.kernel.org/r/tencent_E6911C8D71F5624E432A1AFDF86804C3B509@qq.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
According to RISC-V Hart Capabilities Table (RHCT) description in UEFI
Forum ECR, the format of the ISA string is defined in the RISC-V
unprivileged specification which is case-insensitive. However, the
current ISA string parser in the kernel does not support ISA strings
with uppercase letters.
This patch modifies the ISA string parser in the kernel to support
case-insensitive ISA string parsing.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Link: https://lore.kernel.org/r/tencent_B30EED51C7235CA1988890E5C658BE35C107@qq.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Pull RISC-V fixes from Palmer Dabbelt:
- A build warning fix for BUILTIN_DTB=y
- Hibernation support is hidden behind NONPORTABLE, as it depends on
some undocumented early boot behavior and breaks on most platforms
- A fix for relocatable kernels on systems with early boot errata
- A fix to properly handle perf callchains for kernel tracepoints
- A pair of fixes for NAPOT to avoid inconsistencies between PTEs and
handle hardware that sets arbitrary A/D bits
* tag 'riscv-for-linus-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Implement missing huge_ptep_get
riscv: Fix huge_ptep_set_wrprotect when PTE is a NAPOT
riscv: perf: Fix callchain parse error with kernel tracepoint events
riscv: Fix relocatable kernels with early alternatives using -fno-pie
RISC-V: mark hibernation as nonportable
riscv: Fix unused variable warning when BUILTIN_DTB is set
During boot we call riscv_of_processor_hartid() for each hart that we
add to the possible cpus list. Repeating the call again here is not
required, if we iterate over the list of possible CPUs, rather than the
list of all CPUs.
The call to of_property_read_string() for "riscv,isa" cannot fail
either, as it has previously succeeded in riscv_of_processor_hartid(),
but leaving in the error checking makes the operation of the loop more
obvious & provides leeway for future refactoring of
riscv_of_processor_hartid().
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Co-developed-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230515054928.2079268-14-sunilvl@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Enable ACPI core for RISC-V after adding architecture-specific
interfaces and header files required to build the ACPI core.
1) Couple of header files are required unconditionally by the ACPI
core. Add empty acenv.h and cpu.h header files.
2) If CONFIG_PCI is enabled, a few PCI related interfaces need to
be provided by the architecture. Define dummy interfaces for now
so that build succeeds. Actual implementation will be added when
PCI support is added for ACPI along with external interrupt
controller support.
3) A few globals and memory mapping related functions specific
to the architecture need to be provided.
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230515054928.2079268-7-sunilvl@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Pull probes fixes from Masami Hiramatsu:
- Initialize 'ret' local variables on fprobe_handler() to fix the
smatch warning. With this, fprobe function exit handler is not
working randomly.
- Fix to use preempt_enable/disable_notrace for rethook handler to
prevent recursive call of fprobe exit handler (which is based on
rethook)
- Fix recursive call issue on fprobe_kprobe_handler()
- Fix to detect recursive call on fprobe_exit_handler()
- Fix to make all arch-dependent rethook code notrace (the
arch-independent code is already notrace)"
* tag 'probes-fixes-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
rethook, fprobe: do not trace rethook related functions
fprobe: add recursion detection in fprobe_exit_handler
fprobe: make fprobe_kprobe_handler recursion free
rethook: use preempt_{disable, enable}_notrace in rethook_trampoline_handler
tracing: fprobe: Initialize ret valiable to fix smatch error
Pull more RISC-V updates from Palmer Dabbelt:
- Support for hibernation
- The .rela.dyn section has been moved to the init area
- A fix for the SBI probing to allow for implementation-defined
behavior
- Various other fixes and cleanups throughout the tree
* tag 'riscv-for-linus-6.4-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: include cpufeature.h in cpufeature.c
riscv: Move .rela.dyn to the init sections
dt-bindings: riscv: explicitly mention assumption of Zicsr & Zifencei support
riscv: compat_syscall_table: Fixup compile warning
RISC-V: fixup in-flight collision with ARCH_WANT_OPTIMIZE_VMEMMAP rename
RISC-V: fix sifive and thead section mismatches in errata
RISC-V: Align SBI probe implementation with spec
riscv: mm: remove redundant parameter of create_fdt_early_page_table
riscv: Adjust dependencies of HAVE_DYNAMIC_FTRACE selection
RISC-V: Add arch functions to support hibernation/suspend-to-disk
RISC-V: mm: Enable huge page support to kernel_page_present() function
RISC-V: Factor out common code of __cpu_resume_enter()
RISC-V: Change suspend_save_csrs and suspend_restore_csrs to public function
Automation complains:
warning: symbol '__pcpu_scope_misaligned_access_speed' was not declared. Should it be static?
cpufeature.c doesn't actually include the header of the same name, as it
had not previously used anything from it.
The per-cpu variable is declared there, so include it to silence the
complaints.
Fixes: 62a31d6e38 ("RISC-V: hwprobe: Support probing of misaligned access performance")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20230420-wound-gizzard-2b2b589d9bea@spud
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The recent introduction of relocatable kernels prepared the move of
.rela.dyn to the init section, but actually forgot to do so, so do it
here.
Before this patch: "Freeing unused kernel image (initmem) memory: 2592K"
After this patch: "Freeing unused kernel image (initmem) memory: 6288K"
The difference corresponds to the size of the .rela.dyn section:
"[42] .rela.dyn RELA ffffffff8197e798 0127f798
000000000039c660 0000000000000018 A 47 0 8"
Fixes: 559d1e45a1 ("riscv: Use --emit-relocs in order to move .rela.dyn in init")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230428120932.22735-1-alexghiti@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>