linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-20 16:52:28 +00:00

Author	SHA1	Message	Date
Linus Torvalds	4a1d8ababd	Merge tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: - The sub-architecture selection Kconfig system has been cleaned up, the documentation has been improved, and various detections have been fixed - The vector-related extensions dependencies are now validated when parsing from device tree and in the DT bindings - Misaligned access probing can be overridden via a kernel command-line parameter, along with various fixes to misalign access handling - Support for relocatable !MMU kernels builds - Support for hpge pfnmaps, which should improve TLB utilization - Support for runtime constants, which improves the d_hash() performance - Support for bfloat16, Zicbom, Zaamo, Zalrsc, Zicntr, Zihpm - Various fixes, including: - We were missing a secondary mmu notifier call when flushing the tlb which is required for IOMMU - Fix ftrace panics by saving the registers as expected by ftrace - Fix a couple of stimecmp usage related to cpu hotplug - purgatory_start is now aligned as per the STVEC requirements - A fix for hugetlb when calculating the size of non-present PTEs * tag 'riscv-for-linus-6.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (65 commits) riscv: Add norvc after .option arch in runtime const riscv: Make sure toolchain supports zba before using zba instructions riscv/purgatory: 4B align purgatory_start riscv/kexec_file: Handle R_RISCV_64 in purgatory relocator selftests: riscv: fix v_exec_initval_nolibc.c riscv: Fix hugetlb retrieval of number of ptes in case of !present pte riscv: print hartid on bringup riscv: Add norvc after .option arch in runtime const riscv: Remove CONFIG_PAGE_OFFSET riscv: Support CONFIG_RELOCATABLE on riscv32 asm-generic: Always define Elf_Rel and Elf_Rela riscv: Support CONFIG_RELOCATABLE on NOMMU riscv: Allow NOMMU kernels to access all of RAM riscv: Remove duplicate CONFIG_PAGE_OFFSET definition RISC-V: errata: Use medany for relocatable builds dt-bindings: riscv: document vector crypto requirements dt-bindings: riscv: add vector sub-extension dependencies dt-bindings: riscv: d requires f RISC-V: add f & d extension validation checks RISC-V: add vector crypto extension validation checks ...	2025-04-04 09:49:17 -07:00
Linus Torvalds	edb0e8f6e2	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "ARM: - Nested virtualization support for VGICv3, giving the nested hypervisor control of the VGIC hardware when running an L2 VM - Removal of 'late' nested virtualization feature register masking, making the supported feature set directly visible to userspace - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers - Paravirtual interface for discovering the set of CPU implementations where a VM may run, addressing a longstanding issue of guest CPU errata awareness in big-little systems and cross-implementation VM migration - Userspace control of the registers responsible for identifying a particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1), allowing VMs to be migrated cross-implementation - pKVM updates, including support for tracking stage-2 page table allocations in the protected hypervisor in the 'SecPageTable' stat - Fixes to vPMU, ensuring that userspace updates to the vPMU after KVM_RUN are reflected into the backing perf events LoongArch: - Remove unnecessary header include path - Assume constant PGD during VM context switch - Add perf events support for guest VM RISC-V: - Disable the kernel perf counter during configure - KVM selftests improvements for PMU - Fix warning at the time of KVM module removal x86: - Add support for aging of SPTEs without holding mmu_lock. Not taking mmu_lock allows multiple aging actions to run in parallel, and more importantly avoids stalling vCPUs. This includes an implementation of per-rmap-entry locking; aging the gfn is done with only a per-rmap single-bin spinlock taken, whereas locking an rmap for write requires taking both the per-rmap spinlock and the mmu_lock. Note that this decreases slightly the accuracy of accessed-page information, because changes to the SPTE outside aging might not use atomic operations even if they could race against a clear of the Accessed bit. This is deliberate because KVM and mm/ tolerate false positives/negatives for accessed information, and testing has shown that reducing the latency of aging is far more beneficial to overall system performance than providing "perfect" young/old information. - Defer runtime CPUID updates until KVM emulates a CPUID instruction, to coalesce updates when multiple pieces of vCPU state are changing, e.g. as part of a nested transition - Fix a variety of nested emulation bugs, and add VMX support for synthesizing nested VM-Exit on interception (instead of injecting #UD into L2) - Drop "support" for async page faults for protected guests that do not set SEND_ALWAYS (i.e. that only want async page faults at CPL3) - Bring a bit of sanity to x86's VM teardown code, which has accumulated a lot of cruft over the years. Particularly, destroy vCPUs before the MMU, despite the latter being a VM-wide operation - Add common secure TSC infrastructure for use within SNP and in the future TDX - Block KVM_CAP_SYNC_REGS if guest state is protected. It does not make sense to use the capability if the relevant registers are not available for reading or writing - Don't take kvm->lock when iterating over vCPUs in the suspend notifier to fix a largely theoretical deadlock - Use the vCPU's actual Xen PV clock information when starting the Xen timer, as the cached state in arch.hv_clock can be stale/bogus - Fix a bug where KVM could bleed PVCLOCK_GUEST_STOPPED across different PV clocks; restrict PVCLOCK_GUEST_STOPPED to kvmclock, as KVM's suspend notifier only accounts for kvmclock, and there's no evidence that the flag is actually supported by Xen guests - Clean up the per-vCPU "cache" of its reference pvclock, and instead only track the vCPU's TSC scaling (multipler+shift) metadata (which is moderately expensive to compute, and rarely changes for modern setups) - Don't write to the Xen hypercall page on MSR writes that are initiated by the host (userspace or KVM) to fix a class of bugs where KVM can write to guest memory at unexpected times, e.g. during vCPU creation if userspace has set the Xen hypercall MSR index to collide with an MSR that KVM emulates - Restrict the Xen hypercall MSR index to the unofficial synthetic range to reduce the set of possible collisions with MSRs that are emulated by KVM (collisions can still happen as KVM emulates Hyper-V MSRs, which also reside in the synthetic range) - Clean up and optimize KVM's handling of Xen MSR writes and xen_hvm_config - Update Xen TSC leaves during CPUID emulation instead of modifying the CPUID entries when updating PV clocks; there is no guarantee PV clocks will be updated between TSC frequency changes and CPUID emulation, and guest reads of the TSC leaves should be rare, i.e. are not a hot path x86 (Intel): - Fix a bug where KVM unnecessarily reads XFD_ERR from hardware and thus modifies the vCPU's XFD_ERR on a #NM due to CR0.TS=1 - Pass XFD_ERR as the payload when injecting #NM, as a preparatory step for upcoming FRED virtualization support - Decouple the EPT entry RWX protection bit macros from the EPT Violation bits, both as a general cleanup and in anticipation of adding support for emulating Mode-Based Execution Control (MBEC) - Reject KVM_RUN if userspace manages to gain control and stuff invalid guest state while KVM is in the middle of emulating nested VM-Enter - Add a macro to handle KVM's sanity checks on entry/exit VMCS control pairs in anticipation of adding sanity checks for secondary exit controls (the primary field is out of bits) x86 (AMD): - Ensure the PSP driver is initialized when both the PSP and KVM modules are built-in (the initcall framework doesn't handle dependencies) - Use long-term pins when registering encrypted memory regions, so that the pages are migrated out of MIGRATE_CMA/ZONE_MOVABLE and don't lead to excessive fragmentation - Add macros and helpers for setting GHCB return/error codes - Add support for Idle HLT interception, which elides interception if the vCPU has a pending, unmasked virtual IRQ when HLT is executed - Fix a bug in INVPCID emulation where KVM fails to check for a non-canonical address - Don't attempt VMRUN for SEV-ES+ guests if the vCPU's VMSA is invalid, e.g. because the vCPU was "destroyed" via SNP's AP Creation hypercall - Reject SNP AP Creation if the requested SEV features for the vCPU don't match the VM's configured set of features Selftests: - Fix again the Intel PMU counters test; add a data load and do CLFLUSH{OPT} on the data instead of executing code. The theory is that modern Intel CPUs have learned new code prefetching tricks that bypass the PMU counters - Fix a flaw in the Intel PMU counters test where it asserts that an event is counting correctly without actually knowing what the event counts on the underlying hardware - Fix a variety of flaws, bugs, and false failures/passes dirty_log_test, and improve its coverage by collecting all dirty entries on each iteration - Fix a few minor bugs related to handling of stats FDs - Add infrastructure to make vCPU and VM stats FDs available to tests by default (open the FDs during VM/vCPU creation) - Relax an assertion on the number of HLT exits in the xAPIC IPI test when running on a CPU that supports AMD's Idle HLT (which elides interception of HLT if a virtual IRQ is pending and unmasked)" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (216 commits) RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed RISC-V: KVM: Teardown riscv specific bits after kvm_exit LoongArch: KVM: Register perf callbacks for guest LoongArch: KVM: Implement arch-specific functions for guest perf LoongArch: KVM: Add stub for kvm_arch_vcpu_preempted_in_kernel() LoongArch: KVM: Remove PGD saving during VM context switch LoongArch: KVM: Remove unnecessary header include path KVM: arm64: Tear down vGIC on failed vCPU creation KVM: arm64: PMU: Reload when resetting KVM: arm64: PMU: Reload when user modifies registers KVM: arm64: PMU: Fix SET_ONE_REG for vPMC regs KVM: arm64: PMU: Assume PMU presence in pmu-emul.c KVM: arm64: PMU: Set raw values from user to PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} KVM: arm64: Create each pKVM hyp vcpu after its corresponding host vcpu KVM: arm64: Factor out pKVM hyp vcpu creation to separate function KVM: arm64: Initialize HCRX_EL2 traps in pKVM KVM: arm64: Factor out setting HCRX_EL2 traps into separate function KVM: x86: block KVM_CAP_SYNC_REGS if guest state is protected KVM: x86: Add infrastructure for secure TSC KVM: x86: Push down setting vcpu.arch.user_set_tsc ...	2025-03-25 14:22:07 -07:00
Linus Torvalds	a50b4fe095	Merge tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A treewide hrtimer timer cleanup hrtimers are initialized with hrtimer_init() and a subsequent store to the callback pointer. This turned out to be suboptimal for the upcoming Rust integration and is obviously a silly implementation to begin with. This cleanup replaces the hrtimer_init(T); T->function = cb; sequence with hrtimer_setup(T, cb); The conversion was done with Coccinelle and a few manual fixups. Once the conversion has completely landed in mainline, hrtimer_init() will be removed and the hrtimer::function becomes a private member" * tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits) wifi: rt2x00: Switch to use hrtimer_update_function() io_uring: Use helper function hrtimer_update_function() serial: xilinx_uartps: Use helper function hrtimer_update_function() ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup() RDMA: Switch to use hrtimer_setup() virtio: mem: Switch to use hrtimer_setup() drm/vmwgfx: Switch to use hrtimer_setup() drm/xe/oa: Switch to use hrtimer_setup() drm/vkms: Switch to use hrtimer_setup() drm/msm: Switch to use hrtimer_setup() drm/i915/request: Switch to use hrtimer_setup() drm/i915/uncore: Switch to use hrtimer_setup() drm/i915/pmu: Switch to use hrtimer_setup() drm/i915/perf: Switch to use hrtimer_setup() drm/i915/gvt: Switch to use hrtimer_setup() drm/i915/huc: Switch to use hrtimer_setup() drm/amdgpu: Switch to use hrtimer_setup() stm class: heartbeat: Switch to use hrtimer_setup() i2c: Switch to use hrtimer_setup() iio: Switch to use hrtimer_setup() ...	2025-03-25 10:54:15 -07:00
Chao Du	b3f263a98d	RISC-V: KVM: Optimize comments in kvm_riscv_vcpu_isa_disable_allowed The comments for EXT_SVADE are a bit confusing. Clarify it. Signed-off-by: Chao Du <duchao@eswincomputing.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20250221025929.31678-1-duchao@eswincomputing.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-20 11:33:56 +05:30
Atish Patra	2d117e67f3	RISC-V: KVM: Teardown riscv specific bits after kvm_exit During a module removal, kvm_exit invokes arch specific disable call which disables AIA. However, we invoke aia_exit before kvm_exit resulting in the following warning. KVM kernel module can't be inserted afterwards due to inconsistent state of IRQ. [25469.031389] percpu IRQ 31 still enabled on CPU0! [25469.031732] WARNING: CPU: 3 PID: 943 at kernel/irq/manage.c:2476 __free_percpu_irq+0xa2/0x150 [25469.031804] Modules linked in: kvm(-) [25469.031848] CPU: 3 UID: 0 PID: 943 Comm: rmmod Not tainted 6.14.0-rc5-06947-g91c763118f47-dirty #2 [25469.031905] Hardware name: riscv-virtio,qemu (DT) [25469.031928] epc : __free_percpu_irq+0xa2/0x150 [25469.031976] ra : __free_percpu_irq+0xa2/0x150 [25469.032197] epc : ffffffff8007db1e ra : ffffffff8007db1e sp : ff2000000088bd50 [25469.032241] gp : ffffffff8131cef8 tp : ff60000080b96400 t0 : ff2000000088baf8 [25469.032285] t1 : fffffffffffffffc t2 : 5249207570637265 s0 : ff2000000088bd90 [25469.032329] s1 : ff60000098b21080 a0 : 037d527a15eb4f00 a1 : 037d527a15eb4f00 [25469.032372] a2 : 0000000000000023 a3 : 0000000000000001 a4 : ffffffff8122dbf8 [25469.032410] a5 : 0000000000000fff a6 : 0000000000000000 a7 : ffffffff8122dc10 [25469.032448] s2 : ff60000080c22eb0 s3 : 0000000200000022 s4 : 000000000000001f [25469.032488] s5 : ff60000080c22e00 s6 : ffffffff80c351c0 s7 : 0000000000000000 [25469.032582] s8 : 0000000000000003 s9 : 000055556b7fb490 s10: 00007ffff0e12fa0 [25469.032621] s11: 00007ffff0e13e9a t3 : ffffffff81354ac7 t4 : ffffffff81354ac7 [25469.032664] t5 : ffffffff81354ac8 t6 : ffffffff81354ac7 [25469.032698] status: 0000000200000100 badaddr: ffffffff8007db1e cause: 0000000000000003 [25469.032738] [<ffffffff8007db1e>] __free_percpu_irq+0xa2/0x150 [25469.032797] [<ffffffff8007dbfc>] free_percpu_irq+0x30/0x5e [25469.032856] [<ffffffff013a57dc>] kvm_riscv_aia_exit+0x40/0x42 [kvm] [25469.033947] [<ffffffff013b4e82>] cleanup_module+0x10/0x32 [kvm] [25469.035300] [<ffffffff8009b150>] __riscv_sys_delete_module+0x18e/0x1fc [25469.035374] [<ffffffff8000c1ca>] syscall_handler+0x3a/0x46 [25469.035456] [<ffffffff809ec9a4>] do_trap_ecall_u+0x72/0x134 [25469.035536] [<ffffffff809f5e18>] handle_exception+0x148/0x156 Invoke aia_exit and other arch specific cleanup functions after kvm_exit so that disable gets a chance to be called first before exit. Fixes: `54e43320c2` ("RISC-V: KVM: Initial skeletal support for AIA") Fixes: `eded6754f3` ("riscv: KVM: add basic support for host vs guest profiling") Signed-off-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20250317-kvm_exit_fix-v1-1-aa5240c5dbd2@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-19 20:21:38 +05:30
Clément Léger	2d79608cf6	RISC-V: KVM: Allow Zaamo/Zalrsc extensions for Guest/VM Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zaamo/Zalrsc extensions for Guest/VM. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240619153913.867263-5-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>	2025-03-19 12:03:46 +00:00
Atish Patra	bbb6224887	RISC-V: KVM: Disable the kernel perf counter during configure The perf event should be marked disabled during the creation as it is not ready to be scheduled until there is SBI PMU start call or config matching is called with auto start. Otherwise, event add/start gets called during perf_event_create_kernel_counter function. It will be enabled and scheduled to run via perf_event_enable during either the above mentioned scenario. Fixes: `0cb74b65d2` ("RISC-V: KVM: Implement perf support without sampling") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20250303-kvm_pmu_improve-v2-1-41d177e45929@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-03-06 09:57:07 +05:30
BillXiang	d252435aca	riscv: KVM: Remove unnecessary vcpu kick Remove the unnecessary kick to the vCPU after writing to the vs_file of IMSIC in kvm_riscv_vcpu_aia_imsic_inject. For vCPUs that are running, writing to the vs_file directly forwards the interrupt as an MSI to them and does not need an extra kick. For vCPUs that are descheduled after emulating WFI, KVM will enable the guest external interrupt for that vCPU in kvm_riscv_aia_wakeon_hgei. This means that writing to the vs_file will cause a guest external interrupt, which will cause KVM to wake up the vCPU in hgei_interrupt to handle the interrupt properly. Signed-off-by: BillXiang <xiangwencheng@lanxincomputing.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Radim Krčmář <rkrcmar@ventanamicro.com> Link: https://lore.kernel.org/r/20250221104538.2147-1-xiangwencheng@lanxincomputing.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-21 17:27:32 +05:30
Nam Cao	92051cb9d3	riscv: kvm: Switch to use hrtimer_setup() hrtimer_setup() takes the callback function pointer as argument and initializes the timer completely. Replace hrtimer_init() and the open coded initialization of hrtimer::function with the new setup mechanism. Patch was created by using Coccinelle. Signed-off-by: Nam Cao <namcao@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/d5ededf778f59f2fc38ff4276fb7f4c893e4142c.1738746821.git.namcao@linutronix.de	2025-02-18 10:32:31 +01:00
Andrew Jones	351e02b173	riscv: KVM: Fix SBI sleep_type use The spec says sleep_type is 32 bits wide and "In case the data is defined as 32bit wide, higher privilege software must ensure that it only uses 32 bit data." Mask off upper bits of sleep_type before using it. Fixes: `023c15151f` ("RISC-V: KVM: Add SBI system suspend support") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250217084506.18763-12-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-17 16:28:28 +05:30
Andrew Jones	b901484852	riscv: KVM: Fix SBI TIME error generation When an invalid function ID of an SBI extension is used we should return not-supported, not invalid-param. Fixes: `5f862df558` ("RISC-V: KVM: Add v0.1 replacement SBI extensions defined in v0.2") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250217084506.18763-11-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-17 16:28:28 +05:30
Andrew Jones	0611f78f83	riscv: KVM: Fix SBI IPI error generation When an invalid function ID of an SBI extension is used we should return not-supported, not invalid-param. Also, when we see that at least one hartid constructed from the base and mask parameters is invalid, then we should return invalid-param. Finally, rather than relying on overflowing a left shift to result in zero and then using that zero in a condition which [correctly] skips sending an IPI (but loops unnecessarily), explicitly check for overflow and exit the loop immediately. Fixes: `5f862df558` ("RISC-V: KVM: Add v0.1 replacement SBI extensions defined in v0.2") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250217084506.18763-10-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-17 16:28:28 +05:30
Andrew Jones	e3219b0c49	riscv: KVM: Fix hart suspend_type use The spec says suspend_type is 32 bits wide and "In case the data is defined as 32bit wide, higher privilege software must ensure that it only uses 32 bit data." Mask off upper bits of suspend_type before using it. Fixes: `763c8bed8c` ("RISC-V: KVM: Implement SBI HSM suspend call") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250217084506.18763-9-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-17 16:28:28 +05:30
Andrew Jones	c7db342e3b	riscv: KVM: Fix hart suspend status check "Not stopped" means started or suspended so we need to check for a single state in order to have a chance to check for each state. Also, we need to use target_vcpu when checking for the suspend state. Fixes: `763c8bed8c` ("RISC-V: KVM: Implement SBI HSM suspend call") Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250217084506.18763-8-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2025-02-17 16:28:27 +05:30
Atish Patra	af79caa83f	RISC-V: KVM: Add new exit statstics for redirected traps Currently, kvm doesn't delegate the few traps such as misaligned load/store, illegal instruction and load/store access faults because it is not expected to occur in the guest very frequently. Thus, kvm gets a chance to act upon it or collect statistics about it before redirecting the traps to the guest. Collect both guest and host visible statistics during the traps. Enable them so that both guest and host can collect the stats about them if required. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-3-08a77ac36b02@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:02 +05:30
Atish Patra	2f15b5eaff	RISC-V: KVM: Update firmware counters for various events SBI PMU specification defines few firmware counters which can be used by the guests to collect the statstics about various traps occurred in the host. Update these counters whenever a corresponding trap is taken Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-2-08a77ac36b02@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:02 +05:30
Quan Zhou	51c5895673	RISC-V: KVM: Redirect instruction access fault trap to guest The M-mode redirects an unhandled instruction access fault trap back to S-mode when not delegating it to VS-mode(hedeleg). However, KVM running in HS-mode terminates the VS-mode software when back from M-mode. The KVM should redirect the trap back to VS-mode, and let VS-mode trap handler decide the next step. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241224-kvm_guest_stat-v2-1-08a77ac36b02@rivosinc.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:02 +05:30
Quan Zhou	79be257b57	RISC-V: KVM: Allow Ziccrse extension for Guest/VM Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Ziccrse extension for Guest/VM. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/d10e746d165074174f830aa3d89bf3c92017acee.1732854096.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:02 +05:30
Quan Zhou	679e132c0a	RISC-V: KVM: Allow Zabha extension for Guest/VM Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zabha extension for Guest/VM. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/4074feb27819e23bab05b0fd6441a38bf0b6a5e2.1732854096.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:02 +05:30
Quan Zhou	0f89158597	RISC-V: KVM: Allow Svvptc extension for Guest/VM Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Svvptc extension for Guest/VM. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/133509ffe5783b62cf95e8f675cc3e327bee402e.1732854096.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:01 +05:30
Andrew Jones	023c15151f	RISC-V: KVM: Add SBI system suspend support Implement a KVM SBI SUSP extension handler. The handler only validates the system suspend entry criteria and prepares for resuming in the appropriate state at the resume_addr (as specified by the SBI spec), but then it forwards the call to the VMM where any system suspend behavior may be implemented. Since VMM support is needed, KVM disables the extension by default. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241017074538.18867-5-ajones@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-30 14:01:01 +05:30
Michael Neuling	ea6398a5af	RISC-V: KVM: Fix csr_write -> csr_set for HVIEN PMU overflow bit This doesn't cause a problem currently as HVIEN isn't used elsewhere yet. Found by inspection. Signed-off-by: Michael Neuling <michaelneuling@tenstorrent.com> Fixes: `16b0bde9a3` ("RISC-V: KVM: Add perf sampling support for guests") Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241127041840.419940-1-michaelneuling@tenstorrent.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-12-06 18:42:38 +05:30
Paolo Bonzini	4d911c7abe	Merge tag 'kvm-riscv-6.13-2' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.13 part #2 - Svade and Svadu extension support for Host and Guest/VM	2024-11-27 12:00:28 -05:00
Paolo Bonzini	c1668520c9	Merge tag 'riscv-for-linus-6.13-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux into HEAD RISC-V Paches for the 6.13 Merge Window, Part 1 * Support for pointer masking in userspace, * Support for probing vector misaligned access performance. * Support for qspinlock on systems with Zacas and Zabha.	2024-11-27 11:49:44 -05:00
Yong-Xuan Wang	97eccf7db4	RISC-V: KVM: Add Svade and Svadu Extensions Support for Guest/VM We extend the KVM ISA extension ONE_REG interface to allow VMM tools to detect and enable Svade and Svadu extensions for Guest/VM. Since the henvcfg.ADUE is read-only zero if the menvcfg.ADUE is zero, the Svadu extension is available for Guest/VM and the Svade extension is allowed to disabledonly when arch_has_hw_pte_young() is true. Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20240726084931.28924-4-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-11-21 17:40:14 +05:30
Palmer Dabbelt	64f7b77f0b	Merge patch series "Zacas/Zabha support and qspinlocks" Alexandre Ghiti <alexghiti@rivosinc.com> says: This implements [cmp]xchgXX() macros using Zacas and Zabha extensions and finally uses those newly introduced macros to add support for qspinlocks: note that this implementation of qspinlocks satisfies the forward progress guarantee. It also uses Ziccrse to provide the qspinlock implementation. Thanks to Guo and Leonardo for their work! * b4-shazam-merge: (1314 commits) riscv: Add qspinlock support dt-bindings: riscv: Add Ziccrse ISA extension description riscv: Add ISA extension parsing for Ziccrse asm-generic: ticket-lock: Add separate ticket-lock.h asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock riscv: Implement xchg8/16() using Zabha riscv: Implement arch_cmpxchg128() using Zacas riscv: Improve zacas fully-ordered cmpxchg() riscv: Implement cmpxchg8/16() using Zabha dt-bindings: riscv: Add Zabha ISA extension description riscv: Implement cmpxchg32/64() using Zacas riscv: Do not fail to build on byte/halfword operations with Zawrs riscv: Move cpufeature.h macros into their own header Link: https://lore.kernel.org/r/20241103145153.105097-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-11-11 07:35:09 -08:00
Paolo Bonzini	e3e0f9b7ae	Merge tag 'kvm-riscv-6.13-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.13 - Accelerate KVM RISC-V when running as a guest - Perf support to collect KVM guest statistics from host side	2024-11-08 12:13:48 -05:00
Björn Töpel	332fa4a802	riscv: kvm: Fix out-of-bounds array access In kvm_riscv_vcpu_sbi_init() the entry->ext_idx can contain an out-of-bound index. This is used as a special marker for the base extensions, that cannot be disabled. However, when traversing the extensions, that special marker is not checked prior indexing the array. Add an out-of-bounds check to the function. Fixes: `56d8a385b6` ("RISC-V: KVM: Allow some SBI extensions to be disabled by default") Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241104191503.74725-1-bjorn@kernel.org Signed-off-by: Anup Patel <anup@brainfault.org>	2024-11-05 13:27:32 +05:30
Yong-Xuan Wang	60821fb4dd	RISC-V: KVM: Fix APLIC in_clrip and clripnum write emulation In the section "4.7 Precise effects on interrupt-pending bits" of the RISC-V AIA specification defines that: "If the source mode is Level1 or Level0 and the interrupt domain is configured in MSI delivery mode (domaincfg.DM = 1): The pending bit is cleared whenever the rectified input value is low, when the interrupt is forwarded by MSI, or by a relevant write to an in_clrip register or to clripnum." Update the aplic_write_pending() to match the spec. Fixes: `d8dd9f113e` ("RISC-V: KVM: Fix APLIC setipnum_le/be write emulation") Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Vincent Chen <vincent.chen@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241029085542.30541-1-yongxuan.wang@sifive.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-11-05 13:27:28 +05:30
Anup Patel	5bdecd891e	RISC-V: KVM: Use NACL HFENCEs for KVM request based HFENCEs When running under some other hypervisor, use SBI NACL based HFENCEs for TLB shoot-down via KVM requests. This makes HFENCEs faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-14-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:44:08 +05:30
Anup Patel	3e7d154ad8	RISC-V: KVM: Save trap CSRs in kvm_riscv_vcpu_enter_exit() Save trap CSRs in the kvm_riscv_vcpu_enter_exit() function instead of the kvm_arch_vcpu_ioctl_run() function so that HTVAL and HTINST CSRs are accessed in more optimized manner while running under some other hypervisor. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-13-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:44:05 +05:30
Anup Patel	68c72a6557	RISC-V: KVM: Use SBI sync SRET call when available Implement an optimized KVM world-switch using SBI sync SRET call when SBI nested acceleration extension is available. This improves KVM world-switch when KVM RISC-V is running as a Guest under some other hypervisor. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-12-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:44:03 +05:30
Anup Patel	dab55604ae	RISC-V: KVM: Use nacl_csr_xyz() for accessing AIA CSRs When running under some other hypervisor, prefer nacl_csr_xyz() for accessing AIA CSRs in the run-loop. This makes CSR access faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-11-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:44:01 +05:30
Anup Patel	e28e6b6976	RISC-V: KVM: Use nacl_csr_xyz() for accessing H-extension CSRs When running under some other hypervisor, prefer nacl_csr_xyz() for accessing H-extension CSRs in the run-loop. This makes CSR access faster whenever SBI nested acceleration is available. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-10-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:59 +05:30
Anup Patel	d466c19cea	RISC-V: KVM: Add common nested acceleration support Add a common nested acceleration support which will be shared by all parts of KVM RISC-V. This nested acceleration support detects and enables SBI NACL extension usage based on static keys which ensures minimum impact on the non-nested scenario. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-9-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:57 +05:30
Anup Patel	15ff2ff3c3	RISC-V: KVM: Don't setup SGEI for zero guest external interrupts No need to setup SGEI local interrupt when there are zero guest external interrupts (i.e. zero HW IMSIC guest files). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-7-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:53 +05:30
Anup Patel	5d8f7ee928	RISC-V: KVM: Replace aia_set_hvictl() with aia_hvictl_value() The aia_set_hvictl() internally writes the HVICTL CSR which makes it difficult to optimize the CSR write using SBI NACL extension for kvm_riscv_vcpu_aia_update_hvip() function so replace aia_set_hvictl() with new aia_hvictl_value() which only computes the HVICTL value. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-6-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:50 +05:30
Anup Patel	8f57adac39	RISC-V: KVM: Break down the __kvm_riscv_switch_to() into macros Break down the __kvm_riscv_switch_to() function into macros so that these macros can be later re-used by SBI NACL extension based low-level switch function. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-5-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:45 +05:30
Anup Patel	b922307a5f	RISC-V: KVM: Save/restore SCOUNTEREN in C source The SCOUNTEREN CSR need not be saved/restored in the low-level __kvm_riscv_switch_to() function hence move the SCOUNTEREN CSR save/restore to the kvm_riscv_vcpu_swap_in_guest_state() and kvm_riscv_vcpu_swap_in_host_state() functions in C sources. Also, re-arrange the CSR save/restore and related GPR usage in the low-level __kvm_riscv_switch_to() low-level function. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-4-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:43 +05:30
Anup Patel	b6114a7e24	RISC-V: KVM: Save/restore HSTATUS in C source We will be optimizing HSTATUS CSR access via shared memory setup using the SBI nested acceleration extension. To facilitate this, we first move HSTATUS save/restore in kvm_riscv_vcpu_enter_exit(). Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-3-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:40 +05:30
Anup Patel	e403a90ad6	RISC-V: KVM: Order the object files alphabetically Order the object files alphabetically in the Makefile so that it is very predictable inserting new object files in the future. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Link: https://lore.kernel.org/r/20241020194734.58686-2-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:43:37 +05:30
Quan Zhou	eded6754f3	riscv: KVM: add basic support for host vs guest profiling For the information collected on the host side, we need to identify which data originates from the guest and record these events separately, this can be achieved by having KVM register perf callbacks. Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/00342d535311eb0629b9ba4f1e457a48e2abee33.1728957131.git.zhouquan@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>	2024-10-28 16:41:14 +05:30
Sean Christopherson	334511d468	KVM: RISC-V: Use kvm_faultin_pfn() when mapping pfns into the guest Convert RISC-V to __kvm_faultin_pfn()+kvm_release_faultin_page(), which are new APIs to consolidate arch code and provide consistent behavior across all KVM architectures. Opportunisticaly fix a s/priort/prior typo in the related comment. Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-60-seanjc@google.com>	2024-10-25 13:00:48 -04:00
Sean Christopherson	9c902aee68	KVM: RISC-V: Mark "struct page" pfns accessed before dropping mmu_lock Mark pages accessed before dropping mmu_lock when faulting in guest memory so that RISC-V can convert to kvm_release_faultin_page() without tripping its lockdep assertion on mmu_lock being held. Marking pages accessed outside of mmu_lock is ok (not great, but safe), but marking pages _dirty_ outside of mmu_lock can make filesystems unhappy (see the link below). Do both under mmu_lock to minimize the chances of doing the wrong thing in the future. Link: https://lore.kernel.org/all/cover.1683044162.git.lstoakes@gmail.com Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-59-seanjc@google.com>	2024-10-25 13:00:48 -04:00
Sean Christopherson	9b3639bb02	KVM: RISC-V: Mark "struct page" pfns dirty iff a stage-2 PTE is installed Don't mark pages dirty if KVM bails from the page fault handler without installing a stage-2 mapping, i.e. if the page is guaranteed to not be written by the guest. In addition to being a (very) minor fix, this paves the way for converting RISC-V to use kvm_release_faultin_page(). Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Acked-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20241010182427.1434605-58-seanjc@google.com>	2024-10-25 13:00:48 -04:00
Samuel Holland	1851e78362	RISC-V: KVM: Allow Smnpm and Ssnpm extensions for guests The interface for controlling pointer masking in VS-mode is henvcfg.PMM, which is part of the Ssnpm extension, even though pointer masking in HS-mode is provided by the Smnpm extension. As a result, emulating Smnpm in the guest requires (only) Ssnpm on the host. The guest configures Smnpm through the SBI Firmware Features extension, which KVM does not yet implement, so currently the ISA extension has no visible effect on the guest, and thus it cannot be disabled. Ssnpm is configured using the senvcfg CSR within the guest, so that extension cannot be hidden from the guest without intercepting writes to the CSR. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20241016202814.4061541-10-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-10-24 14:13:00 -07:00
Cyan Yang	3ec4350d4e	RISCV: KVM: use raw_spinlock for critical section in imsic For the external interrupt updating procedure in imsic, there was a spinlock to protect it already. But since it should not be preempted in any cases, we should turn to use raw_spinlock to prevent any preemption in case PREEMPT_RT was enabled. Signed-off-by: Cyan Yang <cyan.yang@sifive.com> Reviewed-by: Yong-Xuan Wang <yongxuan.wang@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Message-ID: <20240919160126.44487-1-cyan.yang@sifive.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-10-20 12:10:44 -04:00
Paolo Bonzini	c09dd2bb57	Merge branch 'kvm-redo-enable-virt' into HEAD Register KVM's cpuhp and syscore callbacks when enabling virtualization in hardware, as the sole purpose of said callbacks is to disable and re-enable virtualization as needed. The primary motivation for this series is to simplify dealing with enabling virtualization for Intel's TDX, which needs to enable virtualization when kvm-intel.ko is loaded, i.e. long before the first VM is created. That said, this is a nice cleanup on its own. By registering the callbacks on-demand, the callbacks themselves don't need to check kvm_usage_count, because their very existence implies a non-zero count. Patch 1 (re)adds a dedicated lock for kvm_usage_count. This avoids a lock ordering issue between cpus_read_lock() and kvm_lock. The lock ordering issue still exist in very rare cases, and will be fixed for good by switching vm_list to an (S)RCU-protected list. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-09-17 11:38:20 -04:00
Sean Christopherson	071f24ad28	KVM: Rename arch hooks related to per-CPU virtualization enabling Rename the per-CPU hooks used to enable virtualization in hardware to align with the KVM-wide helpers in kvm_main.c, and to better capture that the callbacks are invoked on every online CPU. No functional change intended. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> Message-ID: <20240830043600.127750-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-09-04 11:02:33 -04:00
Anup Patel	47d40d9329	RISC-V: KVM: Don't zero-out PMU snapshot area before freeing data With the latest Linux-6.11-rc3, the below NULL pointer crash is observed when SBI PMU snapshot is enabled for the guest and the guest is forcefully powered-off. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000508 Oops [#1] Modules linked in: kvm CPU: 0 UID: 0 PID: 61 Comm: term-poll Not tainted 6.11.0-rc3-00018-g44d7178dd77a #3 Hardware name: riscv-virtio,qemu (DT) epc : __kvm_write_guest_page+0x94/0xa6 [kvm] ra : __kvm_write_guest_page+0x54/0xa6 [kvm] epc : ffffffff01590e98 ra : ffffffff01590e58 sp : ffff8f80001f39b0 gp : ffffffff81512a60 tp : ffffaf80024872c0 t0 : ffffaf800247e000 t1 : 00000000000007e0 t2 : 0000000000000000 s0 : ffff8f80001f39f0 s1 : 00007fff89ac4000 a0 : ffffffff015dd7e8 a1 : 0000000000000086 a2 : 0000000000000000 a3 : ffffaf8000000000 a4 : ffffaf80024882c0 a5 : 0000000000000000 a6 : ffffaf800328d780 a7 : 00000000000001cc s2 : ffffaf800197bd00 s3 : 00000000000828c4 s4 : ffffaf800248c000 s5 : ffffaf800247d000 s6 : 0000000000001000 s7 : 0000000000001000 s8 : 0000000000000000 s9 : 00007fff861fd500 s10: 0000000000000001 s11: 0000000000800000 t3 : 00000000000004d3 t4 : 00000000000004d3 t5 : ffffffff814126e0 t6 : ffffffff81412700 status: 0000000200000120 badaddr: 0000000000000508 cause: 000000000000000d [<ffffffff01590e98>] __kvm_write_guest_page+0x94/0xa6 [kvm] [<ffffffff015943a6>] kvm_vcpu_write_guest+0x56/0x90 [kvm] [<ffffffff015a175c>] kvm_pmu_clear_snapshot_area+0x42/0x7e [kvm] [<ffffffff015a1972>] kvm_riscv_vcpu_pmu_deinit.part.0+0xe0/0x14e [kvm] [<ffffffff015a2ad0>] kvm_riscv_vcpu_pmu_deinit+0x1a/0x24 [kvm] [<ffffffff0159b344>] kvm_arch_vcpu_destroy+0x28/0x4c [kvm] [<ffffffff0158e420>] kvm_destroy_vcpus+0x5a/0xda [kvm] [<ffffffff0159930c>] kvm_arch_destroy_vm+0x14/0x28 [kvm] [<ffffffff01593260>] kvm_destroy_vm+0x168/0x2a0 [kvm] [<ffffffff015933d4>] kvm_put_kvm+0x3c/0x58 [kvm] [<ffffffff01593412>] kvm_vm_release+0x22/0x2e [kvm] Clearly, the kvm_vcpu_write_guest() function is crashing because it is being called from kvm_pmu_clear_snapshot_area() upon guest tear down. To address the above issue, simplify the kvm_pmu_clear_snapshot_area() to not zero-out PMU snapshot area from kvm_pmu_clear_snapshot_area() because the guest is anyway being tore down. The kvm_pmu_clear_snapshot_area() is also called when guest changes PMU snapshot area of a VCPU but even in this case the previous PMU snaphsot area must not be zeroed-out because the guest might have reclaimed the pervious PMU snapshot area for some other purpose. Fixes: `c2f41ddbcd` ("RISC-V: KVM: Implement SBI PMU Snapshot feature") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Link: https://lore.kernel.org/r/20240815170907.2792229-1-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>	2024-08-19 08:58:17 +05:30

1 2 3 4 5 ...

341 Commits