commit da5a3ce66b upstream.
At boot time, KVM stashes the host MDCR_EL2 value, but only does this
when the kernel is not running in hyp mode (i.e. is non-VHE). In these
cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
lead to CONSTRAINED UNPREDICTABLE behaviour.
Since we use this value to derive the MDCR_EL2 value when switching
to/from a guest, after a guest have been run, the performance counters
do not behave as expected. This has been observed to result in accesses
via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
counters, resulting in events not being counted. In these cases, only
the fixed-purpose cycle counter appears to work as expected.
Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.
Cc: Christopher Dall <christoffer.dall@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Fixes: 1e947bad0b ("arm64: KVM: Skip HYP setup when already running in HYP")
Tested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 55e3748e89 upstream.
In order to offer ARCH_WORKAROUND_2 support to guests, we need
a bit of infrastructure.
Let's add a flag indicating whether or not the guest uses
SSBD mitigation. Depending on the state of this flag, allow
KVM to disable ARCH_WORKAROUND_2 before entering the guest,
and enable it when exiting it.
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 36989e7fd3 upstream.
kvm_host_cpu_state is a per-cpu allocation made from kvm_arch_init()
used to store the host EL1 registers when KVM switches to a guest.
Make it easier for ASM to generate pointers into this per-cpu memory
by making it a static allocation.
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f0cf47d939 upstream.
Before entering the guest, we check whether our VMID is still
part of the current generation. In order to avoid taking a lock,
we start with checking that the generation is still current, and
only if not current do we take the lock, recheck, and update the
generation and VMID.
This leaves open a small race: A vcpu can bump up the global
generation number as well as the VM's, but has not updated
the VMID itself yet.
At that point another vcpu from the same VM comes in, checks
the generation (and finds it not needing anything), and jumps
into the guest. At this point, we end-up with two vcpus belonging
to the same VM running with two different VMIDs. Eventually, the
VMID used by the second vcpu will get reassigned, and things will
really go wrong...
A simple solution would be to drop this initial check, and always take
the lock. This is likely to cause performance issues. A middle ground
is to convert the spinlock to a rwlock, and only take the read lock
on the fast path. If the check fails at that point, drop it and
acquire the write lock, rechecking the condition.
This ensures that the above scenario doesn't occur.
Cc: stable@vger.kernel.org
Reported-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 58d0d19a20 ]
Since it is perfectly legal to run the kernel at EL1, it is not
actually an error if HYP mode is not available when attempting to
initialize KVM, given that KVM support cannot be built as a module.
So demote the kvm_err() to kvm_info(), which prevents the error from
appearing on an otherwise 'quiet' console.
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 58d6b15e9d upstream.
cpu_pm_enter() calls the pm notifier chain with CPU_PM_ENTER, then if
there is a failure: CPU_PM_ENTER_FAILED.
When KVM receives CPU_PM_ENTER it calls cpu_hyp_reset() which will
return us to the hyp-stub. If we subsequently get a CPU_PM_ENTER_FAILED,
KVM does nothing, leaving the CPU running with the hyp-stub, at odds
with kvm_arm_hardware_enabled.
Add CPU_PM_ENTER_FAILED as a fallthrough for CPU_PM_EXIT, this reloads
KVM based on kvm_arm_hardware_enabled. This is safe even if CPU_PM_ENTER
never gets as far as KVM, as cpu_hyp_reinit() calls cpu_hyp_reset()
to make sure the hyp-stub is loaded before reloading KVM.
Fixes: 67f6919766 ("arm64: kvm: allows kvm cpu hotplug")
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 20b7035c66 ]
KVM API says for the signal mask you set via KVM_SET_SIGNAL_MASK, that
"any unblocked signal received [...] will cause KVM_RUN to return with
-EINTR" and that "the signal will only be delivered if not blocked by
the original signal mask".
This, however, is only true, when the calling task has a signal handler
registered for a signal. If not, signal evaluation is short-circuited for
SIG_IGN and SIG_DFL, and the signal is either ignored without KVM_RUN
returning or the whole process is terminated.
Make KVM_SET_SIGNAL_MASK behave as advertised by utilizing logic similar
to that in do_sigtimedwait() to avoid short-circuiting of signals.
Signed-off-by: Jan H. Schönherr <jschoenh@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <alexander.levin@verizon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
KVM is being a bit too optimistic, Hyp mode is said to be initialized
when Hyp segments have only been mapped.
Notify KVM's successful initialization only once it is really fully
initialized.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
This implements the kvm_arch_vcpu_in_kernel() for ARM, and adjusts
the calls to kvm_vcpu_on_spin().
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a vcpu exits due to request a user mode spinlock, then
the spinlock-holder may be preempted in user mode or kernel mode.
(Note that not all architectures trap spin loops in user mode,
only AMD x86 and ARM/ARM64 currently do).
But if a vcpu exits in kernel mode, then the holder must be
preempted in kernel mode, so we should choose a vcpu in kernel mode
as a more likely candidate for the lock holder.
This introduces kvm_arch_vcpu_in_kernel() to decide whether the
vcpu is in kernel-mode when it's preempted. kvm_vcpu_on_spin's
new argument says the same of the spinning VCPU.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When reading the cntpct_el0 in guest with VHE (Virtual Host Extension)
enabled in host, the "Unsupported guest sys_reg access" error reported.
The reason is cnthctl_el2.EL1PCTEN is not enabled, which is expected
to be done in kvm_timer_init_vhe(). The problem is kvm_timer_init_vhe
is called by cpu_init_hyp_mode, and which is called when VHE is disabled.
This patch remove the incorrect call to kvm_timer_init_vhe() from
cpu_init_hyp_mode(), and calls kvm_timer_init_vhe() to enable
cnthctl_el2.EL1PCTEN in cpu_hyp_reinit().
Fixes: 488f94d721 ("KVM: arm64: Access CNTHCTL_EL2 bit fields correctly on VHE systems")
Cc: stable@vger.kernel.org
Signed-off-by: Hu Huajun <huhuajun@huawei.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
When injecting an IRQ to the VGIC, you now have to present an owner
token for that IRQ line to show that you are the owner of that line.
IRQ lines driven from userspace or via an irqfd do not have an owner and
will simply pass a NULL pointer.
Also get rid of the unused kvm_vgic_inject_mapped_irq prototype.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Since we got support for devices in userspace which allows reporting the
PMU overflow output status to userspace, we should actually allow
creating the PMU on systems without an in-kernel irqchip, which in turn
requires us to slightly clarify error codes for the ABI and move things
around for the initialization phase.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Don't use request-less VCPU kicks when injecting IRQs, as a VCPU
kick meant to trigger the interrupt injection could be sent while
the VCPU is outside guest mode, which means no IPI is sent, and
after it has called kvm_vgic_flush_hwstate(), meaning it won't see
the updated GIC state until its next exit some time later for some
other reason. The receiving VCPU only needs to check this request
in VCPU RUN to handle it. By checking it, if it's pending, a
memory barrier will be issued that ensures all state is visible.
See "Ensuring Requests Are Seen" of
Documentation/virtual/kvm/vcpu-requests.rst
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
A request called EXIT is too generic. All requests are meant to cause
exits, but different requests have different flags. Let's not make
it difficult to decide if the EXIT request is correct for some case
by just always providing unique requests for each case. This patch
changes EXIT to SLEEP, because that's what the request is asking the
VCPU to do.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We can make a small optimization by not checking the state of
the power_off field on each run. This is done by treating
power_off like pause, only checking it when we get the EXIT
VCPU request. When a VCPU powers off another VCPU the EXIT
request is already made, so we just need to make sure the
request is also made on self power off. kvm_vcpu_kick() isn't
necessary for these cases, as the VCPU would just be kicking
itself, but we add it anyway as a self kick doesn't cost much,
and it makes the code more future-proof.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
The current use of KVM_REQ_VCPU_EXIT for pause is fine. Even the
requester clearing the request is OK, as this is the special case
where the sole requesting thread and receiving VCPU are executing
synchronously (see "Clearing Requests" in
Documentation/virtual/kvm/vcpu-requests.rst) However, that's about
to change, so let's ensure only the receiving VCPU clears the
request. Additionally, by guaranteeing KVM_REQ_VCPU_EXIT is always
set when pause is, we can avoid checking pause directly in VCPU RUN.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
arm/arm64 already has one VCPU request used when setting pause,
but it doesn't properly check requests in VCPU RUN. Check it
and also make sure we set vcpu->mode at the appropriate time
(before the check) and with the appropriate barriers. See
Documentation/virtual/kvm/vcpu-requests.rst. Also make sure we
don't leave any vcpu requests we don't intend to handle later
set in the request bitmap. If we don't clear them, then
kvm_request_pending() may return true when it shouldn't.
Using VCPU requests properly fixes a small race where pause
could get set just as a VCPU was entering guest mode.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
We don't need to stop a specific VCPU when changing the active state,
because private IRQs can only be modified by a running VCPU for the
VCPU itself and it is therefore already stopped.
However, it is also possible for two VCPUs to be modifying the active
state of SPIs at the same time, which can cause the thread being stuck
in the loop that checks other VCPU threads for a potentially very long
time, or to modify the active state of a running VCPU. Fix this by
serializing all accesses to setting and clearing the active state of
interrupts using the KVM mutex.
Reported-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Second round of KVM/ARM Changes for v4.12.
Changes include:
- A fix related to the 32-bit idmap stub
- A fix to the bitmask used to deode the operands of an AArch32 CP
instruction
- We have moved the files shared between arch/arm/kvm and
arch/arm64/kvm to virt/kvm/arm
- We add support for saving/restoring the virtual ITS state to
userspace
Instead of waiting with registering KVM iodevs until the first VCPU is
run, we can actually create the iodevs when the redist base address is
set. The only downside is that we must now also check if we need to do
this for VCPUs which are created after creating the VGIC, because there
is no enforced ordering between creating the VGIC (and setting its base
addresses) and creating the VCPUs.
Signed-off-by: Christoffer Dall <cdall@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
For some time now we have been having a lot of shared functionality
between the arm and arm64 KVM support in arch/arm, which not only
required a horrible inter-arch reference from the Makefile in
arch/arm64/kvm, but also created confusion for newcomers to the code
base, as was recently seen on the mailing list.
Further, it causes confusion for things like cscope, which needs special
attention to index specific shared files for arm64 from the arm tree.
Move the shared files into virt/kvm/arm and move the trace points along
with it. When moving the tracepoints we have to modify the way the vgic
creates definitions of the trace points, so we take the chance to
include the VGIC tracepoints in its very own special vgic trace.h file.
Signed-off-by: Christoffer Dall <cdall@linaro.org>