KVM: s390: pv: fix asynchronous teardown for small VMs

[ Upstream commit 292a7d6fca ]

On machines without the Destroy Secure Configuration Fast UVC, the
topmost level of page tables is set aside and freed asynchronously
as last step of the asynchronous teardown.

Each gmap has a host_to_guest radix tree mapping host (userspace)
addresses (with 1M granularity) to gmap segment table entries (pmds).

If a guest is smaller than 2GB, the topmost level of page tables is the
segment table (i.e. there are only 2 levels). Replacing it means that
the pointers in the host_to_guest mapping would become stale and cause
all kinds of nasty issues.

This patch fixes the issue by disallowing asynchronous teardown for
guests with only 2 levels of page tables. Userspace should (and already
does) try using the normal destroy if the asynchronous one fails.

Update s390_replace_asce so it refuses to replace segment type ASCEs.
This is still needed in case the normal destroy VM fails.

Fixes: fb491d5500 ("KVM: s390: pv: asynchronous destroy for reboot")
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20230421085036.52511-2-imbrenda@linux.ibm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit is contained in:
Claudio Imbrenda
2023-04-21 10:50:36 +02:00
committed by Greg Kroah-Hartman
parent 2d9b91b6ce
commit 83a320c038
2 changed files with 12 additions and 0 deletions

View File

@@ -314,6 +314,11 @@ int kvm_s390_pv_set_aside(struct kvm *kvm, u16 *rc, u16 *rrc)
*/ */
if (kvm->arch.pv.set_aside) if (kvm->arch.pv.set_aside)
return -EINVAL; return -EINVAL;
/* Guest with segment type ASCE, refuse to destroy asynchronously */
if ((kvm->arch.gmap->asce & _ASCE_TYPE_MASK) == _ASCE_TYPE_SEGMENT)
return -EINVAL;
priv = kzalloc(sizeof(*priv), GFP_KERNEL); priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv) if (!priv)
return -ENOMEM; return -ENOMEM;

View File

@@ -2830,6 +2830,9 @@ EXPORT_SYMBOL_GPL(s390_unlist_old_asce);
* s390_replace_asce - Try to replace the current ASCE of a gmap with a copy * s390_replace_asce - Try to replace the current ASCE of a gmap with a copy
* @gmap: the gmap whose ASCE needs to be replaced * @gmap: the gmap whose ASCE needs to be replaced
* *
* If the ASCE is a SEGMENT type then this function will return -EINVAL,
* otherwise the pointers in the host_to_guest radix tree will keep pointing
* to the wrong pages, causing use-after-free and memory corruption.
* If the allocation of the new top level page table fails, the ASCE is not * If the allocation of the new top level page table fails, the ASCE is not
* replaced. * replaced.
* In any case, the old ASCE is always removed from the gmap CRST list. * In any case, the old ASCE is always removed from the gmap CRST list.
@@ -2844,6 +2847,10 @@ int s390_replace_asce(struct gmap *gmap)
s390_unlist_old_asce(gmap); s390_unlist_old_asce(gmap);
/* Replacing segment type ASCEs would cause serious issues */
if ((gmap->asce & _ASCE_TYPE_MASK) == _ASCE_TYPE_SEGMENT)
return -EINVAL;
page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER); page = alloc_pages(GFP_KERNEL_ACCOUNT, CRST_ALLOC_ORDER);
if (!page) if (!page)
return -ENOMEM; return -ENOMEM;