Commit Graph

45170 Commits

Author SHA1 Message Date
Yazen Ghannam
48da1ad8ba x86/mce: Define amd_mce_usable_address()
Currently, all valid MCA_ADDR values are assumed to be usable on AMD
systems. However, this is not correct in most cases. Notifiers expecting
usable addresses may then operate on inappropriate values.

Define a helper function to do AMD-specific checks for a usable memory
address. List out all known cases.

  [ bp: Tone down the capitalized words. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230613141142.36801-3-yazen.ghannam@amd.com
2023-10-16 15:31:32 +02:00
Yazen Ghannam
495a91d099 x86/MCE/AMD: Split amd_mce_is_memory_error()
Define helper functions for legacy and SMCA systems in order to reuse
individual checks in later changes.

Describe what each function is checking for, and correct the XEC bitmask
for SMCA.

No functional change intended.

  [ bp: Use "else in amd_mce_is_memory_error() to make the conditional
    balanced, for readability. ]

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230613141142.36801-2-yazen.ghannam@amd.com
2023-10-16 15:04:53 +02:00
Hou Wenlong
7f6874eddd x86/head/64: Add missing __head annotation to startup_64_load_idt()
This function is currently only used in the head code and is only called
from startup_64_setup_env(). Although it would be inlined by the
compiler, it would be better to mark it as __head too in case it doesn't.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/efcc5b5e18af880e415d884e072bf651c1fa7c34.1689130310.git.houwenlong.hwl@antgroup.com
2023-10-16 13:38:24 +02:00
Hou Wenlong
dc62830090 x86/head/64: Mark 'startup_gdt[]' and 'startup_gdt_descr' as __initdata
As 'startup_gdt[]' and 'startup_gdt_descr' are only used in booting,
mark them as __initdata to allow them to be freed after boot.

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/c85903a7cfad37d14a7e5a4df9fc7119a3669fb3.1689130310.git.houwenlong.hwl@antgroup.com
2023-10-16 13:38:24 +02:00
Sandipan Das
744940f192 perf/x86/amd/uncore: Pass through error code for initialization failures, instead of -ENODEV
Pass through the appropriate error code when the registration of hotplug
callbacks fail during initialization, instead of returning a blanket -ENODEV.

[ mingo: Updated the changelog. ]

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231016060743.332051-1-sandipan.das@amd.com
2023-10-16 11:42:01 +02:00
Linus Torvalds
fbe1bf1e5f Revert "x86/smp: Put CPUs into INIT on shutdown if possible"
This reverts commit 45e34c8af5, and the
two subsequent fixes to it:

  3f874c9b2a ("x86/smp: Don't send INIT to non-present and non-booted CPUs")
  b1472a60a5 ("x86/smp: Don't send INIT to boot CPU")

because it seems to result in hung machines at shutdown.  Particularly
some Dell machines, but Thomas says

 "The rest seems to be Lenovo and Sony with Alderlake/Raptorlake CPUs -
  at least that's what I could figure out from the various bug reports.

  I don't know which CPUs the DELL machines have, so I can't say it's a
  pattern.

  I agree with the revert for now"

Ashok Raj chimes in:

 "There was a report (probably this same one), and it turns out it was a
  bug in the BIOS SMI handler.

  The client BIOS's were waiting for the lowest APICID to be the SMI
  rendevous master. If this is MeteorLake, the BSP wasn't the one with
  the lowest APIC and it triped here.

  The BIOS change is also being pushed to others for assimilation :)

  Server BIOS's had this correctly for a while now"

and it does look likely to be some bad interaction between SMI and the
non-BSP cores having put into INIT (and thus unresponsive until reset).

Link: https://bbs.archlinux.org/viewtopic.php?pid=2124429
Link: https://www.reddit.com/r/openSUSE/comments/16qq99b/tumbleweed_shutdown_did_not_finish_completely/
Link: https://forum.artixlinux.org/index.php/topic,5997.0.html
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2241279
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-10-15 12:02:02 -07:00
Linus Torvalds
ddf2085598 Merge tag 'smp-urgent-2023-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
 "Fix a Longsoon build warning by harmonizing the
  arch_[un]register_cpu() prototypes between architectures"

* tag 'smp-urgent-2023-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu-hotplug: Provide prototypes for arch CPU registration
2023-10-15 08:44:56 -07:00
Paolo Bonzini
88e4cd893f Merge tag 'kvm-x86-pmu-6.6-fixes' of https://github.com/kvm-x86/linux into HEAD
KVM x86/pmu fixes for 6.6:

 - Truncate writes to PMU counters to the counter's width to avoid spurious
   overflows when emulating counter events in software.

 - Set the LVTPC entry mask bit when handling a PMI (to match Intel-defined
   architectural behavior).

 - Treat KVM_REQ_PMI as a wake event instead of queueing host IRQ work to
   kick the guest out of emulated halt.
2023-10-15 08:24:18 -04:00
Linus Torvalds
dc9b2e683b Merge tag 'x86-urgent-2023-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Fix a false-positive KASAN warning, fix an AMD erratum on Zen4 CPUs,
  and fix kernel-doc build warnings"

* tag 'x86-urgent-2023-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/alternatives: Disable KASAN in apply_alternatives()
  x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs
  x86/resctrl: Fix kernel-doc warnings
2023-10-14 15:32:20 -07:00
Linus Torvalds
23931d9353 Merge tag 'perf-urgent-2023-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
 "Fix an LBR sampling bug"

* tag 'perf-urgent-2023-10-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/lbr: Filter vsyscall addresses
2023-10-14 15:09:55 -07:00
Brian Gerst
1a09a27153 x86/entry/32: Clean up syscall fast exit tests
Merge compat and native code and clarify comments.

No change in functionality expected.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20231011224351.130935-4-brgerst@gmail.com
2023-10-13 13:05:28 +02:00
Brian Gerst
58978b44df x86/entry/64: Use TASK_SIZE_MAX for canonical RIP test
Using shifts to determine if an address is canonical is difficult for
the compiler to optimize when the virtual address width is variable
(LA57 feature) without using inline assembly.  Instead, compare RIP
against TASK_SIZE_MAX.  The only user executable address outside of that
range is the deprecated vsyscall page, which can fall back to using IRET.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20231011224351.130935-3-brgerst@gmail.com
2023-10-13 13:05:28 +02:00
Brian Gerst
ca282b486a x86/entry/64: Convert SYSRET validation tests to C
No change in functionality expected.

Signed-off-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Uros Bizjak <ubizjak@gmail.com>
Link: https://lore.kernel.org/r/20231011224351.130935-2-brgerst@gmail.com
2023-10-13 13:05:28 +02:00
Ingo Molnar
92fe9bb77b x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too
The data type for APIC IDs was standardized to 'u32' in the
following recent commit:

   db4a4086a2 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()")

Which changed the function arguments type signature of the
apic->wakeup_secondary_cpu() APIC driver function.

Propagate this to hv_snp_boot_ap() as well, which also addresses a
'assignment from incompatible pointer type' build warning that triggers
under the -Werror=incompatible-pointer-types GCC warning.

Fixes: db4a4086a2 ("x86/apic: Use u32 for wakeup_secondary_cpu[_64]()")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230814085113.233274223@linutronix.de
2023-10-13 12:26:58 +02:00
Dan Carpenter
7543365739 perf/x86/amd/uncore: Fix uninitialized return value in amd_uncore_init()
Some of the error paths in this function return don't initialize the
error code.  Return -ENODEV by default.

Fixes: d6389d3ccc ("perf/x86/amd/uncore: Refactor uncore management")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/cec62eba-c4b8-4cb7-9671-58894dd4b974@moroto.mountain
2023-10-13 09:32:50 +02:00
Jakub Kicinski
0e6bb5b7f4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

kernel/bpf/verifier.c
  829955981c ("bpf: Fix verifier log for async callback return values")
  a923819fb2 ("bpf: Treat first argument as return value for bpf_throw")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12 17:07:34 -07:00
Kirill A. Shutemov
d35652a5fc x86/alternatives: Disable KASAN in apply_alternatives()
Fei has reported that KASAN triggers during apply_alternatives() on
a 5-level paging machine:

	BUG: KASAN: out-of-bounds in rcu_is_watching()
	Read of size 4 at addr ff110003ee6419a0 by task swapper/0/0
	...
	__asan_load4()
	rcu_is_watching()
	trace_hardirqs_on()
	text_poke_early()
	apply_alternatives()
	...

On machines with 5-level paging, cpu_feature_enabled(X86_FEATURE_LA57)
gets patched. It includes KASAN code, where KASAN_SHADOW_START depends on
__VIRTUAL_MASK_SHIFT, which is defined with cpu_feature_enabled().

KASAN gets confused when apply_alternatives() patches the
KASAN_SHADOW_START users. A test patch that makes KASAN_SHADOW_START
static, by replacing __VIRTUAL_MASK_SHIFT with 56, works around the issue.

Fix it for real by disabling KASAN while the kernel is patching alternatives.

[ mingo: updated the changelog ]

Fixes: 6657fca06e ("x86/mm: Allow to boot without LA57 if CONFIG_X86_5LEVEL=y")
Reported-by: Fei Yang <fei.yang@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20231012100424.1456-1-kirill.shutemov@linux.intel.com
2023-10-12 20:27:16 +02:00
Borislav Petkov
deedec0a15 x86/cpu: Fix the AMD Fam 17h, Fam 19h, Zen2 and Zen4 MSR enumerations
The comments introduced in <asm/msr-index.h> in the merge conflict fixup in:

  8f4156d587 ("Merge branch 'x86/urgent' into perf/core, to resolve conflict")

... aren't right: AMD naming schemes are more complex than implied,
family 0x17 is Zen1 and 2, family 0x19 is spread around Zen 3 and 4.

So there's indeed four separate MSR namespaces for:

  MSR_F17H_
  MSR_F19H_
  MSR_ZEN2_
  MSR_ZEN4_

... and the namespaces cannot be merged.

Fix it up. No change in functionality.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
2023-10-12 20:10:39 +02:00
Lukas Bulwahn
faed498d0d hardening: x86: drop reference to removed config AMD_IOMMU_V2
Commit 5a0b11a180 ("iommu/amd: Remove iommu_v2 module") removes the
config AMD_IOMMU_V2.

Remove the reference to this config in the x86 architecture-specific
hardening config fragment as well.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20231012045040.22088-1-lukas.bulwahn@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-10-12 09:08:57 -07:00
Tom Lendacky
3e93467346 KVM: SVM: Fix build error when using -Werror=unused-but-set-variable
Commit 916e3e5f26 ("KVM: SVM: Do not use user return MSR support for
virtualized TSC_AUX") introduced a local variable used for the rdmsr()
function for the high 32-bits of the MSR value. This variable is not used
after being set and triggers a warning or error, when treating warnings
as errors, when the unused-but-set-variable flag is set. Mark this
variable as __maybe_unused to fix this.

Fixes: 916e3e5f26 ("KVM: SVM: Do not use user return MSR support for virtualized TSC_AUX")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <0da9874b6e9fcbaaa5edeb345d7e2a7c859fc818.1696271334.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:09:36 -04:00
Maxim Levitsky
3fdc6087df x86: KVM: SVM: refresh AVIC inhibition in svm_leave_nested()
svm_leave_nested() similar to a nested VM exit, get the vCPU out of nested
mode and thus should end the local inhibition of AVIC on this vCPU.

Failure to do so, can lead to hangs on guest reboot.

Raise the KVM_REQ_APICV_UPDATE request to refresh the AVIC state of the
current vCPU in this case.

Fixes: f44509f849 ("KVM: x86: SVM: allow AVIC to co-exist with a nested guest running")
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928173354.217464-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:09:00 -04:00
Maxim Levitsky
2dcf37abf9 x86: KVM: SVM: add support for Invalid IPI Vector interception
In later revisions of AMD's APM, there is a new 'incomplete IPI' exit code:

"Invalid IPI Vector - The vector for the specified IPI was set to an
illegal value (VEC < 16)"

Note that tests on Zen2 machine show that this VM exit doesn't happen and
instead AVIC just does nothing.

Add support for this exit code by doing nothing, instead of filling
the kernel log with errors.

Also replace an unthrottled 'pr_err()' if another unknown incomplete
IPI exit happens with vcpu_unimpl()

(e.g in case AMD adds yet another 'Invalid IPI' exit reason)

Cc: <stable@vger.kernel.org>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928173354.217464-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:08:59 -04:00
Maxim Levitsky
b65235f6e1 x86: KVM: SVM: always update the x2avic msr interception
The following problem exists since x2avic was enabled in the KVM:

svm_set_x2apic_msr_interception is called to enable the interception of
the x2apic msrs.

In particular it is called at the moment the guest resets its apic.

Assuming that the guest's apic was in x2apic mode, the reset will bring
it back to the xapic mode.

The svm_set_x2apic_msr_interception however has an erroneous check for
'!apic_x2apic_mode()' which prevents it from doing anything in this case.

As a result of this, all x2apic msrs are left unintercepted, and that
exposes the bare metal x2apic (if enabled) to the guest.
Oops.

Remove the erroneous '!apic_x2apic_mode()' check to fix that.

This fixes CVE-2023-5090

Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928173354.217464-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:08:59 -04:00
Sean Christopherson
8647c52e95 KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}
Mask off xfeatures that aren't exposed to the guest only when saving guest
state via KVM_GET_XSAVE{2} instead of modifying user_xfeatures directly.
Preserving the maximal set of xfeatures in user_xfeatures restores KVM's
ABI for KVM_SET_XSAVE, which prior to commit ad856280dd ("x86/kvm/fpu:
Limit guest user_xfeatures to supported bits of XCR0") allowed userspace
to load xfeatures that are supported by the host, irrespective of what
xfeatures are exposed to the guest.

There is no known use case where userspace *intentionally* loads xfeatures
that aren't exposed to the guest, but the bug fixed by commit ad856280dd
was specifically that KVM_GET_SAVE{2} would save xfeatures that weren't
exposed to the guest, e.g. would lead to userspace unintentionally loading
guest-unsupported xfeatures when live migrating a VM.

Restricting KVM_SET_XSAVE to guest-supported xfeatures is especially
problematic for QEMU-based setups, as QEMU has a bug where instead of
terminating the VM if KVM_SET_XSAVE fails, QEMU instead simply stops
loading guest state, i.e. resumes the guest after live migration with
incomplete guest state, and ultimately results in guest data corruption.

Note, letting userspace restore all host-supported xfeatures does not fix
setups where a VM is migrated from a host *without* commit ad856280dd,
to a target with a subset of host-supported xfeatures.  However there is
no way to safely address that scenario, e.g. KVM could silently drop the
unsupported features, but that would be a clear violation of KVM's ABI and
so would require userspace to opt-in, at which point userspace could
simply be updated to sanitize the to-be-loaded XSAVE state.

Reported-by: Tyler Stachecki <stachecki.tyler@gmail.com>
Closes: https://lore.kernel.org/all/20230914010003.358162-1-tstachecki@bloomberg.net
Fixes: ad856280dd ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0")
Cc: stable@vger.kernel.org
Cc: Leonardo Bras <leobras@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Message-Id: <20230928001956.924301-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:08:58 -04:00
Sean Christopherson
18164f66e6 x86/fpu: Allow caller to constrain xfeatures when copying to uabi buffer
Plumb an xfeatures mask into __copy_xstate_to_uabi_buf() so that KVM can
constrain which xfeatures are saved into the userspace buffer without
having to modify the user_xfeatures field in KVM's guest_fpu state.

KVM's ABI for KVM_GET_XSAVE{2} is that features that are not exposed to
guest must not show up in the effective xstate_bv field of the buffer.
Saving only the guest-supported xfeatures allows userspace to load the
saved state on a different host with a fewer xfeatures, so long as the
target host supports the xfeatures that are exposed to the guest.

KVM currently sets user_xfeatures directly to restrict KVM_GET_XSAVE{2} to
the set of guest-supported xfeatures, but doing so broke KVM's historical
ABI for KVM_SET_XSAVE, which allows userspace to load any xfeatures that
are supported by the *host*.

Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-12 11:08:58 -04:00
Suma Hegde
5150542b8e platform/x86/amd/hsmp: add support for metrics tbl
AMD MI300 MCM provides GET_METRICS_TABLE message to retrieve
all the system management information from SMU.

The metrics table is made available as hexadecimal sysfs binary file
under per socket sysfs directory created at
/sys/devices/platform/amd_hsmp/socket%d/metrics_bin

Metrics table definitions will be documented as part of Public PPR.
The same is defined in the amd_hsmp.h header.

Signed-off-by: Suma Hegde <suma.hegde@amd.com>
Reviewed-by: Naveen Krishna Chatradhi <nchatrad@amd.com>
Link: https://lore.kernel.org/r/20231010120310.3464066-2-suma.hegde@amd.com
[ij: lseek -> lseek(), dram -> DRAM in dev_err()]
[ij: added period to terminate a documentation sentence]
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-10-12 16:29:58 +03:00
Peter Zijlstra
f577cd57bf sched/topology: Rename 'DIE' domain to 'PKG'
While reworking the x86 topology code Thomas tripped over creating a 'DIE' domain
for the package mask. :-)

Since these names are CONFIG_SCHED_DEBUG=y only, rename them to make the
name less ambiguous.

[ Shrikanth Hegde: rename on s390 as well. ]
[ Valentin Schneider: also rename it in the comments. ]
[ mingo: port to recent kernels & find all remaining occurances. ]

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Valentin Schneider <vschneid@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20230712141056.GI3100107@hirez.programming.kicks-ass.net
2023-10-12 09:38:16 +02:00
Thomas Zimmermann
052ddf7b86 fbdev: Replace fb_pgprotect() with pgprot_framebuffer()
Rename the fbdev mmap helper fb_pgprotect() to pgprot_framebuffer().
The helper sets VMA page-access flags for framebuffers in device I/O
memory.

Also clean up the helper's parameters and return value. Instead of
the VMA instance, pass the individial parameters separately: existing
page-access flags, the VMAs start and end addresses and the offset
in the underlying device memory rsp file. Return the new page-access
flags. These changes align pgprot_framebuffer() with other pgprot_()
functions.

v4:
	* fix commit message (Christophe)
v3:
	* rename fb_pgprotect() to pgprot_framebuffer() (Arnd)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922080636.26762-3-tzimmermann@suse.de
2023-10-12 09:20:46 +02:00
Paul E. McKenney
f44075ecaf x86/nmi: Fix out-of-order NMI nesting checks & false positive warning
The ->idt_seq and ->recv_jiffies variables added by:

  1a3ea611fc ("x86/nmi: Accumulate NMI-progress evidence in exc_nmi()")

... place the exit-time check of the bottom bit of ->idt_seq after the
this_cpu_dec_return() that re-enables NMI nesting.  This can result in
the following sequence of events on a given CPU in kernels built with
CONFIG_NMI_CHECK_CPU=y:

  o   An NMI arrives, and ->idt_seq is incremented to an odd number.
      In addition, nmi_state is set to NMI_EXECUTING==1.

  o   The NMI is processed.

  o   The this_cpu_dec_return(nmi_state) zeroes nmi_state and returns
      NMI_EXECUTING==1, thus opting out of the "goto nmi_restart".

  o   Another NMI arrives and ->idt_seq is incremented to an even
      number, triggering the warning.  But all is just fine, at least
      assuming we don't get so many closely spaced NMIs that the stack
      overflows or some such.

Experience on the fleet indicates that the MTBF of this false positive
is about 70 years.  Or, for those who are not quite that patient, the
MTBF appears to be about one per week per 4,000 systems.

Fix this false-positive warning by moving the "nmi_restart" label before
the initial ->idt_seq increment/check and moving the this_cpu_dec_return()
to follow the final ->idt_seq increment/check.  This way, all nested NMIs
that get past the NMI_NOT_RUNNING check get a clean ->idt_seq slate.
And if they don't get past that check, they will set nmi_state to
NMI_LATCHED, which will cause the this_cpu_dec_return(nmi_state)
to restart.

Fixes: 1a3ea611fc ("x86/nmi: Accumulate NMI-progress evidence in exc_nmi()")
Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/0cbff831-6e3d-431c-9830-ee65ee7787ff@paulmck-laptop
2023-10-12 08:35:15 +02:00
Lu Yao
441ccc3512 x86/msi: Fix compile error caused by CONFIG_GENERIC_MSI_IRQ=y && !CONFIG_X86_LOCAL_APIC
When compiling the x86 kernel, if X86_LOCAL_APIC is not enabled but
GENERIC_MSI_IRQ is selected in '.config', the following compilation
error will occur:

  include/linux/gpio/driver.h:38:19: error:
    field 'msiinfo' has incomplete type

  kernel/irq/msi.c:752:5: error: invalid use of incomplete typedef
    'msi_alloc_info_t' {aka 'struct irq_alloc_info'}

  kernel/irq/msi.c:740:1: error: control reaches end of non-void function

This is because file such as 'kernel/irq/msi.c' only depends on
'GENERIC_MSI_IRQ', and uses 'struct msi_alloc_info_t'. However,
this struct depends on 'X86_LOCAL_APIC'.

When enable 'GENERIC_MSI_IRQ' or 'X86_LOCAL_APIC' will select
'IRQ_DOMAIN_HIERARCHY', so exposing this struct using
'IRQ_DOMAIN_HIERARCHY' rather than 'X86_LOCAL_APIC'.

Under the above conditions, if 'HPET_TIMER' is selected, the following
compilation error will occur:

  arch/x86/kernel/hpet.c:550:13: error: ‘x86_vector_domain’ undeclared

  arch/x86/kernel/hpet.c:600:9: error: implicit declaration of
    function ‘init_irq_alloc_info’

This is because 'x86_vector_domain' is defined in 'kernel/apic/vector.c'
which is compiled only when 'X86_LOCAL_APIC' is enabled. Besides,
function 'msi_domain_set_affinity' is defined in 'include/linux/msi.h'
which depends on 'GENERIC_MSI_IRQ'. So use 'X86_LOCAL_APIC' and
'GENERIC_MSI_IRQ' to expose these code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20231012032659.323251-1-yaolu@kylinos.cn
2023-10-12 08:13:27 +02:00
Ingo Molnar
8f4156d587 Merge branch 'x86/urgent' into perf/core, to resolve conflict
Resolve an MSR enumeration conflict.

Conflicts:
	arch/x86/include/asm/msr-index.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2023-10-11 22:54:03 +02:00
Fenghua Yu
4dba8f10b8 x86/resctrl: Add sparse_masks file in info
Add the interface in resctrl FS to show if sparse cache allocation
bit masks are supported on the platform. Reading the file returns
either a "1" if non-contiguous 1s are supported and "0" otherwise.
The file path is /sys/fs/resctrl/info/{resource}/sparse_masks, where
{resource} can be either "L2" or "L3".

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Link: https://lore.kernel.org/r/7300535160beba41fd8aa073749ec1ee29b4621f.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11 21:51:24 +02:00
Maciej Wieczor-Retman
0e3cd31f6e x86/resctrl: Enable non-contiguous CBMs in Intel CAT
The setting for non-contiguous 1s support in Intel CAT is
hardcoded to false. On these systems, writing non-contiguous
1s into the schemata file will fail before resctrl passes
the value to the hardware.

In Intel CAT CPUID.0x10.1:ECX[3] and CPUID.0x10.2:ECX[3] stopped
being reserved and now carry information about non-contiguous 1s
value support for L3 and L2 cache respectively. The CAT
capacity bitmask (CBM) supports a non-contiguous 1s value if
the bit is set.

The exception are Haswell systems where non-contiguous 1s value
support needs to stay disabled since they can't make use of CPUID
for Cache allocation.

Originally-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Link: https://lore.kernel.org/r/1849b487256fe4de40b30f88450cba3d9abc9171.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11 21:48:52 +02:00
Maciej Wieczor-Retman
39c6eed1f6 x86/resctrl: Rename arch_has_sparse_bitmaps
Rename arch_has_sparse_bitmaps to arch_has_sparse_bitmasks to ensure
consistent terminology throughout resctrl.

Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Peter Newman <peternewman@google.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Tested-by: Peter Newman <peternewman@google.com>
Link: https://lore.kernel.org/r/e330fcdae873ef1a831e707025a4b70fa346666e.1696934091.git.maciej.wieczor-retman@intel.com
2023-10-11 19:43:43 +02:00
Russell King (Oracle)
c4dd854f74 cpu-hotplug: Provide prototypes for arch CPU registration
Provide common prototypes for arch_register_cpu() and
arch_unregister_cpu(). These are called by acpi_processor.c, with weak
versions, so the prototype for this is already set. It is generally not
necessary for function prototypes to be conditional on preprocessor macros.

Some architectures (e.g. Loongarch) are missing the prototype for this, and
rather than add it to Loongarch's asm/cpu.h, do the job once for everyone.

Since this covers everyone, remove the now unnecessary prototypes in
asm/cpu.h, and therefore remove the 'static' from one of ia64's
arch_register_cpu() definitions.

[ tglx: Bring back the ia64 part and remove the ACPI prototypes ]

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1qkoRr-0088Q8-Da@rmk-PC.armlinux.org.uk
2023-10-11 14:27:37 +02:00
Borislav Petkov (AMD)
f454b18e07 x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs
Fix erratum #1485 on Zen4 parts where running with STIBP disabled can
cause an #UD exception. The performance impact of the fix is negligible.

Reported-by: René Rebe <rene@exactcode.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: René Rebe <rene@exactcode.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
2023-10-11 11:00:11 +02:00
Alexander Shishkin
d6f274b7c8 x86/sev: Drop unneeded #include
Commit:

  20f07a044a ("x86/sev: Move common memory encryption code to mem_encrypt.c")

... forgot to remove the include of virtio_config.h from mem_encrypt_amd.c
when it moved the related code to mem_encrypt.c (from where this include
subsequently got removed by a later commit).

Remove it now.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20231010145220.3960055-3-alexander.shishkin@linux.intel.com
2023-10-11 10:15:47 +02:00
Alexander Shishkin
6e74b12515 x86/sev: Move sev_setup_arch() to mem_encrypt.c
Since commit:

  4d96f91091 ("x86/sev: Replace occurrences of sev_active() with cc_platform_has()")

... the SWIOTLB bounce buffer size adjustment and restricted virtio memory
setting also inadvertently apply to TDX: the code is using
cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) as a gatekeeping condition,
which is also true for TDX, and this is also what we want.

To reflect this, move the corresponding code to generic mem_encrypt.c.

No functional changes intended.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20231010145220.3960055-2-alexander.shishkin@linux.intel.com
2023-10-11 10:15:47 +02:00
Maciej Wieczor-Retman
f05fd4ce99 x86/resctrl: Fix remaining kernel-doc warnings
The kernel test robot reported kernel-doc warnings here:

  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'of' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'seq' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:915: warning: Function parameter or member 'v' not described in 'rdt_bit_usage_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1144: warning: Function parameter or member 'type' not described in '__rdtgroup_cbm_overlaps'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1224: warning: Function parameter or member 'rdtgrp' not described in 'rdtgroup_mode_test_exclusive'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'of' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'buf' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'nbytes' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1261: warning: Function parameter or member 'off' not described in 'rdtgroup_mode_write'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'of' not described in 'rdtgroup_size_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 's' not described in 'rdtgroup_size_show'
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1370: warning: Function parameter or member 'v' not described in 'rdtgroup_size_show'

The first two functions are missing an argument description while the
other three are file callbacks and don't require a kernel-doc comment.

Closes: https://lore.kernel.org/oe-kbuild-all/202310070434.mD8eRNAz-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Newman <peternewman@google.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lore.kernel.org/r/20231011064843.246592-1-maciej.wieczor-retman@intel.com
2023-10-11 09:44:41 +02:00
Yan Zhao
c9f65a3f2d KVM: VMX: drop IPAT in memtype when CD=1 for KVM_X86_QUIRK_CD_NW_CLEARED
For KVM_X86_QUIRK_CD_NW_CLEARED is on, remove the IPAT (ignore PAT) bit in
EPT memory types when cache is disabled and non-coherent DMA are present.

To correctly emulate CR0.CD=1, UC + IPAT are required as memtype in EPT.
However, as with commit fb279950ba ("KVM: vmx: obey
KVM_QUIRK_CD_NW_CLEARED"), WB + IPAT are now returned to workaround a BIOS
issue that guest MTRRs are enabled too late. Without this workaround, a
super slow guest boot-up is expected during the pre-guest-MTRR-enabled
period due to UC as the effective memory type for all guest memory.

Absent emulating CR0.CD=1 with UC, it makes no sense to set IPAT when KVM
is honoring the guest memtype.
Removing the IPAT bit in this patch allows effective memory type to honor
PAT values as well, as WB is the weakest memtype. It means if a guest
explicitly claims UC as the memtype in PAT, the effective memory is UC
instead of previous WB. If, for some unknown reason, a guest meets a slow
boot-up issue with the removal of IPAT, it's desired to fix the blamed PAT
in the guest.

Returning guest MTRR type as if CR0.CD=0 is also not preferred because
KVMs ABI for the quirk also requires KVM to force WB memtype regardless of
guest MTRRs to workaround the slow guest boot-up issue.

In the future, honoring guest PAT will also allow KVM to more precisely
zap SPTEs when the effective memtype changes.  E.g. by not forcing WB when
CR0.CD=1, instead of zapping SPTEs when guest MTRRs change, KVM can skip
MTRR-induced zaps if CR0.CD=1 and zap SPTEs for non-WB MTRR ranges when
CR0.CD is toggled (WB MTRR SPTEs can be kept because they're WB regardless
of CR0.CD).

The change of removing IPAT has been verified with normal boot-up time
on old OVMF of commit c9e5618f84b0cb54a9ac2d7604f7b7e7859b45a7 as well,
dated back to Apr 14 2015.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/r/20230714065326.20557-1-yan.y.zhao@intel.com
[sean: massage changelog to apply patch without full series]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-10-10 17:04:42 -07:00
Yan Zhao
362ff6dca5 KVM: x86/mmu: Zap KVM TDP when noncoherent DMA assignment starts/stops
Zap KVM TDP when noncoherent DMA assignment starts (noncoherent dma count
transitions from 0 to 1) or stops (noncoherent dma count transitions
from 1 to 0). Before the zap, test if guest MTRR is to be honored after
the assignment starts or was honored before the assignment stops.

When there's no noncoherent DMA device, EPT memory type is
((MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT)

When there're noncoherent DMA devices, EPT memory type needs to honor
guest CR0.CD and MTRR settings.

So, if noncoherent DMA count transitions between 0 and 1, EPT leaf entries
need to be zapped to clear stale memory type.

This issue might be hidden when the device is statically assigned with
VFIO adding/removing MMIO regions of the noncoherent DMA devices for
several times during guest boot, and current KVM MMU will call
kvm_mmu_zap_all_fast() on the memslot removal.

But if the device is hot-plugged, or if the guest has mmio_always_on for
the device, the MMIO regions of it may only be added for once, then there's
no path to do the EPT entries zapping to clear stale memory type.

Therefore do the EPT zapping when noncoherent assignment starts/stops to
ensure stale entries cleaned away.

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/r/20230714065223.20432-1-yan.y.zhao@intel.com
[sean: fix misspelled words in comment and changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-10-10 17:04:38 -07:00
Joel Granados
ecd5d66383 x86/vdso: Remove now superfluous sentinel element from ctl_table array
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove sentinel element from abi_table2. This removal is safe because
register_sysctl implicitly uses ARRAY_SIZE() in addition to checking for
the sentinel.

Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-10 15:22:02 -07:00
Joel Granados
83e291d3f5 arch/x86: Remove now superfluous sentinel elem from ctl_table arrays
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)

Remove sentinel element from sld_sysctl and itmt_kern_table. This
removal is safe because register_sysctl_init and register_sysctl
implicitly use the array size in addition to checking for the sentinel.

Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-10-10 15:22:02 -07:00
Linus Torvalds
b711538a40 Merge tag 'hyperv-fixes-signed-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:

 - fixes for Hyper-V VTL code (Saurabh Sengar and Olaf Hering)

 - fix hv_kvp_daemon to support keyfile based connection profile
   (Shradha Gupta)

* tag 'hyperv-fixes-signed-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv/hv_kvp_daemon:Support for keyfile based connection profile
  hyperv: reduce size of ms_hyperv_info
  x86/hyperv: Add common print prefix "Hyper-V" in hv_init
  x86/hyperv: Remove hv_vtl_early_init initcall
  x86/hyperv: Restrict get_vtl to only VTL platforms
2023-10-10 11:01:21 -07:00
Thomas Gleixner
48525fd1ea x86/cpu: Provide debug interface
Provide debug files which dump the topology related information of
cpuinfo_x86. This is useful to validate the upcoming conversion of the
topology evaluation for correctness or bug compatibility.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.353191313@linutronix.de
2023-10-10 14:38:19 +02:00
Thomas Gleixner
90781f0c4c x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids
Per CPU cpuinfo is used to persist the logical package and die IDs. That's
really not the right place simply because cpuinfo is subject to be
reinitialized when a CPU goes through an offline/online cycle.

This works by chance today, but that's far from correct and neither obvious
nor documented.

Add a per cpu datastructure which persists those logical IDs, which allows
to cleanup the CPUID evaluation code.

This is a temporary workaround until the larger topology management is in
place, which makes all of this logical management mechanics obsolete.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.292947071@linutronix.de
2023-10-10 14:38:19 +02:00
Thomas Gleixner
db4a4086a2 x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.233274223@linutronix.de
2023-10-10 14:38:19 +02:00
Thomas Gleixner
59f7928cd4 x86/apic: Use u32 for [gs]et_apic_id()
APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.172569282@linutronix.de
2023-10-10 14:38:19 +02:00
Thomas Gleixner
01ccf9bbd2 x86/apic: Use u32 for phys_pkg_id()
APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width even if that callback going to be removed soonish.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.113097126@linutronix.de
2023-10-10 14:38:19 +02:00
Thomas Gleixner
8aa2a4178d x86/apic: Use u32 for cpu_present_to_apicid()
APIC IDs are used with random data types u16, u32, int, unsigned int,
unsigned long.

Make it all consistently use u32 because that reflects the hardware
register width and fixup a few related usage sites for consistency sake.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230814085113.054064391@linutronix.de
2023-10-10 14:38:19 +02:00