Commit Graph

26719 Commits

Author SHA1 Message Date
peterz@infradead.org
9f6c532f59 futex: Add sys_futex_wake()
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.

The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105247.936205525@noisy.programming.kicks-ass.net
2023-09-21 19:22:07 +02:00
Peter Lafreniere
c945e6f453 powerpc/configs: Remove ReiserFS from defconfig
ReiserFS has been deprecated for a year and a half, yet is still built
as part of a defconfig kernel.

According to commit eb103a5164 ("reiserfs: Deprecate reiserfs"), the
filesystem is slated to be removed in 2025. Remove it from the defconfig
profiles now, as part of its deprecation process.

Signed-off-by: Peter Lafreniere <peter@n8pjl.ca>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230918175529.19011-3-peter@n8pjl.ca
2023-09-19 20:36:20 +10:00
Geert Uytterhoeven
6fcb13972b powerpc: Replace GPL 2.0+ README.legal boilerplate with SPDX
Upstream Linux never had a "README.legal" file, but it was present
in early source releases of Linux/m68k.  It contained a simple copyright
notice and a link to a version of the "COPYING" file that predated the
addition of the "only valid GPL version is v2" clause.

Get rid of the references to non-existent files by replacing the
boilerplate with SPDX license identifiers.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/d91725ff1ed5d4b6ba42474e2ebfeebe711cba23.1695031668.git.geert@linux-m68k.org
2023-09-19 20:05:58 +10:00
Benjamin Gray
c3f4309693 powerpc/dexcr: Move HASHCHK trap handler
Syzkaller reported a sleep in atomic context bug relating to the HASHCHK
handler logic:

  BUG: sleeping function called from invalid context at arch/powerpc/kernel/traps.c:1518
  in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 25040, name: syz-executor
  preempt_count: 0, expected: 0
  RCU nest depth: 0, expected: 0
  no locks held by syz-executor/25040.
  irq event stamp: 34
  hardirqs last  enabled at (33): [<c000000000048b38>] prep_irq_for_enabled_exit arch/powerpc/kernel/interrupt.c:56 [inline]
  hardirqs last  enabled at (33): [<c000000000048b38>] interrupt_exit_user_prepare_main+0x148/0x600 arch/powerpc/kernel/interrupt.c:230
  hardirqs last disabled at (34): [<c00000000003e6a4>] interrupt_enter_prepare+0x144/0x4f0 arch/powerpc/include/asm/interrupt.h:176
  softirqs last  enabled at (0): [<c000000000281954>] copy_process+0x16e4/0x4750 kernel/fork.c:2436
  softirqs last disabled at (0): [<0000000000000000>] 0x0
  CPU: 15 PID: 25040 Comm: syz-executor Not tainted 6.5.0-rc5-00001-g3ccdff6bb06d #3
  Hardware name: IBM,9105-22A POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1040.00 (NL1040_021) hv:phyp pSeries
  Call Trace:
  [c0000000a8247ce0] [c00000000032b0e4] __might_resched+0x3b4/0x400 kernel/sched/core.c:10189
  [c0000000a8247d80] [c0000000008c7dc8] __might_fault+0xa8/0x170 mm/memory.c:5853
  [c0000000a8247dc0] [c00000000004160c] do_program_check+0x32c/0xb20 arch/powerpc/kernel/traps.c:1518
  [c0000000a8247e50] [c000000000009b2c] program_check_common_virt+0x3bc/0x3c0

To determine if a trap was caused by a HASHCHK instruction, we inspect
the user instruction that triggered the trap. However this may sleep
if the page needs to be faulted in (get_user_instr() reaches
__get_user(), which calls might_fault() and triggers the bug message).

Move the HASHCHK handler logic to after we allow IRQs, which is fine
because we are only interested in HASHCHK if it's a user space trap.

Fixes: 5bcba4e6c1 ("powerpc/dexcr: Handle hashchk exception")
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230915034604.45393-1-bgray@linux.ibm.com
2023-09-18 12:23:48 +10:00
Christophe Leroy
6901a9f9ef powerpc/82xx: Select FSL_SOC
It used to be impossible to select CONFIG_CPM2 without selecting
CONFIG_FSL_SOC at the same time because CONFIG_CPM2 was dependent
on CONFIG_8260 and CONFIG_8260 was selecting CONFIG_FSL_SOC.

But after commit eb5aa21372 ("powerpc/82xx: Remove CONFIG_8260
and CONFIG_8272") CONFIG_CPM2 depends on CONFIG_PPC_82xx instead
but CONFIG_PPC_82xx doesn't directly selects CONFIG_FSL_SOC.

Fix it by forcing CONFIG_PPC_82xx to select CONFIG_FSL_SOC just
like already done by PPC_8xx, PPC_MPC512x, PPC_83xx, PPC_86xx.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes: eb5aa21372 ("powerpc/82xx: Remove CONFIG_8260 and CONFIG_8272")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/7ab513546148ebe33ddd4b0ea92c7bfd3cce3ad7.1694705016.git.christophe.leroy@csgroup.eu
2023-09-18 12:23:48 +10:00
Naveen N Rao
60d77ed24b powerpc: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION and FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY
We recently added support for -fpatchable-function-entry and it is
enabled by default on ppc32 (ppc64 needs gcc v13.1.0). When building the
kernel for ppc32 and also enabling CONFIG_LD_DEAD_CODE_DATA_ELIMINATION,
we see the below build error with older gcc versions:
  powerpc-linux-gnu-ld: init/main.o(__patchable_function_entries): error: need linked-to section for --gc-sections

This error is thrown since __patchable_function_entries section would be
garbage collected with --gc-sections since it does not reference any
other kept sections. This has subsequently been fixed with:
  https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=b7d072167715829eed0622616f6ae0182900de3e

Disable LD_DEAD_CODE_DATA_ELIMINATION for gcc versions before v11.1.0 if
using -fpatchable-function-entry to avoid this bug.

Fixes: 0f71dcfb4a ("powerpc/ftrace: Add support for -fpatchable-function-entry")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N Rao <naveen@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230913134129.2782088-1-naveen@kernel.org
2023-09-18 12:23:47 +10:00
Benjamin Gray
27646b2e02 powerpc/watchpoints: Annotate atomic context in more places
It can be easy to miss that the notifier mechanism invokes the callbacks
in an atomic context, so add some comments to that effect on the two
handlers we register here.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-4-bgray@linux.ibm.com
2023-09-18 12:23:47 +10:00
Benjamin Gray
3241f260eb powerpc/watchpoint: Disable pagefaults when getting user instruction
This is called in an atomic context, so is not allowed to sleep if a
user page needs to be faulted in and has nowhere it can be deferred to.
The pagefault_disabled() function is documented as preventing user
access methods from sleeping.

In practice the page will be mapped in nearly always because we are
reading the instruction that just triggered the watchpoint trap.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-3-bgray@linux.ibm.com
2023-09-18 12:23:47 +10:00
Benjamin Gray
cc879ab3ce powerpc/watchpoints: Disable preemption in thread_change_pc()
thread_change_pc() uses CPU local data, so must be protected from
swapping CPUs while it is reading the breakpoint struct.

The error is more noticeable after 1e60f3564b ("powerpc/watchpoints:
Track perf single step directly on the breakpoint"), which added an
unconditional __this_cpu_read() call in thread_change_pc(). However the
existing __this_cpu_read() that runs if a breakpoint does need to be
re-inserted has the same issue.

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230829063457.54157-2-bgray@linux.ibm.com
2023-09-18 12:23:47 +10:00
Kajol Jain
4ff3ba4db5 powerpc/perf/hv-24x7: Update domain value check
Valid domain value is in range 1 to HV_PERF_DOMAIN_MAX. Current code has
check for domain value greater than or equal to HV_PERF_DOMAIN_MAX. But
the check for domain value 0 is missing.

Fix this issue by adding check for domain value 0.

Before:
  # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1
  Using CPUID 00800200
  Control descriptor is not initialized
  Error:
  The sys_perf_event_open() syscall returned with 5 (Input/output error) for
  event (hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/).
  /bin/dmesg | grep -i perf may provide additional information.

  Result from dmesg:
  [   37.819387] hv-24x7: hcall failed: [0 0x60040000 0x100 0] => ret
  0xfffffffffffffffc (-4) detail=0x2000000 failing ix=0

After:
  # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1
  Using CPUID 00800200
  Control descriptor is not initialized
  Warning:
  hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ event is not supported by the kernel.
  failed to read counter hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/

Fixes: ebd4a5a3eb ("powerpc/perf/hv-24x7: Minor improvements")
Reported-by: Krishan Gopal Sarawast <krishang@linux.vnet.ibm.com>
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Disha Goel <disgoel@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230825055601.360083-1-kjain@linux.ibm.com
2023-09-18 12:23:47 +10:00
Benjamin Gray
ff25ad0aa2 powerpc/configs: Set more PPC debug configs
Add more config options that wouldn't be done by the generic debug
config in kernel/configs/debug.config

CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG
	Adds an initialized check on each (cpu|mmu)_has_feature()

CONFIG_PPC_IRQ_SOFT_MASK_DEBUG
	Adds some extra checks around IRQ mask manipulation

CONFIG_PPC_KUAP_DEBUG
	Adds some extra KAUP checks around interrupts/context switching

CONFIG_PPC_RFI_SRR_DEBUG
	Adds some extra SSR checks around interrupts/syscalls

Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230830044238.578840-1-bgray@linux.ibm.com
2023-09-18 12:23:27 +10:00
Hari Bathini
b098f1c323 powerpc/fadump: make is_kdump_kernel() return false when fadump is active
Currently, is_kdump_kernel() returns true in crash dump capture kernel
for both kdump and fadump crash dump capturing methods, as both these
methods set elfcorehdr_addr. Some restrictions enforced for crash dump
capture kernel, based on is_kdump_kernel(), are specifically meant for
kdump case and not desirable for fadump - eg. IO queues restriction in
device drivers. So, define is_kdump_kernel() to return false when f/w
assisted dump is active.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230912082950.856977-2-hbathini@linux.ibm.com
2023-09-18 12:23:27 +10:00
Aditya Gupta
7135b921b3 powerpc: add cur_cpu_spec symbol to vmcoreinfo
Presently, while reading a vmcore, makedumpfile uses
`cur_cpu_spec.mmu_features` to decide whether the crashed system had
RADIX MMU or not.

Currently, makedumpfile fails to get the `cur_cpu_spec` symbol (unless
a vmlinux is passed with the `-x` flag to makedumpfile), and hence
assigns offsets and shifts (such as pgd_offset_l4) incorrecly considering
MMU to be hash MMU.

Add `cur_cpu_spec` symbol and offset of `mmu_features` in the
`cpu_spec` struct, to VMCOREINFO, so that the symbol address and offset
is accessible to makedumpfile, without needing the vmlinux file

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230911091409.415662-1-adityag@linux.ibm.com
2023-09-18 12:23:27 +10:00
Yuan Tan
f84b727d13 powerpc/32: Enable POWER_RESET in pmac32_defconfig
Enable POWER_RESET, and drop ADB_CUDA as it will now be selected.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suggested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/73716a14e9a24bcea4ca6c454aae308c419933c6.1694685860.git.tanyuan@tinylab.org
2023-09-18 12:23:27 +10:00
Yuan Tan
a3ef2fef19 powerpc/32: Add dependencies of POWER_RESET for pmac32
pmac32's power off depends on ADB_CUDA to work. Enable it when
POWER_RESET is set for convenience.

Suggested-by: Zhangjin Wu <falcon@tinylab.org>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/0cca5d5afb6c4a1b78648e98339b4b7c9def46d5.1694685860.git.tanyuan@tinylab.org
2023-09-18 12:23:26 +10:00
Julia Lawall
06b627c123 powerpc/kexec_file: add missing of_node_put
for_each_node_with_property performs an of_node_get on each
iteration, so a break out of the loop requires an
of_node_put.

This was done using the Coccinelle semantic patch
iterators/for_each_child.cocci

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230907095521.14053-7-Julia.Lawall@inria.fr
2023-09-18 12:23:26 +10:00
Julia Lawall
a59e9eb252 powerpc/powermac: add missing of_node_put
for_each_node_by_name performs an of_node_get on each
iteration, so a break out of the loop requires an
of_node_put.

This was done using the Coccinelle semantic patch
iterators/for_each_child.cocci

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230907095521.14053-4-Julia.Lawall@inria.fr
2023-09-18 12:23:26 +10:00
Jordan Niethe
19d31c5f11 KVM: PPC: Add support for nestedv2 guests
A series of hcalls have been added to the PAPR which allow a regular
guest partition to create and manage guest partitions of its own. KVM
already had an interface that allowed this on powernv platforms. This
existing interface will now be called "nestedv1". The newly added PAPR
interface will be called "nestedv2".  PHYP will support the nestedv2
interface. At this time the host side of the nestedv2 interface has not
been implemented on powernv but there is no technical reason why it
could not be added.

The nestedv1 interface is still supported.

Add support to KVM to utilize these hcalls to enable running nested
guests as a pseries guest on PHYP.

Overview of the new hcall usage:

- L1 and L0 negotiate capabilities with
  H_GUEST_{G,S}ET_CAPABILITIES()

- L1 requests the L0 create a L2 with
  H_GUEST_CREATE() and receives a handle to use in future hcalls

- L1 requests the L0 create a L2 vCPU with
  H_GUEST_CREATE_VCPU()

- L1 sets up the L2 using H_GUEST_SET and the
  H_GUEST_VCPU_RUN input buffer

- L1 requests the L0 runs the L2 vCPU using H_GUEST_VCPU_RUN()

- L2 returns to L1 with an exit reason and L1 reads the
  H_GUEST_VCPU_RUN output buffer populated by the L0

- L1 handles the exit using H_GET_STATE if necessary

- L1 reruns L2 vCPU with H_GUEST_VCPU_RUN

- L1 frees the L2 in the L0 with H_GUEST_DELETE()

Support for the new API is determined by trying
H_GUEST_GET_CAPABILITIES. On a successful return, use the nestedv2
interface.

Use the vcpu register state setters for tracking modified guest state
elements and copy the thread wide values into the H_GUEST_VCPU_RUN input
buffer immediately before running a L2. The guest wide
elements can not be added to the input buffer so send them with a
separate H_GUEST_SET call if necessary.

Make the vcpu register getter load the corresponding value from the real
host with H_GUEST_GET. To avoid unnecessarily calling H_GUEST_GET, track
which values have already been loaded between H_GUEST_VCPU_RUN calls. If
an element is present in the H_GUEST_VCPU_RUN output buffer it also does
not need to be loaded again.

Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com>
Signed-off-by: Amit Machhiwal <amachhiw@linux.vnet.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-11-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
dfcaacc8f9 KVM: PPC: Book3s HV: Hold LPIDs in an unsigned long
The LPID register is 32 bits long. The host keeps the lpids for each
guest in an unsigned word struct kvm_arch. Currently, LPIDs are already
limited by mmu_lpid_bits and KVM_MAX_NESTED_GUESTS_SHIFT.

The nestedv2 API returns a 64 bit "Guest ID" to be used be the L1 host
for each L2 guest. This value is used as an lpid, e.g. it is the
parameter used by H_RPT_INVALIDATE. To minimize needless special casing
it makes sense to keep this "Guest ID" in struct kvm_arch::lpid.

This means that struct kvm_arch::lpid is too small so prepare for this
and make it an unsigned long. This is not a problem for the KVM-HV and
nestedv1 cases as their lpid values are already limited to valid ranges
so in those contexts the lpid can be used as an unsigned word safely as
needed.

In the PAPR, the H_RPT_INVALIDATE pid/lpid parameter is already
specified as an unsigned long so change pseries_rpt_invalidate() to
match that.  Update the callers of pseries_rpt_invalidate() to also take
an unsigned long if they take an lpid value.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-10-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
6ccbbc33f0 KVM: PPC: Add helper library for Guest State Buffers
The PAPR "Nestedv2" guest API introduces the concept of a Guest State
Buffer for communication about L2 guests between L1 and L0 hosts.

In the new API, the L0 manages the L2 on behalf of the L1. This means
that if the L1 needs to change L2 state (e.g. GPRs, SPRs, partition
table...), it must request the L0 perform the modification. If the
nested host needs to read L2 state likewise this request must
go through the L0.

The Guest State Buffer is a Type-Length-Value style data format defined
in the PAPR which assigns all relevant partition state a unique
identity. Unlike a typical TLV format the length is redundant as the
length of each identity is fixed but is included for checking
correctness.

A guest state buffer consists of an element count followed by a stream
of elements, where elements are composed of an ID number, data length,
then the data:

  Header:

   <---4 bytes--->
  +----------------+-----
  | Element Count  | Elements...
  +----------------+-----

  Element:

   <----2 bytes---> <-2 bytes-> <-Length bytes->
  +----------------+-----------+----------------+
  | Guest State ID |  Length   |      Data      |
  +----------------+-----------+----------------+

Guest State IDs have other attributes defined in the PAPR such as
whether they are per thread or per guest, or read-only.

Introduce a library for using guest state buffers. This includes support
for actions such as creating buffers, adding elements to buffers,
reading the value of elements and parsing buffers. This will be used
later by the nestedv2 guest support.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-9-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
6de2e837ba KVM: PPC: Book3S HV: Introduce low level MSR accessor
kvmppc_get_msr() and kvmppc_set_msr_fast() serve as accessors for the
MSR. However because the MSR is kept in the shared regs they include a
conditional check for kvmppc_shared_big_endian() and endian conversion.

Within the Book3S HV specific code there are direct reads and writes of
shregs::msr. In preparation for Nested APIv2 these accesses need to be
replaced with accessor functions so it is possible to extend their
behavior. However, using the kvmppc_get_msr() and kvmppc_set_msr_fast()
functions is undesirable because it would introduce a conditional branch
and endian conversion that is not currently present.

kvmppc_set_msr_hv() already exists, it is used for the
kvmppc_ops::set_msr callback.

Introduce a low level accessor __kvmppc_{s,g}et_msr_hv() that simply
gets and sets shregs::msr. This will be extend for Nested APIv2 support.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-8-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
ebc88ea7a6 KVM: PPC: Book3S HV: Use accessors for VCPU registers
Introduce accessor generator macros for Book3S HV VCPU registers. Use
the accessor functions to replace direct accesses to this registers.

This will be important later for Nested APIv2 support which requires
additional functionality for accessing and modifying VCPU state.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-7-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
c8ae9b3c6e KVM: PPC: Use accessors for VCORE registers
Introduce accessor generator macros for VCORE registers. Use the accessor
functions to replace direct accesses to this registers.

This will be important later for Nested APIv2 support which requires
additional functionality for accessing and modifying VCPU state.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-6-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
7028ac8d17 KVM: PPC: Use accessors for VCPU registers
Introduce accessor generator macros for VCPU registers. Use the accessor
functions to replace direct accesses to this registers.

This will be important later for Nested APIv2 support which requires
additional functionality for accessing and modifying VCPU state.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-5-jniethe5@gmail.com
2023-09-14 22:04:24 +10:00
Jordan Niethe
2a64bc6731 KVM: PPC: Rename accessor generator macros
More "wrapper" style accessor generating macros will be introduced for
the nestedv2 guest support. Rename the existing macros with more
descriptive names now so there is a consistent naming convention.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-4-jniethe5@gmail.com
2023-09-14 22:04:23 +10:00
Jordan Niethe
52425a3b3c KVM: PPC: Introduce FPR/VR accessor functions
Introduce accessor functions for floating point and vector registers
like the ones that exist for GPRs. Use these to replace the existing FPR
and VR accessor macros.

This will be important later for Nested APIv2 support which requires
additional functionality for accessing and modifying VCPU state.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-3-jniethe5@gmail.com
2023-09-14 22:04:23 +10:00
Jordan Niethe
0e85b7df9c KVM: PPC: Always use the GPR accessors
Always use the GPR accessor functions. This will be important later for
Nested APIv2 support which requires additional functionality for
accessing and modifying VCPU state.

Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230914030600.16993-2-jniethe5@gmail.com
2023-09-14 22:04:23 +10:00
Linus Torvalds
4b3d6e0c6c Merge tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:

 - Fix OF include file for ata platform drivers (Rob)

 - Simplify various ahci, sata and pata platform drivers using the
   function devm_platform_ioremap_resource() (Yangtao)

 - Cleanup libata time related argument types (e.g. timeouts values)
   (Sergey)

 - Cleanup libata code around error handling as all ata drivers now
   define a error_handler operation (Hannes and Niklas)

 - Remove functions intended for libsas that are in fact unused (Niklas)

 - Change the remove device callback of platform drivers to a null
   function (Uwe)

 - Simplify the pata_imx driver using devm_clk_get_enabled() (Li)

 - Remove old and uinused remnants of the ide code in arm, parisc,
   powerpc, sparc and m68k architectures and associated drivers
   (pata_buddha, pata_falcon and pata_gayle) (Geert)

 - Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010
   drivers (me)

 - Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita,
   Michael)

 - Add Elkhart Lake AHCI controller support to the ahci driver (Werner)

 - Disable NCQ trim on Micron 1100 drives (Pawel)

* tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (60 commits)
  ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives
  ata: ahci: Add Elkhart Lake AHCI controller
  ata: pata_falcon: add data_swab option to byte-swap disk data
  ata: pata_falcon: fix IO base selection for Q40
  ata: pata_ep93xx: use soc_device_match for UDMA modes
  ata: pata_ep93xx: fix error return code in probe
  ata: sata_gemini: Add missing MODULE_DESCRIPTION
  ata: pata_ftide010: Add missing MODULE_DESCRIPTION
  m68k: Remove <asm/ide.h>
  ata: pata_gayle: Remove #include <asm/ide.h>
  ata: pata_falcon: Remove #include <asm/ide.h>
  ata: pata_buddha: Remove #include <asm/ide.h>
  asm-generic: Remove ide_iops.h
  sparc: Remove <asm/ide.h>
  powerpc: Remove <asm/ide.h>
  parisc: Remove <asm/ide.h>
  ARM: Remove <asm/ide.h>
  ata: pata_imx: Use helper function devm_clk_get_enabled()
  ata: sata_rcar: Convert to platform remove callback returning void
  ata: sata_mv: Convert to platform remove callback returning void
  ...
2023-09-05 12:37:28 -07:00
Linus Torvalds
61401a8724 Merge tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Enable -Wenum-conversion warning option

 - Refactor the rpm-pkg target

 - Fix scripts/setlocalversion to consider annotated tags for rt-kernel

 - Add a jump key feature for the search menu of 'make nconfig'

 - Support Qt6 for 'make xconfig'

 - Enable -Wformat-overflow, -Wformat-truncation, -Wstringop-overflow,
   and -Wrestrict warnings for W=1 builds

 - Replace <asm/export.h> with <linux/export.h> for alpha, ia64, and
   sparc

 - Support DEB_BUILD_OPTIONS=parallel=N for the debian source package

 - Refactor scripts/Makefile.modinst and fix some modules_sign issues

 - Add a new Kconfig env variable to warn symbols that are not defined
   anywhere

 - Show help messages of config fragments in 'make help'

* tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (62 commits)
  kconfig: fix possible buffer overflow
  kbuild: Show marked Kconfig fragments in "help"
  kconfig: add warn-unknown-symbols sanity check
  kbuild: dummy-tools: make MPROFILE_KERNEL checks work on BE
  Documentation/llvm: refresh docs
  modpost: Skip .llvm.call-graph-profile section check
  kbuild: support modules_sign for external modules as well
  kbuild: support 'make modules_sign' with CONFIG_MODULE_SIG_ALL=n
  kbuild: move more module installation code to scripts/Makefile.modinst
  kbuild: reduce the number of mkdir calls during modules_install
  kbuild: remove $(MODLIB)/source symlink
  kbuild: move depmod rule to scripts/Makefile.modinst
  kbuild: add modules_sign to no-{compiler,sync-config}-targets
  kbuild: do not run depmod for 'make modules_sign'
  kbuild: deb-pkg: support DEB_BUILD_OPTIONS=parallel=N in debian/rules
  alpha: remove <asm/export.h>
  alpha: replace #include <asm/export.h> with #include <linux/export.h>
  ia64: remove <asm/export.h>
  ia64: replace #include <asm/export.h> with #include <linux/export.h>
  sparc: remove <asm/export.h>
  ...
2023-09-05 11:01:47 -07:00
Kees Cook
feec5e1f74 kbuild: Show marked Kconfig fragments in "help"
Currently the Kconfig fragments in kernel/configs and arch/*/configs
that aren't used internally aren't discoverable through "make help",
which consists of hard-coded lists of config fragments. Instead, list
all the fragment targets that have a "# Help: " comment prefix so the
targets can be generated dynamically.

Add logic to the Makefile to search for and display the fragment and
comment. Add comments to fragments that are intended to be direct targets.

Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-09-04 02:04:20 +09:00
Linus Torvalds
0fe2b86c21 Merge tag 'v6.6-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a random config build failure on powerpc"

* tag 'v6.6-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: powerpc/chacha20,poly1305-p10 - Add dependency on VSX
2023-09-01 15:39:25 -07:00
Linus Torvalds
8e1e49550d Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
 "Here is the big set of tty and serial driver changes for 6.6-rc1.

  Lots of cleanups in here this cycle, and some driver updates. Short
  summary is:

   - Jiri's continued work to make the tty code and apis be a bit more
     sane with regards to modern kernel coding style and types

   - cpm_uart driver updates

   - n_gsm updates and fixes

   - meson driver updates

   - sc16is7xx driver updates

   - 8250 driver updates for different hardware types

   - qcom-geni driver fixes

   - tegra serial driver change

   - stm32 driver updates

   - synclink_gt driver cleanups

   - tty structure size reduction

  All of these have been in linux-next this week with no reported
  issues. The last bit of cleanups from Jiri and the tty structure size
  reduction came in last week, a bit late but as they were just style
  changes and size reductions, I figured they should get into this merge
  cycle so that others can work on top of them with no merge conflicts"

* tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
  tty: shrink the size of struct tty_struct by 40 bytes
  tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw()
  tty: n_tty: extract ECHO_OP processing to a separate function
  tty: n_tty: unify counts to size_t
  tty: n_tty: use u8 for chars and flags
  tty: n_tty: simplify chars_in_buffer()
  tty: n_tty: remove unsigned char casts from character constants
  tty: n_tty: move newline handling to a separate function
  tty: n_tty: move canon handling to a separate function
  tty: n_tty: use MASK() for masking out size bits
  tty: n_tty: make n_tty_data::num_overrun unsigned
  tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun()
  tty: n_tty: use 'num' for writes' counts
  tty: n_tty: use output character directly
  tty: n_tty: make flow of n_tty_receive_buf_common() a bool
  Revert "tty: serial: meson: Add a earlycon for the T7 SoC"
  Documentation: devices.txt: Fix minors for ttyCPM*
  Documentation: devices.txt: Remove ttySIOC*
  Documentation: devices.txt: Remove ttyIOC*
  serial: 8250_bcm7271: improve bcm7271 8250 port
  ...
2023-09-01 09:38:00 -07:00
Ingo Molnar
f441ff73f1 powerpc: Fix pud_mkwrite() definition after pte_mkwrite() API changes
Fix up missed semantic mis-merge between commits

  161e393c0f ("mm: Make pte_mkwrite() take a VMA")
  27af67f356 ("powerpc/book3s64/mm: enable transparent pud hugepage")

where the newly introduced powerpc use of 'pte_mkwrite()' needs to use
the 'novma()' versions as per commit 2f0584f3f4 ("mm: Rename arch
pte_mkwrite()'s to pte_mkwrite_novma()").

Fixes: df57721f9a ("Merge tag 'x86_shstk_for_6.6-rc1' of [...]")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-31 14:49:38 -07:00
Linus Torvalds
4ad0a4c234 Merge tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:

 - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the
   configured SMT state when hotplugging CPUs into the system

 - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the
   Radix MMU to avoid a broadcast TLBIE flush on exit

 - Drop the exclusion between ptrace/perf watchpoints, and drop the now
   unused associated arch hooks

 - Add support for the "nohlt" command line option to disable CPU idle

 - Add support for -fpatchable-function-entry for ftrace, with GCC >=
   13.1

 - Rework memory block size determination, and support 256MB size on
   systems with GPUs that have hotpluggable memory

 - Various other small features and fixes

Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira
Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam
Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel
Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof
Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar,
Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor,
Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar
Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh
Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain,
Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai.

* tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits)
  macintosh/ams: linux/platform_device.h is needed
  powerpc/xmon: Reapply "Relax frame size for clang"
  powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
  powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
  powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
  powerpc/mpc5xxx: Add missing fwnode_handle_put()
  powerpc/config: Disable SLAB_DEBUG_ON in skiroot
  powerpc/pseries: Remove unused hcall tracing instruction
  powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n
  powerpc: dts: add missing space before {
  powerpc/eeh: Use pci_dev_id() to simplify the code
  powerpc/64s: Move CPU -mtune options into Kconfig
  powerpc/powermac: Fix unused function warning
  powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT
  powerpc: Don't include lppaca.h in paca.h
  powerpc/pseries: Move hcall_vphn() prototype into vphn.h
  powerpc/pseries: Move VPHN constants into vphn.h
  cxl: Drop unused detach_spa()
  powerpc: Drop zalloc_maybe_bootmem()
  powerpc/powernv: Use struct opal_prd_msg in more places
  ...
2023-08-31 12:43:10 -07:00
Linus Torvalds
df57721f9a Merge tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 shadow stack support from Dave Hansen:
 "This is the long awaited x86 shadow stack support, part of Intel's
  Control-flow Enforcement Technology (CET).

  CET consists of two related security features: shadow stacks and
  indirect branch tracking. This series implements just the shadow stack
  part of this feature, and just for userspace.

  The main use case for shadow stack is providing protection against
  return oriented programming attacks. It works by maintaining a
  secondary (shadow) stack using a special memory type that has
  protections against modification. When executing a CALL instruction,
  the processor pushes the return address to both the normal stack and
  to the special permission shadow stack. Upon RET, the processor pops
  the shadow stack copy and compares it to the normal stack copy.

  For more information, refer to the links below for the earlier
  versions of this patch set"

Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/

* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
  x86/shstk: Change order of __user in type
  x86/ibt: Convert IBT selftest to asm
  x86/shstk: Don't retry vm_munmap() on -EINTR
  x86/kbuild: Fix Documentation/ reference
  x86/shstk: Move arch detail comment out of core mm
  x86/shstk: Add ARCH_SHSTK_STATUS
  x86/shstk: Add ARCH_SHSTK_UNLOCK
  x86: Add PTRACE interface for shadow stack
  selftests/x86: Add shadow stack test
  x86/cpufeatures: Enable CET CR4 bit for shadow stack
  x86/shstk: Wire in shadow stack interface
  x86: Expose thread features in /proc/$PID/status
  x86/shstk: Support WRSS for userspace
  x86/shstk: Introduce map_shadow_stack syscall
  x86/shstk: Check that signal frame is shadow stack mem
  x86/shstk: Check that SSP is aligned on sigreturn
  x86/shstk: Handle signals for shadow stack
  x86/shstk: Introduce routines modifying shstk
  x86/shstk: Handle thread shadow stack
  x86/shstk: Add user-mode shadow stack support
  ...
2023-08-31 12:20:12 -07:00
Linus Torvalds
1a35914f73 Merge tag 'integrity-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity
Pull integrity subsystem updates from Mimi Zohar:

 - With commit 099f26f22f ("integrity: machine keyring CA
   configuration") certificates may be loaded onto the IMA keyring,
   directly or indirectly signed by keys on either the "builtin" or the
   "machine" keyrings.

   With the ability for the system/machine owner to sign the IMA policy
   itself without needing to recompile the kernel, update the IMA
   architecture specific policy rules to require the IMA policy itself
   be signed.

   [ As commit 099f26f22f was upstreamed in linux-6.4, updating the
     IMA architecture specific policy now to require signed IMA policies
     may break userspace expectations. ]

 - IMA only checked the file data hash was not on the system blacklist
   keyring for files with an appended signature (e.g. kernel modules,
   Power kernel image).

   Check all file data hashes regardless of how it was signed

 - Code cleanup, and a kernel-doc update

* tag 'integrity-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  kexec_lock: Replace kexec_mutex() by kexec_lock() in two comments
  ima: require signed IMA policy when UEFI secure boot is enabled
  integrity: Always reference the blacklist keyring with appraisal
  ima: Remove deprecated IMA_TRUSTED_KEYRING Kconfig
2023-08-30 09:16:56 -07:00
Herbert Xu
ba22e81872 crypto: powerpc/chacha20,poly1305-p10 - Add dependency on VSX
Add dependency on VSX as otherwise the build will fail without
it.

Fixes: 161fca7e3e ("crypto: powerpc - Add chacha20/poly1305-p10 to Kconfig and Makefile")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308251906.SYawej6g-lkp@intel.com/
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-08-30 13:48:39 +08:00
Linus Torvalds
6c1b980a7e Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-maping updates from Christoph Hellwig:

 - allow dynamic sizing of the swiotlb buffer, to cater for secure
   virtualization workloads that require all I/O to be bounce buffered
   (Petr Tesarik)

 - move a declaration to a header (Arnd Bergmann)

 - check for memory region overlap in dma-contiguous (Binglei Wang)

 - remove the somewhat dangerous runtime swiotlb-xen enablement and
   unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)

 - per-node CMA improvements (Yajun Deng)

* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
  swiotlb: optimize get_max_slots()
  swiotlb: move slot allocation explanation comment where it belongs
  swiotlb: search the software IO TLB only if the device makes use of it
  swiotlb: allocate a new memory pool when existing pools are full
  swiotlb: determine potential physical address limit
  swiotlb: if swiotlb is full, fall back to a transient memory pool
  swiotlb: add a flag whether SWIOTLB is allowed to grow
  swiotlb: separate memory pool data from other allocator data
  swiotlb: add documentation and rename swiotlb_do_find_slots()
  swiotlb: make io_tlb_default_mem local to swiotlb.c
  swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
  dma-contiguous: check for memory region overlap
  dma-contiguous: support numa CMA for specified node
  dma-contiguous: support per-numa CMA for all architectures
  dma-mapping: move arch_dma_set_mask() declaration to header
  swiotlb: unexport is_swiotlb_active
  x86: always initialize xen-swiotlb when xen-pcifront is enabling
  xen/pci: add flag for PCI passthrough being possible
2023-08-29 20:32:10 -07:00
Linus Torvalds
d68b4b6f30 Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:

 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
   ("refactor Kconfig to consolidate KEXEC and CRASH options")

 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h")

 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands")

 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions")

 - Switch the handling of kdump from a udev scheme to in-kernel
   handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
   hot un/plug")

 - Many singleton patches to various parts of the tree

* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
  document while_each_thread(), change first_tid() to use for_each_thread()
  drivers/char/mem.c: shrink character device's devlist[] array
  x86/crash: optimize CPU changes
  crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
  crash: hotplug support for kexec_load()
  x86/crash: add x86 crash hotplug support
  crash: memory and CPU hotplug sysfs attributes
  kexec: exclude elfcorehdr from the segment digest
  crash: add generic infrastructure for crash hotplug support
  crash: move a few code bits to setup support of crash hotplug
  kstrtox: consistently use _tolower()
  kill do_each_thread()
  nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
  scripts/bloat-o-meter: count weak symbol sizes
  treewide: drop CONFIG_EMBEDDED
  lockdep: fix static memory detection even more
  lib/vsprintf: declare no_hash_pointers in sprintf.h
  lib/vsprintf: split out sprintf() and friends
  kernel/fork: stop playing lockless games for exe_file replacement
  adfs: delete unused "union adfs_dirtail" definition
  ...
2023-08-29 14:53:51 -07:00
Linus Torvalds
b96a3e9142 Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:

 - Some swap cleanups from Ma Wupeng ("fix WARN_ON in
   add_to_avail_list")

 - Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
   reduces the special-case code for handling hugetlb pages in GUP. It
   also speeds up GUP handling of transparent hugepages.

 - Peng Zhang provides some maple tree speedups ("Optimize the fast path
   of mas_store()").

 - Sergey Senozhatsky has improved te performance of zsmalloc during
   compaction (zsmalloc: small compaction improvements").

 - Domenico Cerasuolo has developed additional selftest code for zswap
   ("selftests: cgroup: add zswap test program").

 - xu xin has doe some work on KSM's handling of zero pages. These
   changes are mainly to enable the user to better understand the
   effectiveness of KSM's treatment of zero pages ("ksm: support
   tracking KSM-placed zero-pages").

 - Jeff Xu has fixes the behaviour of memfd's
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
   MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").

 - David Howells has fixed an fscache optimization ("mm, netfs, fscache:
   Stop read optimisation when folio removed from pagecache").

 - Axel Rasmussen has given userfaultfd the ability to simulate memory
   poisoning ("add UFFDIO_POISON to simulate memory poisoning with
   UFFD").

 - Miaohe Lin has contributed some routine maintenance work on the
   memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
   check").

 - Peng Zhang has contributed some maintenance work on the maple tree
   code ("Improve the validation for maple tree and some cleanup").

 - Hugh Dickins has optimized the collapsing of shmem or file pages into
   THPs ("mm: free retracted page table by RCU").

 - Jiaqi Yan has a patch series which permits us to use the healthy
   subpages within a hardware poisoned huge page for general purposes
   ("Improve hugetlbfs read on HWPOISON hugepages").

 - Kemeng Shi has done some maintenance work on the pagetable-check code
   ("Remove unused parameters in page_table_check").

 - More folioification work from Matthew Wilcox ("More filesystem folio
   conversions for 6.6"), ("Followup folio conversions for zswap"). And
   from ZhangPeng ("Convert several functions in page_io.c to use a
   folio").

 - page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").

 - Baoquan He has converted some architectures to use the
   GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
   architectures to take GENERIC_IOREMAP way").

 - Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
   batched/deferred tlb shootdown during page reclamation/migration").

 - Better maple tree lockdep checking from Liam Howlett ("More strict
   maple tree lockdep"). Liam also developed some efficiency
   improvements ("Reduce preallocations for maple tree").

 - Cleanup and optimization to the secondary IOMMU TLB invalidation,
   from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
   upgrade").

 - Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
   for arm64").

 - Kemeng Shi provides some maintenance work on the compaction code
   ("Two minor cleanups for compaction").

 - Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
   most file-backed faults under the VMA lock").

 - Aneesh Kumar contributes code to use the vmemmap optimization for DAX
   on ppc64, under some circumstances ("Add support for DAX vmemmap
   optimization for ppc64").

 - page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
   data in page_ext"), ("minor cleanups to page_ext header").

 - Some zswap cleanups from Johannes Weiner ("mm: zswap: three
   cleanups").

 - kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").

 - VMA handling cleanups from Kefeng Wang ("mm: convert to
   vma_is_initial_heap/stack()").

 - DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
   implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
   address ranges and DAMON monitoring targets").

 - Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").

 - Liam Howlett has improved the maple tree node replacement code
   ("maple_tree: Change replacement strategy").

 - ZhangPeng has a general code cleanup - use the K() macro more widely
   ("cleanup with helper macro K()").

 - Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
   memmap on memory feature on ppc64").

 - pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
   in page_alloc"), ("Two minor cleanups for get pageblock
   migratetype").

 - Vishal Moola introduces a memory descriptor for page table tracking,
   "struct ptdesc" ("Split ptdesc from struct page").

 - memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
   for vm.memfd_noexec").

 - MM include file rationalization from Hugh Dickins ("arch: include
   asm/cacheflush.h in asm/hugetlb.h").

 - THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
   output").

 - kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
   object_cache instead of kmemleak_initialized").

 - More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
   and _folio_order").

 - A VMA locking scalability improvement from Suren Baghdasaryan
   ("Per-VMA lock support for swap and userfaults").

 - pagetable handling cleanups from Matthew Wilcox ("New page table
   range API").

 - A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
   using page->private on tail pages for THP_SWAP + cleanups").

 - Cleanups and speedups to the hugetlb fault handling from Matthew
   Wilcox ("Change calling convention for ->huge_fault").

 - Matthew Wilcox has also done some maintenance work on the MM
   subsystem documentation ("Improve mm documentation").

* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
  maple_tree: shrink struct maple_tree
  maple_tree: clean up mas_wr_append()
  secretmem: convert page_is_secretmem() to folio_is_secretmem()
  nios2: fix flush_dcache_page() for usage from irq context
  hugetlb: add documentation for vma_kernel_pagesize()
  mm: add orphaned kernel-doc to the rst files.
  mm: fix clean_record_shared_mapping_range kernel-doc
  mm: fix get_mctgt_type() kernel-doc
  mm: fix kernel-doc warning from tlb_flush_rmaps()
  mm: remove enum page_entry_size
  mm: allow ->huge_fault() to be called without the mmap_lock held
  mm: move PMD_ORDER to pgtable.h
  mm: remove checks for pte_index
  memcg: remove duplication detection for mem_cgroup_uncharge_swap
  mm/huge_memory: work on folio->swap instead of page->private when splitting folio
  mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
  mm/swap: use dedicated entry for swap in folio
  mm/swap: stop using page->private on tail pages for THP_SWAP
  selftests/mm: fix WARNING comparing pointer to 0
  selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
  ...
2023-08-29 14:25:26 -07:00
Linus Torvalds
bd6c11bc43 Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
 "Core:

   - Increase size limits for to-be-sent skb frag allocations. This
     allows tun, tap devices and packet sockets to better cope with
     large writes operations

   - Store netdevs in an xarray, to simplify iterating over netdevs

   - Refactor nexthop selection for multipath routes

   - Improve sched class lifetime handling

   - Add backup nexthop ID support for bridge

   - Implement drop reasons support in openvswitch

   - Several data races annotations and fixes

   - Constify the sk parameter of routing functions

   - Prepend kernel version to netconsole message

  Protocols:

   - Implement support for TCP probing the peer being under memory
     pressure

   - Remove hard coded limitation on IPv6 specific info placement inside
     the socket struct

   - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per
     socket scaling factor

   - Scaling-up the IPv6 expired route GC via a separated list of
     expiring routes

   - In-kernel support for the TLS alert protocol

   - Better support for UDP reuseport with connected sockets

   - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
     header size

   - Get rid of additional ancillary per MPTCP connection struct socket

   - Implement support for BPF-based MPTCP packet schedulers

   - Format MPTCP subtests selftests results in TAP

   - Several new SMC 2.1 features including unique experimental options,
     max connections per lgr negotiation, max links per lgr negotiation

  BPF:

   - Multi-buffer support in AF_XDP

   - Add multi uprobe BPF links for attaching multiple uprobes and usdt
     probes, which is significantly faster and saves extra fds

   - Implement an fd-based tc BPF attach API (TCX) and BPF link support
     on top of it

   - Add SO_REUSEPORT support for TC bpf_sk_assign

   - Support new instructions from cpu v4 to simplify the generated code
     and feature completeness, for x86, arm64, riscv64

   - Support defragmenting IPv(4|6) packets in BPF

   - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix
     perf+libbpf issue related to custom section handling

   - Introduce bpf map element count and enable it for all program types

   - Add a BPF hook in sys_socket() to change the protocol ID from
     IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy

   - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress

   - Add uprobe support for the bpf_get_func_ip helper

   - Check skb ownership against full socket

   - Support for up to 12 arguments in BPF trampoline

   - Extend link_info for kprobe_multi and perf_event links

  Netfilter:

   - Speed-up process exit by aborting ruleset validation if a fatal
     signal is pending

   - Allow NLA_POLICY_MASK to be used with BE16/BE32 types

  Driver API:

   - Page pool optimizations, to improve data locality and cache usage

   - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the
     need for raw ioctl() handling in drivers

   - Simplify genetlink dump operations (doit/dumpit) providing them the
     common information already populated in struct genl_info

   - Extend and use the yaml devlink specs to [re]generate the split ops

   - Introduce devlink selective dumps, to allow SF filtering SF based
     on handle and other attributes

   - Add yaml netlink spec for netlink-raw families, allow route, link
     and address related queries via the ynl tool

   - Remove phylink legacy mode support

   - Support offload LED blinking to phy

   - Add devlink port function attributes for IPsec

  New hardware / drivers:

   - Ethernet:
      - Broadcom ASP 2.0 (72165) ethernet controller
      - MediaTek MT7988 SoC
      - Texas Instruments AM654 SoC
      - Texas Instruments IEP driver
      - Atheros qca8081 phy
      - Marvell 88Q2110 phy
      - NXP TJA1120 phy

   - WiFi:
      - MediaTek mt7981 support

   - Can:
      - Kvaser SmartFusion2 PCI Express devices
      - Allwinner T113 controllers
      - Texas Instruments tcan4552/4553 chips

   - Bluetooth:
      - Intel Gale Peak
      - Qualcomm WCN3988 and WCN7850
      - NXP AW693 and IW624
      - Mediatek MT2925

  Drivers:

   - Ethernet NICs:
      - nVidia/Mellanox:
         - mlx5:
            - support UDP encapsulation in packet offload mode
            - IPsec packet offload support in eswitch mode
            - improve aRFS observability by adding new set of counters
            - extends MACsec offload support to cover RoCE traffic
            - dynamic completion EQs
         - mlx4:
            - convert to use auxiliary bus instead of custom interface
              logic
      - Intel
         - ice:
            - implement switchdev bridge offload, even for LAG
              interfaces
            - implement SRIOV support for LAG interfaces
         - igc:
            - add support for multiple in-flight TX timestamps
      - Broadcom:
         - bnxt:
            - use the unified RX page pool buffers for XDP and non-XDP
            - use the NAPI skb allocation cache
      - OcteonTX2:
         - support Round Robin scheduling HTB offload
         - TC flower offload support for SPI field
      - Freescale:
         - add XDP_TX feature support
      - AMD:
         - ionic: add support for PCI FLR event
         - sfc:
            - basic conntrack offload
            - introduce eth, ipv4 and ipv6 pedit offloads
      - ST Microelectronics:
         - stmmac: maximze PTP timestamping resolution

   - Virtual NICs:
      - Microsoft vNIC:
         - batch ringing RX queue doorbell on receiving packets
         - add page pool for RX buffers
      - Virtio vNIC:
         - add per queue interrupt coalescing support
      - Google vNIC:
         - add queue-page-list mode support

   - Ethernet high-speed switches:
      - nVidia/Mellanox (mlxsw):
         - add port range matching tc-flower offload
         - permit enslavement to netdevices with uppers

   - Ethernet embedded switches:
      - Marvell (mv88e6xxx):
         - convert to phylink_pcs
      - Renesas:
         - r8A779fx: add speed change support
         - rzn1: enables vlan support

   - Ethernet PHYs:
      - convert mv88e6xxx to phylink_pcs

   - WiFi:
      - Qualcomm Wi-Fi 7 (ath12k):
         - extremely High Throughput (EHT) PHY support
      - RealTek (rtl8xxxu):
         - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
           RTL8192EU and RTL8723BU
      - RealTek (rtw89):
         - Introduce Time Averaged SAR (TAS) support

   - Connector:
      - support for event filtering"

* tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits)
  net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show
  net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler
  net: stmmac: clarify difference between "interface" and "phy_interface"
  r8152: add vendor/device ID pair for D-Link DUB-E250
  devlink: move devlink_notify_register/unregister() to dev.c
  devlink: move small_ops definition into netlink.c
  devlink: move tracepoint definitions into core.c
  devlink: push linecard related code into separate file
  devlink: push rate related code into separate file
  devlink: push trap related code into separate file
  devlink: use tracepoint_enabled() helper
  devlink: push region related code into separate file
  devlink: push param related code into separate file
  devlink: push resource related code into separate file
  devlink: push dpipe related code into separate file
  devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper
  devlink: push shared buffer related code into separate file
  devlink: push port related code into separate file
  devlink: push object register/unregister notifications into separate helpers
  inet: fix IP_TRANSPARENT error handling
  ...
2023-08-29 11:33:01 -07:00
Linus Torvalds
68cf01760b Merge tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "API:
   - Move crypto engine callback from tfm ctx into algorithm object
   - Fix atomic sleep bug in crypto_destroy_instance
   - Move lib/mpi into lib/crypto

  Algorithms:
   - Add chacha20 and poly1305 implementation for powerpc p10

  Drivers:
   - Add AES skcipher and aead support to starfive
   - Add Dynamic Boost Control support to ccp
   - Add support for STM32P13 platform to stm32"

* tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (149 commits)
  Revert "dt-bindings: crypto: qcom,prng: Add SM8450"
  crypto: chelsio - Remove unused declarations
  X.509: if signature is unsupported skip validation
  crypto: qat - fix crypto capability detection for 4xxx
  crypto: drivers - Explicitly include correct DT includes
  crypto: engine - Remove crypto_engine_ctx
  crypto: zynqmp - Use new crypto_engine_op interface
  crypto: virtio - Use new crypto_engine_op interface
  crypto: stm32 - Use new crypto_engine_op interface
  crypto: jh7110 - Use new crypto_engine_op interface
  crypto: rk3288 - Use new crypto_engine_op interface
  crypto: omap - Use new crypto_engine_op interface
  crypto: keembay - Use new crypto_engine_op interface
  crypto: sl3516 - Use new crypto_engine_op interface
  crypto: caam - Use new crypto_engine_op interface
  crypto: aspeed - Remove non-standard sha512 algorithms
  crypto: aspeed - Use new crypto_engine_op interface
  crypto: amlogic - Use new crypto_engine_op interface
  crypto: sun8i-ss - Use new crypto_engine_op interface
  crypto: sun8i-ce - Use new crypto_engine_op interface
  ...
2023-08-29 11:23:29 -07:00
Nick Desaulniers
90bae4d99b powerpc/xmon: Reapply "Relax frame size for clang"
This is a manual revert of commit 7f3c5d099b ("Revert "powerpc/xmon:
Relax frame size for clang"") but using ccflags-$(CONFIG_CC_IS_CLANG)
which is shorter.

Turns out that this is reproducible still under specific compiler
versions (mea culpa: I did not test every supported version of clang),
and even a few randconfigs bots found.

We'll have to revisit this again in the future, for now back this out.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Closes: https://github.com/ClangBuiltLinux/linux/issues/252#issuecomment-1690371256
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/llvm/202308260344.Vc4Giuk7-lkp@intel.com/
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230828-ppc_rerevert-v2-1-46b71a3656c6@google.com
2023-08-29 17:06:45 +10:00
Linus Torvalds
475d4df827 Merge tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fchmodat2 system call from Christian Brauner:
 "This adds the fchmodat2() system call. It is a revised version of the
  fchmodat() system call, adding a missing flag argument. Support for
  both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.

  Adding this system call revision has been a longstanding request but
  so far has always fallen through the cracks. While the kernel
  implementation of fchmodat() does not have a flag argument the libc
  provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
  have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
  (see [1] and [2]).

  The workaround is brittle because it relies not just on O_PATH and
  O_NOFOLLOW semantics and procfs magic links but also on our rather
  inconsistent symlink semantics.

  This gives userspace a proper fchmodat2() system call that libcs can
  use to properly implement fchmodat(3) and allows them to get rid of
  their hacks. In this case it will immediately benefit them as the
  current workaround is already defunct because of aformentioned
  inconsistencies.

  In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
  AT_EMPTY_PATH with fchmodat2(). This is already possible with
  fchownat() so there's no reason to not also support it for
  fchmodat2().

  The implementation is simple and comes with selftests. Implementation
  of the system call and wiring up the system call are done as separate
  patches even though they could arguably be one patch. But in case
  there are merge conflicts from other system call additions it can be
  beneficial to have separate patches"

Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]

* tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  selftests: fchmodat2: remove duplicate unneeded defines
  fchmodat2: add support for AT_EMPTY_PATH
  selftests: Add fchmodat2 selftest
  arch: Register fchmodat2, usually as syscall 452
  fs: Add fchmodat2()
  Non-functional cleanup of a "__user * filename"
2023-08-28 11:25:27 -07:00
Linus Torvalds
615e95831e Merge tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs timestamp updates from Christian Brauner:
 "This adds VFS support for multi-grain timestamps and converts tmpfs,
  xfs, ext4, and btrfs to use them. This carries acks from all relevant
  filesystems.

  The VFS always uses coarse-grained timestamps when updating the ctime
  and mtime after a change. This has the benefit of allowing filesystems
  to optimize away a lot of metadata updates, down to around 1 per
  jiffy, even when a file is under heavy writes.

  Unfortunately, this has always been an issue when we're exporting via
  NFSv3, which relies on timestamps to validate caches. A lot of changes
  can happen in a jiffy, so timestamps aren't sufficient to help the
  client decide to invalidate the cache.

  Even with NFSv4, a lot of exported filesystems don't properly support
  a change attribute and are subject to the same problems with timestamp
  granularity. Other applications have similar issues with timestamps
  (e.g., backup applications).

  If we were to always use fine-grained timestamps, that would improve
  the situation, but that becomes rather expensive, as the underlying
  filesystem would have to log a lot more metadata updates.

  This introduces fine-grained timestamps that are used when they are
  actively queried.

  This uses the 31st bit of the ctime tv_nsec field to indicate that
  something has queried the inode for the mtime or ctime. When this flag
  is set, on the next mtime or ctime update, the kernel will fetch a
  fine-grained timestamp instead of the usual coarse-grained one.

  As POSIX generally mandates that when the mtime changes, the ctime
  must also change the kernel always stores normalized ctime values, so
  only the first 30 bits of the tv_nsec field are ever used.

  Filesytems can opt into this behavior by setting the FS_MGTIME flag in
  the fstype. Filesystems that don't set this flag will continue to use
  coarse-grained timestamps.

  Various preparatory changes, fixes and cleanups are included:

   - Fixup all relevant places where POSIX requires updating ctime
     together with mtime. This is a wide-range of places and all
     maintainers provided necessary Acks.

   - Add new accessors for inode->i_ctime directly and change all
     callers to rely on them. Plain accesses to inode->i_ctime are now
     gone and it is accordingly rename to inode->__i_ctime and commented
     as requiring accessors.

   - Extend generic_fillattr() to pass in a request mask mirroring in a
     sense the statx() uapi. This allows callers to pass in a request
     mask to only get a subset of attributes filled in.

   - Rework timestamp updates so it's possible to drop the @now
     parameter the update_time() inode operation and associated helpers.

   - Add inode_update_timestamps() and convert all filesystems to it
     removing a bunch of open-coding"

* tag 'v6.6-vfs.ctime' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (107 commits)
  btrfs: convert to multigrain timestamps
  ext4: switch to multigrain timestamps
  xfs: switch to multigrain timestamps
  tmpfs: add support for multigrain timestamps
  fs: add infrastructure for multigrain timestamps
  fs: drop the timespec64 argument from update_time
  xfs: have xfs_vn_update_time gets its own timestamp
  fat: make fat_update_time get its own timestamp
  fat: remove i_version handling from fat_update_time
  ubifs: have ubifs_update_time use inode_update_timestamps
  btrfs: have it use inode_update_timestamps
  fs: drop the timespec64 arg from generic_update_time
  fs: pass the request_mask to generic_fillattr
  fs: remove silly warning from current_time
  gfs2: fix timestamp handling on quota inodes
  fs: rename i_ctime field to __i_ctime
  selinux: convert to ctime accessor functions
  security: convert to ctime accessor functions
  apparmor: convert to ctime accessor functions
  sunrpc: convert to ctime accessor functions
  ...
2023-08-28 09:31:32 -07:00
Aneesh Kumar K.V
4c33bf1472 powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached
Commit 4d15721177 ("powerpc/mm: Cleanup memory block size probing")
used 256MB as the memory block size when we have
ibm,coherent-device-memory device tree node present. Instead of
returning with 256MB memory block size, continue to check the rest of
the memory regions and make sure we can still map them using a 256MB
memory block size.

Fixes: 4d15721177 ("powerpc/mm: Cleanup memory block size probing")
Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230828074658.59553-2-aneesh.kumar@linux.ibm.com
2023-08-28 22:40:58 +10:00
Aneesh Kumar K.V
f1424755db powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled
With CONFIG_SPARSEMEM disabled the below kernel build error is observed.

 arch/powerpc/mm/init_64.c:477:38: error: use of undeclared identifier 'SECTION_SIZE_BITS'

CONFIG_MEMORY_HOTPLUG depends on CONFIG_SPARSEMEM and it is more clear
to describe the code dependency in terms of MEMORY_HOTPLUG. Outside
memory hotplug the kernel uses memory_block_size for kernel directmap.
Instead of depending on SECTION_SIZE_BITS to compute the direct map
page size, add a new #define which defaults to 16M(same as existing
SECTION_SIZE)

Fixes: 4d15721177 ("powerpc/mm: Cleanup memory block size probing")
Signed-off-by: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308251532.k9PpWEAD-lkp@intel.com/
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230828074658.59553-1-aneesh.kumar@linux.ibm.com
2023-08-28 22:22:06 +10:00
Linus Torvalds
6f0edbb833 Merge tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "18 hotfixes. 13 are cc:stable and the remainder pertain to post-6.4
  issues or aren't considered suitable for a -stable backport"

* tag 'mm-hotfixes-stable-2023-08-25-11-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  shmem: fix smaps BUG sleeping while atomic
  selftests: cachestat: catch failing fsync test on tmpfs
  selftests: cachestat: test for cachestat availability
  maple_tree: disable mas_wr_append() when other readers are possible
  madvise:madvise_free_pte_range(): don't use mapcount() against large folio for sharing check
  madvise:madvise_free_huge_pmd(): don't use mapcount() against large folio for sharing check
  madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check
  mm: multi-gen LRU: don't spin during memcg release
  mm: memory-failure: fix unexpected return value in soft_offline_page()
  radix tree: remove unused variable
  mm: add a call to flush_cache_vmap() in vmap_pfn()
  selftests/mm: FOLL_LONGTERM need to be updated to 0x100
  nilfs2: fix general protection fault in nilfs_lookup_dirty_data_buffers()
  mm/gup: handle cont-PTE hugetlb pages correctly in gup_must_unshare() via GUP-fast
  selftests: cgroup: fix test_kmem_basic less than error
  mm: enable page walking API to lock vmas during the walk
  smaps: use vm_normal_page_pmd() instead of follow_trans_huge_pmd()
  mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
2023-08-25 11:44:43 -07:00
Russell Currey
c37b6908f7 powerpc/iommu: Fix notifiers being shared by PCI and VIO buses
fail_iommu_setup() registers the fail_iommu_bus_notifier struct to both
PCI and VIO buses.  struct notifier_block is a linked list node, so this
causes any notifiers later registered to either bus type to also be
registered to the other since they share the same node.

This causes issues in (at least) the vgaarb code, which registers a
notifier for PCI buses.  pci_notify() ends up being called on a vio
device, converted with to_pci_dev() even though it's not a PCI device,
and finally makes a bad access in vga_arbiter_add_pci_device() as
discovered with KASAN:

 BUG: KASAN: slab-out-of-bounds in vga_arbiter_add_pci_device+0x60/0xe00
 Read of size 4 at addr c000000264c26fdc by task swapper/0/1

 Call Trace:
   dump_stack_lvl+0x1bc/0x2b8 (unreliable)
   print_report+0x3f4/0xc60
   kasan_report+0x244/0x698
   __asan_load4+0xe8/0x250
   vga_arbiter_add_pci_device+0x60/0xe00
   pci_notify+0x88/0x444
   notifier_call_chain+0x104/0x320
   blocking_notifier_call_chain+0xa0/0x140
   device_add+0xac8/0x1d30
   device_register+0x58/0x80
   vio_register_device_node+0x9ac/0xce0
   vio_bus_scan_register_devices+0xc4/0x13c
   __machine_initcall_pseries_vio_device_init+0x94/0xf0
   do_one_initcall+0x12c/0xaa8
   kernel_init_freeable+0xa48/0xba8
   kernel_init+0x64/0x400
   ret_from_kernel_thread+0x5c/0x64

Fix this by creating separate notifier_block structs for each bus type.

Fixes: d6b9a81b2a ("powerpc: IOMMU fault injection")
Reported-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
[mpe: Add #ifdef to fix CONFIG_IBMVIO=n build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230322035322.328709-1-ruscur@russell.cc
2023-08-25 23:44:35 +10:00
Matthew Wilcox (Oracle)
9fee28baa6 powerpc: implement the new page table range API
Add set_ptes(), update_mmu_cache_range() and flush_dcache_folio().  Change
the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page to per-folio.

[willy@infradead.org: re-export flush_dcache_icache_folio()]
  Link: https://lkml.kernel.org/r/ZMx1daYwvD9EM7Cv@casper.infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-22-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:20:23 -07:00