Commit Graph

3680 Commits

Author SHA1 Message Date
Phil Elwell
1ac2f86fef arch/arm: Add model string to cpuinfo
Signed-off-by: Phil Elwell <phil@raspberrypi.org>
2024-03-05 13:54:46 +00:00
Phil Elwell
f550dc969a ARM: Activate FIQs to avoid __irq_startup warnings
There is a new test in __irq_startup that the IRQ is activated, which
hasn't been the case for FIQs since they bypass some of the usual setup.

Augment enable_fiq to include a call to irq_activate to avoid the
warning.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>
2024-03-05 13:54:42 +00:00
popcornmix
55cf481631 Add dwc_otg driver
Signed-off-by: popcornmix <popcornmix@gmail.com>

usb: dwc: fix lockdep false positive

Signed-off-by: Kari Suvanto <karis79@gmail.com>

usb: dwc: fix inconsistent lock state

Signed-off-by: Kari Suvanto <karis79@gmail.com>

Add FIQ patch to dwc_otg driver. Enable with dwc_otg.fiq_fix_enable=1. Should give about 10% more ARM performance.
Thanks to Gordon and Costas

Avoid dynamic memory allocation for channel lock in USB driver. Thanks ddv2005.

Add NAK holdoff scheme. Enabled by default, disable with dwc_otg.nak_holdoff_enable=0. Thanks gsh

Make sure we wait for the reset to finish

dwc_otg: fix bug in dwc_otg_hcd.c resulting in silent kernel
	 memory corruption, escalating to OOPS under high USB load.

dwc_otg: Fix unsafe access of QTD during URB enqueue

In dwc_otg_hcd_urb_enqueue during qtd creation, it was possible that the
transaction could complete almost immediately after the qtd was assigned
to a host channel during URB enqueue, which meant the qtd pointer was no
longer valid having been completed and removed. Usually, this resulted in
an OOPS during URB submission. By predetermining whether transactions
need to be queued or not, this unsafe pointer access is avoided.

This bug was only evident on the Pi model A where a device was attached
that had no periodic endpoints (e.g. USB pendrive or some wlan devices).

dwc_otg: Fix incorrect URB allocation error handling

If the memory allocation for a dwc_otg_urb failed, the kernel would OOPS
because for some reason a member of the *unallocated* struct was set to
zero. Error handling changed to fail correctly.

dwc_otg: fix potential use-after-free case in interrupt handler

If a transaction had previously aborted, certain interrupts are
enabled to track error counts and reset where necessary. On IN
endpoints the host generates an ACK interrupt near-simultaneously
with completion of transfer. In the case where this transfer had
previously had an error, this results in a use-after-free on
the QTD memory space with a 1-byte length being overwritten to
0x00.

dwc_otg: add handling of SPLIT transaction data toggle errors

Previously a data toggle error on packets from a USB1.1 device behind
a TT would result in the Pi locking up as the driver never handled
the associated interrupt. Patch adds basic retry mechanism and
interrupt acknowledgement to cater for either a chance toggle error or
for devices that have a broken initial toggle state (FT8U232/FT232BM).

dwc_otg: implement tasklet for returning URBs to usbcore hcd layer

The dwc_otg driver interrupt handler for transfer completion will spend
a very long time with interrupts disabled when a URB is completed -
this is because usb_hcd_giveback_urb is called from within the handler
which for a USB device driver with complicated processing (e.g. webcam)
will take an exorbitant amount of time to complete. This results in
missed completion interrupts for other USB packets which lead to them
being dropped due to microframe overruns.

This patch splits returning the URB to the usb hcd layer into a
high-priority tasklet. This will have most benefit for isochronous IN
transfers but will also have incidental benefit where multiple periodic
devices are active at once.

dwc_otg: fix NAK holdoff and allow on split transactions only

This corrects a bug where if a single active non-periodic endpoint
had at least one transaction in its qh, on frnum == MAX_FRNUM the qh
would get skipped and never get queued again. This would result in
a silent device until error detection (automatic or otherwise) would
either reset the device or flush and requeue the URBs.

Additionally the NAK holdoff was enabled for all transactions - this
would potentially stall a HS endpoint for 1ms if a previous error state
enabled this interrupt and the next response was a NAK. Fix so that
only split transactions get held off.

dwc_otg: Call usb_hcd_unlink_urb_from_ep with lock held in completion handler

usb_hcd_unlink_urb_from_ep must be called with the HCD lock held.  Calling it
asynchronously in the tasklet was not safe (regression in
c4564d4a1a).

This change unlinks it from the endpoint prior to queueing it for handling in
the tasklet, and also adds a check to ensure the urb is OK to be unlinked
before doing so.

NULL pointer dereference kernel oopses had been observed in usb_hcd_giveback_urb
when a USB device was unplugged/replugged during data transfer.  This effect
was reproduced using automated USB port power control, hundreds of replug
events were performed during active transfers to confirm that the problem was
eliminated.

USB fix using a FIQ to implement split transactions

This commit adds a FIQ implementaion that schedules
the split transactions using a FIQ so we don't get
held off by the interrupt latency of Linux

dwc_otg: fix device attributes and avoid kernel warnings on boot

dcw_otg: avoid logging function that can cause panics

See: https://github.com/raspberrypi/firmware/issues/21
Thanks to cleverca22 for fix

dwc_otg: mask correct interrupts after transaction error recovery

The dwc_otg driver will unmask certain interrupts on a transaction
that previously halted in the error state in order to reset the
QTD error count. The various fine-grained interrupt handlers do not
consider that other interrupts besides themselves were unmasked.

By disabling the two other interrupts only ever enabled in DMA mode
for this purpose, we can avoid unnecessary function calls in the
IRQ handler. This will also prevent an unneccesary FIQ interrupt
from being generated if the FIQ is enabled.

dwc_otg: fiq: prevent FIQ thrash and incorrect state passing to IRQ

In the case of a transaction to a device that had previously aborted
due to an error, several interrupts are enabled to reset the error
count when a device responds. This has the side-effect of making the
FIQ thrash because the hardware will generate multiple instances of
a NAK on an IN bulk/interrupt endpoint and multiple instances of ACK
on an OUT bulk/interrupt endpoint. Make the FIQ mask and clear the
associated interrupts.

Additionally, on non-split transactions make sure that only unmasked
interrupts are cleared. This caused a hard-to-trigger but serious
race condition when you had the combination of an endpoint awaiting
error recovery and a transaction completed on an endpoint - due to
the sequencing and timing of interrupts generated by the dwc_otg core,
it was possible to confuse the IRQ handler.

Fix function tracing

dwc_otg: whitespace cleanup in dwc_otg_urb_enqueue

dwc_otg: prevent OOPSes during device disconnects

The dwc_otg_urb_enqueue function is thread-unsafe. In particular the
access of urb->hcpriv, usb_hcd_link_urb_to_ep, dwc_otg_urb->qtd and
friends does not occur within a critical section and so if a device
was unplugged during activity there was a high chance that the
usbcore hub_thread would try to disable the endpoint with partially-
formed entries in the URB queue. This would result in BUG() or null
pointer dereferences.

Fix so that access of urb->hcpriv, enqueuing to the hardware and
adding to usbcore endpoint URB lists is contained within a single
critical section.

dwc_otg: prevent BUG() in TT allocation if hub address is > 16

A fixed-size array is used to track TT allocation. This was
previously set to 16 which caused a crash because
dwc_otg_hcd_allocate_port would read past the end of the array.

This was hit if a hub was plugged in which enumerated as addr > 16,
due to previous device resets or unplugs.

Also add #ifdef FIQ_DEBUG around hcd->hub_port_alloc[], which grows
to a large size if 128 hub addresses are supported. This field is
for debug only for tracking which frame an allocate happened in.

dwc_otg: make channel halts with unknown state less damaging

If the IRQ received a channel halt interrupt through the FIQ
with no other bits set, the IRQ would not release the host
channel and never complete the URB.

Add catchall handling to treat as a transaction error and retry.

dwc_otg: fiq_split: use TTs with more granularity

This fixes certain issues with split transaction scheduling.

- Isochronous multi-packet OUT transactions now hog the TT until
  they are completed - this prevents hubs aborting transactions
  if they get a periodic start-split out-of-order
- Don't perform TT allocation on non-periodic endpoints - this
  allows simultaneous use of the TT's bulk/control and periodic
  transaction buffers

This commit will mainly affect USB audio playback.

dwc_otg: fix potential sleep while atomic during urb enqueue

Fixes a regression introduced with eb1b482a. Kmalloc called from
dwc_otg_hcd_qtd_add / dwc_otg_hcd_qtd_create did not always have
the GPF_ATOMIC flag set. Force this flag when inside the larger
critical section.

dwc_otg: make fiq_split_enable imply fiq_fix_enable

Failing to set up the FIQ correctly would result in
"IRQ 32: nobody cared" errors in dmesg.

dwc_otg: prevent crashes on host port disconnects

Fix several issues resulting in crashes or inconsistent state
if a Model A root port was disconnected.

- Clean up queue heads properly in kill_urbs_in_qh_list by
  removing the empty QHs from the schedule lists
- Set the halt status properly to prevent IRQ handlers from
  using freed memory
- Add fiq_split related cleanup for saved registers
- Make microframe scheduling reclaim host channels if
  active during a disconnect
- Abort URBs with -ESHUTDOWN status response, informing
  device drivers so they respond in a more correct fashion
  and don't try to resubmit URBs
- Prevent IRQ handlers from attempting to handle channel
  interrupts if the associated URB was dequeued (and the
  driver state was cleared)

dwc_otg: prevent leaking URBs during enqueue

A dwc_otg_urb would get leaked if the HCD enqueue function
failed for any reason. Free the URB at the appropriate points.

dwc_otg: Enable NAK holdoff for control split transactions

Certain low-speed devices take a very long time to complete a
data or status stage of a control transaction, producing NAK
responses until they complete internal processing - the USB2.0
spec limit is up to 500mS. This causes the same type of interrupt
storm as seen with USB-serial dongles prior to c8edb238.

In certain circumstances, usually while booting, this interrupt
storm could cause SD card timeouts.

dwc_otg: Fix for occasional lockup on boot when doing a USB reset

dwc_otg: Don't issue traffic to LS devices in FS mode

Issuing low-speed packets when the root port is in full-speed mode
causes the root port to stop responding. Explicitly fail when
enqueuing URBs to a LS endpoint on a FS bus.

Fix ARM architecture issue with local_irq_restore()

If local_fiq_enable() is called before a local_irq_restore(flags) where
the flags variable has the F bit set, the FIQ will be erroneously disabled.

Fixup arch_local_irq_restore to avoid trampling the F bit in CPSR.

Also fix some of the hacks previously implemented for previous dwc_otg
incarnations.

dwc_otg: fiq_fsm: Base commit for driver rewrite

This commit removes the previous FIQ fixes entirely and adds fiq_fsm.

This rewrite features much more complete support for split transactions
and takes into account several OTG hardware bugs. High-speed
isochronous transactions are also capable of being performed by fiq_fsm.

All driver options have been removed and replaced with:
  - dwc_otg.fiq_enable (bool)
  - dwc_otg.fiq_fsm_enable (bool)
  - dwc_otg.fiq_fsm_mask (bitmask)
  - dwc_otg.nak_holdoff (unsigned int)

Defaults are specified such that fiq_fsm behaves similarly to the
previously implemented FIQ fixes.

fiq_fsm: Push error recovery into the FIQ when fiq_fsm is used

If the transfer associated with a QTD failed due to a bus error, the HCD
would retry the transfer up to 3 times (implementing the USB2.0
three-strikes retry in software).

Due to the masking mechanism used by fiq_fsm, it is only possible to pass
a single interrupt through to the HCD per-transfer.

In this instance host channels would fall off the radar because the error
reset would function, but the subsequent channel halt would be lost.

Push the error count reset into the FIQ handler.

fiq_fsm: Implement timeout mechanism

For full-speed endpoints with a large packet size, interrupt latency
runs the risk of the FIQ starting a transaction too late in a full-speed
frame. If the device is still transmitting data when EOF2 for the
downstream frame occurs, the hub will disable the port. This change is
not reflected in the hub status endpoint and the device becomes
unresponsive.

Prevent high-bandwidth transactions from being started too late in a
frame. The mechanism is not guaranteed: a combination of bit stuffing
and hub latency may still result in a device overrunning.

fiq_fsm: fix bounce buffer utilisation for Isochronous OUT

Multi-packet isochronous OUT transactions were subject to a few bounday
bugs. Fix them.

Audio playback is now much more robust: however, an issue stands with
devices that have adaptive sinks - ALSA plays samples too fast.

dwc_otg: Return full-speed frame numbers in HS mode

The frame counter increments on every *microframe* in high-speed mode.
Most device drivers expect this number to be in full-speed frames - this
caused considerable confusion to e.g. snd_usb_audio which uses the
frame counter to estimate the number of samples played.

fiq_fsm: save PID on completion of interrupt OUT transfers

Also add edge case handling for interrupt transports.

Note that for periodic split IN, data toggles are unimplemented in the
OTG host hardware - it unconditionally accepts any PID.

fiq_fsm: add missing case for fiq_fsm_tt_in_use()

Certain combinations of bitrate and endpoint activity could
result in a periodic transaction erroneously getting started
while the previous Isochronous OUT was still active.

fiq_fsm: clear hcintmsk for aborted transactions

Prevents the FIQ from erroneously handling interrupts
on a timed out channel.

fiq_fsm: enable by default

fiq_fsm: fix dequeues for non-periodic split transactions

If a dequeue happened between the SSPLIT and CSPLIT phases of the
transaction, the HCD would never receive an interrupt.

fiq_fsm: Disable by default

fiq_fsm: Handle HC babble errors

The HCTSIZ transfer size field raises a babble interrupt if
the counter wraps. Handle the resulting interrupt in this case.

dwc_otg: fix interrupt registration for fiq_enable=0

Additionally make the module parameter conditional for wherever
hcd->fiq_state is touched.

fiq_fsm: Enable by default

dwc_otg: Fix various issues with root port and transaction errors

Process the host port interrupts correctly (and don't trample them).
Root port hotplug now functional again.

Fix a few thinkos with the transaction error passthrough for fiq_fsm.

fiq_fsm: Implement hack for Split Interrupt transactions

Hubs aren't too picky about which endpoint we send Control type split
transactions to. By treating Interrupt transfers as Control, it is
possible to use the non-periodic queue in the OTG core as well as the
non-periodic FIFOs in the hub itself. This massively reduces the
microframe exclusivity/contention that periodic split transactions
otherwise have to enforce.

It goes without saying that this is a fairly egregious USB specification
violation, but it works.

Original idea by Hans Petter Selasky @ FreeBSD.org.

dwc_otg: FIQ support on SMP. Set up FIQ stack and handler on Core 0 only.

dwc_otg: introduce fiq_fsm_spin(un|)lock()

SMP safety for the FIQ relies on register read-modify write cycles being
completed in the correct order. Several places in the DWC code modify
registers also touched by the FIQ. Protect these by a bare-bones lock
mechanism.

This also makes it possible to run the FIQ and IRQ handlers on different
cores.

fiq_fsm: fix build on bcm2708 and bcm2709 platforms

dwc_otg: put some barriers back where they should be for UP

bcm2709/dwc_otg: Setup FIQ on core 1 if >1 core active

dwc_otg: fixup read-modify-write in critical paths

Be more careful about read-modify-write on registers that the FIQ
also touches.

Guard fiq_fsm_spin_lock with fiq_enable check

fiq_fsm: Falling out of the state machine isn't fatal

This edge case can be hit if the port is disabled while the FIQ is
in the middle of a transaction. Make the effects less severe.

Also get rid of the useless return value.

squash: dwc_otg: Allow to build without SMP

usb: core: make overcurrent messages more prominent

Hub overcurrent messages are more serious than "debug". Increase loglevel.

usb: dwc_otg: Don't use dma_to_virt()

Commit 6ce0d20 changes dma_to_virt() which breaks this driver.
Open code the old dma_to_virt() implementation to work around this.

Limit the use of __bus_to_virt() to cases where transfer_buffer_length
is set and transfer_buffer is not set. This is done to increase the
chance that this driver will also work on ARCH_BCM2835.

transfer_buffer should not be NULL if the length is set, but the
comment in the code indicates that there are situations where this
might happen. drivers/usb/isp1760/isp1760-hcd.c also has a similar
comment pointing to a possible: 'usb storage / SCSI bug'.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Fix crash when fiq_enable=0

dwc_otg: fiq_fsm: Make high-speed isochronous strided transfers work properly

Certain low-bandwidth high-speed USB devices (specialist audio devices,
compressed-frame webcams) have packet intervals > 1 microframe.

Stride these transfers in the FIQ by using the start-of-frame interrupt
to restart the channel at the right time.

dwc_otg: Force host mode to fix incorrect compute module boards

dwc_otg: Add ARCH_BCM2835 support

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Simplify FIQ irq number code

Dropping ATAGS means we can simplify the FIQ irq number code.
Also add error checking on the returned irq number.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: Remove duplicate gadget probe/unregister function

dwc_otg: Properly set the HFIR

Douglas Anderson reported:

According to the most up to date version of the dwc2 databook, the FRINT
field of the HFIR register should be programmed to:
* 125 us * (PHY clock freq for HS) - 1
* 1000 us * (PHY clock freq for FS/LS) - 1

This is opposed to older versions of the doc that claimed it should be:
* 125 us * (PHY clock freq for HS)
* 1000 us * (PHY clock freq for FS/LS)

and reported lower timing jitter on a USB analyser

dcw_otg: trim xfer length when buffer larger than allocated size is received

dwc_otg: Don't free qh align buffers in atomic context

dwc_otg: Enable the hack for Split Interrupt transactions by default

dwc_otg.fiq_fsm_mask=0xF has long been a suggestion for users with audio stutters or other USB bandwidth issues.
So far we are aware of many success stories but no failure caused by this setting.
Make it a default to learn more.

See: https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=70437

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: Use kzalloc when suitable

dwc_otg: Pass struct device to dma_alloc*()

This makes it possible to get the bus address from Device Tree.

Signed-off-by: Noralf Trønnes <noralf@tronnes.org>

dwc_otg: fix summarize urb->actual_length for isochronous transfers

Kernel does not copy input data of ISO transfers to userspace
if actual_length is set only in ISO transfers and not summarized
in urb->actual_length. Fixes raspberrypi/linux#903

fiq_fsm: Use correct states when starting isoc OUT transfers

In fiq_fsm_start_next_periodic() if an isochronous OUT transfer
was selected, no regard was given as to whether this was a single-packet
transfer or a multi-packet staged transfer.

For single-packet transfers, this had the effect of repeatedly sending
OUT packets with bogus data and lengths.

Eventually if the channel was repeatedly enabled enough times, this
would lock up the OTG core and no further bus transfers would happen.

Set the FSM state up properly if we select a single-packet transfer.

Fixes https://github.com/raspberrypi/linux/issues/1842

dwc_otg: make nak_holdoff work as intended with empty queues

If URBs reading from non-periodic split endpoints were dequeued and
the last transfer from the endpoint was a NAK handshake, the resulting
qh->nak_frame value was stale which would result in unnecessarily long
polling intervals for the first subsequent transfer with a fresh URB.

Fixup qh->nak_frame in dwc_otg_hcd_urb_dequeue and also guard against
a case where a single URB is submitted to the endpoint, a NAK was
received on the transfer immediately prior to receiving data and the
device subsequently resubmits another URB past the qh->nak_frame interval.

Fixes https://github.com/raspberrypi/linux/issues/1709

dwc_otg: fix split transaction data toggle handling around dequeues

See https://github.com/raspberrypi/linux/issues/1709

Fix several issues regarding endpoint state when URBs are dequeued
- If the HCD is disconnected, flush FIQ-enabled channels properly
- Save the data toggle state for bulk endpoints if the last transfer
  from an endpoint where URBs were dequeued returned a data packet
- Reset hc->start_pkt_count properly in assign_and_init_hc()

dwc_otg: fix several potential crash sources

On root port disconnect events, the host driver state is cleared and
in-progress host channels are forcibly stopped. This doesn't play
well with the FIQ running in the background, so:
- Guard the disconnect callback with both the host spinlock and FIQ
  spinlock
- Move qtd dereference in dwc_otg_handle_hc_fsm() after the early-out
  so we don't dereference a qtd that has gone away
- Turn catch-all BUG()s in dwc_otg_handle_hc_fsm() into warnings.

dwc_otg: delete hcd->channel_lock

The lock serves no purpose as it is only held while the HCD spinlock
is already being held.

dwc_otg: remove unnecessary dma-mode channel halts on disconnect interrupt

Host channels are already halted in kill_urbs_in_qh_list() with the
subsequent interrupt processing behaving as if the URB was dequeued
via HCD callback.

There's no need to clobber the host channel registers a second time
as this exposes races between the driver and host channel resulting
in hcd->free_hc_list becoming corrupted.

dwcotg: Allow to build without FIQ on ARM64

Signed-off-by: popcornmix <popcornmix@gmail.com>

dwc_otg: make periodic scheduling behave properly for FS buses

If the root port is in full-speed mode, transfer times at 12mbit/s
would be calculated but matched against high-speed quotas.

Reinitialise hcd->frame_usecs[i] on each port enable event so that
full-speed bandwidth can be tracked sensibly.

Also, don't bother using the FIQ for transfers when in full-speed
mode - at the slower bus speed, interrupt frequency is reduced by
an order of magnitude.

Related issue: https://github.com/raspberrypi/linux/issues/2020

dwc_otg: fiq_fsm: Make isochronous compatibility checks work properly

Get rid of the spammy printk and local pointer mangling.
Also, there is a nominal benefit for using fiq_fsm for isochronous
transfers in FS mode (~1.1k IRQs per second vs 2.1k IRQs per second)
so remove the root port speed check.

dwc_otg: add module parameter int_ep_interval_min

Add a module parameter (defaulting to ignored) that clamps the polling rate
of high-speed Interrupt endpoints to a minimum microframe interval.

The parameter is modifiable at runtime as it is used when activating new
endpoints (such as on device connect).

dwc_otg: fiq_fsm: Add non-periodic TT exclusivity constraints

Certain hub types do not discriminate between pipe direction (IN or OUT)
when considering non-periodic transfers. Therefore these hubs get confused
if multiple transfers are issued in different directions with the same
device address and endpoint number.

Constrain queuing non-periodic split transactions so they are performed
serially in such cases.

Related: https://github.com/raspberrypi/linux/issues/2024

dwc_otg: Fixup change to DRIVER_ATTR interface

dwc_otg: Fix compilation warnings

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

USB_DWCOTG: Disable building dwc_otg as a module (#2265)

When dwc_otg is built as a module, build will fail with the following
error:

ERROR: "DWC_TASK_HI_SCHEDULE" [drivers/usb/host/dwc_otg/dwc_otg.ko] undefined!
scripts/Makefile.modpost:91: recipe for target '__modpost' failed
make[1]: *** [__modpost] Error 1
Makefile:1199: recipe for target 'modules' failed
make: *** [modules] Error 2

Even if the error is solved by including the missing
DWC_TASK_HI_SCHEDULE function, the kernel will panic when loading
dwc_otg.

As a workaround, simply prevent user from building dwc_otg as a module
as the current kernel does not support it.

See: https://github.com/raspberrypi/linux/issues/2258

Signed-off-by: Malik Olivier Boussejra <malik@boussejra.com>

dwc_otg: New timer API

dwc_otg: Fix removed ACCESS_ONCE->READ_ONCE

dwc_otg: don't unconditionally force host mode in dwc_otg_cil_init()

Add the ability to disable force_host_mode for those that want to use
dwc_otg in both device and host modes.

dwc_otg: Fix a regression when dequeueing isochronous transfers

In 282bed95 (dwc_otg: make nak_holdoff work as intended with empty queues)
the dequeue mechanism was changed to leave FIQ-enabled transfers to run
to completion - to avoid leaving hub TT buffers with stale packets lying
around.

This broke FIQ-accelerated isochronous transfers, as this then meant that
dozens of transfers were performed after the dequeue function returned.

Restore the state machine fence for isochronous transfers.

fiq_fsm: rewind DMA pointer for OUT transactions that fail (#2288)

See: https://github.com/raspberrypi/linux/issues/2140

dwc_otg: add smp_mb() to prevent driver state corruption on boot

Occasional crashes have been seen where the FIQ code dereferences
invalid/random pointers immediately after being set up, leading to
panic on boot.

The crash occurs as the FIQ code races against hcd_init_fiq() and
the hcd_init_fiq() code races against the outstanding memory stores
from dwc_otg_hcd_init(). Use explicit barriers after touching
driver state.

usb: dwc_otg: fix memory corruption in dwc_otg driver

[Upstream commit 51b1b64917]

The move from the staging tree to the main tree exposed a
longstanding memory corruption bug in the dwc2 driver. The
reordering of the driver initialization caused the dwc2 driver
to corrupt the initialization data of the sdhci driver on the
Raspberry Pi platform, which made the bug show up.

The error is in calling to_usb_device(hsotg->dev), since ->dev
is not a member of struct usb_device. The easiest fix is to
just remove the offending code, since it is not really needed.

Thanks to Stephen Warren for tracking down the cause of this.

Reported-by: Andre Heider <a.heider@gmail.com>
Tested-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Paul Zimmerman <paulz@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[lukas: port from upstream dwc2 to out-of-tree dwc_otg driver]
Signed-off-by: Lukas Wunner <lukas@wunner.de>

usb: dwb_otg: Fix unreachable switch statement warning

This warning appears with GCC 7.3.0 from toolchains.bootlin.com:

../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c: In function ‘fiq_fsm_update_hs_isoc’:
../drivers/usb/host/dwc_otg/dwc_otg_fiq_fsm.c:595:61: warning: statement will never be executed [-Wswitch-unreachable]
   st->hctsiz_copy.b.xfersize = nrpackets * st->hcchar_copy.b.mps;
                                            ~~~~~~~~~~~~~~~~~^~~~

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

dwc_otg: fiq_fsm: fix incorrect DMA register offset calculation

Rationalise the offset and update all call sites.

Fixes https://github.com/raspberrypi/linux/issues/2408

dwc_otg: fix bug with port_addr assignment for single-TT hubs

See https://github.com/raspberrypi/linux/issues/2734

The "Hub Port" field in the split transaction packet was always set
to 1 for single-TT hubs. The majority of single-TT hub products
apparently ignore this field and broadcast to all downstream enabled
ports, which masked the issue. A subset of hub devices apparently
need the port number to be exact or split transactions will fail.

usb: dwc_otg: Clean up build warnings on 64bit kernels

No functional changes. Almost all are changes to logging lines.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

usb: dwc_otg: Use dma allocation for mphi dummy_send buffer

The FIQ driver used a kzalloc'ed buffer for dummy_send,
passing a kernel virtual address to the hardware block.
The buffer is only ever used for a dummy read, so it
should be harmless, but there is the chance that it will
cause exceptions.

Use a dma allocation so that we have a genuine bus address,
and read from that.
Free the allocation when done for good measure.

Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.org>

dwc_otg: only do_split when we actually need to do a split

The previous test would fail if the root port was in fullspeed mode
and there was a hub between the FS device and the root port. While
the transfer worked, the schedule mangling performed for high-speed
split transfers would break leading to an 8ms polling interval.

dwc_otg: fix locking around dequeueing and killing URBs

kill_urbs_in_qh_list() is practically only ever called with the fiq lock
already held, so don't spinlock twice in the case where we need to cancel
an isochronous transfer.

Also fix up a case where the global interrupt register could be read with
the fiq lock not held.

Fixes the deadlock seen in https://github.com/raspberrypi/linux/issues/2907

ARM64/DWC_OTG: Port dwc_otg driver to ARM64

In ARM64, the FIQ mechanism used by this driver is not current
implemented.   As a workaround, reqular IRQ is used instead
of FIQ.

In a separate change, the IRQ-CPU mapping is round robined
on ARM64 to increase concurrency and allow multiple interrupts
to be serviced at a time.  This reduces the need for FIQ.

Tests Run:

This mechanism is most likely to break when multiple USB devices
are attached at the same time.  So the system was tested under
stress.

Devices:

1. USB Speakers playing back a FLAC audio through VLC
   at 96KHz.(Higher then typically, but supported on my speakers).

2. sftp transferring large files through the buildin ethernet
   connection which is connected through USB.

3. Keyboard and mouse attached and being used.

Although I do occasionally hear some glitches, the music seems to
play quite well.

Signed-off-by: Michael Zoran <mzoran@crowfest.net>

usb: dwc_otg: Clean up interrupt claiming code

The FIQ/IRQ interrupt number identification code is scattered through
the dwc_otg driver. Rationalise it, simplifying the code and solving
an existing issue.

See: https://github.com/raspberrypi/linux/issues/2612

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: Choose appropriate IRQ handover strategy

2711 has no MPHI peripheral, but the ARM Control block can fake
interrupts. Use the size of the DTB "mphi" reg block to determine
which is required.

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

usb: host: dwc_otg: fix compiling in separate directory

The dwc_otg Makefile does not respect the O=path argument correctly:
include paths in CFLAGS are given relatively to object path, not source
path. Compiling in a separate directory yields #include errors.

Signed-off-by: Marek Behún <marek.behun@nic.cz>

dwc_otg: use align_buf for small IN control transfers (#3150)

The hardware will do a 4-byte write to memory on any IN packet received
that is between 1 and 3 bytes long. This tramples memory in the uvcvideo
driver, as it uses a sequence of 1- and 2-byte control transfers to
query the min/max/range/step of each individual camera control and
gives us buffers that are offsets into a struct.

Catch small control transfers in the data phase and use the align_buf
to bounce the correct number of bytes into the URB's buffer.

In general, short packets on non-control endpoints should be OK as URBs
should have enough buffer space for a wMaxPacket size transfer.

See: https://github.com/raspberrypi/linux/issues/3148

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: Declare DMA capability with HCD_DMA flag

Following [1], USB controllers have to declare DMA capabilities in
order for them to be used by adding the HCD_DMA flag to their hc_driver
struct.

[1] 7b81cb6bdd ("usb: add a HCD_DMA flag instead of guestimating DMA capabilities")

Signed-off-by: Phil Elwell <phil@raspberrypi.org>

dwc_otg: checking the urb->transfer_buffer too early (#3332)

After enable the HIGHMEM and VMSPLIT_3G, the dwc_otg driver doesn't
work well on Pi2/3 boards with 1G physical ram. Users experience
the failure when copying a file of 600M size to the USB stick. And
at the same time, the dmesg shows:
usb 1-1.1.2: reset high-speed USB device number 8 using dwc_otg
sd 0:0:0:0: [sda] tag#0 FAILED Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK
blk_update_request: I/O error, dev sda, sector 3024048 op 0x1:(WRITE) flags 0x4000 phys_seg 15 prio class 0

When this happens, the sg_buf sent to the driver is located in the
highmem region, the usb_sg_init() in the core/message.c will leave
transfer_buffer to NULL if the sg_buf is in highmem, but in the
dwc_otg driver, it returns -EINVAL unconditionally if transfer_buffer
is NULL.

The driver can handle the situation of buffer to be NULL, if it is in
DMA mode, it will convert an address from transfer_dma.

But if the conversion fails or it is in the PIO mode, we should check
buffer and return -EINVAL if it is NULL.

BugLink: https://bugs.launchpad.net/bugs/1852510
Signed-off-by: Hui Wang <hui.wang@canonical.com>

dwc_otg: constrain endpoint max packet and transfer size on split IN

The hcd would unconditionally set the transfer length to the endpoint
packet size for non-isoc IN transfers. If the remaining buffer length
was less than the length of returned data, random memory would get
scribbled over, with bad effects if it crossed a page boundary.

Force a babble error if this happens by limiting the max transfer size
to the available buffer space. DMA will stop writing to memory on a
babble condition.

The hardware expects xfersize to be an integer multiple of maxpacket
size, so override hcchar.b.mps as well.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: pause when cancelling split transactions

Non-periodic splits will DMA to/from the driver-provided transfer_buffer,
which may be freed immediately after the dequeue call returns. Block until
we know the transfer is complete.

A similar delay is needed when cleaning up disconnects, as the FIQ could
have started a periodic transfer in the previous microframe to the one
that triggered a disconnect.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: fiq_fsm: add a barrier on entry into FIQ handler(s)

On BCM2835, there is no hardware guarantee that multiple outstanding
reads to different peripherals will complete in-order. The FIQ code
uses peripheral reads without barriers for performance, so in the case
where a read to a slow peripheral was issued immediately prior to FIQ
entry, the first peripheral read that the FIQ did could end up with
wrong read data returned.

Add dsb(sy) on entry so that all outstanding reads are retired.

The FIQ only issues reads to the dwc_otg core, so per-read barriers
in the handler itself are not required.

On BCM2836 and BCM2837 the barrier is not strictly required due to
differences in how the peripheral bus is implemented, but having
arch-specific handlers that introduce different latencies is risky.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.org>

dwc_otg: whitelist_table is now productlist_table

dwc_otg: initialise sched_frame for periodic QHs that were parked

If a periodic QH has no remaining QTDs, then it is removed from all
periodic schedules. When re-adding, initialise the sched_frame and
start_split_frame from the current value of the frame counter.

See https://bugs.launchpad.net/raspbian/+bug/1819560
and
 https://github.com/raspberrypi/linux/issues/3883

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

dwc_otg: Minimise header and fix build warnings

Delete a large amount of unused declaration from "usb.h", some of which
were causing build warnings, and get the module building cleanly.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc-otg: fix clang -Wignored-attributes warning

warning: attribute declaration must precede definition

dwc-otg: fix clang -Wsometimes-uninitialized warning

warning: variable 'retval' is used uninitialized whenever 'if' condition is false

dwc-otg: fix clang -Wpointer-bool-conversion warning

warning: address of array 'desc->wMaxPacketSize' will always evaluate to 'true'

The wMaxPacketSize field is actually a two element array which content should
be accessed via the UGETW macro.

dwc_otg: fix an undeclared variable
Replace an undeclared variable used by DWC_DEBUGPL with the real endpoint address. DWC_DEBUGPL does nothing with DEBUG undefined so it did not go wrong before.
Signed-off-by: Zixuan Wang <wangzixuan@sjtu.edu.cn>

dwc_otg: Update NetBSD usb.h header licence

NetBSD have changed their licensing requirements such that the 2-clause
licence is preferred. Update usb.h in the downstream dwc_otg code
accordingly.

See https://www.netbsd.org/about/redistribution.html for more
information.

Signed-off-by: Phil Elwell <phil@raspberrypi.com>

dwc_otg: pay attention to qh->interval when rescheduling periodic queues

A regression introduced in https://github.com/raspberrypi/linux/pull/3887
meant that if the newly scheduled transfer immediately returned data, and
the driver resubmitted a single URB after every transfer, then the effective
polling interval would end up being approx 1ms.

Use the larger of SCHEDULE_SLOP or the configured endpoint interval.

Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>

drivers: dwc_otg: Fix fallthrough warnings

Signed-off-by: Alexander Winkowski <dereference23@outlook.com>
2024-03-05 13:54:39 +00:00
popcornmix
7b4fcaaaf4 reboot: Use power off rather than busy spinning when halt is requested
reboot: Use power off rather than busy spinning when halt is requested

Busy spinning after halt is dumb
We've previously applied this patch to arch/arm
but it is currenltly missing in arch/arm64

Pi4 after "sudo halt" uses 520mA
Pi4 after "sudo shutdown now" uses 310mA

Make them both use the lower powered option

Signed-off-by: Dom Cobley <popcornmix@gmail.com>
2024-03-05 13:54:38 +00:00
Baoquan He
2d16a9f778 kernel/Kconfig.kexec: drop select of KEXEC for CRASH_DUMP
[ Upstream commit dccf78d39f ]

Ignat Korchagin complained that a potential config regression was
introduced by commit 89cde45591 ("kexec: consolidate kexec and crash
options into kernel/Kconfig.kexec").  Before the commit, CONFIG_CRASH_DUMP
has no dependency on CONFIG_KEXEC.  After the commit, CRASH_DUMP selects
KEXEC.  That enforces system to have CONFIG_KEXEC=y as long as
CONFIG_CRASH_DUMP=Y which people may not want.

In Ignat's case, he sets CONFIG_CRASH_DUMP=y, CONFIG_KEXEC_FILE=y and
CONFIG_KEXEC=n because kexec_load interface could have security issue if
kernel/initrd has no chance to be signed and verified.

CRASH_DUMP has select of KEXEC because Eric, author of above commit, met a
LKP report of build failure when posting patch of earlier version.  Please
see below link to get detail of the LKP report:

    https://lore.kernel.org/all/3e8eecd1-a277-2cfb-690e-5de2eb7b988e@oracle.com/T/#u

In fact, that LKP report is triggered because arm's <asm/kexec.h> is
wrapped in CONFIG_KEXEC ifdeffery scope.  That is wrong.  CONFIG_KEXEC
controls the enabling/disabling of kexec_load interface, but not kexec
feature.  Removing the wrongly added CONFIG_KEXEC ifdeffery scope in
<asm/kexec.h> of arm allows us to drop the select KEXEC for CRASH_DUMP.
Meanwhile, change arch/arm/kernel/Makefile to let machine_kexec.o
relocate_kernel.o depend on KEXEC_CORE.

Link: https://lkml.kernel.org/r/20231128054457.659452-1-bhe@redhat.com
Fixes: 89cde45591 ("kexec: consolidate kexec and crash options into kernel/Kconfig.kexec")
Signed-off-by: Baoquan He <bhe@redhat.com>
Reported-by: Ignat Korchagin <ignat@cloudflare.com>
Tested-by: Ignat Korchagin <ignat@cloudflare.com>	[compile-time only]
Tested-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Eric DeVolder <eric_devolder@yahoo.com>
Tested-by: Eric DeVolder <eric_devolder@yahoo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13 18:45:19 +01:00
Linus Torvalds
87dfd85c38 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:

 - Refactor VFP code and convert to C code (Ard Biesheuvel)

 - Fix hardware breakpoint single-stepping using bpf_overflow_handler

 - Make SMP stop calls asynchronous allowing panic from irq context to
   work

 - Fix for kernel-doc warnings for locomo

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  Revert part of ae1f8d793a ("ARM: 9304/1: add prototype for function called only from asm")
  ARM: 9318/1: locomo: move kernel-doc to prevent warnings
  ARM: 9317/1: kexec: Make smp stop calls asynchronous
  ARM: 9316/1: hw_breakpoint: fix single-stepping when using bpf_overflow_handler
  ARM: entry: Make asm coproc dispatch code NWFPE only
  ARM: iwmmxt: Use undef hook to enable coprocessor for task
  ARM: entry: Disregard Thumb undef exception in coproc dispatch
  ARM: vfp: Use undef hook for handling VFP exceptions
  ARM: kernel: Get rid of thread_info::used_cp[] array
  ARM: vfp: Reimplement VFP exception entry in C code
  ARM: vfp: Remove workaround for Feroceon CPUs
  ARM: vfp: Record VFP bounces as perf emulation faults
2023-08-31 12:49:10 -07:00
Linus Torvalds
df57721f9a Merge tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 shadow stack support from Dave Hansen:
 "This is the long awaited x86 shadow stack support, part of Intel's
  Control-flow Enforcement Technology (CET).

  CET consists of two related security features: shadow stacks and
  indirect branch tracking. This series implements just the shadow stack
  part of this feature, and just for userspace.

  The main use case for shadow stack is providing protection against
  return oriented programming attacks. It works by maintaining a
  secondary (shadow) stack using a special memory type that has
  protections against modification. When executing a CALL instruction,
  the processor pushes the return address to both the normal stack and
  to the special permission shadow stack. Upon RET, the processor pops
  the shadow stack copy and compares it to the normal stack copy.

  For more information, refer to the links below for the earlier
  versions of this patch set"

Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/

* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
  x86/shstk: Change order of __user in type
  x86/ibt: Convert IBT selftest to asm
  x86/shstk: Don't retry vm_munmap() on -EINTR
  x86/kbuild: Fix Documentation/ reference
  x86/shstk: Move arch detail comment out of core mm
  x86/shstk: Add ARCH_SHSTK_STATUS
  x86/shstk: Add ARCH_SHSTK_UNLOCK
  x86: Add PTRACE interface for shadow stack
  selftests/x86: Add shadow stack test
  x86/cpufeatures: Enable CET CR4 bit for shadow stack
  x86/shstk: Wire in shadow stack interface
  x86: Expose thread features in /proc/$PID/status
  x86/shstk: Support WRSS for userspace
  x86/shstk: Introduce map_shadow_stack syscall
  x86/shstk: Check that signal frame is shadow stack mem
  x86/shstk: Check that SSP is aligned on sigreturn
  x86/shstk: Handle signals for shadow stack
  x86/shstk: Introduce routines modifying shstk
  x86/shstk: Handle thread shadow stack
  x86/shstk: Add user-mode shadow stack support
  ...
2023-08-31 12:20:12 -07:00
Linus Torvalds
461f35f014 Merge tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
 "The drm core grew a new generic gpu virtual address manager, and new
  execution locking helpers. These are used by nouveau now to provide
  uAPI support for the userspace Vulkan driver. AMD had a bunch of new
  IP core support, loads of refactoring around fbdev, but mostly just
  the usual amount of stuff across the board.

  core:
   - fix gfp flags in drmm_kmalloc

  gpuva:
   - add new generic GPU VA manager (for nouveau initially)

  syncobj:
   - add new DRM_IOCTL_SYNCOBJ_EVENTFD ioctl

  dma-buf:
   - acquire resv lock for mmap() in exporters
   - support dma-buf self import automatically
   - docs fixes

  backlight:
   - fix fbdev interactions

  atomic:
   - improve logging

  prime:
   - remove struct gem_prim_mmap plus driver updates

  gem:
   - drm_exec: add locking over multiple GEM objects
   - fix lockdep checking

  fbdev:
   - make fbdev userspace interfaces optional
   - use linux device instead of fbdev device
   - use deferred i/o helper macros in various drivers
   - Make FB core selectable without drivers
   - Remove obsolete flags FBINFO_DEFAULT and FBINFO_FLAG_DEFAULT
   - Add helper macros and Kconfig tokens for DMA-allocated framebuffer

  ttm:
   - support init_on_free
   - swapout fixes

  panel:
   - panel-edp: Support AUO B116XAB01.4
   - Support Visionox R66451 plus DT bindings
   - ld9040:
      - Backlight support
      - magic improved
      - Kconfig fix
   - Convert to of_device_get_match_data()
   - Fix Kconfig dependencies
   - simple:
      - Set bpc value to fix warning
      - Set connector type for AUO T215HVN01
      - Support Innolux G156HCE-L01 plus DT bindings
   - ili9881: Support TDO TL050HDV35 LCD panel plus DT bindings
   - startek: Support KD070FHFID015 MIPI-DSI panel plus DT bindings
   - sitronix-st7789v:
      - Support Inanbo T28CP45TN89 plus DT bindings
      - Support EDT ET028013DMA plus DT bindings
      - Various cleanups
   - edp: Add timings for N140HCA-EAC
   - Allow panels and touchscreens to power sequence together
   - Fix Innolux G156HCE-L01 LVDS clock

  bridge:
   - debugfs for chains support
   - dw-hdmi:
      - Improve support for YUV420 bus format
      - CEC suspend/resume
      - update EDID on HDMI detect
   - dw-mipi-dsi: Fix enable/disable of DSI controller
   - lt9611uxc: Use MODULE_FIRMWARE()
   - ps8640: Remove broken EDID code
   - samsung-dsim: Fix command transfer
   - tc358764:
      - Handle HS/VS polarity
      - Use BIT() macro
      - Various cleanups
   - adv7511: Fix low refresh rate
   - anx7625:
      - Switch to macros instead of hardcoded values
      - locking fixes
   - tc358767: fix hardware delays
   - sitronix-st7789v:
      - Support panel orientation
      - Support rotation property
      - Add support for Jasonic JT240MHQS-HWT-EK-E3 plus DT bindings

  amdgpu:
   - SDMA 6.1.0 support
   - HDP 6.1 support
   - SMUIO 14.0 support
   - PSP 14.0 support
   - IH 6.1 support
   - Lots of checkpatch cleanups
   - GFX 9.4.3 updates
   - Add USB PD and IFWI flashing documentation
   - GPUVM updates
   - RAS fixes
   - DRR fixes
   - FAMS fixes
   - Virtual display fixes
   - Soft IH fixes
   - SMU13 fixes
   - Rework PSP firmware loading for other IPs
   - Kernel doc fixes
   - DCN 3.0.1 fixes
   - LTTPR fixes
   - DP MST fixes
   - DCN 3.1.6 fixes
   - SMU 13.x fixes
   - PSP 13.x fixes
   - SubVP fixes
   - GC 9.4.3 fixes
   - Display bandwidth calculation fixes
   - VCN4 secure submission fixes
   - Allow building DC on RISC-V
   - Add visible FB info to bo_print_info
   - HBR3 fixes
   - GFX9 MCBP fix
   - GMC10 vmhub index fix
   - GMC11 vmhub index fix
   - Create a new doorbell manager
   - SR-IOV fixes
   - initial freesync panel replay support
   - revert zpos properly until igt regression is fixeed
   - use TTM to manage doorbell BAR
   - Expose both current and average power via hwmon if supported

  amdkfd:
   - Cleanup CRIU dma-buf handling
   - Use KIQ to unmap HIQ
   - GFX 9.4.3 debugger updates
   - GFX 9.4.2 debugger fixes
   - Enable cooperative groups fof gfx11
   - SVM fixes
   - Convert older APUs to use dGPU path like newer APUs
   - Drop IOMMUv2 path as it is no longer used
   - TBA fix for aldebaran

  i915:
   - ICL+ DSI modeset sequence
   - HDCP improvements
   - MTL display fixes and cleanups
   - HSW/BDW PSR1 restored
   - Init DDI ports in VBT order
   - General display refactors
   - Start using plane scale factor for relative data rate
   - Use shmem for dpt objects
   - Expose RPS thresholds in sysfs
   - Apply GuC SLPC min frequency softlimit correctly
   - Extend Wa_14015795083 to TGL, RKL, DG1 and ADL
   - Fix a VMA UAF for multi-gt platform
   - Do not use stolen on MTL due to HW bug
   - Check HuC and GuC version compatibility on MTL
   - avoid infinite GPU waits due to premature release of request memory
   - Fixes and updates for GSC memory allocation
   - Display SDVO fixes
   - Take stolen handling out of FBC code
   - Make i915_coherent_map_type GT-centric
   - Simplify shmem_create_from_object map_type

  msm:
   - SM6125 MDSS support
   - DPU: SM6125 DPU support
   - DSI: runtime PM support, burst mode support
   - DSI PHY: SM6125 support in 14nm DSI PHY driver
   - GPU: prepare for a7xx
   - fix a690 firmware
   - disable relocs on a6xx and newer

  radeon:
   - Lots of checkpatch cleanups

  ast:
   - improve device-model detection
   - Represent BMV as virtual connector
   - Report DP connection status

  nouveau:
   - add new exec/bind interface to support Vulkan
   - document some getparam ioctls
   - improve VRAM detection
   - various fixes/cleanups
   - workraound DPCD issues

  ivpu:
   - MMU updates
   - debugfs support
   - Support vpu4

  virtio:
   - add sync object support

  atmel-hlcdc:
   - Support inverted pixclock polarity

  etnaviv:
   - runtime PM cleanups
   - hang handling fixes

  exynos:
   - use fbdev DMA helpers
   - fix possible NULL ptr dereference

  komeda:
   - always attach encoder

  omapdrm:
   - use fbdev DMA helpers
ingenic:
   - kconfig regmap fixes

  loongson:
   - support display controller

  mediatek:
   - Small mtk-dpi cleanups
   - DisplayPort: support eDP and aux-bus
   - Fix coverity issues
   - Fix potential memory leak if vmap() fail

  mgag200:
   - minor fixes

  mxsfb:
   - support disabling overlay planes

  panfrost:
   - fix sync in IRQ handling

  ssd130x:
   - Support per-controller default resolution plus DT bindings
   - Reduce memory-allocation overhead
   - Improve intermediate buffer size computation
   - Fix allocation of temporary buffers
   - Fix pitch computation
   - Fix shadow plane allocation

  tegra:
   - use fbdev DMA helpers
   - Convert to devm_platform_ioremap_resource()
   - support bridge/connector
   - enable PM

  tidss:
   - Support TI AM625 plus DT bindings
   - Implement new connector model plus driver updates

  vkms:
   - improve write back support
   - docs fixes
   - support gamma LUT

  zynqmp-dpsub:
   - misc fixes"

* tag 'drm-next-2023-08-30' of git://anongit.freedesktop.org/drm/drm: (1327 commits)
  drm/gpuva_mgr: remove unused prev pointer in __drm_gpuva_sm_map()
  drm/tests/drm_kunit_helpers: Place correct function name in the comment header
  drm/nouveau: uapi: don't pass NO_PREFETCH flag implicitly
  drm/nouveau: uvmm: fix unset region pointer on remap
  drm/nouveau: sched: avoid job races between entities
  drm/i915: Fix HPD polling, reenabling the output poll work as needed
  drm: Add an HPD poll helper to reschedule the poll work
  drm/i915: Fix TLB-Invalidation seqno store
  drm/ttm/tests: Fix type conversion in ttm_pool_test
  drm/msm/a6xx: Bail out early if setting GPU OOB fails
  drm/msm/a6xx: Move LLC accessors to the common header
  drm/msm/a6xx: Introduce a6xx_llc_read
  drm/ttm/tests: Require MMU when testing
  drm/panel: simple: Fix Innolux G156HCE-L01 LVDS clock
  Revert "Revert "drm/amdgpu/display: change pipe policy for DCN 2.0""
  drm/amdgpu: Add memory vendor information
  drm/amd: flush any delayed gfxoff on suspend entry
  drm/amdgpu: skip fence GFX interrupts disable/enable for S0ix
  drm/amdgpu: Remove gfxoff check in GFX v9.4.3
  drm/amd/pm: Update pci link speed for smu v13.0.6
  ...
2023-08-30 13:34:34 -07:00
Linus Torvalds
daa22f5a78 Merge tag 'modules-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull modules updates from Luis Chamberlain:
 "Summary of the changes worth highlighting from most interesting to
  boring below:

   - Christoph Hellwig's symbol_get() fix to Nvidia's efforts to
     circumvent the protection he put in place in year 2020 to prevent
     proprietary modules from using GPL only symbols, and also ensuring
     proprietary modules which export symbols grandfather their taint.

     That was done through year 2020 commit 262e6ae708 ("modules:
     inherit TAINT_PROPRIETARY_MODULE"). Christoph's new fix is done by
     clarifing __symbol_get() was only ever intended to prevent module
     reference loops by Linux kernel modules and so making it only find
     symbols exported via EXPORT_SYMBOL_GPL(). The circumvention tactic
     used by Nvidia was to use symbol_get() to purposely swift through
     proprietary module symbols and completely bypass our traditional
     EXPORT_SYMBOL*() annotations and community agreed upon
     restrictions.

     A small set of preamble patches fix up a few symbols which just
     needed adjusting for this on two modules, the rtc ds1685 and the
     networking enetc module. Two other modules just needed some build
     fixing and removal of use of __symbol_get() as they can't ever be
     modular, as was done by Arnd on the ARM pxa module and Christoph
     did on the mmc au1xmmc driver.

     This is a good reminder to us that symbol_get() is just a hack to
     address things which should be fixed through Kconfig at build time
     as was done in the later patches, and so ultimately it should just
     go.

   - Extremely late minor fix for old module layout 055f23b74b
     ("module: check for exit sections in layout_sections() instead of
     module_init_section()") by James Morse for arm64. Note that this
     layout thing is old, it is *not* Song Liu's commit ac3b432839
     ("module: replace module_layout with module_memory"). The issue
     however is very odd to run into and so there was no hurry to get
     this in fast.

   - Although the fix did not go through the modules tree I'd like to
     highlight the fix by Peter Zijlstra in commit 5409730962
     ("x86/static_call: Fix __static_call_fixup()") now merged in your
     tree which came out of what was originally suspected to be a
     fallout of the the newer module layout changes by Song Liu commit
     ac3b432839 ("module: replace module_layout with module_memory")
     instead of module_init_section()"). Thanks to the report by
     Christian Bricart and the debugging by Song Liu & Peter that turned
     to be noted as a kernel regression in place since v5.19 through
     commit ee88d363d1 ("x86,static_call: Use alternative RET
     encoding").

     I highlight this to reflect and clarify that we haven't seen more
     fallout from ac3b432839 ("module: replace module_layout with
     module_memory").

   - RISC-V toolchain got mapping symbol support which prefix symbols
     with "$" to help with alignment considerations for disassembly.

     This is used to differentiate between incompatible instruction
     encodings when disassembling. RISC-V just matches what ARM/AARCH64
     did for alignment considerations and Palmer Dabbelt extended
     is_mapping_symbol() to accept these symbols for RISC-V. We already
     had support for this for all architectures but it also checked for
     the second character, the RISC-V check Dabbelt added was just for
     the "$". After a bit of testing and fallout on linux-next and based
     on feedback from Masahiro Yamada it was decided to simplify the
     check and treat the first char "$" as unique for all architectures,
     and so we no make is_mapping_symbol() for all archs if the symbol
     starts with "$".

     The most relevant commit for this for RISC-V on binutils was:

       https://sourceware.org/pipermail/binutils/2021-July/117350.html

   - A late fix by Andrea Righi (today) to make module zstd
     decompression use vmalloc() instead of kmalloc() to account for
     large compressed modules. I suspect we'll see similar things for
     other decompression algorithms soon.

   - samples/hw_breakpoint minor fixes by Rong Tao, Arnd Bergmann and
     Chen Jiahao"

* tag 'modules-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  module/decompress: use vmalloc() for zstd decompression workspace
  kallsyms: Add more debug output for selftest
  ARM: module: Use module_init_layout_section() to spot init sections
  arm64: module: Use module_init_layout_section() to spot init sections
  module: Expose module_init_layout_section()
  modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
  rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff
  net: enetc: use EXPORT_SYMBOL_GPL for enetc_phc_index
  mmc: au1xmmc: force non-modular build and remove symbol_get usage
  ARM: pxa: remove use of symbol_get()
  samples/hw_breakpoint: mark sample_hbp as static
  samples/hw_breakpoint: fix building without module unloading
  samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'
  modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbols
  kernel: params: Remove unnecessary ‘0’ values from err
  module: Ignore RISC-V mapping symbols too
2023-08-29 17:32:32 -07:00
Linus Torvalds
d68b4b6f30 Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:

 - An extensive rework of kexec and crash Kconfig from Eric DeVolder
   ("refactor Kconfig to consolidate KEXEC and CRASH options")

 - kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
   couple of macros to args.h")

 - gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
   commands")

 - vsprintf inclusion rationalization from Andy Shevchenko
   ("lib/vsprintf: Rework header inclusions")

 - Switch the handling of kdump from a udev scheme to in-kernel
   handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
   hot un/plug")

 - Many singleton patches to various parts of the tree

* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
  document while_each_thread(), change first_tid() to use for_each_thread()
  drivers/char/mem.c: shrink character device's devlist[] array
  x86/crash: optimize CPU changes
  crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
  crash: hotplug support for kexec_load()
  x86/crash: add x86 crash hotplug support
  crash: memory and CPU hotplug sysfs attributes
  kexec: exclude elfcorehdr from the segment digest
  crash: add generic infrastructure for crash hotplug support
  crash: move a few code bits to setup support of crash hotplug
  kstrtox: consistently use _tolower()
  kill do_each_thread()
  nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
  scripts/bloat-o-meter: count weak symbol sizes
  treewide: drop CONFIG_EMBEDDED
  lockdep: fix static memory detection even more
  lib/vsprintf: declare no_hash_pointers in sprintf.h
  lib/vsprintf: split out sprintf() and friends
  kernel/fork: stop playing lockless games for exe_file replacement
  adfs: delete unused "union adfs_dirtail" definition
  ...
2023-08-29 14:53:51 -07:00
Linus Torvalds
542034175c Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
 "I think we have a bit less than usual on the architecture side, but
  that's somewhat balanced out by a large crop of perf/PMU driver
  updates and extensions to our selftests.

  CPU features and system registers:

   - Advertise hinted conditional branch support (FEAT_HBC) to userspace

   - Avoid false positive "SANITY CHECK" warning when xCR registers
     differ outside of the length field

  Documentation:

   - Fix macro name typo in SME documentation

  Entry code:

   - Unmask exceptions earlier on the system call entry path

  Memory management:

   - Don't bother clearing PTE_RDONLY for dirty ptes in pte_wrprotect()
     and pte_modify()

  Perf and PMU drivers:

   - Initial support for Coresight TRBE devices on ACPI systems (the
     coresight driver changes will come later)

   - Fix hw_breakpoint single-stepping when called from bpf

   - Fixes for DDR PMU on i.MX8MP SoC

   - Add NUMA-awareness to Hisilicon PCIe PMU driver

   - Fix locking dependency issue in Arm DMC620 PMU driver

   - Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver

   - Add support for Arm CMN-700 r3 parts to the CMN PMU driver

   - Add support for recent Arm Cortex CPU PMUs

   - Update Hisilicon PMU maintainers

  Selftests:

   - Add a bunch of new features to the hwcap test (JSCVT, PMULL, AES,
     SHA1, etc)

   - Fix SSVE test to leave streaming-mode after grabbing the signal
     context

   - Add new test for SVE vector-length changes with SME enabled

  Miscellaneous:

   - Allow compiler to warn on suspicious looking system register
     expressions

   - Work around SDEI firmware bug by aborting any running handlers on a
     kernel crash

   - Fix some harmless warnings when building with W=1

   - Remove some unused function declarations

   - Other minor fixes and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (62 commits)
  drivers/perf: hisi: Update HiSilicon PMU maintainers
  arm_pmu: acpi: Add a representative platform device for TRBE
  arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
  kselftest/arm64: Fix hwcaps selftest build
  hw_breakpoint: fix single-stepping when using bpf_overflow_handler
  arm64/sysreg: refactor deprecated strncpy
  kselftest/arm64: add jscvt feature to hwcap test
  kselftest/arm64: add pmull feature to hwcap test
  kselftest/arm64: add AES feature check to hwcap test
  kselftest/arm64: add SHA1 and related features to hwcap test
  arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments
  kselftest/arm64: build BTI tests in output directory
  perf/imx_ddr: don't enable counter0 if none of 4 counters are used
  perf/imx_ddr: speed up overflow frequency of cycle
  drivers/perf: hisi: Schedule perf session according to locality
  kselftest/arm64: fix a memleak in zt_regs_run()
  perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency
  perf/smmuv3: Add MODULE_ALIAS for module auto loading
  perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
  kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
  ...
2023-08-28 17:34:54 -07:00
Douglas Anderson
8d539b84f1 nmi_backtrace: allow excluding an arbitrary CPU
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU.  This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.

Let's extend the API to take a CPU ID to exclude instead of just a
boolean.  This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.

Arguably, this new API also encourages safer behavior.  Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.

[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-18 10:19:00 -07:00
Tomislav Novak
d11a69873d hw_breakpoint: fix single-stepping when using bpf_overflow_handler
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.

Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:

  # bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
  Attaching 1 probe...
  hit
  hit
  [...]
  ^C

(./test performs a single 4-byte store to 0x10000)

This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.

Signed-off-by: Tomislav Novak <tnovak@meta.com>
Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon <will@kernel.org>
2023-08-18 17:04:09 +01:00
Kees Cook
4697b5848b ARM: ptrace: Restore syscall skipping for tracers
Since commit 4e57a4ddf6 ("ARM: 9107/1: syscall: always store
thread_info->abi_syscall"), the seccomp selftests "syscall_errno"
and "syscall_faked" have been broken. Both seccomp and PTRACE depend
on using the special value of "-1" for skipping syscalls. This value
wasn't working because it was getting masked by __NR_SYSCALL_MASK in
both PTRACE_SET_SYSCALL and get_syscall_nr().

Explicitly test for -1 in PTRACE_SET_SYSCALL and get_syscall_nr(),
leaving it exposed when present, allowing tracers to skip syscalls
again.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 4e57a4ddf6 ("ARM: 9107/1: syscall: always store thread_info->abi_syscall")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230810195422.2304827-2-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-16 13:58:49 -07:00
Kees Cook
cf00764747 ARM: ptrace: Restore syscall restart tracing
Since commit 4e57a4ddf6 ("ARM: 9107/1: syscall: always store
thread_info->abi_syscall"), the seccomp selftests "syscall_restart" has
been broken. This was caused by the restart syscall not being stored to
"abi_syscall" during restart setup before branching to the "local_restart"
label. Tracers would see the wrong syscall, and scno would get overwritten
while returning from the TIF_WORK path. Add the missing store.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Fixes: 4e57a4ddf6 ("ARM: 9107/1: syscall: always store thread_info->abi_syscall")
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230810195422.2304827-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-16 13:58:49 -07:00
Russell King (Oracle)
f493fedcc3 Merge branch 'devel-stable' into for-next 2023-08-14 12:18:06 +01:00
Mårten Lindahl
8922ba71c9 ARM: 9317/1: kexec: Make smp stop calls asynchronous
If a panic is triggered by a hrtimer interrupt all online cpus will be
notified and set offline. But as highlighted by commit 19dbdcb803
("smp: Warn on function calls from softirq context") this call should
not be made synchronous with disabled interrupts:

 softdog: Initiating panic
 Kernel panic - not syncing: Software Watchdog Timer expired
 WARNING: CPU: 1 PID: 0 at kernel/smp.c:753 smp_call_function_many_cond
   unwind_backtrace:
     show_stack
     dump_stack_lvl
     __warn
     warn_slowpath_fmt
     smp_call_function_many_cond
     smp_call_function
     crash_smp_send_stop.part.0
     machine_crash_shutdown
     __crash_kexec
     panic
     softdog_fire
     __hrtimer_run_queues
     hrtimer_interrupt

Make the smp call for machine_crash_nonpanic_core() asynchronous.

Signed-off-by: Mårten Lindahl <marten.lindahl@axis.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-08-14 12:16:59 +01:00
Tomislav Novak
e6b51532d5 ARM: 9316/1: hw_breakpoint: fix single-stepping when using bpf_overflow_handler
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.

Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:

  # bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
  Attaching 1 probe...
  hit
  hit
  [...]
  ^C

(./test performs a single 4-byte store to 0x10000)

This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.

Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/

Signed-off-by: Tomislav Novak <tnovak@fb.com>
Tested-by: Samuel Gosselin <sgosselin@google.com> # arm64
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-08-14 12:16:58 +01:00
James Morse
a6846234f4 ARM: module: Use module_init_layout_section() to spot init sections
Today module_frob_arch_sections() spots init sections from their
'init' prefix, and uses this to keep the init PLTs separate from the rest.

get_module_plt() uses within_module_init() to determine if a
location is in the init text or not, but this depends on whether
core code thought this was an init section.

Naturally the logic is different.

module_init_layout_section() groups the init and exit text together if
module unloading is disabled, as the exit code will never run. The result
is kernels with this configuration can't load all their modules because
there are not enough PLTs for the combined init+exit section.

A previous patch exposed module_init_layout_section(), use that so the
logic is the same.

Fixes: 055f23b74b ("module: check for exit sections in layout_sections() instead of module_init_section()")
Cc: stable@vger.kernel.org
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-08-03 13:42:02 -07:00
Rick Edgecombe
a5f6c2ace9 x86/shstk: Add user control-protection fault handler
A control-protection fault is triggered when a control-flow transfer
attempt violates Shadow Stack or Indirect Branch Tracking constraints.
For example, the return address for a RET instruction differs from the copy
on the shadow stack.

There already exists a control-protection fault handler for handling kernel
IBT faults. Refactor this fault handler into separate user and kernel
handlers, like the page fault handler. Add a control-protection handler
for usermode. To avoid ifdeffery, put them both in a new file cet.c, which
is compiled in the case of either of the two CET features supported in the
kernel: kernel IBT or user mode shadow stack. Move some static inline
functions from traps.c into a header so they can be used in cet.c.

Opportunistically fix a comment in the kernel IBT part of the fault
handler that is on the end of the line instead of preceding it.

Keep the same behavior for the kernel side of the fault handler, except for
converting a BUG to a WARN in the case of a #CP happening when the feature
is missing. This unifies the behavior with the new shadow stack code, and
also prevents the kernel from crashing under this situation which is
potentially recoverable.

The control-protection fault handler works in a similar way as the general
protection fault handler. It provides the si_code SEGV_CPERR to the signal
handler.

Co-developed-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: John Allen <john.allen@amd.com>
Tested-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/all/20230613001108.3040476-28-rick.p.edgecombe%40intel.com
2023-08-02 15:01:50 -07:00
Daniel Vetter
6c7f27441d Merge tag 'drm-misc-next-2023-07-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v6.6:

UAPI Changes:

 * fbdev:
   * Make fbdev userspace interfaces optional; only leaves the
     framebuffer console active

 * prime:
   * Support dma-buf self-import for all drivers automatically: improves
     support for many userspace compositors

Cross-subsystem Changes:

 * backlight:
   * Fix interaction with fbdev in several drivers

 * base: Convert struct platform.remove to return void; part of a larger,
   tree-wide effort

 * dma-buf: Acquire reservation lock for mmap() in exporters; part
   of an on-going effort to simplify locking around dma-bufs

 * fbdev:
   * Use Linux device instead of fbdev device in many places
   * Use deferred-I/O helper macros in various drivers

 * i2c: Convert struct i2c from .probe_new to .probe; part of a larger,
   tree-wide effort

 * video:
   * Avoid including <linux/screen_info.h>

Core Changes:

 * atomic:
   * Improve logging

 * prime:
   * Remove struct drm_driver.gem_prime_mmap plus driver updates: all
     drivers now implement this callback with drm_gem_prime_mmap()

 * gem:
   * Support execution contexts: provides locking over multiple GEM
     objects

 * ttm:
   * Support init_on_free
   * Swapout fixes

Driver Changes:

 * accel:
   * ivpu: MMU updates; Support debugfs

 * ast:
   * Improve device-model detection
   * Cleanups

 * bridge:
   * dw-hdmi: Improve support for YUV420 bus format
   * dw-mipi-dsi: Fix enable/disable of DSI controller
   * lt9611uxc: Use MODULE_FIRMWARE()
   * ps8640: Remove broken EDID code
   * samsung-dsim: Fix command transfer
   * tc358764: Handle HS/VS polarity; Use BIT() macro; Various cleanups
   * Cleanups

 * ingenic:
   * Kconfig REGMAP fixes

 * loongson:
   * Support display controller

 * mgag200:
   * Minor fixes

 * mxsfb:
   * Support disabling overlay planes

 * nouveau:
   * Improve VRAM detection
   * Various fixes and cleanups

 * panel:
   * panel-edp: Support AUO B116XAB01.4
   * Support Visionox R66451 plus DT bindings
   * Cleanups

 * ssd130x:
   * Support per-controller default resolution plus DT bindings
   * Reduce memory-allocation overhead
   * Cleanups

 * tidss:
   * Support TI AM625 plus DT bindings
   * Implement new connector model plus driver updates

 * vkms
   * Improve write-back support
   * Documentation fixes

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230713090830.GA23281@linux-uq9g
2023-07-17 15:37:57 +02:00
Thomas Zimmermann
8b0d13545b efi: Do not include <linux/screen_info.h> from EFI header
The header file <linux/efi.h> does not need anything from
<linux/screen_info.h>. Declare struct screen_info and remove
the include statements. Update a number of source files that
require struct screen_info's definition.

v2:
	* update loongarch (Jingfeng)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230706104852.27451-2-tzimmermann@suse.de
2023-07-08 20:26:36 +02:00
Linus Torvalds
7b82e90411 Merge tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
 "These are cleanups for architecture specific header files:

   - the comments in include/linux/syscalls.h have gone out of sync and
     are really pointless, so these get removed

   - The asm/bitsperlong.h header no longer needs to be architecture
     specific on modern compilers, so use a generic version for newer
     architectures that use new enough userspace compilers

   - A cleanup for virt_to_pfn/virt_to_bus to have proper type checking,
     forcing the use of pointers"

* tag 'asm-generic-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  syscalls: Remove file path comments from headers
  tools arch: Remove uapi bitsperlong.h of hexagon and microblaze
  asm-generic: Unify uapi bitsperlong.h for arm64, riscv and loongarch
  m68k/mm: Make pfn accessors static inlines
  arm64: memory: Make virt_to_pfn() a static inline
  ARM: mm: Make virt_to_pfn() a static inline
  asm-generic/page.h: Make pfn accessors static inlines
  xen/netback: Pass (void *) to virt_to_page()
  netfs: Pass a pointer to virt_to_page()
  cifs: Pass a pointer to virt_to_page() in cifsglob
  cifs: Pass a pointer to virt_to_page()
  riscv: mm: init: Pass a pointer to virt_to_page()
  ARC: init: Pass a pointer to virt_to_pfn() in init
  m68k: Pass a pointer to virt_to_pfn() virt_to_page()
  fs/proc/kcore.c: Pass a pointer to virt_addr_valid()
2023-07-06 10:06:04 -07:00
Linus Torvalds
04fc8904d5 Merge tag 'docs-arm-move' of git://git.lwn.net/linux
Pull arm documentation move from Jonathan Corbet:
 "Move the Arm architecture documentation under Documentation/arch/.

  This brings some order to the documentation directory, declutters the
  top-level directory, and makes the documentation organization more
  closely match that of the source"

* tag 'docs-arm-move' of git://git.lwn.net/linux:
  dt-bindings: Update Documentation/arm references
  docs: update some straggling Documentation/arm references
  crypto: update some Arm documentation references
  mips: update a reference to a moved Arm Document
  arm64: Update Documentation/arm references
  arm: update in-source documentation references
  arm: docs: Move Arm documentation to Documentation/arch/
2023-06-27 11:58:16 -07:00
Linus Torvalds
2b603cd5b7 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM updates from Russell King:

 - lots of build cleanups from Arnd spread throughout the arch/arm tree

 - replace strlcpy() with the preferred strscpy()

 - use sign_extend32() in the module linker

 - drop handle_irq() machine descriptor method

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9315/1: fiq: include asm/mach/irq.h for prototypes
  ARM: 9314/1: tcm: move tcm_init() prototype to asm/tcm.h
  ARM: 9313/1: vdso: add missing prototypes
  ARM: 9312/1: vfp: include asm/neon.h in vfpmodule.c
  ARM: 9311/1: decompressor: move function prototypes to misc.h
  ARM: 9310/1: xip-kernel: add __inflate_kernel_data prototype
  ARM: 9309/1: add missing syscall prototypes
  ARM: 9308/1: move setup functions to header
  ARM: 9307/1: nommu: include asm/idmap.h
  ARM: 9306/1: cacheflush: avoid __flush_anon_page() missing-prototype warning
  ARM: 9305/1: add clear/copy_user_highpage declarations
  ARM: 9304/1: add prototype for function called only from asm
  ARM: 9303/1: kprobes: avoid missing-declaration warnings
  ARM: 9302/1: traps: hide unused functions on NOMMU
  ARM: 9301/1: dma-mapping: hide unused dma_contiguous_early_fixup function
  ARM: 9300/1: Replace all non-returning strlcpy with strscpy
  ARM: 9299/1: module: use sign_extend32() to extend the signedness
  ARM: 9298/1: Drop custom mdesc->handle_irq()
2023-06-26 17:07:53 -07:00
Linus Torvalds
9244724fbf Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner:
 "A large update for SMP management:

   - Parallel CPU bringup

     The reason why people are interested in parallel bringup is to
     shorten the (kexec) reboot time of cloud servers to reduce the
     downtime of the VM tenants.

     The current fully serialized bringup does the following per AP:

       1) Prepare callbacks (allocate, intialize, create threads)
       2) Kick the AP alive (e.g. INIT/SIPI on x86)
       3) Wait for the AP to report alive state
       4) Let the AP continue through the atomic bringup
       5) Let the AP run the threaded bringup to full online state

     There are two significant delays:

       #3 The time for an AP to report alive state in start_secondary()
          on x86 has been measured in the range between 350us and 3.5ms
          depending on vendor and CPU type, BIOS microcode size etc.

       #4 The atomic bringup does the microcode update. This has been
          measured to take up to ~8ms on the primary threads depending
          on the microcode patch size to apply.

     On a two socket SKL server with 56 cores (112 threads) the boot CPU
     spends on current mainline about 800ms busy waiting for the APs to
     come up and apply microcode. That's more than 80% of the actual
     onlining procedure.

     This can be reduced significantly by splitting the bringup
     mechanism into two parts:

       1) Run the prepare callbacks and kick the AP alive for each AP
          which needs to be brought up.

          The APs wake up, do their firmware initialization and run the
          low level kernel startup code including microcode loading in
          parallel up to the first synchronization point. (#1 and #2
          above)

       2) Run the rest of the bringup code strictly serialized per CPU
          (#3 - #5 above) as it's done today.

          Parallelizing that stage of the CPU bringup might be possible
          in theory, but it's questionable whether required surgery
          would be justified for a pretty small gain.

     If the system is large enough the first AP is already waiting at
     the first synchronization point when the boot CPU finished the
     wake-up of the last AP. That reduces the AP bringup time on that
     SKL from ~800ms to ~80ms, i.e. by a factor ~10x.

     The actual gain varies wildly depending on the system, CPU,
     microcode patch size and other factors. There are some
     opportunities to reduce the overhead further, but that needs some
     deep surgery in the x86 CPU bringup code.

     For now this is only enabled on x86, but the core functionality
     obviously works for all SMP capable architectures.

   - Enhancements for SMP function call tracing so it is possible to
     locate the scheduling and the actual execution points. That allows
     to measure IPI delivery time precisely"

* tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
  trace,smp: Add tracepoints for scheduling remotelly called functions
  trace,smp: Add tracepoints around remotelly called functions
  MAINTAINERS: Add CPU HOTPLUG entry
  x86/smpboot: Fix the parallel bringup decision
  x86/realmode: Make stack lock work in trampoline_compat()
  x86/smp: Initialize cpu_primary_thread_mask late
  cpu/hotplug: Fix off by one in cpuhp_bringup_mask()
  x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils
  x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it
  x86/smpboot: Support parallel startup of secondary CPUs
  x86/smpboot: Implement a bit spinlock to protect the realmode stack
  x86/apic: Save the APIC virtual base address
  cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE
  x86/apic: Provide cpu_primary_thread mask
  x86/smpboot: Enable split CPU startup
  cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism
  cpu/hotplug: Reset task stack state in _cpu_up()
  cpu/hotplug: Remove unused state functions
  riscv: Switch to hotplug core state synchronization
  parisc: Switch to hotplug core state synchronization
  ...
2023-06-26 13:59:56 -07:00
Arnd Bergmann
85e18ed32e ARM: 9315/1: fiq: include asm/mach/irq.h for prototypes
There are two global functions in fiq.c that get called from
other files through an extern declaration, but a W=1 build
warns about the header not being included before the definition:

arch/arm/kernel/fiq.c:85:5: error: no previous prototype for 'show_fiq_list' [-Werror=missing-prototypes]
arch/arm/kernel/fiq.c:159:13: error: no previous prototype for 'init_FIQ' [-Werror=missing-prototypes]

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:36:00 +01:00
Arnd Bergmann
c9a1d4f672 ARM: 9310/1: xip-kernel: add __inflate_kernel_data prototype
The kernel .data decompression is called from assembler, so it does
not need a prototype, but adding one avoids this W=1 warning:

arch/arm/kernel/head-inflate-data.c:35:12: error: no previous prototype for '__inflate_kernel_data' [-Werror=missing-prototypes]

The same file contains a few extern declarations for assembler
symbols, move those into the header as well for consistency.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:56 +01:00
Arnd Bergmann
be0796b07b ARM: 9309/1: add missing syscall prototypes
All architecture-independent system calls have prototypes in
include/linux/syscalls.h, but there are a few that only exist
on arm or that take the pt_regs directly. These cause a W=1
warning such as:

arch/arm/kernel/signal.c:186:16: error: no previous prototype for 'sys_sigreturn' [-Werror=missing-prototypes]
arch/arm/kernel/signal.c:216:16: error: no previous prototype for 'sys_rt_sigreturn' [-Werror=missing-prototypes]
arch/arm/kernel/sys_arm.c:32:17: error: no previous prototype for 'sys_arm_fadvise64_64' [-Werror=missing-prototypes]

Add prototypes for all custom syscalls on arm and add them
to asm/syscalls.h.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:55 +01:00
Arnd Bergmann
ad1cfe62b8 ARM: 9308/1: move setup functions to header
A couple of functions are declared in arch/arm/mm/mmu.c rather than in a header,
which causes W=1 build warnings:

arch/arm/mm/init.c:97:13: error: no previous prototype for 'setup_dma_zone' [-Werror=missing-prototypes]
arch/arm/mm/mmu.c:118:13: error: no previous prototype for 'init_default_cache_policy' [-Werror=missing-prototypes]
arch/arm/mm/mmu.c:1195:13: error: no previous prototype for 'adjust_lowmem_bounds' [-Werror=missing-prototypes]
arch/arm/mm/mmu.c:1761:13: error: no previous prototype for 'paging_init' [-Werror=missing-prototypes]
arch/arm/mm/mmu.c:1794:13: error: no previous prototype for 'early_mm_init' [-Werror=missing-prototypes]

Move the declaratsion to asm/setup.h so they can be seen by the compiler
while building the definition.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:55 +01:00
Arnd Bergmann
4b026ca3e2 ARM: 9302/1: traps: hide unused functions on NOMMU
A couple of functions in this file are only used on MMU-enabled
builds, and never even declared otherwise, causing these build
warnings:

arch/arm/kernel/traps.c:759:6: error: no previous prototype for '__pte_error' [-Werror=missing-prototypes]
arch/arm/kernel/traps.c:764:6: error: no previous prototype for '__pmd_error' [-Werror=missing-prototypes]
arch/arm/kernel/traps.c:769:6: error: no previous prototype for '__pgd_error' [-Werror=missing-prototypes]

Protect these in an #ifdef to avoid the warnings and save a little
bit of .text space.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:50 +01:00
Azeem Shaikh
7611b3358a ARM: 9300/1: Replace all non-returning strlcpy with strscpy
strlcpy() reads the entire source buffer first.  This read may exceed
the destination size limit.  This is both inefficient and can lead to
linear read overflows if a source string is not NUL-terminated [1].  In
an effort to remove strlcpy() completely [2], replace strlcpy() here
with strscpy().  No return values were used, so direct replacement is
safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

[ardb: submitting to the patch tracker on behalf of Azeem]

Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:49 +01:00
Masahiro Yamada
ddbb7ea96a ARM: 9299/1: module: use sign_extend32() to extend the signedness
The function name clarifies the intention.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:48 +01:00
Linus Walleij
5bb578a0c1 ARM: 9298/1: Drop custom mdesc->handle_irq()
ARM exclusively uses GENERIC_IRQ_MULTI_HANDLER, so at some point
set_handle_irq() needs to be called to handle system-wide
interrupts.

For all DT-enabled boards, this call happens down in the
drivers/irqchip subsystem, after locating the target irqchip
driver from the device tree.

We still have a few instances of the boardfiles with machine
descriptors passing a machine-specific .handle_irq() to the
ARM kernel core.

Get rid of this by letting the few remaining machines consistently
call set_handle_irq() from the end of the .init_irq() callback
instead and diet down one member from the machine descriptor.

Cc: Marc Zyngier <maz@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-06-19 09:35:48 +01:00
Thomas Gleixner
ee31bb0524 ARM: cpu: Switch to arch_cpu_finalize_init()
check_bugs() is about to be phased out. Switch over to the new
arch_cpu_finalize_init() implementation.

No functional change.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20230613224545.078124882@linutronix.de
2023-06-16 10:15:59 +02:00
Jonathan Corbet
e318b36ed3 arm: update in-source documentation references
The Arm documentation has moved to Documentation/arch/arm; update
references within arch/arm to match.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Alim Akhtar <alim.akhtar@samsung.com>
Cc: Patrice Chotard <patrice.chotard@foss.st.com>
Cc: linux-doc@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arch@vger.kernel.org
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-12 06:33:48 -06:00
Linus Walleij
a9ff696160 ARM: mm: Make virt_to_pfn() a static inline
Making virt_to_pfn() a static inline taking a strongly typed
(const void *) makes the contract of a passing a pointer of that
type to the function explicit and exposes any misuse of the
macro virt_to_pfn() acting polymorphic and accepting many types
such as (void *), (unitptr_t) or (unsigned long) as arguments
without warnings.

Doing this is a bit intrusive: virt_to_pfn() requires
PHYS_PFN_OFFSET and PAGE_SHIFT to be defined, and this is defined in
<asm/page.h>, so this must be included *before* <asm/memory.h>.

The use of macros were obscuring the unclear inclusion order here,
as the macros would eventually be resolved, but a static inline
like this cannot be compiled with unresolved macros.

The naive solution to include <asm/page.h> at the top of
<asm/memory.h> does not work, because <asm/memory.h> sometimes
includes <asm/page.h> at the end of itself, which would create a
confusing inclusion loop. So instead, take the approach to always
unconditionally include <asm/page.h> at the end of <asm/memory.h>

arch/arm uses <asm/memory.h> explicitly in a lot of places,
however it turns out that if we just unconditionally include
<asm/memory.h> into <asm/page.h> and switch all inclusions of
<asm/memory.h> to <asm/page.h> instead, we enforce the right
order and <asm/memory.h> will always have access to the
definitions.

Put an inclusion guard in place making it impossible to include
<asm/memory.h> explicitly.

Link: https://lore.kernel.org/linux-mm/20220701160004.2ffff4e5ab59a55499f4c736@linux-foundation.org/
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2023-05-29 11:27:08 +02:00
Ard Biesheuvel
47ba5f39ea ARM: entry: Make asm coproc dispatch code NWFPE only
Now that we can dispatch all VFP and iWMMXT related undef exceptions
using undef hooks implemented in C code, we no longer need the asm entry
code that takes care of this unless we are using FPE, so we can move it
into the FPE entry code. As this means it is ARM only, we can remove the
Thumb2 specific decorations as well.

It also means the non-standard, asm-only calling convention where
returning via LR means failure and returning via R9 means success is now
only used on legacy platforms that lack any kind of function return
prediction, avoiding the associated performance impact.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 15:08:22 +02:00
Ard Biesheuvel
303d6da167 ARM: iwmmxt: Use undef hook to enable coprocessor for task
Define a undef hook to deal with undef exceptions triggered by iwmmxt
instructions that were issued with the coprocessor disabled. This
removes the dependency on the coprocessor dispatch code in entry-armv.S,
which will be made NWFPE-only in a subsequent patch.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 15:08:22 +02:00
Ard Biesheuvel
8bcba70cb5 ARM: entry: Disregard Thumb undef exception in coproc dispatch
Now that the only remaining coprocessor instructions being handled via
the dispatch in entry-armv.S are ones that only exist in a ARM (A32)
encoding, we can simplify the handling of Thumb undef exceptions, and
send them straight to the undefined instruction handlers in C code.

This also means we can drop the code that partially decodes the
instruction to decide whether it is a 16-bit or 32-bit Thumb
instruction: this is all taken care of by the undef hook.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 15:08:22 +02:00
Ard Biesheuvel
cdd87465ad ARM: vfp: Use undef hook for handling VFP exceptions
Now that the VFP support code has been reimplemented as a C function
that takes a struct pt_regs pointer and an opcode, we can use the
existing undef_hook framework to deal with undef exceptions triggered by
VFP instructions instead of having special handling in assembler.

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 15:08:22 +02:00
Ard Biesheuvel
6ee1e6772e ARM: kernel: Get rid of thread_info::used_cp[] array
We keep track of which coprocessor triggered a fault in the used_cp[]
array in thread_info, but this data is never used anywhere. So let's
remove it.

Linus did some digging and found out that the last user of this field
was removed in commit bb1a773d5b ("kill unused dump_fpu() instances").

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-17 15:08:22 +02:00
Thomas Gleixner
5490e769cd ARM: smp: Switch to hotplug core state synchronization
Switch to the CPU hotplug core state tracking and synchronization
mechanim. No functional change intended.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205256.635326070@linutronix.de
2023-05-15 13:44:57 +02:00
Linus Torvalds
01bc932561 Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:

 - fix unwinder for uleb128 case

 - fix kernel-doc warnings for HP Jornada 7xx

 - fix unbalanced stack on vfp success path

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9297/1: vfp: avoid unbalanced stack on 'success' return path
  ARM: 9296/1: HP Jornada 7XX: fix kernel-doc warnings
  ARM: 9295/1: unwind:fix unwind abort for uleb128 case
2023-05-14 09:17:32 -07:00
Haibo Li
fa3eeb638d ARM: 9295/1: unwind:fix unwind abort for uleb128 case
When unwind instruction is 0xb2,the subsequent instructions
are uleb128 bytes.
For now,it uses only the first uleb128 byte in code.

For vsp increments of 0x204~0x400,use one uleb128 byte like below:
0xc06a00e4 <unwind_test_work>: 0x80b27fac
  Compact model index: 0
  0xb2 0x7f vsp = vsp + 1024
  0xac      pop {r4, r5, r6, r7, r8, r14}

For vsp increments larger than 0x400,use two uleb128 bytes like below:
0xc06a00e4 <unwind_test_work>: @0xc0cc9e0c
  Compact model index: 1
  0xb2 0x81 0x01 vsp = vsp + 1032
  0xac      pop {r4, r5, r6, r7, r8, r14}
The unwind works well since the decoded uleb128 byte is also 0x81.

For vsp increments larger than 0x600,use two uleb128 bytes like below:
0xc06a00e4 <unwind_test_work>: @0xc0cc9e0c
  Compact model index: 1
  0xb2 0x81 0x02 vsp = vsp + 1544
  0xac      pop {r4, r5, r6, r7, r8, r14}
In this case,the decoded uleb128 result is 0x101(vsp=0x204+(0x101<<2)).
While the uleb128 used in code is 0x81(vsp=0x204+(0x81<<2)).
The unwind aborts at this frame since it gets incorrect vsp.

To fix this,add uleb128 decode to cover all the above case.

Signed-off-by: Haibo Li <haibo.li@mediatek.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2023-05-05 10:16:40 +01:00
Linus Torvalds
f20730efbd Merge tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP cross-CPU function-call updates from Ingo Molnar:

 - Remove diagnostics and adjust config for CSD lock diagnostics

 - Add a generic IPI-sending tracepoint, as currently there's no easy
   way to instrument IPI origins: it's arch dependent and for some major
   architectures it's not even consistently available.

* tag 'smp-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  trace,smp: Trace all smp_function_call*() invocations
  trace: Add trace_ipi_send_cpu()
  sched, smp: Trace smp callback causing an IPI
  smp: reword smp call IPI comment
  treewide: Trace IPIs sent via smp_send_reschedule()
  irq_work: Trace self-IPIs sent via arch_irq_work_raise()
  smp: Trace IPIs sent via arch_send_call_function_ipi_mask()
  sched, smp: Trace IPIs sent via send_call_function_single_ipi()
  trace: Add trace_ipi_send_cpumask()
  kernel/smp: Make csdlock_debug= resettable
  locking/csd_lock: Remove per-CPU data indirection from CSD lock debugging
  locking/csd_lock: Remove added data from CSD lock debugging
  locking/csd_lock: Add Kconfig option for csd_debug default
2023-04-28 15:03:43 -07:00
Linus Torvalds
2aff7c706c Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:

 - Mark arch_cpu_idle_dead() __noreturn, make all architectures &
   drivers that did this inconsistently follow this new, common
   convention, and fix all the fallout that objtool can now detect
   statically

 - Fix/improve the ORC unwinder becoming unreliable due to
   UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
   and UNWIND_HINT_UNDEFINED to resolve it

 - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code

 - Generate ORC data for __pfx code

 - Add more __noreturn annotations to various kernel startup/shutdown
   and panic functions

 - Misc improvements & fixes

* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  x86/hyperv: Mark hv_ghcb_terminate() as noreturn
  scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
  x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
  btrfs: Mark btrfs_assertfail() __noreturn
  objtool: Include weak functions in global_noreturns check
  cpu: Mark nmi_panic_self_stop() __noreturn
  cpu: Mark panic_smp_self_stop() __noreturn
  arm64/cpu: Mark cpu_park_loop() and friends __noreturn
  x86/head: Mark *_start_kernel() __noreturn
  init: Mark start_kernel() __noreturn
  init: Mark [arch_call_]rest_init() __noreturn
  objtool: Generate ORC data for __pfx code
  x86/linkage: Fix padding for typed functions
  objtool: Separate prefix code from stack validation code
  objtool: Remove superfluous dead_end_function() check
  objtool: Add symbol iteration helpers
  objtool: Add WARN_INSN()
  scripts/objdump-func: Support multiple functions
  context_tracking: Fix KCSAN noinstr violation
  objtool: Add stackleak instrumentation to uaccess safe list
  ...
2023-04-28 14:02:54 -07:00
Linus Torvalds
888d3c9f7f Merge tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull sysctl updates from Luis Chamberlain:
 "This only does a few sysctl moves from the kernel/sysctl.c file, the
  rest of the work has been put towards deprecating two API calls which
  incur recursion and prevent us from simplifying the registration
  process / saving memory per move. Most of the changes have been
  soaking on linux-next since v6.3-rc3.

  I've slowed down the kernel/sysctl.c moves due to Matthew Wilcox's
  feedback that we should see if we could *save* memory with these moves
  instead of incurring more memory. We currently incur more memory since
  when we move a syctl from kernel/sysclt.c out to its own file we end
  up having to add a new empty sysctl used to register it. To achieve
  saving memory we want to allow syctls to be passed without requiring
  the end element being empty, and just have our registration process
  rely on ARRAY_SIZE(). Without this, supporting both styles of sysctls
  would make the sysctl registration pretty brittle, hard to read and
  maintain as can be seen from Meng Tang's efforts to do just this [0].
  Fortunately, in order to use ARRAY_SIZE() for all sysctl registrations
  also implies doing the work to deprecate two API calls which use
  recursion in order to support sysctl declarations with subdirectories.

  And so during this development cycle quite a bit of effort went into
  this deprecation effort. I've annotated the following two APIs are
  deprecated and in few kernel releases we should be good to remove
  them:

   - register_sysctl_table()
   - register_sysctl_paths()

  During this merge window we should be able to deprecate and unexport
  register_sysctl_paths(), we can probably do that towards the end of
  this merge window.

  Deprecating register_sysctl_table() will take a bit more time but this
  pull request goes with a few example of how to do this.

  As it turns out each of the conversions to move away from either of
  these two API calls *also* saves memory. And so long term, all these
  changes *will* prove to have saved a bit of memory on boot.

  The way I see it then is if remove a user of one deprecated call, it
  gives us enough savings to move one kernel/sysctl.c out from the
  generic arrays as we end up with about the same amount of bytes.

  Since deprecating register_sysctl_table() and register_sysctl_paths()
  does not require maintainer coordination except the final unexport
  you'll see quite a bit of these changes from other pull requests, I've
  just kept the stragglers after rc3"

Link: https://lkml.kernel.org/r/ZAD+cpbrqlc5vmry@bombadil.infradead.org [0]

* tag 'sysctl-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (29 commits)
  fs: fix sysctls.c built
  mm: compaction: remove incorrect #ifdef checks
  mm: compaction: move compaction sysctl to its own file
  mm: memory-failure: Move memory failure sysctls to its own file
  arm: simplify two-level sysctl registration for ctl_isa_vars
  ia64: simplify one-level sysctl registration for kdump_ctl_table
  utsname: simplify one-level sysctl registration for uts_kern_table
  ntfs: simplfy one-level sysctl registration for ntfs_sysctls
  coda: simplify one-level sysctl registration for coda_table
  fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctls
  xfs: simplify two-level sysctl registration for xfs_table
  nfs: simplify two-level sysctl registration for nfs_cb_sysctls
  nfs: simplify two-level sysctl registration for nfs4_cb_sysctls
  lockd: simplify two-level sysctl registration for nlm_sysctls
  proc_sysctl: enhance documentation
  xen: simplify sysctl registration for balloon
  md: simplify sysctl registration
  hv: simplify sysctl registration
  scsi: simplify sysctl registration with register_sysctl()
  csky: simplify alignment sysctl registration
  ...
2023-04-27 16:52:33 -07:00
Linus Torvalds
b6a7828502 Merge tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
 "The summary of the changes for this pull requests is:

   - Song Liu's new struct module_memory replacement

   - Nick Alcock's MODULE_LICENSE() removal for non-modules

   - My cleanups and enhancements to reduce the areas where we vmalloc
     module memory for duplicates, and the respective debug code which
     proves the remaining vmalloc pressure comes from userspace.

  Most of the changes have been in linux-next for quite some time except
  the minor fixes I made to check if a module was already loaded prior
  to allocating the final module memory with vmalloc and the respective
  debug code it introduces to help clarify the issue. Although the
  functional change is small it is rather safe as it can only *help*
  reduce vmalloc space for duplicates and is confirmed to fix a bootup
  issue with over 400 CPUs with KASAN enabled. I don't expect stable
  kernels to pick up that fix as the cleanups would have also had to
  have been picked up. Folks on larger CPU systems with modules will
  want to just upgrade if vmalloc space has been an issue on bootup.

  Given the size of this request, here's some more elaborate details:

  The functional change change in this pull request is the very first
  patch from Song Liu which replaces the 'struct module_layout' with a
  new 'struct module_memory'. The old data structure tried to put
  together all types of supported module memory types in one data
  structure, the new one abstracts the differences in memory types in a
  module to allow each one to provide their own set of details. This
  paves the way in the future so we can deal with them in a cleaner way.
  If you look at changes they also provide a nice cleanup of how we
  handle these different memory areas in a module. This change has been
  in linux-next since before the merge window opened for v6.3 so to
  provide more than a full kernel cycle of testing. It's a good thing as
  quite a bit of fixes have been found for it.

  Jason Baron then made dynamic debug a first class citizen module user
  by using module notifier callbacks to allocate / remove module
  specific dynamic debug information.

  Nick Alcock has done quite a bit of work cross-tree to remove module
  license tags from things which cannot possibly be module at my request
  so to:

   a) help him with his longer term tooling goals which require a
      deterministic evaluation if a piece a symbol code could ever be
      part of a module or not. But quite recently it is has been made
      clear that tooling is not the only one that would benefit.
      Disambiguating symbols also helps efforts such as live patching,
      kprobes and BPF, but for other reasons and R&D on this area is
      active with no clear solution in sight.

   b) help us inch closer to the now generally accepted long term goal
      of automating all the MODULE_LICENSE() tags from SPDX license tags

  In so far as a) is concerned, although module license tags are a no-op
  for non-modules, tools which would want create a mapping of possible
  modules can only rely on the module license tag after the commit
  8b41fc4454 ("kbuild: create modules.builtin without
  Makefile.modbuiltin or tristate.conf").

  Nick has been working on this *for years* and AFAICT I was the only
  one to suggest two alternatives to this approach for tooling. The
  complexity in one of my suggested approaches lies in that we'd need a
  possible-obj-m and a could-be-module which would check if the object
  being built is part of any kconfig build which could ever lead to it
  being part of a module, and if so define a new define
  -DPOSSIBLE_MODULE [0].

  A more obvious yet theoretical approach I've suggested would be to
  have a tristate in kconfig imply the same new -DPOSSIBLE_MODULE as
  well but that means getting kconfig symbol names mapping to modules
  always, and I don't think that's the case today. I am not aware of
  Nick or anyone exploring either of these options. Quite recently Josh
  Poimboeuf has pointed out that live patching, kprobes and BPF would
  benefit from resolving some part of the disambiguation as well but for
  other reasons. The function granularity KASLR (fgkaslr) patches were
  mentioned but Joe Lawrence has clarified this effort has been dropped
  with no clear solution in sight [1].

  In the meantime removing module license tags from code which could
  never be modules is welcomed for both objectives mentioned above. Some
  developers have also welcomed these changes as it has helped clarify
  when a module was never possible and they forgot to clean this up, and
  so you'll see quite a bit of Nick's patches in other pull requests for
  this merge window. I just picked up the stragglers after rc3. LWN has
  good coverage on the motivation behind this work [2] and the typical
  cross-tree issues he ran into along the way. The only concrete blocker
  issue he ran into was that we should not remove the MODULE_LICENSE()
  tags from files which have no SPDX tags yet, even if they can never be
  modules. Nick ended up giving up on his efforts due to having to do
  this vetting and backlash he ran into from folks who really did *not
  understand* the core of the issue nor were providing any alternative /
  guidance. I've gone through his changes and dropped the patches which
  dropped the module license tags where an SPDX license tag was missing,
  it only consisted of 11 drivers. To see if a pull request deals with a
  file which lacks SPDX tags you can just use:

    ./scripts/spdxcheck.py -f \
	$(git diff --name-only commid-id | xargs echo)

  You'll see a core module file in this pull request for the above, but
  that's not related to his changes. WE just need to add the SPDX
  license tag for the kernel/module/kmod.c file in the future but it
  demonstrates the effectiveness of the script.

  Most of Nick's changes were spread out through different trees, and I
  just picked up the slack after rc3 for the last kernel was out. Those
  changes have been in linux-next for over two weeks.

  The cleanups, debug code I added and final fix I added for modules
  were motivated by David Hildenbrand's report of boot failing on a
  systems with over 400 CPUs when KASAN was enabled due to running out
  of virtual memory space. Although the functional change only consists
  of 3 lines in the patch "module: avoid allocation if module is already
  present and ready", proving that this was the best we can do on the
  modules side took quite a bit of effort and new debug code.

  The initial cleanups I did on the modules side of things has been in
  linux-next since around rc3 of the last kernel, the actual final fix
  for and debug code however have only been in linux-next for about a
  week or so but I think it is worth getting that code in for this merge
  window as it does help fix / prove / evaluate the issues reported with
  larger number of CPUs. Userspace is not yet fixed as it is taking a
  bit of time for folks to understand the crux of the issue and find a
  proper resolution. Worst come to worst, I have a kludge-of-concept [3]
  of how to make kernel_read*() calls for modules unique / converge
  them, but I'm currently inclined to just see if userspace can fix this
  instead"

Link: https://lore.kernel.org/all/Y/kXDqW+7d71C4wz@bombadil.infradead.org/ [0]
Link: https://lkml.kernel.org/r/025f2151-ce7c-5630-9b90-98742c97ac65@redhat.com [1]
Link: https://lwn.net/Articles/927569/ [2]
Link: https://lkml.kernel.org/r/20230414052840.1994456-3-mcgrof@kernel.org [3]

* tag 'modules-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (121 commits)
  module: add debugging auto-load duplicate module support
  module: stats: fix invalid_mod_bytes typo
  module: remove use of uninitialized variable len
  module: fix building stats for 32-bit targets
  module: stats: include uapi/linux/module.h
  module: avoid allocation if module is already present and ready
  module: add debug stats to help identify memory pressure
  module: extract patient module check into helper
  modules/kmod: replace implementation with a semaphore
  Change DEFINE_SEMAPHORE() to take a number argument
  module: fix kmemleak annotations for non init ELF sections
  module: Ignore L0 and rename is_arm_mapping_symbol()
  module: Move is_arm_mapping_symbol() to module_symbol.h
  module: Sync code of is_arm_mapping_symbol()
  scripts/gdb: use mem instead of core_layout to get the module address
  interconnect: remove module-related code
  interconnect: remove MODULE_LICENSE in non-modules
  zswap: remove MODULE_LICENSE in non-modules
  zpool: remove MODULE_LICENSE in non-modules
  x86/mm/dump_pagetables: remove MODULE_LICENSE in non-modules
  ...
2023-04-27 16:36:55 -07:00
Linus Torvalds
34b62f186d Merge tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
 "Resource management:

   - Add pci_dev_for_each_resource() and pci_bus_for_each_resource()
     iterators

  PCIe native device hotplug:

   - Fix AB-BA deadlock between reset_lock and device_lock

  Power management:

   - Wait longer for devices to become ready after resume (as we do for
     reset) to accommodate Intel Titan Ridge xHCI devices

   - Extend D3hot delay for NVIDIA HDA controllers to avoid
     unrecoverable devices after a bus reset

  Error handling:

   - Clear PCIe Device Status after EDR since generic error recovery now
     only clears it when AER is native

  ASPM:

   - Work around Chromebook firmware defect that clobbers Capability
     list (including ASPM L1 PM Substates Cap) when returning from
     D3cold to D0

  Freescale i.MX6 PCIe controller driver:

   - Install imprecise external abort handler only when DT indicates
     PCIe support

  Freescale Layerscape PCIe controller driver:

   - Add ls1028a endpoint mode support

  Qualcomm PCIe controller driver:

   - Add SM8550 DT binding and driver support

   - Add SDX55 DT binding and driver support

   - Use bulk APIs for clocks of IP 1.0.0, 2.3.2, 2.3.3

   - Use bulk APIs for reset of IP 2.1.0, 2.3.3, 2.4.0

   - Add DT "mhi" register region for supported SoCs

   - Expose link transition counts via debugfs to help debug low power
     issues

   - Support system suspend and resume; reduce interconnect bandwidth
     and turn off clock and PHY if there are no active devices

   - Enable async probe by default to reduce boot time

  Miscellaneous:

   - Sort controller Kconfig entries by vendor"

* tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (56 commits)
  PCI: xilinx: Drop obsolete dependency on COMPILE_TEST
  PCI: mobiveil: Sort Kconfig entries by vendor
  PCI: dwc: Sort Kconfig entries by vendor
  PCI: Sort controller Kconfig entries by vendor
  PCI: Use consistent controller Kconfig menu entry language
  PCI: xilinx-nwl: Add 'Xilinx' to Kconfig prompt
  PCI: hv: Add 'Microsoft' to Kconfig prompt
  PCI: meson: Add 'Amlogic' to Kconfig prompt
  PCI: Use of_property_present() for testing DT property presence
  PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
  dt-bindings: PCI: qcom: Document msi-map and msi-map-mask properties
  PCI: qcom: Add SM8550 PCIe support
  dt-bindings: PCI: qcom: Add SM8550 compatible
  PCI: qcom: Add support for SDX55 SoC
  dt-bindings: PCI: qcom-ep: Fix the unit address used in example
  dt-bindings: PCI: qcom: Add SDX55 SoC
  dt-bindings: PCI: qcom: Update maintainers entry
  PCI: qcom: Enable async probe by default
  PCI: qcom: Add support for system suspend and resume
  PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter
  ...
2023-04-27 10:45:30 -07:00