Merge tag 'v6.18-rc6' into for-linus

Sync up with the mainline to bring in definition of
INPUT_PROP_HAPTIC_TOUCHPAD.
This commit is contained in:
Dmitry Torokhov
2025-11-17 23:16:55 -08:00
13477 changed files with 573015 additions and 320479 deletions

View File

@@ -294,7 +294,6 @@ ForEachMacros:
- 'for_each_fib6_node_rt_rcu'
- 'for_each_fib6_walker_rt'
- 'for_each_file_lock'
- 'for_each_free_mem_pfn_range_in_zone_from'
- 'for_each_free_mem_range'
- 'for_each_free_mem_range_reverse'
- 'for_each_func_rsrc'

View File

@@ -1,5 +1,6 @@
Alan Cox <alan@lxorguk.ukuu.org.uk>
Alan Cox <root@hraefn.swansea.linux.org.uk>
Alyssa Rosenzweig <alyssa@rosenzweig.io>
Christoph Hellwig <hch@lst.de>
Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Marc Gonzalez <marc.w.gonzalez@free.fr>

2
.gitignore vendored
View File

@@ -176,7 +176,7 @@ x509.genkey
*.kdev4
# Clang's compilation database file
/compile_commands.json
compile_commands.json
# Documentation toolchain
sphinx_*/

View File

@@ -27,6 +27,7 @@ Alan Cox <alan@lxorguk.ukuu.org.uk>
Alan Cox <root@hraefn.swansea.linux.org.uk>
Aleksandar Markovic <aleksandar.markovic@mips.com> <aleksandar.markovic@imgtec.com>
Aleksey Gorelov <aleksey_gorelov@phoenix.com>
Alex Williamson <alex@shazbot.org> <alex.williamson@redhat.com>
Alexander Lobakin <alobakin@pm.me> <alobakin@dlink.ru>
Alexander Lobakin <alobakin@pm.me> <alobakin@marvell.com>
Alexander Lobakin <alobakin@pm.me> <bloodyreaper@yandex.ru>
@@ -134,6 +135,7 @@ Ben M Cahill <ben.m.cahill@intel.com>
Ben Widawsky <bwidawsk@kernel.org> <ben@bwidawsk.net>
Ben Widawsky <bwidawsk@kernel.org> <ben.widawsky@intel.com>
Ben Widawsky <bwidawsk@kernel.org> <benjamin.widawsky@intel.com>
Bence Csókás <bence98@sch.bme.hu> <csokas.bence@prolan.hu>
Benjamin Poirier <benjamin.poirier@gmail.com> <bpoirier@suse.de>
Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@gmail.com>
Benjamin Tissoires <bentiss@kernel.org> <benjamin.tissoires@redhat.com>
@@ -164,6 +166,8 @@ Casey Connolly <casey.connolly@linaro.org> <caleb@connolly.tech>
Casey Connolly <casey.connolly@linaro.org> <caleb@postmarketos.org>
Can Guo <quic_cang@quicinc.com> <cang@codeaurora.org>
Carl Huang <quic_cjhuang@quicinc.com> <cjhuang@codeaurora.org>
Carl Vanderlip <carl.vanderlip@oss.qualcomm.com> <carlv@codeaurora.org>
Carl Vanderlip <carl.vanderlip@oss.qualcomm.com> <quic_carlv@quicinc.com>
Carlos Bilbao <carlos.bilbao@kernel.org> <carlos.bilbao@amd.com>
Carlos Bilbao <carlos.bilbao@kernel.org> <carlos.bilbao.osdev@gmail.com>
Carlos Bilbao <carlos.bilbao@kernel.org> <bilbao@vt.edu>
@@ -202,6 +206,7 @@ Danilo Krummrich <dakr@kernel.org> <dakr@redhat.com>
David Brownell <david-b@pacbell.net>
David Collins <quic_collinsd@quicinc.com> <collinsd@codeaurora.org>
David Heidelberg <david@ixit.cz> <d.okias@gmail.com>
David Hildenbrand <david@kernel.org> <david@redhat.com>
David Rheinsberg <david@readahead.eu> <dh.herrmann@gmail.com>
David Rheinsberg <david@readahead.eu> <dh.herrmann@googlemail.com>
David Rheinsberg <david@readahead.eu> <david.rheinsberg@gmail.com>
@@ -213,7 +218,8 @@ Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@gmail.com>
Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@imgtec.com>
Dengcheng Zhu <dzhu@wavecomp.com> <dengcheng.zhu@mips.com>
<dev.kurt@vandijck-laurijssen.be> <kurt.van.dijck@eia.be>
Dikshita Agarwal <quic_dikshita@quicinc.com> <dikshita@codeaurora.org>
Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com> <dikshita@codeaurora.org>
Dikshita Agarwal <dikshita.agarwal@oss.qualcomm.com> <quic_dikshita@quicinc.com>
Dmitry Baryshkov <lumag@kernel.org> <dbaryshkov@gmail.com>
Dmitry Baryshkov <lumag@kernel.org> <[dbaryshkov@gmail.com]>
Dmitry Baryshkov <lumag@kernel.org> <dmitry_baryshkov@mentor.com>
@@ -223,9 +229,12 @@ Dmitry Safonov <0x7f454c46@gmail.com> <dima@arista.com>
Dmitry Safonov <0x7f454c46@gmail.com> <d.safonov@partner.samsung.com>
Dmitry Safonov <0x7f454c46@gmail.com> <dsafonov@virtuozzo.com>
Domen Puncer <domen@coderock.org>
Dong Aisheng <aisheng.dong@nxp.com> <b29396@freescale.com>
Douglas Gilbert <dougg@torque.net>
Drew Fustini <fustini@kernel.org> <drew@pdp7.com>
<duje@dujemihanovic.xyz> <duje.mihanovic@skole.hr>
Easwar Hariharan <easwar.hariharan@linux.microsoft.com> <easwar.hariharan@intel.com>
Easwar Hariharan <easwar.hariharan@linux.microsoft.com> <eahariha@linux.microsoft.com>
Ed L. Cashin <ecashin@coraid.com>
Elliot Berman <quic_eberman@quicinc.com> <eberman@codeaurora.org>
Enric Balletbo i Serra <eballetbo@kernel.org> <enric.balletbo@collabora.com>
@@ -418,7 +427,7 @@ Kenneth W Chen <kenneth.w.chen@intel.com>
Kenneth Westfield <quic_kwestfie@quicinc.com> <kwestfie@codeaurora.org>
Kiran Gunda <quic_kgunda@quicinc.com> <kgunda@codeaurora.org>
Kirill Tkhai <tkhai@ya.ru> <ktkhai@virtuozzo.com>
Kirill A. Shutemov <kas@kernel.org> <kirill.shutemov@linux.intel.com>
Kiryl Shutsemau <kas@kernel.org> <kirill.shutemov@linux.intel.com>
Kishon Vijay Abraham I <kishon@kernel.org> <kishon@ti.com>
Konrad Dybcio <konradybcio@kernel.org> <konrad.dybcio@linaro.org>
Konrad Dybcio <konradybcio@kernel.org> <konrad.dybcio@somainline.org>
@@ -587,6 +596,7 @@ Nikolay Aleksandrov <razor@blackwall.org> <nikolay@redhat.com>
Nikolay Aleksandrov <razor@blackwall.org> <nikolay@cumulusnetworks.com>
Nikolay Aleksandrov <razor@blackwall.org> <nikolay@nvidia.com>
Nikolay Aleksandrov <razor@blackwall.org> <nikolay@isovalent.com>
Nobuhiro Iwamatsu <nobuhiro.iwamatsu.x90@mail.toshiba> <nobuhiro1.iwamatsu@toshiba.co.jp>
Odelu Kukatla <quic_okukatla@quicinc.com> <okukatla@codeaurora.org>
Oleksandr Natalenko <oleksandr@natalenko.name> <oleksandr@redhat.com>
Oleksij Rempel <linux@rempel-privat.de> <bug-track@fisher-privat.net>
@@ -596,7 +606,8 @@ Oleksij Rempel <o.rempel@pengutronix.de>
Oleksij Rempel <o.rempel@pengutronix.de> <ore@pengutronix.de>
Oliver Hartkopp <socketcan@hartkopp.net> <oliver.hartkopp@volkswagen.de>
Oliver Hartkopp <socketcan@hartkopp.net> <oliver@hartkopp.net>
Oliver Upton <oliver.upton@linux.dev> <oupton@google.com>
Oliver Upton <oupton@kernel.org> <oupton@google.com>
Oliver Upton <oupton@kernel.org> <oliver.upton@linux.dev>
Ondřej Jirman <megi@xff.cz> <megous@megous.com>
Oza Pawandeep <quic_poza@quicinc.com> <poza@codeaurora.org>
Pali Rohár <pali@kernel.org> <pali.rohar@gmail.com>
@@ -620,6 +631,7 @@ Paulo Alcantara <pc@manguebit.org> <palcantara@suse.com>
Paulo Alcantara <pc@manguebit.org> <pc@manguebit.com>
Pavankumar Kondeti <quic_pkondeti@quicinc.com> <pkondeti@codeaurora.org>
Peter A Jonsson <pj@ludd.ltu.se>
Peter Hilber <peter.hilber@oss.qualcomm.com> <quic_philber@quicinc.com>
Peter Oruba <peter.oruba@amd.com>
Peter Oruba <peter@oruba.de>
Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> <pierre-louis.bossart@linux.intel.com>
@@ -634,6 +646,7 @@ Qais Yousef <qyousef@layalina.io> <qais.yousef@arm.com>
Quentin Monnet <qmo@kernel.org> <quentin.monnet@netronome.com>
Quentin Monnet <qmo@kernel.org> <quentin@isovalent.com>
Quentin Perret <qperret@qperret.net> <quentin.perret@arm.com>
Rae Moar <raemoar63@gmail.com> <rmoar@google.com>
Rafael J. Wysocki <rjw@rjwysocki.net> <rjw@sisk.pl>
Rajeev Nandan <quic_rajeevny@quicinc.com> <rajeevny@codeaurora.org>
Rajendra Nayak <quic_rjendra@quicinc.com> <rnayak@codeaurora.org>
@@ -703,6 +716,7 @@ Sergey Senozhatsky <senozhatsky@chromium.org> <sergey.senozhatsky@mail.by>
Sergey Senozhatsky <senozhatsky@chromium.org> <senozhatsky@google.com>
Seth Forshee <sforshee@kernel.org> <seth.forshee@canonical.com>
Shakeel Butt <shakeel.butt@linux.dev> <shakeelb@google.com>
Shameer Kolothum <skolothumtho@nvidia.com> <shameerali.kolothum.thodi@huawei.com>
Shannon Nelson <sln@onemain.com> <shannon.nelson@amd.com>
Shannon Nelson <sln@onemain.com> <snelson@pensando.io>
Shannon Nelson <sln@onemain.com> <shannon.nelson@intel.com>
@@ -713,7 +727,8 @@ Shuah Khan <shuah@kernel.org> <shuahkhan@gmail.com>
Shuah Khan <shuah@kernel.org> <shuah.khan@hp.com>
Shuah Khan <shuah@kernel.org> <shuahkh@osg.samsung.com>
Shuah Khan <shuah@kernel.org> <shuah.kh@samsung.com>
Sibi Sankar <quic_sibis@quicinc.com> <sibis@codeaurora.org>
Sibi Sankar <sibi.sankar@oss.qualcomm.com> <sibis@codeaurora.org>
Sibi Sankar <sibi.sankar@oss.qualcomm.com> <quic_sibis@quicinc.com>
Sid Manning <quic_sidneym@quicinc.com> <sidneym@codeaurora.org>
Simon Arlott <simon@octiron.net> <simon@fire.lp0.eu>
Simona Vetter <simona.vetter@ffwll.ch> <daniel.vetter@ffwll.ch>
@@ -737,6 +752,8 @@ Sriram Yagnaraman <sriram.yagnaraman@ericsson.com> <sriram.yagnaraman@est.tech>
Stanislav Fomichev <sdf@fomichev.me> <sdf@google.com>
Stanislav Fomichev <sdf@fomichev.me> <stfomichev@gmail.com>
Stefan Wahren <wahrenst@gmx.net> <stefan.wahren@i2se.com>
Stéphane Grosjean <stephane.grosjean@hms-networks.com> <s.grosjean@peak-system.com>
Stéphane Grosjean <stephane.grosjean@hms-networks.com> <stephane.grosjean@free.fr>
Stéphane Witzmann <stephane.witzmann@ubpmes.univ-bpclermont.fr>
Stephen Hemminger <stephen@networkplumber.org> <shemminger@linux-foundation.org>
Stephen Hemminger <stephen@networkplumber.org> <shemminger@osdl.org>
@@ -791,6 +808,7 @@ Tvrtko Ursulin <tursulin@ursulin.net> <tvrtko.ursulin@onelan.co.uk>
Tvrtko Ursulin <tursulin@ursulin.net> <tvrtko@ursulin.net>
Tycho Andersen <tycho@tycho.pizza> <tycho@tycho.ws>
Tzung-Bi Shih <tzungbi@kernel.org> <tzungbi@google.com>
Umang Jain <uajain@igalia.com> <umang.jain@ideasonboard.com>
Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de>
Uwe Kleine-König <u.kleine-koenig@baylibre.com> <ukleinek@baylibre.com>
Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
@@ -812,7 +830,9 @@ Valentin Schneider <vschneid@redhat.com> <valentin.schneider@arm.com>
Veera Sundaram Sankaran <quic_veeras@quicinc.com> <veeras@codeaurora.org>
Veerabhadrarao Badiganti <quic_vbadigan@quicinc.com> <vbadigan@codeaurora.org>
Venkateswara Naralasetty <quic_vnaralas@quicinc.com> <vnaralas@codeaurora.org>
Vikash Garodia <quic_vgarodia@quicinc.com> <vgarodia@codeaurora.org>
Vikash Garodia <vikash.garodia@oss.qualcomm.com> <vgarodia@codeaurora.org>
Vikash Garodia <vikash.garodia@oss.qualcomm.com> <quic_vgarodia@quicinc.com>
Vincent Mailhol <mailhol@kernel.org> <mailhol.vincent@wanadoo.fr>
Vinod Koul <vkoul@kernel.org> <vinod.koul@intel.com>
Vinod Koul <vkoul@kernel.org> <vinod.koul@linux.intel.com>
Vinod Koul <vkoul@kernel.org> <vkoul@infradead.org>

View File

@@ -1,2 +1,2 @@
[MASTER]
init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi"]'
init-hook='import sys; sys.path += ["scripts/lib/kdoc", "scripts/lib/abi", "tools/docs/lib"]'

22
CREDITS
View File

@@ -1890,6 +1890,11 @@ S: Reading
S: RG6 2NU
S: United Kingdom
N: Michael Jamet
E: michael.jamet@intel.com
D: Thunderbolt/USB4 driver maintainer
D: Thunderbolt/USB4 networking driver maintainer
N: Dave Jeffery
E: dhjeffery@gmail.com
D: SCSI hacks and IBM ServeRAID RAID driver maintenance
@@ -2031,6 +2036,10 @@ S: Botanicka' 68a
S: 602 00 Brno
S: Czech Republic
N: Karsten Keil
E: isdn@linux-pingi.de
D: ISDN subsystem maintainer
N: Jakob Kemi
E: jakob.kemi@telia.com
D: V4L W9966 Webcam driver
@@ -3222,6 +3231,10 @@ D: AIC5800 IEEE 1394, RAW I/O on 1394
D: Starter of Linux1394 effort
S: ask per mail for current address
N: Boris Pismenny
E: borisp@mellanox.com
D: Kernel TLS implementation and offload support.
N: Nicolas Pitre
E: nico@fluxnic.net
D: StrongARM SA1100 support integrator & hacker
@@ -3908,6 +3921,12 @@ S: C/ Federico Garcia Lorca 1 10-A
S: Sevilla 41005
S: Spain
N: Björn Töpel
E: bjorn@kernel.org
D: AF_XDP
S: Gothenburg
S: Sweden
N: Linus Torvalds
E: torvalds@linux-foundation.org
D: Original kernel hacker
@@ -4168,6 +4187,9 @@ S: 1513 Brewster Dr.
S: Carrollton, TX 75010
S: USA
N: Dave Watson
D: Kernel TLS implementation.
N: Tim Waugh
E: tim@cyberelk.net
D: Co-architect of the parallel-port sharing system

1191
Documentation/.renames.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -603,16 +603,10 @@ Date: July 2003
Contact: linux-block@vger.kernel.org
Description:
[RW] This controls how many requests may be allocated in the
block layer for read or write requests. Note that the total
allocated number may be twice this amount, since it applies only
to reads or writes (not the accumulated sum).
To avoid priority inversion through request starvation, a
request queue maintains a separate request pool per each cgroup
when CONFIG_BLK_CGROUP is enabled, and this parameter applies to
each such per-block-cgroup request pool. IOW, if there are N
block cgroups, each request queue may have up to N request
pools, each independently regulated by nr_requests.
block layer. Noted this value only represents the quantity for a
single blk_mq_tags instance. The actual number for the entire
device depends on the hardware queue count, whether elevator is
enabled, and whether tags are shared.
What: /sys/block/<disk>/queue/nr_zones

View File

@@ -1,6 +1,6 @@
What: /sys/kernel/debug/cec/*/error-inj
Date: March 2018
Contact: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Contact: Hans Verkuil <hverkuil@kernel.org>
Description:
The CEC Framework allows for CEC error injection commands through

View File

@@ -19,6 +19,20 @@ Description:
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
poison injection can result in permanent data loss. Injected
poison may render data permanently inaccessible even after
clearing, as the clear operation writes zeros and does not
recover original data.
SYSTEM STABILITY RISK: For volatile memory, poison injection
can cause kernel crashes, system instability, or unpredictable
behavior if the poisoned addresses are accessed by running code
or critical kernel structures.
What: /sys/kernel/debug/cxl/memX/clear_poison
Date: April, 2023
@@ -35,6 +49,79 @@ Description:
The clear_poison attribute is only visible for devices
supporting the capability.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
specified address range and removes the address from the poison
list. It does NOT recover or restore original data that may have
been present before poison injection. Any original data at the
cleared address is permanently lost and replaced with zeros.
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
purposes only and should not be used as a data repair tool.
Clearing poison is fundamentally different from data recovery
or error correction.
What: /sys/kernel/debug/cxl/regionX/inject_poison
Date: August, 2025
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Host Physical Address (HPA) is written to this
attribute, the region driver translates it to a Device
Physical Address (DPA) and identifies the corresponding
memdev. It then sends an inject poison command to that memdev
at the translated DPA. Refer to the memdev ABI entry at:
/sys/kernel/debug/cxl/memX/inject_poison for the detailed
behavior. This attribute is only visible if all memdevs
participating in the region support both inject and clear
poison commands.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
DATA LOSS RISK: For CXL persistent memory (PMEM) devices,
poison injection can result in permanent data loss. Injected
poison may render data permanently inaccessible even after
clearing, as the clear operation writes zeros and does not
recover original data.
SYSTEM STABILITY RISK: For volatile memory, poison injection
can cause kernel crashes, system instability, or unpredictable
behavior if the poisoned addresses are accessed by running code
or critical kernel structures.
What: /sys/kernel/debug/cxl/regionX/clear_poison
Date: August, 2025
Contact: linux-cxl@vger.kernel.org
Description:
(WO) When a Host Physical Address (HPA) is written to this
attribute, the region driver translates it to a Device
Physical Address (DPA) and identifies the corresponding
memdev. It then sends a clear poison command to that memdev
at the translated DPA. Refer to the memdev ABI entry at:
/sys/kernel/debug/cxl/memX/clear_poison for the detailed
behavior. This attribute is only visible if all memdevs
participating in the region support both inject and clear
poison commands.
TEST-ONLY INTERFACE: This interface is intended for testing
and validation purposes only. It is not a data repair mechanism
and should never be used on production systems or live data.
CLEAR IS NOT DATA RECOVERY: This operation writes zeros to the
specified address range and removes the address from the poison
list. It does NOT recover or restore original data that may have
been present before poison injection. Any original data at the
cleared address is permanently lost and replaced with zeros.
CLEAR IS NOT A REPAIR MECHANISM: This interface is for testing
purposes only and should not be used as a data repair tool.
Clearing poison is fundamentally different from data recovery
or error correction.
What: /sys/kernel/debug/cxl/einj_types
Date: January, 2024
KernelVersion: v6.9

View File

@@ -57,6 +57,7 @@ Description: (RO) Reports device telemetry counters.
gp_lat_acc_avg average get to put latency [ns]
bw_in PCIe, write bandwidth [Mbps]
bw_out PCIe, read bandwidth [Mbps]
re_acc_avg average ring empty time [ns]
at_page_req_lat_avg Address Translator(AT), average page
request latency [ns]
at_trans_lat_avg AT, average page translation latency [ns]
@@ -85,6 +86,32 @@ Description: (RO) Reports device telemetry counters.
exec_cph<N> execution count of Cipher slice N
util_ath<N> utilization of Authentication slice N [%]
exec_ath<N> execution count of Authentication slice N
cmdq_wait_cnv<N> wait time for cmdq N to get Compression and verify
slice ownership
cmdq_exec_cnv<N> Compression and verify slice execution time while
owned by cmdq N
cmdq_drain_cnv<N> time taken for cmdq N to release Compression and
verify slice ownership
cmdq_wait_dcprz<N> wait time for cmdq N to get Decompression
slice N ownership
cmdq_exec_dcprz<N> Decompression slice execution time while
owned by cmdq N
cmdq_drain_dcprz<N> time taken for cmdq N to release Decompression
slice ownership
cmdq_wait_pke<N> wait time for cmdq N to get PKE slice ownership
cmdq_exec_pke<N> PKE slice execution time while owned by cmdq N
cmdq_drain_pke<N> time taken for cmdq N to release PKE slice
ownership
cmdq_wait_ucs<N> wait time for cmdq N to get UCS slice ownership
cmdq_exec_ucs<N> UCS slice execution time while owned by cmdq N
cmdq_drain_ucs<N> time taken for cmdq N to release UCS slice
ownership
cmdq_wait_ath<N> wait time for cmdq N to get Authentication slice
ownership
cmdq_exec_ath<N> Authentication slice execution time while owned
by cmdq N
cmdq_drain_ath<N> time taken for cmdq N to release Authentication
slice ownership
======================= ========================================
The telemetry report file can be read with the following command::

View File

@@ -23,3 +23,9 @@ Contact: Longfang Liu <liulongfang@huawei.com>
Description: Read the live migration status of the vfio device.
The contents of the state file reflects the migration state
relative to those defined in the vfio_device_mig_state enum
What: /sys/kernel/debug/vfio/<device>/migration/features
Date: Oct 2025
KernelVersion: 6.18
Contact: Cédric Le Goater <clg@redhat.com>
Description: Read the migration features of the vfio device.

View File

@@ -239,3 +239,9 @@ Date: March 2020
KernelVersion: 5.7
Contact: Mike Leach or Mathieu Poirier
Description: (Write) Clear all channel / trigger programming.
What: /sys/bus/coresight/devices/<cti-name>/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -13,3 +13,9 @@ KernelVersion: 6.14
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (R) Show the trace ID that will appear in the trace stream
coming from this trace entity.
What: /sys/bus/coresight/devices/dummy_source<N>/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -19,6 +19,12 @@ Description: (RW) Disables write access to the Trace RAM by stopping the
into the Trace RAM following the trigger event is equal to the
value stored in this register+1 (from ARM ETB-TRM).
What: /sys/bus/coresight/devices/<memory_map>.etb/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.
What: /sys/bus/coresight/devices/<memory_map>.etb/mgmt/rdp
Date: March 2016
KernelVersion: 4.7

View File

@@ -251,6 +251,12 @@ KernelVersion: 4.4
Contact: Mathieu Poirier <mathieu.poirier@linaro.org>
Description: (RO) Holds the cpu number this tracer is affined to.
What: /sys/bus/coresight/devices/<memory_map>.[etm|ptm]/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.
What: /sys/bus/coresight/devices/<memory_map>.[etm|ptm]/mgmt/etmccr
Date: September 2015
KernelVersion: 4.4

View File

@@ -329,6 +329,12 @@ Contact: Mathieu Poirier <mathieu.poirier@linaro.org>
Description: (RW) Access the selected single show PE comparator control
register.
What: /sys/bus/coresight/devices/etm<N>/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.
What: /sys/bus/coresight/devices/etm<N>/mgmt/trcoslsr
Date: April 2015
KernelVersion: 4.01

View File

@@ -10,3 +10,9 @@ Date: November 2014
KernelVersion: 3.19
Contact: Mathieu Poirier <mathieu.poirier@linaro.org>
Description: (RW) Defines input port priority order.
What: /sys/bus/coresight/devices/<memory_map>.funnel/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -51,3 +51,9 @@ KernelVersion: 4.7
Contact: Mathieu Poirier <mathieu.poirier@linaro.org>
Description: (RW) Holds the trace ID that will appear in the trace stream
coming from this trace entity.
What: /sys/bus/coresight/devices/<memory_map>.stm/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -107,3 +107,9 @@ Contact: Anshuman Khandual <anshuman.khandual@arm.com>
Description: (RW) Current Coresight TMC-ETR buffer mode selected. But user could
only provide a mode which is supported for a given ETR device. This
file is available only for TMC ETR devices.
What: /sys/bus/coresight/devices/<memory_map>.tmc/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -272,3 +272,9 @@ KernelVersion 6.15
Contact: Jinlong Mao (QUIC) <quic_jinlmao@quicinc.com>, Tao Zhang (QUIC) <quic_taozha@quicinc.com>
Description:
(RW) Set/Get the enablement of the individual lane.
What: /sys/bus/coresight/devices/<tpdm-name>/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -12,3 +12,9 @@ Contact: Anshuman Khandual <anshuman.khandual@arm.com>
Description: (Read) Shows if TRBE updates in the memory are with access
and dirty flag updates as well. This value is fetched from
the TRBIDR register.
What: /sys/bus/coresight/devices/trbe<cpu>/label
Date: Aug 2025
KernelVersion 6.18
Contact: Mao Jinlong <quic_jinlmao@quicinc.com>
Description: (Read) Show hardware context information of device.

View File

@@ -309,26 +309,26 @@ Description:
What: /sys/bus/counter/devices/counterX/cascade_counts_enable_component_id
What: /sys/bus/counter/devices/counterX/external_input_phase_clock_select_component_id
What: /sys/bus/counter/devices/counterX/countY/compare_component_id
What: /sys/bus/counter/devices/counterX/countY/capture_component_id
What: /sys/bus/counter/devices/counterX/countY/ceiling_component_id
What: /sys/bus/counter/devices/counterX/countY/floor_component_id
What: /sys/bus/counter/devices/counterX/countY/compare_component_id
What: /sys/bus/counter/devices/counterX/countY/count_mode_component_id
What: /sys/bus/counter/devices/counterX/countY/direction_component_id
What: /sys/bus/counter/devices/counterX/countY/enable_component_id
What: /sys/bus/counter/devices/counterX/countY/error_noise_component_id
What: /sys/bus/counter/devices/counterX/countY/floor_component_id
What: /sys/bus/counter/devices/counterX/countY/num_overflows_component_id
What: /sys/bus/counter/devices/counterX/countY/prescaler_component_id
What: /sys/bus/counter/devices/counterX/countY/preset_component_id
What: /sys/bus/counter/devices/counterX/countY/preset_enable_component_id
What: /sys/bus/counter/devices/counterX/countY/signalZ_action_component_id
What: /sys/bus/counter/devices/counterX/countY/num_overflows_component_id
What: /sys/bus/counter/devices/counterX/signalY/cable_fault_component_id
What: /sys/bus/counter/devices/counterX/signalY/cable_fault_enable_component_id
What: /sys/bus/counter/devices/counterX/signalY/filter_clock_prescaler_component_id
What: /sys/bus/counter/devices/counterX/signalY/frequency_component_id
What: /sys/bus/counter/devices/counterX/signalY/index_polarity_component_id
What: /sys/bus/counter/devices/counterX/signalY/polarity_component_id
What: /sys/bus/counter/devices/counterX/signalY/synchronous_mode_component_id
What: /sys/bus/counter/devices/counterX/signalY/frequency_component_id
KernelVersion: 5.16
Contact: linux-iio@vger.kernel.org
Description:

View File

@@ -0,0 +1,25 @@
What: /sys/bus/event_source/devices/vpa_dtl/format
Date: February 2025
Contact: Linux on PowerPC Developer List <linuxppc-dev at lists.ozlabs.org>
Description: Read-only. Attribute group to describe the magic bits
that go into perf_event_attr.config for a particular pmu.
(See ABI/testing/sysfs-bus-event_source-devices-format).
Each attribute under this group defines a bit range of the
perf_event_attr.config. Supported attribute are listed
below::
event = "config:0-7" - event ID
For example::
dtl_cede = "event=0x1"
What: /sys/bus/event_source/devices/vpa_dtl/events
Date: February 2025
Contact: Linux on PowerPC Developer List <linuxppc-dev at lists.ozlabs.org>
Description: (RO) Attribute group to describe performance monitoring events
for the Virtual Processor Dispatch Trace Log. Each attribute in
this group describes a single performance monitoring event
supported by vpa_dtl pmu. The name of the file is the name of
the event (See ABI/testing/sysfs-bus-event_source-devices-events).

View File

@@ -0,0 +1,100 @@
What: /sys/bus/i2c/devices/<busnum>-<primary-addr>/unlock
Date: 2025-07-04
KernelVersion: 6.17
Contact: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Description:
Write-only attribute used to present a password and unlock
access to protected areas of the M24LR chip, including
configuration registers such as the Sector Security Status
(SSS) bytes. A valid password must be written to enable write
access to these regions via the I2C interface.
Format:
- Hexadecimal string representing a 32-bit (4-byte) password
- Accepts 1 to 8 hex digits (e.g., "c", "1F", "a1b2c3d4")
- No "0x" prefix, whitespace, or trailing newline
- Case-insensitive
Behavior:
- If the password matches the internal stored value,
access to protected memory/configuration is granted
- If the password does not match the internally stored value,
it will fail silently
What: /sys/bus/i2c/devices/<busnum>-<primary-addr>/new_pass
Date: 2025-07-04
KernelVersion: 6.17
Contact: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Description:
Write-only attribute used to update the password required to
unlock the M24LR chip.
Format:
- Hexadecimal string representing a new 32-bit password
- Accepts 1 to 8 hex digits (e.g., "1A", "ffff", "c0ffee00")
- No "0x" prefix, whitespace, or trailing newline
- Case-insensitive
Behavior:
- Overwrites the current password stored in the I2C password
register
- Requires the device to be unlocked before changing the
password
- If the device is locked, the write silently fails
What: /sys/bus/i2c/devices/<busnum>-<primary-addr>/uid
Date: 2025-07-04
KernelVersion: 6.17
Contact: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Description:
Read-only attribute that exposes the 8-byte unique identifier
programmed into the M24LR chip at the factory.
Format:
- Lowercase hexadecimal string representing a 64-bit value
- 1 to 16 hex digits (e.g., "e00204f12345678")
- No "0x" prefix
- Includes a trailing newline
What: /sys/bus/i2c/devices/<busnum>-<primary-addr>/total_sectors
Date: 2025-07-04
KernelVersion: 6.17
Contact: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Description:
Read-only attribute that exposes the total number of EEPROM
sectors available in the M24LR chip.
Format:
- 1 to 2 hex digits (e.g. "F")
- No "0x" prefix
- Includes a trailing newline
Notes:
- Value is encoded by the chip and corresponds to the EEPROM
size (e.g., 3 = 4 kbit for M24LR04E-R)
What: /sys/bus/i2c/devices/<busnum>-<primary-addr>/sss
Date: 2025-07-04
KernelVersion: 6.17
Contact: Abd-Alrhman Masalkhi <abd.masalkhi@gmail.com>
Description:
Read/write binary attribute representing the Sector Security
Status (SSS) bytes for all EEPROM sectors in STMicroelectronics
M24LR chips.
Each EEPROM sector has one SSS byte, which controls I2C and
RF access through protection bits and optional password
authentication.
Format:
- The file contains one byte per EEPROM sector
- Byte at offset N corresponds to sector N
- Binary access only; use tools like dd, Python, or C that
support byte-level I/O and offset control.
Notes:
- The number of valid bytes in this file is equal to the
value exposed by 'total_sectors' file
- Write access requires prior password authentication in
I2C mode
- Refer to the M24LR datasheet for full SSS bit layout

View File

@@ -167,7 +167,18 @@ Description:
is required is a consistent labeling. Units after application
of scale and offset are millivolts.
What: /sys/bus/iio/devices/iio:deviceX/in_altvoltageY_rms_raw
KernelVersion: 6.18
Contact: linux-iio@vger.kernel.org
Description:
Raw (unscaled) Root Mean Square (RMS) voltage measurement from
channel Y. Units after application of scale and offset are
millivolts.
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_raw
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_active_raw
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_reactive_raw
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_apparent_raw
KernelVersion: 4.5
Contact: linux-iio@vger.kernel.org
Description:
@@ -176,6 +187,13 @@ Description:
unique to allow association with event codes. Units after
application of scale and offset are milliwatts.
What: /sys/bus/iio/devices/iio:deviceX/in_powerY_powerfactor
KernelVersion: 6.18
Contact: linux-iio@vger.kernel.org
Description:
Power factor measurement from channel Y. Power factor is the
ratio of active power to apparent power. The value is unitless.
What: /sys/bus/iio/devices/iio:deviceX/in_capacitanceY_raw
KernelVersion: 3.2
Contact: linux-iio@vger.kernel.org
@@ -1569,6 +1587,9 @@ Description:
What: /sys/.../iio:deviceX/in_energy_input
What: /sys/.../iio:deviceX/in_energy_raw
What: /sys/.../iio:deviceX/in_energyY_active_raw
What: /sys/.../iio:deviceX/in_energyY_reactive_raw
What: /sys/.../iio:deviceX/in_energyY_apparent_raw
KernelVersion: 4.0
Contact: linux-iio@vger.kernel.org
Description:
@@ -1707,6 +1728,14 @@ Description:
component of the signal while the 'q' channel contains the quadrature
component.
What: /sys/bus/iio/devices/iio:deviceX/in_altcurrentY_rms_raw
KernelVersion: 6.18
Contact: linux-iio@vger.kernel.org
Description:
Raw (unscaled no bias removal etc.) Root Mean Square (RMS) current
measurement from channel Y. Units after application of scale and
offset are milliamps.
What: /sys/.../iio:deviceX/in_energy_en
What: /sys/.../iio:deviceX/in_distance_en
What: /sys/.../iio:deviceX/in_velocity_sqrt(x^2+y^2+z^2)_en
@@ -2281,21 +2310,28 @@ Description:
conversion time. Poor noise performance.
* "sinc3" - The digital sinc3 filter. Moderate 1st
conversion time. Good noise performance.
* "sinc4" - Sinc 4. Excellent noise performance. Long
1st conversion time.
* "sinc5" - The digital sinc5 filter. Excellent noise
performance
* "sinc4+sinc1" - Sinc4 + averaging by 8. Low 1st conversion
time.
* "sinc3+rej60" - Sinc3 + 60Hz rejection.
* "sinc3+sinc1" - Sinc3 + averaging by 8. Low 1st conversion
time.
* "sinc3+pf1" - Sinc3 + device specific Post Filter 1.
* "sinc3+pf2" - Sinc3 + device specific Post Filter 2.
* "sinc3+pf3" - Sinc3 + device specific Post Filter 3.
* "sinc3+pf4" - Sinc3 + device specific Post Filter 4.
* "sinc5+pf1" - Sinc5 + device specific Post Filter 1.
* "sinc3+rej60" - Sinc3 + 60Hz rejection.
* "sinc3+sinc1" - Sinc3 + averaging by 8. Low 1st conversion
time.
* "sinc4" - Sinc 4. Excellent noise performance. Long
1st conversion time.
* "sinc4+lp" - Sinc4 + Low Pass Filter.
* "sinc4+sinc1" - Sinc4 + averaging by 8. Low 1st conversion
time.
* "sinc4+rej60" - Sinc4 + 60Hz rejection.
* "sinc5" - The digital sinc5 filter. Excellent noise
performance
* "sinc5+avg" - Sinc5 + averaging by 4.
* "sinc5+pf1" - Sinc5 + device specific Post Filter 1.
* "sinc5+sinc1" - Sinc5 + Sinc1.
* "sinc5+sinc1+pf1" - Sinc5 + Sinc1 + device specific Post Filter 1.
* "sinc5+sinc1+pf2" - Sinc5 + Sinc1 + device specific Post Filter 2.
* "sinc5+sinc1+pf3" - Sinc5 + Sinc1 + device specific Post Filter 3.
* "sinc5+sinc1+pf4" - Sinc5 + Sinc1 + device specific Post Filter 4.
* "wideband" - filter with wideband low ripple passband
and sharp transition band.

View File

@@ -7,16 +7,6 @@ Description:
corresponding calibration offsets can be read from `*_calibbias`
entries.
What: /sys/bus/iio/devices/iio:deviceX/location
Date: July 2015
KernelVersion: 4.7
Contact: linux-iio@vger.kernel.org
Description:
This attribute returns a string with the physical location where
the motion sensor is placed. For example, in a laptop a motion
sensor can be located on the base or on the lid. Current valid
values are 'base' and 'lid'.
What: /sys/bus/iio/devices/iio:deviceX/id
Date: September 2017
KernelVersion: 4.14

View File

@@ -612,3 +612,12 @@ Description:
# ls doe_features
0001:01 0001:02 doe_discovery
What: /sys/bus/pci/devices/.../serial_number
Date: December 2025
Contact: Matthew Wood <thepacketgeek@gmail.com>
Description:
This is visible only for PCI devices that support the serial
number extended capability. The file is read only and due to
the possible sensitivity of accessible serial numbers, admin
only.

View File

@@ -0,0 +1,8 @@
What: /sys/class/drm/.../boot_display
Date: January 2026
Contact: Linux DRI developers <dri-devel@vger.kernel.org>
Description:
This file indicates that displays connected to the device were
used to display the boot sequence. If a display connected to
the device was used to display the boot sequence the file will
be present and contain "1".

View File

@@ -553,6 +553,43 @@ Description:
Integer > 0: representing full cycles
Integer = 0: cycle_count info is not available
What: /sys/class/power_supply/<supply_name>/internal_resistance
Date: August 2025
Contact: linux-arm-msm@vger.kernel.org
Description:
Represent the battery's internal resistance, often referred
to as Equivalent Series Resistance (ESR). It is a dynamic
parameter that reflects the opposition to current flow within
the cell. It is not a fixed value but varies significantly
based on several operational conditions, including battery
state of charge (SoC), temperature, and whether the battery
is in a charging or discharging state.
Access: Read
Valid values: Represented in microohms
What: /sys/class/power_supply/<supply_name>/state_of_health
Date: August 2025
Contact: linux-arm-msm@vger.kernel.org
Description:
The state_of_health parameter quantifies the overall condition
of a battery as a percentage, reflecting its ability to deliver
rated performance relative to its original specifications. It is
dynamically computed using a combination of learned capacity
and impedance-based degradation indicators, both of which evolve
over the battery's lifecycle.
Note that the exact algorithms are kept secret by most battery
vendors and the value from different battery vendors cannot be
compared with each other as there is no vendor-agnostic definition
of "performance". Also this usually cannot be used for any
calculations (i.e. this is not the factor between charge_full and
charge_full_design).
Access: Read
Valid values: 0 - 100 (percent)
**USB Properties**
What: /sys/class/power_supply/<supply_name>/input_current_limit

View File

@@ -274,15 +274,15 @@ What: /sys/devices/.../power/runtime_active_time
Date: Jul 2010
Contact: Arjan van de Ven <arjan@linux.intel.com>
Description:
Reports the total time that the device has been active.
Used for runtime PM statistics.
Reports the total time that the device has been active, in
milliseconds. Used for runtime PM statistics.
What: /sys/devices/.../power/runtime_suspended_time
Date: Jul 2010
Contact: Arjan van de Ven <arjan@linux.intel.com>
Description:
Reports total time that the device has been suspended.
Used for runtime PM statistics.
Reports total time that the device has been suspended, in
milliseconds. Used for runtime PM statistics.
What: /sys/devices/.../power/runtime_usage
Date: Apr 2010

View File

@@ -586,6 +586,7 @@ What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/srbds
/sys/devices/system/cpu/vulnerabilities/tsa
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/vmscape
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities

View File

@@ -0,0 +1,8 @@
What: /sys/bus/platform/devices/xxx/version
Date: Sep 2025
Contact: netdev@vger.kernel.org
Description: Reports the version of the PEF2256 framer
Access: Read
Valid values: Represented as string

View File

@@ -822,8 +822,8 @@ What: /sys/fs/f2fs/<disk>/gc_valid_thresh_ratio
Date: September 2024
Contact: "Daeho Jeong" <daehojeong@google.com>
Description: It controls the valid block ratio threshold not to trigger excessive GC
for zoned deivces. The initial value of it is 95(%). F2FS will stop the
background GC thread from intiating GC for sections having valid blocks
for zoned devices. The initial value of it is 95(%). F2FS will stop the
background GC thread from initiating GC for sections having valid blocks
exceeding the ratio.
What: /sys/fs/f2fs/<disk>/max_read_extent_count
@@ -847,7 +847,7 @@ Description: For several zoned storage devices, vendors will provide extra space
filesystem level GC. To do that, we can reserve the space using
reserved_blocks. However, it is not enough, since this extra space should
not be shown to users. So, with this new sysfs node, we can hide the space
by substracting reserved_blocks from total bytes.
by subtracting reserved_blocks from total bytes.
What: /sys/fs/f2fs/<disk>/encoding_flags
Date: April 2025
@@ -883,3 +883,53 @@ Date: June 2025
Contact: "Daeho Jeong" <daehojeong@google.com>
Description: Control GC algorithm for boost GC. 0: cost benefit, 1: greedy
Default: 1
What: /sys/fs/f2fs/<disk>/effective_lookup_mode
Date: August 2025
Contact: "Daniel Lee" <chullee@google.com>
Description:
This is a read-only entry to show the effective directory lookup mode
F2FS is currently using for casefolded directories.
This considers both the "lookup_mode" mount option and the on-disk
encoding flag, SB_ENC_NO_COMPAT_FALLBACK_FL.
Possible values are:
- "perf": Hash-only lookup.
- "compat": Hash-based lookup with a linear search fallback enabled
- "auto:perf": lookup_mode is auto and fallback is disabled on-disk
- "auto:compat": lookup_mode is auto and fallback is enabled on-disk
What: /sys/fs/f2fs/<disk>/bggc_io_aware
Date: August 2025
Contact: "Liao Yuanhong" <liaoyuanhong@vivo.com>
Description: Used to adjust the BG_GC priority when pending IO, with a default value
of 0. Specifically, for ZUFS, the default value is 1.
================== ======================================================
value description
bggc_io_aware = 0 skip background GC if there is any kind of pending IO
bggc_io_aware = 1 skip background GC if there is pending read IO
bggc_io_aware = 2 don't aware IO for background GC
================== ======================================================
What: /sys/fs/f2fs/<disk>/allocate_section_hint
Date: August 2025
Contact: "Liao Yuanhong" <liaoyuanhong@vivo.com>
Description: Indicates the hint section between the first device and others in multi-devices
setup. It defaults to the end of the first device in sections. For a single storage
device, it defaults to the total number of sections. It can be manually set to match
scenarios where multi-devices are mapped to the same dm device.
What: /sys/fs/f2fs/<disk>/allocate_section_policy
Date: August 2025
Contact: "Liao Yuanhong" <liaoyuanhong@vivo.com>
Description: Controls write priority in multi-devices setups. A value of 0 means normal writing.
A value of 1 prioritizes writing to devices before the allocate_section_hint. A value of 2
prioritizes writing to devices after the allocate_section_hint. The default is 0.
=========================== ==========================================================
value description
allocate_section_policy = 0 Normal writing
allocate_section_policy = 1 Prioritize writing to section before allocate_section_hint
allocate_section_policy = 2 Prioritize writing to section after allocate_section_hint
=========================== ==========================================================

View File

@@ -77,6 +77,13 @@ Description: Writing a keyword for a monitoring operations set ('vaddr' for
Note that only the operations sets that listed in
'avail_operations' file are valid inputs.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/addr_unit
Date: Aug 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing an integer to this file sets the 'address unit'
parameter of the given operations set of the context. Reading
the file returns the last-written 'address unit' value.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/monitoring_attrs/intervals/sample_us
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>

View File

@@ -60,8 +60,8 @@ ifeq ($(HAVE_LATEXMK),1)
endif #HAVE_LATEXMK
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
PAPEROPT_a4 = -D latex_elements.papersize=a4paper
PAPEROPT_letter = -D latex_elements.papersize=letterpaper
ALLSPHINXOPTS = -D kerneldoc_srctree=$(srctree) -D kerneldoc_bin=$(KERNELDOC)
ALLSPHINXOPTS += $(PAPEROPT_$(PAPER)) $(SPHINXOPTS)
ifneq ($(wildcard $(srctree)/.config),)
@@ -87,7 +87,7 @@ loop_cmd = $(echo-cmd) $(cmd_$(1)) || exit;
PYTHONPYCACHEPREFIX ?= $(abspath $(BUILDDIR)/__pycache__)
quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
cmd_sphinx = $(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media $2 && \
cmd_sphinx = \
PYTHONPYCACHEPREFIX="$(PYTHONPYCACHEPREFIX)" \
BUILDDIR=$(abspath $(BUILDDIR)) SPHINX_CONF=$(abspath $(src)/$5/$(SPHINX_CONF)) \
$(PYTHON3) $(srctree)/scripts/jobserver-exec \
@@ -104,26 +104,13 @@ quiet_cmd_sphinx = SPHINX $@ --> file://$(abspath $(BUILDDIR)/$3/$4)
cp $(if $(patsubst /%,,$(DOCS_CSS)),$(abspath $(srctree)/$(DOCS_CSS)),$(DOCS_CSS)) $(BUILDDIR)/$3/_static/; \
fi
YNL_INDEX:=$(srctree)/Documentation/networking/netlink_spec/index.rst
YNL_RST_DIR:=$(srctree)/Documentation/networking/netlink_spec
YNL_YAML_DIR:=$(srctree)/Documentation/netlink/specs
YNL_TOOL:=$(srctree)/tools/net/ynl/pyynl/ynl_gen_rst.py
YNL_RST_FILES_TMP := $(patsubst %.yaml,%.rst,$(wildcard $(YNL_YAML_DIR)/*.yaml))
YNL_RST_FILES := $(patsubst $(YNL_YAML_DIR)%,$(YNL_RST_DIR)%, $(YNL_RST_FILES_TMP))
$(YNL_INDEX): $(YNL_RST_FILES)
$(Q)$(YNL_TOOL) -o $@ -x
$(YNL_RST_DIR)/%.rst: $(YNL_YAML_DIR)/%.yaml $(YNL_TOOL)
$(Q)$(YNL_TOOL) -i $< -o $@
htmldocs texinfodocs latexdocs epubdocs xmldocs: $(YNL_INDEX)
htmldocs:
@$(srctree)/scripts/sphinx-pre-install --version-check
@+$(foreach var,$(SPHINXDIRS),$(call loop_cmd,sphinx,html,$(var),,$(var)))
htmldocs-redirects: $(srctree)/Documentation/.renames.txt
@tools/docs/gen-redirects.py --output $(BUILDDIR) < $<
# If Rust support is available and .config exists, add rustdoc generated contents.
# If there are any, the errors from this make rustdoc will be displayed but
# won't stop the execution of htmldocs
@@ -186,13 +173,12 @@ refcheckdocs:
$(Q)cd $(srctree);scripts/documentation-file-ref-check
cleandocs:
$(Q)rm -f $(YNL_INDEX) $(YNL_RST_FILES)
$(Q)rm -rf $(BUILDDIR)
$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/userspace-api/media clean
dochelp:
@echo ' Linux kernel internal documentation in different formats from ReST:'
@echo ' htmldocs - HTML'
@echo ' htmldocs-redirects - generate HTML redirects for moved pages'
@echo ' texinfodocs - Texinfo'
@echo ' infodocs - Info'
@echo ' latexdocs - LaTeX'

View File

@@ -86,7 +86,7 @@ The <EPF Device> directory can have a list of symbolic links
be created by the user to represent the virtual functions that are bound to
the physical function. In the above directory structure <EPF Device 11> is a
physical function and <EPF Device 31> is a virtual function. An EPF device once
it's linked to another EPF device, cannot be linked to a EPC device.
it's linked to another EPF device, cannot be linked to an EPC device.
EPC Device
==========
@@ -108,7 +108,7 @@ entries corresponding to EPC device will be created by the EPC core.
The <EPC Device> directory will have a list of symbolic links to
<EPF Device>. These symbolic links should be created by the user to
represent the functions present in the endpoint device. Only <EPF Device>
that represents a physical function can be linked to a EPC device.
that represents a physical function can be linked to an EPC device.
The <EPC Device> directory will also have a *start* field. Once
"1" is written to this field, the endpoint device will be ready to

View File

@@ -197,8 +197,8 @@ by the PCI endpoint function driver.
* pci_epf_register_driver()
The PCI Endpoint Function driver should implement the following ops:
* bind: ops to perform when a EPC device has been bound to EPF device
* unbind: ops to perform when a binding has been lost between a EPC
* bind: ops to perform when an EPC device has been bound to EPF device
* unbind: ops to perform when a binding has been lost between an EPC
device and EPF device
* add_cfs: optional ops to create function specific configfs
attributes
@@ -251,7 +251,7 @@ pci-ep-cfs.c can be used as reference for using these APIs.
* pci_epf_bind()
pci_epf_bind() should be invoked when the EPF device has been bound to
a EPC device.
an EPC device.
* pci_epf_unbind()

View File

@@ -90,8 +90,9 @@ of the function device and is populated with the following NTB specific
attributes that can be configured by the user::
# ls functions/pci_epf_vntb/func1/pci_epf_vntb.0/
db_count mw1 mw2 mw3 mw4 num_mws
spad_count
ctrl_bar db_count mw1_bar mw2_bar mw3_bar mw4_bar spad_count
db_bar mw1 mw2 mw3 mw4 num_mws vbus_number
vntb_vid vntb_pid
A sample configuration for NTB function is given below::
@@ -100,6 +101,10 @@ A sample configuration for NTB function is given below::
# echo 1 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
# echo 0x100000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
By default, each construct is assigned a BAR, as needed and in order.
Should a specific BAR setup be required by the platform, BAR may be assigned
to each construct using the related ``XYZ_bar`` entry.
A sample configuration for virtual NTB driver for virtual PCI bus::
# echo 0x1957 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid

View File

@@ -13,7 +13,7 @@ PCI Error Recovery
Many PCI bus controllers are able to detect a variety of hardware
PCI errors on the bus, such as parity errors on the data and address
buses, as well as SERR and PERR errors. Some of the more advanced
chipsets are able to deal with these errors; these include PCI-E chipsets,
chipsets are able to deal with these errors; these include PCIe chipsets,
and the PCI-host bridges found on IBM Power4, Power5 and Power6-based
pSeries boxes. A typical action taken is to disconnect the affected device,
halting all I/O to it. The goal of a disconnection is to avoid system
@@ -108,8 +108,8 @@ A driver does not have to implement all of these callbacks; however,
if it implements any, it must implement error_detected(). If a callback
is not implemented, the corresponding feature is considered unsupported.
For example, if mmio_enabled() and resume() aren't there, then it
is assumed that the driver is not doing any direct recovery and requires
a slot reset. Typically a driver will want to know about
is assumed that the driver does not need these callbacks
for recovery. Typically a driver will want to know about
a slot_reset().
The actual steps taken by a platform to recover from a PCI error
@@ -122,6 +122,10 @@ A PCI bus error is detected by the PCI hardware. On powerpc, the slot
is isolated, in that all I/O is blocked: all reads return 0xffffffff,
all writes are ignored.
Similarly, on platforms supporting Downstream Port Containment
(PCIe r7.0 sec 6.2.11), the link to the sub-hierarchy with the
faulting device is disabled. Any device in the sub-hierarchy
becomes inaccessible.
STEP 1: Notification
--------------------
@@ -141,6 +145,9 @@ shouldn't do any new IOs. Called in task context. This is sort of a
All drivers participating in this system must implement this call.
The driver must return one of the following result codes:
- PCI_ERS_RESULT_RECOVERED
Driver returns this if it thinks the device is usable despite
the error and does not need further intervention.
- PCI_ERS_RESULT_CAN_RECOVER
Driver returns this if it thinks it might be able to recover
the HW by just banging IOs or if it wants to be given
@@ -199,7 +206,25 @@ reset or some such, but not restart operations. This callback is made if
all drivers on a segment agree that they can try to recover and if no automatic
link reset was performed by the HW. If the platform can't just re-enable IOs
without a slot reset or a link reset, it will not call this callback, and
instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset)
instead will have gone directly to STEP 3 (Link Reset) or STEP 4 (Slot Reset).
.. note::
On platforms supporting Advanced Error Reporting (PCIe r7.0 sec 6.2),
the faulting device may already be accessible in STEP 1 (Notification).
Drivers should nevertheless defer accesses to STEP 2 (MMIO Enabled)
to be compatible with EEH on powerpc and with s390 (where devices are
inaccessible until STEP 2).
On platforms supporting Downstream Port Containment, the link to the
sub-hierarchy with the faulting device is re-enabled in STEP 3 (Link
Reset). Hence devices in the sub-hierarchy are inaccessible until
STEP 4 (Slot Reset).
For errors such as Surprise Down (PCIe r7.0 sec 6.2.7), the device
may not even be accessible in STEP 4 (Slot Reset). Drivers can detect
accessibility by checking whether reads from the device return all 1's
(PCI_POSSIBLE_ERROR()).
.. note::
@@ -234,14 +259,14 @@ The driver should return one of the following result codes:
The next step taken depends on the results returned by the drivers.
If all drivers returned PCI_ERS_RESULT_RECOVERED, then the platform
proceeds to either STEP3 (Link Reset) or to STEP 5 (Resume Operations).
proceeds to either STEP 3 (Link Reset) or to STEP 5 (Resume Operations).
If any driver returned PCI_ERS_RESULT_NEED_RESET, then the platform
proceeds to STEP 4 (Slot Reset)
STEP 3: Link Reset
------------------
The platform resets the link. This is a PCI-Express specific step
The platform resets the link. This is a PCIe specific step
and is done whenever a fatal error has been detected that can be
"solved" by resetting the link.
@@ -263,13 +288,13 @@ that is equivalent to what it would be after a fresh system
power-on followed by power-on BIOS/system firmware initialization.
Soft reset is also known as hot-reset.
Powerpc fundamental reset is supported by PCI Express cards only
Powerpc fundamental reset is supported by PCIe cards only
and results in device's state machines, hardware logic, port states and
configuration registers to initialize to their default conditions.
For most PCI devices, a soft reset will be sufficient for recovery.
Optional fundamental reset is provided to support a limited number
of PCI Express devices for which a soft reset is not sufficient
of PCIe devices for which a soft reset is not sufficient
for recovery.
If the platform supports PCI hotplug, then the reset might be
@@ -313,7 +338,7 @@ Result codes:
- PCI_ERS_RESULT_DISCONNECT
Same as above.
Drivers for PCI Express cards that require a fundamental reset must
Drivers for PCIe cards that require a fundamental reset must
set the needs_freset bit in the pci_dev structure in their probe function.
For example, the QLogic qla2xxx driver sets the needs_freset bit for certain
PCI card types::

View File

@@ -70,16 +70,16 @@ AER error output
----------------
When a PCIe AER error is captured, an error message will be output to
console. If it's a correctable error, it is output as an info message.
console. If it's a correctable error, it is output as a warning message.
Otherwise, it is printed as an error. So users could choose different
log level to filter out correctable error messages.
Below shows an example::
0000:50:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, id=0500(Requester ID)
0000:50:00.0: PCIe Bus Error: severity=Uncorrectable (Fatal), type=Transaction Layer, (Requester ID)
0000:50:00.0: device [8086:0329] error status/mask=00100000/00000000
0000:50:00.0: [20] Unsupported Request (First)
0000:50:00.0: TLP Header: 04000001 00200a03 05010000 00050100
0000:50:00.0: [20] UnsupReq (First)
0000:50:00.0: TLP Header: 0x04000001 0x00200a03 0x05010000 0x00050100
In the example, 'Requester ID' means the ID of the device that sent
the error message to the Root Port. Please refer to PCIe specs for other
@@ -138,7 +138,7 @@ error message to the Root Port above it when it captures
an error. The Root Port, upon receiving an error reporting message,
internally processes and logs the error message in its AER
Capability structure. Error information being logged includes storing
the error reporting agent's requestor ID into the Error Source
the error reporting agent's Requester ID into the Error Source
Identification Registers and setting the error bits of the Root Error
Status Register accordingly. If AER error reporting is enabled in the Root
Error Command Register, the Root Port generates an interrupt when an
@@ -152,18 +152,6 @@ the device driver.
Provide callbacks
-----------------
callback reset_link to reset PCIe link
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This callback is used to reset the PCIe physical link when a
fatal error happens. The Root Port AER service driver provides a
default reset_link function, but different Upstream Ports might
have different specifications to reset the PCIe link, so
Upstream Port drivers may provide their own reset_link functions.
Section 3.2.2.2 provides more detailed info on when to call
reset_link.
PCI error-recovery callbacks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -174,8 +162,8 @@ when performing error recovery actions.
Data struct pci_driver has a pointer, err_handler, to point to
pci_error_handlers who consists of a couple of callback function
pointers. The AER driver follows the rules defined in
pci-error-recovery.rst except PCIe-specific parts (e.g.
reset_link). Please refer to pci-error-recovery.rst for detailed
pci-error-recovery.rst except PCIe-specific parts (see
below). Please refer to pci-error-recovery.rst for detailed
definitions of the callbacks.
The sections below specify when to call the error callback functions.
@@ -189,10 +177,21 @@ software intervention or any loss of data. These errors do not
require any recovery actions. The AER driver clears the device's
correctable error status register accordingly and logs these errors.
Non-correctable (non-fatal and fatal) errors
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Uncorrectable (non-fatal and fatal) errors
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If an error message indicates a non-fatal error, performing link reset
The AER driver performs a Secondary Bus Reset to recover from
uncorrectable errors. The reset is applied at the port above
the originating device: If the originating device is an Endpoint,
only the Endpoint is reset. If on the other hand the originating
device has subordinate devices, those are all affected by the
reset as well.
If the originating device is a Root Complex Integrated Endpoint,
there's no port above where a Secondary Bus Reset could be applied.
In this case, the AER driver instead applies a Function Level Reset.
If an error message indicates a non-fatal error, performing a reset
at upstream is not required. The AER driver calls error_detected(dev,
pci_channel_io_normal) to all drivers associated within a hierarchy in
question. For example::
@@ -204,38 +203,34 @@ Downstream Port B and Endpoint.
A driver may return PCI_ERS_RESULT_CAN_RECOVER,
PCI_ERS_RESULT_DISCONNECT, or PCI_ERS_RESULT_NEED_RESET, depending on
whether it can recover or the AER driver calls mmio_enabled as next.
whether it can recover without a reset, considers the device unrecoverable
or needs a reset for recovery. If all affected drivers agree that they can
recover without a reset, it is skipped. Should one driver request a reset,
it overrides all other drivers.
If an error message indicates a fatal error, kernel will broadcast
error_detected(dev, pci_channel_io_frozen) to all drivers within
a hierarchy in question. Then, performing link reset at upstream is
necessary. As different kinds of devices might use different approaches
to reset link, AER port service driver is required to provide the
function to reset link via callback parameter of pcie_do_recovery()
function. If reset_link is not NULL, recovery function will use it
to reset the link. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
and reset_link returns PCI_ERS_RESULT_RECOVERED, the error handling goes
to mmio_enabled.
a hierarchy in question. Then, performing a reset at upstream is
necessary. If error_detected returns PCI_ERS_RESULT_CAN_RECOVER
to indicate that recovery without a reset is possible, the error
handling goes to mmio_enabled, but afterwards a reset is still
performed.
Frequent Asked Questions
------------------------
In other words, for non-fatal errors, drivers may opt in to a reset.
But for fatal errors, they cannot opt out of a reset, based on the
assumption that the link is unreliable.
Frequently Asked Questions
--------------------------
Q:
What happens if a PCIe device driver does not provide an
error recovery handler (pci_driver->err_handler is equal to NULL)?
A:
The devices attached with the driver won't be recovered. If the
error is fatal, kernel will print out warning messages. Please refer
to section 3 for more information.
Q:
What happens if an upstream port service driver does not provide
callback reset_link?
A:
Fatal error recovery will fail if the errors are reported by the
upstream ports who are attached by the service driver.
The devices attached with the driver won't be recovered.
The kernel will print out informational messages to identify
unrecoverable devices.
Software error injection

View File

@@ -1973,9 +1973,7 @@ code, and the FQS loop, all of which refer to or modify this bookkeeping.
Note that grace period initialization (rcu_gp_init()) must carefully sequence
CPU hotplug scanning with grace period state changes. For example, the
following race could occur in rcu_gp_init() if rcu_seq_start() were to happen
after the CPU hotplug scanning.
.. code-block:: none
after the CPU hotplug scanning::
CPU0 (rcu_gp_init) CPU1 CPU2
--------------------- ---- ----
@@ -2008,22 +2006,22 @@ after the CPU hotplug scanning.
kfree(r1);
r2 = *r0; // USE-AFTER-FREE!
By incrementing gp_seq first, CPU1's RCU read-side critical section
By incrementing ``gp_seq`` first, CPU1's RCU read-side critical section
is guaranteed to not be missed by CPU2.
**Concurrent Quiescent State Reporting for Offline CPUs**
Concurrent Quiescent State Reporting for Offline CPUs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RCU must ensure that CPUs going offline report quiescent states to avoid
blocking grace periods. This requires careful synchronization to handle
race conditions
**Race condition causing Offline CPU to hang GP**
Race condition causing Offline CPU to hang GP
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
A race between CPU offlining and new GP initialization (gp_init) may occur
because `rcu_report_qs_rnp()` in `rcutree_report_cpu_dead()` must temporarily
release the `rcu_node` lock to wake the RCU grace-period kthread:
.. code-block:: none
A race between CPU offlining and new GP initialization (gp_init()) may occur
because rcu_report_qs_rnp() in rcutree_report_cpu_dead() must temporarily
release the ``rcu_node`` lock to wake the RCU grace-period kthread::
CPU1 (going offline) CPU0 (GP kthread)
-------------------- -----------------
@@ -2044,15 +2042,14 @@ release the `rcu_node` lock to wake the RCU grace-period kthread:
// Reacquire lock (but too late)
rnp->qsmaskinitnext &= ~mask // Finally clears bit
Without `ofl_lock`, the new grace period includes the offline CPU and waits
Without ``ofl_lock``, the new grace period includes the offline CPU and waits
forever for its quiescent state causing a GP hang.
**A solution with ofl_lock**
A solution with ofl_lock
^^^^^^^^^^^^^^^^^^^^^^^^
The `ofl_lock` (offline lock) prevents `rcu_gp_init()` from running during
the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`:
.. code-block:: none
The ``ofl_lock`` (offline lock) prevents rcu_gp_init() from running during
the vulnerable window when rcu_report_qs_rnp() has released ``rnp->lock``::
CPU0 (rcu_gp_init) CPU1 (rcutree_report_cpu_dead)
------------------ ------------------------------
@@ -2065,21 +2062,20 @@ the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`:
arch_spin_unlock(&ofl_lock) ---> // Now CPU1 can proceed
} // But snapshot already taken
**Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs**
Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
After the first loop takes an atomic snapshot of online CPUs, as shown above,
the second loop in `rcu_gp_init()` detects CPUs that went offline between
releasing `ofl_lock` and acquiring the per-node `rnp->lock`. This detection is
crucial because:
the second loop in rcu_gp_init() detects CPUs that went offline between
releasing ``ofl_lock`` and acquiring the per-node ``rnp->lock``.
This detection is crucial because:
1. The CPU might have gone offline after the snapshot but before the second loop
2. The offline CPU cannot report its own QS if it's already dead
3. Without this detection, the grace period would wait forever for CPUs that
are now offline.
The second loop performs this detection safely:
.. code-block:: none
The second loop performs this detection safely::
rcu_for_each_node_breadth_first(rnp) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -2093,10 +2089,10 @@ The second loop performs this detection safely:
}
This approach ensures atomicity: quiescent state reporting for offline CPUs
happens either in `rcu_gp_init()` (second loop) or in `rcutree_report_cpu_dead()`,
never both and never neither. The `rnp->lock` held throughout the sequence
prevents races - `rcutree_report_cpu_dead()` also acquires this lock when
clearing `qsmaskinitnext`, ensuring mutual exclusion.
happens either in rcu_gp_init() (second loop) or in rcutree_report_cpu_dead(),
never both and never neither. The ``rnp->lock`` held throughout the sequence
prevents races - rcutree_report_cpu_dead() also acquires this lock when
clearing ``qsmaskinitnext``, ensuring mutual exclusion.
Scheduler and RCU
~~~~~~~~~~~~~~~~~

View File

@@ -641,7 +641,7 @@ Orran Krieger and Rusty Russell and Dipankar Sarma and Maneesh Soni"
,Month="July"
,Year="2001"
,note="Available:
\url{http://www.linuxsymposium.org/2001/abstracts/readcopy.php}
\url{https://kernel.org/doc/ols/2001/read-copy.pdf}
\url{http://www.rdrop.com/users/paulmck/RCU/rclock_OLS.2001.05.01c.pdf}
[Viewed June 23, 2004]"
,annotation={
@@ -1480,7 +1480,7 @@ Suparna Bhattacharya"
,Year="2006"
,pages="v2 123-138"
,note="Available:
\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
\url{https://kernel.org/doc/ols/2006/ols2006v2-pages-131-146.pdf}
\url{http://www.rdrop.com/users/paulmck/RCU/OLSrtRCU.2006.08.11a.pdf}
[Viewed January 1, 2007]"
,annotation={
@@ -1511,7 +1511,7 @@ Canis Rufus and Zoicon5 and Anome and Hal Eisen"
,Year="2006"
,pages="v2 249-254"
,note="Available:
\url{http://www.linuxsymposium.org/2006/view_abstract.php?content_key=184}
\url{https://kernel.org/doc/ols/2006/ols2006v2-pages-249-262.pdf}
[Viewed January 11, 2009]"
,annotation={
Uses RCU-protected radix tree for a lockless page cache.

View File

@@ -69,7 +69,13 @@ over a rather long period of time, but improvements are always welcome!
Explicit disabling of preemption (preempt_disable(), for example)
can serve as rcu_read_lock_sched(), but is less readable and
prevents lockdep from detecting locking issues. Acquiring a
spinlock also enters an RCU read-side critical section.
raw spinlock also enters an RCU read-side critical section.
The guard(rcu)() and scoped_guard(rcu) primitives designate
the remainder of the current scope or the next statement,
respectively, as the RCU read-side critical section. Use of
these guards can be less error-prone than rcu_read_lock(),
rcu_read_unlock(), and friends.
Please note that you *cannot* rely on code known to be built
only in non-preemptible kernels. Such code can and will break,
@@ -405,9 +411,11 @@ over a rather long period of time, but improvements are always welcome!
13. Unlike most flavors of RCU, it *is* permissible to block in an
SRCU read-side critical section (demarked by srcu_read_lock()
and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
Please note that if you don't need to sleep in read-side critical
sections, you should be using RCU rather than SRCU, because RCU
is almost always faster and easier to use than is SRCU.
As with RCU, guard(srcu)() and scoped_guard(srcu) forms are
available, and often provide greater ease of use. Please note
that if you don't need to sleep in read-side critical sections,
you should be using RCU rather than SRCU, because RCU is almost
always faster and easier to use than is SRCU.
Also unlike other forms of RCU, explicit initialization and
cleanup is required either at build time via DEFINE_SRCU()
@@ -443,10 +451,13 @@ over a rather long period of time, but improvements are always welcome!
real-time workloads than is synchronize_rcu_expedited().
It is also permissible to sleep in RCU Tasks Trace read-side
critical section, which are delimited by rcu_read_lock_trace() and
rcu_read_unlock_trace(). However, this is a specialized flavor
of RCU, and you should not use it without first checking with
its current users. In most cases, you should instead use SRCU.
critical section, which are delimited by rcu_read_lock_trace()
and rcu_read_unlock_trace(). However, this is a specialized
flavor of RCU, and you should not use it without first checking
with its current users. In most cases, you should instead
use SRCU. As with RCU and SRCU, guard(rcu_tasks_trace)() and
scoped_guard(rcu_tasks_trace) are available, and often provide
greater ease of use.
Note that rcu_assign_pointer() relates to SRCU just as it does to
other forms of RCU, but instead of rcu_dereference() you should

View File

@@ -1,13 +1,13 @@
.. SPDX-License-Identifier: GPL-2.0
.. _rcu_concepts:
.. _rcu_handbook:
============
RCU concepts
RCU Handbook
============
.. toctree::
:maxdepth: 3
:maxdepth: 2
checklist
lockdep

View File

@@ -106,7 +106,7 @@ or the RCU-protected data that it points to can change concurrently.
Like rcu_dereference(), when lockdep is enabled, RCU list and hlist
traversal primitives check for being called from within an RCU read-side
critical section. However, a lockdep expression can be passed to them
as a additional optional argument. With this lockdep expression, these
as an additional optional argument. With this lockdep expression, these
traversal primitives will complain only if the lockdep expression is
false and they are called from outside any RCU read-side critical section.

View File

@@ -119,7 +119,7 @@ warnings:
uncommon in large datacenter. In one memorable case some decades
back, a CPU failed in a running system, becoming unresponsive,
but not causing an immediate crash. This resulted in a series
of RCU CPU stall warnings, eventually leading the realization
of RCU CPU stall warnings, eventually leading to the realization
that the CPU had failed.
The RCU, RCU-sched, RCU-tasks, and RCU-tasks-trace implementations have

View File

@@ -344,7 +344,7 @@ painstaking and error-prone.
And this is why the kvm-remote.sh script exists.
If you the following command works::
If the following command works::
ssh system0 date
@@ -364,7 +364,7 @@ systems must come first.
The kvm.sh ``--dryrun scenarios`` argument is useful for working out
how many scenarios may be run in one batch across a group of systems.
You can also re-run a previous remote run in a manner similar to kvm.sh:
You can also re-run a previous remote run in a manner similar to kvm.sh::
kvm-remote.sh "system0 system1 system2 system3 system4 system5" \
tools/testing/selftests/rcutorture/res/2022.11.03-11.26.28-remote \

View File

@@ -1021,32 +1021,41 @@ RCU list traversal::
list_entry_rcu
list_entry_lockless
list_first_entry_rcu
list_first_or_null_rcu
list_tail_rcu
list_next_rcu
list_next_or_null_rcu
list_for_each_entry_rcu
list_for_each_entry_continue_rcu
list_for_each_entry_from_rcu
list_first_or_null_rcu
list_next_or_null_rcu
list_for_each_entry_lockless
hlist_first_rcu
hlist_next_rcu
hlist_pprev_rcu
hlist_for_each_entry_rcu
hlist_for_each_entry_rcu_notrace
hlist_for_each_entry_rcu_bh
hlist_for_each_entry_from_rcu
hlist_for_each_entry_continue_rcu
hlist_for_each_entry_continue_rcu_bh
hlist_nulls_first_rcu
hlist_nulls_next_rcu
hlist_nulls_for_each_entry_rcu
hlist_nulls_for_each_entry_safe
hlist_bl_first_rcu
hlist_bl_for_each_entry_rcu
RCU pointer/list update::
rcu_assign_pointer
rcu_replace_pointer
INIT_LIST_HEAD_RCU
list_add_rcu
list_add_tail_rcu
list_del_rcu
list_replace_rcu
list_splice_init_rcu
list_splice_tail_init_rcu
hlist_add_behind_rcu
hlist_add_before_rcu
hlist_add_head_rcu
@@ -1054,34 +1063,53 @@ RCU pointer/list update::
hlist_del_rcu
hlist_del_init_rcu
hlist_replace_rcu
list_splice_init_rcu
list_splice_tail_init_rcu
hlist_nulls_del_init_rcu
hlist_nulls_del_rcu
hlist_nulls_add_head_rcu
hlist_nulls_add_tail_rcu
hlist_nulls_add_fake
hlists_swap_heads_rcu
hlist_bl_add_head_rcu
hlist_bl_del_init_rcu
hlist_bl_del_rcu
hlist_bl_set_first_rcu
RCU::
Critical sections Grace period Barrier
Critical sections Grace period Barrier
rcu_read_lock synchronize_net rcu_barrier
rcu_read_unlock synchronize_rcu
rcu_dereference synchronize_rcu_expedited
rcu_read_lock_held call_rcu
rcu_dereference_check kfree_rcu
rcu_dereference_protected
rcu_read_lock synchronize_net rcu_barrier
rcu_read_unlock synchronize_rcu
guard(rcu)() synchronize_rcu_expedited
scoped_guard(rcu) synchronize_rcu_mult
rcu_dereference call_rcu
rcu_dereference_check call_rcu_hurry
rcu_dereference_protected kfree_rcu
rcu_read_lock_held kvfree_rcu
rcu_read_lock_any_held kfree_rcu_mightsleep
rcu_pointer_handoff cond_synchronize_rcu
unrcu_pointer cond_synchronize_rcu_full
cond_synchronize_rcu_expedited
cond_synchronize_rcu_expedited_full
get_completed_synchronize_rcu
get_completed_synchronize_rcu_full
get_state_synchronize_rcu
get_state_synchronize_rcu_full
poll_state_synchronize_rcu
poll_state_synchronize_rcu_full
same_state_synchronize_rcu
same_state_synchronize_rcu_full
start_poll_synchronize_rcu
start_poll_synchronize_rcu_full
start_poll_synchronize_rcu_expedited
start_poll_synchronize_rcu_expedited_full
bh::
Critical sections Grace period Barrier
rcu_read_lock_bh call_rcu rcu_barrier
rcu_read_unlock_bh synchronize_rcu
[local_bh_disable] synchronize_rcu_expedited
rcu_read_lock_bh [Same as RCU] [Same as RCU]
rcu_read_unlock_bh
[local_bh_disable]
[and friends]
rcu_dereference_bh
rcu_dereference_bh_check
@@ -1092,9 +1120,9 @@ sched::
Critical sections Grace period Barrier
rcu_read_lock_sched call_rcu rcu_barrier
rcu_read_unlock_sched synchronize_rcu
[preempt_disable] synchronize_rcu_expedited
rcu_read_lock_sched [Same as RCU] [Same as RCU]
rcu_read_unlock_sched
[preempt_disable]
[and friends]
rcu_read_lock_sched_notrace
rcu_read_unlock_sched_notrace
@@ -1104,46 +1132,104 @@ sched::
rcu_read_lock_sched_held
RCU: Initialization/cleanup/ordering::
RCU_INIT_POINTER
RCU_INITIALIZER
RCU_POINTER_INITIALIZER
init_rcu_head
destroy_rcu_head
init_rcu_head_on_stack
destroy_rcu_head_on_stack
SLAB_TYPESAFE_BY_RCU
RCU: Quiescents states and control::
cond_resched_tasks_rcu_qs
rcu_all_qs
rcu_softirq_qs_periodic
rcu_end_inkernel_boot
rcu_expedite_gp
rcu_gp_is_expedited
rcu_unexpedite_gp
rcu_cpu_stall_reset
rcu_head_after_call_rcu
rcu_is_watching
RCU-sync primitive::
rcu_sync_is_idle
rcu_sync_init
rcu_sync_enter
rcu_sync_exit
rcu_sync_dtor
RCU-Tasks::
Critical sections Grace period Barrier
Critical sections Grace period Barrier
N/A call_rcu_tasks rcu_barrier_tasks
N/A call_rcu_tasks rcu_barrier_tasks
synchronize_rcu_tasks
RCU-Tasks-Rude::
Critical sections Grace period Barrier
Critical sections Grace period Barrier
N/A N/A
synchronize_rcu_tasks_rude
N/A synchronize_rcu_tasks_rude rcu_barrier_tasks_rude
call_rcu_tasks_rude
RCU-Tasks-Trace::
Critical sections Grace period Barrier
Critical sections Grace period Barrier
rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
rcu_read_unlock_trace synchronize_rcu_tasks_trace
guard(rcu_tasks_trace)()
scoped_guard(rcu_tasks_trace)
SRCU list traversal::
list_for_each_entry_srcu
hlist_for_each_entry_srcu
SRCU::
Critical sections Grace period Barrier
Critical sections Grace period Barrier
srcu_read_lock call_srcu srcu_barrier
srcu_read_unlock synchronize_srcu
srcu_dereference synchronize_srcu_expedited
srcu_read_lock call_srcu srcu_barrier
srcu_read_unlock synchronize_srcu
srcu_read_lock_fast synchronize_srcu_expedited
srcu_read_unlock_fast get_state_synchronize_srcu
srcu_read_lock_nmisafe start_poll_synchronize_srcu
srcu_read_unlock_nmisafe start_poll_synchronize_srcu_expedited
srcu_read_lock_notrace poll_state_synchronize_srcu
srcu_read_unlock_notrace
srcu_down_read
srcu_up_read
srcu_down_read_fast
srcu_up_read_fast
guard(srcu)()
scoped_guard(srcu)
srcu_read_lock_held
srcu_dereference
srcu_dereference_check
srcu_dereference_notrace
srcu_read_lock_held
SRCU: Initialization/cleanup::
SRCU: Initialization/cleanup/ordering::
DEFINE_SRCU
DEFINE_STATIC_SRCU
init_srcu_struct
cleanup_srcu_struct
smp_mb__after_srcu_read_unlock
All: lockdep-checked RCU utility APIs::

View File

@@ -223,13 +223,13 @@ Userspace components
Compiler
--------
Peano is an LLVM based open-source compiler for AMD XDNA Array compute tile
available at:
Peano is an LLVM based open-source single core compiler for AMD XDNA Array
compute tile. Peano is available at:
https://github.com/Xilinx/llvm-aie
The open-source IREE compiler supports graph compilation of ML models for AMD
NPU and uses Peano underneath. It is available at:
https://github.com/nod-ai/iree-amd-aie
IRON is an open-source array compiler for AMD XDNA Array based NPU which uses
Peano underneath. IRON is available at:
https://github.com/Xilinx/mlir-aie
Usermode Driver (UMD)
---------------------

View File

@@ -10,6 +10,7 @@ Compute Accelerators
introduction
amdxdna/index
qaic/index
rocket/index
.. only:: subproject and html

View File

@@ -0,0 +1,19 @@
.. SPDX-License-Identifier: GPL-2.0-only
=====================================
accel/rocket Rockchip NPU driver
=====================================
The accel/rocket driver supports the Neural Processing Units (NPUs) inside some
Rockchip SoCs such as the RK3588. Rockchip calls it RKNN and sometimes RKNPU.
The hardware is described in chapter 36 in the RK3588 TRM.
This driver just powers the hardware on and off, allocates and maps buffers to
the device and submits jobs to the frontend unit. Everything else is done in
userspace, as a Gallium driver (also called rocket) that is part of the Mesa3D
project.
Hardware currently supported:
* RK3588

View File

@@ -134,47 +134,72 @@ The above command can be used with -v to get more debug information.
After the system starts, use `delaytop` to get the system-wide delay information,
which includes system-wide PSI information and Top-N high-latency tasks.
Note: PSI support requires `CONFIG_PSI=y` and `psi=1` for full functionality.
`delaytop` supports sorting by CPU latency in descending order by default,
displays the top 20 high-latency tasks by default, and refreshes the latency
data every 2 seconds by default.
`delaytop` is an interactive tool for monitoring system pressure and task delays.
It supports multiple sorting options, display modes, and real-time keyboard controls.
Get PSI information and Top-N tasks delay, since system boot::
Basic usage with default settings (sorts by CPU delay, shows top 20 tasks, refreshes every 2 seconds)::
bash# ./delaytop
System Pressure Information: (avg10/avg60/avg300/total)
CPU some: 0.0%/ 0.0%/ 0.0%/ 345(ms)
System Pressure Information: (avg10/avg60vg300/total)
CPU some: 0.0%/ 0.0%/ 0.0%/ 106137(ms)
CPU full: 0.0%/ 0.0%/ 0.0%/ 0(ms)
Memory full: 0.0%/ 0.0%/ 0.0%/ 0(ms)
Memory some: 0.0%/ 0.0%/ 0.0%/ 0(ms)
IO full: 0.0%/ 0.0%/ 0.0%/ 65(ms)
IO some: 0.0%/ 0.0%/ 0.0%/ 79(ms)
IO full: 0.0%/ 0.0%/ 0.0%/ 2240(ms)
IO some: 0.0%/ 0.0%/ 0.0%/ 2783(ms)
IRQ full: 0.0%/ 0.0%/ 0.0%/ 0(ms)
Top 20 processes (sorted by CPU delay):
PID TGID COMMAND CPU(ms) IO(ms) SWAP(ms) RCL(ms) THR(ms) CMP(ms) WP(ms) IRQ(ms)
----------------------------------------------------------------------------------------------
161 161 zombie_memcg_re 1.40 0.00 0.00 0.00 0.00 0.00 0.00 0.00
130 130 blkcg_punt_bio 1.37 0.00 0.00 0.00 0.00 0.00 0.00 0.00
444 444 scsi_tmf_0 0.73 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1280 1280 rsyslogd 0.53 0.04 0.00 0.00 0.00 0.00 0.00 0.00
12 12 ksoftirqd/0 0.47 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1277 1277 nbd-server 0.44 0.00 0.00 0.00 0.00 0.00 0.00 0.00
308 308 kworker/2:2-sys 0.41 0.00 0.00 0.00 0.00 0.00 0.00 0.00
55 55 netns 0.36 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1187 1187 acpid 0.31 0.03 0.00 0.00 0.00 0.00 0.00 0.00
6184 6184 kworker/1:2-sys 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.00
186 186 kaluad 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.00
18 18 ksoftirqd/1 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.00
185 185 kmpath_rdacd 0.23 0.00 0.00 0.00 0.00 0.00 0.00 0.00
190 190 kstrp 0.23 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2759 2759 agetty 0.20 0.03 0.00 0.00 0.00 0.00 0.00 0.00
1190 1190 kworker/0:3-sys 0.19 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1272 1272 sshd 0.15 0.04 0.00 0.00 0.00 0.00 0.00 0.00
1156 1156 license 0.15 0.11 0.00 0.00 0.00 0.00 0.00 0.00
134 134 md 0.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00
6142 6142 kworker/3:2-xfs 0.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00
[o]sort [M]memverbose [q]quit
Top 20 processes (sorted by cpu delay):
PID TGID COMMAND CPU(ms) IO(ms) IRQ(ms) MEM(ms)
------------------------------------------------------------------------
110 110 kworker/15:0H-s 27.91 0.00 0.00 0.00
57 57 cpuhp/7 3.18 0.00 0.00 0.00
99 99 cpuhp/14 2.97 0.00 0.00 0.00
51 51 cpuhp/6 0.90 0.00 0.00 0.00
44 44 kworker/4:0H-sy 0.80 0.00 0.00 0.00
60 60 ksoftirqd/7 0.74 0.00 0.00 0.00
76 76 idle_inject/10 0.31 0.00 0.00 0.00
100 100 idle_inject/14 0.30 0.00 0.00 0.00
1309 1309 systemsettings 0.29 0.00 0.00 0.00
45 45 cpuhp/5 0.22 0.00 0.00 0.00
63 63 cpuhp/8 0.20 0.00 0.00 0.00
87 87 cpuhp/12 0.18 0.00 0.00 0.00
93 93 cpuhp/13 0.17 0.00 0.00 0.00
1265 1265 acpid 0.17 0.00 0.00 0.00
1552 1552 sshd 0.17 0.00 0.00 0.00
2584 2584 sddm-helper 0.16 0.00 0.00 0.00
1284 1284 rtkit-daemon 0.15 0.00 0.00 0.00
1326 1326 nde-netfilter 0.14 0.00 0.00 0.00
27 27 cpuhp/2 0.13 0.00 0.00 0.00
631 631 kworker/11:2-rc 0.11 0.00 0.00 0.00
Dynamic interactive interface of delaytop::
Interactive keyboard controls during runtime::
o - Select sort field (CPU, IO, IRQ, Memory, etc.)
M - Toggle display mode (Default/Memory Verbose)
q - Quit
Available sort fields(use -s/--sort or interactive command)::
cpu(c) - CPU delay
blkio(i) - I/O delay
irq(q) - IRQ delay
mem(m) - Total memory delay
swapin(s) - Swapin delay (memory verbose mode only)
freepages(r) - Freepages reclaim delay (memory verbose mode only)
thrashing(t) - Thrashing delay (memory verbose mode only)
compact(p) - Compaction delay (memory verbose mode only)
wpcopy(w) - Write page copy delay (memory verbose mode only)
Advanced usage examples::
# ./delaytop -s blkio
Sorted by IO delay
# ./delaytop -s mem -M
Sorted by memory delay in memory verbose mode
# ./delaytop -p pid
Print delayacct stats

View File

@@ -41,7 +41,7 @@ namespace). The higher level goal is to allow for uid-based sandboxing of system
services without having to give out CAP_SETUID all over the place just so that
non-root programs can drop to even-lesser-privileged uids. This is especially
relevant when one non-root daemon on the system should be allowed to spawn other
processes as different uids, but its undesirable to give the daemon a
processes as different uids, but it's undesirable to give the daemon a
basically-root-equivalent CAP_SETUID.

View File

@@ -253,7 +253,7 @@ interface.
Some architectures have ECC detectors for L1, L2 and L3 caches,
along with DMA engines, fabric switches, main data path switches,
interconnections, and various other hardware data paths. If the hardware
reports it, then a edac_device device probably can be constructed to
reports it, then an edac_device device probably can be constructed to
harvest and present that to userspace.

View File

@@ -2,7 +2,7 @@
# They may be installed along the following lines. Check the section
# 8 udev manpage to see whether your udev supports SUBSYSTEM, and
# whether it uses one or two equal signs for SUBSYSTEM and KERNEL.
#
#
# ecashin@makki ~$ su
# Password:
# bash# find /etc -type f -name udev.conf
@@ -13,7 +13,7 @@
# 10-wacom.rules 50-udev.rules
# bash# cp /path/to/linux/Documentation/admin-guide/aoe/udev.txt \
# /etc/udev/rules.d/60-aoe.rules
#
#
# aoe char devices
SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220"
@@ -22,5 +22,5 @@ SUBSYSTEM=="aoe", KERNEL=="interfaces", NAME="etherd/%k", GROUP="disk", MODE="02
SUBSYSTEM=="aoe", KERNEL=="revalidate", NAME="etherd/%k", GROUP="disk", MODE="0220"
SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", MODE="0220"
# aoe block devices
# aoe block devices
KERNEL=="etherd*", GROUP="disk"

View File

@@ -118,7 +118,7 @@ and high-level drivers that you would use:
================ ============ ========
All parports and all protocol drivers are probed automatically unless probe=0
parameter is used. So just "modprobe epat" is enough for a Imation SuperDisk
parameter is used. So just "modprobe epat" is enough for an Imation SuperDisk
drive to work.
Manual device creation::

View File

@@ -252,7 +252,7 @@ For example, if you find a bug at the gspca's sonixj.c file, you can get
its maintainers with::
$ ./scripts/get_maintainer.pl --bug -f drivers/media/usb/gspca/sonixj.c
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
Hans Verkuil <hverkuil@kernel.org> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
Mauro Carvalho Chehab <mchehab@kernel.org> (maintainer:MEDIA INPUT INFRASTRUCTURE (V4L/DVB),commit_signer:1/1=100%)
Tejun Heo <tj@kernel.org> (commit_signer:1/1=100%)
Bhaktipriya Shridhar <bhaktipriya96@gmail.com> (commit_signer:1/1=100%,authored:1/1=100%,added_lines:4/4=100%,removed_lines:9/9=100%)

View File

@@ -15,6 +15,9 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
.. CONTENTS
[Whenever any new section is added to this document, please also add
an entry here.]
1. Introduction
1-1. Terminology
1-2. What is cgroup?
@@ -25,9 +28,10 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
2-2-2. Threads
2-3. [Un]populated Notification
2-4. Controlling Controllers
2-4-1. Enabling and Disabling
2-4-2. Top-down Constraint
2-4-3. No Internal Process Constraint
2-4-1. Availability
2-4-2. Enabling and Disabling
2-4-3. Top-down Constraint
2-4-4. No Internal Process Constraint
2-5. Delegation
2-5-1. Model of Delegation
2-5-2. Delegation Containment
@@ -61,14 +65,15 @@ v1 is available under :ref:`Documentation/admin-guide/cgroup-v1/index.rst <cgrou
5-4-1. PID Interface Files
5-5. Cpuset
5.5-1. Cpuset Interface Files
5-6. Device
5-6. Device controller
5-7. RDMA
5-7-1. RDMA Interface Files
5-8. DMEM
5-8-1. DMEM Interface Files
5-9. HugeTLB
5.9-1. HugeTLB Interface Files
5-10. Misc
5.10-1 Miscellaneous cgroup Interface Files
5.10-1 Misc Interface Files
5.10-2 Migration and Ownership
5-11. Others
5-11-1. perf_event
@@ -435,8 +440,8 @@ both cgroups.
Controlling Controllers
-----------------------
Availablity
~~~~~~~~~~~
Availability
~~~~~~~~~~~~
A controller is available in a cgroup when it is supported by the kernel (i.e.,
compiled in, not disabled and not attached to a v1 hierarchy) and listed in the
@@ -1001,6 +1006,24 @@ All cgroup core files are prefixed with "cgroup."
Total number of dying cgroup subsystems (e.g. memory
cgroup) at and beneath the current cgroup.
cgroup.stat.local
A read-only flat-keyed file which exists in non-root cgroups.
The following entry is defined:
frozen_usec
Cumulative time that this cgroup has spent between freezing and
thawing, regardless of whether by self or ancestor groups.
NB: (not) reaching "frozen" state is not accounted here.
Using the following ASCII representation of a cgroup's freezer
state, ::
1 _____
frozen 0 __/ \__
ab cd
the duration being measured is the span between a and c.
cgroup.freeze
A read-write single value file which exists on non-root cgroups.
Allowed values are "0" and "1". The default is "0".

View File

@@ -3,7 +3,7 @@ dm-delay
========
Device-Mapper's "delay" target delays reads and/or writes
and/or flushs and optionally maps them to different devices.
and/or flushes and optionally maps them to different devices.
Arguments::
@@ -18,7 +18,7 @@ Table line has to either have 3, 6 or 9 arguments:
to write and flush operations on optionally different write_device with
optionally different sector offset
9: same as 6 arguments plus define flush_offset and flush_delay explicitely
9: same as 6 arguments plus define flush_offset and flush_delay explicitly
on/with optionally different flush_device/flush_offset.
Offsets are specified in sectors.
@@ -40,7 +40,7 @@ Example scripts
#!/bin/sh
#
# Create mapped device delaying write and flush operations for 400ms and
# splitting reads to device $1 but writes and flushs to different device $2
# splitting reads to device $1 but writes and flushes to different device $2
# to different offsets of 2048 and 4096 sectors respectively.
#
dmsetup create delayed --table "0 `blockdev --getsz $1` delay $1 2048 0 $2 4096 400"
@@ -48,7 +48,7 @@ Example scripts
::
#!/bin/sh
#
# Create mapped device delaying reads for 50ms, writes for 100ms and flushs for 333ms
# Create mapped device delaying reads for 50ms, writes for 100ms and flushes for 333ms
# onto the same backing device at offset 0 sectors.
#
dmsetup create delayed --table "0 `blockdev --getsz $1` delay $1 0 50 $2 0 100 $1 0 333"

View File

@@ -0,0 +1,202 @@
.. SPDX-License-Identifier: GPL-2.0
=================================
dm-pcache — Persistent Cache
=================================
*Author: Dongsheng Yang <dongsheng.yang@linux.dev>*
This document describes *dm-pcache*, a Device-Mapper target that lets a
byte-addressable *DAX* (persistent-memory, “pmem”) region act as a
high-performance, crash-persistent cache in front of a slower block
device. The code lives in `drivers/md/dm-pcache/`.
Quick feature summary
=====================
* *Write-back* caching (only mode currently supported).
* *16 MiB segments* allocated on the pmem device.
* *Data CRC32* verification (optional, per cache).
* Crash-safe: every metadata structure is duplicated (`PCACHE_META_INDEX_MAX
== 2`) and protected with CRC+sequence numbers.
* *Multi-tree indexing* (indexing trees sharded by logical address) for high PMem parallelism
* Pure *DAX path* I/O no extra BIO round-trips
* *Log-structured write-back* that preserves backend crash-consistency
Constructor
===========
::
pcache <cache_dev> <backing_dev> [<number_of_optional_arguments> <cache_mode writeback> <data_crc true|false>]
========================= ====================================================
``cache_dev`` Any DAX-capable block device (``/dev/pmem0``…).
All metadata *and* cached blocks are stored here.
``backing_dev`` The slow block device to be cached.
``cache_mode`` Optional, Only ``writeback`` is accepted at the
moment.
``data_crc`` Optional, default to ``false``
* ``true`` store CRC32 for every cached entry
and verify on reads
* ``false`` skip CRC (faster)
========================= ====================================================
Example
-------
.. code-block:: shell
dmsetup create pcache_sdb --table \
"0 $(blockdev --getsz /dev/sdb) pcache /dev/pmem0 /dev/sdb 4 cache_mode writeback data_crc true"
The first time a pmem device is used, dm-pcache formats it automatically
(super-block, cache_info, etc.).
Status line
===========
``dmsetup status <device>`` (``STATUSTYPE_INFO``) prints:
::
<sb_flags> <seg_total> <cache_segs> <segs_used> \
<gc_percent> <cache_flags> \
<key_head_seg>:<key_head_off> \
<dirty_tail_seg>:<dirty_tail_off> \
<key_tail_seg>:<key_tail_off>
Field meanings
--------------
=============================== =============================================
``sb_flags`` Super-block flags (e.g. endian marker).
``seg_total`` Number of physical *pmem* segments.
``cache_segs`` Number of segments used for cache.
``segs_used`` Segments currently allocated (bitmap weight).
``gc_percent`` Current GC high-water mark (0-90).
``cache_flags`` Bit 0 DATA_CRC enabled
Bit 1 INIT_DONE (cache initialised)
Bits 2-5 cache mode (0 == WB).
``key_head`` Where new key-sets are being written.
``dirty_tail`` First dirty key-set that still needs
write-back to the backing device.
``key_tail`` First key-set that may be reclaimed by GC.
=============================== =============================================
Messages
========
*Change GC trigger*
::
dmsetup message <dev> 0 gc_percent <0-90>
Theory of operation
===================
Sub-devices
-----------
==================== =========================================================
backing_dev Any block device (SSD/HDD/loop/LVM, etc.).
cache_dev DAX device; must expose direct-access memory.
==================== =========================================================
Segments and key-sets
---------------------
* The pmem space is divided into *16 MiB segments*.
* Each write allocates space from a per-CPU *data_head* inside a segment.
* A *cache-key* records a logical range on the origin and where it lives
inside pmem (segment + offset + generation).
* 128 keys form a *key-set* (kset); ksets are written sequentially in pmem
and are themselves crash-safe (CRC).
* The pair *(key_tail, dirty_tail)* delimit clean/dirty and live/dead ksets.
Write-back
----------
Dirty keys are queued into a tree; a background worker copies data
back to the backing_dev and advances *dirty_tail*. A FLUSH/FUA bio from the
upper layers forces an immediate metadata commit.
Garbage collection
------------------
GC starts when ``segs_used >= seg_total * gc_percent / 100``. It walks
from *key_tail*, frees segments whose every key has been invalidated, and
advances *key_tail*.
CRC verification
----------------
If ``data_crc is enabled`` dm-pcache computes a CRC32 over every cached data
range when it is inserted and stores it in the on-media key. Reads
validate the CRC before copying to the caller.
Failure handling
================
* *pmem media errors* all metadata copies are read with
``copy_mc_to_kernel``; an uncorrectable error logs and aborts initialisation.
* *Cache full* if no free segment can be found, writes return ``-EBUSY``;
dm-pcache retries internally (request deferral).
* *System crash* on attach, the driver replays ksets from *key_tail* to
rebuild the in-core trees; every segments generation guards against
use-after-free keys.
Limitations & TODO
==================
* Only *write-back* mode; other modes planned.
* Only FIFO cache invalidate; other (LRU, ARC...) planned.
* Table reload is not supported currently.
* Discard planned.
Example workflow
================
.. code-block:: shell
# 1. Create devices
dmsetup create pcache_sdb --table \
"0 $(blockdev --getsz /dev/sdb) pcache /dev/pmem0 /dev/sdb 4 cache_mode writeback data_crc true"
# 2. Put a filesystem on top
mkfs.ext4 /dev/mapper/pcache_sdb
mount /dev/mapper/pcache_sdb /mnt
# 3. Tune GC threshold to 80 %
dmsetup message pcache_sdb 0 gc_percent 80
# 4. Observe status
watch -n1 'dmsetup status pcache_sdb'
# 5. Shutdown
umount /mnt
dmsetup remove pcache_sdb
``dm-pcache`` is under active development; feedback, bug reports and patches
are very welcome!

View File

@@ -18,6 +18,7 @@ Device Mapper
dm-integrity
dm-io
dm-log
dm-pcache
dm-queue-length
dm-raid
dm-service-time

View File

@@ -600,7 +600,7 @@ lock and return itself to the pool.
All storage within vdo is managed as 4KB blocks, but it can accept writes
as small as 512 bytes. Processing a write that is smaller than 4K requires
a read-modify-write operation that reads the relevant 4K block, copies the
new data over the approriate sectors of the block, and then launches a
new data over the appropriate sectors of the block, and then launches a
write operation for the modified data block. The read and write stages of
this operation are nearly identical to the normal read and write
operations, and a single data_vio is used throughout this operation.

View File

@@ -1,5 +1,6 @@
.. SPDX-License-Identifier: GPL-2.0-only
======
dm-vdo
======

View File

@@ -398,7 +398,7 @@ There are 3 different data modes:
* writeback mode
In data=writeback mode, ext4 does not journal data at all. This mode provides
a similar level of journaling as that of XFS, JFS, and ReiserFS in its default
a similar level of journaling as that of XFS and JFS in its default
mode - metadata journaling. A crash+recovery can cause incorrect data to
appear in files which were written shortly before the crash. This mode will
typically provide the best ext4 performance.

View File

@@ -215,9 +215,10 @@ Spectre_v2 X X
Spectre_v2_user X X * (Note 1)
SRBDS X X X X
SRSO X X X X
SSB (Note 4)
SSB X
TAA X X X X * (Note 2)
TSA X X X X
VMSCAPE X
=============== ============== ============ ============= ============== ============ ========
Notes:
@@ -229,9 +230,6 @@ Notes:
3 -- Disables SMT if cross-thread mitigations are fully enabled, the CPU is
vulnerable, and STIBP is not supported
4 -- Speculative store bypass is always enabled by default (no kernel
mitigation applied) unless overridden with spec_store_bypass_disable option
When an attack-vector is disabled, all mitigations for the vulnerabilities
listed in the above table are disabled, unless mitigation is required for a
different enabled attack-vector or a mitigation is explicitly selected via a

View File

@@ -26,3 +26,4 @@ are configurable at compile, boot or run time.
rsb
old_microcode
indirect-target-selection
vmscape

View File

@@ -214,7 +214,7 @@ XEON PHI specific considerations
command line with the 'ring3mwait=disable' command line option.
XEON PHI is not affected by the other MDS variants and MSBDS is mitigated
before the CPU enters a idle state. As XEON PHI is not affected by L1TF
before the CPU enters an idle state. As XEON PHI is not affected by L1TF
either disabling SMT is not required for full protection.
.. _mds_smt_control:

View File

@@ -664,7 +664,7 @@ Intel white papers:
.. _spec_ref1:
[1] `Intel analysis of speculative execution side channels <https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/Intel-Analysis-of-Speculative-Execution-Side-Channels.pdf>`_.
[1] `Intel analysis of speculative execution side channels <https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/analysis-of-speculative-execution-side-channels-white-paper.pdf>`_.
.. _spec_ref2:
@@ -682,7 +682,7 @@ AMD white papers:
.. _spec_ref5:
[5] `AMD64 technology indirect branch control extension <https://developer.amd.com/wp-content/resources/Architecture_Guidelines_Update_Indirect_Branch_Control.pdf>`_.
[5] `AMD64 technology indirect branch control extension <https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/white-papers/111006-architecture-guidelines-update-amd64-technology-indirect-branch-control-extension.pdf>`_.
.. _spec_ref6:
@@ -708,7 +708,7 @@ MIPS white paper:
.. _spec_ref10:
[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/>`_.
[10] `MIPS: response on speculative execution and side channel vulnerabilities <https://web.archive.org/web/20220512003005if_/https://www.mips.com/blog/mips-response-on-speculative-execution-and-side-channel-vulnerabilities/>`_.
Academic papers:

View File

@@ -0,0 +1,110 @@
.. SPDX-License-Identifier: GPL-2.0
VMSCAPE
=======
VMSCAPE is a vulnerability that may allow a guest to influence the branch
prediction in host userspace. It particularly affects hypervisors like QEMU.
Even if a hypervisor may not have any sensitive data like disk encryption keys,
guest-userspace may be able to attack the guest-kernel using the hypervisor as
a confused deputy.
Affected processors
-------------------
The following CPU families are affected by VMSCAPE:
**Intel processors:**
- Skylake generation (Parts without Enhanced-IBRS)
- Cascade Lake generation - (Parts affected by ITS guest/host separation)
- Alder Lake and newer (Parts affected by BHI)
Note that, BHI affected parts that use BHB clearing software mitigation e.g.
Icelake are not vulnerable to VMSCAPE.
**AMD processors:**
- Zen series (families 0x17, 0x19, 0x1a)
** Hygon processors:**
- Family 0x18
Mitigation
----------
Conditional IBPB
----------------
Kernel tracks when a CPU has run a potentially malicious guest and issues an
IBPB before the first exit to userspace after VM-exit. If userspace did not run
between VM-exit and the next VM-entry, no IBPB is issued.
Note that the existing userspace mitigation against Spectre-v2 is effective in
protecting the userspace. They are insufficient to protect the userspace VMMs
from a malicious guest. This is because Spectre-v2 mitigations are applied at
context switch time, while the userspace VMM can run after a VM-exit without a
context switch.
Vulnerability enumeration and mitigation is not applied inside a guest. This is
because nested hypervisors should already be deploying IBPB to isolate
themselves from nested guests.
SMT considerations
------------------
When Simultaneous Multi-Threading (SMT) is enabled, hypervisors can be
vulnerable to cross-thread attacks. For complete protection against VMSCAPE
attacks in SMT environments, STIBP should be enabled.
The kernel will issue a warning if SMT is enabled without adequate STIBP
protection. Warning is not issued when:
- SMT is disabled
- STIBP is enabled system-wide
- Intel eIBRS is enabled (which implies STIBP protection)
System information and options
------------------------------
The sysfs file showing VMSCAPE mitigation status is:
/sys/devices/system/cpu/vulnerabilities/vmscape
The possible values in this file are:
* 'Not affected':
The processor is not vulnerable to VMSCAPE attacks.
* 'Vulnerable':
The processor is vulnerable and no mitigation has been applied.
* 'Mitigation: IBPB before exit to userspace':
Conditional IBPB mitigation is enabled. The kernel tracks when a CPU has
run a potentially malicious guest and issues an IBPB before the first
exit to userspace after VM-exit.
* 'Mitigation: IBPB on VMEXIT':
IBPB is issued on every VM-exit. This occurs when other mitigations like
RETBLEED or SRSO are already issuing IBPB on VM-exit.
Mitigation control on the kernel command line
----------------------------------------------
The mitigation can be controlled via the ``vmscape=`` command line parameter:
* ``vmscape=off``:
Disable the VMSCAPE mitigation.
* ``vmscape=ibpb``:
Enable conditional IBPB mitigation (default when CONFIG_MITIGATION_VMSCAPE=y).
* ``vmscape=force``:
Force vulnerability detection and mitigation even on processors that are
not known to be affected.

View File

@@ -471,7 +471,7 @@ Notes on loading the dump-capture kernel:
performance degradation. To enable multi-cpu support, you should bring up an
SMP dump-capture kernel and specify maxcpus/nr_cpus options while loading it.
* For s390x there are two kdump modes: If a ELF header is specified with
* For s390x there are two kdump modes: If an ELF header is specified with
the elfcorehdr= kernel parameter, it is used by the kdump kernel as it
is done on all other architectures. If no elfcorehdr= kernel parameter is
specified, the s390x kdump kernel dynamically creates the header. The

View File

@@ -1,3 +1,5 @@
.. SPDX-License-Identifier: GPL-2.0
.. _kernelparameters:
The kernel's command-line parameters
@@ -213,7 +215,7 @@ need or coordination with <Documentation/arch/x86/boot.rst>.
There are also arch-specific kernel-parameters not documented here.
Note that ALL kernel parameters listed below are CASE SENSITIVE, and that
a trailing = on the name of any parameter states that that parameter will
a trailing = on the name of any parameter states that the parameter will
be entered as an environment variable, whereas its absence indicates that
it will appear as a kernel argument readable via /proc/cmdline by programs
running once the system is up.

View File

@@ -608,6 +608,24 @@
ccw_timeout_log [S390]
See Documentation/arch/s390/common_io.rst for details.
cfi= [X86-64] Set Control Flow Integrity checking features
when CONFIG_FINEIBT is enabled.
Format: feature[,feature...]
Default: auto
auto: Use FineIBT if IBT available, otherwise kCFI.
Under FineIBT, enable "paranoid" mode when
FRED is not available.
off: Turn off CFI checking.
kcfi: Use kCFI (disable FineIBT).
fineibt: Use FineIBT (even if IBT not available).
norand: Do not re-randomize CFI hashes.
paranoid: Add caller hash checking under FineIBT.
bhi: Enable register poisoning to stop speculation
across FineIBT. (Disabled by default.)
warn: Do not enforce CFI checking: warn only.
debug: Report CFI initialization details.
cgroup_disable= [KNL] Disable a particular controller or optional feature
Format: {name of the controller(s) or feature(s) to disable}
The effects of cgroup_disable=foo are:
@@ -2606,6 +2624,11 @@
for it. Intended to get systems with badly broken
firmware running.
irqhandler.duration_warn_us= [KNL]
Warn if an IRQ handler exceeds the specified duration
threshold in microseconds. Useful for identifying
long-running IRQs in the system.
irqpoll [HW]
When an interrupt is not handled search all handlers
for it. Also check all handlers each timer
@@ -2957,6 +2980,27 @@
(enabled). Disable by KVM if hardware lacks support
for NPT.
kvm-amd.ciphertext_hiding_asids=
[KVM,AMD] Ciphertext hiding prevents disallowed accesses
to SNP private memory from reading ciphertext. Instead,
reads will see constant default values (0xff).
If ciphertext hiding is enabled, the joint SEV-ES and
SEV-SNP ASID space is partitioned into separate SEV-ES
and SEV-SNP ASID ranges, with the SEV-SNP range being
[1..max_snp_asid] and the SEV-ES range being
(max_snp_asid..min_sev_asid), where min_sev_asid is
enumerated by CPUID.0x.8000_001F[EDX].
A non-zero value enables SEV-SNP ciphertext hiding and
adjusts the ASID ranges for SEV-ES and SEV-SNP guests.
KVM caps the number of SEV-SNP ASIDs at the maximum
possible value, e.g. specifying -1u will assign all
joint SEV-ES and SEV-SNP ASIDs to SEV-SNP. Note,
assigning all joint ASIDs to SEV-SNP, i.e. configuring
max_snp_asid == min_sev_asid-1, will effectively make
SEV-ES unusable.
kvm-arm.mode=
[KVM,ARM,EARLY] Select one of KVM/arm64's modes of
operation.
@@ -3700,7 +3744,7 @@
looking for corruption. Enabling this will
both detect corruption and prevent the kernel
from using the memory being corrupted.
However, its intended as a diagnostic tool; if
However, it's intended as a diagnostic tool; if
repeatable BIOS-originated corruption always
affects the same memory, you can use memmap=
to prevent the kernel from using that memory.
@@ -3767,8 +3811,16 @@
mga= [HW,DRM]
microcode.force_minrev= [X86]
Format: <bool>
microcode= [X86] Control the behavior of the microcode loader.
Available options, comma separated:
base_rev=X - with <X> with format: <u32>
Set the base microcode revision of each thread when in
debug mode.
dis_ucode_ldr: disable the microcode loader
force_minrev:
Enable or disable the microcode minimal revision
enforcement for the runtime microcode loader.
@@ -3829,6 +3881,7 @@
srbds=off [X86,INTEL]
ssbd=force-off [ARM64]
tsx_async_abort=off [X86]
vmscape=off [X86]
Exceptions:
This does not have any effect on
@@ -4589,7 +4642,7 @@
bit 2: print timer info
bit 3: print locks info if CONFIG_LOCKDEP is on
bit 4: print ftrace buffer
bit 5: replay all messages on consoles at the end of panic
bit 5: replay all kernel messages on consoles at the end of panic
bit 6: print all CPUs backtrace (if available in the arch)
bit 7: print only tasks in uninterruptible (blocked) state
*Be aware* that this option may print a _lot_ of lines,
@@ -6154,7 +6207,7 @@
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
mba, smba, bmec.
mba, smba, bmec, abmc.
E.g. to turn on cmt and turn off mba use:
rdt=cmt,!mba
@@ -6405,8 +6458,9 @@
rodata= [KNL,EARLY]
on Mark read-only kernel memory as read-only (default).
off Leave read-only kernel memory writable for debugging.
full Mark read-only kernel memory and aliases as read-only
[arm64]
noalias Mark read-only kernel memory as read-only but retain
writable aliases in the direct map for regions outside
of the kernel image. [arm64]
rockchip.usb_uart
[EARLY]
@@ -6428,6 +6482,9 @@
rootflags= [KNL] Set root filesystem mount option string
initramfs_options= [KNL]
Specify mount options for for the initramfs mount.
rootfstype= [KNL] Set root filesystem type
rootwait [KNL] Wait (indefinitely) for root device to show up.
@@ -7382,7 +7439,7 @@
(converted into nanoseconds). Fast, but
depending on the architecture, may not be
in sync between CPUs.
global - Event time stamps are synchronize across
global - Event time stamps are synchronized across
CPUs. May be slower than the local clock,
but better for some race conditions.
counter - Simple counting of events (1, 2, ..)
@@ -7502,12 +7559,12 @@
section.
trace_trigger=[trigger-list]
[FTRACE] Add a event trigger on specific events.
[FTRACE] Add an event trigger on specific events.
Set a trigger on top of a specific event, with an optional
filter.
The format is is "trace_trigger=<event>.<trigger>[ if <filter>],..."
Where more than one trigger may be specified that are comma deliminated.
The format is "trace_trigger=<event>.<trigger>[ if <filter>],..."
Where more than one trigger may be specified that are comma delimited.
For example:
@@ -7515,7 +7572,7 @@
The above will enable the "stacktrace" trigger on the "sched_switch"
event but only trigger it if the "prev_state" of the "sched_switch"
event is "2" (TASK_UNINTERUPTIBLE).
event is "2" (TASK_UNINTERRUPTIBLE).
See also "Event triggers" in Documentation/trace/events.rst
@@ -8041,6 +8098,16 @@
vmpoff= [KNL,S390] Perform z/VM CP command after power off.
Format: <command>
vmscape= [X86] Controls mitigation for VMscape attacks.
VMscape attacks can leak information from a userspace
hypervisor to a guest via speculative side-channels.
off - disable the mitigation
ibpb - use Indirect Branch Prediction Barrier
(IBPB) mitigation (default)
force - force vulnerability detection even on
unaffected processors
vsyscall= [X86-64,EARLY]
Controls the behavior of vsyscalls (i.e. calls to
fixed addresses of 0xffffffffff600x00 from legacy

View File

@@ -61,7 +61,7 @@ Caveats
Check your drive's rating, and don't wear down your drive's lifetime if you
don't need to.
* If you mount some of your ext3/reiserfs filesystems with the -n option, then
* If you mount some of your ext3 filesystems with the -n option, then
the control script will not be able to remount them correctly. You must set
DO_REMOUNTS=0 in the control script, otherwise it will remount them with the
wrong options -- or it will fail because it cannot write to /etc/mtab.
@@ -96,7 +96,7 @@ control script increases dirty_expire_centisecs and dirty_writeback_centisecs in
dirtied are not forced to be written to disk as often. The control script also
changes the dirty background ratio, so that background writeback of dirty pages
is not done anymore. Combined with a higher commit value (also 10 minutes) for
ext3 or ReiserFS filesystems (also done automatically by the control script),
ext3 filesystem (also done automatically by the control script),
this results in concentration of disk activity in a small time interval which
occurs only once every 10 minutes, or whenever the disk is forced to spin up by
a cache miss. The disk can then be spun down in the periods of inactivity.
@@ -587,7 +587,7 @@ Control script::
FST=$(deduce_fstype $MP)
fi
case "$FST" in
"ext3"|"reiserfs")
"ext3")
PARSEDOPTS="$(parse_mount_opts commit "$OPTS")"
mount $DEV -t $FST $MP -o remount,$PARSEDOPTS,commit=$MAX_AGE$NOATIME_OPT
;;
@@ -647,7 +647,7 @@ Control script::
FST=$(deduce_fstype $MP)
fi
case "$FST" in
"ext3"|"reiserfs")
"ext3")
PARSEDOPTS="$(parse_mount_opts_wfstab $DEV commit $OPTS)"
PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $PARSEDOPTS)"
mount $DEV -t $FST $MP -o remount,$PARSEDOPTS

View File

@@ -48,8 +48,8 @@ This value is reset to 100 when the kernel boots.
Fan mode
--------
Writing 1/0 to /sys/devices/platform/lg-laptop/fan_mode disables/enables
the fan silent mode.
Writing 0/1/2 to /sys/devices/platform/lg-laptop/fan_mode sets fan mode to
Optimal/Silent/Performance respectively.
USB charge

View File

@@ -25,7 +25,7 @@ generate, like:
(when available)
Those events (see linux/sonypi.h) can be polled using the character device node
/dev/sonypi (major 10, minor auto allocated or specified as a option).
/dev/sonypi (major 10, minor auto allocated or specified as an option).
A simple daemon which translates the jogdial movements into mouse wheel events
can be downloaded at: <http://popies.net/sonypi/>

View File

@@ -347,6 +347,54 @@ All md devices contain:
active-idle
like active, but no writes have been seen for a while (safe_mode_delay).
consistency_policy
This indicates how the array maintains consistency in case of unexpected
shutdown. It can be:
none
Array has no redundancy information, e.g. raid0, linear.
resync
Full resync is performed and all redundancy is regenerated when the
array is started after unclean shutdown.
bitmap
Resync assisted by a write-intent bitmap.
journal
For raid4/5/6, journal device is used to log transactions and replay
after unclean shutdown.
ppl
For raid5 only, Partial Parity Log is used to close the write hole and
eliminate resync.
The accepted values when writing to this file are ``ppl`` and ``resync``,
used to enable and disable PPL.
uuid
This indicates the UUID of the array in the following format:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
bitmap_type
[RW] When read, this file will display the current and available
bitmap for this array. The currently active bitmap will be enclosed
in [] brackets. Writing an bitmap name or ID to this file will switch
control of this array to that new bitmap. Note that writing a new
bitmap for created array is forbidden.
none
No bitmap
bitmap
The default internal bitmap
llbitmap
The lockless internal bitmap
If bitmap_type is not none, then additional bitmap attributes bitmap/xxx or
llbitmap/xxx will be created after md device KOBJ_CHANGE event.
If bitmap_type is bitmap, then the md device will also contain:
bitmap/location
This indicates where the write-intent bitmap for the array is
stored.
@@ -401,35 +449,23 @@ All md devices contain:
once the array becomes non-degraded, and this fact has been
recorded in the metadata.
consistency_policy
This indicates how the array maintains consistency in case of unexpected
shutdown. It can be:
If bitmap_type is llbitmap, then the md device will also contain:
none
Array has no redundancy information, e.g. raid0, linear.
llbitmap/bits
This is read-only, show status of bitmap bits, the number of each
value.
resync
Full resync is performed and all redundancy is regenerated when the
array is started after unclean shutdown.
llbitmap/metadata
This is read-only, show bitmap metadata, include chunksize, chunkshift,
chunks, offset and daemon_sleep.
bitmap
Resync assisted by a write-intent bitmap.
journal
For raid4/5/6, journal device is used to log transactions and replay
after unclean shutdown.
ppl
For raid5 only, Partial Parity Log is used to close the write hole and
eliminate resync.
The accepted values when writing to this file are ``ppl`` and ``resync``,
used to enable and disable PPL.
uuid
This indicates the UUID of the array in the following format:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
llbitmap/daemon_sleep
This is read-write, time in seconds that daemon function will be
triggered to clear dirty bits.
llbitmap/barrier_idle
This is read-write, time in seconds that page barrier will be idled,
means dirty bits in the page will be cleared.
As component devices are added to an md array, they appear in the ``md``
directory as new directories named::
@@ -758,7 +794,7 @@ These currently include:
journal_mode (currently raid5 only)
The cache mode for raid5. raid5 could include an extra disk for
caching. The mode can be "write-throuth" and "write-back". The
caching. The mode can be "write-through" or "write-back". The
default is "write-through".
ppl_write_hint

View File

@@ -91,7 +91,6 @@ ov5647 OmniVision OV5647 sensor
ov5670 OmniVision OV5670 sensor
ov5675 OmniVision OV5675 sensor
ov5695 OmniVision OV5695 sensor
ov6650 OmniVision OV6650 sensor
ov7251 OmniVision OV7251 sensor
ov7640 OmniVision OV7640 sensor
ov7670 OmniVision OV7670 sensor

View File

@@ -96,7 +96,7 @@ Some of the features of this driver include:
motion compensation modes: low, medium, and high motion. Pipelines are
defined that allow sending frames to the VDIC subdev directly from the
CSI. There is also support in the future for sending frames to the
VDIC from memory buffers via a output/mem2mem devices.
VDIC from memory buffers via output/mem2mem devices.
- Includes a Frame Interval Monitor (FIM) that can correct vertical sync
problems with the ADV718x video decoders.

View File

@@ -3,7 +3,7 @@
The ivtv driver
===============
Author: Hans Verkuil <hverkuil@xs4all.nl>
Author: Hans Verkuil <hverkuil@kernel.org>
This is a v4l2 device driver for the Conexant cx23415/6 MPEG encoder/decoder.
The cx23415 can do both encoding and decoding, the cx23416 can only do MPEG

View File

@@ -13,7 +13,7 @@ Contact: Eduardo Valentin <eduardo.valentin@nokia.com>
Information about the Device
----------------------------
This chip is a Silicon Labs product. It is a I2C device, currently on 0x63 address.
This chip is a Silicon Labs product. It is an I2C device, currently on 0x63 address.
Basically, it has transmission and signal noise level measurement features.
The Si4713 integrates transmit functions for FM broadcast stereo transmission.
@@ -28,7 +28,7 @@ Users must comply with local regulations on radio frequency (RF) transmission.
Device driver description
-------------------------
There are two modules to handle this device. One is a I2C device driver
There are two modules to handle this device. One is an I2C device driver
and the other is a platform driver.
The I2C device driver exports a v4l2-subdev interface to the kernel.
@@ -113,7 +113,7 @@ Here is a summary of them:
- acomp_attack_time - Sets the attack time for audio dynamic range control.
- acomp_release_time - Sets the release time for audio dynamic range control.
* Limiter setups audio deviation limiter feature. Once a over deviation occurs,
* Limiter sets up the audio deviation limiter feature. Once an over deviation occurs,
it is possible to adjust the front-end gain of the audio input and always
prevent over deviation.

View File

@@ -175,4 +175,4 @@ Below command makes every memory region of size >=4K that has not accessed for
$ sudo damo start --damos_access_rate 0 0 --damos_sz_region 4K max \
--damos_age 60s max --damos_action pageout \
<pid of your workload>
--target_pid <pid of your workload>

View File

@@ -61,7 +61,7 @@ comma (",").
:ref:`kdamonds <sysfs_kdamonds>`/nr_kdamonds
│ │ :ref:`0 <sysfs_kdamond>`/state,pid,refresh_ms
│ │ │ :ref:`contexts <sysfs_contexts>`/nr_contexts
│ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations
│ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations,addr_unit
│ │ │ │ │ :ref:`monitoring_attrs <sysfs_monitoring_attrs>`/
│ │ │ │ │ │ intervals/sample_us,aggr_us,update_us
│ │ │ │ │ │ │ intervals_goal/access_bp,aggrs,min_sample_us,max_sample_us
@@ -188,9 +188,9 @@ details). At the moment, only one context per kdamond is supported, so only
contexts/<N>/
-------------
In each context directory, two files (``avail_operations`` and ``operations``)
and three directories (``monitoring_attrs``, ``targets``, and ``schemes``)
exist.
In each context directory, three files (``avail_operations``, ``operations``
and ``addr_unit``) and three directories (``monitoring_attrs``, ``targets``,
and ``schemes``) exist.
DAMON supports multiple types of :ref:`monitoring operations
<damon_design_configurable_operations_set>`, including those for virtual address
@@ -205,6 +205,9 @@ You can set and get what type of monitoring operations DAMON will use for the
context by writing one of the keywords listed in ``avail_operations`` file and
reading from the ``operations`` file.
``addr_unit`` file is for setting and getting the :ref:`address unit
<damon_design_addr_unit>` parameter of the operations set.
.. _sysfs_monitoring_attrs:
contexts/<N>/monitoring_attrs/
@@ -357,7 +360,7 @@ The directory for the :ref:`quotas <damon_design_damos_quotas>` of the given
DAMON-based operation scheme.
Under ``quotas`` directory, four files (``ms``, ``bytes``,
``reset_interval_ms``, ``effective_bytes``) and two directores (``weights`` and
``reset_interval_ms``, ``effective_bytes``) and two directories (``weights`` and
``goals``) exist.
You can set the ``time quota`` in milliseconds, ``size quota`` in bytes, and

View File

@@ -225,6 +225,42 @@ to "always" or "madvise"), and it'll be automatically shutdown when
PMD-sized THP is disabled (when both the per-size anon control and the
top-level control are "never")
process THP controls
--------------------
A process can control its own THP behaviour using the ``PR_SET_THP_DISABLE``
and ``PR_GET_THP_DISABLE`` pair of prctl(2) calls. The THP behaviour set using
``PR_SET_THP_DISABLE`` is inherited across fork(2) and execve(2). These calls
support the following arguments::
prctl(PR_SET_THP_DISABLE, 1, 0, 0, 0):
This will disable THPs completely for the process, irrespective
of global THP controls or madvise(..., MADV_COLLAPSE) being used.
prctl(PR_SET_THP_DISABLE, 1, PR_THP_DISABLE_EXCEPT_ADVISED, 0, 0):
This will disable THPs for the process except when the usage of THPs is
advised. Consequently, THPs will only be used when:
- Global THP controls are set to "always" or "madvise" and
madvise(..., MADV_HUGEPAGE) or madvise(..., MADV_COLLAPSE) is used.
- Global THP controls are set to "never" and madvise(..., MADV_COLLAPSE)
is used. This is the same behavior as if THPs would not be disabled on
a process level.
Note that MADV_COLLAPSE is currently always rejected if
madvise(..., MADV_NOHUGEPAGE) is set on an area.
prctl(PR_SET_THP_DISABLE, 0, 0, 0, 0):
This will re-enable THPs for the process, as if they were never disabled.
Whether THPs will actually be used depends on global THP controls and
madvise() calls.
prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0):
This returns a value whose bits indicate how THP-disable is configured:
Bits
1 0 Value Description
|0|0| 0 No THP-disable behaviour specified.
|0|1| 1 THP is entirely disabled for this process.
|1|1| 3 THP-except-advised mode is set for this process.
Khugepaged controls
-------------------
@@ -383,6 +419,8 @@ option: ``huge=``. It can have following values:
always
Attempt to allocate huge pages every time we need a new page;
Always try PMD-sized huge pages first, and fall back to smaller-sized
huge pages if the PMD-sized huge page allocation fails;
never
Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)``
@@ -390,7 +428,9 @@ never
is specified everywhere;
within_size
Only allocate huge page if it will be fully within i_size.
Only allocate huge page if it will be fully within i_size;
Always try PMD-sized huge pages first, and fall back to smaller-sized
huge pages if the PMD-sized huge page allocation fails;
Also respect madvise() hints;
advise

View File

@@ -53,26 +53,17 @@ Zswap receives pages for compression from the swap subsystem and is able to
evict pages from its own compressed pool on an LRU basis and write them back to
the backing swap device in the case that the compressed pool is full.
Zswap makes use of zpool for the managing the compressed memory pool. Each
allocation in zpool is not directly accessible by address. Rather, a handle is
Zswap makes use of zsmalloc for the managing the compressed memory pool. Each
allocation in zsmalloc is not directly accessible by address. Rather, a handle is
returned by the allocation routine and that handle must be mapped before being
accessed. The compressed memory pool grows on demand and shrinks as compressed
pages are freed. The pool is not preallocated. By default, a zpool
of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created,
but it can be overridden at boot time by setting the ``zpool`` attribute,
e.g. ``zswap.zpool=zsmalloc``. It can also be changed at runtime using the sysfs
``zpool`` attribute, e.g.::
echo zsmalloc > /sys/module/zswap/parameters/zpool
The zsmalloc type zpool has a complex compressed page storage method, and it
can achieve great storage densities.
pages are freed. The pool is not preallocated.
When a swap page is passed from swapout to zswap, zswap maintains a mapping
of the swap entry, a combination of the swap type and swap offset, to the zpool
handle that references that compressed swap page. This mapping is achieved
with a red-black tree per swap type. The swap offset is the search key for the
tree nodes.
of the swap entry, a combination of the swap type and swap offset, to the
zsmalloc handle that references that compressed swap page. This mapping is
achieved with a red-black tree per swap type. The swap offset is the search
key for the tree nodes.
During a page fault on a PTE that is a swap entry, the swapin code calls the
zswap load function to decompress the page into the page allocated by the page
@@ -96,11 +87,11 @@ attribute, e.g.::
echo lzo > /sys/module/zswap/parameters/compressor
When the zpool and/or compressor parameter is changed at runtime, any existing
compressed pages are not modified; they are left in their own zpool. When a
request is made for a page in an old zpool, it is uncompressed using its
original compressor. Once all pages are removed from an old zpool, the zpool
and its compressor are freed.
When the compressor parameter is changed at runtime, any existing compressed
pages are not modified; they are left in their own pool. When a request is
made for a page in an old pool, it is uncompressed using its original
compressor. Once all pages are removed from an old pool, the pool and its
compressor are freed.
Some of the pages in zswap are same-value filled pages (i.e. contents of the
page have same value or repetitive pattern). These pages include zero-filled

View File

@@ -342,7 +342,7 @@ They depend on various facilities being available:
When using pxelinux, the kernel image is specified using
"kernel <relative-path-below /tftpboot>". The nfsroot parameters
are passed to the kernel by adding them to the "append" line.
It is common to use serial console in conjunction with pxeliunx,
It is common to use serial console in conjunction with pxelinux,
see Documentation/admin-guide/serial-console.rst for more information.
For more information on isolinux, including how to create bootdisks

View File

@@ -16,8 +16,8 @@ provides the following two features:
- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
time spent in each low-power LTSSM state) and
- one 32-bit counter for Event Counting (error and non-error events for
a specified lane)
- one 32-bit counter per event for Event Counting (error and non-error
events for a specified lane)
Note: There is no interrupt for counter overflow.

View File

@@ -0,0 +1,115 @@
.. SPDX-License-Identifier: GPL-2.0-only
================================================
Fujitsu Uncore Performance Monitoring Unit (PMU)
================================================
This driver supports the Uncore MAC PMUs and the Uncore PCI PMUs found
in Fujitsu chips.
Each MAC PMU on these chips is exposed as a uncore perf PMU with device name
mac_iod<iod>_mac<mac>_ch<ch>.
And each PCI PMU on these chips is exposed as a uncore perf PMU with device name
pci_iod<iod>_pci<pci>.
The driver provides a description of its available events and configuration
options in sysfs, see /sys/bus/event_sources/devices/mac_iod<iod>_mac<mac>_ch<ch>/
and /sys/bus/event_sources/devices/pci_iod<iod>_pci<pci>/.
This driver exports:
- formats, used by perf user space and other tools to configure events
- events, used by perf user space and other tools to create events
symbolically, e.g.::
perf stat -a -e mac_iod0_mac0_ch0/event=0x21/ ls
perf stat -a -e pci_iod0_pci0/event=0x24/ ls
- cpumask, used by perf user space and other tools to know on which CPUs
to open the events
This driver supports the following events for MAC:
- cycles
This event counts MAC cycles at MAC frequency.
- read-count
This event counts the number of read requests to MAC.
- read-count-request
This event counts the number of read requests including retry to MAC.
- read-count-return
This event counts the number of responses to read requests to MAC.
- read-count-request-pftgt
This event counts the number of read requests including retry with PFTGT
flag.
- read-count-request-normal
This event counts the number of read requests including retry without PFTGT
flag.
- read-count-return-pftgt-hit
This event counts the number of responses to read requests which hit the
PFTGT buffer.
- read-count-return-pftgt-miss
This event counts the number of responses to read requests which miss the
PFTGT buffer.
- read-wait
This event counts outstanding read requests issued by DDR memory controller
per cycle.
- write-count
This event counts the number of write requests to MAC (including zero write,
full write, partial write, write cancel).
- write-count-write
This event counts the number of full write requests to MAC (not including
zero write).
- write-count-pwrite
This event counts the number of partial write requests to MAC.
- memory-read-count
This event counts the number of read requests from MAC to memory.
- memory-write-count
This event counts the number of full write requests from MAC to memory.
- memory-pwrite-count
This event counts the number of partial write requests from MAC to memory.
- ea-mac
This event counts energy consumption of MAC.
- ea-memory
This event counts energy consumption of memory.
- ea-memory-mac-write
This event counts the number of write requests from MAC to memory.
- ea-ha
This event counts energy consumption of HA.
'ea' is the abbreviation for 'Energy Analyzer'.
Examples for use with perf::
perf stat -e mac_iod0_mac0_ch0/ea-mac/ ls
And, this driver supports the following events for PCI:
- pci-port0-cycles
This event counts PCI cycles at PCI frequency in port0.
- pci-port0-read-count
This event counts read transactions for data transfer in port0.
- pci-port0-read-count-bus
This event counts read transactions for bus usage in port0.
- pci-port0-write-count
This event counts write transactions for data transfer in port0.
- pci-port0-write-count-bus
This event counts write transactions for bus usage in port0.
- pci-port1-cycles
This event counts PCI cycles at PCI frequency in port1.
- pci-port1-read-count
This event counts read transactions for data transfer in port1.
- pci-port1-read-count-bus
This event counts read transactions for bus usage in port1.
- pci-port1-write-count
This event counts write transactions for data transfer in port1.
- pci-port1-write-count-bus
This event counts write transactions for bus usage in port1.
- ea-pci
This event counts energy consumption of PCI.
'ea' is the abbreviation for 'Energy Analyzer'.
Examples for use with perf::
perf stat -e pci_iod0_pci0/ea-pci/ ls
Given that these are uncore PMUs the driver does not support sampling, therefore
"perf record" will not work. Per-task perf sessions are not supported.

View File

@@ -18,9 +18,10 @@ HiSilicon SoC uncore PMU driver
Each device PMU has separate registers for event counting, control and
interrupt, and the PMU driver shall register perf PMU drivers like L3C,
HHA and DDRC etc. The available events and configuration options shall
be described in the sysfs, see:
be described in the sysfs, see::
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}/hha{Y}/ddrc{Y}>.
The "perf list" command shall list the available events from sysfs.
Each L3C, HHA and DDRC is registered as a separate PMU with perf. The PMU
@@ -65,6 +66,10 @@ specified as a bitmap::
This will only count the operations from core/thread 0 and 1 in this cluster.
User should not use tt_core_deprecated to specify the core/thread filtering.
This option is provided for backward compatiblility and only support 8bit
which may not cover all the core/thread sharing L3C.
2. Tracetag allow the user to chose to count only read, write or atomic
operations via the tt_req parameeter in perf. The default value counts all
operations. tt_req is 3bits, 3'b100 represents read operations, 3'b101
@@ -109,8 +114,52 @@ uring channel. It is 2 bits. Some important codes are as follows:
- 2'b11: count the events which sent to the uring_ext (MATA) channel;
- 2'b01: is the same as 2'b11;
- 2'b10: count the events which sent to the uring (non-MATA) channel;
- 2'b00: default value, count the events which sent to the both uring and
uring_ext channel;
- 2'b00: default value, count the events which sent to both uring and
uring_ext channels;
6. ch: NoC PMU supports filtering the event counts of certain transaction
channel with this option. The current supported channels are as follows:
- 3'b010: Request channel
- 3'b100: Snoop channel
- 3'b110: Response channel
- 3'b111: Data channel
7. tt_en: NoC PMU supports counting only transactions that have tracetag set
if this option is set. See the 2nd list for more information about tracetag.
For HiSilicon uncore PMU v3 whose identifier is 0x40, some uncore PMUs are
further divided into parts for finer granularity of tracing, each part has its
own dedicated PMU, and all such PMUs together cover the monitoring job of events
on particular uncore device. Such PMUs are described in sysfs with name format
slightly changed::
/sys/bus/event_source/devices/hisi_sccl{X}_<l3c{Y}_{Z}/ddrc{Y}_{Z}/noc{Y}_{Z}>
Z is the sub-id, indicating different PMUs for part of hardware device.
Usage of most PMUs with different sub-ids are identical. Specially, L3C PMU
provides ``ext`` option to allow exploration of even finer granual statistics
of L3C PMU. L3C PMU driver uses that as hint of termination when delivering
perf command to hardware:
- ext=0: Default, could be used with event names.
- ext=1 and ext=2: Must be used with event codes, event names are not supported.
An example of perf command could be::
$# perf stat -a -e hisi_sccl0_l3c1_0/rd_spipe/ sleep 5
or::
$# perf stat -a -e hisi_sccl0_l3c1_0/event=0x1,ext=1/ sleep 5
As above, ``hisi_sccl0_l3c1_0`` locates PMU of Super CPU CLuster 0, L3 cache 1
pipe0.
First command locates the first part of L3C since ``ext=0`` is implied by
default. Second command issues the counting on another part of L3C with the
event ``0x1``.
Users could configure IDs to count data come from specific CCL/ICL, by setting
srcid_cmd & srcid_msk, and data desitined for specific CCL/ICL by setting

View File

@@ -29,3 +29,4 @@ Performance monitor support
cxl
ampere_cspmu
mrvl-pem-pmu
fujitsu_uncore_pmu

View File

@@ -274,10 +274,6 @@ are the following:
The time it takes to switch the CPUs belonging to this policy from one
P-state to another, in nanoseconds.
If unknown or if known to be so high that the scaling driver does not
work with the `ondemand`_ governor, -1 (:c:macro:`CPUFREQ_ETERNAL`)
will be returned by reads from this attribute.
``related_cpus``
List of all (online and offline) CPUs belonging to this policy.

View File

@@ -273,7 +273,7 @@ again.
does nothing at all; in that case you have to manually install your kernel,
as outlined in the reference section.
If you are running a immutable Linux distribution, check its documentation
If you are running an immutable Linux distribution, check its documentation
and the web to find out how to install your own kernel there.
[:ref:`details<install>`]
@@ -884,7 +884,7 @@ When a build error occurs, it might be caused by some aspect of your machine's
setup that often can be fixed quickly; other times though the problem lies in
the code and can only be fixed by a developer. A close examination of the
failure messages coupled with some research on the internet will often tell you
which of the two it is. To perform such a investigation, restart the build
which of the two it is. To perform such an investigation, restart the build
process like this::
make V=1

View File

@@ -611,7 +611,7 @@ better place.
How to read the MAINTAINERS file
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, lets assume
To illustrate how to use the :ref:`MAINTAINERS <maintainers>` file, let's assume
the WiFi in your Laptop suddenly misbehaves after updating the kernel. In that
case it's likely an issue in the WiFi driver. Obviously it could also be some
code it builds upon, but unless you suspect something like that stick to the
@@ -1543,7 +1543,7 @@ as well, because that will speed things up.
And note, it helps developers a great deal if you can specify the exact version
that introduced the problem. Hence if possible within a reasonable time frame,
try to find that version using vanilla kernels. Lets assume something broke when
try to find that version using vanilla kernels. Let's assume something broke when
your distributor released a update from Linux kernel 5.10.5 to 5.10.8. Then as
instructed above go and check the latest kernel from that version line, say
5.10.9. If it shows the problem, try a vanilla 5.10.5 to ensure that no patches

View File

@@ -164,8 +164,8 @@ pipe-user-pages-soft
--------------------
Maximum total number of pages a non-privileged user may allocate for pipes
before the pipe size gets limited to a single page. Once this limit is reached,
new pipes will be limited to a single page in size for this user in order to
before the pipe size gets limited to two pages. Once this limit is reached,
new pipes will be limited to two pages in size for this user in order to
limit total memory usage, and trying to increase them using ``fcntl()`` will be
denied until usage goes below the limit again. The default value allows to
allocate up to 1024 pipes at their default size. When set to 0, no limit is

View File

@@ -66,25 +66,31 @@ This documentation is about:
=============== ===============================================================
abi/ execution domains & personalities
debug/ <empty>
dev/ device specific information (eg dev/cdrom/info)
<$ARCH> tuning controls for various CPU architecture (e.g. csky, s390)
crypto/ <undocumented>
debug/ <undocumented>
dev/ device specific information (e.g. dev/cdrom/info)
fs/ specific filesystems
filehandle, inode, dentry and quota tuning
binfmt_misc <Documentation/admin-guide/binfmt-misc.rst>
kernel/ global kernel info / tuning
miscellaneous stuff
some architecture-specific controls
security (LSM) stuff
net/ networking stuff, for documentation look in:
<Documentation/networking/>
proc/ <empty>
sunrpc/ SUN Remote Procedure Call (NFS)
user/ Per user namespace limits
vm/ memory management tuning
buffer and cache management
user/ Per user per user namespace limits
xen/ <undocumented>
=============== ===============================================================
These are the subdirs I have on my system. There might be more
or other subdirs in another setup. If you see another dir, I'd
really like to hear about it :-)
These are the subdirs I have on my system or have been discovered by
searching through the source code. There might be more or other subdirs
in another setup. If you see another dir, I'd really like to hear about
it :-)
.. toctree::
:maxdepth: 1

View File

@@ -890,7 +890,7 @@ bit 1 print system memory info
bit 2 print timer info
bit 3 print locks info if ``CONFIG_LOCKDEP`` is on
bit 4 print ftrace buffer
bit 5 replay all messages on consoles at the end of panic
bit 5 replay all kernel messages on consoles at the end of panic
bit 6 print all CPUs backtrace (if available in the arch)
bit 7 print only tasks in uninterruptible (blocked) state
===== ============================================

View File

@@ -222,6 +222,8 @@ rmem_max
The maximum receive socket buffer size in bytes.
Default: 4194304
rps_default_mask
----------------
@@ -247,6 +249,8 @@ wmem_max
The maximum send socket buffer size in bytes.
Default: 4194304
message_burst and message_cost
------------------------------

View File

@@ -1757,7 +1757,7 @@ or all of these tasks:
to your bootloader's configuration.
You have to take care of some or all of the tasks yourself, if your
distribution lacks a installkernel script or does only handle part of them.
distribution lacks an installkernel script or does only handle part of them.
Consult the distribution's documentation for details. If in doubt, install the
kernel manually::

View File

@@ -34,22 +34,6 @@ When mounting an XFS filesystem, the following options are accepted.
to the file. Specifying a fixed ``allocsize`` value turns off
the dynamic behaviour.
attr2 or noattr2
The options enable/disable an "opportunistic" improvement to
be made in the way inline extended attributes are stored
on-disk. When the new form is used for the first time when
``attr2`` is selected (either when setting or removing extended
attributes) the on-disk superblock feature bit field will be
updated to reflect this format being in use.
The default behaviour is determined by the on-disk feature
bit indicating that ``attr2`` behaviour is active. If either
mount option is set, then that becomes the new default used
by the filesystem.
CRC enabled filesystems always use the ``attr2`` format, and so
will reject the ``noattr2`` mount option if it is set.
discard or nodiscard (default)
Enable/disable the issuing of commands to let the block
device reclaim space freed by the filesystem. This is
@@ -75,12 +59,6 @@ When mounting an XFS filesystem, the following options are accepted.
across the entire filesystem rather than just on directories
configured to use it.
ikeep or noikeep (default)
When ``ikeep`` is specified, XFS does not delete empty inode
clusters and keeps them around on disk. When ``noikeep`` is
specified, empty inode clusters are returned to the free
space pool.
inode32 or inode64 (default)
When ``inode32`` is specified, it indicates that XFS limits
inode creation to locations which will not result in inode
@@ -253,9 +231,8 @@ latest version and try again.
The deprecation will take place in two parts. Support for mounting V4
filesystems can now be disabled at kernel build time via Kconfig option.
The option will default to yes until September 2025, at which time it
will be changed to default to no. In September 2030, support will be
removed from the codebase entirely.
These options were changed to default to no in September 2025. In
September 2030, support will be removed from the codebase entirely.
Note: Distributors may choose to withdraw V4 format support earlier than
the dates listed above.
@@ -268,8 +245,6 @@ Deprecated Mount Options
============================ ================
Mounting with V4 filesystem September 2030
Mounting ascii-ci filesystem September 2030
ikeep/noikeep September 2025
attr2/noattr2 September 2025
============================ ================
@@ -285,6 +260,8 @@ Removed Mount Options
osyncisdsync/osyncisosync v4.0
barrier v4.19
nobarrier v4.19
ikeep/noikeep v6.18
attr2/noattr2 v6.18
=========================== =======
sysctls
@@ -312,9 +289,6 @@ The following sysctls are available for the XFS filesystem:
removes unused preallocation from clean inodes and releases
the unused space back to the free pool.
fs.xfs.speculative_cow_prealloc_lifetime
This is an alias for speculative_prealloc_lifetime.
fs.xfs.error_level (Min: 0 Default: 3 Max: 11)
A volume knob for error reporting when internal errors occur.
This will generate detailed messages & backtraces for filesystem
@@ -341,17 +315,6 @@ The following sysctls are available for the XFS filesystem:
This option is intended for debugging only.
fs.xfs.irix_symlink_mode (Min: 0 Default: 0 Max: 1)
Controls whether symlinks are created with mode 0777 (default)
or whether their mode is affected by the umask (irix mode).
fs.xfs.irix_sgid_inherit (Min: 0 Default: 0 Max: 1)
Controls files created in SGID directories.
If the group ID of the new file does not match the effective group
ID or one of the supplementary group IDs of the parent dir, the
ISGID bit is cleared if the irix_sgid_inherit compatibility sysctl
is set.
fs.xfs.inherit_sync (Min: 0 Default: 1 Max: 1)
Setting this to "1" will cause the "sync" flag set
by the **xfs_io(8)** chattr command on a directory to be
@@ -387,24 +350,20 @@ The following sysctls are available for the XFS filesystem:
Deprecated Sysctls
==================
=========================================== ================
Name Removal Schedule
=========================================== ================
fs.xfs.irix_sgid_inherit September 2025
fs.xfs.irix_symlink_mode September 2025
fs.xfs.speculative_cow_prealloc_lifetime September 2025
=========================================== ================
None currently.
Removed Sysctls
===============
============================= =======
Name Removed
============================= =======
fs.xfs.xfsbufd_centisec v4.0
fs.xfs.age_buffer_centisecs v4.0
============================= =======
========================================== =======
Name Removed
========================================== =======
fs.xfs.xfsbufd_centisec v4.0
fs.xfs.age_buffer_centisecs v4.0
fs.xfs.irix_symlink_mode v6.18
fs.xfs.irix_sgid_inherit v6.18
fs.xfs.speculative_cow_prealloc_lifetime v6.18
========================================== =======
Error handling
==============

View File

@@ -15,7 +15,7 @@ It features:
- SD/MMC/SDIO support
- Ethernet controller
- USB OTFG FS & HS controllers
- I2C, SPI, CAN busses support
- I2C, SPI, CAN buses support
- Several 16 & 32 bits general purpose timers
- Serial Audio interface
- LCD controller

Some files were not shown because too many files have changed in this diff Show More