linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-06 01:49:46 +00:00

Author	SHA1	Message	Date
Sabrina Dubroca	8d2a2a49c3	xfrm: drop SA reference in xfrm_state_update if dir doesn't match We're not updating x1, but we still need to put() it. Fixes: `a4a87fa4e9` ("xfrm: Add Direction to the SA in or out") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2025-10-21 10:42:42 +02:00
Wang Liang	f584239a9e	net/smc: fix general protection fault in __smc_diag_dump The syzbot report a crash: Oops: general protection fault, probably for non-canonical address 0xfbd5a5d5a0000003: 0000 [#1] SMP KASAN NOPTI KASAN: maybe wild-memory-access in range [0xdead4ead00000018-0xdead4ead0000001f] CPU: 1 UID: 0 PID: 6949 Comm: syz.0.335 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 RIP: 0010:smc_diag_msg_common_fill net/smc/smc_diag.c:44 [inline] RIP: 0010:__smc_diag_dump.constprop.0+0x3ca/0x2550 net/smc/smc_diag.c:89 Call Trace: <TASK> smc_diag_dump_proto+0x26d/0x420 net/smc/smc_diag.c:217 smc_diag_dump+0x27/0x90 net/smc/smc_diag.c:234 netlink_dump+0x539/0xd30 net/netlink/af_netlink.c:2327 __netlink_dump_start+0x6d6/0x990 net/netlink/af_netlink.c:2442 netlink_dump_start include/linux/netlink.h:341 [inline] smc_diag_handler_dump+0x1f9/0x240 net/smc/smc_diag.c:251 __sock_diag_cmd net/core/sock_diag.c:249 [inline] sock_diag_rcv_msg+0x438/0x790 net/core/sock_diag.c:285 netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2552 netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline] netlink_unicast+0x5a7/0x870 net/netlink/af_netlink.c:1346 netlink_sendmsg+0x8d1/0xdd0 net/netlink/af_netlink.c:1896 sock_sendmsg_nosec net/socket.c:714 [inline] __sock_sendmsg net/socket.c:729 [inline] ____sys_sendmsg+0xa95/0xc70 net/socket.c:2614 ___sys_sendmsg+0x134/0x1d0 net/socket.c:2668 __sys_sendmsg+0x16d/0x220 net/socket.c:2700 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x4e0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> The process like this: (CPU1) \| (CPU2) ---------------------------------\|------------------------------- inet_create() \| // init clcsock to NULL \| sk = sk_alloc() \| \| // unexpectedly change clcsock \| inet_init_csk_locks() \| \| // add sk to hash table \| smc_inet_init_sock() \| smc_sk_init() \| smc_hash_sk() \| \| // traverse the hash table \| smc_diag_dump_proto \| __smc_diag_dump() \| // visit wrong clcsock \| smc_diag_msg_common_fill() // alloc clcsock \| smc_create_clcsk \| sock_create_kern \| With CONFIG_DEBUG_LOCK_ALLOC=y, the smc->clcsock is unexpectedly changed in inet_init_csk_locks(). The INET_PROTOSW_ICSK flag is no need by smc, just remove it. After removing the INET_PROTOSW_ICSK flag, this patch alse revert commit `6fd27ea183` ("net/smc: fix lacks of icsk_syn_mss with IPPROTO_SMC") to avoid casting smc_sock to inet_connection_sock. Reported-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f775be4458668f7d220e Tested-by: syzbot+f775be4458668f7d220e@syzkaller.appspotmail.com Fixes: `d25a92ccae` ("net/smc: Introduce IPPROTO_SMC") Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: D. Wythe <alibuda@linux.alibaba.com> Link: https://patch.msgid.link/20251017024827.3137512-1-wangliang74@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-20 17:46:06 -07:00
Jakub Kicinski	bbca867846	Merge branch 'fix-generating-skb-from-non-linear-xdp_buff-for-mlx5' Tariq Toukan says: ==================== Fix generating skb from non-linear xdp_buff for mlx5 Link: https://lore.kernel.org/20250915225857.3024997-1-ameryhung@gmail.com ==================== Link: https://patch.msgid.link/1760644540-899148-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-20 17:39:37 -07:00
Amery Hung	87bcef158a	net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ XDP programs can change the layout of an xdp_buff through bpf_xdp_adjust_tail() and bpf_xdp_adjust_head(). Therefore, the driver cannot assume the size of the linear data area nor fragments. Fix the bug in mlx5 by generating skb according to xdp_buff after XDP programs run. Currently, when handling multi-buf XDP, the mlx5 driver assumes the layout of an xdp_buff to be unchanged. That is, the linear data area continues to be empty and fragments remain the same. This may cause the driver to generate erroneous skb or triggering a kernel warning. When an XDP program added linear data through bpf_xdp_adjust_head(), the linear data will be ignored as mlx5e_build_linear_skb() builds an skb without linear data and then pull data from fragments to fill the linear data area. When an XDP program has shrunk the non-linear data through bpf_xdp_adjust_tail(), the delta passed to __pskb_pull_tail() may exceed the actual nonlinear data size and trigger the BUG_ON in it. To fix the issue, first record the original number of fragments. If the number of fragments changes after the XDP program runs, rewind the end fragment pointer by the difference and recalculate the truesize. Then, build the skb with the linear data area matching the xdp_buff. Finally, only pull data in if there is non-linear data and fill the linear part up to 256 bytes. Fixes: `f52ac7028b` ("net/mlx5e: RX, Add XDP multi-buffer support in Striding RQ") Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1760644540-899148-3-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-20 17:39:13 -07:00
Amery Hung	afd5ba577c	net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ XDP programs can release xdp_buff fragments when calling bpf_xdp_adjust_tail(). The driver currently assumes the number of fragments to be unchanged and may generate skb with wrong truesize or containing invalid frags. Fix the bug by generating skb according to xdp_buff after the XDP program runs. Fixes: `ea5d49bdae` ("net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ") Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1760644540-899148-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-20 17:39:13 -07:00
Xin Long	a73ca0449b	selftests: net: fix server bind failure in sctp_vrf.sh sctp_vrf.sh could fail: TEST 12: bind vrf-2 & 1 in server, connect from client 1 & 2, N [FAIL] not ok 1 selftests: net: sctp_vrf.sh # exit=3 The failure happens when the server bind in a new run conflicts with an existing association from the previous run: [1] ip netns exec $SERVER_NS ./sctp_hello server ... [2] ip netns exec $CLIENT_NS ./sctp_hello client ... [3] ip netns exec $SERVER_NS pkill sctp_hello ... [4] ip netns exec $SERVER_NS ./sctp_hello server ... It occurs if the client in [2] sends a message and closes immediately. With the message unacked, no SHUTDOWN is sent. Killing the server in [3] triggers a SHUTDOWN the client also ignores due to the unacked message, leaving the old association alive. This causes the bind at [4] to fail until the message is acked and the client responds to a second SHUTDOWN after the server’s T2 timer expires (3s). This patch fixes the issue by preventing the client from sending data. Instead, the client blocks on recv() and waits for the server to close. It also waits until both the server and the client sockets are fully released in stop_server and wait_client before restarting. Additionally, replace 2>&1 >/dev/null with -q in sysctl and grep, and drop other redundant 2>&1 >/dev/null redirections, and fix a typo from N to Y (connect successfully) in the description of the last test. Fixes: `a61bd7b9fe` ("selftests: add a selftest for sctp vrf") Reported-by: Hangbin Liu <liuhangbin@gmail.com> Tested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/be2dacf52d0917c4ba5e2e8c5a9cb640740ad2b6.1760731574.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-20 16:41:33 -07:00
Aleksander Jan Bajkowski	ffff5c8fc2	net: phy: realtek: fix rtl8221b-vm-cg name When splitting the RTL8221B-VM-CG into C22 and C45 variants, the name was accidentally changed to RTL8221B-VN-CG. This patch brings back the previous part number. Fixes: `ad5ce743a6` ("net: phy: realtek: Add driver instances for rtl8221b via Clause 45") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251016192325.2306757-1-olek2@wp.pl Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:34:37 -07:00
Ioana Ciornei	902e81e679	dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path The blamed commit increased the needed headroom to account for alignment. This means that the size required to always align a Tx buffer was added inside the dpaa2_eth_needed_headroom() function. By doing that, a manual adjustment of the pointer passed to PTR_ALIGN() was no longer correct since the 'buffer_start' variable was already pointing to the start of the skb's memory. The behavior of the dpaa2-eth driver without this patch was to drop frames on Tx even when the headroom was matching the 128 bytes necessary. Fix this by removing the manual adjust of 'buffer_start' from the PTR_MODE call. Closes: https://lore.kernel.org/netdev/70f0dcd9-1906-4d13-82df-7bbbbe7194c6@app.fastmail.com/T/#u Fixes: `f422abe3f2` ("dpaa2-eth: increase the needed headroom to account for alignment") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Mathew McBride <matt@traverse.com.au> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251016135807.360978-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:31:40 -07:00
Tonghao Zhang	e0caeb24f5	net: bonding: update the slave array for broadcast mode This patch fixes `ce7a381697` ("net: bonding: add broadcast_neighbor option for 802.3ad"). Before this commit, on the broadcast mode, all devices were traversed using the bond_for_each_slave_rcu. This patch supports traversing devices by using all_slaves. Therefore, we need to update the slave array when enslave or release slave. Fixes: `ce7a381697` ("net: bonding: add broadcast_neighbor option for 802.3ad") Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: <stable@vger.kernel.org> Reported-by: Jiri Slaby <jirislaby@kernel.org> Tested-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/all/a97e6e1e-81bc-4a79-8352-9e4794b0d2ca@kernel.org/ Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Link: https://patch.msgid.link/20251016125136.16568-1-tonghao@bamaicloud.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:29:59 -07:00
Bagas Sanjaya	cb74f8c952	Documentation: net: net_failover: Separate cloud-ifupdown-helper and reattach-vf.sh code blocks marker cloud-ifupdown-helper patch and reattach-vf.sh script are rendered in htmldocs output as normal paragraphs instead of literal code blocks due to missing separator from respective code block marker. Add it. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251016093936.29442-2-bagasdotme@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:28:29 -07:00
Wei Fang	e59bc32df2	net: enetc: correct the value of ENETC_RXB_TRUESIZE The ENETC RX ring uses the page halves flipping mechanism, each page is split into two halves for the RX ring to use. And ENETC_RXB_TRUESIZE is defined to 2048 to indicate the size of half a page. However, the page size is configurable, for ARM64 platform, PAGE_SIZE is default to 4K, but it could be configured to 16K or 64K. When PAGE_SIZE is set to 16K or 64K, ENETC_RXB_TRUESIZE is not correct, and the RX ring will always use the first half of the page. This is not consistent with the description in the relevant kernel doc and commit messages. This issue is invisible in most cases, but if users want to increase PAGE_SIZE to receive a Jumbo frame with a single buffer for some use cases, it will not work as expected, because the buffer size of each RX BD is fixed to 2048 bytes. Based on the above two points, we expect to correct ENETC_RXB_TRUESIZE to (PAGE_SIZE >> 1), as described in the comment. Fixes: `d4fd0404c1` ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://patch.msgid.link/20251016080131.3127122-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:27:58 -07:00
Jianpeng Chang	50bd33f6b3	net: enetc: fix the deadlock of enetc_mdio_lock After applying the workaround for err050089, the LS1028A platform experiences RCU stalls on RT kernel. This issue is caused by the recursive acquisition of the read lock enetc_mdio_lock. Here list some of the call stacks identified under the enetc_poll path that may lead to a deadlock: enetc_poll -> enetc_lock_mdio -> enetc_clean_rx_ring OR napi_complete_done -> napi_gro_receive -> enetc_start_xmit -> enetc_lock_mdio -> enetc_map_tx_buffs -> enetc_unlock_mdio -> enetc_unlock_mdio After enetc_poll acquires the read lock, a higher-priority writer attempts to acquire the lock, causing preemption. The writer detects that a read lock is already held and is scheduled out. However, readers under enetc_poll cannot acquire the read lock again because a writer is already waiting, leading to a thread hang. Currently, the deadlock is avoided by adjusting enetc_lock_mdio to prevent recursive lock acquisition. Fixes: `6d36ecdbc4` ("net: enetc: take the MDIO lock only once per NAPI poll cycle") Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com> Acked-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251015021427.180757-1-jianpeng.chang.cn@windriver.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-17 16:12:44 -07:00
Sebastian Reichel	7f864458e9	net: stmmac: dwmac-rk: Fix disabling set_clock_selection On all platforms set_clock_selection() writes to a GRF register. This requires certain clocks running and thus should happen before the clocks are disabled. This has been noticed on RK3576 Sige5, which hangs during system suspend when trying to suspend the second network interface. Note, that suspending the first interface works, because the second device ensures that the necessary clocks for the GRF are enabled. Cc: stable@vger.kernel.org Fixes: `2f2b60a0ec` ("net: ethernet: stmmac: dwmac-rk: Add gmac support for rk3588") Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251014-rockchip-network-clock-fix-v1-1-c257b4afdf75@collabora.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 17:12:46 -07:00
Johannes Wiesböck	bf29555f5b	rtnetlink: Allow deleting FDB entries in user namespace Creating FDB entries is possible from a non-initial user namespace when having CAP_NET_ADMIN, yet, when deleting FDB entries, processes receive an EPERM because the capability is always checked against the initial user namespace. This restricts the FDB management from unprivileged containers. Drop the netlink_capable check in rtnl_fdb_del as it was originally dropped in `c5c351088a` and reintroduced in `1690be63a2` without intention. This patch was tested using a container on GyroidOS, where it was possible to delete FDB entries from an unprivileged user namespace and private network namespace. Fixes: `1690be63a2` ("bridge: Add vlan support to static neighbors") Reviewed-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Tested-by: Harshal Gohel <hg@simonwunderlich.de> Signed-off-by: Johannes Wiesböck <johannes.wiesboeck@aisec.fraunhofer.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20251015201548.319871-1-johannes.wiesboeck@aisec.fraunhofer.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 16:09:56 -07:00
Bagas Sanjaya	1b0124ad50	net: rmnet: Fix checksum offload header v5 and aggregation packet formatting Packet format for checksum offload header v5 and aggregation, and header type table for the former, are shown in normal paragraphs instead. Use appropriate markup. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251015092540.32282-2-bagasdotme@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 15:50:52 -07:00
Cosmin Ratiu	5348d63124	net/mlx5e: psp, avoid 'accel' NULL pointer dereference The 'accel' parameter of mlx5e_txwqe_build_eseg_csum() and the similar 'state' parameter of mlx5e_accel_tx_ids_len() were NULL when called from mlx5i_sq_xmit() and were causing kernel panics from that context. Fix that by passing in a local empty mlx5e_accel_tx_state variable, thus guaranteeing that 'accel' is never NULL. Also remove an unnecessary check from mlx5e_tx_wqe_inline_mode(). Fixes: `e5a1861a29` ("net/mlx5e: Implement PSP Tx data path") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1760511923-890650-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 15:43:15 -07:00
Eric Dumazet	d0d3e9c286	net: gro: clear skb_shinfo(skb)->hwtstamps in napi_reuse_skb() Some network drivers assume this field is zero after napi_get_frags(). We must clear it in napi_reuse_skb() otherwise the following can happen: 1) A packet is received, and skb_shinfo(skb)->hwtstamps is populated because a bit in the receive descriptor announced hwtstamp availability for this packet. 2) Packet is given to gro layer via napi_gro_frags(). 3) Packet is merged to a prior one held in GRO queues. 4) skb is saved after some cleanup in napi->skb via a call to napi_reuse_skb(). 5) Next packet is received 10 seconds later, gets the recycled skb from napi_get_frags(). 6) The receive descriptor does not announce hwtstamp availability. Driver does not clear shinfo->hwtstamps. 7) We have in shinfo->hwtstamps an old timestamp. Fixes: `ac45f602ee` ("net: infrastructure for hardware time stamping") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://patch.msgid.link/20251015063221.4171986-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 15:42:49 -07:00
Nathan Chancellor	aaf043a568	net/mlx5e: Return 1 instead of 0 in invalid case in mlx5e_mpwrq_umr_entry_size() When building with Clang 20 or newer, there are some objtool warnings from unexpected fallthroughs to other functions: vmlinux.o: warning: objtool: mlx5e_mpwrq_mtts_per_wqe() falls through to next function mlx5e_mpwrq_max_num_entries() vmlinux.o: warning: objtool: mlx5e_mpwrq_max_log_rq_size() falls through to next function mlx5e_get_linear_rq_headroom() LLVM 20 contains an (admittedly problematic [1]) optimization [2] to convert divide by zero into the equivalent of __builtin_unreachable(), which invokes undefined behavior and destroys code generation when it is encountered in a control flow graph. mlx5e_mpwrq_umr_entry_size() returns 0 in the default case of an unrecognized mlx5e_mpwrq_umr_mode value. mlx5e_mpwrq_mtts_per_wqe(), which is inlined into mlx5e_mpwrq_max_log_rq_size(), uses the result of mlx5e_mpwrq_umr_entry_size() in a divide operation without checking for zero, so LLVM is able to infer there will be a divide by zero in this case and invokes undefined behavior. While there is some proposed work to isolate this undefined behavior and avoid the destructive code generation that results in these objtool warnings, code should still be defensive against divide by zero. As the WARN_ONCE() implies that an invalid value should be handled gracefully, return 1 instead of 0 in the default case so that the results of this division operation is always valid. Fixes: `168723c1f8` ("net/mlx5e: xsk: Use umr_mode to calculate striding RQ parameters") Link: https://lore.kernel.org/CAGG=3QUk8-Ak7YKnRziO4=0z=1C_7+4jF+6ZeDQ9yF+kuTOHOQ@mail.gmail.com/ [1] Link: `37932643ab` [2] Closes: https://github.com/ClangBuiltLinux/linux/issues/2131 Closes: https://github.com/ClangBuiltLinux/linux/issues/2132 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20251014-mlx5e-avoid-zero-div-from-mlx5e_mpwrq_umr_entry_size-v1-1-dc186b8819ef@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 15:12:41 -07:00
Michal Pecio	75cea9860a	net: usb: rtl8150: Fix frame padding TX frames aren't padded and unknown memory is sent into the ether. Theoretically, it isn't even guaranteed that the extra memory exists and can be sent out, which could cause further problems. In practice, I found that plenty of tailroom exists in the skb itself (in my test with ping at least) and skb_padto() easily succeeds, so use it here. In the event of -ENOMEM drop the frame like other drivers do. The use of one more padding byte instead of a USB zero-length packet is retained to avoid regression. I have a dodgy Etron xHCI controller which doesn't seem to support sending ZLPs at all. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Michal Pecio <michal.pecio@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251014203528.3f9783c4.michal.pecio@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-16 15:10:02 -07:00
Linus Torvalds	634ec1fc79	Merge tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN Current release - regressions: - udp: do not use skb_release_head_state() before skb_attempt_defer_free() - gro_cells: use nested-BH locking for gro_cell - dpll: zl3073x: increase maximum size of flash utility Previous releases - regressions: - core: fix lockdep splat on device unregister - tcp: fix tcp_tso_should_defer() vs large RTT - tls: - don't rely on tx_work during send() - wait for pending async decryptions if tls_strp_msg_hold fails - can: j1939: add missing calls in NETDEV_UNREGISTER notification handler - eth: lan78xx: fix lost EEPROM write timeout in lan78xx_write_raw_eeprom Previous releases - always broken: - ip6_tunnel: prevent perpetual tunnel growth - dpll: zl3073x: handle missing or corrupted flash configuration - can: m_can: fix pm_runtime and CAN state handling - eth: - ixgbe: fix too early devlink_free() in ixgbe_remove() - ixgbevf: fix mailbox API compatibility - gve: Check valid ts bit on RX descriptor before hw timestamping - idpf: cleanup remaining SKBs in PTP flows - r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H" * tag 'net-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) udp: do not use skb_release_head_state() before skb_attempt_defer_free() net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset netdevsim: set the carrier when the device goes up selftests: tls: add test for short splice due to full skmsg selftests: net: tls: add tests for cmsg vs MSG_MORE tls: don't rely on tx_work during send() tls: wait for pending async decryptions if tls_strp_msg_hold fails tls: always set record_type in tls_process_cmsg tls: wait for async encrypt in case of error during latter iterations of sendmsg tls: trim encrypted message to match the plaintext on short splice tg3: prevent use of uninitialized remote_adv and local_adv variables MAINTAINERS: new entry for IPv6 IOAM gve: Check valid ts bit on RX descriptor before hw timestamping net: core: fix lockdep splat on device unregister MAINTAINERS: add myself as maintainer for b53 selftests: net: check jq command is supported net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit() tcp: fix tcp_tso_should_defer() vs large RTT r8152: add error handling in rtl8152_driver_init usbnet: Fix using smp_processor_id() in preemptible code warnings ...	2025-10-16 09:41:21 -07:00
Linus Torvalds	ef25485516	Merge tag 'ata-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Niklas Cassel: - Do not print an error message (and assume that the General Purpose Log Directory log page is not supported) for a device that reports a bogus General Purpose Logging Version. Unsurprisingly, many vendors fail to report the only valid General Purpose Logging Version (Damien) * tag 'ata-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-core: relax checks in ata_read_log_directory()	2025-10-16 09:39:29 -07:00
Eric Dumazet	6de1dec1c1	udp: do not use skb_release_head_state() before skb_attempt_defer_free() Michal reported and bisected an issue after recent adoption of skb_attempt_defer_free() in UDP. The issue here is that skb_release_head_state() is called twice per skb, one time from skb_consume_udp(), then a second time from skb_defer_free_flush() and napi_consume_skb(). As Sabrina suggested, remove skb_release_head_state() call from skb_consume_udp(). Add a DEBUG_NET_WARN_ON_ONCE(skb_nfct(skb)) in skb_attempt_defer_free() Many thanks to Michal, Sabrina, Paolo and Florian for their help. Fixes: `6471658dc6` ("udp: use skb_attempt_defer_free()") Reported-and-bisected-by: Michal Kubecek <mkubecek@suse.cz> Closes: https://lore.kernel.org/netdev/gpjh4lrotyephiqpuldtxxizrsg6job7cvhiqrw72saz2ubs3h@g6fgbvexgl3r/ Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Michal Kubecek <mkubecek@suse.cz> Cc: Sabrina Dubroca <sd@queasysnail.net> Cc: Florian Westphal <fw@strlen.de> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/20251015052715.4140493-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-16 16:03:07 +02:00
I Viswanath	8d93ff40d4	net: usb: lan78xx: fix use of improperly initialized dev->chipid in lan78xx_reset dev->chipid is used in lan78xx_init_mac_address before it's initialized: lan78xx_reset() { lan78xx_init_mac_address() lan78xx_read_eeprom() lan78xx_read_raw_eeprom() <- dev->chipid is used here dev->chipid = ... <- dev->chipid is initialized correctly here } Reorder initialization so that dev->chipid is set before calling lan78xx_init_mac_address(). Fixes: `a0db7d10b7` ("lan78xx: Add to handle mux control per chip id") Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Khalid Aziz <khalid@kernel.org> Link: https://patch.msgid.link/20251013181648.35153-1-viswanathiyyappan@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 18:27:48 -07:00
Jakub Kicinski	5e655aadda	Merge tag 'linux-can-fixes-for-6.18-20251014' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2025-10-14 The first 2 paches are by Celeste Liu and target the gS_usb driver. The first patch remove the limitation to 3 CAN interface per USB device. The second patch adds the missing population of net_device->dev_port. The next 4 patches are by me and fix the m_can driver. They add a missing pm_runtime_disable(), fix the CAN state transition back to Error Active and fix the state after ifup and suspend/resume. Another patch by me targets the m_can driver, too and replaces Dong Aisheng's old email address. The next 2 patches are by Vincent Mailhol and update the CAN networking Documentation. Tetsuo Handa contributes the last patch that add missing cleanup calls in the NETDEV_UNREGISTER notification handler. * tag 'linux-can-fixes-for-6.18-20251014' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: j1939: add missing calls in NETDEV_UNREGISTER notification handler can: add Transmitter Delay Compensation (TDC) documentation can: remove false statement about 1:1 mapping between DLC and length can: m_can: replace Dong Aisheng's old email address can: m_can: fix CAN state in system PM can: m_can: m_can_chip_config(): bring up interface in correct state can: m_can: m_can_handle_state_errors(): fix CAN state transition to Error Active can: m_can: m_can_plat_remove(): add missing pm_runtime_disable() can: gs_usb: gs_make_candev(): populate net_device->dev_port can: gs_usb: increase max interface to U8_MAX ==================== Link: https://patch.msgid.link/20251014122140.990472-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:56:20 -07:00
Breno Leitao	1a8fed52f7	netdevsim: set the carrier when the device goes up Bringing a linked netdevsim device down and then up causes communication failure because both interfaces lack carrier. Basically a ifdown/ifup on the interface make the link broken. Commit `3762ec05a9` ("netdevsim: add NAPI support") added supported for NAPI, calling netif_carrier_off() in nsim_stop(). This patch re-enables the carrier symmetrically on nsim_open(), in case the device is linked and the peer is up. Signed-off-by: Breno Leitao <leitao@debian.org> Fixes: `3762ec05a9` ("netdevsim: add NAPI support") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251014-netdevsim_fix-v2-1-53b40590dae1@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:43:17 -07:00
Jakub Kicinski	cf51d617c3	Merge branch 'tls-misc-bugfixes' Sabrina Dubroca says: ==================== tls: misc bugfixes Jann Horn reported multiple bugs in kTLS. This series addresses them, and adds some corresponding selftests for those that are reproducible (and without failure injection). ==================== Link: https://patch.msgid.link/cover.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:48 -07:00
Sabrina Dubroca	3667e9b442	selftests: tls: add test for short splice due to full skmsg We don't have a test triggering a partial splice caused by a full skmsg. Add one, based on a program by Jann Horn. Use MAX_FRAGS=48 to make sure the skmsg will be full for any allowed value of CONFIG_MAX_SKB_FRAGS (17..45). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/1d129a15f526ea3602f3a2b368aa0b6f7e0d35d5.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:46 -07:00
Sabrina Dubroca	f95fce1e95	selftests: net: tls: add tests for cmsg vs MSG_MORE We don't have a test to check that MSG_MORE won't let us merge records of different types across sendmsg calls. Add new tests that check: - MSG_MORE is only allowed for DATA records - a pending DATA record gets closed and pushed before a non-DATA record is processed Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/b34feeadefe8a997f068d5ed5617afd0072df3c0.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Sabrina Dubroca	7f846c65ca	tls: don't rely on tx_work during send() With async crypto, we rely on tx_work to actually transmit records once encryption completes. But while send() is running, both the tx_lock and socket lock are held, so tx_work_handler cannot process the queue of encrypted records, and simply reschedules itself. During a large send(), this could last a long time, and use a lot of memory. Transmit any pending encrypted records before restarting the main loop of tls_sw_sendmsg_locked. Fixes: `a42055e8d2` ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/8396631478f70454b44afb98352237d33f48d34d.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Sabrina Dubroca	b8a6ff84ab	tls: wait for pending async decryptions if tls_strp_msg_hold fails Async decryption calls tls_strp_msg_hold to create a clone of the input skb to hold references to the memory it uses. If we fail to allocate that clone, proceeding with async decryption can lead to various issues (UAF on the skb, writing into userspace memory after the recv() call has returned). In this case, wait for all pending decryption requests. Fixes: `84c61fe1a7` ("tls: rx: do not use the standard strparser") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/b9fe61dcc07dab15da9b35cf4c7d86382a98caf2.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Sabrina Dubroca	b6fe4c29bb	tls: always set record_type in tls_process_cmsg When userspace wants to send a non-DATA record (via the TLS_SET_RECORD_TYPE cmsg), we need to send any pending data from a previous MSG_MORE send() as a separate DATA record. If that DATA record is encrypted asynchronously, tls_handle_open_record will return -EINPROGRESS. This is currently treated as an error by tls_process_cmsg, and it will skip setting record_type to the correct value, but the caller (tls_sw_sendmsg_locked) handles that return value correctly and proceeds with sending the new message with an incorrect record_type (DATA instead of whatever was requested in the cmsg). Always set record_type before handling the open record. If tls_handle_open_record returns an error, record_type will be ignored. If it succeeds, whether with synchronous crypto (returning 0) or asynchronous (returning -EINPROGRESS), the caller will proceed correctly. Fixes: `a42055e8d2` ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/0457252e578a10a94e40c72ba6288b3a64f31662.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Sabrina Dubroca	b014a4e066	tls: wait for async encrypt in case of error during latter iterations of sendmsg If we hit an error during the main loop of tls_sw_sendmsg_locked (eg failed allocation), we jump to send_end and immediately return. Previous iterations may have queued async encryption requests that are still pending. We should wait for those before returning, as we could otherwise be reading from memory that userspace believes we're not using anymore, which would be a sort of use-after-free. This is similar to what tls_sw_recvmsg already does: failures during the main loop jump to the "wait for async" code, not straight to the unlock/return. Fixes: `a42055e8d2` ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/c793efe9673b87f808d84fdefc0f732217030c52.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Sabrina Dubroca	ce5af41e32	tls: trim encrypted message to match the plaintext on short splice During tls_sw_sendmsg_locked, we pre-allocate the encrypted message for the size we're expecting to send during the current iteration, but we may end up sending less, for example when splicing: if we're getting the data from small fragments of memory, we may fill up all the slots in the skmsg with less data than expected. In this case, we need to trim the encrypted message to only the length we actually need, to avoid pushing uninitialized bytes down the underlying TCP socket. Fixes: `fe1e81d4f7` ("tls/sw: Support MSG_SPLICE_PAGES") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/66a0ae99c9efc15f88e9e56c1f58f902f442ce86.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:41:45 -07:00
Alexey Simakov	0c3f2e6281	tg3: prevent use of uninitialized remote_adv and local_adv variables Some execution paths that jump to the fiber_setup_done label could leave the remote_adv and local_adv variables uninitialized and then use it. Initialize this variables at the point of definition to avoid this. Fixes: `85730a631f` ("tg3: Add SGMII phy support for 5719/5718 serdes") Co-developed-by: Alexandr Sapozhnikov <alsp705@gmail.com> Signed-off-by: Alexandr Sapozhnikov <alsp705@gmail.com> Signed-off-by: Alexey Simakov <bigalex934@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20251014164736.5890-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:16:49 -07:00
Justin Iurman	bc384963bc	MAINTAINERS: new entry for IPv6 IOAM Create a maintainer entry for IPv6 IOAM. Add myself as I authored most if not all of the IPv6 IOAM code in the kernel and actively participate in the related IETF groups. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20251014170650.27679-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 17:13:46 -07:00
Linus Torvalds	7ea30958b3	Merge tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Handle inode number mismatches in nsfs file handles - Update the comment to init_file() - Add documentation link for EBADF in the rust file code - Skip read lock assertion for read-only filesystems when using dax - Don't leak disconnected dentries during umount - Fix new coredump input pattern validation - Handle ENOIOCTLCMD conversion in vfs_fileattr_{g,s}et() correctly - Remove redundant IOCB_DIO_CALLER_COMP clearing in overlayfs * tag 'vfs-6.18-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ovl: remove redundant IOCB_DIO_CALLER_COMP clearing fs: return EOPNOTSUPP from file_setattr/file_getattr syscalls Revert "fs: make vfs_fileattr_[get\|set] return -EOPNOTSUPP" coredump: fix core_pattern input validation vfs: Don't leak disconnected dentries on umount dax: skip read lock assertion for read-only filesystems rust: file: add intra-doc link for 'EBADF' fs: update comment in init_file() nsfs: handle inode number mismatches gracefully in file handles	2025-10-15 15:12:58 -07:00
Tim Hostetler	bfdd74166a	gve: Check valid ts bit on RX descriptor before hw timestamping The device returns a valid bit in the LSB of the low timestamp byte in the completion descriptor that the driver should check before setting the SKB's hardware timestamp. If the timestamp is not valid, do not hardware timestamp the SKB. Cc: stable@vger.kernel.org Fixes: `b2c7aeb490` ("gve: Implement ndo_hwtstamp_get/set for RX timestamping") Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Tim Hostetler <thostet@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251014004740.2775957-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-15 09:04:58 -07:00
Linus Torvalds	5a6f65d150	Merge tag 'bitmap-for-v6.18-rc2' of https://github.com/norov/linux Pull bitmap fixes from Yury Norov: "A "unnecessary `unsafe`" warning fix for bitmap/rust, and one leftover patch for FIELD_PREP_WM16() conversion. - rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning (Miguel) - FIELD_PREP_WM16() rework leftover (Nicolas)" * tag 'bitmap-for-v6.18-rc2' of https://github.com/norov/linux: PM / devfreq: rockchip-dfi: switch to FIELD_PREP_WM16 macro rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning	2025-10-15 08:48:17 -07:00
Linus Torvalds	1f4a222b0e	Remove long-stale ext3 defconfig option Inspired by commit `c065b6046b` ("Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs") I looked around for any other left-over EXT3 config options, and found some old defconfig files still mentioned CONFIG_EXT3_DEFAULTS_TO_ORDERED. That config option was removed a decade ago in commit `c290ea01ab` ("fs: Remove ext3 filesystem driver"). It had a good run, but let's remove it for good. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-10-15 07:57:28 -07:00
Linus Torvalds	66f8e4df00	Merge tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 bug fixes from Ted Ts'o: - Fix regression caused by removing CONFIG_EXT3_FS when testing some very old defconfigs - Avoid a BUG_ON when opening a file on a maliciously corrupted file system - Avoid mm warnings when freeing a very large orphan file metadata - Avoid a theoretical races between metadata writeback and checkpoints (it's very hard to hit in practice, since the race requires that the writeback take a very long time) * tag 'ext4_for_linus-6.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs ext4: free orphan info with kvfree ext4: detect invalid INLINE_DATA + EXTENTS flag combination ext4, doc: fix and improve directory hash tree description ext4: wait for ongoing I/O to complete before freeing blocks jbd2: ensure that all ongoing I/O complete before freeing blocks	2025-10-15 07:51:57 -07:00
Nicolas Frattaroli	7e85ac9da1	PM / devfreq: rockchip-dfi: switch to FIELD_PREP_WM16 macro The era of hand-rolled HIWORD_UPDATE macros is over, at least for those drivers that use constant masks. Like many other Rockchip drivers, rockchip-dfi brings with it its own HIWORD_UPDATE macro. This variant doesn't shift the value (and like the others, doesn't do any checking). Remove it, and replace instances of it with hw_bitfield.h's FIELD_PREP_WM16. Since FIELD_PREP_WM16 requires contiguous masks and shifts the value for us, some reshuffling of definitions needs to happen. This gives us better compile-time error checking, and in my opinion, nicer code. Tested on an RK3568 ODROID-M1 board (LPDDR4X at 1560 MHz, an RK3588 Radxa ROCK 5B board (LPDDR4X at 2112 MHz) and an RK3588 Radxa ROCK 5T board (LPDDR5 at 2400 MHz). perf measurements were consistent with the measurements of stress-ng --stream in all cases. Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>	2025-10-15 10:39:54 -04:00
Miguel Ojeda	0f5878834d	rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning Starting with Rust 1.92.0 (expected 2025-12-11), Rust allows to safely take the address of a union field [1][2]: CLIPPY L rust/kernel.o error: unnecessary `unsafe` block --> rust/kernel/bitmap.rs:169:13 \| 169 \| unsafe { core::ptr::addr_of!(self.repr.bitmap) } \| ^^^^^^ unnecessary `unsafe` block \| = note: `-D unused-unsafe` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_unsafe)]` error: unnecessary `unsafe` block --> rust/kernel/bitmap.rs:185:13 \| 185 \| unsafe { core::ptr::addr_of_mut!(self.repr.bitmap) } \| ^^^^^^ unnecessary `unsafe` block Thus allow both instances to clean the warning in newer compilers. Link: https://github.com/rust-lang/rust/issues/141264 [1] Link: https://github.com/rust-lang/rust/pull/141469 [2] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>	2025-10-15 10:39:54 -04:00
Florian Westphal	7f0fddd817	net: core: fix lockdep splat on device unregister Since blamed commit, unregister_netdevice_many_notify() takes the netdev mutex if the device needs it. If the device list is too long, this will lock more device mutexes than lockdep can handle: unshare -n \ bash -c 'for i in $(seq 1 100);do ip link add foo$i type dummy;done' BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. depth: 48 max: 48! 48 locks held by kworker/u16:1/69: #0: ..148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work #1: ..d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work #2: ..bd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net #3: ..aa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch #4: ..cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify [..] Add a helper to close and then unlock a list of net_devices. Devices that are not up have to be skipped - netif_close_many always removes them from the list without any other actions taken, so they'd remain in locked state. Close devices whenever we've used up half of the tracking slots or we processed entire list without hitting the limit. Fixes: `7e4d784f58` ("net: hold netdev instance lock during rtnetlink operations") Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20251013185052.14021-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-14 19:27:20 -07:00
Jonas Gorski	df5a1f4aeb	MAINTAINERS: add myself as maintainer for b53 I wrote the original OpenWrt driver that Florian used as the base for the dsa driver, I might as well take responsibility for it. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20251013180347.133246-1-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-10-14 13:47:58 -07:00
Linus Torvalds	9b332cece9	Merge tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix a crasher reported by rtm@csail.mit.edu * tag 'nfsd-6.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Define a proc_layoutcommit for the FlexFiles layout type	2025-10-14 09:28:12 -07:00
Linus Torvalds	5bd0116d92	Merge tag 'for-linus-6.18-2' of https://github.com/cminyard/linux-ipmi Pull IPMI fixes from Corey Minyard: "A few bug fixes for patches that went in this release: a refcount error and some missing or incorrect error checks" * tag 'for-linus-6.18-2' of https://github.com/cminyard/linux-ipmi: ipmi: Fix handling of messages with provided receive message pointer mfd: ls2kbmc: check for devm_mfd_add_devices() failure mfd: ls2kbmc: Fix an IS_ERR() vs NULL check in probe()	2025-10-14 09:15:45 -07:00
Wang Liang	4f86eb0a38	selftests: net: check jq command is supported The jq command is used in vlan_bridge_binding.sh, if it is not supported, the test will spam the following log. # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # TEST: Test bridge_binding on->off when lower down [FAIL] # Got operstate of , expected 0 The rtnetlink.sh has the same problem. It makes sense to check if jq is installed before running these tests. After this patch, the vlan_bridge_binding.sh skipped if jq is not supported: # timeout set to 3600 # selftests: net: vlan_bridge_binding.sh # TEST: jq not installed [SKIP] Fixes: `dca12e9ab7` ("selftests: net: Add a VLAN bridge binding selftest") Fixes: `6a414fd77f` ("selftests: rtnetlink: Add an address proto test") Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251013080039.3035898-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-14 15:12:18 +02:00
Lorenzo Bianconi	bd5afca115	net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit() Completion napi can free out-of-order tx descriptors if hw QoS is enabled and packets with different priority are queued to same DMA ring. Take into account possible out-of-order reports checking if the tx queue is full using circular buffer head/tail pointer instead of the number of queued packets. Fixes: `23020f0493` ("net: airoha: Introduce ethernet support for EN7581 SoC") Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251012-airoha-tx-busy-queue-v2-1-a600b08bab2d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-14 12:33:46 +02:00
Eric Dumazet	295ce1eb36	tcp: fix tcp_tso_should_defer() vs large RTT Neal reported that using neper tcp_stream with TCP_TX_DELAY set to 50ms would often lead to flows stuck in a small cwnd mode, regardless of the congestion control. While tcp_stream sets TCP_TX_DELAY too late after the connect(), it highlighted two kernel bugs. The following heuristic in tcp_tso_should_defer() seems wrong for large RTT: delta = tp->tcp_clock_cache - head->tstamp; /* If next ACK is likely to come too late (half srtt), do not defer / if ((s64)(delta - (u64)NSEC_PER_USEC (tp->srtt_us >> 4)) < 0) goto send_now; If next ACK is expected to come in more than 1 ms, we should not defer because we prefer a smooth ACK clocking. While blamed commit was a step in the good direction, it was not generic enough. Another patch fixing TCP_TX_DELAY for established flows will be proposed when net-next reopens. Fixes: `50c8339e92` ("tcp: tso: restore IW10 after TSO autosizing") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251011115742.1245771-1-edumazet@google.com [pabeni@redhat.com: fixed whitespace issue] Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-14 12:21:48 +02:00
Yi Cong	75527d61d6	r8152: add error handling in rtl8152_driver_init rtl8152_driver_init() is missing the error handling. When rtl8152_driver registration fails, rtl8152_cfgselector_driver should be deregistered. Fixes: `ec51fbd1b8` ("r8152: add USB device driver for config selection") Cc: stable@vger.kernel.org Signed-off-by: Yi Cong <yicong@kylinos.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251011082415.580740-1-yicongsrfy@163.com [pabeni@redhat.com: clarified the commit message] Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-10-14 12:07:21 +02:00

1 2 3 4 5 ...

1396753 Commits