Commit Graph

2231 Commits

Author SHA1 Message Date
Saravanan Vajravel
2f884e6df6 IB/isert: Fix incorrect release of isert connection
[ Upstream commit 699826f4e3 ]

The ib_isert module is releasing the isert connection both in
isert_wait_conn() handler as well as isert_free_conn() handler.
In isert_wait_conn() handler, it is expected to wait for iSCSI
session logout operation to complete. It should free the isert
connection only in isert_free_conn() handler.

When a bunch of iSER target is cleared, this issue can lead to
use-after-free memory issue as isert conn is twice released

Fixes: b02efbfc9a ("iser-target: Fix implicit termination of connections")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/20230606102531.162967-4-saravanan.vajravel@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21 16:02:15 +02:00
Saravanan Vajravel
fb0f216bf0 IB/isert: Fix possible list corruption in CMA handler
[ Upstream commit 7651e2d6c5 ]

When ib_isert module receives connection error event, it is
releasing the isert session and removes corresponding list
node but it doesn't take appropriate mutex lock to remove
the list node.  This can lead to linked  list corruption

Fixes: bd3792205a ("iser-target: Fix pending connections handling in target stack shutdown sequnce")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Link: https://lore.kernel.org/r/20230606102531.162967-3-saravanan.vajravel@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21 16:02:15 +02:00
Saravanan Vajravel
d578ba2d9c IB/isert: Fix dead lock in ib_isert
[ Upstream commit 691b048093 ]

- When a iSER session is released, ib_isert module is taking a mutex
  lock and releasing all pending connections. As part of this, ib_isert
  is destroying rdma cm_id. To destroy cm_id, rdma_cm module is sending
  CM events to CMA handler of ib_isert. This handler is taking same
  mutex lock. Hence it leads to deadlock between ib_isert & rdma_cm
  modules.

- For fix, created local list of pending connections and release the
  connection outside of mutex lock.

Calltrace:
---------
[ 1229.791410] INFO: task kworker/10:1:642 blocked for more than 120 seconds.
[ 1229.791416]       Tainted: G           OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[ 1229.791418] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1229.791419] task:kworker/10:1    state:D stack:    0 pid:  642 ppid:     2 flags:0x80004000
[ 1229.791424] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 1229.791436] Call Trace:
[ 1229.791438]  __schedule+0x2d1/0x830
[ 1229.791445]  ? select_idle_sibling+0x23/0x6f0
[ 1229.791449]  schedule+0x35/0xa0
[ 1229.791451]  schedule_preempt_disabled+0xa/0x10
[ 1229.791453]  __mutex_lock.isra.7+0x310/0x420
[ 1229.791456]  ? select_task_rq_fair+0x351/0x990
[ 1229.791459]  isert_cma_handler+0x224/0x330 [ib_isert]
[ 1229.791463]  ? ttwu_queue_wakelist+0x159/0x170
[ 1229.791466]  cma_cm_event_handler+0x25/0xd0 [rdma_cm]
[ 1229.791474]  cma_ib_handler+0xa7/0x2e0 [rdma_cm]
[ 1229.791478]  cm_process_work+0x22/0xf0 [ib_cm]
[ 1229.791483]  cm_work_handler+0xf4/0xf30 [ib_cm]
[ 1229.791487]  ? move_linked_works+0x6e/0xa0
[ 1229.791490]  process_one_work+0x1a7/0x360
[ 1229.791491]  ? create_worker+0x1a0/0x1a0
[ 1229.791493]  worker_thread+0x30/0x390
[ 1229.791494]  ? create_worker+0x1a0/0x1a0
[ 1229.791495]  kthread+0x10a/0x120
[ 1229.791497]  ? set_kthread_struct+0x40/0x40
[ 1229.791499]  ret_from_fork+0x1f/0x40

[ 1229.791739] INFO: task targetcli:28666 blocked for more than 120 seconds.
[ 1229.791740]       Tainted: G           OE    --------- -  - 4.18.0-372.9.1.el8.x86_64 #1
[ 1229.791741] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1229.791742] task:targetcli       state:D stack:    0 pid:28666 ppid:  5510 flags:0x00004080
[ 1229.791743] Call Trace:
[ 1229.791744]  __schedule+0x2d1/0x830
[ 1229.791746]  schedule+0x35/0xa0
[ 1229.791748]  schedule_preempt_disabled+0xa/0x10
[ 1229.791749]  __mutex_lock.isra.7+0x310/0x420
[ 1229.791751]  rdma_destroy_id+0x15/0x20 [rdma_cm]
[ 1229.791755]  isert_connect_release+0x115/0x130 [ib_isert]
[ 1229.791757]  isert_free_np+0x87/0x140 [ib_isert]
[ 1229.791761]  iscsit_del_np+0x74/0x120 [iscsi_target_mod]
[ 1229.791776]  lio_target_np_driver_store+0xe9/0x140 [iscsi_target_mod]
[ 1229.791784]  configfs_write_file+0xb2/0x110
[ 1229.791788]  vfs_write+0xa5/0x1a0
[ 1229.791792]  ksys_write+0x4f/0xb0
[ 1229.791794]  do_syscall_64+0x5b/0x1a0
[ 1229.791798]  entry_SYSCALL_64_after_hwframe+0x65/0xca

Fixes: bd3792205a ("iser-target: Fix pending connections handling in target stack shutdown sequnce")
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Link: https://lore.kernel.org/r/20230606102531.162967-2-saravanan.vajravel@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21 16:02:15 +02:00
Li Zhijian
cb1461ecf0 RDMA/rtrs: Fix rxe_dealloc_pd warning
[ Upstream commit 9c29c8c7df ]

In current design:
1. PD and clt_path->s.dev are shared among connections.
2. every con[n]'s cleanup phase will call destroy_con_cq_qp()
3. clt_path->s.dev will be always decreased in destroy_con_cq_qp(), and
   when clt_path->s.dev become zero, it will destroy PD.
4. when con[1] failed to create, con[1] will not take clt_path->s.dev,
   but it try to decreased clt_path->s.dev

So, in case create_cm(con[0]) succeeds but create_cm(con[1]) fails,
destroy_con_cq_qp(con[1]) will be called first which will destroy the PD
while this PD is still taken by con[0].

Here, we refactor the error path of create_cm() and init_conns(), so that
we do the cleanup in the order they are created.

The warning occurs when destroying RXE PD whose reference count is not
zero.

 rnbd_client L597: Mapping device /dev/nvme0n1 on session client, (access_mode: rw, nr_poll_queues: 0)
 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 26407 at drivers/infiniband/sw/rxe/rxe_pool.c:256 __rxe_cleanup+0x13a/0x170 [rdma_rxe]
 Modules linked in: rpcrdma rdma_ucm ib_iser rnbd_client libiscsi rtrs_client scsi_transport_iscsi rtrs_core rdma_cm iw_cm ib_cm crc32_generic rdma_rxe udp_tunnel ib_uverbs ib_core kmem device_dax nd_pmem dax_pmem nd_vme crc32c_intel fuse nvme_core nfit libnvdimm dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_mirror dm_region_hash dm_log dm_mod
 CPU: 0 PID: 26407 Comm: rnbd-client.sh Kdump: loaded Not tainted 6.2.0-rc6-roce-flush+ #53
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
 RIP: 0010:__rxe_cleanup+0x13a/0x170 [rdma_rxe]
 Code: 45 84 e4 0f 84 5a ff ff ff 48 89 ef e8 5f 18 71 f9 84 c0 75 90 be c8 00 00 00 48 89 ef e8 be 89 1f fa 85 c0 0f 85 7b ff ff ff <0f> 0b 41 bc ea ff ff ff e9 71 ff ff ff e8 84 7f 1f fa e9 d0 fe ff
 RSP: 0018:ffffb09880b6f5f0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff99401f15d6a8 RCX: 0000000000000000
 RDX: 0000000000000001 RSI: ffffffffbac8234b RDI: 00000000ffffffff
 RBP: ffff99401f15d6d0 R08: 0000000000000001 R09: 0000000000000001
 R10: 0000000000002d82 R11: 0000000000000000 R12: 0000000000000001
 R13: ffff994101eff208 R14: ffffb09880b6f6a0 R15: 00000000fffffe00
 FS:  00007fe113904740(0000) GS:ffff99413bc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007ff6cde656c8 CR3: 000000001f108004 CR4: 00000000001706f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  rxe_dealloc_pd+0x16/0x20 [rdma_rxe]
  ib_dealloc_pd_user+0x4b/0x80 [ib_core]
  rtrs_ib_dev_put+0x79/0xd0 [rtrs_core]
  destroy_con_cq_qp+0x8a/0xa0 [rtrs_client]
  init_path+0x1e7/0x9a0 [rtrs_client]
  ? __pfx_autoremove_wake_function+0x10/0x10
  ? lock_is_held_type+0xd7/0x130
  ? rcu_read_lock_sched_held+0x43/0x80
  ? pcpu_alloc+0x3dd/0x7d0
  ? rtrs_clt_init_stats+0x18/0x40 [rtrs_client]
  rtrs_clt_open+0x24f/0x5a0 [rtrs_client]
  ? __pfx_rnbd_clt_link_ev+0x10/0x10 [rnbd_client]
  rnbd_clt_map_device+0x6a5/0xe10 [rnbd_client]

Fixes: 6a98d71dae ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/1682384563-2-4-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Tested-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21 16:02:13 +02:00
Li Zhijian
53f93d46e1 RDMA/rtrs: Fix the last iu->buf leak in err path
[ Upstream commit 3bf3a7c698 ]

The last iu->buf will leak if ib_dma_mapping_error() fails.

Fixes: c0894b3ea6 ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/1682384563-2-3-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21 16:02:12 +02:00
Saravanan Vajravel
b713623bfe RDMA/srpt: Add a check for valid 'mad_agent' pointer
[ Upstream commit eca5cd9474 ]

When unregistering MAD agent, srpt module has a non-null check
for 'mad_agent' pointer before invoking ib_unregister_mad_agent().
This check can pass if 'mad_agent' variable holds an error value.
The 'mad_agent' can have an error value for a short window when
srpt_add_one() and srpt_remove_one() is executed simultaneously.

In srpt module, added a valid pointer check for 'sport->mad_agent'
before unregistering MAD agent.

This issue can hit when RoCE driver unregisters ib_device

Stack Trace:
------------
BUG: kernel NULL pointer dereference, address: 000000000000004d
PGD 145003067 P4D 145003067 PUD 2324fe067 PMD 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 10 PID: 4459 Comm: kworker/u80:0 Kdump: loaded Tainted: P
Hardware name: Dell Inc. PowerEdge R640/06NR82, BIOS 2.5.4 01/13/2020
Workqueue: bnxt_re bnxt_re_task [bnxt_re]
RIP: 0010:_raw_spin_lock_irqsave+0x19/0x40
Call Trace:
  ib_unregister_mad_agent+0x46/0x2f0 [ib_core]
  IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
  ? __schedule+0x20b/0x560
  srpt_unregister_mad_agent+0x93/0xd0 [ib_srpt]
  srpt_remove_one+0x20/0x150 [ib_srpt]
  remove_client_context+0x88/0xd0 [ib_core]
  bond0: (slave p2p1): link status definitely up, 100000 Mbps full duplex
  disable_device+0x8a/0x160 [ib_core]
  bond0: active interface up!
  ? kernfs_name_hash+0x12/0x80
 (NULL device *): Bonding Info Received: rdev: 000000006c0b8247
  __ib_unregister_device+0x42/0xb0 [ib_core]
 (NULL device *):         Master: mode: 4 num_slaves:2
  ib_unregister_device+0x22/0x30 [ib_core]
 (NULL device *):         Slave: id: 105069936 name:p2p1 link:0 state:0
  bnxt_re_stopqps_and_ib_uninit+0x83/0x90 [bnxt_re]
  bnxt_re_alloc_lag+0x12e/0x4e0 [bnxt_re]

Fixes: a42d985bd5 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Link: https://lore.kernel.org/r/20230406042549.507328-1-saravanan.vajravel@broadcom.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11 23:17:32 +09:00
Mike Christie
32d530c85c scsi: target: iscsit: isert: Alloc per conn cmd counter
[ Upstream commit 6d256bee60 ]

This has iscsit allocate a per conn cmd counter and converts iscsit/isert
to use it instead of the per session one.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230319015620.96006-5-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stable-dep-of: 395cee83d0 ("scsi: target: iscsit: Stop/wait on cmds during conn close")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-11 23:17:14 +09:00
Linus Torvalds
8cbd92339d Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
 "Quite a small cycle this time, even with the rc8. I suppose everyone
  went to sleep over xmas.

   - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw,
     mana

   - inline CQE support for hns

   - Have mlx5 display device error codes

   - Pinned DMABUF support for irdma

   - Continued rxe cleanups, particularly converting the MRs to use
     xarray

   - Improvements to what can be cached in the mlx5 mkey cache"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (61 commits)
  IB/mlx5: Extend debug control for CC parameters
  IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors
  IB/hfi1: Fix math bugs in hfi1_can_pin_pages()
  RDMA/irdma: Add support for dmabuf pin memory regions
  RDMA/mlx5: Use query_special_contexts for mkeys
  net/mlx5e: Use query_special_contexts for mkeys
  net/mlx5: Change define name for 0x100 lkey value
  net/mlx5: Expose bits for querying special mkeys
  RDMA/rxe: Fix missing memory barriers in rxe_queue.h
  RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet
  RDMA/rxe: Remove rxe_alloc()
  RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size
  Subject: RDMA/rxe: Handle zero length rdma
  iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()
  RDMA/mlx5: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Remove unused 'work' member from struct ib_umem
  RDMA/irdma: Cap MSIX used to online CPUs + 1
  RDMA/mlx5: Check reg_create() create for errors
  RDMA/restrack: Correct spelling
  RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish()
  ...
2023-02-24 15:11:03 -08:00
Linus Torvalds
d2980d8d82 Merge tag 'mm-nonmm-stable-2023-02-20-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
 "There is no particular theme here - mainly quick hits all over the
  tree.

  Most notable is a set of zlib changes from Mikhail Zaslonko which
  enhances and fixes zlib's use of S390 hardware support: 'lib/zlib: Set
  of s390 DFLTCC related patches for kernel zlib'"

* tag 'mm-nonmm-stable-2023-02-20-15-29' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (55 commits)
  Update CREDITS file entry for Jesper Juhl
  sparc: allow PM configs for sparc32 COMPILE_TEST
  hung_task: print message when hung_task_warnings gets down to zero.
  arch/Kconfig: fix indentation
  scripts/tags.sh: fix the Kconfig tags generation when using latest ctags
  nilfs2: prevent WARNING in nilfs_dat_commit_end()
  lib/zlib: remove redundation assignement of avail_in dfltcc_gdht()
  lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default
  lib/zlib: DFLTCC always switch to software inflate for Z_PACKET_FLUSH option
  lib/zlib: DFLTCC support inflate with small window
  lib/zlib: Split deflate and inflate states for DFLTCC
  lib/zlib: DFLTCC not writing header bits when avail_out == 0
  lib/zlib: fix DFLTCC ignoring flush modes when avail_in == 0
  lib/zlib: fix DFLTCC not flushing EOBS when creating raw streams
  lib/zlib: implement switching between DFLTCC and software
  lib/zlib: adjust offset calculation for dfltcc_state
  nilfs2: replace WARN_ONs for invalid DAT metadata block requests
  scripts/spelling.txt: add "exsits" pattern and fix typo instances
  fs: gracefully handle ->get_block not mapping bh in __mpage_writepage
  cramfs: Kconfig: fix spelling & punctuation
  ...
2023-02-23 17:55:40 -08:00
Li Zhijian
2de49fb1c9 RDMA/rtrs: Don't call kobject_del for srv_path->kobj
As the mention in commmit f7452a7e96 ("RDMA/rtrs-srv: fix memory leak by missing kobject free"),
it was intended to remove the kobject_del for srv_path->kobj.

f7452a7e96 said:
>This patch moves kobject_del() into free_sess() so that the kobject of
>    rtrs_srv_sess can be freed.

This patch also move rtrs_srv_destroy_once_sysfs_root_folders back to
'if (srv_path->kobj.state_in_sysfs)' block to avoid a 'held lock freed!'

A kernel panic will be triggered by following script
-----------------------
$ while true
do
        echo "sessname=foo path=ip:<ip address> device_path=/dev/nvme0n1" > /sys/devices/virtual/rnbd-client/ctl/map_device
        echo "normal" > /sys/block/rnbd0/rnbd/unmap_device
done
-----------------------
The bisection pointed to commit 6af4609c18 ("RDMA/rtrs-srv: Fix several issues in rtrs_srv_destroy_path_files")
at last.

 rnbd_server L777: </dev/nvme0n1@foo>: Opened device 'nvme0n1'
 general protection fault, probably for non-canonical address 0x765f766564753aea: 0000 [#1] PREEMPT SMP PTI
 CPU: 0 PID: 3558 Comm: systemd-udevd Kdump: loaded Not tainted 6.1.0-rc3-roce-flush+ #51
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
 RIP: 0010:kernfs_dop_revalidate+0x36/0x180
 Code: 00 00 41 55 41 54 55 53 48 8b 47 68 48 89 fb 48 85 c0 0f 84 db 00 00 00 48 8b a8 60 04 00 00 48 8b 45 30 48 85 c0 48 0f 44 c5 <4c> 8b 60 78 49 81 c4 d8 00 00 00 4c 89 e7 e8 b7 78 7b 00 8b 05 3d
 RSP: 0018:ffffaf1700b67c78 EFLAGS: 00010206
 RAX: 765f766564753a72 RBX: ffff89e2830849c0 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff89e2830849c0
 RBP: ffff89e280361bd0 R08: 0000000000000000 R09: 0000000000000001
 R10: 0000000000000065 R11: 0000000000000000 R12: ffff89e2830849c0
 R13: ffff89e283084888 R14: d0d0d0d0d0d0d0d0 R15: 2f2f2f2f2f2f2f2f
 FS:  00007f13fbce7b40(0000) GS:ffff89e2bbc00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f93e055d340 CR3: 0000000104664002 CR4: 00000000001706f0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  <TASK>
  lookup_fast+0x7b/0x100
  walk_component+0x21/0x160
  link_path_walk.part.0+0x24d/0x390
  path_openat+0xad/0x9a0
  do_filp_open+0xa9/0x150
  ? lock_release+0x13c/0x2e0
  ? _raw_spin_unlock+0x29/0x50
  ? alloc_fd+0x124/0x1f0
  do_sys_openat2+0x9b/0x160
  __x64_sys_openat+0x54/0xa0
  do_syscall_64+0x3b/0x90
  entry_SYSCALL_64_after_hwframe+0x63/0xcd
 RIP: 0033:0x7f13fc9d701b
 Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 54 24 28 64 48 2b 14 25
 RSP: 002b:00007ffddf242640 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f13fc9d701b
 RDX: 0000000000080000 RSI: 00007ffddf2427c0 RDI: 00000000ffffff9c
 RBP: 00007ffddf2427c0 R08: 00007f13fcc5b440 R09: 21b2131aa64b1ef2
 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000080000
 R13: 00007ffddf2427c0 R14: 000055ed13be8db0 R15: 0000000000000000

Fixes: 6af4609c18 ("RDMA/rtrs-srv: Fix several issues in rtrs_srv_destroy_path_files")
Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/1675332721-2-1-git-send-email-lizhijian@fujitsu.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-07 11:21:32 +02:00
Luca Ceresoli
1b381f6fe4 scripts/spelling.txt: add "exsits" pattern and fix typo instances
Fix typos and add the following to the scripts/spelling.txt:

  exsits||exists

Link: https://lkml.kernel.org/r/20230126152205.959277-1-luca.ceresoli@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02 22:50:07 -08:00
Dragos Tatulea
e632291a2d IB/IPoIB: Fix legacy IPoIB due to wrong number of queues
The cited commit creates child PKEY interfaces over netlink will
multiple tx and rx queues, but some devices doesn't support more than 1
tx and 1 rx queues. This causes to a crash when traffic is sent over the
PKEY interface due to the parent having a single queue but the child
having multiple queues.

This patch fixes the number of queues to 1 for legacy IPoIB at the
earliest possible point in time.

BUG: kernel NULL pointer dereference, address: 000000000000036b
PGD 0 P4D 0
Oops: 0000 [#1] SMP
CPU: 4 PID: 209665 Comm: python3 Not tainted 6.1.0_for_upstream_min_debug_2022_12_12_17_02 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:kmem_cache_alloc+0xcb/0x450
Code: ce 7e 49 8b 50 08 49 83 78 10 00 4d 8b 28 0f 84 cb 02 00 00 4d 85 ed 0f 84 c2 02 00 00 41 8b 44 24 28 48 8d 4a
01 49 8b 3c 24 <49> 8b 5c 05 00 4c 89 e8 65 48 0f c7 0f 0f 94 c0 84 c0 74 b8 41 8b
RSP: 0018:ffff88822acbbab8 EFLAGS: 00010202
RAX: 0000000000000070 RBX: ffff8881c28e3e00 RCX: 00000000064f8dae
RDX: 00000000064f8dad RSI: 0000000000000a20 RDI: 0000000000030d00
RBP: 0000000000000a20 R08: ffff8882f5d30d00 R09: ffff888104032f40
R10: ffff88810fade828 R11: 736f6d6570736575 R12: ffff88810081c000
R13: 00000000000002fb R14: ffffffff817fc865 R15: 0000000000000000
FS:  00007f9324ff9700(0000) GS:ffff8882f5d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000036b CR3: 00000001125af004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 skb_clone+0x55/0xd0
 ip6_finish_output2+0x3fe/0x690
 ip6_finish_output+0xfa/0x310
 ip6_send_skb+0x1e/0x60
 udp_v6_send_skb+0x1e5/0x420
 udpv6_sendmsg+0xb3c/0xe60
 ? ip_mc_finish_output+0x180/0x180
 ? __switch_to_asm+0x3a/0x60
 ? __switch_to_asm+0x34/0x60
 sock_sendmsg+0x33/0x40
 __sys_sendto+0x103/0x160
 ? _copy_to_user+0x21/0x30
 ? kvm_clock_get_cycles+0xd/0x10
 ? ktime_get_ts64+0x49/0xe0
 __x64_sys_sendto+0x25/0x30
 do_syscall_64+0x3d/0x90
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f9374f1ed14
Code: 42 41 f8 ff 44 8b 4c 24 2c 4c 8b 44 24 20 89 c5 44 8b 54 24 28 48 8b 54 24 18 b8 2c 00 00 00 48 8b 74 24 10 8b
7c 24 08 0f 05 <48> 3d 00 f0 ff ff 77 34 89 ef 48 89 44 24 08 e8 68 41 f8 ff 48 8b
RSP: 002b:00007f9324ff7bd0 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f9324ff7cc8 RCX: 00007f9374f1ed14
RDX: 00000000000002fb RSI: 00007f93000052f0 RDI: 0000000000000030
RBP: 0000000000000000 R08: 00007f9324ff7d40 R09: 000000000000001c
R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000000
R13: 000000012a05f200 R14: 0000000000000001 R15: 00007f9374d57bdc
 </TASK>

Fixes: dbc94a0fb8 ("IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/95eb6b74c7cf49fa46281f9d056d685c9fa11d38.1674584576.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 21:18:27 +02:00
Leon Romanovsky
1ca49d26af Merge branch 'mlx5-next' into HEAD
Bring HW bits for mlx5 QP events series.

Link: https://lore.kernel.org/all/cover.1672821186.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-15 12:22:36 +02:00
Mark Zhang
ccae0447af RDMA/cma: Refactor the inbound/outbound path records process flow
Refactors based on comments [1] of the multiple path records support
patchset:
- Return failure if not able to set inbound/outbound PRs;
- Simplify the flow when receiving the PRs from netlink channel: When
  a good PR response is received, unpack it and call the path_query
  callback directly. This saves two memory allocations;
- Define RDMA_PRIMARY_PATH_MAX_REC_NUM in a proper place.

[1] https://lore.kernel.org/linux-rdma/Yyxp9E9pJtUids2o@nvidia.com/

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org> #srp
Link: https://lore.kernel.org/r/7610025d57342b8b6da0f19516c9612f9c3fdc37.1672819376.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-10 10:49:50 +02:00
Jiri Slaby (SUSE)
56c5dab20a RDMA/srp: Move large values to a new enum for gcc13
Since gcc13, each member of an enum has the same type as the enum [1]. And
that is inherited from its members. Provided these two:
  SRP_TAG_NO_REQ        = ~0U,
  SRP_TAG_TSK_MGMT	= 1U << 31
all other members are unsigned ints.

Esp. with SRP_MAX_SGE and SRP_TSK_MGMT_SQ_SIZE and their use in min(),
this results in the following warnings:
  include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast
  drivers/infiniband/ulp/srp/ib_srp.c:563:42: note: in expansion of macro 'min'

  include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast
  drivers/infiniband/ulp/srp/ib_srp.c:2369:27: note: in expansion of macro 'min'

So move the large values away to a separate enum, so that they don't
affect other members.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36113

Link: https://lore.kernel.org/r/20221212120411.13750-1-jirislaby@kernel.org
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-12-29 09:07:58 +02:00
Linus Torvalds
ab425febda Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
 "Usual size of updates, a new driver, and most of the bulk focusing on
  rxe:

   - Usual typos, style, and language updates

   - Driver updates for mlx5, irdma, siw, rts, srp, hfi1, hns, erdma,
     mlx4, srp

   - Lots of RXE updates:
      * Improve reply error handling for bad MR operations
      * Code tidying
      * Debug printing uses common loggers
      * Remove half implemented RD related stuff
      * Support IBA's recently defined Atomic Write and Flush operations

   - erdma support for atomic operations

   - New driver 'mana' for Ethernet HW available in Azure VMs. This
     driver only supports DPDK"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (122 commits)
  IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces
  RDMA: Add missed netdev_put() for the netdevice_tracker
  RDMA/rxe: Enable RDMA FLUSH capability for rxe device
  RDMA/cm: Make QP FLUSHABLE for supported device
  RDMA/rxe: Implement flush completion
  RDMA/rxe: Implement flush execution in responder side
  RDMA/rxe: Implement RC RDMA FLUSH service in requester side
  RDMA/rxe: Extend rxe packet format to support flush
  RDMA/rxe: Allow registering persistent flag for pmem MR only
  RDMA/rxe: Extend rxe user ABI to support flush
  RDMA: Extend RDMA kernel verbs ABI to support flush
  RDMA: Extend RDMA user ABI to support flush
  RDMA/rxe: Fix incorrect responder length checking
  RDMA/rxe: Fix oops with zero length reads
  RDMA/mlx5: Remove not-used IB_FLOW_SPEC_IB define
  RDMA/hns: Fix XRC caps on HIP08
  RDMA/hns: Fix error code of CMD
  RDMA/hns: Fix page size cap from firmware
  RDMA/hns: Fix PBL page MTR find
  RDMA/hns: Fix AH attr queried by query_qp
  ...
2022-12-14 09:27:13 -08:00
Linus Torvalds
75f4d9af8b Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter updates from Al Viro:
 "iov_iter work; most of that is about getting rid of direction
  misannotations and (hopefully) preventing more of the same for the
  future"

* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use less confusing names for iov_iter direction initializers
  iov_iter: saner checks for attempt to copy to/from iterator
  [xen] fix "direction" argument of iov_iter_kvec()
  [vhost] fix 'direction' argument of iov_iter_{init,bvec}()
  [target] fix iov_iter_bvec() "direction" argument
  [s390] memcpy_real(): WRITE is "data source", not destination...
  [s390] zcore: WRITE is "data source", not destination...
  [infiniband] READ is "data destination", not source...
  [fsi] WRITE is "data source", not destination...
  [s390] copy_oldmem_kernel() - WRITE is "data source", not destination
  csum_and_copy_to_iter(): handle ITER_DISCARD
  get rid of unlikely() on page_copy_sane() calls
2022-12-12 18:29:54 -08:00
Dragos Tatulea
dbc94a0fb8 IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces
There are 2 ways to create IPoIB PKEY child interfaces:
1) Writing a PKEY to /sys/class/net/<ib parent interface>/create_child.
2) Using netlink with iproute.

While with sysfs the child interface has the same number of tx and
rx queues as the parent, with netlink there will always be 1 tx
and 1 rx queue for the child interface. That's because the
get_num_tx/rx_queues() netlink ops are missing and the default value
of 1 is taken for the number of queues (in rtnl_create_link()).

This change adds the get_num_tx/rx_queues() ops which allows for
interfaces with multiple queues to be created over netlink. This
constant only represents the max number of tx and rx queues on that
net device.

Fixes: 9baa0b0364 ("IB/ipoib: Add rtnl_link_ops support")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/f4a42c8aa43c02d5ae5559a60c3e5e0f18c82531.1670485816.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-12-11 11:04:19 +02:00
Wang Yufen
ed461b30b2 RDMA/srp: Fix error return code in srp_parse_options()
In the previous iteration of the while loop, the "ret" may have been
assigned a value of 0, so the error return code -EINVAL may have been
incorrectly set to 0. To fix set valid return code before calling to
goto. Also investigate each case separately as Andy suggessted.

Fixes: e711f968c4 ("IB/srp: replace custom implementation of hex2bin()")
Fixes: 2a174df0c6 ("IB/srp: Use kstrtoull() instead of simple_strtoull()")
Fixes: 19f313438c ("IB/srp: Add RDMA/CM support")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/1669953638-11747-2-git-send-email-wangyufen@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-12-04 15:26:58 +02:00
Al Viro
de4eda9de2 use less confusing names for iov_iter direction initializers
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.

Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-11-25 13:01:55 -05:00
Al Viro
355d2c2798 [infiniband] READ is "data destination", not source...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-11-25 13:01:20 -05:00
Maurizio Lombardi
2a402120a8 IB/isert: use the ISCSI_LOGIN_CURRENT_STAGE macro
Use the proper macro to get the current_stage value.

Link: https://lore.kernel.org/r/20221116094535.138298-1-mlombard@redhat.com
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-21 14:49:21 -04:00
Jason A. Donenfeld
8032bf1233 treewide: use get_random_u32_below() instead of deprecated function
This is a simple mechanical transformation done by:

@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
  (E)

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-11-18 02:15:15 +01:00
Guoqing Jiang
34a046f08b RDMA/rtrs-srv: Remove kobject_del from rtrs_srv_destroy_once_sysfs_root_folders
The kobj_paths which is created dynamically by kobject_create_and_add,
and per the comment above kobject_create_and_add, we only need to call
kobject_put which is not same as other kobjs such as stats->kobj_stats
and srv_path->kobj.

Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-9-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
6af4609c18 RDMA/rtrs-srv: Fix several issues in rtrs_srv_destroy_path_files
There are several issues in the function which is supposed to be paired
with rtrs_srv_create_path_files.

1. rtrs_srv_stats_attr_group is not removed though it is created in
   rtrs_srv_create_stats_files.

2. it makes more sense to check kobj_stats.state_in_sysfs before destroy
   kobj_stats instead of rely on kobj.state_in_sysfs.

3. kobject_init_and_add is used for both kobjs (srv_path->kobj and
   srv_path->stats->kobj_stats), however we missed to call kobject_del
   for srv_path->kobj which was called in free_path.

4. rtrs_srv_destroy_once_sysfs_root_folders is independent of either
   kobj or kobj_stats.

Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-8-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
7526198f27 RDMA/rtrs: Clean up rtrs_rdma_dev_pd_ops
Let's remove them since the three members are not used.

Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-7-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
a439956335 RDMA/rtrs-srv: Remove outdated comments from create_con
Remove the orphan comments.

Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-6-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
f5708e6699 RDMA/rtrs-clt: Correct the checking of ib_map_mr_sg
We should check with count, also the only successful case is that
all sg elements are mapped, so make it explicitly.

Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-5-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
102d2f70ec RDMA/rtrs-srv: Correct the checking of ib_map_mr_sg
We should check with nr_sgt, also the only successful case is that
all sg elements are mapped, so make it explicitly.

Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-4-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
0f597ac618 RDMA/rtrs-srv: Refactor the handling of failure case in map_cont_bufs
Let's call unmap_cont_bufs when failure happens, and also only update
mrs_num after everything is settled which means we can remove 'mri'.

Acked-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-3-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Guoqing Jiang
d7115727e3 RDMA/rtrs-srv: Refactor rtrs_srv_rdma_cm_handler
The RDMA_CM_EVENT_CONNECT_REQUEST is quite different to other types,
let's check it separately at the beginning of routine, then we can
avoid the indentation accordingly.

Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20221117101945.6317-2-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 13:47:28 +02:00
Max Gurtovoy
c1842f34fc IB/iser: open code iser_disconnected_handler
There is a single caller to iser_disconnected_handler. Open code its
logic and remove it.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20221016093833.12537-4-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-19 10:38:35 +03:00
Max Gurtovoy
a75243ae08 IB/iser: add safety checks for state_mutex lock
In some cases, we need to make sure that state_mutex is taken. Use
lockdep_assert_held to warn us in case it doesn't while it should.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Link: https://lore.kernel.org/r/20221016093833.12537-3-mgurtovoy@nvidia.com
Reviewed-by: Sergey Gorenko <sergeygo@nvidia.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-19 10:38:35 +03:00
Sergey Gorenko
acc7d94ab4 IB/iser: open code iser_conn_state_comp_exch
There is a single caller to iser_conn_state_comp_exch. Open code its
logic and remove it.

Acked-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Link: https://lore.kernel.org/r/20221016093833.12537-2-mgurtovoy@nvidia.com
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-19 10:38:35 +03:00
Jason A. Donenfeld
a251c17aa5 treewide: use get_random_u32() when possible
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> # for sch_cake
Acked-by: Chuck Lever <chuck.lever@oracle.com> # for nfsd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> # for thunderbolt
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Acked-by: Helge Deller <deller@gmx.de> # for parisc
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:58 -06:00
Jason A. Donenfeld
81895a65ec treewide: use prandom_u32_max() when possible, part 1
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:

@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)

@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@

-       RAND = get_random_u32();
        ... when != RAND
-       RAND %= (E);
+       RAND = prandom_u32_max(E);

// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@

        ((T)get_random_u32()@p & (LITERAL))

// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@

value = None
if literal.startswith('0x'):
        value = int(literal, 16)
elif literal[0] in '123456789':
        value = int(literal, 10)
if value is None:
        print("I don't know how to handle %s" % (literal))
        cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
        print("Skipping 0x%x for cleanup elsewhere" % (value))
        cocci.include_match(False)
elif value & (value + 1) != 0:
        print("Skipping 0x%x because it's not a power of two minus one" % (value))
        cocci.include_match(False)
elif literal.startswith('0x'):
        coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
        coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))

// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@

-       (FUNC()@p & (LITERAL))
+       prandom_u32_max(RESULT)

@collapse_ret@
type T;
identifier VAR;
expression E;
@@

 {
-       T VAR;
-       VAR = (E);
-       return VAR;
+       return E;
 }

@drop_var@
type T;
identifier VAR;
@@

 {
-       T VAR;
        ... when != VAR
 }

Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-10-11 17:42:55 -06:00
Jason Gunthorpe
33331a728c Merge tag 'v6.0' into rdma.git for-next
Trvial merge conflicts against rdma.git for-rc resolved matching
linux-next:
            drivers/infiniband/hw/hns/hns_roce_hw_v2.c
            drivers/infiniband/hw/hns/hns_roce_main.c

https://lore.kernel.org/r/20220929124005.105149-1-broonie@kernel.org

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-06 19:48:45 -03:00
Mikhael Goikhman
b05398aff9 RDMA/srp: Support more than 255 rdma ports
Currently ib_srp module does not support devices with more than 256
ports. Switch from u8 to u32 to fix the problem.

Fixes: 1fb7f8973f ("RDMA: Support more than 255 rdma ports")
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Mikhael Goikhman <migo@nvidia.com>
Link: https://lore.kernel.org/r/7d80d8844f1abb3a54170b7259f0a02be38080a6.1663747327.git.leonro@nvidia.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22 12:53:13 +03:00
Mark Zhang
5a37494933 RDMA/cma: Multiple path records support with netlink channel
Support receiving inbound and outbound IB path records (along with GMP
PathRecord) from user-space service through the RDMA netlink channel.
The LIDs in these 3 PRs can be used in this way:
1. GMP PR: used as the standard local/remote LIDs;
2. DLID of outbound PR: Used as the "dlid" field for outbound traffic;
3. DLID of inbound PR: Used as the "dlid" field for outbound traffic in
   responder side.

This is aimed to support adaptive routing. With current IB routing
solution when a packet goes out it's assigned with a fixed DLID per
target, meaning a fixed router will be used.
The LIDs in inbound/outbound path records can be used to identify group
of routers that allow communication with another subnet's entity. With
them packets from an inter-subnet connection may travel through any
router in the set to reach the target.

As confirmed with Jason, when sending a netlink request, kernel uses
LS_RESOLVE_PATH_USE_ALL so that the service knows kernel supports
multiple PRs.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Link: https://lore.kernel.org/r/2fa2b6c93c4c16c8915bac3cfc4f27be1d60519d.1662631201.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-22 12:35:21 +03:00
Hangyu Hua
4b46a6079d RDMA/srpt: Use flex array destination for memcpy()
In preparation for FORTIFY_SOURCE performing run-time destination buffer
bounds checking for memcpy(), specify the destination output buffer
explicitly, instead of asking memcpy() to write past the end of what looked
like a fixed-size object.

Notice that srp_rsp[] is a pointer to a structure that contains
flexible-array member data[]:

struct srp_rsp {
	...
	__be32	sense_data_len;
	__be32	resp_data_len;
	u8	data[];
};

link: https://github.com/KSPP/linux/issues/201
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Link: https://lore.kernel.org/r/20220909022943.8896-1-hbh25y@gmail.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 15:05:29 +03:00
Bart Van Assche
6dbe4a8dea RDMA/srp: Fix srp_abort()
Fix the code for converting a SCSI command pointer into an SRP request
pointer.

Cc: Xiao Yang <yangx.jy@fujitsu.com>
Fixes: ad215aaea4 ("RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220908233139.3042628-1-bvanassche@acm.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 14:15:09 +03:00
Guoqing Jiang
db77d84cfe RDMA/rtrs-clt: Kill xchg_paths
Let's call try_cmpxchg directly for the same purpose.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220903040252.29397-1-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:12:03 +03:00
Guoqing Jiang
57eb938237 RDMA/rtrs-clt: Break the loop once one path is connected
No need to iterate all paths after find one connected path.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220902101922.26273-3-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:05:22 +03:00
Guoqing Jiang
2aa9e4a2c3 RDMA/rtrs: Update comments for MAX_SESS_QUEUE_DEPTH
The maximum queue_depth should be 65535 per check_module_params,
also update other relevant comments.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220902101922.26273-2-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:05:22 +03:00
yangx.jy@fujitsu.com
12f35199a2 RDMA/srp: Set scmnd->result only when scmnd is not NULL
This change fixes the following kernel NULL pointer dereference
which is reproduced by blktests srp/007 occasionally.

BUG: kernel NULL pointer dereference, address: 0000000000000170
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 9 Comm: kworker/0:1H Kdump: loaded Not tainted 6.0.0-rc1+ #37
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
Workqueue:  0x0 (kblockd)
RIP: 0010:srp_recv_done+0x176/0x500 [ib_srp]
Code: 00 4d 85 ff 0f 84 52 02 00 00 48 c7 82 80 02 00 00 00 00 00 00 4c 89 df 4c 89 14 24 e8 53 d3 4a f6 4c 8b 14 24 41 0f b6 42 13 <41> 89 87 70 01 00 00 41 0f b6 52 12 f6 c2 02 74 44 41 8b 42 1c b9
RSP: 0018:ffffaef7c0003e28 EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff9bc9486dea60 RCX: 0000000000000000
RDX: 0000000000000102 RSI: ffffffffb76bbd0e RDI: 00000000ffffffff
RBP: ffff9bc980099a00 R08: 0000000000000001 R09: 0000000000000001
R10: ffff9bca53ef0000 R11: ffff9bc980099a10 R12: ffff9bc956e14000
R13: ffff9bc9836b9cb0 R14: ffff9bc9557b4480 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9bc97ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000170 CR3: 0000000007e04000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 __ib_process_cq+0xb7/0x280 [ib_core]
 ib_poll_handler+0x2b/0x130 [ib_core]
 irq_poll_softirq+0x93/0x150
 __do_softirq+0xee/0x4b8
 irq_exit_rcu+0xf7/0x130
 sysvec_apic_timer_interrupt+0x8e/0xc0
 </IRQ>

Fixes: ad215aaea4 ("RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent")
Link: https://lore.kernel.org/r/20220831081626.18712-1-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-01 09:51:18 +03:00
Mark Zhang
91a3f14ec9 IB/cm: Remove the service_mask parameter from ib_cm_listen()
Remove the service_mask parameter of ib_cm_listen(), as all callers
use 0.

Link: https://lore.kernel.org/r/20220819090859.957943-2-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Guoqing Jiang
6edd86a2d2 RDMA/rtrs: Remove 'dir' argument from rnbd_srv_rdma_ev
Since process_{read,write} already prints direction info if ctx->ops.rdma_ev
fails, no need to pass 'dir'.

Link: https://lore.kernel.org/r/20220826081117.21687-1-guoqing.jiang@linux.dev
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:13:57 +03:00
Bart Van Assche
b8a9c18c2f RDMA/srp: Use the attribute group mechanism for sysfs attributes
Simplify the SRP driver by using the attribute group mechanism instead
of calling device_create_file() explicitly.

Link: https://lore.kernel.org/r/20220825213900.864587-5-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:28 +03:00
Bart Van Assche
351e458f72 RDMA/srp: Handle dev_set_name() failure
Instead of ignoring dev_set_name() failure, handle dev_set_name()
failure. Convert a device_register() call into device_initialize() and
device_add() calls.

Link: https://lore.kernel.org/r/20220825213900.864587-4-bvanassche@acm.org
Reported-by: Bo Liu <liubo03@inspur.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:20 +03:00
Bart Van Assche
0766fcaa1e RDMA/srp: Remove the srp_host.released completion
Move the kfree(host) calls into srp_release_dev(). Convert a
device_unregister() call into a device_del() and a device_put() call.
Remove the host->released completion object. This patch prepares for
handling dev_set_name() failure in srp_add_port().

Link: https://lore.kernel.org/r/20220825213900.864587-3-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:12 +03:00