Commit Graph

63 Commits

Author SHA1 Message Date
Mustafa Ismail
0d516e61cc RDMA/irdma: Fix Local Invalidate fencing
[ Upstream commit 5842d1d9c1 ]

If the local invalidate fence is indicated in the WR, only the read fence
is currently being set in WQE. Fix this to set both the read and local
fence in the WQE.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09 10:47:52 +02:00
Mustafa Ismail
6ee53f8254 RDMA/irdma: Prevent QP use after free
[ Upstream commit c8f304d75f ]

There is a window where the poll cq may use a QP that has been freed.
This can happen if a CQE is polled before irdma_clean_cqes() can clear the
CQE's related to the QP and the destroy QP races to free the QP memory.
then the QP structures are used in irdma_poll_cq.  Fix this by moving the
clearing of CQE's before the reference is removed and the QP is destroyed.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-09 10:47:52 +02:00
Zhu Yanjun
d2225b838c RDMA/irdma: Add support for dmabuf pin memory regions
This is a followup to the EFA dmabuf[1]. Irdma driver currently does
not support on-demand-paging(ODP). So it uses habanalabs as the
dmabuf exporter, and irdma as the importer to allow for peer2peer
access through libibverbs.

In this commit, the function ib_umem_dmabuf_get_pinned() is used.
This function is introduced in EFA dmabuf[1] which allows the driver
to get a dmabuf umem which is pinned and does not require move_notify
callback implementation. The returned umem is pinned and DMA mapped
like standard cpu umems, and is released through ib_umem_release().

[1]https://lore.kernel.org/lkml/20211007114018.GD2688930@ziepe.ca/t/

Link: https://lore.kernel.org/r/20230217011425.498847-1-yanjun.zhu@intel.com
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-17 16:36:14 -04:00
Zhu Yanjun
2f25e3bab0 RDMA/irdma: Split CQ handler into irdma_reg_user_mr_type_cq
Split the source codes related with CQ handling into a new function.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-5-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
e965ef0e7b RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp
Split the source codes related with QP handling into a new function.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-4-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
693a5386ef RDMA/irdma: Split mr alloc and free into new functions
In the function irdma_reg_user_mr, the mr allocation and free
will be used by other functions. As such, the source codes related
with mr allocation and free are split into the new functions.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-3-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
01798df198 RDMA/irdma: Split MEM handler into irdma_reg_user_mr_type_mem
The source codes related with IRDMA_MEMREG_TYPE_MEM are split
into a new function irdma_reg_user_mr_type_mem.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-2-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
bd99ede8ef RDMA/irdma: Remove extra ret variable in favor of existing err
In the function irdma_reg_user_mr, err and ret exist. Actually,
one variable err is enough.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230104064333.660344-1-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-04 08:59:02 +02:00
Mustafa Ismail
9907526d25 RDMA/irdma: Initialize net_type before checking it
The av->net_type is not initialized before it is checked in
irdma_modify_qp_roce. This leads to an incorrect update to the ARP cache
and QP context. RoCEv2 connections might fail as result.

Set the net_type using rdma_gid_attr_network_type.

Fixes: 80005c43d4 ("RDMA/irdma: Use net_type to check network type")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221122004410.1471-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-22 16:13:44 +02:00
Mustafa Ismail
8f7e2daa63 RDMA/irdma: Do not request 2-level PBLEs for CQ alloc
When allocating PBLE's for a large CQ, it is possible
that a 2-level PBLE is returned which would cause the
CQ allocation to fail since 1-level is assumed and checked for.
Fix this by requesting a level one PBLE only.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Mustafa Ismail
24419777e9 RDMA/irdma: Fix RQ completion opcode
The opcode written by HW, in the RQ CQE, is the
RoCEv2/iWARP protocol opcode from the received
packet and not the SW opcode as currently assumed.
Fix this by returning the raw operation type and
queue type in the CQE to irdma_process_cqe and add
2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map
IRDMA HW op types to IB op types.

Note that for iWARP, only Write with Immediate is
supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM
when there is immediate data present.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Mustafa Ismail
4f44e519b6 RDMA/irdma: Fix inline for multiple SGE's
Currently, inline send and inline write assume a single
SGE and only copy data from the first one. Add support
for multiple SGE's.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Shiraz Saleem
4eace75e08 RDMA/irdma: Report the correct link speed
The active link speed is currently hard-coded in irdma_query_port due
to which the port rate in ibstatus does reflect the active link speed.

Call ib_get_eth_speed in irdma_query_port to get the active link speed.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Reported-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221104234957.1135-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07 09:12:42 +02:00
Jason Gunthorpe
33331a728c Merge tag 'v6.0' into rdma.git for-next
Trvial merge conflicts against rdma.git for-rc resolved matching
linux-next:
            drivers/infiniband/hw/hns/hns_roce_hw_v2.c
            drivers/infiniband/hw/hns/hns_roce_main.c

https://lore.kernel.org/r/20220929124005.105149-1-broonie@kernel.org

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-06 19:48:45 -03:00
Shiraz Saleem
34acb833cc RDMA/irdma: Validate udata inlen and outlen
Currently ib_copy_from_udata and ib_copy_to_udata could underfill
the request and response buffer if the user-space passes an undersized
value for udata->inlen or udata->outlen respectively [1]
This could lead to undesirable behavior.

Zero initing the buffer only goes as far as preventing using the buffer
uninitialized.

Validate udata->inlen and udata->outlen passed from user-space to ensure
they are at least the required minimum size.

[1] https://lore.kernel.org/linux-rdma/MWHPR11MB0029F37D40D9D4A993F8F549E9D79@MWHPR11MB0029.namprd11.prod.outlook.com/

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:53 +03:00
Sindhu-Devale
7f51a961f8 RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the
correct flush_code and event_type. Fix these up so that the
correct IBV error and event codes are returned to application.

Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to
return the correct WC error code.

Fixes: 44d9e52977 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:52 +03:00
Sindhu-Devale
a261786fdc RDMA/irdma: Report RNR NAK generation in device caps
Report RNR NAK generation when device capabilities are queried

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
6b227bd32d RDMA/irdma: Return error on MR deregister CQP failure
The MR deregister CQP can fail if an MW is bound to it.
Return an appropriate error for this case.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Sindhu-Devale
12faad5e5c RDMA/irdma: Report the correct max cqes from query device
Report the correct max cqes available to an application taking
into account a reserved entry to detect overflow.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Linus Torvalds
e495274793 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
 "This cycle we got a new RDMA driver "ERDMA" for the Alibaba cloud
  environment. Otherwise the changes are dominated by rxe fixes.

  There is another RDMA driver on the list that might get merged next
  cycle, 'MANA' for the Azure cloud environment.

  Summary:

   - Bug fixes and small features for irdma, hns, siw, qedr, hfi1, mlx5

   - General spelling/grammer fixes

   - rdma cm can follow changes in neighbours for control packets

   - Significant amounts of rxe fixes and spec compliance changes

   - Use the modern NAPI API

   - Use the bitmap API instead of open coding

   - Performance improvements for rtrs

   - Add the ERDMA driver for Alibaba cloud

   - Fix a use after free bug in SRP"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits)
  RDMA/ib_srpt: Unify checking rdma_cm_id condition in srpt_cm_req_recv()
  RDMA/rxe: Fix error unwind in rxe_create_qp()
  RDMA/mlx5: Add missing check for return value in get namespace flow
  RDMA/rxe: Split qp state for requester and completer
  RDMA/rxe: Generate error completion for error requester QP state
  RDMA/rxe: Update wqe_index for each wqe error completion
  RDMA/srpt: Fix a use-after-free
  RDMA/srpt: Introduce a reference count in struct srpt_device
  RDMA/srpt: Duplicate port name members
  IB/qib: Fix repeated "in" within comments
  RDMA/erdma: Add driver to kernel build environment
  RDMA/erdma: Add the ABI definitions
  RDMA/erdma: Add the erdma module
  RDMA/erdma: Add connection management (CM) support
  RDMA/erdma: Add verbs implementation
  RDMA/erdma: Add verbs header file
  RDMA/erdma: Add event queue implementation
  RDMA/erdma: Add cmdq implementation
  RDMA/erdma: Add main include file
  RDMA/erdma: Add the hardware related definitions
  ...
2022-08-04 19:54:32 -07:00
Mustafa Ismail
8ecef7890b RDMA/irdma: Fix a window for use-after-free
During a destroy CQ an interrupt may cause processing of a CQE after CQ
resources are freed by irdma_cq_free_rsrc(). Fix this by moving the call
to irdma_cq_free_rsrc() after the irdma_sc_cleanup_ceqes(), which is
called under the cq_lock.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220705230815.265-6-shiraz.saleem@intel.com
Signed-off-by: Bartosz Sobczak <bartosz.sobczak@intel.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 10:39:57 +03:00
Mustafa Ismail
137d264c6f RDMA/irdma: Add 2 level PBLE support for FMR
Level 2 Physical Buffer List Entry (PBLE) is currently not supported for
Fast MRs which limits memory registrations to 256K pages.

Adapt irdma_set_page and irdma_alloc_mr to allow for 2 level PBLEs.

Link: https://lore.kernel.org/r/20220705230815.265-2-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 10:39:28 +03:00
Mustafa Ismail
5e8afb8792 RDMA/irdma: Do not advertise 1GB page size for x722
x722 does not support 1GB page size but the irdma driver incorrectly
advertises 1GB page size support for x722 device to ib_core to compute the
best page size to use on this MR.  This could lead to incorrect start
offsets computed by hardware on the MR.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-07-11 03:03:50 -03:00
Jason Gunthorpe
a6f844da39 Merge tag 'v5.18' into rdma.git for-next
Following patches have dependencies.

Resolve the merge conflict in
drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names
for the fs functions following linux-next:

https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-24 12:40:28 -03:00
Mustafa Ismail
81091d7696 RDMA/irdma: Add SW mechanism to generate completions on error
HW flushes after QP in error state is not reliable. This can lead to
   application hang waiting on a completion for outstanding WRs.  Implement a
SW mechanism to generate completions for any outstanding WR's after the QP
is modified to error.

This is accomplished by starting a delayed worker after the QP is modified
to error and the HW flush is performed. The worker will generate
completions that will be returned to the application when it polls the
CQ. This mechanism only applies to Kernel applications.

Link: https://lore.kernel.org/r/20220425181624.1617-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-11 15:58:40 -03:00
Tatyana Nikolova
7b8943b821 RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state
When connection establishment fails in iWARP mode, an app can drain the
QPs and hang because flush isn't issued when the QP is modified from RTR
state to error. Issue a flush in this case using function
irdma_cm_disconn().

Update irdma_cm_disconn() to do flush when cm_id is NULL, which is the
case when the QP is in RTR state and there is an error in the connection
establishment.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220425181703.1634-2-shiraz.saleem@intel.com
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-02 11:10:33 -03:00
Jason Gunthorpe
e945c653c8 RDMA: Split kernel-only global device caps from uverbs device caps
Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-06 15:02:13 -03:00
Mustafa Ismail
51cad28724 RDMA/irdma: Add support for address handle re-use
Address handles (AH) are a limited HW resource and some user applications
may create large numbers of identical AH's.  Avoid running out of AH's by
reusing existing identical ones.

Link: https://lore.kernel.org/r/20220228183650.290-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-03-15 16:22:55 -03:00
Mustafa Ismail
17850f2b0b RDMA/irdma: Remove incorrect masking of PD
The PD id is masked with 0x7fff, while PD can be 18 bits for GEN2 HW.
Remove the masking as it should not be needed and can cause incorrect PD
id to be used.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220225163211.127-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-28 12:07:40 -04:00
Zhu Yanjun
884194ef26 RDMA/irdma: Move union irdma_sockaddr to header file
The union irdma_sockaddr is used frequently. So move it to the header
file.

Link: https://lore.kernel.org/r/20220223024252.3873736-4-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Zhu Yanjun
8627da62cc RDMA/irdma: Remove the unnecessary variable saddr
Firstly the variable saddr was to check the type of a network. Now the
variable net_type is used to do the same work. So it is removed.

Link: https://lore.kernel.org/r/20220223024252.3873736-3-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Zhu Yanjun
80005c43d4 RDMA/irdma: Use net_type to check network type
The member variable net_type is to check the type of network.

Link: https://lore.kernel.org/r/20220223024252.3873736-2-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 20:38:56 -04:00
Shiraz Saleem
2322d17abf RDMA/irdma: Remove excess error variables
As irdma_status_code is replaced with an int, there is no need for two
variables to hold error codes.

Remove the excess variable in functions where this occurs.  Also, remove
any redundant initializations which are no longer needed.

Link: https://lore.kernel.org/r/20220217151851.1518-4-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 15:24:19 -04:00
Shiraz Saleem
45225a93cc RDMA/irdma: Propagate error codes
All functions now return linux error codes. Propagate the return from
these functions as opposed to converting them to generic values.

Link: https://lore.kernel.org/r/20220217151851.1518-3-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 15:24:18 -04:00
Shiraz Saleem
2c4b14ea95 RDMA/irdma: Remove enum irdma_status_code
Replace use of custom irdma_status_code with linux error codes.

Remove enum irdma_status_code and header in which its defined.

Link: https://lore.kernel.org/r/20220217151851.1518-2-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-23 15:24:18 -04:00
Mustafa Ismail
8348305532 RDMA/irdma: Refactor DCB bits in prep for DSCP support
Rename dcb flag to dcb_vlan_mode in irdma_device struct.  Add a new helper
function, irdma_set_qos_info, to set the VSI QoS information passed by the
PCI driver.

Link: https://lore.kernel.org/r/20220202191921.1638-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-02-08 12:54:47 -04:00
Zhu Yanjun
69e609ba96 RDMA/irdma: Make the source udp port vary
Get the source udp port number for a QP based on the grh.flow_label or
lqpn/rqrpn. This provides a better spread of traffic across NIC RX queues.

Link: https://lore.kernel.org/r/20220106180359.2915060-4-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-07 19:34:55 -04:00
Jason Gunthorpe
4922f09209 Merge tag 'v5.16-rc5' into rdma.git for-next
Required due to dependencies in following patches.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-14 20:18:48 -04:00
Tatyana Nikolova
10467ce09f RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
Completion events (CEs) are lost if the application is allowed to arm the
CQ more than two times when no new CE for this CQ has been generated by
the HW.

Check if arming has been done for the CQ and if not, arm the CQ for any
event otherwise promote to arm the CQ for any event only when the last arm
event was solicited.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20211201231509.1930-2-shiraz.saleem@intel.com
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-07 13:53:01 -04:00
Kamal Heib
fc9d19e18a RDMA/irdma: Use helper function to set GUIDs
Use the addrconf_addr_eui48() helper function to set the GUIDs for both
RoCE and iWARP modes, Also make sure the GUIDs are valid EUI-64
identifiers.

Link: https://lore.kernel.org/r/20211107212227.44610-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-16 13:54:24 -04:00
Jason Gunthorpe
a2a2a69d14 Merge tag 'v5.15' into rdma.git for-next
Pull in the accepted for-rc patches as the next merge needs a newer base.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 14:49:20 -03:00
Zhu Yanjun
9ed8110c9b RDMA/irdma: optimize rx path by removing unnecessary copy
In the function irdma_post_recv, the function irdma_copy_sg_list is
not needed since the struct irdma_sge and ib_sge have the similar
member variables. The struct irdma_sge can be replaced with the
struct ib_sge totally.

This can increase the rx performance of irdma.

Link: https://lore.kernel.org/r/20211030104226.253346-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-01 14:39:33 -03:00
Zhu Yanjun
50604757e7 RDMA/irdma: Remove the unused variable local_qp
Since the member variable local_qp is not used, remove it.

Link: https://lore.kernel.org/r/20211027175457.201822-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-28 08:58:26 -03:00
Zhu Yanjun
86479f8a3f RDMA/irdma: Remove the unused spin lock in struct irdma_qp_uk
The spin lock in struct irdma_qp_uk is not used. So remove it.

Link: https://lore.kernel.org/r/20211021230612.153812-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:36:58 -03:00
Jakub Kicinski
fd92213e9a RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the
qualifier.

Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:09 -03:00
Mustafa Ismail
cc07b73ef1 RDMA/irdma: Set VLAN in UD work completion correctly
Currently VLAN is reported in UD work completion when VLAN id is zero,
i.e. no VLAN case.

Report VLAN in UD work completion only when VLAN id is non-zero.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20211019151654.1943-1-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-19 20:22:01 -03:00
Aharon Landau
13f30b0fa0 RDMA/counter: Add a descriptor in struct rdma_hw_stats
Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added.  This code
extension is needed for optional-counter support in the following patches.

Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:04 -03:00
Sindhu Devale
9f7fa37a6b RDMA/irdma: Report correct WC error when there are MW bind errors
Report the correct WC error when MW bind error related asynchronous events
are generated by HW.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-5-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
d3bdcd5963 RDMA/irdma: Report correct WC error when transport retry counter is exceeded
When the retry counter exceeds, as the remote QP didn't send any Ack or
Nack an asynchronous event (AE) for too many retries is generated. Add
code to handle the AE and set the correct IB WC error code
IB_WC_RETRY_EXC_ERR.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-4-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
f4475f2494 RDMA/irdma: Validate number of CQ entries on create CQ
Add lower bound check for CQ entries at creation time.

Fixes: b48c24c2d7 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-3-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00