linux

mirror of https://github.com/raspberrypi/linux.git synced 2025-12-15 22:41:38 +00:00

Author	SHA1	Message	Date
Peter Zijlstra	cdd30ebb1b	module: Convert symbol namespace to string literal Clean up the existing export namespace code along the same lines of commit `33def8498f` ("treewide: Convert macro and uses of __section(foo) to __section("foo")") and for the same reason, it is not desired for the namespace argument to be a macro expansion itself. Scripted using git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS \| while read file; do awk -i inplace ' /^#define EXPORT_SYMBOL_NS/ { gsub(/__stringify$ns$/, "ns"); print; next; } /^#define MODULE_IMPORT_NS/ { gsub(/__stringify$ns$/, "ns"); print; next; } /MODULE_IMPORT_NS/ { $0 = gensub(/MODULE_IMPORT_NS$([^)])$/, "MODULE_IMPORT_NS(\"\\1\")", "g"); } /EXPORT_SYMBOL_NS/ { if ($0 ~ /(EXPORT_SYMBOL_NS[^(])$([^,]+),/) { if ($0 !~ /(EXPORT_SYMBOL_NS[^(])\(([^,]+), ([^)]+)$/ && $0 !~ /(EXPORT_SYMBOL_NS[^(])/ && $0 !~ /^my/) { getline line; gsub(/[[:space:]]\\$/, ""); gsub(/[[:space:]]/, "", line); $0 = $0 " " line; } $0 = gensub(/(EXPORT_SYMBOL_NS[^(])$([^,]+), ([^)]+)$/, "\\1(\\2, \"\\3\")", "g"); } } { print }' $file; done Requested-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc Acked-by: Greg KH <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2024-12-02 11:34:44 -08:00
Linus Torvalds	55cb93fd24	Merge tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is a small set of driver core changes for 6.13-rc1. Nothing major for this merge cycle, except for the two simple merge conflicts are here just to make life interesting. Included in here are: - sysfs core changes and preparations for more sysfs api cleanups that can come through all driver trees after -rc1 is out - fw_devlink fixes based on many reports and debugging sessions - list_for_each_reverse() removal, no one was using it! - last-minute seq_printf() format string bug found and fixed in many drivers all at once. - minor bugfixes and changes full details in the shortlog" * tag 'driver-core-6.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (35 commits) Fix a potential abuse of seq_printf() format string in drivers cpu: Remove spurious NULL in attribute_group definition s390/con3215: Remove spurious NULL in attribute_group definition perf: arm-ni: Remove spurious NULL in attribute_group definition driver core: Constify bin_attribute definitions sysfs: attribute_group: allow registration of const bin_attribute firmware_loader: Fix possible resource leak in fw_log_firmware_info() drivers: core: fw_devlink: Fix excess parameter description in docstring driver core: class: Correct WARN() message in APIs class_(for_each\|find)_device() cacheinfo: Use of_property_present() for non-boolean properties cdx: Fix cdx_mmap_resource() after constifying attr in ->mmap() drivers: core: fw_devlink: Make the error message a bit more useful phy: tegra: xusb: Set fwnode for xusb port devices drm: display: Set fwnode for aux bus devices driver core: fw_devlink: Stop trying to optimize cycle detection logic driver core: Constify attribute arguments of binary attributes sysfs: bin_attribute: add const read/write callback variants sysfs: implement all BIN_ATTR_* macros in terms of __BIN_ATTR() sysfs: treewide: constify attribute callback of bin_attribute::llseek() sysfs: treewide: constify attribute callback of bin_attribute::mmap() ...	2024-11-29 11:43:29 -08:00
Linus Torvalds	2a163a4cea	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Seveal fixes scattered across the drivers and a few new features: - Minor updates and bug fixes to hfi1, efa, iopob, bnxt, hns - Force disassociate the userspace FD when hns does an async reset - bnxt new features for optimized modify QP to skip certain stayes, CQ coalescing, better debug dumping - mlx5 new data placement ordering feature - Faster destruction of mlx5 devx HW objects - Improvements to RDMA CM mad handling" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (51 commits) RDMA/bnxt_re: Correct the sequence of device suspend RDMA/bnxt_re: Use the default mode of congestion control RDMA/bnxt_re: Support different traffic class IB/cm: Rework sending DREQ when destroying a cm_id IB/cm: Do not hold reference on cm_id unless needed IB/cm: Explicitly mark if a response MAD is a retransmission RDMA/mlx5: Move events notifier registration to be after device registration RDMA/bnxt_re: Cache MSIx info to a local structure RDMA/bnxt_re: Refurbish CQ to NQ hash calculation RDMA/bnxt_re: Refactor NQ allocation RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved RDMA/hns: Fix different dgids mapping to the same dip_idx RDMA/bnxt_re: Add set_func_resources support for P5/P7 adapters RDMA/bnxt_re: Enhance RoCE SRIOV resource configuration design bnxt_en: Add support for RoCE sriov configuration RDMA/hns: Fix NULL pointer derefernce in hns_roce_map_mr_sg() RDMA/hns: Fix out-of-order issue of requester when setting FENCE RDMA/nldev: Add IB device and net device rename events RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation RDMA/core: Move ib_uverbs_file struct to uverbs_types.h ...	2024-11-22 20:03:57 -08:00
Kalesh AP	68b3bca2df	RDMA/bnxt_re: Correct the sequence of device suspend When in fatal error condition, mark device as detached first and then complete all pending HWRM commands as firmware is not going to process them and eventually time out. Move the device to error only if suspend is called when device is in Fatal state. Also, remove some outdated comments. Remove the stop_irq call which is no longer required. Fixes: `cc5b9b48d4` ("RDMA/bnxt_re: Recover the device when FW error is detected") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1731660464-27838-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-17 04:56:39 -05:00
Kalesh AP	bfb27ae6d0	RDMA/bnxt_re: Use the default mode of congestion control Instead of driver setting the congestion mode, use the default values setup by Firmware. Enable the tos_ecn field in FW. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1731660464-27838-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-17 04:56:38 -05:00
Chandramohan Akula	c64b16a37b	RDMA/bnxt_re: Support different traffic class Adding support for different traffic class passed to driver. Fix the traffic class setting in modify_qp by skipping the ECN bits. Pass the service level received from applications to the firmware. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1731660464-27838-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-17 04:56:38 -05:00
Patrisious Haddad	ede132a5cf	RDMA/mlx5: Move events notifier registration to be after device registration Move pkey change work initialization and cleanup from device resources stage to notifier stage, since this is the stage which handles this work events. Fix a race between the device deregistration and pkey change work by moving MLX5_IB_STAGE_DEVICE_NOTIFIER to be after MLX5_IB_STAGE_IB_REG in order to ensure that the notifier is deregistered before the device during cleanup. Which ensures there are no works that are being executed after the device has already unregistered which can cause the panic below. BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 630071 Comm: kworker/1:2 Kdump: loaded Tainted: G W OE --------- --- 5.14.0-162.6.1.el9_1.x86_64 #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 02/27/2023 Workqueue: events pkey_change_handler [mlx5_ib] RIP: 0010:setup_qp+0x38/0x1f0 [mlx5_ib] Code: ee 41 54 45 31 e4 55 89 f5 53 48 89 fb 48 83 ec 20 8b 77 08 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 48 8b 07 48 8d 4c 24 16 <4c> 8b 38 49 8b 87 80 0b 00 00 4c 89 ff 48 8b 80 08 05 00 00 8b 40 RSP: 0018:ffffbcc54068be20 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff954054494128 RCX: ffffbcc54068be36 RDX: ffff954004934000 RSI: 0000000000000001 RDI: ffff954054494128 RBP: 0000000000000023 R08: ffff954001be2c20 R09: 0000000000000001 R10: ffff954001be2c20 R11: ffff9540260133c0 R12: 0000000000000000 R13: 0000000000000023 R14: 0000000000000000 R15: ffff9540ffcb0905 FS: 0000000000000000(0000) GS:ffff9540ffc80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000010625c001 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_gsi_pkey_change+0x20/0x40 [mlx5_ib] process_one_work+0x1e8/0x3c0 worker_thread+0x50/0x3b0 ? rescuer_thread+0x380/0x380 kthread+0x149/0x170 ? set_kthread_struct+0x50/0x50 ret_from_fork+0x22/0x30 Modules linked in: rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) mlx5_fwctl(OE) fwctl(OE) ib_uverbs(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlx_compat(OE) psample mlxfw(OE) tls knem(OE) netconsole nfsv3 nfs_acl nfs lockd grace fscache netfs qrtr rfkill sunrpc intel_rapl_msr intel_rapl_common rapl hv_balloon hv_utils i2c_piix4 pcspkr joydev fuse ext4 mbcache jbd2 sr_mod sd_mod cdrom t10_pi sg ata_generic pci_hyperv pci_hyperv_intf hyperv_drm drm_shmem_helper drm_kms_helper hv_storvsc syscopyarea hv_netvsc sysfillrect sysimgblt hid_hyperv fb_sys_fops scsi_transport_fc hyperv_keyboard drm ata_piix crct10dif_pclmul crc32_pclmul crc32c_intel libata ghash_clmulni_intel hv_vmbus serio_raw [last unloaded: ib_core] CR2: 0000000000000000 ---[ end trace f6f8be4eae12f7bc ]--- Fixes: `7722f47e71` ("IB/mlx5: Create GSI transmission QPs when P_Key table is changed") Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Michael Guralnik <michaelgur@nvidia.com> Link: https://patch.msgid.link/d271ceeff0c08431b3cbbbb3e2d416f09b6d1621.1731496944.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 09:53:53 -05:00
Kalesh AP	31bad59805	RDMA/bnxt_re: Cache MSIx info to a local structure L2 driver allocates the vectors for RoCE and pass it through the en_dev structure to RoCE. During probe, cache the MSIx related info to a local structure. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/1731577748-1804-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 09:52:37 -05:00
Kalesh AP	cb97b377a1	RDMA/bnxt_re: Refurbish CQ to NQ hash calculation There are few use cases where CQ create and destroy is seen before re-creating the CQ, this kind of use case is disturbing the RR distribution and all the active CQ getting mapped to only 2 NQ alternatively. Fixing the CQ to NQ hash calculation by implementing a quick load sorting mechanism under a mutex. Using this, if the CQ was allocated and destroyed before using it, the nq selecting algorithm still obtains the least loaded CQ. Thus balancing the load on NQs. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Link: https://patch.msgid.link/1731577748-1804-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 09:52:37 -05:00
Kalesh AP	30b871338c	RDMA/bnxt_re: Refactor NQ allocation Move NQ related data structures from rdev to a new structure named "struct bnxt_re_nq_record" by keeping a pointer to in the rdev structure. Allocate the memory for it dynamically. This change is needed for subsequent patches in the series. Also, removed the nq_task variable from rdev structure as it is redundant and no longer used. This change would help to reduce the size of the driver private structure as well. Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1731577748-1804-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 09:52:37 -05:00
Kalesh AP	65ecee1327	RDMA/bnxt_re: Fail probe early when not enough MSI-x vectors are reserved L2 driver allocates and populates the MSI-x vector details for RoCE in the en_dev structure. RoCE driver requires minimum 2 MSIx vectors. Hence during probe, driver has to check and bail out if there are not enough MSI-x vectors reserved for it before proceeding further initialization. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1731577748-1804-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 09:52:37 -05:00
Feng Fang	faa62440a5	RDMA/hns: Fix different dgids mapping to the same dip_idx DIP algorithm requires a one-to-one mapping between dgid and dip_idx. Currently a queue 'spare_idx' is used to store QPN of QPs that use DIP algorithm. For a new dgid, use a QPN from spare_idx as dip_idx. This method lacks a mechanism for deduplicating QPN, which may result in different dgids sharing the same dip_idx and break the one-to-one mapping requirement. This patch replaces spare_idx with xarray and introduces a refcnt of a dip_idx to indicate the number of QPs that using this dip_idx. The state machine for dip_idx management is implemented as: * The entry at an index in xarray is empty -- This indicates that the corresponding dip_idx hasn't been created. * The entry at an index in xarray is not empty but with 0 refcnt -- This indicates that the corresponding dip_idx has been created but not used as dip_idx yet. * The entry at an index in xarray is not empty and with non-0 refcnt -- This indicates that the corresponding dip_idx is being used by refcnt number of DIP QPs. Fixes: `eb653eda1e` ("RDMA/hns: Bugfix for incorrect association between dip_idx and dgid") Fixes: `f91696f2f0` ("RDMA/hns: Support congestion control type selection according to the FW") Signed-off-by: Feng Fang <fangfeng4@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241112055553.3681129-1-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-14 04:56:14 -05:00
Kalesh AP	cdb21c12ad	RDMA/bnxt_re: Add set_func_resources support for P5/P7 adapters Enable set_func_resources for P5 and P7 adapters to handle VF resource distribution. Remove setting max resources per VF during PF initialization. This change is required for firmwares which does not support RoCE VF resource management by NIC driver. The code is same for all adapters now. Reviewed-by: Stephen Shi <stephen.shi@broadcom.com> Reviewed-by: Rukhsana Ansari <rukhsana.ansari@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730882676-24434-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-12 03:04:04 -05:00
Bhargava Chenna Marreddy	304cc83807	RDMA/bnxt_re: Enhance RoCE SRIOV resource configuration design Refine RoCE SRIOV resource configuration design, using the INITIALIZE_FW's flag as an indication for the new design to the firmware. RoCE driver does not have to provision resources to VF when firmware advertises support for RoCE resource management by NIC driver. Signed-off-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> CC: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730882676-24434-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-12 03:04:04 -05:00
Junxian Huang	6b526d17ee	RDMA/hns: Fix NULL pointer derefernce in hns_roce_map_mr_sg() ib_map_mr_sg() allows ULPs to specify NULL as the sg_offset argument. The driver needs to check whether it is a NULL pointer before dereferencing it. Fixes: `d387d4b54e` ("RDMA/hns: Fix missing pagesize and alignment check in FRMR") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241108075743.2652258-3-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-10 09:29:53 -05:00
Junxian Huang	5dbcb1c190	RDMA/hns: Fix out-of-order issue of requester when setting FENCE The FENCE indicator in hns WQE doesn't ensure that response data from a previous Read/Atomic operation has been written to the requester's memory before the subsequent Send/Write operation is processed. This may result in the subsequent Send/Write operation accessing the original data in memory instead of the expected response data. Unlike FENCE, the SO (Strong Order) indicator blocks the subsequent operation until the previous response data is written to memory and a bresp is returned. Set the SO indicator instead of FENCE to maintain strict order. Fixes: `9a4435375c` ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241108075743.2652258-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-10 09:29:53 -05:00
Thomas Weißschuh	b626816fdd	sysfs: treewide: constify attribute callback of bin_is_visible() The is_bin_visible() callbacks should not modify the struct bin_attribute passed as argument. Enforce this by marking the argument as const. As there are not many callback implementers perform this change throughout the tree at once. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Krzysztof Wilczyński <kw@linux.com> Link: https://lore.kernel.org/r/20241103-sysfs-const-bin_attr-v2-5-71110628844c@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-11-05 14:00:28 +01:00
Patrisious Haddad	7c891a4dbc	RDMA/mlx5: Add implementation for ufile_hw_cleanup device operation Implement the device API for ufile_hw_cleanup operation, which iterates over the ufile uobjects lists, and attempts to destroy DevX QPs, by issuing up to 8 commands in parallel. This function is responsible only for cleaning the FW resources of the QP, and doesn't necessarily cleanup all of its resources. Hence the normal serialized cleanup flow is still executed after it in __uverbs_cleanup_ufile() to cleanup the remaining resources and handle the cleanup of SW objects. In order to avoid double cleanup for the FW resources, new DevX flag was added DEVX_OBJ_FLAGS_HW_FREED, which marks the object's FW resources as already freed. Since QP destruction is the most time-consuming operation in FW, parallelizing it reduces the cleanup time of applications that use DevX QPs. Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Link: https://patch.msgid.link/2f82675d0412542cba1c47a6b86f589521ae41e1.1730373303.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:57:21 -05:00
Chiara Meiohas	0bd2c61df9	RDMA/mlx5: Ensure active slave attachment to the bond IB device Fix a race condition when creating a lag bond in active backup mode where after the bond creation the backup slave was attached to the IB device, instead of the active slave. This caused stale entries in the GID table, as the gid updating mechanism relies on ib_device_get_netdev(), which would return the backup slave. Send an MLX5_DRIVER_EVENT_ACTIVE_BACKUP_LAG_CHANGE_LOWERSTATE event when activating the lag, additionally to when modifying the lag. This ensures that eventually the active netdevice is stored in the bond IB device. When handling this event remove the GIDs of the previously attached netdevice in this port and rescan the GIDs of the newly attached netdevice. This ensures that eventually the active slave netdevice is correctly stored in the IB device port. While there might be a brief moment where the backup slave GIDs appear in the GID table, it will eventually stabilize with the correct GIDs (of the bond and the active slave). Fixes: `8d159eb211` ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://patch.msgid.link/91fc2cb24f63add266a528c1c702668a80416d9f.1730381292.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:57:21 -05:00
Chiara Meiohas	6d9c7b2729	RDMA/mlx5: Call dev_put() after the blocking notifier Move dev_put() call to occur directly after the blocking notifier, instead of within the event handler. Fixes: `8d159eb211` ("RDMA/mlx5: Use IB set_netdev and get_netdev functions") Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com> Link: https://patch.msgid.link/342ff94b3dcbb07da1c7dab862a73933d604b717.1730381292.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:57:20 -05:00
Mark Zhang	eb3d354efb	RDMA/mlx5: Support querying per-plane IB PortCounters On a SMI device, set requested plane_num when querying PPCNT register with the PortCounters Attribute group. Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Link: https://patch.msgid.link/828d57444a0a41042556bb0a4394ecf2fcaed639.1730368052.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:57:20 -05:00
Edward Srouji	8b36f7c3c6	RDMA/mlx5: Support OOO RX WQE consumption Support QP with out-of-order (OOO) capabilities enabled. This allows WRs on the receiver side of the QP to be consumed OOO, permitting the sender side to transmit messages without guaranteeing arrival order on the receiver side. When enabled, the completion ordering of WRs remains in-order, regardless of the Receive WRs consumption order. RDMA Read and RDMA Atomic operations on the responder side continue to be executed in-order, while the ordering of data placement for RDMA Write and Send operations is not guaranteed. Atomic operations larger than 8 bytes are currently not supported. Therefore, when this feature is enabled, the created QP restricts its atomic support to 8 bytes at most. In addition, when querying the device, a new flag is returned in response to indicate that the Kernel supports OOO QP. Signed-off-by: Edward Srouji <edwards@nvidia.com> Reviewed-by: Yishai Hadas <yishaih@nvidia.com> Link: https://patch.msgid.link/06ac609a5f358c8fb0a090d22c61a2f9329d82e6.1725362773.git.leon@kernel.org Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:57:20 -05:00
Kalesh AP	d7d54769c0	RDMA/bnxt_re: Add debugfs hook in the driver Adding support for a per device debugfs folder for exporting some of the device specific debug information. Added support to get QP info for now. The same folder can be used to export other debug features in future. Signed-off-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730428483-17841-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:53:06 -05:00
Kashyap Desai	e4bcf8eb2a	RDMA/bnxt_re: Support raw data query for each resources Support interfaces to get the raw data for each of the resources. Use this interface to get some of the HW structures from active resources. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730428483-17841-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:53:06 -05:00
Kashyap Desai	3b72946bdf	RDMA/bnxt_re: Add support for querying HW contexts Implements support for querying the hardware resource contexts. This raw data can be used for the debugging of the field issues. Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730428483-17841-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:53:06 -05:00
Kashyap Desai	7363eb76b7	RDMA/bnxt_re: Support driver specific data collection using rdma tool Allow users to dump driver specific resource details when queried through rdma tool. This supports the driver data for QP, CQ, MR and SRQ. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730428483-17841-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-04 06:53:06 -05:00
Christophe JAILLET	aceee63a3a	RDMA/bnxt_re: Remove some dead code If the probe succeeds, then auxiliary_get_drvdata() can't return a NULL pointer. So several NULL checks can be removed to simplify code. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/f02eb630734ee530315dce9f60b078f631ae93d0.1730477345.git.christophe.jaillet@wanadoo.fr Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-03 11:02:02 -05:00
Christophe JAILLET	cf90a4d1b9	RDMA/bnxt_re: Fix some error handling paths in bnxt_re_probe() If bnxt_re_add_device() fails, 'en_info' still needs to be freed, as already done in the .remove() function. The commit in Fixes incorrectly removed this call, certainly because it was expecting the .remove() function was called anyway. But if the probe fails, the remove function is not called. There is no need to call bnxt_re_remove() as it was done before, kfree() is enough. Fixes: `a5e099e0c4` ("RDMA/bnxt_re: Fix an error path in bnxt_re_add_device") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://patch.msgid.link/9e48ff955ae55fc39a9eb1eb590d374539eab5ba.1730477345.git.christophe.jaillet@wanadoo.fr Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-11-03 11:02:02 -05:00
Michael Margolin	1103579d6e	RDMA/efa: Report link speed according to device attributes Set port link speed and width based on max bandwidth acquired from the device instead of using constant 100 Gbps. Use a default value in case the device didn't set the field. Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Firas Jahjah <firasj@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Link: https://patch.msgid.link/20241030093006.21352-1-mrgolin@amazon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 15:36:59 +02:00
Kashyap Desai	808ca6de98	RDMA/bnxt_re: Check cqe flags to know imm_data vs inv_irkey Invalidate rkey is cpu endian and immediate data is in big endian format. Both immediate data and invalidate the remote key returned by HW is in little endian format. While handling the commit in fixes tag, the difference between immediate data and invalidate rkey endianness was not considered. Without changes of this patch, Kernel ULP was failing while processing inv_rkey. dmesg log snippet - nvme nvme0: Bogus remote invalidation for rkey 0x2000019Fix in this patch Do endianness conversion based on completion queue entry flag. Also, the HW completions are already converted to host endianness in bnxt_qplib_cq_process_res_rc and bnxt_qplib_cq_process_res_ud and there is no need to convert it again in bnxt_re_poll_cq. Modified the union to hold the correct data type. Fixes: `95b087f87b` ("bnxt_re: Fix imm_data endianness") Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1730110014-20755-1-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:24:50 +02:00
wenglianfa	323275ac2f	RDMA/hns: Fix cpu stuck caused by printings during reset During reset, cmd to destroy resources such as qp, cq, and mr may fail, and error logs will be printed. When a large number of resources are destroyed, there will be lots of printings, and it may lead to a cpu stuck. Delete some unnecessary printings and replace other printing functions in these paths with the ratelimited version. Fixes: `9a4435375c` ("IB/hns: Add driver files for hns RoCE driver") Fixes: `c7bcb13442` ("RDMA/hns: Add SRQ support for hip08 kernel mode") Fixes: `70f9252158` ("RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT") Fixes: `926a01dc00` ("RDMA/hns: Add QP operations support for hip08 SoC") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241024124000.2931869-6-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:13:55 +02:00
Junxian Huang	d81fb6511a	RDMA/hns: Use dev_* printings in hem code instead of ibdev_* The hem code is executed before ib_dev is registered, so use dev_* printing instead of ibdev_* to avoid log like this: (null): set HEM address to HW failed! Fixes: `2f49de21f3` ("RDMA/hns: Optimize mhop get flow for multi-hop addressing") Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241024124000.2931869-5-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:13:55 +02:00
Yuyu Li	370a9351bf	RDMA/hns: Modify debugfs name The sub-directory of hns_roce debugfs is named after the device's kernel name currently, but it will be inconvenient to use when the device is renamed. Modify the name to pci name as users can always easily find the correspondence between an RDMA device and its pci name. Fixes: `eb7854d63d` ("RDMA/hns: Support SW stats with debugfs") Signed-off-by: Yuyu Li <liyuyu6@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241024124000.2931869-4-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:13:55 +02:00
wenglianfa	377a209770	RDMA/hns: Fix flush cqe error when racing with destroy qp QP needs to be modified to IB_QPS_ERROR to trigger HW flush cqe. But when this process races with destroy qp, the destroy-qp process may modify the QP to IB_QPS_RESET first. In this case flush cqe will fail since it is invalid to modify qp from IB_QPS_RESET to IB_QPS_ERROR. Add lock and bit flag to make sure pending flush cqe work is completed first and no more new works will be added. Fixes: `ffd541d457` ("RDMA/hns: Add the workqueue framework for flush cqe handler") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241024124000.2931869-3-huangjunxian6@hisilicon.com Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:13:54 +02:00
wenglianfa	571e4ab8a4	RDMA/hns: Fix an AEQE overflow error caused by untimely update of eq_db_ci eq_db_ci is updated only after all AEQEs are processed in the AEQ interrupt handler, which is not timely enough and may result in AEQ overflow. Two optimization methods are proposed: 1. Set an upper limit for AEQE processing. 2. Move time-consuming operations such as printings to the bottom half of the interrupt. cmd events and flush_cqe events are still fully processed in the top half to ensure timely handling. Fixes: `a5073d6054` ("RDMA/hns: Add eq support of hip08") Signed-off-by: wenglianfa <wenglianfa@huawei.com> Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com> Link: https://patch.msgid.link/20241024124000.2931869-2-huangjunxian6@hisilicon.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-30 14:13:54 +02:00
Hongguang Gao	52f70dea42	RDMA/bnxt_re: Fix access flags for MR and QP modify Access flag definition in MR and QP is different in FW. Currently both reg/bind MR and modify/query QP uses the same flags. Add a different function to map the QP access flags for newer adapters. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1729065346-1364-6-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-28 20:59:32 +02:00
Kalesh AP	9a420bb2b4	RDMA/bnxt_re: Add support for modify_device hook Adds support for modify_device in the driver for node desc changes. Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1729065346-1364-5-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-28 20:59:32 +02:00
Chandramohan Akula	9c4927cacc	RDMA/bnxt_re: Add support for CQ rx coalescing RoCE message rate performance is heavily degraded without the use of cq coalescing. With proper coalescing, message rates get better. Furthermore, coalescing significantly reduces contention on the PCIe Root Complex/Memory subsystems. Add the changes to configure CQ rx colascing parameters based on adapter revision when CQ is created. Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1729065346-1364-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-28 20:59:32 +02:00
Kalesh AP	82c32d2192	RDMA/bnxt_re: Add support for optimized modify QP Modify QP improvements are for state transitions from INIT -> RTR and RTR -> RTS. In order to support the Modify QP Optimization feature, the driver is expected to check for the feature support in the CMDQ_QUERY_FUNC and register its support for this feature with the FW in CMDQ_INITIALIZE_FIRMWARE. Additionally, the driver is required to specify the new fields and attribute masks for the transitions as follows: 1. INIT -> RTR: - New fields: srq_used, type. - enable srq_used when RC QP is configured to use SRQ. - set the type based on the QP type. - Mandatory masks: - RC: CMDQ_MODIFY_QP_MODIFY_MASK_ACCESS, CMDQ_MODIFY_QP_MODIFY_MASK_PKEY - UD QP and QP1: CMDQ_MODIFY_QP_MODIFY_MASK_PKEY, CMDQ_MODIFY_QP_MODIFY_MASK_QKEY 2. RTR -> RTS: - New fields: type - set the type based on the QP type. - Mandatory masks: - RC: CMDQ_MODIFY_QP_MODIFY_MASK_ACCESS - UD QP and QP1: CMDQ_MODIFY_QP_MODIFY_MASK_QKEY Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Tushar Rane <tushar.rane@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1729065346-1364-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>	2024-10-28 20:59:32 +02:00
Michael Margolin	48931f65e9	RDMA/efa: Add option to set QP service level on create Using modify QP with AH attributes and IB_QP_AV flag set doesn't make much sense for connectionless QP types like SRD. Add SL parameter to EFA create QP user ABI and pass it to the device. Link: https://patch.msgid.link/r/20241015174242.3490-3-mrgolin@amazon.com Reviewed-by: Firas Jahjah <firasj@amazon.com> Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:58:07 -03:00
Michael Margolin	1e7b86f1b2	RDMA/efa: Update device interface Update device interface header files. Link: https://patch.msgid.link/r/20241015174242.3490-2-mrgolin@amazon.com Reviewed-by: Daniel Kranzdorf <dkkranzd@amazon.com> Reviewed-by: Yonatan Nachum <ynachum@amazon.com> Signed-off-by: Michael Margolin <mrgolin@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:58:07 -03:00
Selvin Xavier	76d3ddff71	RDMA/bnxt_re: synchronize the qp-handle table array There is a race between the CREQ tasklet and destroy qp when accessing the qp-handle table. There is a chance of reading a valid qp-handle in the CREQ tasklet handler while the QP is already moving ahead with the destruction. Fixing this race by implementing a table-lock to synchronize the access. Fixes: `f218d67ef0` ("RDMA/bnxt_re: Allow posting when QPs are in error") Fixes: `84cf229f40` ("RDMA/bnxt_re: Fix the qp table indexing") Link: https://patch.msgid.link/r/1728912975-19346-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:28:15 -03:00
Selvin Xavier	d71f4acd58	RDMA/bnxt_re: Fix the usage of control path spin locks Control path completion processing always runs in tasklet context. To synchronize with the posting thread, there is no need to use the irq variant of spin lock. Use spin_lock_bh instead. Fixes: `1ac5a40479` ("RDMA/bnxt_re: Add bnxt_re RoCE driver") Link: https://patch.msgid.link/r/1728912975-19346-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:28:15 -03:00
Patrisious Haddad	78ed28e08e	RDMA/mlx5: Round max_rd_atomic/max_dest_rd_atomic up instead of down After the cited commit below max_dest_rd_atomic and max_rd_atomic values are being rounded down to the next power of 2. As opposed to the old behavior and mlx4 driver where they used to be rounded up instead. In order to stay consistent with older code and other drivers, revert to using fls round function which rounds up to the next power of 2. Fixes: `f18e26af6a` ("RDMA/mlx5: Convert modify QP to use MLX5_SET macros") Link: https://patch.msgid.link/r/d85515d6ef21a2fa8ef4c8293dce9b58df8a6297.1728550179.git.leon@kernel.org Signed-off-by: Patrisious Haddad <phaddad@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:25:44 -03:00
Leon Romanovsky	89f8c6f197	RDMA/cxgb4: Dump vendor specific QP details Restore the missing functionality to dump vendor specific QP details, which was mistakenly removed in the commit mentioned in Fixes line. Fixes: `5cc34116cc` ("RDMA: Add dedicated QP resource tracker function") Link: https://patch.msgid.link/r/ed9844829135cfdcac7d64285688195a5cd43f82.1728323026.git.leonro@nvidia.com Reported-by: Dr. David Alan Gilbert <linux@treblig.org> Closes: https://lore.kernel.org/all/Zv_4qAxuC0dLmgXP@gallifrey Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-21 13:22:33 -03:00
Kalesh AP	dc5006cfcf	RDMA/bnxt_re: Fix the GID table length GID table length is reported by FW. The gid index which is passed to the driver during modify_qp/create_ah is restricted by the sgid_index field of struct ib_global_route. sgid_index is u8 and the max sgid possible is 256. Each GID entry in HW will have 2 GID entries in the kernel gid table. So we can support twice the gid table size reported by FW. Also, restrict the max GID to 256 also. Fixes: `847b97887e` ("RDMA/bnxt_re: Restrict the max_gids to 256") Link: https://patch.msgid.link/r/1728373302-19530-11-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:02 -03:00
Bhargava Chenna Marreddy	7988bdbbb8	RDMA/bnxt_re: Fix a bug while setting up Level-2 PBL pages Avoid memory corruption while setting up Level-2 PBL pages for the non MR resources when num_pages > 256K. There will be a single PDE page address (contiguous pages in the case of > PAGE_SIZE), but, current logic assumes multiple pages, leading to invalid memory access after 256K PBL entries in the PDE. Fixes: `0c4dcd6028` ("RDMA/bnxt_re: Refactor hardware queue memory allocation") Link: https://patch.msgid.link/r/1728373302-19530-10-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:02 -03:00
Chandramohan Akula	2df411353d	RDMA/bnxt_re: Change the sequence of updating the CQ toggle value Currently the CQ toggle value in the shared page (read by the userlib) is updated as part of the cqn_handler. There is a potential race of application calling the CQ ARM doorbell immediately and using the old toggle value. Change the sequence of updating CQ toggle value to update in the bnxt_qplib_service_nq function immediately after reading the toggle value to be in sync with the HW updated value. Fixes: `e275919d96` ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace") Link: https://patch.msgid.link/r/1728373302-19530-9-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Kalesh AP	a5e099e0c4	RDMA/bnxt_re: Fix an error path in bnxt_re_add_device In bnxt_re_add_device(), when register netdev notifier fails, driver is not unregistering the IB device in the error cleanup path. Also, removed the duplicate cleanup in error path of bnxt_re_probe. Fixes: `94a9dc6ac8` ("RDMA/bnxt_re: Group all operations under add_device and remove_device") Link: https://patch.msgid.link/r/1728373302-19530-8-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00
Selvin Xavier	8be3e5b0c9	RDMA/bnxt_re: Avoid CPU lockups due fifo occupancy check loop Driver waits indefinitely for the fifo occupancy to go below a threshold as soon as the pacing interrupt is received. This can cause soft lockup on one of the processors, if the rate of DB is very high. Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th if the loop is taking more time. Pacing will be continuing until the occupancy is below the threshold. This is ensured by the checks in bnxt_re_pacing_timer_exp and further scheduling the work for pacing based on the fifo occupancy. Fixes: `2ad4e6303a` ("RDMA/bnxt_re: Implement doorbell pacing algorithm") Link: https://patch.msgid.link/r/1728373302-19530-7-git-send-email-selvin.xavier@broadcom.com Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Chandramohan Akula <chandramohan.akula@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2024-10-11 20:49:01 -03:00

1 2 3 4 5 ...

8933 Commits