Pull SCSI updates from James Bottomley:
"Only a couple of driver updates this time (lpfc and mpt3sas) plus the
usual assorted minor fixes and updates. The major core update is a set
of patches moving retries out of the drivers and into the core"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (84 commits)
scsi: core: Constify the struct device_type usage
scsi: libfc: replace deprecated strncpy() with memcpy()
scsi: lpfc: Replace deprecated strncpy() with strscpy()
scsi: bfa: Fix function pointer type mismatch for state machines
scsi: bfa: Fix function pointer type mismatch for hcb_qe->cbfn
scsi: bfa: Remove additional unnecessary struct declarations
scsi: csiostor: Avoid function pointer casts
scsi: qla1280: Remove redundant assignment to variable 'mr'
scsi: core: Make scsi_bus_type const
scsi: core: Really include kunit tests with SCSI_LIB_KUNIT_TEST
scsi: target: tcm_loop: Make tcm_loop_lld_bus const
scsi: scsi_debug: Make pseudo_lld_bus const
scsi: iscsi: Make iscsi_flashnode_bus const
scsi: fcoe: Make fcoe_bus_type const
scsi: lpfc: Copyright updates for 14.4.0.0 patches
scsi: lpfc: Update lpfc version to 14.4.0.0
scsi: lpfc: Change lpfc_vport load_flag member into a bitmask
scsi: lpfc: Change lpfc_vport fc_flag member into a bitmask
scsi: lpfc: Protect vport fc_nodes list with an explicit spin lock
scsi: lpfc: Change nlp state statistic counters into atomic_t
...
Implement support for MQ in fnic driver:
The block multiqueue layer issues IO to the fnic driver with an MQ tag. Use
the mqtag and derive a tag from it. Derive the hardware queue from the
mqtag and use it in all paths. Modify queuecommand to handle mqtag.
Replace wq and cq indices to support MQ. Replace the zeroth queue with a
hardware queue. Implement spin locks on a per hardware queue basis.
Replace io_lock with per hardware queue spinlock. Implement out of range
tag checks.
Allocate an io_req_table to track status of the io_req.
Test the driver by building it, loading it, and configuring 64 queues in
UCSM. Issue IOs using Medusa on multiple fnics. Enable/disable links to
exercise the abort and clean up path.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310300032.2awCqkfn-lkp@intel.com/
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Reviewed-by: Arulprabhu Ponnusamy <arulponn@cisco.com>
Tested-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20231211173617.932990-12-kartilak@cisco.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull in the fixes tree for a commit that missed 6.5. Also resolve a
trivial merge conflict in fnic.
* 6.5/scsi-fixes: (36 commits)
scsi: storvsc: Handle additional SRB status values
scsi: snic: Fix double free in snic_tgt_create()
scsi: core: raid_class: Remove raid_component_add()
scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW major version > 5
scsi: ufs: mcq: Fix the search/wrap around logic
scsi: qedf: Fix firmware halt over suspend and resume
scsi: qedi: Fix firmware halt over suspend and resume
scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
scsi: lpfc: Remove reftag check in DIF paths
scsi: ufs: renesas: Fix private allocation
scsi: snic: Fix possible memory leak if device_add() fails
scsi: core: Fix possible memory leak if device_add() fails
scsi: core: Fix legacy /proc parsing buffer overflow
scsi: 53c700: Check that command slot is not NULL
scsi: fnic: Replace return codes in fnic_clean_pending_aborts()
scsi: storvsc: Fix handling of virtual Fibre Channel timeouts
scsi: pm80xx: Fix error return code in pm8001_pci_probe()
scsi: zfcp: Defer fc_rport blocking until after ADISC response
scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices
scsi: sg: Fix checking return value of blk_get_queue()
...
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
sgreset is issued with a SCSI command pointer. The device reset code
assumes that it was issued on a hardware queue, and calls block multiqueue
layer. However, the assumption is broken, and there is no hardware queue
associated with the sgreset, and this leads to a crash due to a null
pointer exception.
Fix the code to use the max_tag_id as a tag which does not overlap with the
other tags issued by mid layer.
Tested by running FC traffic for a few minutes, and by issuing sgreset on
the device in parallel. Without the fix, the crash is observed right away.
With this fix, no crash is observed.
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Tested-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20230817182146.229059-1-kartilak@cisco.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
fnic_clean_pending_aborts() was returning a non-zero value irrespective of
failure or success. This caused the caller of this function to assume that
the device reset had failed, even though it would succeed in most cases. As
a consequence, a successful device reset would escalate to host reset.
Reviewed-by: Sesidhar Baddela <sebaddel@cisco.com>
Tested-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Karan Tilak Kumar <kartilak@cisco.com>
Link: https://lore.kernel.org/r/20230727193919.2519-1-kartilak@cisco.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SPDX updates from Greg KH:
"Here is the set of SPDX comment updates for 6.0-rc1.
Nothing huge here, just a number of updated SPDX license tags and
cleanups based on the review of a number of common patterns in GPLv2
boilerplate text.
Also included in here are a few other minor updates, two USB files,
and one Documentation file update to get the SPDX lines correct.
All of these have been in the linux-next tree for a very long time"
* tag 'spdx-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: (28 commits)
Documentation: samsung-s3c24xx: Add blank line after SPDX directive
x86/crypto: Remove stray comment terminator
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_406.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_398.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_391.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_390.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_385.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_320.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_319.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_318.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_298.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_292.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_179.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 2)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_168.RULE (part 1)
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_160.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_152.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_149.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_147.RULE
treewide: Replace GPLv2 boilerplate/reference with SPDX - gpl-2.0_133.RULE
...
Based on the normalized pattern:
this program is free software you may redistribute it and/or modify it
under the terms of the gnu general public license as published by the
free software foundation version 2 of the license the software is
provided as is without warranty of any kind express or implied
including but not limited to the warranties of merchantability fitness
for a particular purpose and noninfringement in no event shall the
authors or copyright holders be liable for any claim damages or other
liability whether in an action of contract tort or otherwise arising
from out of or in connection with the software or the use or other
dealings in the software
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When aborting a SCSI command through fnic, there is a race with the fnic
interrupt handler which can result in the SCSI command and its request
being completed twice. If the interrupt handler claims the command by
setting CMD_SP to NULL first, the abort handler assumes the interrupt
handler has completed the command and returns SUCCESS, causing the request
for the scsi_cmnd to be re-queued.
But the interrupt handler may not have finished the command yet. After it
drops the spinlock protecting CMD_SP, it does memory cleanup before finally
calling scsi_done() to complete the scsi_cmnd. If the call to scsi_done
occurs after the abort handler finishes and re-queues the request, the
completion of the scsi_cmnd will advance and try to double complete a
request already queued for retry.
This patch fixes the issue by moving scsi_done() and any other use of
scsi_cmnd to before the spinlock is released by the interrupt handler.
Link: https://lore.kernel.org/r/20220311184359.2345319-1-djeffery@redhat.com
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The DEF_SCSI_QCMD() macro passes the addresses of the SCSI host lock and
also that of the scsi_done function to the queuecommand_lck() function
implementations. Remove the 'scsi_done' argument since its address is
now a constant and instead call 'scsi_done' directly from inside the
queuecommand_lck() functions.
Link: https://lore.kernel.org/r/20211007204618.2196847-14-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/fnic/fnic_scsi.c:183: warning: Function parameter or member 'fnic' not described in '__fnic_set_state_flags'
drivers/scsi/fnic/fnic_scsi.c:183: warning: Function parameter or member 'st_flags' not described in '__fnic_set_state_flags'
drivers/scsi/fnic/fnic_scsi.c:183: warning: Function parameter or member 'clearbits' not described in '__fnic_set_state_flags'
drivers/scsi/fnic/fnic_scsi.c:2296: warning: Function parameter or member 'fnic' not described in 'fnic_scsi_host_start_tag'
drivers/scsi/fnic/fnic_scsi.c:2296: warning: Function parameter or member 'sc' not described in 'fnic_scsi_host_start_tag'
drivers/scsi/fnic/fnic_scsi.c:2316: warning: Function parameter or member 'fnic' not described in 'fnic_scsi_host_end_tag'
drivers/scsi/fnic/fnic_scsi.c:2316: warning: Function parameter or member 'sc' not described in 'fnic_scsi_host_end_tag'
Link: https://lore.kernel.org/r/20210317091230.2912389-18-lee.jones@linaro.org
Cc: Satish Kharat <satishkh@cisco.com>
Cc: Sesidhar Baddela <sebaddel@cisco.com>
Cc: Karan Tilak Kumar <kartilak@cisco.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The fnic drivers assigns an ioreq structure to each command and severs this
assignment once scsi_done() has been called and the command has been
completed.
When traversing commands to terminate outstanding I/O we should not call
scsi_done() on commands which do not have a corresponding ioreq structure;
these commands have either never entered the driver or have already been
completed.
[mkp: fixed unused label warning]
Link: https://lore.kernel.org/r/20200515112647.49260-1-hare@suse.de
Reported-by: kbuild test robot <lkp@intel.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Satish Kharat <satishkh@cisco.com>
Acked-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a link is going down the driver will be calling fnic_cleanup_io(),
which will traverse all commands and calling 'done' for each found command.
While the traversal is handled under the host_lock, calling 'done' happens
after the host_lock is being dropped.
As fnic_queuecommand_lck() is being called with the host_lock held, it
might well be that it will pick the command being selected for abortion
from the above routine and enqueue it for sending, but then 'done' is being
called on that very command from the above routine.
Which of course confuses the hell out of the scsi midlayer.
So fix this by not queueing commands when fnic_cleanup_io is active.
Link: https://lore.kernel.org/r/20200116102053.62755-1-hare@suse.de
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The change is to print warning when scsi done is called for an IO that has
not yet been issued to the fw. Also adding sc and tag to debug print when
IO is cleaned up.
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This change is to add fnic stats for the max number of CQs (corresponding
to copy WQ) processed in a given interrupt, max time taken by the ISR.
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Need to use fnic_lock as well as host lock in that order to set state
flags.
[mkp: typos]
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The way these functions abuse ->special to try to store the dummy
request looks completely broken, given that it actually stores the
original scsi command.
Instead switch to ->host_scribble and store the actual dummy command.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Switch from the legacy PCI DMA API to the generic DMA API.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
fnic_fcpio_icmnd_cmpl_handler() displays the value of sc with:
FNIC_SCSI_DBG(KERN_INFO...
"... sc = 0x%p"
"scsi_status ..."
...
As the literal strings get merged, the function uses %ps instead of the
intended raw %p format. Fix this by inserting a space.
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Command abort already returns FAILED, which will then be escalated to a
host reset. So no need to call host_reset directly.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the queue command returns DID_NO_CONNECT anytime the rport is
not in RPORT_ST_READY state. Changing it to return DID_NO_CONNECT only
when the rport is in RPORT_ST_DELETE state. When the rport is in one of
the init states retruning DID_IMM_RETRY.
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
io_cmpl_skip keep track of number of completions to skip when stats are
reset. If a fw_reset happens immediately after stats reset it could put
it out of sync so need to reset io_cmpl_skip when fw reset is completed.
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The IO and Abort latency counter counts the time taken to complete the
IO and abort command into broad buckets. This is not intended for
performance measurement, just a debug statistic. current_max_io_time
tries to keep track of the maximum time an IO has taken to complete if
it is > 30sec.
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If SCSI-ML has already issued abort on a command i.e
FNIC_IOREQ_ABTS_PENDING is set and we get a IO completion, avoid this
being flagged as out-of-order completion by setting the FNIC_IO_DONE
flag in fnic_fcpio_icmnd_cmpl_handler
Signed-off-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Sesidhar Baddela <sebaddel@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>