There are some scenarios that can occur, such as performing an upgrade of
the virtual I/O server, where the supported max transfer of the backing
device for an ibmvfc HBA can change. If the max transfer of the backing
device decreases, this can cause issues with previously discovered
LUNs. This patch accomplishes two things. First, it changes the default
ibmvfc max transfer value to 1MB. This is generally supported by all
backing devices, which should mitigate this issue out of the box. Secondly,
it adds a module parameter, enabling a user to increase the max transfer
value to values that are larger than 1MB, as long as they have configured
these larger values on the virtual I/O server as well.
[mkp: fix checkpatch warnings]
Signed-off-by: Brian King <brking@linux.ibm.com>
Link: https://lore.kernel.org/r/20240903134708.139645-2-brking@linux.ibm.com
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A previous change was introduced to prevent data loss during a power-on
reset when a tape is present inside the drive. This commit set the
"pos_unknown" flag to true to avoid operations that could compromise data
by performing actions from an untracked position. The relevant change is
commit 9604eea5bd ("scsi: st: Add third party poweron reset handling")
As a consequence of this change, a new issue has surfaced: the driver now
returns an "Input/output error" even for empty drives when the drive, host,
or bus is reset. This issue stems from the "flush_buffer" function, which
first checks whether the "pos_unknown" flag is set. If the flag is set, the
user will encounter an "Input/output error" until the tape position is
known again. This behavior differs from the previous implementation, where
empty drives were not affected at system start up time, allowing tape
software to send commands to the driver to retrieve the drive's status and
other information.
The current behavior prioritizes the "pos_unknown" flag over the
"ST_NO_TAPE" status, leading to issues for software that detects drives
during system startup. This software will receive an "Input/output error"
until a tape is loaded and its position is known.
To resolve this, the "ST_NO_TAPE" status should take priority when the
drive is empty, allowing communication with the drive following a power-on
reset. At the same time, the change should continue to protect data by
maintaining the "pos_unknown" flag when the drive contains a tape and its
position is unknown.
Signed-off-by: Rafael Rocha <rrochavi@fnal.gov>
Link: https://lore.kernel.org/r/20240905173921.10944-1-rrochavi@fnal.gov
Fixes: 9604eea5bd ("scsi: st: Add third party poweron reset handling")
Acked-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ability to handle maximum FCoE frames of 2158 bytes can never be changed
and thus more of an attribute, not a toggleable feature.
Move it from netdev_features_t to "cold" priv flags (bitfield bool) and
free yet another feature bit.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pull SCSI fixes from James Bottomley:
"Minor fixes only.
The sd.c one ignores a sync cache request if format is in progress
which can happen if formatting a drive across suspend/resume"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Ignore command SYNCHRONIZE CACHE error if format in progress
scsi: aacraid: Fix double-free on probe failure
scsi: lpfc: Fix overflow build issue
A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.
There isn't much the driver can do in terms of error handling for either
case, so just warn and continue as DMA operations will fail anyway.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC
We'll start throwing warnings soon when dma_set_seg_boundary and
dma_set_max_seg_size are called on devices for buses that don't fully
support the DMA API. Prepare for that by making the calls in the SCSI
midlayer conditional.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace <don.brace@microchip.com> says:
These patches are based on Martin Petersen's 6.12/scsi-queue tree
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git
6.12/scsi-queue
There are two functional changes:
smartpqi-add-fw-log-to-kdump
smartpqi-add-counter-for-parity-write-stream-requests
There are three minor bug fixes:
smartpqi-fix-stream-detection
smartpqi-fix-rare-system-hang-during-LUN-reset
smartpqi-fix-volume-size-updates
The other two patches add PCI-IDs for new controllers and change the
driver version.
This set of changes consists of:
* smartpqi-add-fw-log-to-kdump
During a kdump, the driver tells the controller to copy its logging
information to some pre-allocated buffers that can be analyzed
later.
This is a "feature" driven capability and is backward compatible
with existing controller FW.
This patch renames some prefixes for OFA (Online-Firmware Activation
ofa_*) buffers to host_memory_*. So, not a lot of actual functional
changes to smartpqi_init.c, mainly determining the memory size
allocation.
We added a function to notify the controller to copy debug data into
host memory before continuing kdump.
Most of the functional changes are in smartpqi_sis.c where the
actual handshaking is done.
* smartpqi-fix-stream-detection
Correct some false write-stream detections. The data structure used
to check for write-streams was not initialized to all 0's causing
some false write stream detections. The driver sends down streamed
requests to the raid engine instead of using AIO bypass for some
extra performance. (Potential full-stripe write verses Read Modify
Write).
False detections have not caused any data corruption. Found by
internal testing. No known externally reported bugs.
* smartpqi-add-counter-for-parity-write-stream-requests
Adding some counters for raid_bypass and write streams. These two
counters are related because write stream detection is only checked
if an I/O request is eligible for bypass (AIO).
The bypass counter (raid_bypass_cnt) was moved into a common
structure (pqi_raid_io_stats) and changed to type __percpu. The
write stream counter is (write_stream_cnt) has been added to this
same structure.
These counters are __percpu counters for performance. We added a
sysfs entry to show the write stream count. The raid bypass counter
sysfs entry already exists.
Useful for checking streaming writes. The change in the sysfs entry
write_stream_cnt can be checked during AIO eligible write
operations.
* smartpqi-add-new-controller-PCI-IDs
Adding support for new controller HW. No functional changes.
* smartpqi-fix-rare-system-hang-during-LUN-reset
We found a rare race condition that can occur during a LUN reset. We
were not emptying our internal queue completely.
There have been some rare conditions where our internal request
queue has requests for multiple LUNs and a reset comes in for one of
the LUNs. The driver waits for this internal queue to empty. We were
only clearing out the requests for the LUN being reset so the
request queue was never empty causing a hang.
The Fix:
For all requests in our internal request queue:
Complete requests with DID_RESET for queued requests for the
device undergoing a reset.
Complete requests with DID_REQUEUE for all other queued requests.
Found by internal testing. No known externally reported bugs.
* smartpqi-fix-volume-size-updates
The current code only checks for a size change if there is also a
queue depth change. We are separating the check for queue depth and
the size changes.
Found by internal testing. No known bugs were filed.
* smartpqi-update-version-to-2.1.30-031
No functional changes.
Link: https://lore.kernel.org/r/20240827185501.692804-1-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct a rare case where in a LUN reset occurs on a device and I/O
requests for other devices persist in the driver's internal request queue.
Part of a LUN reset involves waiting for our internal request queue to
empty before proceeding. The internal request queue contains requests not
yet sent down to the controller.
We were clearing the requests queued for the LUN undergoing a reset, but
not all of the queued requests. Causing a hang.
For all requests in our internal request queue:
Complete requests with DID_RESET for queued requests for the device
undergoing a reset.
Complete requests with DID_REQUEUE for all other queued requests.
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-6-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add sysfs entry to check for write stream requests.
Move existing raid_bypass_cnt into a structure named pqi_raid_io_stats and
add member write_stream_cnt. These two counters are related because write
stream detection is only checked if an I/O request is eligible for bypass
(AIO).
Example usage:
lsscsi
[15:1:0:0] disk Adaptec LOGICAL VOLUME 0129 /dev/sdae
cat /sys/block/sdae/device/ssd_smart_path_enabled
1
^
|
+---- NOTE: here bypass has been enabled on device sdae
To read the counter for parity write stream requests:
cat /sys/block/sdae/device/write_stream_cnt
0x60cd507
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Co-developed-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-4-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Correct stream detection by initializing the structure
pqi_scsi_dev_raid_map_data to 0s.
When the OS issues SCSI READ commands, the driver erroneously considers
them as SCSI WRITES. If they are identified as sequential IOs, the driver
then submits those requests via the RAID path instead of the AIO path.
The 'is_write' flag might be set for SCSI READ commands also. The driver
may interpret SCSI READ commands as SCSI WRITE commands, resulting in IOs
being submitted through the RAID path.
Note: This does not cause data corruption.
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Scott Teel <scott.teel@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Mahesh Rajashekhara <mahesh.rajashekhara@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-3-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add controller logs to kdump.
Driver allocates DMA memory and communicates this address to FW. In the
event of system crash, host driver notifies the firmware about the crash
and firmware posts all the necessary logs in the pre-allocated host buffer
for firmware debugging.
Once firmware notifies the completion of the log uploading to the host
memory and host continues with the OS crash dump saving.
This is a "feature" driven capability and is backward compatible with
existing controller FW.
Rename some prefixes for OFA (Online-Firmware Activation ofa_*) buffers to
host_memory_*. So, not a lot of actual functional changes to
smartpqi_init.c, mainly determining the memory size allocation.
Added a function to notify the controller to copy debug data into host
memory before continuing kdump.
Most of the functional changes are in smartpqi_sis.c where the actual
handshaking is done.
Reviewed-by: Scott Benesh <scott.benesh@microchip.com>
Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com>
Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com>
Signed-off-by: Don Brace <don.brace@microchip.com>
Link: https://lore.kernel.org/r/20240827185501.692804-2-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI fixes from James Bottomley:
"The important core fix is another tweak to our discard discovery
issues. The off by 512 in logical block count seems bad, but in fact
the inline was only ever used in debug prints, which is why no-one
noticed"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Do not attempt to configure discard unless LBPME is set
scsi: MAINTAINERS: Add header files to SCSI SUBSYSTEM
scsi: ufs: qcom: Add UFSHCD_QUIRK_BROKEN_LSDBS_CAP for SM8550 SoC
scsi: ufs: core: Add a quirk for handling broken LSDBS field in controller capabilities register
scsi: core: Fix the return value of scsi_logical_block_count()
scsi: MAINTAINERS: Update HiSilicon SAS controller driver maintainer
Bart Van Assche <bvanassche@acm.org> says:
Hi Martin,
Multiple SCSI drivers use snprintf() to format a workqueue name before
invoking one of the create*_workqueue() macros. This patch series
simplifies such code by passing the format string and arguments to
alloc_workqueue(). Additionally, the structure members that are only
used as a temporary buffer for formatting workqueue names are
removed. Please consider this patch series for the next merge window.
Thanks,
Bart.
Link: https://lore.kernel.org/r/20240822195944.654691-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The workqueue maintainer wants to remove the create*_workqueue() macros
because these macros always set the WQ_MEM_RECLAIM flag and because these
only support literal workqueue names. Hence this patch that replaces the
create*_workqueue() invocations with the definition of this macro. The
WQ_MEM_RECLAIM flag has been retained because I think that flag is necessary
for workqueues created by storage drivers. This patch has been generated by
running spatch and git clang-format. spatch has been invoked as follows:
spatch --in-place --sp-file expand-create-workqueue.spatch $(git grep -lEw 'create_(freezable_|singlethread_|)workqueue' */scsi */ufs)
The contents of the expand-create-workqueue.spatch file is as follows:
@@
expression name;
@@
-create_workqueue(name)
+alloc_workqueue("%s", WQ_MEM_RECLAIM, 1, name)
@@
expression name;
@@
-create_freezable_workqueue(name)
+alloc_workqueue("%s", WQ_FREEZABLE | WQ_UNBOUND | WQ_MEM_RECLAIM, 1, name)
@@
expression name;
@@
-create_singlethread_workqueue(name)
+alloc_ordered_workqueue("%s", WQ_MEM_RECLAIM, name)
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240822195944.654691-2-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If formatting a suspended disk (such as formatting with different DIF
type), the disk will be resuming first, and then the format command will
submit to the disk through SG_IO ioctl.
When the disk is processing the format command, the system does not
submit other commands to the disk. Therefore, the system attempts to
suspend the disk again and sends the SYNCHRONIZE CACHE command. However,
the SYNCHRONIZE CACHE command will fail because the disk is in the
formatting process. This will cause the runtime_status of the disk to
error and it is difficult for user to recover it. Error info like:
[ 669.925325] sd 6:0:6:0: [sdg] Synchronizing SCSI cache
[ 670.202371] sd 6:0:6:0: [sdg] Synchronize Cache(10) failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK
[ 670.216300] sd 6:0:6:0: [sdg] Sense Key : 0x2 [current]
[ 670.221860] sd 6:0:6:0: [sdg] ASC=0x4 ASCQ=0x4
To solve the issue, ignore the error and return success/0 when format is
in progress.
Cc: stable@vger.kernel.org
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Link: https://lore.kernel.org/r/20240819090934.2130592-1-liyihang9@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_probe_one() calls hardware-specific init functions through the
aac_driver_ident::init pointer, all of which eventually call down to
aac_init_adapter().
If aac_init_adapter() fails after allocating memory for aac_dev::queues,
it frees the memory but does not clear that member.
After the hardware-specific init function returns an error,
aac_probe_one() goes down an error path that frees the memory pointed to
by aac_dev::queues, resulting.in a double-free.
Reported-by: Michael Gordon <m.gordon.zelenoborsky@gmail.com>
Link: https://bugs.debian.org/1075855
Fixes: 8e0c5ebde8 ("[SCSI] aacraid: Newer adapter communication iterface support")
Signed-off-by: Ben Hutchings <benh@debian.org>
Link: https://lore.kernel.org/r/ZsZvfqlQMveoL5KQ@decadent.org.uk
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Build failed while enabling "CONFIG_GCOV_KERNEL=y" and
"CONFIG_GCOV_PROFILE_ALL=y" with following error:
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_get_cgnbuf_info':
BUILDSTDERR: ./include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' accessing 18446744073709551615 bytes at offsets 0 and 0 overlaps 9223372036854775807 bytes at offset -9223372036854775808 [-Werror=restrict]
BUILDSTDERR: 114 | #define __underlying_memcpy __builtin_memcpy
BUILDSTDERR: | ^
BUILDSTDERR: ./include/linux/fortify-string.h:637:9: note: in expansion of macro '__underlying_memcpy'
BUILDSTDERR: 637 | __underlying_##op(p, q, __fortify_size); \
BUILDSTDERR: | ^~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/fortify-string.h:682:26: note: in expansion of macro '__fortify_memcpy_chk'
BUILDSTDERR: 682 | #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \
BUILDSTDERR: | ^~~~~~~~~~~~~~~~~~~~
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c:5468:9: note: in expansion of macro 'memcpy'
BUILDSTDERR: 5468 | memcpy(cgn_buff, cp, cinfosz);
BUILDSTDERR: | ^~~~~~
This happens from the commit 06bb7fc0fe ("kbuild: turn on -Wrestrict by
default"). Address this issue by using size_t type.
Signed-off-by: Sherry Yang <sherry.yang@oracle.com>
Link: https://lore.kernel.org/r/20240821065131.1180791-1-sherry.yang@oracle.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI fixes from James Bottomley:
"Two small fixes to the mpi3mr driver. One to avoid oversize
allocations in tracing and the other to fix an uninitialized spinlock
in the user to driver feature request code (used to trigger dumps and
the like)"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpi3mr: Avoid MAX_PAGE_ORDER WARNING for buffer allocations
scsi: mpi3mr: Add missing spin_lock_init() for mrioc->trigger_lock
Finn Thain <fthain@linux-m68k.org> says:
This series begins with some work on the mac_scsi driver to improve
compatibility with SCSI2SD v5 devices. Better error handling is needed
there because the PDMA hardware does not tolerate the write latency
spikes which SD cards can produce.
A bug is fixed in the 5380 core driver so that scatter/gather can be
enabled in mac_scsi.
Several patches at the end of this series improve robustness and
correctness in the core driver.
This series has been tested on a variety of mac_scsi hosts. A variety
of SCSI targets was also tested, including Quantum HDD, Fujitsu HDD,
Iomega FDD, Ricoh CD-RW, Matsushita CD-ROM, SCSI2SD and BlueSCSI.
Link: https://lore.kernel.org/r/cover.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>