If command timeout happens and cq complete IRQ is raised at the same time,
ufshcd_mcq_abort clears lprb->cmd and a NULL pointer deref happens in the
ISR. Error log:
ufshcd_abort: Device abort task at tag 18
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000108
pc : [0xffffffe27ef867ac] scsi_dma_unmap+0xc/0x44
lr : [0xffffffe27f1b898c] ufshcd_release_scsi_cmd+0x24/0x114
Fixes: f1304d4420 ("scsi: ufs: mcq: Added ufshcd_mcq_abort()")
Cc: stable@vger.kernel.org
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20231106075117.8995-1-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc,
scsi_debug) plus the usual assorted minor fixes and updates.
The major change this time around is a prep patch for rethreading of
the driver reset handler API not to take a scsi_cmd structure which
starts to reduce various drivers' dependence on scsi_cmd in error
handling"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (132 commits)
scsi: ufs: core: Leave space for '\0' in utf8 desc string
scsi: ufs: core: Conversion to bool not necessary
scsi: ufs: core: Fix race between force complete and ISR
scsi: megaraid: Fix up debug message in megaraid_abort_and_reset()
scsi: aic79xx: Fix up NULL command in ahd_done()
scsi: message: fusion: Initialize return value in mptfc_bus_reset()
scsi: mpt3sas: Fix loop logic
scsi: snic: Remove useless code in snic_dr_clean_pending_req()
scsi: core: Add comment to target_destroy in scsi_host_template
scsi: core: Clean up scsi_dev_queue_ready()
scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler()
scsi: target: core: Fix kernel-doc comment
scsi: pmcraid: Fix kernel-doc comment
scsi: core: Handle depopulation and restoration in progress
scsi: ufs: core: Add support for parsing OPP
scsi: ufs: core: Add OPP support for scaling clocks and regulators
scsi: ufs: dt-bindings: common: Add OPP table
scsi: scsi_debug: Add param to control sdev's allow_restart
scsi: scsi_debug: Add debugfs interface to fail target reset
scsi: scsi_debug: Add new error injection type: Reset LUN failed
...
utf16s_to_utf8s does not NULL terminate the output string. For us to be
able to add a NULL character when utf16s_to_utf8s returns, we need to make
sure that there is space for such NULL character at the end of the output
buffer. We can achieve this by passing an output buffer size to
utf16s_to_utf8s that is one character less than what we allocated.
Other call sites of utf16s_to_utf8s appear to be using the same technique
where they artificially reduce the buffer size by one to leave space for a
NULL character or line feed character.
Fixes: 4b828fe156 ("scsi: ufs: revamp string descriptor reading")
Reviewed-by: Mars Cheng <marscheng@google.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Yen-lin Lai <yenlinlai@google.com>
Signed-off-by: Daniel Mentz <danielmentz@google.com>
Link: https://lore.kernel.org/r/20231017182026.2141163-1-danielmentz@google.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While error handler force complete command (Thread A) and completion IRQ
raising (Thread B) of the same command, it may cause race condition.
Below is racing step (from 1 to 6):
ufshcd_mcq_compl_pending_transfer (Thread A)
1 if (cmd && !test_bit(SCMD_STATE_COMPLETE, &cmd->state)) {
5 spin_lock_irqsave(&hwq->cq_lock, flags); // wait lock release
set_host_byte(cmd, DID_REQUEUE);
6 ufshcd_release_scsi_cmd(hba, lrbp); // access null pointer
scsi_done(cmd);
spin_unlock_irqrestore(&hwq->cq_lock, flags);
}
ufshcd_mcq_poll_cqe_lock (Thread B)
2 spin_lock_irqsave(&hwq->cq_lock, flags);
ufshcd_mcq_poll_cqe_nolock()
ufshcd_compl_one_cqe()
3 ufshcd_release_scsi_cmd() // lrbp->cmd = NULL;
4 spin_unlock_irqrestore(&hwq->cq_lock, flags);
Signed-off-by: Alice Chao <alice.chao@mediatek.com>
Link: https://lore.kernel.org/r/20231024084324.12197-1-alice.chao@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
OPP framework can be used to scale the clocks along with other entities
such as regulators, performance state etc... So let's add support for
parsing OPP from devicetree. OPP support in devicetree is added through the
"operating-points-v2" property which accepts the OPP table defining clock
frequency, regulator voltage, power domain performance state etc...
Since the UFS controller requires multiple clocks to be controlled for
proper working, devm_pm_opp_set_config() has been used which supports
scaling multiple clocks through custom ufshcd_opp_config_clks() callback.
It should be noted that the OPP support is not compatible with the old
"freq-table-hz" property. So only one can be used at a time even though
the UFS core supports both.
Co-developed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20231012172129.65172-4-manivannan.sadhasivam@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
UFS core is only scaling the clocks during devfreq scaling and
initialization. But for an optimum power saving, regulators should also be
scaled along with the clocks.
So let's use the OPP framework which supports scaling clocks, regulators,
and performance state using OPP table defined in devicetree. For
accomodating the OPP support, the existing APIs (ufshcd_scale_clks,
ufshcd_is_devfreq_scaling_required and ufshcd_devfreq_scale) are modified
to accept "freq" as an argument which in turn used by the OPP helpers.
The OPP support is added along with the old freq-table based clock scaling
so that the existing platforms work as expected.
Co-developed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20231012172129.65172-3-manivannan.sadhasivam@linaro.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When no active_reqs, devfreq_monitor (thread A) will suspend clock scaling.
But it may have racing with clk_scaling.suspend_work (thread B) and
actually not suspend clock scaling (requeue after suspend). Next time
after polling_ms, devfreq_monitor read clk_scaling.window_start_t = 0 then
scale up clock abnormal.
Below is racing step:
devfreq->work (Thread A)
devfreq_monitor
update_devfreq
.....
ufshcd_devfreq_target
queue_work(hba->clk_scaling.workq,
1 &hba->clk_scaling.suspend_work)
.....
5 queue_delayed_work(devfreq_wq, &devfreq->work,
msecs_to_jiffies(devfreq->profile->polling_ms));
2 hba->clk_scaling.suspend_work (Thread B)
ufshcd_clk_scaling_suspend_work
__ufshcd_suspend_clkscaling
devfreq_suspend_device(hba->devfreq);
3 cancel_delayed_work_sync(&devfreq->work);
4 hba->clk_scaling.window_start_t = 0;
.....
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230831130826.5592-4-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When ufshcd_clk_scaling_suspend_work (thread A) running and new command
coming, ufshcd_clk_scaling_start_busy (thread B) may get host_lock after
thread A first time release host_lock. Then thread A second time get
host_lock will set clk_scaling.window_start_t = 0 which scale up clock
abnormal next polling_ms time. Also inlines another
__ufshcd_suspend_clkscaling calls.
Below is racing step:
1 hba->clk_scaling.suspend_work (Thread A)
ufshcd_clk_scaling_suspend_work
2 spin_lock_irqsave(hba->host->host_lock, irq_flags);
3 hba->clk_scaling.is_suspended = true;
4 spin_unlock_irqrestore(hba->host->host_lock, irq_flags);
__ufshcd_suspend_clkscaling
7 spin_lock_irqsave(hba->host->host_lock, flags);
8 hba->clk_scaling.window_start_t = 0;
9 spin_unlock_irqrestore(hba->host->host_lock, flags);
ufshcd_send_command (Thread B)
ufshcd_clk_scaling_start_busy
5 spin_lock_irqsave(hba->host->host_lock, flags);
....
6 spin_unlock_irqrestore(hba->host->host_lock, flags);
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20230831130826.5592-3-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a dev command times out, clk_scaling.active_reqs is not decreased which
causes abnormal clock scaling.
It is complicated to handle different dev command timeout cases in both
legacy mode and MCQ mode. Besides, dev cmds are rarely used and the busy
time is short.
Remove clock scaling busy window for dev cmds like we do for UIC or TM cmds
which don't update busy window either.
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20231004062454.29165-1-peter.wang@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart from
emitting a warning) and this typically results in resource leaks. To
improve here there is a quest to make the remove callback return void. In
the first step of this quest all drivers are converted to .remove_new()
which already returns void. Eventually after all drivers are converted,
.remove_new() is renamed to .remove().
All platform drivers below drivers/ufs/ unconditionally return zero in
their remove callback and so can be converted trivially to the variant
returning void.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230917145722.1131557-1-u.kleine-koenig@pengutronix.de
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Damien Le Moal <dlemoal@kernel.org> says:
The first patch of this series fixes an issue with IRQ setup which
prevents the controller from resuming after a system suspend. The
following patches are code cleanup without any functional changes.
[mkp: The first patch went into v6.6-rc2 and thus this merge
constitutes the remaining patches of the series]
Link: https://lore.kernel.org/r/20230911232745.325149-1-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Nitin Rawat <quic_nitirawa@quicinc.com> says:
This patch series adds programming support for Qualcomm UFS V4 and
above to align avoid with Hardware Specification. This patch series
will address stability and performance issues.
In this patch series below changes are taken care.
1) Register layout for DME_VS_CORE_CLK_CTRL has changed for v4 and above.
2) Adds Support to configure PA_VS_CORE_CLK_40NS_CYCLES attibute for UFS V4
and above.
3) Adds Support to configure multiple unipro frequencies like 403MHz,
300MHz, 202MHz, 150 MHz, 75Mhz, 37.5 MHz for Qualcomm UFS Controller V4
and above.
4) Allow configuration of SYS1CLK_1US_REG for UFS V4 and above.
Link: https://lore.kernel.org/r/20230905052400.13935-1-quic_nitirawa@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The "hs_gear" variable is used to program the PHY settings (submode) during
ufs_qcom_power_up_sequence(). Currently, it is being updated every time the
agreed gear changes. Due to this, if the gear got downscaled before suspend
(runtime/system), then while resuming, the PHY settings for the lower gear
will be applied first and later when scaling to max gear with REINIT, the
PHY settings for the max gear will be applied.
This adds a latency while resuming and also really not needed as the PHY
gear settings are backwards compatible i.e., we can continue using the PHY
settings for max gear with lower gear speed.
So let's update the "hs_gear" variable _only_ when the agreed gear is
greater than the current one. This guarantees that the PHY settings will be
changed only during probe time and fatal error condition.
Due to this, UFSHCD_QUIRK_REINIT_AFTER_MAX_GEAR_SWITCH can now be skipped
when the PM operation is in progress.
Cc: stable@vger.kernel.org
Fixes: 96a7141da3 ("scsi: ufs: core: Add support for reinitializing the UFS device")
Reported-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://lore.kernel.org/r/20230908145329.154024-1-manivannan.sadhasivam@linaro.org
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Tested-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SYS1CLK_1US represents the required number of system 1-clock cycles for one
microsecond. UFS Host Controller V4.0 and above mandates to write
SYS1CLK_1US_REG register and also these timer configuration needs to be
called from clk scaling pre ops as per HPG.
Refactor ufs_qcom_cfg_timers and add the below code support to align
with HPG.
a)Configure SYS1CLK_1US_REG for UFS V4 and above.
b)Introduce a new argument is_pre_scale_up for ufs_qcom_cfg_timers
to configure SYS1CLK_1US for max freq during prescale and link startup
condition.
c)Move ufs_qcom_cfg_timers from clk scaling post change ops
to clk scaling pre change ops.
Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-6-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
PA_VS_CORE_CLK_40NS_CYCLES attribute represents the required number of
Qunipro core clock for 40 nanoseconds. For UFS host controller V4 and above
PA_VS_CORE_CLK_40NS_CYCLES needs to be programmed as per frequency of
unipro core clk of UFS host controller.
Add Support to configure PA_VS_CORE_CLK_40NS_CYCLES for Controller V4 and
above to align with the hardware specification and to avoid functionality
issues like h8 enter/exit failure, command timeout.
Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-4-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Qualcomm UFS Controller V4 and above supports multiple unipro frequencies
like 403MHz, 300MHz, 202MHz, 150 MHz, 75Mhz, 37.5 MHz. Current code
supports only 150MHz and 75MHz which have performance impact due to low UFS
controller frequencies.
For targets which supports frequencies other than 150 MHz and 75 Mhz, needs
an update of MAX_CORE_CLK_1US_CYCLES to match the configured frequency to
avoid functionality issues. Add multiple frequency support for
MAX_CORE_CLK_1US_CYCLES based on the frequency configured.
Co-developed-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Naveen Kumar Goud Arepalli <quic_narepall@quicinc.com>
Signed-off-by: Nitin Rawat <quic_nitirawa@quicinc.com>
Link: https://lore.kernel.org/r/20230905052400.13935-3-quic_nitirawa@quicinc.com
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The logical unit information is missing from the UFS command tracing
output. Although the device name is logged, e.g. 13200000.ufs, this name
does not include logical unit information. Hence this patch that replaces
the device name with the SCSI ID in the tracing output. An example of
tracing output with this patch applied:
kworker/8:0H-80 [008] ..... 89.106063: ufshcd_command: send_req: 0:0:0:4: tag: 10, DB: 0x7ffffbff, size: 524288, IS: 0, LBA: 1085538, opcode: 0x8a (WRITE_16), group_id: 0x0
dd-4225 [000] d.h.. 89.106219: ufshcd_command: complete_rsp: 0:0:0:4: tag: 11, DB: 0x7ffff7ff, size: 524288, IS: 0, LBA: 1081728, opcode: 0x8a (WRITE_16), group_id: 0x0
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230907183739.905938-1-bvanassche@acm.org
Reviewed-by: Can Guo <quic_cang@quicinc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
With auto hibern8 enabled, UIC could be busy processing a hibern8 operation
and the HCI would reports UIC not ready for a short while through
HCS.UCRDY. The UFS driver doesn't currently handle this situation. The
UFSHCI spec specifies UCRDY like this: whether the host controller is ready
to process UIC COMMAND
The 'ready' could be seen as many different meanings. If the meaning
includes not processing any request from HCI, processing a hibern8
operation can be 'not ready'. In this situation, the driver needs to wait
until the operations is completed.
Signed-off-by: Kiwoong Kim <kwmad.kim@samsung.com>
Link: https://lore.kernel.org/r/550484ffb66300bdcec63d3e304dfd55cb432f1f.1693790060.git.kwmad.kim@samsung.com
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Chanwoo Lee <cw9316.lee@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
According to UFSHCI 4.0 specification:
5.2 Host Controller Capabilities Registers
5.2.1 Offset 00h: CAP – Controller Capabilities:
"EHS Length in UTRD Supported (EHSLUTRDS): Indicates whether the host
controller supports EHS Length field in UTRD.
0 – Host controller takes EHS length from CMD UPIU, and SW driver use EHS
Length field in CMD UPIU.
1 – HW controller takes EHS length from UTRD, and SW driver use EHS
Length field in UTRD.
NOTE Recommend Host controllers move to taking EHS length from UTRD, and
in UFS-5, it will be mandatory."
So, when UFSHCI 4.0 doesn't support EHS Length field in UTRD, we could use
EHS Length field in CMD UPIU. Remove the limitation that advanced RPMB only
works when EHS length is supported in UTRD.
Fixes: 6ff265fc5e ("scsi: ufs: core: bsg: Add advanced RPMB support in ufs_bsg")
Co-developed-by: "jonghwi.rha" <jonghwi.rha@samsung.com>
Signed-off-by: "jonghwi.rha" <jonghwi.rha@samsung.com>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Link: https://lore.kernel.org/r/20230809181847.102123-2-beanhuo@iokpp.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull in the fixes tree for a commit that missed 6.5. Also resolve a
trivial merge conflict in fnic.
* 6.5/scsi-fixes: (36 commits)
scsi: storvsc: Handle additional SRB status values
scsi: snic: Fix double free in snic_tgt_create()
scsi: core: raid_class: Remove raid_component_add()
scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW major version > 5
scsi: ufs: mcq: Fix the search/wrap around logic
scsi: qedf: Fix firmware halt over suspend and resume
scsi: qedi: Fix firmware halt over suspend and resume
scsi: qedi: Fix potential deadlock on &qedi_percpu->p_work_lock
scsi: lpfc: Remove reftag check in DIF paths
scsi: ufs: renesas: Fix private allocation
scsi: snic: Fix possible memory leak if device_add() fails
scsi: core: Fix possible memory leak if device_add() fails
scsi: core: Fix legacy /proc parsing buffer overflow
scsi: 53c700: Check that command slot is not NULL
scsi: fnic: Replace return codes in fnic_clean_pending_aborts()
scsi: storvsc: Fix handling of virtual Fibre Channel timeouts
scsi: pm80xx: Fix error return code in pm8001_pci_probe()
scsi: zfcp: Defer fc_rport blocking until after ADISC response
scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices
scsi: sg: Fix checking return value of blk_get_queue()
...
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche <bvanassche@acm.org> says:
Hi Martin,
This patch includes the following changes, none of which should change the
functionality of the UFS host controller driver:
- Improve the kernel-doc headers further.
- Fix multiple W=2 compiler warnings.
- Simplify ufshcd_abort_all().
- Simplify the code for creating and parsing UFS Transport Protocol (UTP)
headers.
Please consider this patch series for the next merge window.
Thanks,
Bart.
Link: https://lore.kernel.org/r/20230727194457.3152309-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>