In order to later add the ability to do deeper resets of the
device when it crashes, first restructure the firmware error
handling. Instead of having just a single nic_error() method
that handles all, split it:
- nic_error() just handles and prints the error itself,
- dump_error() synchronously creates an error dump, and
- sw_reset() will be called to request doing a SW reset.
This changes the architecture so that the transport is now
responsible for deciding how to do the reset, and therefore
the handling of reprobe if error occurs during reconfig
moves there, which necessitates adding a method there that
notifies the transport that the recovery was completed.
Actually introducing the model under which deeper resets can
be done will be in future patches.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20241227095718.6d4f741ae907.I96a9243e7877808ed6d1bff6967c15d6c24882f0@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Approximately three years ago, in commit ddb6b76b6f
("iwlwifi: yoyo: support TLV-based firmware reset"), the code
was (likely erroneously) changed to no longer treat error
interrupts as firmware errors. As a result, this meant that
the fw_restart counter was only applied in case of command
queue being stuck, which never seems to happen. Also, there's
no longer any way to set the mvm->fw_restart to a value that
doesn't match exactly the module parameter behaviour.
Instead of trying to fix this, simply remove the logic that
limits the number of restarts, it's clearly unused.
However, restore the logic that restart isn't unconditional,
by checking the module parameter.
Since the "fw_error" argument to iwl_mvm_nic_restart() is now
always true (except in the "never happens" case of CMD queue
stuck), just remove it too and treat command queue stuck the
same way as everything else.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20241227095718.b0489daf323c.I0cd3233b2214c5f06e059f746041b19d08647e40@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Instead of differentiating only sync/async, differentiate
the type of error, and document that only reset handshake
timeout (IWL_ERR_TYPE_RESET_HS_TIMEOUT) needs sync handling.
The special sync handling is somewhat temporary, the idea
is to later split the nic_error() method into error dump,
synchronizing the dump, and SW reset methods, and the type
is mostly in order to unify command queue full handling
into that new architecture as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20241227095718.aed9c9e4fac0.I2288042bec4728a75b61cb7f6ded5214bfa3ce85@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Currently, the driver handles SMPS decisions by tracking AP
capabilities, BT coexistence changes, sending necessary SMPS
frames to the AP, and updating firmware with RX chain info
using the RLC_CONFIG_CMD.
Starting with version 3 of the RLC_CONFIG_CMD, the firmware
takes over this responsibility. It now tracks SMPS, sends
frames, and configures the RLC.
In this patch:
1. Stop sending RLC_CONFIG_CMD when firmware supports RLC
offload (version 3), as rlc.rx_chain_info is not needed by
firmware, and no other field in the cmd is used.
2. Prevent the driver from forwarding any SMPS requests to
mac80211, i.e., the driver should not transmit SMPS frames
to the AP as firmware handles that.
3. Set NL80211_FEATURE_DYNAMIC_SMPS and NL80211_FEATURE_STATIC_SMPS
conditionally based on RLC version.
Signed-off-by: Daniel Gabay <daniel.gabay@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240808232017.45da23be1f65.I0d46db82dd990a82e8a66876fe2f5310bc9513be@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Not doing so will make us send a host command to the transport while the
firmware is not alive, which will trigger a WARNING.
bad state = 0
WARNING: CPU: 2 PID: 17434 at drivers/net/wireless/intel/iwlwifi/iwl-trans.c:115 iwl_trans_send_cmd+0x1cb/0x1e0 [iwlwifi]
RIP: 0010:iwl_trans_send_cmd+0x1cb/0x1e0 [iwlwifi]
Call Trace:
<TASK>
iwl_mvm_send_cmd+0x40/0xc0 [iwlmvm]
iwl_mvm_config_scan+0x198/0x260 [iwlmvm]
iwl_mvm_recalc_tcm+0x730/0x11d0 [iwlmvm]
iwl_mvm_tcm_work+0x1d/0x30 [iwlmvm]
process_one_work+0x29e/0x640
worker_thread+0x2df/0x690
? rescuer_thread+0x540/0x540
kthread+0x192/0x1e0
? set_kthread_struct+0x90/0x90
ret_from_fork+0x22/0x30
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.5abe71ca1b6b.I97a968cb8be1f24f94652d9b110ecbf6af73f89e@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fast resume is a feature that was recently introduced to speed up the
resume time. It basically keeps the firmware alive while the system
is suspended and that avoids starting again the whole device.
This flow can't work for hibernation, since when the system boots,
before the frozen image is loaded, the kernel may touch the device. As a
result, we can't assume the device is in the exact same state as before
the hibernation.
Detect that we are resuming from hibernation through the PCI device and
forbid the fast resume flow. We also need to shut down the device
cleanly when that happens.
In addition, in case the device is power gated during S3, we won't be
able to keep the device alive. Detect this situation with BE200 at least
with the help of the CSR_FUNC_SCRATCH register and reset the device upon
resume if it was power gated during S3.
Fixes: e8bb19c1d5 ("wifi: iwlwifi: support fast resume")
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240825191257.24eb3b19e74f.I3837810318dbef0a0a773cf4c4fcf89cdc6fdbd3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
PLDR (product level device reset) is a Windows term, and
is something the driver triggers there, AFAICT.
Really what 'pldr_sync' here wants to capture is whether
or not the firmware will/may do a product reset during
initialization, which makes the device drop off the bus,
requiring a rescan. If this is the case, obviously the
init will fail/time out, so we don't want to report all
kinds of errors etc., hence this tracking variable.
Rename it to 'fw_product_reset' to capture the meaning
better.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://patch.msgid.link/20240618194245.ccf849642af8.I01dded6b2393771b7baf8b4b17336784d987c7c2@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The move of the scan complete notification handling to the wiphy worker
introduced a race between scan complete notification and scan abort:
- The wiphy lock is held, e.g., for rfkill handling etc.
- Scan complete notification is received but not handled yet.
- Scan abort is triggered, and scan abort is sent to the FW. Once the
scan abort command is sent successfully, the flow synchronously waits
for the scan complete notification. However, as the scan complete
notification was already received but not processed yet, this hangs for
a second and continues leaving the scan status in an inconsistent
state.
- Once scan complete handling is started (when the wiphy lock is not held)
since the scan status is not an inconsistent state, a warning is issued
and the scan complete notification is not handled.
To fix this issue, switch back the scan complete notification to be
asynchronously handling, and only move the link selection logic to
a worker (which was the original reason for the move to use wiphy lock).
While at it, refactor some prints to improve debug data.
Fixes: 07bf5297d3 ("wifi: iwlwifi: mvm: Implement new link selection algorithm")
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240506095953.1f484a86324b.I63ed445a47f144546948c74ae6df85587fdb4ce3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When there's an active link in a non-station vif, the station vif is
not allowed to enter EMLSR
Note that blocking EMLSR by calling iwl_mvm_block_esr() we will schedule
an exit from EMLSR worker, but the worker cannot run before the
activation of the non-BSS link, as ieee80211_remain_on_channel already
holds the wiphy mutex.
Handle that by explicitly calling ieee80211_set_active_links()
to leave EMLSR, and then doing iwl_mvm_block_esr() only for
consistency and to avoid re-entering it before ready.
Note that a call to ieee80211_set_active_links requires to release the
mvm mutex, but that's ok since we still hold the wiphy lock. The only
thing that might race here is the ESR_MODE_NOTIF, so this changes its
handler to run under the wiphy lock.
Signed-off-by: Yedidya Benshimol <yedidya.ben.shimol@intel.com>
Co-developed-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Link: https://msgid.link/20240505091420.916193759f8a.Idf3a3caf5cdc3e69c81710b7ceb57e87f2de87e4@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Replaces the current logic with a new algorithm based on the link
grading introduced in a previous patch.
The new selection algorithm will be invoked upon successful scan to ensure
it has the necessary updated data it needs.
This update delegates the selection logic as the primary link
determiner in EMLSR mode, storing it in mvmvif to avoid repeated
calculations, as the result may vary.
Additionally, includes tests for iwl_mvm_valid_link_pair to validate
link pairs for EMLSR.
Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Link: https://msgid.link/20240416134215.309fb1b3fe44.I5baf0c293c89a5a28bd1a6386bf9ca6d2bf61ab8@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>