struct net_device shouldn't be embedded into any structure, instead,
the owner should use the priv space to embed their state into net_device.
Embedding net_device into structures prohibits the usage of flexible
arrays in the net_device structure. For more details, see the discussion
at [1].
Un-embed the net_device from struct iwl_trans_pcie by converting it
into a pointer. Then use the leverage alloc_netdev() to allocate the
net_device object at iwl_trans_pcie_alloc.
The private data of net_device becomes a pointer for the struct
iwl_trans_pcie, so, it is easy to get back to the iwl_trans_pcie parent
given the net_device object.
[1] https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://msgid.link/20240501165417.3406039-1-leitao@debian.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The Bz devices got a new completion descriptor again since
we only ever really used 4 out of 32 bytes anyway. Adjust
the code to deal with that. Note that the intention was to
reduce the size, but the hardware was implemented wrongly.
While at it, do some cleanups and remove the union to simplify
the code, clean up iwl_pcie_free_bd_size() to no longer need
an argument and add iwl_pcie_used_bd_size() with the logic to
selct completion descriptor size.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20220204122220.bef461a04110.I90c8885550fa54eb0aaa4363d322f50e301175a6@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
The TR/CR tail data are meant to be per-queue-arrays, however,
we allocate them completely wrong (we have a separate allocation
per queue).
Looking at this more closely, it turns out that the hardware
never uses these - we have a separate free list per RX queue
and maintain a write pointer for that in a register, and the
RX itself is indicated in the RB status (rb_stts) DMA region.
Despite nothing using the tail pointers, the hardware will
unconditionally access them to write updates, even when we aren't
using CRs/TRs.
Give it dummy values that we never use/update so it can do that
without causing trouble.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210617110647.5f5764e04c46.I4d5de1929be048085767f1234a1e07b517ab6a2d@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
After the fix from Jiri that disabled local IRQs instead of
just BHs (necessary to fix an issue with submitting a command
with IRQs already disabled), there was still a situation in
which we could deep in there enable BHs, if the device config
sets the apmg_wake_up_wa configuration, which is true on all
7000 series devices.
To fix that, but not require reverting commit 1ed08f6fb5
("iwlwifi: remove flags argument for nic_access"), split up
nic access into a version with BH manipulation to use most
of the time, and without it for this specific case where the
local IRQs are already disabled.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210415164821.d0f2edda1651.I75f762e0bed38914d1300ea198b86dd449b4b206@changeid
To avoid duplicating code we need to call iwl_pcie_txq_update_byte_cnt_tbl
function from non bus independent code so make it bus independent.
Used spatch rule
@r1@
struct iwl_trans_pcie *trans_pcie;
@@
(
-trans_pcie->scd_bc_tbls
+trans->txqs.scd_bc_tbls
|
-iwl_pcie_txq_update_byte_cnt_tbl
+iwl_txq_gen1_update_byte_cnt_tbl
|
-iwl_pcie_txq_inval_byte_cnt_tbl
+iwl_txq_gen1_inval_byte_cnt_tbl
|
-iwl_pcie_tfd_unmap
+iwl_txq_gen1_tfd_unmap
|
-iwl_pcie_tfd_tb_get_addr
+iwl_txq_gen1_tfd_tb_get_addr
|
-iwl_pcie_tfd_tb_get_len
+iwl_txq_gen1_tfd_tb_get_len
|
-iwl_pcie_tfd_get_num_tbs
+iwl_txq_gen1_tfd_get_num_tbs
)
/* clean all new unused variables */
@ depends on r1@
type T;
identifier i;
expression E;
@@
- T i = E;
... when != i
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200930191738.8d33e791ec8c.Ica35125ed640aa3aa1ecc38fb5e8f1600caa8df6@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We don't want to have txq code in the PCIe transport code, so move all
the relevant elements to a new iwl_txq structure and store it in
iwl_trans.
spatch
@ replace_pcie @
struct iwl_trans_pcie *trans_pcie;
@@
(
-trans_pcie->queue_stopped
+trans->txqs.queue_stopped
|
-trans_pcie->queue_used
+trans->txqs.queue_used
|
-trans_pcie->txq
+trans->txqs.txq
|
-trans_pcie->txq
+trans->txqs.txq
|
-trans_pcie->cmd_queue
+trans->txqs.cmd.q_id
|
-trans_pcie->cmd_fifo
+trans->txqs.cmd.fifo
|
-trans_pcie->cmd_q_wdg_timeout
+trans->txqs.cmd.wdg_timeout
)
// clean all new unused variables
@ depends on replace_pcie @
type T;
identifier i;
expression E;
@@
- T i = E;
... when != i
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200529092401.a428d3c9d66f.Ie04ae55f33954636a39c98e7ae1e739c0507435b@changeid
We don't really expect fragmented RBs, and don't seem to be seeing
them in practice since that would've caused a crash. Nevertheless,
we should be expecting the hardware to send them.
Parse the flag indicating a fragmented buffer, but then discard it
and any fragments thereof, at least for now. We need to do more
work in the higher layers to properly deal with this, since we may
not get "normal" firmware notifications that are fragmented, only
RX, and then we need to put it back together and add the necessary
API to report a chain of things to the higher layers, this doesn't
fit into the struct iwl_rx_cmd_buffer today.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200425130140.e78a59f70b1d.Ica656a98a4e4220d73edc97600edd680cbc97241@changeid
There's no need for this to be exposed outside of the tx.c
file, make it static.
Change-Id: I41d40008311b108d0578bd2ec73c5477e700a839
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Kalle Valo says:
====================
wireless-drivers-next patches for v5.6
Second set of patches for v5.6. Nothing special standing out, smaller
new features and fixes allover.
Major changes:
ar5523
* add support for SMCWUSBT-G2 USB device
iwlwifi
* support new versions of the FTM FW APIs
* support new version of the beacon template FW API
* print some extra information when the driver is loaded
rtw88
* support wowlan feature for 8822c
* add support for WIPHY_WOWLAN_NET_DETECT
brcmfmac
* add initial support for monitor mode
qtnfmac
* add module parameter to enable DFS offloading in firmware
* add support for STA HE rates
* add support for TWT responder and spatial reuse
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If we have only 2k RBs like on the latest (AX210) hardware, then
even on x86 where PAGE_SIZE is 4k we currently waste half of the
memory.
If this is the case, return partial pages from the allocator and
track the offset in each RBD (to be able to find the data in them
and remap them later.)
This might also address other platforms with larger PAGE_SIZE by
putting more RBs into a single large page.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
We don't need to map *everything* of the RX buffers, we won't use
that much, map only the part we're going to use. This save some
IOMMU space (if applicable and it can deal with that) and also
prepares a bit for mapping partial pages for 2K buffers later.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
For HE-capable devices, we need to allocate more receive buffers as
there could be 256 frames aggregated into a single A-MPDU, and then
they might contain A-MSDUs as well. Until 22000 family, the devices
are able to put multiple frames into a single RB and the default RB
size is 4k, but starting from AX210 family this is no longer true.
On the other hand, those newer devices only use 2k receive buffers
(by default).
Modify the code and configuration to allocate an appropriate number
of RBs depending on the device capabilities:
* 4096 for AX210 HE devices, which use 2k buffers by default,
* 2048 for 22000 family devices which use 4k buffers by default,
* 512 for existing 9000 family devices, which doesn't really
change anything since that's the default before this patch,
* 512 also for AX210/22000 family devices that don't do HE.
Theoretically, for devices lower than AX210, we wouldn't have to
allocate that many RBs if the RB size was manually increased, but
to support that the code got more complex, and it didn't really
seem necessary as that's a use case for monitor mode only, where
hopefully the wasted memory isn't really much of a concern.
Note that AX210 devices actually support bigger than 12-bit VID,
which is required here as we want to allocate 4096 buffers plus
some for quick recycling, so adjust the code for that as well.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
After more investigation on the hardware side, it appears that the
hardware bug regarding 2^32 boundary reaching/crossing also affects
other uses of the DMA engine, in particular the ones triggered by
the context-info (image loader) mechanism.
It also turns out that the bug only affects devices with gen2 TX
hardware engine, so we don't need to change context info for gen3.
The TX path workarounds are simpler to still keep for both though.
Add the workaround to that code as well; this is a lot simpler as
we have just a single way to allocate DMA memory there.
I made the algorithm recursive (with a small limit) since it's
actually (almost) impossible to hit this today - dma_alloc_coherent
is currently documented to always return 32-bit addressable memory
regardless of the DMA mask for it, and so we could only get REALLY
unlucky to get the very last page in that area.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
As noted in the previous commit, due to the way we allocate the
dev_cmd headers with 324 byte size, and 4/8 byte alignment, the
part we use of them (bytes 20..40-68) could still cross a page
and thus 2^32 boundary.
Address this by using alignment to ensure that the allocation
cannot cross a page boundary, on hardware that's affected. To
make that not cause more memory consumption, reduce the size of
the allocations to the necessary size - we go from 324 bytes in
each allocation to 60/68 on gen2 depending on family, and ~120
or so on gen1 (so on gen1 it's a pure reduction in size, since
we don't need alignment there).
To avoid size and clearing issues, add a new structure that's
just the header, and use kmem_cache_zalloc().
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Move the tracking that records the page in the SKB for later
free (refcount decrement) into the get_page_hdr() function
for better code reuse.
While at it, also add an assertion that this doesn't overwrite
any existing page pointer in the skb.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>