phy-handle can't be handled well for ast2400/2500 which has an embedded
MDIO controller. Add ftgmac100_mdio_setup for ast2400/2500 and initialize
PHYs from mdio child node with of_mdiobus_register.
Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Split MDIO registration and PHY connect into ftgmac100_setup_mdio and
ftgmac100_mii_probe.
Signed-off-by: Ivan Mikhaylov <i.mikhaylov@yadro.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The TI CPTS does not natively support PTPv1, only PTPv2. But, as it
happens, the CPTS can provide HW timestamp for PTPv1 Sync messages, because
CPTS HW parser looks for PTP messageType id in PTP message octet 0 which
value is 0 for PTPv1. As result, CPTS HW can detect Sync messages for PTPv1
and PTPv2 (Sync messageType = 0 for both), but it fails for any other PTPv1
messages (Delay_req/resp) and will return PTP messageType id 0 for them.
The commit e9523a5a32 ("net: ethernet: ti: cpsw: enable
HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter") added PTPv1 hw timestamping
advertisement by mistake, only to make Linux Kernel "timestamping" utility
work, and this causes issues with only PTPv1 compatible HW/SW - Sync HW
timestamped, but Delay_req/resp are not.
Hence, fix it disabling PTPv1 hw timestamping advertisement, so only PTPv1
compatible HW/SW can properly roll back to SW timestamping.
Fixes: e9523a5a32 ("net: ethernet: ti: cpsw: enable HWTSTAMP_FILTER_PTP_V1_L4_EVENT filter")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Link: https://lore.kernel.org/r/20201029190910.30789-1-grygorii.strashko@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The headroom reserved for received frames needs to be aligned to an
RX specific value. There is currently a discrepancy between the values
used in the Ethernet driver and the values passed to the FMan.
Coincidentally, the resulting aligned values are identical.
Fixes: 3c68b8fffb ("dpaa_eth: FMan erratum A050385 workaround")
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Impose a larger RX private data area only when the A050385 erratum is
present on the hardware. A smaller buffer size is sufficient in all
other scenarios. This enables a wider range of linear Jumbo frame
sizes in non-erratum scenarios, instead of turning to multi
buffer Scatter/Gather frames. The maximum linear frame size is
increased by 128 bytes for non-erratum arm64 platforms.
Cleanup the hardware annotations header defines in the process.
Fixes: 3c68b8fffb ("dpaa_eth: FMan erratum A050385 workaround")
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When we read device memory, we lock a spinlock, write the address we
want to read from the device and then spin in a loop reading the data
in 32-bit quantities from another register.
As the description makes clear, this is rather inefficient, incurring
a PCIe bus transaction for every read. In a typical device today, we
want to read 786k SMEM if it crashes, leading to 192k register reads.
Occasionally, we've seen the whole loop take over 20 seconds and then
triggering the soft lockup detector.
Clearly, it is unreasonable to spin here for such extended periods of
time.
To fix this, break the loop down into an outer and an inner loop, and
break out of the inner loop if more than half a second elapsed. To
avoid too much overhead, check for that only every 128 reads, though
there's no particular reason for that number. Then, unlock and relock
to obtain NIC access again, reprogram the start address and continue.
This will keep (interrupt) latencies on the CPU down to a reasonable
time.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20201022165103.45878a7e49aa.I3b9b9c5a10002915072312ce75b68ed5b3dc6e14@changeid
The clang build reports this warning
fw.c:1485:21: warning: address of array 'rtwdev->chip->fw_fifo_addr'
will always evaluate to 'true'
if (!rtwdev->chip->fw_fifo_addr) {
fw_fifo_addr is an array in rtw_chip_info so it is always
nonzero. A better check is if the first element of the array is
nonzero. In the cases where fw_fifo_addr is initialized by rtw88b
and rtw88c, the first array element is 0x780.
Fixes: 0fbc2f0f34 ("rtw88: add dump firmware fifo support")
Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Tzu-En Huang <tehuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201011155438.15892-1-trix@redhat.com
In my test setup, I had a SAMA5D27 device configured with ip forwarding, and
second device with usb ethernet (r8152) sending ICMP packets. If the packet
was larger than about 220 bytes, the SAMA5 device would "oops" with the
following trace:
kernel BUG at net/core/skbuff.c:1863!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in: xt_MASQUERADE ppp_async ppp_generic slhc iptable_nat xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 can_raw can bridge stp llc ipt_REJECT nf_reject_ipv4 sd_mod cdc_ether usbnet usb_storage r8152 scsi_mod mii o
ption usb_wwan usbserial micrel macb at91_sama5d2_adc phylink gpio_sama5d2_piobu m_can_platform m_can industrialio_triggered_buffer kfifo_buf of_mdio can_dev fixed_phy sdhci_of_at91 sdhci_pltfm libphy sdhci mmc_core ohci_at91 ehci_atmel o
hci_hcd iio_rescale industrialio sch_fq_codel spidev prox2_hal(O)
CPU: 0 PID: 0 Comm: swapper Tainted: G O 5.9.1-prox2+ #1
Hardware name: Atmel SAMA5
PC is at skb_put+0x3c/0x50
LR is at macb_start_xmit+0x134/0xad0 [macb]
pc : [<c05258cc>] lr : [<bf0ea5b8>] psr: 20070113
sp : c0d01a60 ip : c07232c0 fp : c4250000
r10: c0d03cc8 r9 : 00000000 r8 : c0d038c0
r7 : 00000000 r6 : 00000008 r5 : c59b66c0 r4 : 0000002a
r3 : 8f659eff r2 : c59e9eea r1 : 00000001 r0 : c59b66c0
Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control: 10c53c7d Table: 2640c059 DAC: 00000051
Process swapper (pid: 0, stack limit = 0x75002d81)
<snipped stack>
[<c05258cc>] (skb_put) from [<bf0ea5b8>] (macb_start_xmit+0x134/0xad0 [macb])
[<bf0ea5b8>] (macb_start_xmit [macb]) from [<c053e504>] (dev_hard_start_xmit+0x90/0x11c)
[<c053e504>] (dev_hard_start_xmit) from [<c0571180>] (sch_direct_xmit+0x124/0x260)
[<c0571180>] (sch_direct_xmit) from [<c053eae4>] (__dev_queue_xmit+0x4b0/0x6d0)
[<c053eae4>] (__dev_queue_xmit) from [<c05a5650>] (ip_finish_output2+0x350/0x580)
[<c05a5650>] (ip_finish_output2) from [<c05a7e24>] (ip_output+0xb4/0x13c)
[<c05a7e24>] (ip_output) from [<c05a39d0>] (ip_forward+0x474/0x500)
[<c05a39d0>] (ip_forward) from [<c05a13d8>] (ip_sublist_rcv_finish+0x3c/0x50)
[<c05a13d8>] (ip_sublist_rcv_finish) from [<c05a19b8>] (ip_sublist_rcv+0x11c/0x188)
[<c05a19b8>] (ip_sublist_rcv) from [<c05a2494>] (ip_list_rcv+0xf8/0x124)
[<c05a2494>] (ip_list_rcv) from [<c05403c4>] (__netif_receive_skb_list_core+0x1a0/0x20c)
[<c05403c4>] (__netif_receive_skb_list_core) from [<c05405c4>] (netif_receive_skb_list_internal+0x194/0x230)
[<c05405c4>] (netif_receive_skb_list_internal) from [<c0540684>] (gro_normal_list.part.0+0x14/0x28)
[<c0540684>] (gro_normal_list.part.0) from [<c0541280>] (napi_complete_done+0x16c/0x210)
[<c0541280>] (napi_complete_done) from [<bf14c1c0>] (r8152_poll+0x684/0x708 [r8152])
[<bf14c1c0>] (r8152_poll [r8152]) from [<c0541424>] (net_rx_action+0x100/0x328)
[<c0541424>] (net_rx_action) from [<c01012ec>] (__do_softirq+0xec/0x274)
[<c01012ec>] (__do_softirq) from [<c012d6d4>] (irq_exit+0xcc/0xd0)
[<c012d6d4>] (irq_exit) from [<c0160960>] (__handle_domain_irq+0x58/0xa4)
[<c0160960>] (__handle_domain_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x90)
Exception stack(0xc0d01ef0 to 0xc0d01f38)
1ee0: 00000000 0000003d 0c31f383 c0d0fa00
1f00: c0d2eb80 00000000 c0d2e630 4dad8c49 4da967b0 0000003d 0000003d 00000000
1f20: fffffff5 c0d01f40 c04e0f88 c04e0f8c 30070013 ffffffff
[<c0100b0c>] (__irq_svc) from [<c04e0f8c>] (cpuidle_enter_state+0x7c/0x378)
[<c04e0f8c>] (cpuidle_enter_state) from [<c04e12c4>] (cpuidle_enter+0x28/0x38)
[<c04e12c4>] (cpuidle_enter) from [<c014f710>] (do_idle+0x194/0x214)
[<c014f710>] (do_idle) from [<c014fa50>] (cpu_startup_entry+0xc/0x14)
[<c014fa50>] (cpu_startup_entry) from [<c0a00dc8>] (start_kernel+0x46c/0x4a0)
Code: e580c054 8a000002 e1a00002 e8bd8070 (e7f001f2)
---[ end trace 146c8a334115490c ]---
The solution was to force nonlinear buffers to be cloned. This was previously
reported by Klaus Doth (https://www.spinics.net/lists/netdev/msg556937.html)
but never formally submitted as a patch.
This is the third revision, hopefully the formatting is correct this time!
Suggested-by: Klaus Doth <krnl@doth.eu>
Fixes: 653e92a917 ("net: macb: add support for padding and fcs computation")
Signed-off-by: Mark Deneen <mdeneen@saucontech.com>
Link: https://lore.kernel.org/r/20201030155814.622831-1-mdeneen@saucontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tp->dirty_tx isn't changed outside rtl_tx(). Therefore I see no need
to guarantee a specific order of reading tp->dirty_tx and tp->cur_tx.
Having said that we can remove the memory barrier.
In addition use READ_ONCE() when reading tp->cur_tx because it can
change in parallel to rtl_tx().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/2264563a-fa9e-11b0-2c42-31bc6b8e2790@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds support for 10GBASE-R interface to the linux driver for
Cadence's ethernet controller.
This controller has separate MAC's and PCS'es for low and high speed paths.
High speed PCS supports 100M, 1G, 2.5G, 5G and 10G through rate adaptation
implementation. However, since it doesn't support auto negotiation, linux
driver is modified to support 10GBASE-R instead of USXGMII.
Signed-off-by: Parshuram Thombare <pthombar@cadence.com>
Link: https://lore.kernel.org/r/1603975627-18338-1-git-send-email-pthombar@cadence.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pull more flexible-array member conversions from Gustavo A. R. Silva:
"Replace zero-length arrays with flexible-array members"
* tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
printk: ringbuffer: Replace zero-length array with flexible-array member
net/smc: Replace zero-length array with flexible-array member
net/mlx5: Replace zero-length array with flexible-array member
mei: hw: Replace zero-length array with flexible-array member
gve: Replace zero-length array with flexible-array member
Bluetooth: btintel: Replace zero-length array with flexible-array member
scsi: target: tcmu: Replace zero-length array with flexible-array member
ima: Replace zero-length array with flexible-array member
enetc: Replace zero-length array with flexible-array member
fs: Replace zero-length array with flexible-array member
Bluetooth: Replace zero-length array with flexible-array member
params: Replace zero-length array with flexible-array member
tracepoint: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
Unlike earlier silicon variants, OcteonTx2 98xx
silicon has 2 NIX blocks and each of the CGX is
mapped to either of the NIX blocks. Each NIX
block supports 100G. Mapping btw NIX blocks and
CGX is done by firmware based on CGX speed config
to have a maximum possible network bandwidth.
Since the mapping is not fixed, it's difficult
for a user to figure out. Hence added a debugfs
entry which displays mapping between CGX LMAC,
NIX block and RVU PF.
Sample result of this entry ::
~# cat /sys/kernel/debug/octeontx2/rvu_pf_cgx_map
PCI dev RVU PF Func NIX block CGX LMAC
0002:02:00.0 0x400 NIX0 CGX0 LMAC0
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If NIX1 block is also implemented then add a new
directory for NIX1 in debugfs root. Stats of
NIX1 block can be read/writen from/to the files
in directory "/sys/kernel/debug/octeontx2/nix1/".
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
CGX links are followed by LBK links but number of
CGX and LBK links varies between platforms. Hence
get the number of links present in hardware from
AF and use it to calculate LBK link number.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch puts together all mailbox changes
for 98xx silicon:
Attach ->
Modify resource attach mailbox handler to
request LFs from a block address out of multiple
blocks of same type. If a PF/VF need LFs from two
blocks of same type then attach mbox should be
called twice.
Example:
struct rsrc_attach *attach;
.. Allocate memory for message ..
attach->cptlfs = 3; /* 3 LFs from CPT0 */
.. Send message ..
.. Allocate memory for message ..
attach->modify = 1;
attach->cpt_blkaddr = BLKADDR_CPT1;
attach->cptlfs = 2; /* 2 LFs from CPT1 */
.. Send message ..
Detach ->
Update detach mailbox and its handler to detach
resources from CPT1 and NIX1 blocks.
MSIX ->
Updated the MSIX mailbox and its handler to return
MSIX offsets for the new block CPT1.
Free resources ->
Update free_rsrc mailbox and its handler to return
the free resources count of new blocks NIX1 and CPT1
Links ->
Number of CGX,LBK and SDP links may vary between
platforms. For example, in 98xx number of CGX and LBK
links are more than 96xx. Hence the info about number
of links present in hardware is useful for consumers to
request link configuration properly. This patch sends
this info in nix_lf_alloc_rsp.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
On 98xx silicon, NPC block has additional
mcam entries, counters and NIX1 interfaces.
Extended set of registers are present for the
new mcam entries and counters.
This patch does the following:
- updates the register accessing macros
to use extended set if present.
- configures the MKEX profile for NIX1 interfaces also.
- updates mcam entry write functions to use assigned
NIX0/1 interfaces for the PF/VF.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Initialize MCE context for the assigned NIX0/1
block for a CGX mapped PF. Modified rvu_nix_aq_enq_inst
function to work with nix_hw so that MCE contexts
for both NIX blocks can be inited.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Firmware configures NIX block mapping for all CGXs
to achieve maximum throughput. This patch reads
the configuration and create mapping between RVU
PF and NIX blocks. And for LBK VFs assign NIX0 for
even numbered VFs and NIX1 for odd numbered VFs.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch modifies NIX functions to operate
with nix_hw context so that existing functions
can be used for both NIX0 and NIX1 blocks. And
the NIX blocks present in the system are initialized
during driver init and freed during exit.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
AF manages the tasks of allocating, freeing
LFs from RVU blocks to PF and VFs. With new
NIX1 and CPT1 blocks in 98xx, this patch
adds support for handling new blocks too.
Co-developed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since multiple blocks of same type are present in
98xx, modify functions which get resource count and
which update resource count to work with individual
block address instead of block type.
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The previous commit added support for IPA having up to six source
and destination resources. But currently nothing uses more than
four. (Five of each are used in a newer version of the hardware.)
I find that in one of my build environments the compiler complains
about newly-added code in two spots. Inspection shows that the
warnings have no merit, but this compiler does not recognize that.
ipa_main.c:457:39: warning: array index 5 is past the end of the
array (which contains 4 elements) [-Warray-bounds]
(and the same warning at line 483)
We can make this warning go away by changing the number of elements
in the source and destination resource limit arrays--now rather than
waiting until we need it to support the newer hardware. This change
was coming soon anyway; make it now to get rid of the warning.
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20201031151524.32132-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After switching to the net core rx/tx byte/packet counters we can
remove the now unused private version.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Switch to the net core rx/tx byte/packet counter infrastructure.
This simplifies the code, only small drawback is some memory overhead
because we use just one queue, but allocate the counters per cpu.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver uses in_irq() to determine if the tlan_priv::lock has to be
acquired in tlan_mii_read_reg() and tlan_mii_write_reg().
The interrupt handler acquires the lock outside of these functions so the
in_irq() check is meant to prevent a lock recursion deadlock. But this
check is incorrect when interrupt force threading is enabled because then
the handler runs in thread context and in_irq() correctly returns false.
The usage of in_*() in drivers is phased out and Linus clearly requested
that code which changes behaviour depending on context should either be
seperated or the context be conveyed in an argument passed by the caller,
which usually knows the context.
tlan_set_timer() has this conditional as well, but this function is only
invoked from task context or the timer callback itself. So it always has to
lock and the check can be removed.
tlan_mii_read_reg(), tlan_mii_write_reg() and tlan_phy_print() are invoked
from interrupt and other contexts.
Split out the actual function body into helper variants which are called
from interrupt context and make the original functions wrappers which
acquire tlan_priv::lock unconditionally.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Samuel Chessman <chessman@tux.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
nv_update_stats() triggers a WARN_ON() when invoked from hard interrupt
context because the locks in use are not hard interrupt safe. It also has
an assert_spin_locked() which was the lock check before the lockdep era.
Lockdep has way broader locking correctness checks and covers both issues,
so replace the warning and the lock assert with lockdep_assert_held().
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Rain River <rain.1986.08.12@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
wait_for_cmd_complete() uses in_interrupt() to detect whether it is safe to
sleep or not.
The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.
in_interrupt() also is only partially correct because it fails to chose the
correct code path when just preemption or interrupts are disabled.
Add an argument 'may_block' to both functions and adjust the callers to
pass the context information.
The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:
s2io_card_up()
s2io_set_multicast()
init_nic()
init_tti()
s2io_close()
do_s2io_delete_unicast_mc()
do_s2io_add_mac()
s2io_set_mac_addr()
do_s2io_prog_unicast()
do_s2io_add_mac()
s2io_reset()
do_s2io_restore_unicast_mc()
do_s2io_add_mc()
do_s2io_add_mac()
s2io_open()
do_s2io_prog_unicast()
do_s2io_add_mac()
The following call chains which end up invoking wait_for_cmd_complete()
were analyzed to be safe to sleep:
__dev_set_rx_mode()
s2io_set_multicast()
s2io_txpic_intr_handle()
s2io_link()
init_tti()
Add a may_sleep argument to wait_for_cmd_complete(), s2io_set_multicast()
and init_tti() and hand the context information in from the call sites.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Jon Mason <jdmason@kudzu.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is one main difference in mscc_ocelot between IP multicast and L2
multicast. With IP multicast, destination ports are encoded into the
upper bytes of the multicast MAC address. Example: to deliver the
address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to
program the address of 00:03:08:11:22:33 into hardware. Whereas for L2
multicast, the MAC table entry points to a Port Group ID (PGID), and
that PGID contains the port mask that the packet will be forwarded to.
As to why it is this way, no clue. My guess is that not all port
combinations can be supported simultaneously with the limited number of
PGIDs, and this was somehow an issue for IP multicast but not for L2
multicast. Anyway.
Prior to this change, the raw L2 multicast code was bogus, due to the
fact that there wasn't really any way to test it using the bridge code.
There were 2 issues:
- A multicast PGID was allocated for each MDB entry, but it wasn't in
fact programmed to hardware. It was dummy.
- In fact we don't want to reserve a multicast PGID for every single MDB
entry. That would be odd because we can only have ~60 PGIDs, but
thousands of MDB entries. So instead, we want to reserve a multicast
PGID for every single port combination for multicast traffic. And
since we can have 2 (or more) MDB entries delivered to the same port
group (and therefore PGID), we need to reference-count the PGIDs.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ocelot.h says:
/* MAC table entry types.
* ENTRYTYPE_NORMAL is subject to aging.
* ENTRYTYPE_LOCKED is not subject to aging.
* ENTRYTYPE_MACv4 is not subject to aging. For IPv4 multicast.
* ENTRYTYPE_MACv6 is not subject to aging. For IPv6 multicast.
*/
We don't want the permanent entries added with 'bridge mdb' to be
subject to aging.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
AIUI, the NETIF_F_TSO_MANGLEID flag is a signal to the stack that a
driver may _need_ to mangle IDs in order to do TSO, and conversely
a signal from the stack that the driver is permitted to do so.
Since we support both fixed and incrementing IPIDs, we should rely
on the SKB_GSO_FIXEDID flag on a per-skb basis, rather than using
the MANGLEID feature to make all TSOs fixed-id.
Includes other minor cleanups of ef100_make_tso_desc() coding style.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The NIC only needs to know where the headers it has to edit (TCP and
inner and outer IPv4) are, which fits GSO_PARTIAL nicely.
It also supports non-PARTIAL offload of UDP tunnels, again just
needing to be told the outer transport offset so that it can edit
the UDP length field.
(It's not clear to me whether the stack will ever use the non-PARTIAL
version with the netdev feature flags we're setting here.)
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We need EFX_POPULATE_OWORD_17 for an encap TSO descriptor on EF100.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The minimum and maximum limits for resources assigned to a given
resource group are programmed in pairs, with the limits for two
groups set in a single register.
If the number of supported resource groups is odd, only half of the
register that defines these limits is valid for the last group; that
group has no second group in the pair.
Currently we ignore this constraint, and it turns out to be harmless,
but it is not guaranteed to be. This patch addresses that, and adds
support for programming the 5th resource group's limits.
Rework how the resource group limit registers are programmed by
having a single function program all group pairs rather than having
one function program each pair. Add the programming of the 4-5
resource group pair limits to this function. If a resource group is
not supported, pass a null pointer to ipa_resource_config_common()
for that group and have that function write zeroes in that case.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The number of resource groups supported by the hardware can be
different for source and destination resources. Determine the
number supported for each using separate functions. Make the
functions inline end move their definitions into "ipa_reg.h",
because they determine whether certain register definitions are
valid. Pass just the IPA hardware version as argument.
IPA_RESOURCE_GROUP_COUNT represents the maximum number of resource
groups the driver supports for any hardware version. Change that
symbol to be two separate constants, one for source and the other
for destination resource groups. Rename them to end with "_MAX"
rather than "_COUNT", to reflect their true purpose.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The IPA hardware manages various resources (e.g. descriptors)
internally to perform its functions. The resources are grouped,
allowing different endpoints to use separate resource pools. This
way one group of endpoints can be configured to operate unaffected
by the resource use of endpoints in a different group.
Endpoints should be assigned to a resource group, but we currently
don't do that.
Define a new resource_group field in the endpoint configuration
data, and use it to assign the proper resource group to use for
each AP endpoint.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The mask for the RSRC_GRP field in the INIT_RSRC_GRP endpoint
initialization register is incorrectly defined for IPA v4.2 (where
it is only one bit wide). So we need to fix this.
The fix is not straightforward, however. Field masks are passed to
functions like u32_encode_bits(), and for that they must be constant.
To address this, we define a new inline function that returns the
*encoded* value to use for a given RSRC_GRP field, which depends on
the IPA version. The caller can then use something like this, to
assign a given endpoint resource id 1:
u32 offset = IPA_REG_ENDP_INIT_RSRC_GRP_N_OFFSET(endpoint_id);
u32 val = rsrc_grp_encoded(ipa->version, 1);
iowrite32(val, ipa->reg_virt + offset);
The next patch requires this fix.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
At the end of ipa_mem_setup() we write the local packet processing
context base register to tell it where the processing context memory
is. But we are writing the wrong value.
The value written turns out to be the offset of the modem header
memory region (assigned earlier in the function). Fix this bug.
Tested-by: Sujit Kautkar <sujitka@chromium.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver does not implement a shutdown handler which leads to issues
when using kexec in certain scenarios. The NIC keeps on fetching
descriptors which gets flagged by the IOMMU with errors like this:
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
DMAR: DMAR:[DMA read] Request device [5e:00.0]fault addr fffff000
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201028172125.496942-1-mdf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code
should always use “flexible array members”[1] for these cases. The
older style of one-element or zero-length arrays should no longer be
used[2].
Refactor the code according to the use of a flexible-array member in
struct gve_stats_report, instead of a zero-length array, and use the
struct_size() helper to calculate the size for the resource allocation.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
DSA assumes that a bridge which has vlan filtering disabled is not
vlan aware, and ignores all vlan configuration. However, the kernel
software bridge code allows configuration in this state.
This causes the kernel's idea of the bridge vlan state and the
hardware state to disagree, so "bridge vlan show" indicates a correct
configuration but the hardware lacks all configuration. Even worse,
enabling vlan filtering on a DSA bridge immediately blocks all traffic
which, given the output of "bridge vlan show", is very confusing.
Allow the VLAN configuration to be updated on Marvell DSA bridges,
otherwise we end up cutting all traffic when enabling vlan filtering.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1kYAU3-00071C-1G@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>