otx2 only supports additional indirection tables (no separate keys
etc.) so the conversion to dedicated callbacks and core-allocated
context is mostly removing the code which stores the extra tables
in the driver. Core already stores the indirection tables for
additional contexts, and doesn't call .get for them.
One subtle change here is that we'll now start with the table
covering all queues, not directing all traffic to queue 0.
This is what core expects if the user doesn't pass the initial
indir table explicitly (there's a WARN_ON() in the core trying
to make sure driver authors don't forget to populate ctx to
defaults).
Drivers implementing .create_rxfh_context don't have to set
cap_rss_ctx_supported, so remove it.
Tested-by: Geetha Sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250707184115.2285277-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The error code was intended to be -EINVAL here, but it was accidentally
changed to returning success. Set the error code.
Fixes: e53ee4acb2 ("octeontx2-af: CN20k basic mbox operations and structures")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation considers only first advertise
mode and passes the same to firmware to process.
This patch extends code such that user can advertise
multiple modes on the given interface.
Below are high level changes:
1. Remove unnecessary speed/duplex/autoneg validation as its
already verified as part of "set_link_ksettings"
2. Since scratch csr framework designed to support single mode at a time,
use "shared firmware data" for multi mode support.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-4-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kernel and firmware communicates via scratch register which is
64 bit in size.
[MODE_ID PORT AUTONEG DUPLEX SPEED CMD_ID OWNERSHIP ]
63-22 21-14 13 12 11-8 7-2 1-0
The existing MODE_ID bitmap can only support up to 42 modes.
To resolve the issue, the unused port field is modified as below
uint64_t reserved2:6;
uint64_t mode_group_idx:2;
'mode_group_idx' categorize the mode ID range to accommodate more modes.
To specify mode ID range of 0 - 41, this field will be 0.
To specify mode ID range of 42 - 83, this field will be 1.
mode ID will be still mentioned as 1 << (0 - 41). But the mode_group_idx
decides the actual mode range
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-3-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current implementation maps ethtool link modes 10baseT/100baseT/1000baseT
to single firmware mode SGMII. This create a problem for end users who want
to advertise only one speed among them.
This patch addresses the issue by mapping each ethtool link mode
to a corresponding firmware mode also updates new modes supported
by firmware.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250625092107.9746-2-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
rvu_mbox_init function makes use of error path for
freeing memory which are local to the function in
both success and failure conditions. This is unusual hence
fix it by returning zero on success. With new cn20k code this
is freeing valid memory in success case also.
Fixes: e53ee4acb2 ("octeontx2-af: CN20k basic mbox operations and structures")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
NIX block can receive packets from multiple links such as
MAC (RPM), LBK and CPT.
-----------------
RPM --| NIX |
-----------------
|
|
LBK
Each link supports multiple channels for example RPM link supports
16 channels. In case of link oversubsribe, NIX will assert backpressure
on receive channels.
The previous patch considered a single channel per link, resulting in
backpressure not being enabled on the remaining channels
Fixes: a7ef63dbd5 ("octeontx2-af: Disable backpressure between CPT and NIX")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250617063403.3582210-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Number of RVU PFs on CN20K silicon have increased to 96 from maximum
of 32 that were supported on earlier silicons. Every RVU PF and VF is
identified by HW using a 16bit PF_FUNC value. Due to the change in
Max number of PFs in CN20K, the bit encoding of this PF_FUNC has changed.
This patch handles the change by using helper functions(using silicon
check) to use PF,VF masks and shifts to support both new silicon CN20K,
OcteonTx series. These helper functions are used in different modules.
Also moved the NIX AF register offset macros to other files which
will be posted in coming patches.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Link: https://patch.msgid.link/1749639716-13868-2-git-send-email-sbhatta@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch addresses below issues,
1. Active traffic on the leaf node must be stopped before its send queue
is reassigned to the parent. This patch resolves the issue by marking
the node as 'Inner'.
2. During a system reboot, the interface receives TC_HTB_LEAF_DEL
and TC_HTB_LEAF_DEL_LAST callbacks to delete its HTB queues.
In the case of TC_HTB_LEAF_DEL_LAST, although the same send queue
is reassigned to the parent, the current logic still attempts to update
the real number of queues, leadning to below warnings
New queues can't be registered after device unregistration.
WARNING: CPU: 0 PID: 6475 at net/core/net-sysfs.c:1714
netdev_queue_update_kobjects+0x1e4/0x200
Fixes: 5e6808b4c6 ("octeontx2-pf: Add support for HTB offload")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250522115842.1499666-1-hkelam@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
QOS is designed to create a new send queue whenever a class
is created, ensuring proper shaping and scheduling. However,
when multiple send queues are created and deleted in a loop,
SMMU errors are observed.
This patch addresses the issue by performing an data cache sync
during the teardown of QOS send queues.
Fixes: ab6dddd2a6 ("octeontx2-pf: qos send queues management")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250522094742.1498295-1-hkelam@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Steffen Klassert says:
====================
1) Remove some unnecessary strscpy_pad() size arguments.
From Thorsten Blum.
2) Correct use of xso.real_dev on bonding offloads.
Patchset from Cosmin Ratiu.
3) Add hardware offload configuration to XFRM_MSG_MIGRATE.
From Chiachang Wang.
4) Refactor migration setup during cloning. This was
done after the clone was created. Now it is done
in the cloning function itself.
From Chiachang Wang.
5) Validate assignment of maximal possible SEQ number.
Prevent from setting to the maximum sequrnce number
as this would cause for traffic drop.
From Leon Romanovsky.
6) Prevent configuration of interface index when offload
is used. Hardware can't handle this case.i
From Leon Romanovsky.
7) Always use kfree_sensitive() for SA secret zeroization.
From Zilin Guan.
ipsec-next-2025-05-23
* tag 'ipsec-next-2025-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: use kfree_sensitive() for SA secret zeroization
xfrm: prevent configuration of interface index when offload is used
xfrm: validate assignment of maximal possible SEQ number
xfrm: Refactor migration setup during the cloning process
xfrm: Migrate offload configuration
bonding: Fix multiple long standing offload races
bonding: Mark active offloaded xfrm_states
xfrm: Add explicit dev to .xdo_dev_state_{add,delete,free}
xfrm: Remove unneeded device check from validate_xmit_xfrm
xfrm: Use xdo.dev instead of xdo.real_dev
net/mlx5: Avoid using xso.real_dev unnecessarily
xfrm: Remove unnecessary strscpy_pad() size arguments
====================
Link: https://patch.msgid.link/20250523075611.3723340-1-steffen.klassert@secunet.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The AF driver assigns reserved MCAM entries (for unicast, broadcast,
etc.) based on the NIXLF number. When a NIXLF is detached, these entries
are disabled.
For example,
PF NIXLF
--------------------
PF0 0
SDP-VF0 1
If the user unbinds both PF0 and SDP-VF0 interfaces and then binds them in
reverse order
PF NIXLF
---------------------
SDP-VF0 0
PF0 1
In this scenario, the PF0 unicast entry is getting corrupted because
the MCAM entry contains stale data (SDP-VF0 ucast data)
This patch resolves the issue by clearing the unicast MCAM entry during
NIXLF detach
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current implementation maps the APR table using a fixed size,
which can lead to incorrect mapping when the number of PFs and VFs
varies.
This patch corrects the mapping by calculating the APR table
size dynamically based on the values configured in the
APR_LMT_CFG register, ensuring accurate representation
of APR entries in debugfs.
Fixes: 0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table").
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Link: https://patch.msgid.link/20250521060834.19780-3-gakula@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Apart from netdev interface Octeontx2 PF does the following:
1. Sends its own requests to AF and receives responses from AF.
2. Receives async messages from AF.
3. Forwards VF requests to AF, sends respective responses from AF to VFs.
4. Sends async messages to VFs.
This patch adds new tracepoint otx2_msg_status to display the status
of PF wrt mailbox handling.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-5-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch adds pcifunc which represents PF and VF device to the
tracepoints otx2_msg_alloc, otx2_msg_send, otx2_msg_process so that
it is easier to correlate which device allocated the message, which
device forwarded it and which device processed that message.
Also add message id in otx2_msg_send tracepoint to check which
message is sent at any point of time from a device.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-4-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
If ntuple filters count is modified followed by
unicast filters count using devlink then the ntuple count
set by user is ignored and all the ntuple filters are
being reallocated. Fix this by storing the ntuple count
set by user. Without this patch, say if user tries
to modify ntuple count as 8 followed by ucast filter count as 4
using devlink commands then ntuple count is being reverted to
default value 16 i.e, not retaining user set value 8.
Fixes: 39c469188b ("octeontx2-pf: Add ucast filter count configurability via devlink.")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747054357-5850-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Each CGX block supports 4 logical MACs (LMACS). Receive
counters CGX_CMR_RX_STAT0-8 are per LMAC and CGX_CMR_RX_STAT9-12
are per CGX.
Due a bug in previous patch, stale Per CGX counters values observed.
Fixes: 66208910e5 ("octeontx2-af: Support to retrieve CGX LMAC stats")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Link: https://patch.msgid.link/20250513071554.728922-1-hkelam@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
MASCEC hardware block has a field called maximum transmit size for
TX secy. Max packet size going out of MCS block has be programmed
taking into account full packet size which has L2 header,SecTag
and ICV. MACSEC offload driver is configuring max transmit size as
macsec interface MTU which is incorrect. Say with 1500 MTU of real
device, macsec interface created on top of real device will have MTU of
1468(1500 - (SecTag + ICV)). This is causing packets from macsec
interface of size greater than or equal to 1468 are not getting
transmitted out because driver programmed max transmit size as 1468
instead of 1514(1500 + ETH_HDR_LEN).
Fixes: c54ffc7360 ("octeontx2-pf: mcs: Introduce MACSEC hardware offloading")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747053756-4529-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The hardware supports multiple MAC types, including RPM, SDP, and LBK.
However, features such as link settings and pause frames are only available
on RPM MAC, and not supported on SDP or LBK.
This patch updates the ethtool operations logic accordingly to reflect
this behavior.
Fixes: 2f7f33a095 ("octeontx2-pf: Add representors for sdp MAC")
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
New timestamping API was introduced in commit 66f7223039 ("net: add
NDOs for configuring hardware timestamping") from kernel v6.6. It is
time to convert the mvpp2 driver to the new API, so that the
ndo_eth_ioctl() path can be removed completely.
Note that on the !port->hwtstamp condition, the old code used to fall
through in mvpp2_ioctl(), and return either -ENOTSUPP if !port->phylink,
or -EOPNOTSUPP, in phylink_mii_ioctl(). Keep the test for port->hwtstamp
in the newly introduced net_device_ops, but consolidate the error code
to just -EOPNOTSUPP. The other one is documented as NFS-specific, it's
best to avoid it anyway.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20250508144630.1979215-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When the host loses heartbeat messages from the device,
the driver calls the device-specific ndo_stop function,
which frees the resources. If the driver is unloaded in
this scenario, it calls ndo_stop again, attempting to free
resources that have already been freed, leading to a host
hang issue. To resolve this, dev_close should be called
instead of the device-specific stop function.dev_close
internally calls ndo_stop to stop the network interface
and performs additional cleanup tasks. During the driver
unload process, if the device is already down, ndo_stop
is not called.
Fixes: 5cb96c29aa ("octeon_ep: add heartbeat monitor")
Signed-off-by: Sathesh B Edara <sedara@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250429114624.19104-1-sedara@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://patch.msgid.link/20250425085740.65304-4-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The currently used function pci_request_regions() is one of the
problematic "hybrid devres" PCI functions, which are sometimes managed
through devres, and sometimes not (depending on whether
pci_enable_device() or pcim_enable_device() has been called before).
The PCI subsystem wants to remove this behavior and, therefore, needs to
port all users to functions that don't have this problem.
Furthermore, the PCI function being managed implies that it's not
necessary to call pci_release_regions() manually.
Remove the calls to pci_release_regions().
Replace pci_request_regions() with pcim_request_all_regions().
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Acked-by: Elad Nachman <enachman@marvell.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250425085740.65304-3-phasta@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>