mlxplat_pci_fpga_device_init() calls pci_get_device() but does not
release the refcount on error path. Call pci_dev_put() on the error path
and in mlxplat_pci_fpga_device_exit() to fix this.
This bug was found by an experimental static analysis tool that I am
developing.
Fixes: 02daa222fb ("platform: mellanox: Add initial support for PCIe based programming logic device")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20241216022538.381209-1-joe@pf.is.s.u-tokyo.ac.jp
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
After commit 0edb555a65 ("platform: Make platform_driver::remove()
return void") .remove() is (again) the right callback to implement for
platform drivers.
Convert all platform drivers below drivers/platform/x86/ to use
.remove(), with the eventual goal to drop struct
platform_driver::remove_new(). As .remove() and .remove_new() have the
same prototypes, conversion is done by just changing the structure
member name in the driver initializer.
While touching these files, make indention of the struct initializer
consistent in several files.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Link: https://lore.kernel.org/r/20241017073802.53235-2-u.kleine-koenig@baylibre.com
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Found with git grep 'MODULE_AUTHOR(".*([^)]*@'
Fixed with
sed -i '/MODULE_AUTHOR(".*([^)]*@/{s/ (/ </g;s/)"/>"/;s/)and/> and/}' \
$(git grep -l 'MODULE_AUTHOR(".*([^)]*@')
Also:
in drivers/media/usb/siano/smsusb.c normalise ", INC" to ", Inc";
this is what every other MODULE_AUTHOR for this company says,
and it's what the header says
in drivers/sbus/char/openprom.c normalise a double-spaced separator;
this is clearly copied from the copyright header,
where the names are aligned on consecutive lines thusly:
* Linux/SPARC PROM Configuration Driver
* Copyright (C) 1996 Thomas K. Dyas (tdyas@noc.rutgers.edu)
* Copyright (C) 1996 Eddie C. Dost (ecd@skynet.be)
but the authorship branding is single-line
Link: https://lkml.kernel.org/r/mk3geln4azm5binjjlfsgjepow4o73domjv6ajybws3tz22vb3@tarta.nabijaczleweli.xyz
Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230927081040.2198742-22-u.kleine-koenig@pengutronix.de
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Extend driver to support logic implemented by FPGA device connected
through PCIe bus.
The motivation two support new generation of Nvidia COME module
equipped with Lattice LFD2NX-40 FPGA device.
In order to support new Nvidia COME module FPGA device driver
initialization flow is modified. In case FPGA device is detected,
system resources are to be mapped to this device, otherwise system
resources are to be mapped same as it has been done before for Lattice
LPC based CPLD.
FPGA device is associated with three PCIe devices:
- PCIe-LPC bridge for main register space access.
- PCIe-I2C bridge for I2C controller access.
- PCIe-JTAG bridge for JTAG access.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230822113451.13785-14-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Split logic in mlxplat_init()/mlxplat_exit() routines.
Separate initialization of I2C infrastructure and others platform
drivers.
Motivation is to provide synchronization between I2C bus and mux
drivers and other drivers using this infrastructure.
I2C main bus and MUX busses are implemented in FPGA logic. On some new
systems the numbers allocated for these busses could be variable
depending on order of initialization of I2C native busses. Since bus
numbers are passed to some other platform drivers during initialization
flow, it is necessary to synchronize completion of I2C infrastructure
drivers and activation of rest of drivers.
Thus initialization flow will be performed in synchronized order.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-8-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Split mlxplat_init() into two by adding mlxplat_pre_init().
Motivation is to prepare 'mlx-platform' driver to support systems
equipped PCIe based programming logic device.
Such systems are supposed to use different system resources, thus this
commit separates resources allocation related code.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-7-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add support for new L1 switch nodes providing L1 connectivity for
multi-node networking chassis.
The purpose is to provide compute server with full management and IO
subsystems with connections to L1 switches.
System contains the following components:
- COMe module based on Intel Coffee Lake CPU
- Switch baseboard with two ASICs, while
24 ports of each ASICs are connected to one backplane connector
32 ports of each ASIC are connected to 8 OSFPs
- Integrated 60mm dual-rotor FANs inside L1 node (N+2 redundancy)
- Support 48V or 54V DC input from the external power server.
Add the structures related to the new systems to allow proper activation
of the all required platform driver.
Add poweroff callback to support deep power cycle flow, which should
include special actions against CPLD device for performing graceful
operation.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-6-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Introduce support for Nvidia next-generation 800GB/s ethernet switch
SN5600.
SN5600 is 51.2 Tbps Ethernet switch based on Nvidia Spectrum-4 ASIC.
It can provide up to 64x800Gb/s (ETH) full bidirectional bandwidth per
port using PAM-4 modulations. The system supports 64 Belly to Belly 2x4
OSFP cages.
The switch was designed to fit standard 2U racks.
Features:
- 64 OSFP ports support 800GbE - 10GbE speed.
- Additional 25GbE - 1GbE service port on the front panel.
- Air-cooled with 3 + 1 redundant fan units.
- 1 + 1 redundant 3000W or 3600W PSUs.
- System management board is based on Intel Coffee-lake CPU E-2276
with secure-boot support.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-5-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Change "reset_voltmon_upgrade_fail" attribute name to
"reset_pwr_converter_fail".
For systems using "mlxplat_mlxcpld_default_ng_regs_io_data", relevant
CPLD 'register.bit' indicates the failure of power converter, while on
older systems same 'register.bit' indicates failure of voltage monitor
devices upgrade failure.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-3-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The rack switch is designed to provide high bandwidth, low latency
connectivity using optical fiber as the primary interconnect.
System supports 32 OSFP ports, non-blocking switching capacity of
25.6Tbps.
System equipped with:
- 2 replaceable power supplies (AC) with 1+1 redundancy model.
- 7 replaceable fan drawers with 6+1 redundancy model.
- 2 External Root of Trust or EROT (Glacier) devices for securing
ASICs firmware.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20230208063331.15560-2-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
The Vulcan is chassis containing Nvidia's Hopper dGPU (GH100), NVswitch
(LS10) based HGX baseboard and COMe NVSwitch management module.
The system is built for artificial intelligence and accelerated
analytics applications. Vulcan is offered as an HGX product to cloud
service providers and OEMs, who intend to build fully interconnected
GPU systems for large scale deployments.
Driver is extended to support new COMe NVSwitch management module.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Link: https://lore.kernel.org/r/20220711084559.62447-5-vadimp@nvidia.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add support for new system type, which is a water-cooling flavor
of the VMOD001 system class, equipped with 48xSFP28 and 8xQSFP28
100G Ethernet ports.
System is recognized by "DMI_BOARD_NAME" and " DMI_PRODUCT_SKU"
matches, when these fields are set respectively to "VMOD001" and
"HI138".
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Link: https://lore.kernel.org/r/20211023094022.4193813-4-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Extend systems of class VMOD0010 equipped with CoffeeLake COMEx module
with BIOS related attributes to represent various BIOS statuses.
These attributes "bios_active_image", "bios_auth_fail",
"bios_upgrade_fail", "bios_safe_mode" has been already added to modular
system. This all of them are already documented.
- "bios_active_image" - location of current active BIOS image (0: Top,
1: Bottom. The reported value should correspond to value expected by
OS in case of BIOS safe mode is 0. This bit is related to Intel
top-swap feature of DualBios on the same flash.
- "bios_auth_fail": BIOS upgrade is failed because provided BIOS image
is not signed correctly.
- "bios_upgrade_fail" BIOS upgrade is failed by some reason not related
to authentication. For example, due to physical SPI flash problem.
- "bios_safe_mod": - 0 : if BIOS is booted from a supposed active image;
1 : BIOS safe mechanism was enforced by hardware (CPLD).
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Link: https://lore.kernel.org/r/20211023094022.4193813-3-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add support for new system types "MQM97xx", which is based on Mellanox
Quantum-2 ASIC. It provides up to 64x400GB/s (IB) full bidirectional
bandwidth per port using PAM-4 modulation. The system support 32 OSFP
cages that can provide 64x400GB/s per port (two ports/cage). The system
fits standard 1U racks.
System is equipped with seven fan drawers and with per fan drawer LED
on backport panel and uses two-bytes for exposing CPLD Part Number
versions.
System is recognized by "DMI_BOARD_NAME" match, when this field is set
to "VMOD0010".
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Oleksandr Shamray <oleksandrs@nvidia.com>
Link: https://lore.kernel.org/r/20211023094022.4193813-2-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add event notifier callbacks for modular system line cards. These
callbacks are to be passed to "mlxreg-hotplug" driver by line card
driver during probing. Then, when any line card related hotplug event
is received (insertion ,power, synch, ready), hotplug driver will
invoke callback for the relevant line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20211002093238.3771419-5-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add initial chassis management support for Nvidia modular Ethernet
switch systems MSN4800, providing a high performance switching solution
for Enterprise Data Centers (EDC) for building Ethernet based clusters,
High-Performance Computing (HPC) and embedded environments.
This system could be equipped with the different types of replaceable
line cards and management board. The first system flavor will support
the line card type MSN4800-C16 equipped with Lattice CPLD devices aimed
for system and ASIC control, one Nvidia FPGA for gearboxes (PHYs)
management, and four Nvidia gearboxes for the port control and with
16x100GbE QSFP28 ports and also with various devices for electrical
control.
The system is equipped with eight slots for line cards, four slots for
power supplies and six slots for fans. It could be configured as fully
populated or with even only one line card. The line cards are
hot-pluggable.
In the future when more line card flavors are to be available (for
example line cards with 8x200Gb Eth port, with 4x400 Eth ports, or with
some kind of smart cards for offloading purpose), any type of line card
could be inserted at any slot.
The system is based on Nvidia Spectrum-3 ASIC. The switch height is
4U and it fits standard rack size.
System could be configured as fully populated or with even only one
line card. The line cards are hot-pluggable.
Line cards are connected to the chassis through I2C interface for the
chassis management operations and through PCIe for the networking
operations. Future line cards could be connected to the chassis through
InfiniBand fabric, instead of PCIe.
The first type of line card supports 16x100GbE QSFP28 Ethernet ports.
Those line cards equipped with the programmable devices aimed for
system control of Nvidia Ethernet switch ASIC control, Nvidia FPGA,
Nvidia gearboxes (PHYs).
The next coming card generations are supposed to support:
- Line cards with 8x200Gbe QSFP28 Ethernet ports.
- Line cards with 4x400Gbe QSFP-DD Ethernet ports.
- Smart cards equipped with Nvidia ARM CPU for offloading and for fast
access to the storage (EBoF).
- Fabric cards for inter-connection.
The basic system initialization flow with input signals from the
programmable device to kernel hotplug driver and with OS response
to some of these signals is depicted below.
lc#n_prsnt *-> Input: line card presence in/out events.
Informational event. Required action - 'udev' event
generation for logging.
lc#n_verified *-> Input: line card verification status events coming
after line card security signature validation by
hardware. Required action - connect line card
driver and initialized line card devices feeding
from system auxiliary power domain.
lc#n_pwr <-* Output: line card power on / off from OS. Action
should be performed by platform power management
driver.
lc#n_powered *-> Input: line card power on/off events coming after
line card "power good" on/off events, mean that
line card power up sequence has been successfully
completed or line card "power good" status has been
dropped. Required action - connect line card
devices feeding from system main power domain.
lc#n_synced *-> Input: line card synchronization events, coming
after hardware-firmware synchronization handshake.
Required action - to enable line card, in case
lc#n_ready has been received before.
lc#n_ready *-> Input: line card ready events, indicating line card
PHYs ready / unready states. Required action -
enable line card, in case lc#n_synced has been
received before.
lc#n_enable <-* Output: line card enable from OS - release FPGA and
PHYs line card devices from reset state. Action
should be performed by platform power management
driver.
lc#n_active *-> Input: when line card "active event" is received
for particular line card, its network, hardware
monitoring and thermal interfaces should be
configured according to the configuration obtained
from the firmware. When opposite "inactive event"
is received all the above interfaces should be
teared down. Required action - connect / disconnect
the above line card interfaces through ASIC I2C
chassis management driver.
For initial support:
- Define new system type 'VMOD0011' to support new modular system.
- Provide initial platform configuration for new system type.
- Extend the registers definitions.
- Add support for modular system registers related to line card
specific events - insertion/removal, power on/off, verification
and activation.
- Add hotplug configuration for the above events.
- Add configurations for hotplug actions for the modular system.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20211002093238.3771419-3-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Fix array names to match assignments for data items and data items
counter in 'mlxplat_mlxcpld_comex_items' structure for:
.data = mlxplat_mlxcpld_default_pwr_items_data,
.count = ARRAY_SIZE(mlxplat_mlxcpld_pwr),
and
.data = mlxplat_mlxcpld_default_fan_items_data,
.count = ARRAY_SIZE(mlxplat_mlxcpld_fan),
Replace:
- 'mlxplat_mlxcpld_pwr' by 'mlxplat_mlxcpld_default_pwr_items_data' for
ARRAY_SIZE() calculation.
- 'mlxplat_mlxcpld_fan' by 'mlxplat_mlxcpld_default_fan_items_data'
for ARRAY_SIZE() calculation.
Fixes: bdd6e155e0 ("platform/x86: mlx-platform: Add support for new system type")
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/20201207174745.22889-3-vadimp@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>