Cadence Macb Linux Driver for Zynq, Zynq Ultrascale+ MPSoC and Versal
Table of Contents |
---|
Paths, files, links and documentation on this page are given relative to the Linux kernel source tree.
Table of Contents
Table of Contents | ||
---|---|---|
|
HW IP features
- Speed support for 10/100/1000 Mbps
- MAC loopback and PHY loopback
- Partial store and forward option
- Packet buffer option
- Flow control - TX/RX pause
- Checksum offload support, CRC checking, FCS stripping
- Promiscuous mode, Broadcast mode
- Collision detection and enforcement - this is an IP feature, no SW support required
- MDIO support for PHY layer management
- Multicasting support
- VLAN tagged frames
- Half duplex support
- Programmable IPG
- External FIFO interface
- Wake on LAN
- IEEE1588 support for ZynqMP and Versal
- Jumbo frame size support for ZynqMP and Versal
- 64 bit addressing for ZynqMP and Versal
- Priority queue support for ZynqMP and Versal
- PS SGMII support (hardwired to 1Gbps) is present in ZynqMP
Features supported in driver
(Functional HW IP and stack related features)- Speed support for 10/100/1000 Mbps with clock framework
- Packet buffer option
- Checksum offload support, CRC checking, FCS stripping
- MDIO support for PHY layer management
- Multicasting support
- Programmable IPG
- IEEE1588 support for ZynqMP and Versal
- Jumbo frame size support for ZynqMP and Versal
- 64 bit addressing for ZynqMP and Versal
- Priority queue support for ZynqMP and Versal
- PS SGMII support is present in ZynqMP and supported in the driver
- This driver can be used with PL SGMII/1000BaseX driver on Zynq, ZynqMP and ZynqMPVersal
- This driver can be used with gmii2rgmii converter driver
- Support for EthTool queries
- RX NAPI support
- Clock adaptation on Zynq, ZynqMP and Versal
- Runtime PM and suspend/resume supported on ZynqMP and Versal
- Partial store and forward
- Wake on LAN support using ARP on ZynqMP and Versal
- Dynamic SGMII configuration support on Xilinx Zynq Ultrascale+MPSoC
Missing Features, Known Issues and Limitations
- Linux does not support loopback
- Flow control support is not present in the driver. RX pause frames can be received by the IP but TX pause frame support is not provided.
- External FIFO interface is not supported by the driver - this implementation is DMA based.
- No interrupt support for PHY events in driver. The current implementation relies on polling method for phy event
- No IEEE 1588 support for Zynq-7000 as the timestamp implementation in IP is not accurate enough.
- The timestamp generated on a PTP event is stored in a non-latching register. This means that the timestamp is overwritten whenever a PTP event packet arrives. Hence there is no foolproof way to associate a timestamp with the packet.
- An application using sync, follow up, pdelay request, pdelay response with a sync cycle of 1 second and NO errors in between might possibly work but it is not reliable because sync will fail the moment there is any deviation: i.e. multiple back to back PTP event packets in the same direction (or) a small sync interval on a high traffic system where the SW is unable to process the timestamp register before it is overwritten.
- .
- WOL does not work on warm restart designs because GEM WOL requires an RX BD scratch area that is accessible even during suspend (OCM is used for this) and OCM is secure in this design which is a limitation for this feature.
Important AR links
- WOL does not work on warm restart designs due to some limitations (2018.1/2/3) - AR-71028
- PTP time adjustment for a large negative delta fails in 2018.1/2 - AR-71332
- MACB MDIO bus support - Please find the patches for 2017.1, 2017.2, 2017.3, 2017.4, 2018.1, 2018.2 and 2018.3 at the AR - AR-69132
- ZynqMP PS SGMII GT initialization and related - AR-68866
- ZynqMP PS SGMII fixed link - AR-69769 (apllicable till 2021.2 release)
- TI PHY design on ZynqMP evaluation board has incorrect straps and can be remedied with a SW workaround (already implemented in drivers) - AR-70686
- PL PCS PMA initialization in fsbl for Zynq and ZynqMP - refer to xapp1026 and xapp130
- For custom Versal designs using AIE on 2020.1, make sure the low DDR region is accessible to LPD slaves (including GEM) using a workaround (<link to AR>).
- There is a performance drop of ~100Mbps between 2020.1 (5.4 Linux kernel) and 2019.2 (4.19 Linux kernel) observed on both GEM and Axi Ethernet on Zynq. This is currently suspected to be the result of change in the net framework and there is no workaround yet. Further updates will be documented in AR-75195
- For full list of ARs, search XKB
Kernel Configuration
The- Macb + PL PCS PMA ifconfig down/up may fail without proper reset and clock reinitialization. Please refer to AR-72806.
- Timestamping issue in gPTP master mode (applicable only for 2022.2/2023.1) - AR-000035307
- For full list of ARs, search XKB
Kernel Configuration
The following config options should be enabled in order to build the macb driverCONFIG_ETHERNET
...
Use IEEE 1588 hwstamp (only supported in ZynqMP and Versal) - This config option supports use of 1588 HW TSTAMP support in ZynqMP & Versal and depends on MACB.
This option enables IEEE 1588 Precision Time Protocol (PTP) support for MACB.
Devicetree
Compatible string can be:- "cdnsxlnx,zynq-gem" for Zynq-7000
- "cdnsxlnx,zynqmp-gem" for ZynqMP. This compatible string enables use of jumbo frame sizes, 1588 and HW timestamping suport and any features exclusive to ZynqMP.
- "cdnsxlnx,versal-gem" for Versal. This compatible string enables use of jumbo frame sizes, 1588 and HW timestamping suport, automatic flow control, 802.1AS and any features exclusive to Versal.
...
Note: Compatible string of format "cdnx,XXXX" is deprecated.
For more details on phy bindings please refer "Documentation/devicetree/bindings/net/cdns,macb.yaml" (macb.txt "in older version)
Code Block | ||
---|---|---|
| ||
gem0: ethernet@e000b000 { compatible = "cdns,gem"; reg = <0xe000b000 0x1000>; status = "disabled"; interrupt-parent = <&gic>; interrupts = <0 22 4>; clocks = <&clkc 30>, <&clkc 30>, <&clkc 13>; clock-names = "pclk", "hclk", "tx_clk"; #address-cells = <1>; #size-cells = <0>; phy-handle = <ðernet_phy>; phy-mode = "rgmii-id"; ethernet_phy: ethernet-phy@7{ reg = <7>; }; }; |
...
Clock adaption is present by default for both Zynq and ZynqMPall device families. For more details refer to devicetree clock bindings and respective wiki pages.
ZynqMP and Versal also has have tsu-clk adaption support in addition to all the other reference clocks.
...
This driver can be used for a MAC - MAC fixed link connection. In order to do so, please update the devicetree fixed link node as per
https://github.com/Xilinx/linux-xlnx/blob/master/Documentation/devicetree/bindings/net/fixedethernet-link.txt
and set the phy-mode to "moca" (https://github.com/Xilinx/linux-xlnx/blob/master/include/linux/phy.h)
Common MDIO DT
To use multiple GEM→PHY connections using a common MDIO bus, please use the following devicetree convention:
Common MDIO DT
To use multiple GEM→PHY connections using a common MDIO bus, please use the following devicetree convention:
Code Block |
---|
gem0 { ...... phy-handle = <&phya>; mdio { phya { reg = <0xa>; }; phyb { reg = <0xb>; }; }; }; gem1 { ..... phy-handle = <&phyb>; }; |
...
→ gem0 is communicating via phya and gem1 is communicating via phyb
Note that gem0 needs : For versions upto 2022.1, gem0 needs to come up before gem1 and stay up (because the MDIO interface is expected to be up first; otherwise, the dependent MAC-PHY link (gem1-phyb) will come up on next ifconfig up/down).
As a result of this gem0's runtime PM will not be effective if gem1 is still active in this configuration.
For versions starting 2022.2, probe order and PM suspend/resume order is automatically handled in the driver based on MDIO producer and consumer.
PS SGMII DTs (ZynqMP only)
...
→ If there is no MDIO access to the SGMII PHY or if SFPs are used, then the phy-mode can should be set to sgmii and fixed link node can should be used instead of phy node. This means that the Linux SW assumes, there is no PHY or autoneg. PCS block will still attempt autoneg and update PCS autoneg will be disabled and PCS_status register will always report link up (to be read twice because of sticky bits).
→ Alternately, patch in SGMII fixed link AR mentioned above can be used (especially if there is no PHY) with "is-internal-pcspma" property and a fixed link node. In this case, both Linux SW and PCS block do not attempt autoneg and the link status in PCS_status register will always report link up.
Performance
...
Zynq
...
This solution is available in releases 2022.1 and above. For previous releases, please refer to "Important AR links"
Pointers on PHY reset via GPIO
→ For boards which require a PHY reset via GPIO, please see the generic framework provisions here: https://github.com/Xilinx/linux-xlnx/blob/master/Documentation/devicetree/bindings/net/ethernet-phy.yaml#L141
This can be used for multiple PHYs with independent GPIO resets as well.
→ If reset is required before PHY detection, please see the MDIO bus provision here: https://github.com/Xilinx/linux-xlnx/blob/master/Documentation/devicetree/bindings/net/mdio.yaml#L30
→ When using PHY reset via GPIO, please check manufacturer specific datasheet for the reset polarity, reset assert duration and post de-assert delay for PHY to be functional. These values can then be passed to PHY and MDIO framework via Devicetree documentation above.
Performance
These benchmark performance numbers were obtained by connecting Xilinx boards to Linux PCs/server machines (Ubuntu/Red Hat Enterprise).The tool used is netperf (Refer to tool information below).
The protocol, MTU size and option to note CPU load can all be selected from netperf/netserver options
Zynq
Board: ZC706CPU Freq: 666MHz (A9)
Link Speed: 1000Mbps, Full duplex
Linux version: 6.1
TCP (Mbps) | UDP(Mbps) | |||||||
---|---|---|---|---|---|---|---|---|
MTU | TX | CPU(%) | RX | CPU(%) | TX | CPU(%) | RX | CPU(%) |
1500 | 728.76 | 97.29 | 548.70 | 95.96 | 565.6 | 65.00 | 444.8 | 99.55 |
Linux version: 5.4 and above
NOTE- There is ~10% drop in performance (compared to 2019.2) for 1500 MTU.
The drop is due to this commit enabling CONFIG_OPTIMIZE_INLINING forcibly in linux kernel. It is observed on GEM and Xilinx Axi Ethernet drivers on Zynq.
Kernel and networking stack has a large number of inline functions and it could be some unoptimized inline function (could also be dependent on gcc version) leading to performance drop.
The plan is to document this performance drop on Zynq and initiate a discussion with mainline community so that it is analyzed by respective kernel maintainers.
TCP (Mbps) | UDP(Mbps) | |||||||
---|---|---|---|---|---|---|---|---|
MTU | TX | CPU(%) | RX | CPU(%) | TX | CPU(%) | RX | CPU(%) |
1500 | 654.79 | 93.11 | 737.63 | 81.43 | 486.8 | 63.56 | 303 | 96.23 |
Linux version: 5.10
TCP (Mbps) | UDP(Mbps) | |||||||
---|---|---|---|---|---|---|---|---|
MTU | TX | CPU(%) | RX | CPU(%) | TX | CPU(%) | RX | CPU(%) |
1500 | 675.79 | 90.68 | 759.22 | 86.45 | 455.0 | 62.95 | 690.1 | 82.99 |
ZynqMP
Board: ZCU102CPU Freq 1100MHz (A53)
Link Speed 1000Mbps, Full duplex
DDR 533MHz
CCU: No
Linux version: 6.1
TCP (Mbps) | UDP (Mbps) | |||||||
---|---|---|---|---|---|---|---|---|
MTU | TX | CPU (%) | RX | CPU (%) | TX | CPU (%) | RX | CPU (%) |
1500 |
941. |
37 |
5. |
0 |
930. |
64 |
54. |
Linux version: 5.4
NOTE- There is ~10% drop in performance (compared to 2019.2) for 1500 MTU.
The drop is due to this commit enabling CONFIG_OPTIMIZE_INLINING forcibly in linux kernel. It is observed on GEM and Xilinx Axi Ethernet drivers on Zynq.
Kernel and networking stack has a large number of inline functions and it could be some unoptimized inline function (could also be dependent on gcc version) leading to performance drop.
The plan is to document this performance drop on Zynq and initiate a discussion with mainline community so that it is analyzed by respective kernel maintainers.
ZynqMP
Board: ZCU102CPU Freq 1100MHz (A53)
Link Speed 1000Mbps, Full duplex
DDR 533MHz
CCU: No
Linux version: 5.4
Diagnostic and Protocol Tests
PING
This utility used to test the reachability of a host on an Internet Protocol(IP) network and to measure the round trip time for messages sent from the originating host to a destination computer.How to run:
Code Block | ||
---|---|---|
| ||
ping <Remote IP Address> |
WebServer
Connect zynq board to a Linux x86 machine. Ensure that telnet server is running on the Zynq board. It tests for remote access for Zynq board on host machineOpen a web browser on host machine and enter the static IP assigned to zynq board. Webpage is expected to be displayed properly.
Telnet
Code Block | ||
---|---|---|
| ||
telnet <Server IP Address> |
FTP & TFTP
How to run:Open a ftp client on the host with the Zynq.
Code Block | ||
---|---|---|
| ||
x86> ftp 192.168.1.10 |
Code Block | ||
---|---|---|
| ||
x86> mput <file_name> |
Pkt Generator
Please refer to link below for how to run and various optionshttps://www.kernel.org/doc/Documentation/networking/pktgen.txt
Performance Tests
Netperf
How to run:Server:
Code Block | ||
---|---|---|
| ||
netserver |
Code Block | ||
---|---|---|
| ||
taskset 2 ./netperf -H <Server IP> -t TCP_STREAM
taskset 2 ./netperf -H <Server IP> -t UDP_STREAM |
http://www.netperf.org/netperf/
Iperf
How to run:Server:
Code Block | ||
---|---|---|
| ||
./iperf_arm -s -u
./iperf_arm -s |
Code Block | ||
---|---|---|
| ||
./iperf_arm -c <Server IP> -u -b <banwidth>
./iperf_arm -c <Server IP> |
http://en.wikipedia.org/wiki/Iperf
Stress Test
Iperf with option -d
Run iperf in dual testing mode. This will cause the server to connect back to the client on the port specified in the -L option (or defaults to the port the client connected to the server on). This is done immediately therefore running the tests simultaneously.Code Block | ||
---|---|---|
| ||
./iperf_arm -c <Server IP> -d |
Ping flood test
Users can send hundred or more packets per second using -f option. It prints a ‘.’ when a packet is sent, and a backspace is printed when a packet is receivedCode Block | ||
---|---|---|
| ||
ping -f localhost |
PTP
1588 synchronization can be tested on ZynqMP using open source linuxptp application.http://linuxptp.sourceforge.net/
The setup requires a master with precise clock and timstamping capabilities, typically a NIC or another 1588 capable device.
How to run
master:
Code Block | ||
---|---|---|
| ||
#ptp4l -i <interface name> -m |
Code Block | ||
---|---|---|
| ||
#ptp4l -i <interface name> -s -m |
Mainline status
The macb driver is currently at mainline kernel 5.4 with some patches pulled in from later kernels. The patches that not yet in any mainline kernel are as follows:- WOL via ARP support (~70 lines)
- Partial store and forward support (~80 lines)
- Versal support (~50 lines)
- Minor differences including mdio phy node support (gmii2rgmii), PCS autoneg and CAPS change, gem_rx_refill skbuff error handling, optimized HW timestamp reading, high DDR handling and other bugfixes (~50 lines altogether).
PHY details
The following PHYs were tested with ZynqMP GEM:- TI DP83867IR
- TI DP83867E (SGMII)
- Marvell 88E1112
- Marvell 88E1510/2
- Realtek RTL8211
- Vitesse VSC8211
- Micrel KSZ9031
- VSC8531_02
...
94 | 957.0 | 20.3 | 961.6 | 22.07 | ||||
8192 | 988.95 | 2.34 | 989.07 | 7.94 | 991.9 | 5.80 | 992.0 | 5.54 |
Test Procedure
Diagnostic and Protocol Tests
PING
This utility used to test the reachability of a host on an Internet Protocol(IP) network and to measure the round trip time for messages sent from the originating host to a destination computer.How to run:
Code Block | ||
---|---|---|
| ||
ping <Remote IP Address> |
WebServer
Connect zynq board to a Linux x86 machine. Ensure that telnet server is running on the Zynq board. It tests for remote access for Zynq board on host machineOpen a web browser on host machine and enter the static IP assigned to zynq board. Webpage is expected to be displayed properly.
Telnet
Code Block | ||
---|---|---|
| ||
telnet <Server IP Address> |
FTP & TFTP
How to run:Open a ftp client on the host with the Zynq.
Code Block | ||
---|---|---|
| ||
x86> ftp 192.168.1.10 |
Code Block | ||
---|---|---|
| ||
x86> mput <file_name> |
Pkt Generator
Please refer to link below for how to run and various optionshttps://www.kernel.org/doc/Documentation/networking/pktgen.txt
Performance Tests
Netperf
How to run:Server:
Code Block | ||
---|---|---|
| ||
netserver |
Code Block | ||
---|---|---|
| ||
taskset 2 ./netperf -H <Server IP> -t TCP_STREAM
taskset 2 ./netperf -H <Server IP> -t UDP_STREAM |
http://www.netperf.org/netperf/
Iperf
How to run:Server:
Code Block | ||
---|---|---|
| ||
./iperf_arm -s -u
./iperf_arm -s |
Code Block | ||
---|---|---|
| ||
./iperf_arm -c <Server IP> -u -b <banwidth>
./iperf_arm -c <Server IP> |
http://en.wikipedia.org/wiki/Iperf
Stress Test
Iperf with option -d
Run iperf in dual testing mode. This will cause the server to connect back to the client on the port specified in the -L option (or defaults to the port the client connected to the server on). This is done immediately therefore running the tests simultaneously.Code Block | ||
---|---|---|
| ||
./iperf_arm -c <Server IP> -d |
Ping flood test
Users can send hundred or more packets per second using -f option. It prints a ‘.’ when a packet is sent, and a backspace is printed when a packet is receivedCode Block | ||
---|---|---|
| ||
ping -f localhost |
PTP
1588 synchronization can be tested on ZynqMP and Versal using open source linuxptp application.http://linuxptp.sourceforge.net/
The setup requires a master with precise clock and timstamping capabilities, typically a NIC or another 1588 capable device.
How to run
master:
Code Block | ||
---|---|---|
| ||
#ptp4l -i <interface name> -m |
Code Block | ||
---|---|---|
| ||
#ptp4l -i <interface name> -s -m |
Mainline status
The macb driver is currently at mainline kernel 6.1 with some patches pulled in from later kernels. The patches that not yet in any mainline kernel are as follows:- WOL via ARP support (~70 lines)
- Partial store and forward support (~80 lines) - in upstream 6.5 kernel
- Minor differences around PCS PMA handling
PHY details
The following PHYs were tested with ZynqMP GEM:- TI DP83867IR
- TI DP83867E (SGMII)
- Marvell 88E1112
- Marvell 88E1510/2
- Realtek RTL8211
- Vitesse VSC8211
- Micrel KSZ9031
- VSC8531_02
Change Log
2023.2
Summary:
- Bugfixes from mainline to support pclk>160MHz and to fix PTP Timestamp failure due to packet padding
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2023.2/drivers/net/ethernet/cadence
2023.1
Summary:
- Fixes for macb SGMII wake source configuration
- Upgrade to 6.1 mainline version including minor updates around napi weight removal, macb phy PM support
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2023.1/drivers/net/ethernet/cadence
2022.2
Summary:
- Macb common MDIO bus enhancements in initialization and suspend/resume flows including SGMII phy handling.
- Fixes on ethtool WOL helper and inclusion of macb pad and fcs support for fragmented packets.
- PTP fixes on one step sync support and to add PTP TX timestamps for all packets in alignment with HWTSTAMP_TX_ON definition.
- Mainline fixes for TX restart handling under TXUBR conditions, compatible string prefix (deprecation of cdns to use xlnx)
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2022.2/drivers/net/ethernet/cadence
2022.1
Summary:
- Align with mainline 5.15 driver to use phylink and PCS functionality.
- Add support for (Internal) SGMII Dynamic Configuration support on Zynq Ultrascale+ MPSoC
- Minor bugfixes including fix for possible rotting packet in RX queue due to NAPI ordering.
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2022.1/drivers/net/ethernet/cadence
2021.2
No Changes
2021.1
Summary:
- New updates from 5.10 including phylink support.
- Minor bugfixes including DMA & coherent DMA mask handling, error handling for WOL ARP etc.
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2021.1/drivers/net/ethernet/cadence
2020.2
Summary:
- Minor bugfix for high memory DMA handling.
Commits:
https://github.com/Xilinx/linux-xlnx/commits/xilinx-v2020.2/drivers/net/ethernet/cadence
...