Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Axi Ethernet Linux driver for Microblaze, Zynq, Zynq Ultrascale+ MPSoC and Versal


Introduction

Table of Contents

Table of Contents
This page gives an overview of Axi Ethernet Linux driver which is available as part of the Linux distribution.

...

  • Hardened Ethernet IP block on Versal.
  • Multi rate Ethernet MAC supporting speeds from 10G to 100G.
    • The driver supports 25GE and 10GE with 1 to 4 lanes.
  • Hardened IP (to be used with Soft DMA and logic for driver subsystems)
  • High performance, low latency.
  • Low data path latency
  • User-side AXI4-Stream interface for data
  • AXI4-Lite register interface
  • Detailed statistics gathering
  • IEEE1588 support

...

Features supported in the driver

  • Support ethernet IPs- AXI 1G/2.5G Ethernet subsystem (PG138), 10G Ethernet subsystem(PG157), 10G Ethernet Subsystem(PG210), USXGMII(PG251) and MRMAC.
  • IEEE 1588 Support for 1G and legacy 10G MAC (PG157), 10G Ethernet subsystem(PG210) and MRMAC
  • Speed support for 10/100/1000 Mbps for 1G MAC
  • 10G Base-R support for Legacy 10G MAC(PG157) and 10G MAC (PG210)
  • 10G and 25G speed support for MRMAC
  • Support for GMII/RGMII/SGMII/1000Base-X Phy Configurations
  • Supports Independent 4K, 8K, 16K, or 32KB TX and RX frame buffer memory
  • Support for common ethtool queries.
  • NAPI support.
  • Full/Partial Checksum offload support
  • Support for Jumbo Frames
  • Supports AXI DMA and AXI MCDMA dma configuration.
  • Multi-queue support

Missing Features and Known Issues/Limitations in Driver

  • The driver assumes that Axi Ethernet IP is connected to the DMA at the hardware level.
  • The driver doesn't use dma engine framework and contains DMA programming sequence i.e doesn't use separate DMA driver. Hence compatibility string of axidma node (DTS) is set to a dummy device-tree property compatible = "xlnx,eth-dma";
  • The driver doesn't support software time-stamping. It supports only hardware time-stamping. 
  • PTP synchronization along with high speed traffic (iperf or netperf) is not supported as under heavy load,  timestamp in FIFO and DMA data in BD is expected to go out of sync and remain so until the interface is reset.
  • No support for fixed-link.
  • For 1588 testing the Current driver assumes that AXI Stream FIFO is connected to the MAC TX Time stamp Stream interface at the design level.  For axiethernet 1G/10G subsystem only 2-step PTP is supported.
  • 10G/25G and USXGMII configurations do not support dynamic link status/change in the background as there is no external PHY using PHY framework.
  • Pause frame solution is not supported and hence there could be RX overruns errors in bidirectional throughput.
  • The driver supports MCDMA using kernel config i.e CONFIG_AXIENET_HAS_MCDMA option. So in multi-instance scenario driver will only support a single DMA type i.e 1G + MCDMA and 10G + MCDMA.
  • The driver doesn't support extended multicast and VLAN support. Limited validation of multicast and vlan support. 
  • Runtime Switchable mode.
  • 25G Ethernet Subsystem(PG210).
  • MRMAC speeds 40G/50G/100G are not supported yet.
  • MRMAC multi-lane support is not independent because if common GT reset logic exists in subsystem.
  • On versal support is limited to AXI 1G/2.5G Ethernet subsystem (without PTP) and MRMAC.
Info

NOTE: Relevant missing Features and Known Issues/Limitations in IP:

  • Multiple TX and RX channel in MCDMA have common configuration and reset registers and hence cannot be used independently by multiple MACs. For ex., if XXV Ethernet instance 1 uses channel 0-4 of MCDMA and then XXV Ethernet instance 2 uses channels 5-15, then resets during driver initialization and error management effect all channels and both instance need to use common registers. Due to this limitation, multiple MACs cannot be used with a single MCDMA.
  • AXI Ethernet driver in specific MCDMA configuration throws swiotlb full error with jumbo frames. Please refer to 2020.x AR-75128.
  • Default DTG generation for XXV Ethernet designs fails on 2020.2. Please refer to AR-76113.

Kernel Configuration

The following config options should be enabled in order to build the Axi Ethernet driver
CONFIG_ETHERNET
CONFIG_NET_VENDOR_XILINX
CONFIG_XILINX_AXI_EMAC
CONFIG_AXIENET_HAS_MCDMA (Select this option In the design if Axi Ethernet is configured with Axi MCDMA)
CONFIG_XILINX_PHY (For testing SGMII/1000Base-x Configuration with PCS/PMA Core)




Device-tree

For more details on phy bindings please refer "Documentation/devicetree/bindings/net/phy.txt"
Code Block
themeMidnight
axi_ethernet_eth_buf: ethernet@40c00000 {
axistream-connected = <&axi_dma_1>;
axistream-control-connected = <&axi_dma_1>;
clock-frequency = <100000000>;
clocks = <&clk_bus_0>;
compatible = "xlnx,axi-ethernet-1.00.a";
device_type = "network";
interrupt-parent = <&microblaze_1_axi_intc>;
interrupts = <4 2>;
reg = <0x40c00000 0x40000>;
xlnx,phy-type = <0x4>;
xlnx,phyaddr = <0x1>;
xlnx,rxcsum = <0x0>;
xlnx,rxmem = <0x8000>;
xlnx,txcsum = <0x0>;
phy-handle = <&phy0>;
mdio {
#address-cells = <1>;
#size-cells = <0>;
phy0: phy@7 {
device_type = "ethernet-phy";
reg = <7>;
};
};
};

Soft Ethernet MAC(1G, legacy 10G or 10G/25G MAC, MRMAC) Configured with MCDMA

When Axi Ethernet (10G/25G MAC) configured with MCDMA device-tree node will be like below
Code Block
themeMidnight
 xxv_ethernet_0: ethernet@80020000 {
	axistream-connected = <&axi_dma_hier_axi_mcdma_0>;
	axistream-control-connected = <&axi_dma_hier_axi_mcdma_0>;
	clock-frequency = <100000000>;
	clock-names = "rx_core_clk_0", "dclk", "s_axi_aclk_0";
	clocks = <&misc_clk_0>, <&clk 72>, <&clk 71>;
	compatible = "xlnx,xxv-ethernet-2.5", "xlnx,xxv-ethernet-1.0";
	device_type = "network";
	local-mac-address = [00 0a 35 00 00 00];
	phy-mode = "base-r";
	reg = <0x0 0x80020000 0x0 0x10000>;
	xlnx = <0x0>;
	xlnx,add-gt-cntrl-sts-ports = <0x0>;
	xlnx,anlt-clk-in-mhz = <0x64>;
	xlnx,axis-tdata-width = <0x40>;
	xlnx,axis-tkeep-width = <0x7>;
	xlnx,base-r-kr = "BASE-R";
	xlnx,channel-ids = "1","2","3","4","5","6","7","8","9","a","b","c","d","e","f","10";
	xlnx,clocking = "Asynchronous";
	xlnx,core = "Ethernet MAC+PCS/PMA 64-bit";
	xlnx,data-path-interface = "AXI Stream";
	xlnx,enable-datapath-parity = <0x0>;
	xlnx,enable-pipeline-reg = <0x0>;
	xlnx,enable-preemption = <0x0>;
	xlnx,enable-preemption-fifo = <0x0>;
	xlnx,enable-rx-flow-control-logic = <0x0>;
	xlnx,enable-time-stamping = <0x1>;
	xlnx,enable-tx-flow-control-logic = <0x0>;
	xlnx,enable-vlane-adjust-mode = <0x0>;
	xlnx,family-chk = "zynquplus";
	xlnx,fast-sim-mode = <0x0>;
	xlnx,gt-diffctrl-width = <0x4>;
	xlnx,gt-drp-clk = "100.00";
	xlnx,gt-group-select = "Quad X0Y0";
	xlnx,gt-location = <0x1>;
	xlnx,gt-ref-clk-freq = "156.25";
	xlnx,gt-type = "GTH";
	xlnx,include-auto-neg-lt-logic = "None";
	xlnx,include-axi4-interface = <0x1>;
	xlnx,include-fec-logic = <0x0>;
	xlnx,include-rsfec-logic = <0x0>;
	xlnx,include-shared-logic = <0x1>;
	xlnx,include-user-fifo = <0x1>;
	xlnx,lane1-gt-loc = "X0Y4";
	xlnx,lane2-gt-loc = "NA";
	xlnx,lane3-gt-loc = "NA";
	xlnx,lane4-gt-loc = "NA";
	xlnx,line-rate = <0xa>;
	xlnx,mii-ctrl-width = <0x4>;
	xlnx,mii-data-width = <0x20>;
	xlnx,num-of-cores = <0x1>;
	xlnx,num-queues = /bits/ 16 <0x10>;
	xlnx,ptp-clocking-mode = <0x0>;
	xlnx,ptp-operation-mode = <0x2>;
	xlnx,runtime-switch = <0x0>;
	xlnx,rxmem = <0x40000>;
	xlnx,switch-1-10-25g = <0x0>;
	xlnx,tx-latency-adjust = <0x0>;
	xlnx,tx-total-bytes-width = <0x4>;
	xlnx,xgmii-interface = <0x1>;
	interrupt-names = "mm2s_ch1_introut", "mm2s_ch2_introut", "mm2s_ch3_introut", "mm2s_ch4_introut", "mm2s_ch5_introut", "mm2s_ch6_introut", "mm2s_ch7_introut", "mm2s_ch8_introut", "mm2s_ch9_introut", "mm2s_ch10_introut", "mm2s_ch11_introut", "mm2s_ch12_introut", "mm2s_ch13_introut", "mm2s_ch14_introut", "mm2s_ch15_introut", "mm2s_ch16_introut", "s2mm_ch1_introut", "s2mm_ch2_introut", "s2mm_ch3_introut", "s2mm_ch4_introut", "s2mm_ch5_introut", "s2mm_ch6_introut", "s2mm_ch7_introut", "s2mm_ch8_introut", "s2mm_ch9_introut", "s2mm_ch10_introut", "s2mm_ch11_introut", "s2mm_ch12_introut", "s2mm_ch13_introut", "s2mm_ch14_introut", "s2mm_ch15_introut", "s2mm_ch16_introut";
	interrupt-parent = <&gic>;
	interrupts = <0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 89 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4 0 90 4>;

	xxv_ethernet_0_mdio: mdio {
			#address-cells = <1>;
			#size-cells = <0>;
	};
};



...

          
  • The driver supports channel observer feature through sysfs. This custom feature is useful in multi-core (Observer) system where MCDMA is a shared resource for all cores. MCDMA IP supports a maximum of six cores and 16 Channels can be distributed across each core as a static configuration. The Channel Observer is available for each group and provides the status about the channels in a group being serviced.
  • The driver supports per channel weight configuration through sysfs. This custom feature specifies the channel weight i.e number of packets to be sent in one iteration.
  • The driver supports Linux multiqueue networking. It uses the alloc_etherdev_mq() function to allocate the subqueues for the device.

    The userspace command 'tc,' part of the iproute2 package, is used to configure qdiscs. To add the MULTIQ qdisc assuming the device is called eth0, run the following command: 

    # tc qdisc add dev eth0 root handle 1: multiq

    The qdisc will allocate the number of bands to equal the number of queues that the device reports, and bring the qdisc online.

    Assuming eth0 has 4 Tx queues, the band mapping would look like:

    band 0 => queue 0
    band 1 => queue 1
    band 2 => queue 2
    band 3 => queue 3

...

NOTE- There is ~10% drop (compared to 2019.2) in performance for 1500 MTU.
The drop is due to enable CONFIG_OPTIMIZE_INLINING forcibly” commit in linux kernel.

Kernel and networking stack is full of inline functions and it could be some unoptimized
inline function (could also be dependent on gcc version) leading to a performance drop.

The performance drop is observed on GEM and Xilinx Axi Ethernet MAC’s on Zynq

The plan is to document the performance drop on zynq and initiate the discussion with
the mainline community so that it is analyzed by respective kernel maintainers.



TCP (Mbps)UDP (Mbps)
MTUTXCPU(%)RXCPU(%)TXCPU(%)RXCPU(%)
1500

740

67.53

537

89.39

453
52.86

456

88.72

8192

977

60.69

732

50.26

743

36.10

643

50.32


...

Traditionally microblaze designs are not targeted for high performance applications so only functional sanity is done.

10G Ethernet with AXIMCDMA

Kernel version: 5.10

ZynqMP
Board: ZCU102 board (production silicon) + SFP Module


TCP (Gbps)UDP (Gbps)
MTUTXCPU(%)RXCPU(%)TXCPU(%)RXCPU(%)
15002.2950.911.730.983.0399.931.6570.04
90005.052.353.5653.96.5165.094.6354.28
NOTE: In this design 1588 is not enabled.

Setup Details
Host setup: Dell System Precision Tower 7910 (0619)
Iperf: iperf 3-CURRENT (cJSON 1.5.2)
OS : Linux 3.13.0-147-generic #196-Ubuntu SMP Wed May 2 15:51:34 UTC 2018 x86_64
NIC (10G Solarflare's SFN6322F Dual-Port 10GbE SFP+ Adapter) : Default

Performance benchmarking

Pre-requisites:
  • Set Ethernet MCDMA TX interrupt affinity to core-1
root@10g-mcdma-no1588-build:~# echo 2 > /proc/irq/xx/smp_affinity
  • Run iperf servers on ZynqMP (core2 and core3)
root@10g-mcdma-no1588-build:~# taskset -c 2 iperf3 -s -p 5101 &
root@10g-mcdma-no1588-build:~# taskset -c 3 iperf3 -s -p 5102 &
  • CPU Utilization reporting
root@10g-mcdma-no1588-build:~# ./mpstat -P ALL 1 50
  • Run iperf servers on the remote host
server:~# iperf3 -s -p 5101 & ; iperf3 -s -p 5102 & ; iperf3 -s -p 5103 & ; iperf3 -s -p 5104 &

Steps:

...

  • Set the CROSS_COMPILE environment variable arm toolchain
  • Install the kernel headers

          https://www.kernel.org/doc/Documentation/kbuild/headers_install.txt

  • Include the headers path in makefile

          INC = -I/proj/epdsw/punnaiah/git/test/ethernet/1588/header/include
          CFLAGS = -Wall $(VER) $(incdefs) $(DEBUG) $(INC) $(EXTRA_CFLAGS)

  • run make

Execution steps

In order to perform master-slave sync, run the following:

Master (linux server) : ptp4l -i < interface name> -m
Slave (xilinx board) :   ptp4l -i <interface name> -m -s

NOTE: If intended before synchronization phc2sys -s <devicename> -w & can be run to synchronize the system clock to a PTP hardware clock.

Synchronization is stabilized in a few secs.


Mainline status

The current Axi Ethernet driver is currently in sync with mainline except for the following

...

33ebfdb net: xilinx: axiethernet: Fix crash in ifconfig down
f5b9e58 net: xilinx: axiethernet: Fix axiethernet register description
e491e78 net: xilinx: axiethernet: Check for queue full in transmit path
0ba2b93 net: xilinx: axiethernet: Fix code checker warnings
d4c6c09 net: xilinx: axiethernet: Use %pa format specifier for phys_addr_t type
270968c net: xilinx: axiethernet: Add 64-bit support
d139077 net: xilinx: axiethernet: Extend clocking support
fdce589 net: xilinx: axiethernet: Fix kernel crash on MII ioctl
3f2d6cd net: xilinx: axiethernet: use channel-id for mcdma interrupt names
aaad9c0 net: xilinx: axiethernet: Fix netconsole implementation

2018.3
  • Sync kconfig description.
  • In axienet_skb_tstsmp() failure return TX_BUSY.
  • Add error output on DMA allocation failed.
  • Fix memory leak in axienet bd_free().
  • Refactor and split axidma and mcdma programming in separate sources.
  • Fix dma name buffer size and skb_free in xmit.
  • Format XXV error output.
  • Fix compiler warnings.

...