XAPP1289 PCIe Root DMA

This page provides supplemental information for XAP1289

Table of Contents


The Zynq UltraScale+ Controller for PCI Express has a built-in DMA engine that can be used in Endpoint as well as Root Port mode.
This application note provides an example that demonstrates how to configure and use the DMA in the Controller for PCI Express when configured as a Root Port.

This wiki provides instructions on how to build the design - hardware and software components and test them.
This requires Vivado 2016.1 and PetaLinux 2016.1 and ZCU102 board and a KCU105 card to be used as an Endpoint.

Design Build Instructions

This section captures hardware and software component build steps.
Design is available here

Directory Structure

  • ready_to_test folder contains prebuilt SD image binaries for the design
  • software/patch contains the patch required to be applied to the root port driver in kernel. This patch sets up the DMA register aperture (DREG) and enables the DMA interrupts in AXI domain from AXI-PCIe bridge
  • software/hw_export contains the exported HDF
  • software/root_dma contains the source code for the Root DMA driver and application

Note: No hardware design is provided as it requires ZCU102 board support files available on the MPSoC lounge. Steps to build hardware after using the board preset template are provided below.

Hardware Design

This design configures the controller for PCI Express in Root mode with x4 Gen2 Link. The following provides step by step instructions with Vivado PCW snapshot.
This requires ZCU102 board support files which can be obtained from Zynq UltraScale+ MPSoC lounge.
  • Create Vivado project with ZCU102 board support template. Create a block design, add Zynq UltraScale+ MPSoC IP.
  • Click on the "Run Block Automation" ribbon and let it apply board preset settings. Once the ZCU102 preset settings are done, make changes as below for x4 Gen2 PCIe root port.
  • Double click on Zynq UltraScale+ MPSoC to customize the configuration.
  • On the IO Configuration page, disable USB 3.0 and SATA and enable PCIe for 4 lanes.
  • On the Clock configuration page, select 100MHz reference clock for all 4 GTR lanes.
  • On the PS-PL Configuration page, disable HPM-LPD master interface as it is not being used here.

  • On Advanced configuration page, set the reference clocks to REFCLK0 as shown

  • On PCIe Configuration page, change the Link width and speed to reflect x4 Gen2. Modify device ID to 0xD024 and rest of the settings can be retained as it.

  • This completes PCW customization for Root mode.
  • Save the block design and validate it.
  • Create HDL wrapper and Export Hardware to get HDF file which can be consumed by PetaLinux for building the various components.

Note: No build steps or source code is provided for the KCU105 Endpoint design. It uses Xilinx IP- AXI Bridge for PCIe Gen3 with AXI interface connect to DDR4 (via MIG-IP). AXI Performance monitor is added on this AXI interface.
Address translations are set as
  • BAR2 to MIG-DDR4 (used for Root DMA transfers)
  • BAR4 to APM registers (used for performance measurement)

Software Build

Follow the instructions below to create the software binaries.
$XAPP1289 is used to refer to the path where you copied the design zip file
$PETALINUX_PROJECT_PATH is used to refer to the path of your PetaLinux project.

Important Note
The PS-DDR APM on ZCU102 needs to be brought out of reset for the Root DMA drivers to poll APM for performance. This can be achieved via two options:
  • Add the code as FSBL hook in xfsbl_hooks.c. The following can be added under XFsbl_HookBeforeHandofffunction.

//- Bring PS-DDR APM out of Reset
XFsbl_Out32(0xFD1A0108, 0x3);
  • Alternatively, the same can be done via Linux before inserting Root DMA drivers using devmem
$ devmem 0xFD1A0108 32 0x3

Building Kernel

  1. Create project using ZCU102 BSP from PetaLinux. Copy the patch to the cloned kernel folder.
> petalinux-create -t project -s $PETALINUX_INSTALL/Xilinx-ZCU102-2016.1.bsp
> cd Xilinx-ZCU102-2016.1/components
> mkdir linux-kernel
> cd linux-kernel
> git clone git://github.com/Xilinx/linux-xlnx.git
> cd linux-xlnx
> git checkout xilinx-v2016.1.01
> git am 0001-PATCH-PCI-Xilinx-NWL-PCIe-Bridge-Adding-DREG-capabil.patch
> petalinux-config
  • 2. Clone kernel and apply patch provided as shown above.
  • 3. Configure PetaLinux to use the cloned kernel
  • 4. Import HDF exported from Vivado
> petalinux-config --get-hw-description=<Path to folder containing HDF>
5. Configure rootfs for the following:

6. Device-tree update: Add the following to $PETALINUX_PROJECT_PATH/Xilinx-ZCU102-2016.1/subsystem/linux/configs/device-tree/zynqmp.dtsi
  • /* PCIe DMA node */
        pci_dma: pdma@fd0f0000 {
          compatible = "xlnx,pcie_dma-1.00.a";
          reg = <0x0 0xfd0f0000 0x0 0x1000>,
                <0x0 0xfd0b0000 0x0 0x1000>;
          reg-names = "dma_reg", "apm_reg";
          interrupts = <0 117 4>;
          interrupt-names = "pcie_dma";
          interrupt-parent = <&&gic>;
7. Build kernel
> petalinux-build
8. Create BIN file for SD boot using the BIF file provided under $XAPP1289/software/util folder. Ensure you have XSDK environment setup for bootgen to work. Copy this bif file under $PETALINUX_PROJECT_PATH/images/linux
> bootgen -arch zynqmp -image bootsd.bif -o boot.bin

Building Root DMA Drivers

This design uses loadable kernel modules and this section describes how to build the Root DMA driver modules and associated application.
Say $KERNEL_BUILD_PATH (based on PetaLinux build above is)- $PETALINUX_PROJECT_PATH/build/linux/kernel/linux-xlnx

Ensure that you have the Petalinux build environment settings done and define CROSS_COMPILE as aarch64-linux-gnu-
(This is also added in the Makefile provided)
  1. Modify the KDIR path in the Makefile.variable file provided ($XAPP1289/software/root_dma/) to reflect $KERNEL_BUILD_PATH
  2. Navigate to $XAPP1289/software/root_dma and build the kernel modules and application by using 'make'
  3. This should build the following
    • xdma/root_dma_pcie.ko
    • Appdriver/XRaw_Data0.ko
    • Appdriver/XRaw_Data1.ko
  4. Navigate to $XAPP1289/software/root_dma/root_dma_test_app/ and build the application by using 'make'. This builds Root_dma_test_App

Setup Details

The XAPP uses ZCU102 as Root port and KCU105 as Endpoint. Bitstream for Endpoint design is provided (source code for this design is not provided).
  1. Connect the setup as shown below.

Test Instructions

Things to know before running the design:
  • The SD-MMC card has to be formatted as FAT32 using a SD-MMC card reader.
  • To use the prebuilt binaries provided, copy the entire folder content from $XAPP1289/ready_to_test/zcu102_sd onto the primary partition of the SD-MMC.
  • Alternatively, you can use the images built based on instructions above and copy the following to SD card-
    • boot.bin created above
    • image.ub (from $PETALINUX_PROJECT_PATH/images/linux/
    • Kernel modules and application executable from $XAPP1289/software/root_dma
  • Petalinux console login details
User : root
Password : root

  1. Connect the setup as shown above.
  2. Set the ZCU102 to boot from SD (SW6 set to 0101 configuration), connect USB-UART cable. Setup teraterm on laptop for 115200 baud rate to see UART prints.
  3. Power on the KCU105 Endpoint card and program it with the bitfile provided $XAPP1289/ready_to_test/kcu105_axi_pcie.bit
  4. Then power on the ZCU102 and let Linux boot (prints seen on UART terminal)
  5. Once Linux boots, log in and do 'lspci' to list the PCI devices. It would show the PCIe root port on MPSoC and KCU105 Endpoint. Look at Link Status to check if it linked at x4 Gen2 configuration.
> > Xilinx-ZCU102-2016_1 login: root
> > Password:
> > login[1919]: root login on 'ttyPS0'
> > root@Xilinx-ZCU102-2016_1:~# lspci
> > 00:00.0 PCI bridge: Xilinx Corporation Device d024
> > 01:00.0 Memory controller: Xilinx Corporation Device 8024
To get verbose information on Endpoint do 'lspci -d 10ee:8024 -vv'
> 01:00.0 Memory controller: Xilinx Corporation Device 8024
> Subsystem: Xilinx Corporation Device 0007
> Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Interrupt: pin A routed to IRQ 223
> Region 0: Memory at e1120000 (64-bit, non-prefetchable) [size=64K]
> Region 2: Memory at e1000000 (64-bit, non-prefetchable) [size=1M]
> Region 4: Memory at e1100000 (64-bit, non-prefetchable) [size=128K]
6. Mount the SD card to /mnt and navigate to the SD mount area and insert the drivers.
> mount /dev/mmcblk0 /mnt
> cd /mnt/root_dma
> root@Xilinx-ZCU102-2016_1:/mnt/root_dma# insmod root_dma_pcie.ko
> root@Xilinx-ZCU102-2016_1:/mnt/root_dma# insmod XRaw_Data0.ko
> root@Xilinx-ZCU102-2016_1:/mnt/root_dma# insmod XRaw_Data1.ko
7. Run the test to see performance
The MPS is 128B with the AXI-PCIe Bridge Burst size on MPSoC configured as 128B.

DMA Channel-0 and channel-2 for System to Card traffic (Root DMA pushing data into Endpoint)
  • root@Xilinx-ZCU102-2016_1:~/mnt# ./Root_dma_test_App 2 CH 32768 32768 30s i 60s
    Read bandwidth: 13.069 Gbps, Write bandwidth: 0.715 Gbps
    PCIE Endpoint DDR APM:
    Write bandwidth: 11.906 Gbps, Read bandwidth: 0.000 Gbps
DMA channel-1 and 3 for Card to System traffic (Root DMA pulling data from Endpoint)
root@Xilinx-ZCU102-2016_1:~/mnt# ./Root_dma_test_App 2 GN 32768 32768 30s i 60s
Read bandwidth: 1.346 Gbps, Write bandwidth: 12.365 Gbps
PCIE Endpoint DDR APM:
Write bandwidth: 0.000 Gbps, Read bandwidth: 11.488 Gbps


  • 'lspci' does not show up the Endpoint Card
  1. Ensure the Endpoint card has fit in properly on the PCIe slot on ZCU102
  2. Check if the Endpoint KCU105 board has been programmed with the provided bitfile
  3. On ZCU102 Linux prompt, read the PCIe status by doing 'devmem 0xFD480228' if bit[0] is set then link is up.
  4. If you find link is up via PCIe Status register but do not see the device in 'lspci'
    • Check if you programmed Endpoint card before ZCU102 booted
    • Try doing a rescan of PCIe bus by the following command on ZCU102 Linux prompt

$ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/remove
$ echo 1 > /sys/bus/pci/rescan

Related Links