Zynq UltraScale+ MPSoC VCU TRD 2020.1 - PCIe
This page provides detailed information related to Design Module 5 - VCU TRD PCIe design.
Table of Contents
- 1 1 Overview
- 1.1 1.1 Board Setup
- 1.2 1.2 Run Flow
- 1.2.1 HOST PACKAGE
- 1.2.2 Steps to run use cases
- 1.2.3 HOST APPLICATION
- 1.2.4 DEVICE APPLICATION
- 1.3 1.3 Build Flow
- 2 2 Other Information
1 Overview
The primary goal of this Design is to demonstrate the file-based VCU transcode, encode and decode capabilities over PCIe present in Zynq UltraScale+ EV devices.
This design supports the following interfaces:
VCU Codec:
Video Encode/Decode capability using VCU hard block in PL
AVC/HEVC encoding.
Encoder/decoder parameter configuration.
Communication Interface:
PCIe
Video format:
NV12
NV16
XV15
XV20
Supported Resolution:
The table below provides the supported resolution from the command line app only in this design.
Resolution | Command Line |
Single Stream | |
4kp60 | √ |
4kp30 | √ |
1080p60 | √ |
1080p30 | √ |
720p30 | √ |
√ - Supported
NA – Not applicable
x – Not supported
Hardware Overview
This Design uses the PCI Express (PCIe®) Endpoint block in an x4 Gen3 configuration along with DMA/Bridge Subsystem for PCI Express for data transfers between the host system memory and the Endpoint.
The DMA/Bridge Subsystem for PCI Express provides protocol conversion between PCIe TLPs(Transaction Layer Packets) and AXI transactions. The hardware SGL(Scatter-gather List) DMA interface is exercised to handle buffer management at the Endpoint to enable the memory-mapped interface.
The downstream AXI4-Lite slaves include userspace registers, responsible for the handshaking mechanism between host and the endpoint.
In the system to card direction, the DMA block moves data from the host memory to the PL-side through PCIe and then writes the data to PS-DDR via the AXI-MM interface. Then VCU IP reads data from PS-DDR, performs Video encoding/decoding and writes it back to the same memory. Lastly, in card to system direction, DMA reads PS-DDR via the AXI-MM interface and writes to host system memory through PCIe.
Figure 1: VCU PCIe Hardware Block Diagram
Components, Features, and Functions
4-lane integrated PCIe block with a maximum link speed of 8 GT/s (GT/s is Giga Transfers per second)
128-bit at 250 MHz
DMA/Bridge Subsystem for PCIe
AXI Memory-mapped enabled
One of each DMA Read (H2C) & DMA write (C2H) channels
Apart from PCIe related IPs, the design contains VCU IP.
Software
The below figure shows the PCIe software block diagram
1.1 Board Setup
Refer below link for Board Setup
1.2 Run Flow
The TRD package is released with the source code, Vivado project, Petalinux BSP, host software required for PCIe and SD card image that enables the user to run the demonstration. It also includes the binaries necessary to configure and boot the ZCU106 board. Prior to running the steps mentioned in this wiki page, download the TRD package and extract its contents to a directory referred to as ‘TRD_HOME' which is the home directory.
Refer below link to download all TRD contents.
Refer Section 4.1 : Download the TRD of
Zynq UltraScale+ MPSoC VCU TRD 2020.1
wiki page to download all TRD contents.
TRD package contents are placed in the following directory structure.
rdf0428-zcu106-vcu-trd-2020-1
├── apu
│ └── vcu_petalinux_bsp
├── images
│ ├── vcu_10g
│ ├── vcu_audio
│ ├── vcu_hdmi_multistream_xv20
│ ├── vcu_hdmi_rx
│ ├── vcu_hdmi_tx
│ ├── vcu_llp2_hdmi_nv12
│ ├── vcu_llp2_hdmi_nv16
│ ├── vcu_llp2_hdmi_xv20
│ ├── vcu_llp2_sdi_xv20
│ ├── vcu_multistream_nv12
│ ├── vcu_pcie
│ ├── vcu_sdirx
│ ├── vcu_sditx
│ └── vcu_sdi_xv20
├── pcie_host_package
│ ├── COPYING
│ ├── include
│ ├── libxdma
│ ├── LICENSE
│ ├── readme.txt
│ ├── RELEASE
│ ├── tools
│ ├── test
│ └── xdma
├── pl
│ ├── constrs
│ ├── designs
│ ├── prebuild
│ ├── README.md
│ └── srcs
└── README.txt
TRD package contents specific to VCU PCIe design are placed in the following directory structure.
rdf0428-zcu106-vcu-trd-2020-1
├── apu
│ └── vcu_petalinux_bsp
│ └── xilinx-vcu-zcu106-v2020.1-final.bsp
├── images
│ ├── vcu_pcie
│ │ ├── autostart.sh
│ │ ├── BOOT.BIN
│ │ ├── boot.scr
│ │ ├── image.ub
│ │ ├── system.dtb
│ │ └── vcu
├── pcie_host_package
├── pl
│ ├── constrs
│ ├── designs
│ │ ├── zcu106_pcie
│ ├── prebuild
│ │ ├── zcu106_pcie
│ ├── README.md
│ └── srcs
│ ├── hdl
│ └── ip
└── README.txt
HOST PACKAGE
The PCIe HOST application(pcie_host_app) supports Transcode, Decode and Encode use cases.
Transcode
The host application will read an input(.mp4 or .ts) file from the HOST machine and sends it to the zcu106 board; which is connected as an endpoint device to the PCIe slot of HOST machine. The data received from the HOST will be decoded, then again encoded with provided encoder type and mpegtsmux using VCU hardware. Transcoded data are written back to the HOST machine in .ts file format.
We support NV12, NV16, XV15 and XV20 formats. Need to mention format as per your choice. For NV12 format and encoder type AVC we support 3 profiles Baseline, High and Main. For NV12 format and encoder type HEVC default profile selected from the application. For NV16, XV15 and XV20 no need to provide profile as an argument, profile selected form the application.
Decode
The host application will read an input(.mp4 or .ts) file from the HOST machine and sends it to the zcu106 board which is connected as an endpoint device to the PCIe slot of HOST machine. The data received from the HOST will be decoded using VCU hardware; then writes the decoded data back to the HOST machine in .yuv file format.
We support NV12, NV16, XV15 and XV20 formats. No Need to mention format, decoder will take care for given input file.
Encode
The host application will read an input(.yuv) file from the HOST machine and sends it to the zcu106 board which is connected as an endpoint device to the PCIe slot of HOST machine. The data received from the HOST will be encoded with provided encoder type; then mpegtsmux using VCU hardware and writes the encoded data back to the HOST machine in .ts file format.
We support NV12, NV16, XV15 and XV20 formats. Need to mention format as per your choice. For NV12 format and encoder type AVC we support 3 profiles Baseline, High and Main. For NV12 format and encoder type HEVC default profile selected from the application. For NV16, XV15 and XV20 no need to provide profile as an argument, profile selected form the application. The main aim of this design was to show case the VCU capability, that’s the reason we don’t have mux options.
The files in pcie_host_package directory provides Xilinx PCIe DMA drivers, for example, software to be used to exercise file transfer over the Xilinx PCIe DMA IP and run the transcode, encode or decode use case using Xilinx VCU IP on zcu106 board.
Directory and file description:
Directory | File description |
---|---|
xdma/ | Contains the Xilinx PCIe DMA kernel module driver files |
libxdma/ | Contains support files for the kernel driver module, which interfaces directly with the XDMA IP |
include/ | Contains all include files required for compiling drivers |
etc/ | Contains rules for the Xilinx PCIe DMA kernel module and software. The files in this directory should be copied to the /etc/ directory on your Linux system. |
tools/ | Contains example application software and PCIe host application to exercise the provided kernel module driver and Xilinx PCIe DMA IP |
test/ | Contains example application software to exercise the provided kernel module driver and Xilinx PCIe DMA IP |
Steps to run use cases
The user needs to copy all the files from the $TRD_HOME/images/vcu_pcie/ to FAT32 formatted SD card directory
Insert the ZCU106 board into the PCIe slot of the HOST machine and power on the board; then power on the HOST machine
Make sure, the ZCU106 board is powered on before booting the HOST machine to enumerate ZCU106 board as PCIe endpoint device successfully
Execute "lspci" command on HOST machine's terminal and make sure that "Processing accelerators: Xilinx Corporation Device a883" and "Processing accelerators: Xilinx Corporation Device a884" entries are listed; otherwise XDMA driver will not be able to recognized PCIe endpoint device. It throws an error like "Error: The Kernel module installed correctly, but no devices were recognized"
Copy the hosting package on to UBUNTU-18.04 machine
Run the below commands to Install the XDMA driver and compile the Host application(pcie_host_app)
Root permissions will be required to install xdma driver. xdma driver transfer data up to 512K chunk size.
$ cd $TRD_HOME/pcie_host_package
$ cd xdma
$ make
$ make install
$ insmod xdma.ko
$ cd ../tools
$ make
The Host software consists of the XDMA module with the following user access devices.
Devices | Access |
---|---|
xdma0_control | To access XDMA registers |
xdma0_xvc | To access userspace registers from HOST |
xdma0_user | To access AXI-Lite Master interface |
xdma0_bypass | To access DMA-Bypass interface |
xdma0_h2c_0, xdma0_c2h_0 | To access each channel |
Finally, run the pcie_host_app on HOST machine then run pcie_transcode device application on ZCU106 board target to initiate the transfer.
HOST APPLICATION
Run the below command to initiate a file transfer from the HOST machine and transcode, encode or decode it from the ZCU106 device. After running the application on HOST user need to start device application(pcie_transcode) on the zcu106 target to initiate the transfer.
The user was given with the below options for transcoding, encoding or decoding the file :
DEVICE APPLICATION
After booting the ZCU106 board with the SD images, to run the transcode, encode or decode the use case first run the host side application mentioned above and run the device application on the zcu106 device with the mentioned commands below. The host application will send file data to the device for transcoding , encoding or decoding it on the ZCU106 device and receives the transcoded, encoded or decoded data and saves it on to the host machine.
1.3 Build Flow
Refer below link for Build Flow
Zynq UltraScale+ MPSoC VCU TRD 2020.1 - Build Flow
2 Other Information
2.1 Known Issues
For Petalinux related known issues please refer: PetaLinux 2020.1 - Product Update Release Notes and Known Issues
For VCU related known issues please refer AR# 66763: LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues and Xilinx Zynq UltraScale+ MPSoC Video Codec Unit.
2.2 Limitations
For Petalinux related limitations please refer PetaLinux 2020.1 - Product Update Release Notes and Known Issues
For VCU related limitations please refer AR# 66763: LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues, Xilinx Zynq UltraScale+ MPSoC Video Codec Unit and PG252 link.
xdma driver can transfer max up to 512K chunk size of data.
2.3 Optimum VCU Encoder parameters for use-cases:
Quality: Low bitrate AVC encoding:
Enable
profile=high
and useqp-mode=auto
for low-bitrate encoding use-cases.The high profile enables 8x8 transform which results in better video quality at low bitrates.
© Copyright 2019 - 2022 Xilinx Inc. Privacy Policy