Zynq UltraScale+ MPSoC VCU TRD 2019.2 - VCU TRD: Multi Stream


Table of Contents

1 Overview

The primary goal of this Design is to demonstrate the capabilities of VCU hard block present in Zynq UltraScale+ EV devices. The TRD will serve as a platform to tune the performance parameters of VCU and arrive at optimal configurations for encoder and decoder blocks. 

This design supports the following video interfaces:

Sources:

Sinks:

VCU Codec:

Streaming Interfaces:

Video format:


The below figure shows the TRD block diagram.

Other features:

Supported Resolution:

The table below provides the supported resolution from GUI and command-line app in this design.

Resolution
GUICommand Line
Single StreamSingle StreamMulti-stream
4kp60XNA
4kp30√ (Max 2)
1080p60√ (Max 4)
1080p30X√ (Max 8)


√ - Supported
NA – Not applicable
x – Not supported


The below table gives information about the features supported in this design. 

Pipeline

Input source

Output Type

Resolution

VCU codec

Capture--> Display(Passthrough pipeline)

HDMI-Rx/MIPI/TPG

HDMI-Tx/DP

4KP/1080p

None

Single Stream: Capture--> SCD --> Encode--> Decode--> Display

HDMI-Rx/MIPI/TPG

HDMI-Tx/DP

4KP/1080p

HEVC/AVC

Multi-Stream (2 input sources): Capture--> SCD --> Encode--> Decode--> Display

HDMI-Rx/MIPI/TPG

HDMI-Tx

4KP30

HEVC/AVC

Multi-Stream(4 input sources): Capture--> SCD --> Encode--> Decode--> Display

HDMI-Rx/MIPI/TPG

HDMI-Tx

1080P60

HEVC/AVC

Multi-Stream(8 input sources): Capture--> SCD --> Encode--> Decode--> Display7-HDMI-Rx + 1 MIPIHDMI-Tx1080P30HEVC/AVC

Single Stream: Record/Stream-Out pipeline

HDMI-Rx/MIPI/TPG

File Sink/ Stream-Out

4K/1080p

HEVC/AVC

Multi-Stream(2 or 4 i/p sources): Record/Stream-Out pipeline

HDMI-Rx/MIPI/TPG

File Sink/ Stream-Out

2-4KP30/4-1080p60

HEVC/AVC

Multi-Stream(8 input sources): Record/Stream-Out pipeline

7-HDMI-Rx + 1 MIPI

File Sink/ Stream-Out

8-1080P30

HEVC/AVC

File/Streaming Playback pipeline

File Source/ Stream-In

HDMI-Tx/DP

4K/1080p

HEVC/AVC

NOTE: DP will support a max resolution of 4kp30 

            TPG will not support 1080p30 resolution mode.


The below figure shows the VCU TRD design hardware block diagram.


VCU TRD Design hardware block diagram

The below figure shows the VCU TRD design software block diagram.

VCU TRD Design software block diagram

1.1 Board Setup

Refer below link for Board Setup


1.2 Run Flow

The TRD package is released with the source code, Vivado project, Petalinux BSP, and SD card image that enables the user to run the demonstration. It also includes the binaries necessary to configure and boot the ZCU106 board. Prior to running the steps mentioned in this wiki page, download the TRD package and extract its contents to a directory referred to as ‘TRD_HOME' which is the home directory.

Refer below link to download all TRD contents.

TRD package contents are placed in the following directory structure. The user needs to copy all the files from the $TRD_HOME/images/vcu_trd/ to FAT32 formatted SD card directory.

└── rdf0428-zcu106-vcu-trd-2019-2
	├── apu
	│   └── vcu_petalinux_bsp
	├── images
	│   ├── vcu_10g
	│   ├── vcu_audio
	│   ├── vcu_hdmi_multistream_xv20
	│   ├── vcu_hdmi_rx
	│   ├── vcu_hdmi_tx
	│   ├── vcu_llp2_hdmi_nv12
	│   ├── vcu_llp2_hdmi_nv16
	│   ├── vcu_llp2_hdmi_xv20
	│   ├── vcu_llp2_sdi_xv20
	│   ├── vcu_multistream_nv12
	│   ├── vcu_pcie
	│   ├── vcu_sdirx
	│   ├── vcu_sditx
	│   └── vcu_sdi_xv20
	├── pcie_host_package
	│   ├── COPYING
	│   ├── etc
	│   ├── include
	│   ├── libxdma
	│   ├── LICENSE
	│   ├── README.md
	│   ├── tools
	│   └── xdma
	├── pl
	│   ├── constrs
	│   ├── designs
	│   ├── prebuild
	│   ├── README.md
	│   └── srcs
	└── README.txt


TRD package contents specific to VCU TRD design are placed in the following directory structure.

└── rdf0428-zcu106-vcu-trd-2019-2
	├── apu
	│   └── vcu_petalinux_bsp
	├── images
	│   ├── vcu_multistream_nv12
	│	│	├── autostart.sh
	│	│	├── bin
	│	│	├── BOOT.BIN
	│	│	├── config
	│	│	├── image.ub
	│	│	├── system.dtb
	│	│	└── vcu
	├── pcie_host_package
	│   ├── COPYING
	│   ├── etc
	│   ├── include
	│   ├── libxdma
	│   ├── LICENSE
	│   ├── README.md
	│   ├── tools
	│   └── xdma
	├── pl
	│   ├── constrs
	│   ├── designs
	│   ├── prebuild
	│   ├── README.md
	│   └── srcs
	└── README.txt

The below snippet shows the configuration files(input.cfg) for running various multistream Display, Record, and Streaming use cases. All these configurations files are placed in the images folder mentioned above. The directory structure in /media/card.

config/
├── 1-4kp60
│   ├── Display
│   ├── Record
│   ├── Stream-out
│   └── Stream-in
├── 2-4kp30
│   ├── Display
│   ├── Record
│   ├── Stream-out
│   └── Stream-in 
├── 4-1080p60
│   ├── Display
│   ├── Record
│   ├── Stream-out
│   └── Stream-in 
├── 8-1080p30 
│   ├── Display 
│   ├── Record
│   ├── Stream-out 
│   └── Stream-in 
└── input.cfg

1.2.1 GStreamer Application (vcu_gst_app)

The vcu_gst_app is a command-line multi-threaded Linux application. The command-line application requires an input configuration file (input.cfg) to be provided in the plain text.

Before execution of vcu_gst_app, we need to run below modetest command manually in the background to set CRTC configurations in case of VCU TRD Multi-stream design.

% modetest -D a0070000.v_mix -s 41:3840x2160-60@BG24

Execution of the application is shown below:

% vcu_gst_app <path to *.cfg file>


Example:

4kp60 HEVC_HIGH Display Pipeline execution

% vcu_gst_app /media/card/config/1-4kp60/Display/Single_4kp60_HEVC_HIGH.cfg

4kp60 HEVC_HIGH Record Pipeline execution

% vcu_gst_app /media/card/config/1-4kp60/Record/Single_4kp60_HEVC_HIGH.cfg

4kp60 HEVC_HIGH Stream-out Pipeline execution

% vcu_gst_app /media/card/config/1-4kp60/Stream-out/Single_4kp60_HEVC_HIGH.cfg

4kp60 HEVC_HIGH Stream-in Pipeline execution

% vcu_gst_app /media/card/config/1-4kp60/Stream-in/input.cfg

NOTE: Make sure HDMI-Rx should be configured to 4kp60 mode.

To measure the latency of the pipeline, run the below command. The latency data is huge, so dump it to a file.

% GST_DEBUG="GST_TRACER:7" GST_TRACERS="latency;scheduletime" ./vcu_gst_app ./input.cfg >& dump_log.txt


Refer below link for detailed run flow steps


1.3 Build Flow

Refer below link for detailed build flow steps



2 Other Information

2.1 Known Issues

2.2 Limitations

2.3 Optimum VCU Encoder parameters for use-cases:

Video streaming:

Performance: AVC Encoder settings:

Quality: Low bitrate AVC encoding:



3 Appendix A - Input Configuration File (input.cfg)

The example configuration files are stored at /media/card/config/ folder.

Common Configuration:
It is the starting point of common configuration.
 
Num of Input:
Provide the number of inputs. It is ranging from 1 to 8.

Output:
Select the video interface.
Options: HDMI or DP

Out Type:
Options: display, record, and stream

Display Rate:
Pipeline frame rate.
Options: 30 FPS or 60 FPS for each stream.

Exit:
It indicates to the application that the configuration is over.

Input Configuration:
It is the starting point of the input configuration.

Input Num:
Starting Nth input configuration.
Options: 1-8

Input Type:
Input source type.
Options: TPG, HDMI, HDMI_2, HDMI_3, HDMI_4, HDMI_5, HDMI_6, HDMI_7, MIPI, File, Stream

Uri:
File path or Network URL. Applicable for file playback and stream-in pipeline only. Supported file formats for playback are ts, mp4, and mkv.
NOTE: See Mount Location for additional file paths.
Options: file:///run/media/sda/abc.ts (for file path), udp://192.168.25.89:5004/ (for Network streaming, Here 192.168.25.89 is IP address and 5004 is port no)

Raw:
To tell the pipeline is processed or pass-through.
Options: True, False

Width:
The width of the live source.
Options: 3840, 1920

Height:
The height of the live source.
Options: 2160, 1080

Enable SCD:
Enable or Disable Memory based SCD in the pipeline.
Options: True, False

Exit:

It indicates to the application that the configuration is over.

Encoder Configuration:
It is the starting point of encoder configuration.

Encoder Num:
Starting Nth encoder configuration.
Options: 1-8

Encoder Name:
Name of the encoder.
Options: AVC, HEVC

Profile:
Name of the profile.
Options: baseline, main or high for AVC. Main for HEVC.

Rate Control:
Rate control options.
Options: CBR, VBR, and low-latency.

Filler Data:
Filler Data NAL units for CBR rate control.
Options: True, False

QP:
QP control mode used by the VCU encoder.
Options: Uniform, Auto

L2 Cache:
Enable or Disable L2Cache buffer in encoding process.
Options: True, False

Latency Mode:
Encoder latency mode.
Options: normal, sub_frame

Low Bandwidth:
If enabled, decrease the vertical search range used for P-frame motion estimation to reduce the bandwidth.
Options: True, False

Gop Mode:
Group of Pictures mode.
Options: Basic, low_delay_p, low_delay_b

Bitrate:
Target bitrate in Kbps
Options: 1-60000

B Frames:
Number of B-frames between two consecutive P-frames
Options: 0-4

Slice:
The number of slices produced for each frame. Each slice contains one or more complete macroblock/CTU row(s). Slices are distributed over the frame as regularly as possible. If slice-size is defined as well more slices may be produced to fit the slice-size requirement.
Options:
4-22 4kp resolution with HEVC codec
4-32 4kp resolution with AVC codec
4-32 1080p resolution with HEVC codec
4-32 1080p resolution with AVC codec

GoP Length:
The distance between two consecutive I frames
Options: 1-1000

Format:
The format of input data.
Options: NV12

Preset:
Options: HEVC_HIGH, HEVC_MEDIUM, HEVC_LOW, AVC_HIGH, AVC_MEDIUM, AVC_LOW, Custom

Exit
It indicates to the application that the configuration is over.

Record Configuration:
It is the starting point of record configuration.

Record Num:
Starting Nth record configuration.
Options: 1-8

Out-File Name:
Record file path.
NOTE: See Mount Location for additional file paths.
Options: /run/media/sda/abc.ts

Duration:
Duration in minutes.
Options: 1-3

Exit
It indicates to the application that the configuration is over.

Streaming Configuration:
It is the starting point of streaming configuration.

Streaming Num:
Starting Nth Streaming configuration.
Options: 1-8

Host IP:
The host to send the packets to
Options: 192.168.25.89 or Windows PC IP

Port:
The port to send the packets to
Options: 5004, 5008, 5012, 5016, 5020, 5024, 5028, and 5032.

Exit
It indicates to the application that the configuration is over.

Trace Configuration:
It is the starting point of trace configuration.

FPS Info:
To display fps info on the console.
Options: True, False

APM Info:
To display APM counter number on the console.
Options: True, False

Pipeline Info:
To display pipeline info on console.
Options: True, False

Exit
It indicates to the application that the configuration is over.

Mount Locations:

The mount locations for various devices can be found in the below table.

The mount locations can vary.  Users can use lsblk or mount to find the location of the mounted devices.

Below are some example mount points

DeviceMount Location
SD Card/run/media/mmcblk0p2

Sata Drive

USB Drive

/run/media/sda

/run/media/usb

RAM Disk/run/media/




4 Appendix B

$ killall -9 run_vcu.sh
$ killall -9 vcu_qt
$ killall -9 Xorg
$ xmedia-ctl -p -d /dev/mediaX
$ xmedia-ctl -p -d /dev/mediaX

When HDMI source is connected to 4KP60 resolution, it shows:

root@zcu106_vcu_trd:/media/card# xmedia-ctl -p -d /dev/mediaX
Media controller API version 4.19.0

Media device information
------------------------
driver          xilinx-video
model           Xilinx Video Composite Device
serial          
bus info        
hw revision     0x0
driver version  4.19.0

Device topology
- entity 1: vcap_hdmi output 0 (1 pad, 1 link)
            type Node subtype V4L flags 0
            device node name /dev/video0 -----> Video node for HDMI Rx source
        pad0: Sink
                <- "a0080000.v_proc_ss":1 [ENABLED]

- entity 5: a0080000.v_proc_ss (2 pads, 2 links)
            type V4L2 subdev subtype Unknown flags 0
            device node name /dev/v4l-subdev21
        pad0: Sink
                [fmt:RBG888_1X24/3840x2160 field:none]
                <- "a0000000.v_hdmi_rx_ss":0 [ENABLED]
        pad1: Source
                [fmt:VYYUYY8_1X24/3840x2160 field:none]
                -> "vcap_hdmi output 0":0 [ENABLED]

- entity 8: a0000000.v_hdmi_rx_ss (1 pad, 1 link)
            type V4L2 subdev subtype Unknown flags 0
            device node name /dev/v4l-subdev22
        pad0: Source
                [fmt:RBG888_1X24/3840x2160 field:none colorspace:srgb]
                [dv.caps:BT.656/1120 min:0x0@25000000 max:4096x2160@297000000 stds:CEA-861,DMT,CVT,GTF caps:progressive,reduced-blanking,custom]
                [dv.detect:BT.656/1120 3840x2160p60 (4400x2250) stds:CEA-861 flags:CE-video] -----> Resolution and Frame-rate of HDMI Rx source
                -> "a0080000.v_proc_ss":0 [ENABLED]

NOTE: Check resolution and frame-rate of "dv.detect" under "v_hdmi_rx_ss" node.

When the HDMI source is not connected, it shows:


root@zcu106_vcu_trd:/media/card# xmedia-ctl -p -d /dev/mediaX
Media controller API version 4.19.0

Media device information
------------------------
driver          xilinx-video
model           Xilinx Video Composite Device
serial          
bus info        
hw revision     0x0
driver version  4.19.0

Device topology
- entity 1: vcap_hdmi output 0 (1 pad, 1 link)
            type Node subtype V4L flags 0
            device node name /dev/video0 -----> Video node for HDMI Rx source
        pad0: Sink
                <- "a0080000.v_proc_ss":1 [ENABLED]

- entity 5: a0080000.v_proc_ss (2 pads, 2 links)
            type V4L2 subdev subtype Unknown flags 0
            device node name /dev/v4l-subdev21
        pad0: Sink
                [fmt:RBG888_1X24/3840x2160 field:none]
                <- "a0000000.v_hdmi_rx_ss":0 [ENABLED]
        pad1: Source
                [fmt:VYYUYY8_1X24/3840x2160 field:none]
                -> "vcap_hdmi output 0":0 [ENABLED]

- entity 8: a0000000.v_hdmi_rx_ss (1 pad, 1 link)
            type V4L2 subdev subtype Unknown flags 0
            device node name /dev/v4l-subdev22
        pad0: Source
                [fmt:RBG888_1X24/3840x2160 field:none colorspace:srgb]
                [dv.caps:BT.656/1120 min:0x0@25000000 max:4096x2160@297000000 stds:CEA-861,DMT,CVT,GTF caps:progressive,reduced-blanking,custom]
                [dv.query:no-link] -----> HDMI Rx Link Status
                -> "a0080000.v_proc_ss":0 [ENABLED]

NOTE: Here "dv.query:no-link" under "v_hdmi_rx_ss" node shows HDMI-Rx source is not connected or HDMI-Rx source is not active(Try waking up the device by pressing a key on remote).


Notes for gst-launch-1.0 commands:

$ xmedia-ctl -p -d /dev/mediaX

When HDMI Input Source is NVIDIA SHIELD

$ xmedia-ctl -d /dev/mediaX -V "\"a0080000.v_proc_ss\":0  [fmt:RBG888_1X24/3840x2160 field:none]"
$ xmedia-ctl -d /dev/mediaX -V "\"a0080000.v_proc_ss\":1  [fmt:VYYUYY8_1X24/3840x2160 field:none]"

NOTE: Make sure NVIDIA SHIELD is configured for 4kp resolution and RGB888 colour format.

When HDMI Input Source is ABOX

$ xmedia-ctl -d /dev/mediaX -V "\"a0080000.v_proc_ss\":0  [fmt:VYYUYY8_1X24/3840x2160 field:none]"
$ xmedia-ctl -d /dev/mediaX -V "\"a0080000.v_proc_ss\":1  [fmt:VYYUYY8_1X24/3840x2160 field:none]"

NOTE: Make sure ABOX is configured for 4kp resolution and VYYUYY8 colour format.

Notes to set the format of SCD media node:

$ xmedia-ctl -p -d /dev/mediaX

Run the following command to change the resolution of SCD nodes(here mediaX is SCD media node and xlnx-scdchan.Y is SCD channel),

$ xmedia-ctl -d /dev/mediaX -V "\"xlnx-scdchan.Y\":0 [fmt:VYYUYY8_1X24/3840x2160 field:none]"
$ xmedia-ctl -d /dev/mediaX -V "\"xlnx-scdchan.Y\":0 [fmt:VYYUYY8_1X24/1920x1080 field:none]"


$ vcu_gst_app /media/card/config/input.cfg

Below configurations needs to be set in input.cfg for HDMI-1080p60.

Common Configuration    : START
Num Of Input            : 1
Output                  : HDMI
Out Type                : Display
Frame Rate              : 60
Exit

Input Configuration     : START
Input Num               : 1
Input Type              : hdmi
Raw                     : TRUE
Width                   : 1920
Height                  : 1080
Exit
$ modetest -D a0070000.v_mix -s 41:3840x2160-60@BG24
$ gst-launch-1.0 v4l2src device=/dev/videoX io-mode=4 ! video/x-raw, width=3840, height=2160, format=NV12, framerate=60/1 ! queue ! kmssink bus-id="a0070000.v_mix"
$ gst-launch-1.0 v4l2src device=/dev/videoX io-mode=4 ! video/x-raw, width=3840, height=2160, format=NV12, framerate=60/1 ! xilinxscd io-mode=5 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=60000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 ! video/x-h265, profile=main, alignment=au ! queue ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! kmssink bus-id="a0070000.v_mix"
$ gst-launch-1.0 v4l2src device=/dev/videoX io-mode=4 num-buffers=3600 ! video/x-raw, format=NV12,width=3840,height=2160,framerate=60/1 ! xilinxscd io-mode=5 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=60000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 ! queue ! video/x-h265, profile=main, alignment=au ! mpegtsmux alignment=7 name=mux ! filesink location="/run/media/sda/test.ts"

NOTE: File location should be SATA SSD(ext4 format) to avoid the read-write bandwidth issue.

$ gst-launch-1.0 uridecodebin uri="file:///run/media/sda/test.ts" ! queue max-size-bytes=0 ! kmssink bus-id="a0070000.v_mix"

NOTE: File location should be SATA SSD(ext4 format) to avoid the read-write bandwidth issue.

$ gst-launch-1.0 v4l2src device=/dev/videoX io-mode=4 ! video/x-raw, format=NV12, width=3840, height=2160, framerate=60/1 ! xilinxscd io-mode=5 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=60000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! queue ! mpegtsmux alignment=7 name=mux ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5004

NOTE: Here 192.168.25.89 is host/client IP address and 5004 is port no.

$ gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux name=demux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! kmssink bus-id="a0070000.v_mix"
$ gst-launch-1.0 v4l2src io-mode=4 device=/dev/videoX ! video/x-raw, width=3840, height=2160, format=NV12, framerate=60/1 ! omxh265enc qp-mode=auto gop-mode=low-delay-p gop-length=60 periodicity-idr=60 b-frames=0 target-bitrate=25000 num-slices=8 control-rate=low-latency prefetch-buffer=TRUE low-bandwidth=false filler-data=0 cpb-size=1000 initial-delay=500 ! video/x-h265, alignment=nal ! queue max-size-buffers=0 ! rtph265pay ! udpsink host=192.168.25.89 port=5004 buffer-size=60000000 max-bitrate=120000000 max-lateness=-1 qos-dscp=60 async=false

NOTE: Here 192.168.25.89 is host/client IP address and 5004 is port no.

$ gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 caps="application/x-rtp, media=video, clock-rate=90000, payload=96, encoding-name=H265" ! rtpjitterbuffer latency=5 ! rtph265depay ! h265parse ! video/x-h265, alignment=nal ! omxh265dec low-latency=1 ! video/x-raw ! queue max-size-bytes=0 ! fpsdisplaysink name=fpssink text-overlay=false video-sink="kmssink bus-id=a0070000.v_mix plane-id=30" sync=true

NOTE: The low latency stream-out pipeline is supported with tsmux in vcu_gst_app and low latency stream-in pipeline is not supported in vcu_gst_app.