Zynq UltraScale+ MPSoC ZCU106 VCU Multi-Stream ROI TRD using Avnet Quad Sensor 2021.2
This page provides all the information related to VCU Multi-Stream ROI TRD using Avnet Quad Sensor design for ZCU106.
Table of Contents
1 Overview
The primary goal of this VCU Multi-Stream ROI TRD using Avnet Quad Sensor design is to demonstrate the use of the Deep learning Processor Unit (DPU) block for extracting the Region of Interest (ROI) from input video frames and to use this information to perform ROI based encoding using the Video Codec Unit (VCU) encoder hard block present in Zynq UltraScale+ EV devices. Video is captured from the quad sensor connected through the MIPI CSI-2 Rx which is implemented in the PL. The Avnet Multi-Camera FMC module is used to capture four video streams through a MIPI CSI-2 interface.
The design will serve as a platform to accelerate Deep Neural Network inference algorithms using DPU and demonstrate the ROI feature of the VCU encoder. The design uses a Deep Convolutional Neural Network (CNN) named Densebox, running on DPU to extract ROI Information (for example, ‘face’ in this case). The design will also use the Vitis Video Analytics SDK (VVAS) framework to leverage its rich set of highly optimized and ready to use Kernels and GStreamer plugins.
The Design will use the Vivado IP Integrator flow for building the hardware platform and the Xilinx Yocto PetaLinux flow for software design. It will use Xilinx IP and Software drivers to demonstrate the capabilities of different components.
The Vitis platform will be created from the Vivado/PetaLinux build artifacts, and then the Vitis acceleration flow will be used to insert the DPU into the platform to create the final bitstream.
The following figure shows the streaming pipeline use-case with enhanced ROI + face detection model on a ZCU106 board. For a detailed view of the VVAS block, please refer to Section 1.3.3 - GStreamer Pipeline Flow.
Streaming: Face detection with enhanced ROI on ZCU106.
This ZCU106 VCU Multi Stream ROI TRD design supports the encoding feature of the VCU only. For decoding on the ZCU106 Board-2 setup, you would need to use the VCU TRD Multi Stream Video Capture and Display design.
1.1 System Architecture
The following figure shows the block diagram of the ROI design
1.2 Hardware Architecture
This section gives a detailed description of the blocks used in the hardware design. The functional block diagram of the design is shown in the below figure.
There are five primary Sections in the design.
MIPI Capture Pipeline:
Captures video frame buffers from the Capture source (Avnet Quad Sensor) at a resolution of 1080p30
The AXI Switch sends the Captured videos to Multiple streams using the Round-Robin method
Each stream writes the frame buffers into DDR Memory with the Frame Buffer Write IP
Multi-Scaler Block:
Reads the Video Buffers from DDR Memory for the first two sensors only
Scales down the buffer to the 640x360 size (suitable for DPU)
Converts the format from NV12 to BGR
Writes the downscaled buffer to DDR Memory
DPU Block:
Reads the downscaled buffers from DDR Memory only for the first two sensors only
Runs the Densebox algorithm to generate the ROI information for each frame buffer
Passes the ROI information to the VCU Encoder
VCU Encoder:
Reads the 4 x 1080p30 NV12 Buffer from DDR Memory
Receives the ROI metadata from the DPU IP from the first two sensors only.
Encodes the video buffers based on the ROI Information for the first two sensors
Encodes the video buffers for the other two sensors
Finally, it writes the encoded stream to DDR Memory
PS GEM:
Reads the Encoder stream from DDR Memory
Stream-out the encoded stream via Ethernet
This design supports the following video interfaces:
Sources |
|
|---|---|
Sinks |
|
VCU Codec |
|
DPU | |
Streaming Interfaces | 1G Ethernet PS GEM |
Video Format | NV12 |
Supported Resolution | 4 x 1080P30 |
1.3 VCU ROI Software
1.3.1 Vitis Video Analytics SDK (VVAS)
VVAS is being developed to provide an easy to use and scalable framework which users will be able to use to build their solutions on Xilinx FPGA. VVAS provides infrastructure that will cover a wide variety of applications in Embedded, Vision, Datacenter, Machine Learning, Automotive and many other domains.
VVAS Provides a set of generic framework plugins that abstracts the complexities of writing a GStreamer plugin. These framework plugins interacts with the kernel libraries through a simple VVAS kernel interface. Using this VVAS Kernel interface, a user can easily integrate and test their kernels in the GStreamer framework.
VVAS also provides a rich set of highly optimized, ready to use, Kernels and GStreamer plugins, such as the video encoder, video decoder, multiscalar, ML, bounding box cropping etc. so that users can create their applications in a very short span of time.
VVAS will also provide the infrastructure needed to bridge the gap between Edge and Cloud solutions.
In the VCU Multi Stream ROI TRD, which is using an Avnet Quad Sensor design, we’ve used the following VVAS plugins:
vvas_xfilter: The vvas_xfilter efficiently works with hard-kernel/soft-kernel/software (user-space) acceleration software library types. It can operate in Passthrough/in-place/transform mode. In the Multi-Stream ROI TRD using Avnet Quad Sensor design, it was used in in-place mode with the below acceleration software library and can alter the input buffer.
vvas_xdpuinfer: The vvas_xdpuinfer is the acceleration software library that controls the DPU through the Vitis AI interface. The vvas_xdpuinfer does not modify the contents of the input buffer. The input buffer is passed to the Vitis AI model library that generates the inference data. This inference data is then mapped into the VVAS meta data structure and attached to the input buffer. The same input buffer is then pushed to the downstream plug-in.
vvas_xboundingbox: The vvas_xboundingbox acceleration software library is used to draw a bounding box and label information using the VVAS infrastructure plug-in vvas_xfilter. The vvas_xboundingbox interprets machine learning inference results from the vvas_xdpuinfer acceleration software library and uses an OpenCV library to draw the bounding box and label on the identified objects.
vvas_xmetaaffixer: It is used to scale the incoming metadata information for the different resolutions, where metadata received on the master sink pad is scaled in relation to the resolution output slave pads.
vvas_xroigen: This plug-in generates ROI metadata information, which is expected by the GStreamer OMX encoder plug-ins to encode raw frames with the desired quality parameters (QP) values/ level for specified ROIs.
Refer to the VVAS document for more detail on VVAS.
VVAS Top-level Block diagram
1.3.2 Deep Learning Processor Unit (DPU)
DPU is a programmable engine optimized for deep neural networks. It is a group of parameterizable IP cores pre-implemented on the hardware with no place and route required. The DPU is released with the Vitis AI specialized instruction set, allowing for efficient implementation of many deep learning networks.
Refer to DPU IP PG338 and UG1354 for more details about the DPU.
The following figure shows the DPU Top-Level Block Diagram.
DPU Top-level Block Diagram
PE - Processing Engine, DPU - Deep Learning Processor Unit, APU - Application Processing Unit
The DPU IP can be implemented in the programmable logic (PL) of the selected Zynq® UltraScale+™ MPSoC device with direct connections to the processing system (PS). The DPU requires instructions to implement a neural network and accessible memory locations for input images as well as temporary and output data. A program running on the Application Processing Unit (APU) is also required to service interrupts and coordinate data transfers.
The following figure shows the sequence of operations performed on the DPU device.
The following sequence of steps are performed to access and run face detection using the DPU device:
The DPU device is initialized
Instantiate a DPU Task from DPU Kernel and allocate the corresponding DPU memory buffer
Set the input image to created the DPU task
Run the DPU task to find the faces from the input image
The DPU device is uninitialized
1.3.3 GStreamer Pipelines Flow
The GStreamer plugin demonstrates the DPU capabilities with the Xilinx VCU encoder’s ROI (Region of Interest) feature. The plugin will detect ROI (i.e. face co-ordinates) from input frames using the DPU IP and pass the detected ROI information to the Xilinx VCU encoder. The following figure shows the data flow for the GStreamer pipeline of the stream-out use case.
Block Diagram of Stream-out Pipeline
fd = v4l2 frame data, fd' = DPU compatible frame data
As shown in the above figure, the stream-out GStreamer pipeline performs the below list of operations:
Sensors capture the stream through the camera and pass it to the FMC module which passes it to the MIPI CSI-2 Rx interface on the ZCU106 Board.
The MIPI CSI-2 Rx interface will capture the data in NV12 format and pass it to the tee element, which will split the input stream to the metaaffixer and preprocessor elements.
Preprocessors (v4l2convert GStreamer plugin) will scale-down the input frame resolution to 640x360 and convert the data into BGR format as per the input requirement of the DPU.
360p BGR frame will be provided to the DPU IP (via the xfilter plugin) as an input to find ROI (i.e. face co-ordinates).
Extracted ROI information will be passed to the VVAS Metaaffixer plug-in along with the original capture stream (via tee), which will embed the ROI metadata with the original stream.
The ROI Generator will generate ROI metadata and it is given to the Bounding box (via the xfilter plugin) as an input to draw bounding-boxes according to the ROI (i.e. face co-ordinates).
An updated stream will be passed to the VCU encoder, which will encode the input data by encoding ROI regions with high quality as compared to non-ROI region using received ROI information.
Stream-out the encoded data using the RTP protocol.
Use the below Stream-in use case with another ZCU106 Board along with the VCU TRD Multi Stream Video Capture and Display design.
The following figure shows the data flow for the GStreamer pipeline of the stream-in use case.
Block Diagram of Stream-in Pipeline
fd = VCU decoded frame data
As shown in the above figure, the stream-in GStreamer pipeline performs the below list of operations:
Stream-in the encoded data using the RTP protocol
The Xilinx VCU decoder will decode the data
Display the decoded data on the HDMI-Tx display
1.4 Software Tools and System Requirements
Hardware
Required:
Two ZCU106 rev 1.0 evaluation boards with power cables
Monitor with HDMI input supporting 3840x2160 resolution or 1920x1080 resolution (for example, an LG 27UD88, Samsung LU28ES90DS/XL)
HDMI 2.0 certified cable
Class-10 SD card
Ethernet cable
Optional:
USB pen drive formatted with the FAT32 file system and hub
SATA drive formatted with the FAT32 file system, external power supply, and data cable
Software Tools
Required:
Linux host machine for all tool flow tutorials (see UG1144 for detailed OS requirements)
PetaLinux Tools version 2021.2 (see UG1144 for installation instructions)
Git a distributed version control system
Serial terminal emulator, for example TeraTerm
Download, Installation, and Licensing of Vivado Design Suite 2021.2
The Vivado Design Suite User Guide explains how to download and install the Vivado® Design Suite tools, which include the Vivado Integrated Design Environment (IDE), High-Level Synthesis tool, and System Generator for DSP. This guide also provides information about licensing and administering evaluation and full copies of Xilinx design tools and intellectual property (IP) products. The Vivado Design Suite can be downloaded here.
LogiCORE IP Licensing
The following IP cores require a license to build the design.
Video Processing Subsystem (VPSS) - Included with Vivado - PG231
MIPI CSI Controller Subsystems (mipi_csi2_rx_subsystem) - Purchase license (Hardware evaluation available) - PG232
To obtain the LogiCORE IP license, please visit the respective IP product page and get the license.
(Xilinx Answer 44029) - Licensing - LogiCORE IP Core licensing questions
The below table provides performance information:
Resolution | FPS Achieved |
|---|---|
4 x 1080p30 | 30 |
1.5 Board Setup
The below section will provide the information on the ZCU106 board setup for running the ROI design.
Connect the Micro USB cable into the ZCU106 Board Micro USB port J83, and the other end into an open USB port on the host PC. This cable is used for UART over USB communication.
Insert the SD card with the images copied onto the SD card slot J100. Please find here how to prepare the SD card for a specific design.
Set the SW6 switches as shown in the below Figure. This configures the boot settings to boot from SD.
Connect 12V Power to the ZCU106 6-Pin Molex connector.
For a USB storage device, connect the USB hub along with the mouse. (Optional)
For a SATA storage device, connect the SATA data cable to the SATA 3.0 port. (Optional)
For MIPI CSI-2, Insert the Avnet Multi-Camera FMC module into the FMC0 connector and set VADJ to 1.2V
Important Note: VADJ on the FMC0 connector must be set to 1.2V. See FMC VADJ Voltage Settings for more information.
Set up a terminal session between a PC COM port and the serial port on the evaluation board (See the Determine which COM to use to access the USB serial port on the ZCU106 board for more details).
Copy the VCU Multi Stream ROI TRD images into the SD card and insert the SD card on the board
The below images show how to connect interfaces on the ZCU106 board:
1.6 Run Flow
The VCU Multi Stream ROI TRD package is released with the source code, Vivado project, PetaLinux BSP, and SD card image that enable the user to run the demonstration.
It also includes the binaries necessary to configure and boot the ZCU106 board. Prior to running the steps mentioned in this wiki page, download the VCU Multi Stream ROI TRD package and extract its contents to the directory referred to as $TRD_HOME which is the home directory.
See the below link to download the VCU Multi Stream ROI TRD package.
The TRD package contents are placed in the following directory structure.
rdf0617-zcu106-vcu-multi-stream-roi-2021-2/
├── apu
│ └── vcu_petalinux_bsp
│ └── xilinx-vcu-multi-stream-roi-zcu106-v2021.2-final.bsp
├── dpu
│ ├── 0001-Added-ZCU106-configuration-to-support-DPU-in-ZCU106.patch
│ ├── dpu_conf.vh
│ └── vitis_platform
│ └── zcu106_dpu
├── image
│ ├── bootfiles
│ │ ├── bl31.elf
│ │ ├── linux.bif
│ │ ├── pmufw.elf
│ │ ├── system.bit
│ │ ├── system.dtb
│ │ ├── u-boot.elf
│ │ └── zynqmp_fsbl.elf
│ ├── license_zcu106_multistream_roi_trd_dpu_xclbin.txt
│ ├── README.txt
│ ├── sd_card
│ │ ├── boot
│ │ └── root
│ └── sd_card.img
├── pl
│ ├── constrs
│ │ ├── quad_mipi_rx_ROI.xdc
│ │ └── quad_sensor_async.xdc
│ ├── designs
│ │ └── zcu106_Quad_Sensor_ROI
│ ├── prebuild
│ │ └── zcu106_Quad_Sensor_ROI_wrapper.xsa
│ ├── README.md
│ └── srcs
│ └── hdl
├── README.txt
└── zcu106_vcu_multistream_roi_trd_sources_and_licenses.tar.gz
17 directories, 19 filesThe below snippet shows the directory structure of various binary files placed in the $TRD_HOME/image/sd_card/boot directory.
├── image
└──sd_card
└──boot
├── autostart.sh
├── bd.hwh
├── BOOT.BIN
├── boot.scr
├── dpu.xclbin
├── Image
├── quad_sensor_isp_tuning.sh
├── quad_sensor_media_graph_setting.sh
├── setup.sh
├── system.dtb
├── vcu
│ └── configure_qos.sh
├── vitis
│ └── densebox_640_360-zcu102_zcu104_kv260-r2.0.0.tar.gz
└── vvas
└── json
├── kernel_ML.json
└── kernel_swbbox.json1.6.1 Preparing the SD card
There are three ways to prepare the SD card for booting. Each method is detailed below.
Using ready to test image
Flash the SD Card with
sd_card.imgusing Etcher or Win32DiskImagerBoot the board with the flashed SD Card
sd_card.imgis available atrdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card.imgAll of the required Vitis packages are already installed in ready to test
rdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card.img
Using Pre-built images
To Create SD Card with two partitions: Boot(FAT32+Bootable) and Root(EXT4) Refer to this Link.
Copy
bootcontent fromrdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card/bootto the Boot partition in the SD CardExtract
rootfs.ext4fromrdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card/rootto the Root partition in the SD Card usingBoot the board with the flashed SD Card
Use the Output of the Build Flow
To Create an SD Card with two partitions: Boot(FAT32+Bootable) and Root(EXT4) Refer to this Link.
For the Build Flow, refer to these steps. Copy mentioned generated DPU build images
bd.hwh BOOT.BIN boot.scr dpu.xclbin Image system.dtbinto the BOOT partition of the SD card and extract generatedrootfs.ext4into the ROOT partition of SD CardCopy the mentioned
bootcontentvcu, vitis, vvas, autostart.sh, setup.shfrom therdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card/boot/directory to the Boot partition in the SD CardBoot the board with the flashed SD Card
All of the required Densebox models are already available in the
rdf0617-zcu106-vcu-multi-stream-roi-2021-2/image/sd_card/boot/vitisdirectory and are installed automatically during 1st time boot. Please wait until the target setup completes and models are installed.
1.6.2 GStreamer Pipelines using mediasrcbin plugin
This section covers the GStreamer pipelines using the mediasrcbin plugin for streaming ROI use-cases. This mediasrcbin plugin is a Xilinx specific plugin which is a bin element on top of v4l2src. It parses and configures the media graph of a media device automatically.
For more information on JSON configurations used in the following pipelines, please refer to VVAS JSON object members
Stream-out ( Server ):
→ v4l2convert → vvas_xfilter (DPU) →
Capture (Sensor-1) → tee -| |- vvas_xmetaaffixer → vvas_xroigen → vvas_xfilter (Bounding-box) → Encode → Stream-out
→→→→→→→→→→→→→→→→→→→
→ v4l2convert → vvas_xfilter (DPU) →
Capture (Sensor-2) → tee -| |- vvas_xmetaaffixer → vvas_xroigen → vvas_xfilter (Bounding-box) → Encode → Stream-out
→→→→→→→→→→→→→→→→→→→
Capture (Sensor-3) → Encode → Stream-out
Capture (Sensor-4) → Encode → Stream-out
Set IP address for server:
ifconfig eth0 192.168.25.90Run the following
gst-launch-1.0command for stream-out pipelineStream-out Pipeline
gst-launch-1.0 mediasrcbin media-device=/dev/media0 v4l2src0::io-mode=4 v4l2src1::io-mode=4 v4l2src2::io-mode=4 v4l2src3::io-mode=4 name=src src. ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! tee name=t0 t0. ! queue ! v4l2convert capture-io-mode=4 output-io-mode=5 ! video/x-raw, width=640, height=360, format=BGR ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_ML.json" ! scalem0.sink_master vvas_xmetaaffixer name=scalem0 t0. ! queue ! scalem0.sink_slave_0 scalem0.src_slave_0 ! queue min-threshold-buffers=2 max-size-bytes=0 max-size-buffers=4 max-size-time=0 ! vvas_xroigen roi-type=2 roi-qp-delta=-21 roi-max-num=50 ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_swbbox.json" ! queue ! omxh265enc qp-mode=roi gop-mode=basic gop-length=60 b-frames=0 target-bitrate=1500 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux0 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5004 src. ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! tee name=t1 t1. ! queue ! v4l2video5convert capture-io-mode=4 output-io-mode=5 ! video/x-raw, width=640, height=360, format=BGR ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_ML.json" ! scalem1.sink_master vvas_xmetaaffixer name=scalem1 t1. ! queue ! scalem1.sink_slave_0 scalem1.src_slave_0 ! queue min-threshold-buffers=2 max-size-bytes=0 max-size-buffers=4 max-size-time=0 ! vvas_xroigen roi-type=2 roi-qp-delta=-21 roi-max-num=50 ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_swbbox.json" ! queue ! omxh265enc qp-mode=roi gop-mode=basic gop-length=60 b-frames=0 target-bitrate=1500 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux1 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5008 src. ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=15000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux2 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5012 src. ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=15000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux3 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5016
Here 192.168.25.89 is the host/client IP address and 5004, 5008, 5012 and 5016 are port numbers.
Use the below Stream-in use case with another ZCU106 Board along with VCU TRD Multi Stream Video Capture and Display design
Stream-in ( Client ): 4 X (Stream-in→ Decode → Display)
Set the IP address for the client:
ifconfig eth0 192.168.25.89Run the following
gst-launch-1.0command for the stream-in pipeline where5004,5008,5012&5016are port numbers.Stream-in Pipeline
gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=34 render-rectangle=<0,0,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5008 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=35 render-rectangle=<1920,0,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5012 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=36 render-rectangle=<0,1080,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5016 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=37 render-rectangle=<1920,1080,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true -v
1.6.3 GStreamer Pipelines using v4l2src plugin
This section covers GStreamer pipelines using the v4l2src plugin for streaming ROI use-cases.
Make sure that the MIPI CSI-2 Rx media pipeline is configured for 1080p resolution and that the source/sink have the same color format. Run the below script to set the resolution and format of the MIPI CSI-2 Rx media pipeline nodes where
"media0"indicates the media node for the MIPI CSI-2 Rx input source.
$ sh /media/card/quad_sensor_media_graph_setting.shFor more information on JSON configurations used in the following pipelines, please refer to VVAS JSON object members
Stream-out ( Server ):
→ v4l2convert → vvas_xfilter (DPU) →
Capture (Sensor-1) → tee -| |- vvas_xmetaaffixer → vvas_xroigen → vvas_xfilter (Bounding-box) → Encode → Stream-out
→→→→→→→→→→→→→→→→→→→
→ v4l2convert → vvas_xfilter (DPU) →
Capture (Sensor-2) → tee -| |- vvas_xmetaaffixer → vvas_xroigen → vvas_xfilter (Bounding-box) → Encode → Stream-out
→→→→→→→→→→→→→→→→→→→
Capture (Sensor-3) → Encode → Stream-out
Capture (Sensor-4) → Encode → Stream-out
Set the IP address for the server:
ifconfig eth0 192.168.25.90Run the following
gst-launch-1.0command for stream-out pipelineStream-out Pipeline
gst-launch-1.0 v4l2src device=/dev/video0 io-mode=4 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! tee name=t0 t0. ! queue ! v4l2convert capture-io-mode=4 output-io-mode=5 ! video/x-raw, width=640, height=360, format=BGR ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_ML.json" ! scalem0.sink_master vvas_xmetaaffixer name=scalem0 t0. ! queue ! scalem0.sink_slave_0 scalem0.src_slave_0 ! queue min-threshold-buffers=2 max-size-bytes=0 max-size-buffers=4 max-size-time=0 ! vvas_xroigen roi-type=2 roi-qp-delta=-21 roi-max-num=50 ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_swbbox.json" ! queue ! omxh265enc qp-mode=roi gop-mode=basic gop-length=60 b-frames=0 target-bitrate=1500 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux0 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5004 v4l2src device=/dev/video1 io-mode=4 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! tee name=t1 t1. ! queue ! v4l2video5convert capture-io-mode=4 output-io-mode=5 ! video/x-raw, width=640, height=360, format=BGR ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_ML.json" ! scalem1.sink_master vvas_xmetaaffixer name=scalem1 t1. ! queue ! scalem1.sink_slave_0 scalem1.src_slave_0 ! queue min-threshold-buffers=2 max-size-bytes=0 max-size-buffers=4 max-size-time=0 ! vvas_xroigen roi-type=2 roi-qp-delta=-21 roi-max-num=50 ! queue ! vvas_xfilter kernels-config="/media/card/vvas/json/kernel_swbbox.json" ! queue ! omxh265enc qp-mode=roi gop-mode=basic gop-length=60 b-frames=0 target-bitrate=1500 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux1 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5008 v4l2src device=/dev/video2 io-mode=4 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=15000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux2 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5012 v4l2src device=/dev/video3 io-mode=4 ! video/x-raw, format=NV12, width=1920, height=1080, framerate=30/1 ! omxh265enc qp-mode=auto gop-mode=basic gop-length=60 b-frames=0 target-bitrate=15000 num-slices=8 control-rate=constant prefetch-buffer=true low-bandwidth=false filler-data=true cpb-size=1000 initial-delay=500 periodicity-idr=60 ! video/x-h265, profile=main, alignment=au ! h265parse ! queue ! mpegtsmux alignment=7 name=mux3 ! rtpmp2tpay ! udpsink host=192.168.25.89 port=5016
Here 192.168.25.89 is the host/client IP address and 5004, 5008, 5012 and 5016 are port numbers.
Use the below Stream-in use case with another ZCU106 Board along with the VCU TRD Multi Stream Video Capture and Display design
Stream-in ( Client ): 4 X (Stream-in→ Decode → Display)
Set the IP address for the client:
ifconfig eth0 192.168.25.89Run the following
gst-launch-1.0command for the stream-in pipeline where5004,5008,5012and5016are port numbers.Stream-in Pipeline
gst-launch-1.0 udpsrc port=5004 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=34 render-rectangle=<0,0,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5008 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=35 render-rectangle=<1920,0,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5012 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=36 render-rectangle=<0,1080,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true udpsrc port=5016 buffer-size=60000000 caps="application/x-rtp, clock-rate=90000" ! rtpjitterbuffer latency=1000 ! rtpmp2tdepay ! tsparse ! video/mpegts ! tsdemux ! queue ! h265parse ! video/x-h265, profile=main, alignment=au ! omxh265dec internal-entropy-buffers=5 low-latency=0 ! queue max-size-bytes=0 ! fpsdisplaysink text-overlay=false video-sink="kmssink bus-id="a0070000.v_mix" plane-id=37 render-rectangle=<1920,1080,1920,1080> hold-extra-sample=1 show-preroll-frame=false sync=true" sync=true -v
1.7 Build Flow
Refer to the below link to download the VCU Multi Stream ROI TRD package.
Unzip the released package.
unzip </path/to/downloaded/zipfile>/rdf0617-zcu106-vcu-multi-stream-roi-2021-2.zipThe following tutorials assume that the $TRD_HOME environment variable is set as shown below.
export TRD_HOME=</path/to/downloaded/zipfile>/rdf0617-zcu106-vcu-multi-stream-roi-2021-21.7.1 Hardware Build Flow
This section explains the steps to build the hardware platform and generate XSA using the Vivado tool.
Refer to the Vivado Design Suite User Guide: Using the Vivado IDE, UG893, for setting up the Vivado environment.
Refer to the vivado-release-notes-install-license(UG973) for installation.
Make sure that the necessary IP licenses are in place.
On Linux:
Open a Linux terminal
Change directory to
$TRD_HOME/plfolderSource Vivado
settings.shbash source <path/to/Vivado-installer>/tool/Vivado/2021.2/settings64.shRun the following command to create the Vivado IPI project and invoke the GUI and generate the XSA required for the platform
vivado -source ./designs/zcu106_Quad_Sensor_ROI/project.tclThe project.tcl script does the following
Creates the project in the ../pl/build/zcu106_Quad_Sensor_ROI directory
Creates the IPI Block design with platform interfaces
Runs Synthesis and Implementation
Builds bitstream with no accelerators
Exports the HW to XSA (zcu106_Quad_Sensor_ROI_wrapper.xsa)
zcu106_Quad_Sensor_ROI_wrapper.xsa is stored at location
$TRD_HOME/pl/build/zcu106_Quad_Sensor_ROI/zcu106_Quad_Sensor_ROI.xsa/This XSA is used by PetaLinux for platform creation and also by the Vitis Tool for DPU Kernel Integration.
After executing the script, the Vivado IPI block design comes up as shown in the below figure.
The Platform Setup tab has the settings and AXI Ports, as shown in the below image
1.7.1.1 Platform Interfaces
The screenshots below show the platform interfaces that have been made available to the Vitis tool for linking the acceleration IP dynamically.
In the case of this reference design, the DPU Kernel will be inserted.
After the DPU Kernel is integrated dynamically with the platform using Vitis Flow, the connections are as shown below:
The DPU Data ports are connected to the HP0 Port(S_AXI_HP0_FPD) of the PS
The DPU Instruction port is connected to the S_AXI_HPC1 port of the PS
The DPU S_AXI_Control port is connected to the M_AXI_HPM0_LPD port of the PS through interconnect_hpm0_lpd
The DPU interrupt is connected to the AXI interrupt controller dynamically
1.7.2 PetaLinux build Flow
This tutorial shows how to build the Linux image and boot image using the PetaLinux build tool.
PetaLinux Installation: Refer to the PetaLinux Tools Documentation (UG1144) for installation.
Kernel patches Documentation: Refer this article for Kernel patches required for ZCU106 VCU Multi-Stream ROI TRD using Avnet Quad Sensor BSP.
It is recommended to follow the build steps in sequence
Source the PetaLinux
settings.shbash source <path/to/petalinux-installer>/tool/petalinux-v2021.2-final/settings.shCreate the PetaLinux project
cd $TRD_HOME/apu/vcu_petalinux_bsp petalinux-create -t project -s xilinx-vcu-multi-stream-roi-zcu106-v2021.2-final.bspConfigure the PetaLinux project
cd xilinx-vcu-multi-stream-roi-zcu106-v2021.2-final petalinux-config --silentconfig --get-hw-description=<Path to directory of XSA>
© Copyright 2019 - 2022 Xilinx Inc. Privacy Policy