Zynq UltraScale+ MPSoC VCU Single Sensor ROI 2020.1
This page provides all the information related to VCU Single Sensor ROI design.
Table of Contents
- 1 1 Overview
- 2 2 Other Information
- 3 3 Appendix A - Input Configuration File (input.cfg)
- 4 4 Appendix B - MIPI-Rx/HDMI-Tx Link-up and GStreamer Commands
1 Overview
The primary goal of this VCU Single Sensor ROI design is to demonstrate the use of DPU (Deep learning Processor Unit) block for extracting the ROI (Region of Interest) from input video frames and to use this information to perform ROI based encoding using VCU (Video Codec Unit) encoder hard block present in Zynq UltraScale+ EV devices.
The design will serve as a platform to accelerate Deep Neural Network inference algorithms using DPU and demonstrate the ROI feature of VCU encoder. The design uses a Deep Convolutional Neural Network (CNN) named Densebox, running on DPU to extract ROI Information (e.g. ‘face’ in this case).
The Design will use Vivado IPI flow for building the hardware platform and Xilinx Yocto Petalinux flow for software design. It will use Xilinx IP and Software driver to demonstrate the capabilities of different components. The Vitis platform will be created from the Vivado/PetaLInux build artifacts, and then Vitis acceleration flow will be used to insert the DPU into the platform to create the final bitstream.
The following figure shows one of the use cases (serial pipeline) with face detection and enhanced ROI quality on ZCU106.
Serial: Face detection and enhanced ROI quality on ZCU106
The following figure shows one of the use cases (streaming pipeline) with face detection and enhanced ROI quality on ZCU106.
Streaming: Face detection and enhanced ROI quality on ZCU106
1.1 System Architecture
The following figure shows the system level diagram which includes the components of the evaluation board.
1.2 Hardware Architecture
This section gives a detailed description of the blocks used in the hardware design. The functional block diagram of the design is shown in the below figure.
There are seven primary Sections in the design.
MIPI Capture Pipeline:
Captures video frame buffers from Capture source in 4K Resolution, NV12 Format
Writes the buffers into DDR Memory with Frame Buffer Write IP
Multi-scaler Block:
Reads the Video Buffers from DDR Memory
Scales down the buffer to the 640x360 size (suitable for dpu)
Converts the format from NV12 to BGR
Writes the Down-scaled buffer to DDR Memory
DPU Block:
Reads the downscaled buffers from DDR Memory
Runs the Densebox algorithm to generate the ROI information for each frame buffer
Passes the ROI information to VCU Encoder
VCU Encoder:
Reads the 4K NV12 Buffer from DDR Memory
Receives the ROI metadata from DPU IP
Encodes the video buffers based on the ROI Information
Finally writes the encoded stream to DDR Memory
PS GEM:
Reads the Encoder stream from DDR Memory
Streams out the encoded stream via Ethernet
VCU Decoder:
Decodes the received encoded frame and writes to memory
HDMI-Tx:
Displays the decoded frames on HDMI Display
This design supports the following video interfaces: