Zynq UltraScale+ MPSoC VCU TRD 2022.1 - Xilinx Low Latency PS DDR NV12 HDMI Audio Video Capture and Display

This page provides all the information related to Design Module 7 - VCU TRD Xilinx low latency(LLP2) PS DDR NV12 HDMI Audio Video Capture and Display design.

Table of Contents

1 Overview

This module enables capture of video and audio data from an HDMI-Rx subsystem implemented in the PL. The video and audio data can be displayed through the HDMI-Tx subsystem implemented in the PL. The module can stream-out and stream-in live captured video and audio through an Ethernet interface at ultra-low latencies using Sync IP. This module supports four video streams using AXI broadcaster at capture side and mixer at display side for NV12 pixel format. It also supports single-stream audio.

The VCU encoder and decoder operate in slice mode. An input frame is divided into multiple slices (8 or 16) horizontally. The encoder generates a slice_done interrupt at every end of the slice. Generated NAL unit data can be passed to a downstream element immediately without waiting for the frame_done interrupt. The VCU decoder also starts processing data as soon as one slice of data is ready in its circular buffer instead of waiting for complete frame data. The Sync IP does an AXI transaction-level tracking so that the producer and consumer can be synchronized at the granularity of AXI transactions instead of granularity at the video buffer level. Sync IP is responsible for synchronizing buffers between Capture DMA and VCU encoder as both work on same buffer.

The capture element (FB write DMA) writes video buffers in raster-scan order. SyncIP monitors the buffer level while the capture element is writing into DRAM and allows the encoder to read input buffer data if the requested data is already written by DMA, otherwise it blocks the encoder until DMA completes its writes. On the decoder side, the VCU decoder writes decoded video buffer data into DRAM in block-raster scan order and displays reads data in raster-scan order. To avoid display under-run problems, software ensures a phase difference of "~frame_period/2", so that decoder is ahead compare to display.

This design supports the following video interfaces:

Sources:

  • HDMI-Rx capture pipeline implemented in the PS.

  • Stream-In from network or internet.

Sinks:

  • HDMI-Tx display pipeline implemented in the PS.

VCU Codec:

  • Video Encode/Decode capability using VCU hard block in PS

    • AVC/HEVC encoding

    • Encoder/decoder parameter configuration.

Video format:

  • NV12

Supported Resolutions:

The table below provides the supported resolution for this design.

Resolution

Command Line

Single Stream

Multi-stream

4kp60

NA

4kp30

√ (Max 2)

1080p60

√ (Max 4 for encoder) (Max 2 for decoder)

√ - Supported
NA – Not applicable
x – Not supported

When using Low Latency mode (LLP1/LLP2), The encoder and decoder are limited by the number of internal cores. The encoder has a maximum of four streams and the decoder has a maximum of two streams.

The below table gives information about the features supported in this design. 

Pipeline

Video Input
source

Audio Input
source

Video Format

Video Output
Type

Audio Output
Type

Resolution

VCU codec

Pipeline

Video Input
source

Audio Input
source

Video Format

Video Output
Type

Audio Output
Type

Resolution

VCU codec

Serial pipeline (Capture -> Encode -> Decode -> Display)

HDMI-Rx

HDMI-Rx

NV12

HDMI-Tx

HDMI-Tx

4kp60/4kp30/1080p60

HEVC/AVC

Stream-Out pipeline (Capture -> Encode -> Stream-out)

HDMI-Rx

HDMI-Rx

NV12

Stream-Out

Stream-Out

4kp60/4kp30/1080p60

HEVC/AVC

Stream-in pipeline (Stream-in -> Decode -> Display)

Stream-In

Stream-In

NV12

HDMI-Tx

HDMI-Tx

4kp60/4kp30/1080p60

HEVC/AVC

The below figure shows the Xilinx Low Latency PS DDR NV12 HDMI Audio Video Capture and Display design hardware block diagram.

The below figure shows the Xilinx Low Latency PS DDR NV12 HDMI Audio Video Capture and Display design software block diagram.

1.1 Board Setup

Refer to the below link for Board Setup