Xilinx Zynq UltraScale+ MPSoC Video Codec Unit

Xilinx Zynq UltraScale+ MPSoC Video Codec Unit

 

Table of Contents

Overview

Xilinx Zynq UltraScale+ MPSoC Video Codec Unit (VCU) provides multi-standard video encoding and decoding capabilities, including: High Efficiency Video Coding (HEVC), i.e., H.265; and Advanced Video Coding (AVC), i.e., H.264 standards. VCU software stack consists of custom kernel module and custom user space library known as Control Software (CtrlSW). The OpenMAX IL (OMX) layer is integrated on top of CtrlSW, and Gstreamer frame work is used to integrate OMX-IL component along with other multimedia elements.

OpenMAX™ is a cross-platform API that provides comprehensive streaming media codec and application portability by enabling accelerated multimedia components. Gstreamer is the cross-platform / open source multimedia framework, and provides the infrastructure to integrate multiple multimedia components and create pipelines.

Users can develop their application at all 3 levels, i.e. CtrlSW, OMX-IL, and Gstreamer.

VCU Software Stack

 

Supported Features

  • Multi-standard encoding/decoding support, including:

    • Advanced Video Coding (AVC) H.264

    • High Efficiency Video Coding (HEVC) H.265

    • HEVC Main profiles, level upto 5.1, High Tier, 4kp60 Encoding/Decoding

    • AVC BP/MP/HP, level upto 5.2, 4kp60 Encoding/Decoding

  • Supports multi-stream encoding/decoding up to 8(1080p30) streams

  • 420/422 chroma-format support

  • 8/10 bit encoding/decoding

  • Slice level Encoding/Decoding

  • Flexible Rate Control: CBR, VBR, Constant QP, Low-latency-RC(Optimized for streaming applications)

  • On-the-fly change of multiple encoding parameters

    • Target bitrate

    • Gop length

    • Number of B frames

    • Insertion of IDR picture

  • Region of Interest based Encoding (ROI)

 

API information and Example pipelines

 

Refer to VCU product guide documentation for more details on example pipelines, API information, Encoder/Decoder application flow chart, Encoder parameters...etc.
H.264/H.265 Video Codec Unit v1.2 

Source code repos

VCU:

Gstreamer:

CMA Size setting instructions for Yocto and Petalinux users:

  • Using kernel menuconfig option

    1. Run kernel menuconfig using below cmds

      1. peta-linux : "petalinux-config -c kernel"

      2. yocto: " bitbake -f virtual/kernel -c menuconfig"

    2. Go to Device Drivers→Generic Driver options→ DMA Contiguous Memory Allocator

    3. Set default contiguous memory size as per your preferred value depending upon your usecase

    4. Save the configuration, exit the menuconfig and build images

  • Using bootargs at uboot prompt

    1. During boot-up stop at uboot prompt by continuously pressing any key during bootup and set custom CMA size using below cmd

      • "setenv bootargs $bootargs cma=1500m"

      • boot

  • Using custom uEnv.txt for Yocto

    1. There is a custom uEnv.txt file being generated by yocto build at work/build/tmp/deploy/images/zcu106-zynqmp which can be edited to set custom cma size.

    2. Modify uEnv.txt bootargs option to set requried cma size as below

      • "bootargs=earlycon clk_ignore_unused root=/dev/mmcblk0p2 rw rootwait cma=1500m"

    3. Copy uEnv.txt having CMA boot args setting in boot partition (i.e the first partition) of sdcard and boot the board.

Device Tree Binding

The device tree node will be automatically generated, if the core is configured in the HW design, using the Device Tree BSP.

Steps to generate device-tree is documented here,
http://www.wiki.xilinx.com/Build+Device+Tree+Blob

And a sample binding is shown below and the description of DT property is documented here - Documentation/devicetree/bindings/clock/xlnx,vcu.txt

Known Issues

  • AR66763 - LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues for the Vivado 2017.3 tool and later versions

2022.2 Release

New Feature Support:

  • Dynamic Insertion of IDR frame in Low latency

Bug Fixes:

  • Fix race condition when destroying the channel

  • Fixed wrong computation of iMaxSlices at decoder side, due to which several channel variables are computed wrong and decoder was crashing

  • Handled SDI input cable disconnect and reconnect scenarios based on video locking/unlocking event

  • Fixed NULL pointer dereference in ALSA framework while setting DAI data for xilinx DP audio based use cases

Known Issues:

S.No

Issue Description

Workaround

comments/AR link

S.No

Issue Description

Workaround

comments/AR link

1

Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run

No workaround available

NA

2

Memory leak observed with decoder app

No workaround available

NA

3

Fps drops observed with below usecase:

  • LLP2 2x-4kp30 XV20 AVC serial use-case.

  • LLP2 4x-1080p60 XV20 AVC (2x-1080p60 serial + 2x-1080p60 cap->enc->fakesink) pipeline

  • Above two cases pipeline is taking more time to stabilize(around 3-4 seconds) at start.

  • YUV444 4kp30 10-bit(X403) AVC Streaming usecase.

No workaround available.

NA

4

VCU decoder will hang and could not recover itself if the input stream exceeds its capability for a few times

No workaround available.

NA

5

Broken bitstream observed with h264 parser on decoding specific streams

No workaround available.

NA

2022.1 Release

New Feature Support:

  • DMA fd support for VCU Encoder output.

  • YUV444 support for encoder and decoder using Xilinx custom solution.

Bug Fixes:

  • Fix for memory leak issue that was observed with gstreamer based low-latency VCU pipelines.

  • Release reset for DP before accessing DP registers. Accessing the DP register without releasing the reset makes system hang.

Known Issues:

S.No

Issue Description

Workaround

comments/AR link

S.No

Issue Description

Workaround

comments/AR link

1

Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run

No workaround available

NA

2

SIGINT is not working properly while killing low-latency stream-out pipelines

Send two consecutive ctrl+c signals kills the pipeline

NA

3

Fps drops observed with below usecase:

  • LLP2 2x-4kp30 XV20 AVC serial use-case.

  • LLP2 4x-1080p60 XV20 AVC (2x-1080p60 serial + 2x-1080p60 cap->enc->fakesink) pipeline

  • Above two cases pipeline is taking more time to stabilize(around 3-4 seconds) at start.

  • YUV444 4kp30 10-bit(X403) AVC Streaming usecase.

No workaround available.

NA

4

ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations)

Sending another SIGINT_KILL or Ctrl+C closes the application

NA

2021.2 Release

New Feature Support:

  • 4 byte start code for all slices.

Bug Fixes:

  • Fixed: 4K HEVC Encoder with skip-frames enabled produces invalid bitstream syntax

  • Fixed: Scheduler changes for skip-frame issue

  • Fixed: Fixed MCU trace error reported when receiving RTSP stream

  • Fixed: Memory leak in gst-plugins-bad

  • Fixed: Decoder latency is high when receving rtsp multicast stream

  • Fixed: DPDMA example shows black screen with 1080p display monitors

  • Fixed: Handling case where mixer is not able to scale to application requested dimensions

  • Fixed: Allow media pipeline enable with single dma start

Known Issues:

 

S.No

Issue Description

Workaround

comments/AR link

S.No

Issue Description

Workaround

comments/AR link

1

SIGINT is not working properly while killing low-latency stream-out pipelines

Send two consecutive ctrl+c signals kills the pipeline

NA

2

Fps drops observed with LLP2 2-4kp30 XV20 AVC serial use-case.
Pipeline is taking more time to stabilize(around 3-4 seconds).

No workaround available.

NA

3

ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations)

Sending another SIGINT_KILL or Ctrl+C closes the application

NA

4

Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run

No workaround available

NA

5

Interlaced file playback shows distortion on the Display in the long run

No workaround available

NA

6

PS_DP is not sending correct channel_status message for audio 

No workaround available

NA

2021.1 Release

New Feature Support:

  • Gstreamer version upgraded to 1.16.3

  • GRAY8/GRAY10 format support is added for VCU encoder/decoder at Gstreamer

  • Dynamic IDR insertion support is added for Pyramidal GOP

  • Added external CRTC (ex: PL video-mixer) support to PS-DP subsystem.

  • Uniform slice_type parameter support is added for VCU encoder

  • Vertical alignment/crop on input buffers is supported for VCU encoder

    • used in 486i to 480i conversion.

Bug Fixes:

  • Fixed V4l2 mem2mem driver reload/load issue.

  • Fixed gstreamer kmssink to display full screen mode for 4k wider monitors.

  • Fixed dma driver to capture and encode resolutions which are not aligned to 32, ex: 1400x1050.

  • Fixed overwriting SEI messages on BP and PT SEI messages.

  • Fixed kmssink to display planar 420 I420 video using PS_DP.

  • LLP2: 

    • Fixed issues related to 720p resolution capture + encode use-case

    • Fixed Display Ripple effects and start/stop live sources on the fly.

Known Issues:

 

S.No

Issue Description

Workaround

comments/AR link

S.No

Issue Description

Workaround

comments/AR link

1

SIGINT is not working properly while killing low-latency stream-out pipelines

Send two consecutive ctrl+c signals kills the pipeline

NA

2

VCU encoder MCU trace error is reported when receiving erroneous RTSP stream

No workaround available, 

Need to handle error concealment for such errors 

3

Interlaced file playback shows distortion on the Display in the long run

No workaround available

NA

4

4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video

Use b-frames=0

NA

 

2020.2 Release

New Feature Support:

  • HDR10 is supported for capture, VCU encode/decode and display at gstreamer level

  • Added max-consecutive-skip parameter to VCU encoder

    • Allows users to specify a maximum number of consecutive skipped frames

  • Interlaced video support added for SCD MM

  • NTSC 4:2:0 interlace support is added to VCU encoder/decoder

  • Interlaced video/audio support is validated

  • Added example for passing DMA buffers between the VCU and appsrc/appsink

Bug Fixes:

  • Fixed XAVC compliance errors

  • Fixed incorrect POC on skipped interlaced frames

  • Fixed LLP2 encoder latency issues

  • Fixed frame drop issue with AVC, HIGH Profile 4kp60 with num-slices=16

  • Fixed coverity check errors in V4L2 and DRM

  • Fixed gstreamer parser bugs which caused crashes with third party video files

    • Fixed issue with improper handling of prefix NALs

Known issues:

S.No

Issue description

Work around

Comments/AR link

S.No

Issue description

Work around

Comments/AR link

1

4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video

Use b-frames=0

NA

2

LLP2: 1280x720 XV20 (422 10 bit) doesn't work

Fix available, AR will be published

NA

3

Memory leak when using v4l2src

Fix available, AR will be published

NA

4

LLP2: Switching Live source resolutions on the fly or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.

remove and re-insert vcu kernel modules when it happens

NA

5

LLP2: Ripple effect on display

Fix available, AR will be published

NA

2020.1 Release

New Feature Support:

  • Gstreamer version upgraded to 1.16.1

  • Added support for HDR10 metadata insertion and extraction at control software level

  • Enabled buffer metadata to indicate encoder frame-skip frame to application at control software level

    • Users can determine whether or not a frame is skipped by calling "AL_Buffer_GetMetaData()" and extracting the bSkipped flag

  • Added custom ROI delta-qp (roi-by-value) support to encoder.

  • LLP2 Video + Audio pipeline support is verified.

Bug Fixes:

  • Fixed gstreamer parser bugs which caused crashes with third party video files

    • Fixed picture timing SEI parsing

    • Fixed crash caused by incomplete AUs

    • Fixed interlaced transport stream parsing

  • Fixed gradual horizontal/vertical line sweep when using GDR mode

    • Provided custom gstreamer application (zynqmp_relative_qp_insertion) to improve visual quality by modifying delta-qp, alpha, and beta offsets 

  • Fixed coverity check errors in control software

  • Fixed target bitrate in interlaced mode

  • XAVC SEI message buffer size is increased to avoid video hang/corruption for AVC encoder.

Known issues:

S.No

Issue description

Work around

Comments/AR link

S.No

Issue description

Work around

Comments/AR link

1

4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video

No workaround

NA

2

LLP2:  Higher encoder latency is observed in 4x 1080p60 use-case. Extra 3 to 4 msec is observed, this may lead to occasional frame drops or lower fps problem.

Use extra processing-deadline for gst-pipeline sink, it may reduce frame drops to some extent

NA

3

Frame drops observed in serial/streaming use-case for AVC, HIGH Profile 4kp60 with num-slices=16 use-case.

use num-slices=8 for slice encoding 

NA

4

Switching Live source resolutions on the fly  or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.

remove and re-insert vcu kernel modules when it happens

NA

2019.2 Release:

New Feature Support

  • ZDMA copy is supported for VCU Encoder output buffer reconstruction, helps in reducing CPU load for encode use-cases.

  • Added IntraMB forcing at block level (Encoder) through External QP table and ROI Map 

  • Added 15 Bframes support in Pyramidal GOP structure

  • Dynamic resolution change support  without port reconfiguration is added gstreamer level, previous release had support at vcu control-sw level.

  • Added sample gstreamer test application to show case 32x stream video transcoding use-case.

  • Added support for generating separate codec-config data for VCU encoder, useful in android framework

  • Added max-picture-sizes control based on Frame type <I, P, B>

  • Added support for blow xAVC profiles

    1. AL_PROFILE_XAVC_HIGH10_INTRA_CBG

    2. AL_PROFILE_XAVC_HIGH10_INTRA_VBR

    3. AL_PROFILE_XAVC_HIGH_422_INTRA_CBG

    4. AL_PROFILE_XAVC_HIGH_422_INTRA_VBR

    5. AL_PROFILE_XAVC_LONG_GOP_MAIN_MP4

    6. AL_PROFILE_XAVC_LONG_GOP_HIGH_MP4

    7. AL_PROFILE_XAVC_LONG_GOP_HIGH_MXF

    8. AL_PROFILE_XAVC_LONG_GOP_HIGH_422_MXF

  • Added support external rate control plugin at MCU level

    • Enables users to develop their own rate-control and plugin into vcu firmware.

  • Added support for Loading of external QP map at gstremaer level 

  • Added support for transfering decoder pts/dts data using one-to-one buffer mapping instead of FIFO at application level.