Xilinx Zynq UltraScale+ MPSoC Video Codec Unit


Table of Contents

Overview

Xilinx Zynq UltraScale+ MPSoC Video Codec Unit (VCU) provides multi-standard video encoding and decoding capabilities, including: High Efficiency Video Coding (HEVC), i.e., H.265; and Advanced Video Coding (AVC), i.e., H.264 standards. VCU software stack consists of custom kernel module and custom user space library known as Control Software (CtrlSW). The OpenMAX IL (OMX) layer is integrated on top of CtrlSW, and Gstreamer frame work is used to integrate OMX-IL component along with other multimedia elements.

OpenMAX™ is a cross-platform API that provides comprehensive streaming media codec and application portability by enabling accelerated multimedia components. Gstreamer is the cross-platform / open source multimedia framework, and provides the infrastructure to integrate multiple multimedia components and create pipelines.

Users can develop their application at all 3 levels, i.e. CtrlSW, OMX-IL, and Gstreamer.

VCU Software Stack

Supported Features

  • Multi-standard encoding/decoding support, including:
    • Advanced Video Coding (AVC) H.264
    • High Efficiency Video Coding (HEVC) H.265
    • HEVC Main profiles, level upto 5.1, High Tier, 4kp60 Encoding/Decoding
    • AVC BP/MP/HP, level upto 5.2, 4kp60 Encoding/Decoding
  • Supports multi-stream encoding/decoding up to 8(1080p30) streams
  • 420/422 chroma-format support
  • 8/10 bit encoding/decoding
  • Slice level Encoding/Decoding
  • Flexible Rate Control: CBR, VBR, Constant QP, Low-latency-RC(Optimized for streaming applications)
  • On-the-fly change of multiple encoding parameters
    • Target bitrate
    • Gop length
    • Number of B frames
    • Insertion of IDR picture
  • Region of Interest based Encoding (ROI)

API information and Example pipelines


Refer to VCU product guide documentation for more details on example pipelines, API information, Encoder/Decoder application flow chart, Encoder parameters...etc.
H.264/H.265 Video Codec Unit v1.2 

Source code repos

VCU:

Gstreamer:

CMA Size setting instructions for Yocto and Petalinux users:

  • Using kernel menuconfig option

    1. Run kernel menuconfig using below cmds
      1. peta-linux : "petalinux-config -c kernel"
      2. yocto: " bitbake -f virtual/kernel -c menuconfig"
    2. Go to Device Drivers→Generic Driver options→ DMA Contiguous Memory Allocator
    3. Set default contiguous memory size as per your preferred value depending upon your usecase
    4. Save the configuration, exit the menuconfig and build images
  • Using bootargs at uboot prompt

    1. During boot-up stop at uboot prompt by continuously pressing any key during bootup and set custom CMA size using below cmd
      • "setenv bootargs $bootargs cma=1500m"
      • boot
  • Using custom uEnv.txt for Yocto

    1. There is a custom uEnv.txt file being generated by yocto build at work/build/tmp/deploy/images/zcu106-zynqmp which can be edited to set custom cma size.
    2. Modify uEnv.txt bootargs option to set requried cma size as below
      • "bootargs=earlycon clk_ignore_unused root=/dev/mmcblk0p2 rw rootwait cma=1500m"
    3. Copy uEnv.txt having CMA boot args setting in boot partition (i.e the first partition) of sdcard and boot the board.

Known Issues

  • AR66763 - LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues for the Vivado 2017.3 tool and later versions


2020.2 Release

New Feature Support:

  • HDR10 is supported for capture, VCU encode/decode and display at gstreamer level
  • Added max-consecutive-skip parameter to VCU encoder
    • Allows users to specify a maximum number of consecutive skipped frames
  • Interlaced video support added for SCD MM
  • NTSC 4:2:0 interlace support is added to VCU encoder/decoder
  • Interlaced video/audio support is validated
  • Added example for passing DMA buffers between the VCU and appsrc/appsink

Bug Fixes:

  • Fixed XAVC compliance errors
  • Fixed incorrect POC on skipped interlaced frames
  • Fixed LLP2 encoder latency issues
  • Fixed frame drop issue with AVC, HIGH Profile 4kp60 with num-slices=16
  • Fixed coverity check errors in V4L2 and DRM
  • Fixed gstreamer parser bugs which caused crashes with third party video files
    • Fixed issue with improper handling of prefix NALs

Known issues:

S.NoIssue descriptionWork aroundComments/AR link
14x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input videoUse b-frames=0NA
2LLP2: 1280x720 XV20 (422 10 bit) doesn't workFix available, AR will be publishedNA
3Memory leak when using v4l2srcFix available, AR will be publishedNA
4LLP2: Switching Live source resolutions on the fly or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.

remove and re-insert vcu kernel modules when it happens

NA
5LLP2: Ripple effect on displayFix available, AR will be publishedNA

2020.1 Release

New Feature Support:

  • Gstreamer version upgraded to 1.16.1
  • Added support for HDR10 metadata insertion and extraction at control software level
  • Enabled buffer metadata to indicate encoder frame-skip frame to application at control software level
    • Users can determine whether or not a frame is skipped by calling "AL_Buffer_GetMetaData()" and extracting the bSkipped flag
  • Added custom ROI delta-qp (roi-by-value) support to encoder.
  • LLP2 Video + Audio pipeline support is verified.

Bug Fixes:

  • Fixed gstreamer parser bugs which caused crashes with third party video files
    • Fixed picture timing SEI parsing
    • Fixed crash caused by incomplete AUs
    • Fixed interlaced transport stream parsing
  • Fixed gradual horizontal/vertical line sweep when using GDR mode
    • Provided custom gstreamer application (zynqmp_relative_qp_insertion) to improve visual quality by modifying delta-qp, alpha, and beta offsets 
  • Fixed coverity check errors in control software
  • Fixed target bitrate in interlaced mode
  • XAVC SEI message buffer size is increased to avoid video hang/corruption for AVC encoder.

Known issues:

S.NoIssue descriptionWork aroundComments/AR link
14x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input videoNo workaroundNA
2LLP2:  Higher encoder latency is observed in 4x 1080p60 use-case. Extra 3 to 4 msec is observed, this may lead to occasional frame drops or lower fps problem.Use extra processing-deadline for gst-pipeline sink, it may reduce frame drops to some extentNA
3Frame drops observed in serial/streaming use-case for AVC, HIGH Profile 4kp60 with num-slices=16 use-case.use num-slices=8 for slice encoding NA
4Switching Live source resolutions on the fly  or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.remove and re-insert vcu kernel modules when it happensNA

2019.2 Release:

New Feature Support

  • ZDMA copy is supported for VCU Encoder output buffer reconstruction, helps in reducing CPU load for encode use-cases.
  • Added IntraMB forcing at block level (Encoder) through External QP table and ROI Map 
  • Added 15 Bframes support in Pyramidal GOP structure
  • Dynamic resolution change support  without port reconfiguration is added gstreamer level, previous release had support at vcu control-sw level.
  • Added sample gstreamer test application to show case 32x stream video transcoding use-case.
  • Added support for generating separate codec-config data for VCU encoder, useful in android framework
  • Added max-picture-sizes control based on Frame type <I, P, B>
  • Added support for blow xAVC profiles
    1. AL_PROFILE_XAVC_HIGH10_INTRA_CBG
    2. AL_PROFILE_XAVC_HIGH10_INTRA_VBR
    3. AL_PROFILE_XAVC_HIGH_422_INTRA_CBG
    4. AL_PROFILE_XAVC_HIGH_422_INTRA_VBR
    5. AL_PROFILE_XAVC_LONG_GOP_MAIN_MP4
    6. AL_PROFILE_XAVC_LONG_GOP_HIGH_MP4
    7. AL_PROFILE_XAVC_LONG_GOP_HIGH_MXF
    8. AL_PROFILE_XAVC_LONG_GOP_HIGH_422_MXF
  • Added support external rate control plugin at MCU level
    • Enables users to develop their own rate-control and plugin into vcu firmware.
  • Added support for Loading of external QP map at gstremaer level 
  • Added support for transfering decoder pts/dts data using one-to-one buffer mapping instead of FIFO at application level.
  • New GOP structure support is added.
    • default-gop-b and pyramidal-gop-b modes.
  • Added support of changing loop filter alpha and beta coefficients dynamically for both avc and hevc.
  • Xilinx Low-latency (XNLXLL) / Low-latency phase2 support is added for VCU encoder/decoder. It uses extra HW IP to synchronize video buffers with other IPs (Ex: capture) on the fly.

Bug Fixes

  • Fixed reduced latency-mode (no-reorder mode) multi-stream support, it will support more than 2 decoder streams unlike low-latency mode.
  • Fixed OMX Encoder flush mechanism race condition, Helps in fixing encoder crash in regular video-recording start/stop use-case.
  • Fixed HEVC Picture timing SEI meta data filed, HEVC should use frame number not field number unlike AVC.
  • Fixed 32 (VGA) stream hang related issue, mailbox size  and channel creation optimizations added.
  • Fixed VCU encoder hang when FrameSkip is enabled, excluded IDR pictures from frameskip logic.
  • Fixed dmabuf handling in userbuffer() API at omxvideodec gstreamer.
  • Updated ZCU104 Petalinux BSP design to use external clock as source for VCU PLL Ref clock.  

2019.1 Release:

  • New Feature Support

    • Dynamic Resolution Change
      • VCU Decoder and Encoder support at Control-software
    • Frame skip support for VCU encoder
    • New rate control mode Capped VBR support 
    • SEI NAL Unit insertion at Gstreamer/OMX Level
    • DCI 4K (4096x2160 @60fps support
    • VCU Encoder – VQ improvement option
      • Temporal layer ID support for Pyramidal GOP
      • Frame DeltaQP support based on temporal layer
      • Lambda Table update based on temporal layer
    • 32 streams - 420P (Encode and Decode)
    • Adaptive GOP Support (ability to change number of dynamically)
    • VCU PL DDR Controller support for Limited DRAM parts
    • Multistream Audio/Video pipeline support
  • Bug fixes

    • Fix glitches in H.264 decode and display on long run
    • Improve robustness while decoding erroneous stream
    • Fix performance issue in CONST_QP mode compared to VBR mode
    • Fix for decoder always generating 10-bit output for chroma-4:2:2 (control-sw application)
    • Fix encoder scheduling issue in multi-stream use-case
    • Fix IDR frame insertion issue when gop-length=1
    • Fix decoder control-sw generating blocky video output for specific stream.
    • Fix incorrect parameter passing for reduced-latency mode
    • Fix encoder assert on multiple flush for empty frame

2018.3 Release:

  • New Feature Support
    • Scene Change Detection
    • HEVC Interlaced Encoding/Decoding
    • Adaptive B-Frame support
    • Long Term Reference picture
    • SEI Insertion at Control-SW Encoder
    • SEI Extraction at Control-SW Decoder
    • Dual Pass Encoding at frame level
    • Low Latency Mode 
    • GDR support Enhancement
      • Insertion of SPS/PPS at each GDR start frame
      • Decoder synchronization without IDR picture
    • Multi-stream support in Low latency mode
  • Stabilized streaming use cases
    • Fixed Hang/Crash issues observed in various streaming pipelines.
    • Fixed Assertion errors observed in multiple start/stop at both server & client side.
    • Unable to resume decoding if server/client is start/stopped.
    • Fixed time stamping errors in low-latency mode.
  • Fixed Frame drops issues observed in 4kp60 pipeline at 60Mbps in CBR/VBR/low latency modes.
  • Fixed Flickering & Block noise issue observed with 4kp60 pipeline at lower bitrate(10Mbps).
  • Fixed I-Frame flickering effect with lower CPBSize.
  • Fixed Memory leak where control software fail to release memory after encoding.
  • Fixed Memory leak observed when running pipeline in full-screen overlay mode.
  • Fixed Hang issue while seeking qtmux or mkv generated file. 
  • Fixed Assertion errors observed in repeat Playback with EOS
  • Fixed Encoder port flush issue observed while seeking.
  • Fixed Crash issue at Control Software when performing multiple concurrent transcoding.
  • Fixed Hang issue in HEVC decoder on EOS in low latency mode.
  • Fixed Encoder to achieve expected Bitrates
  • Fixed VCU Decoder to conceal errors & decode corrupted streams
  • Fixed various non-complaint stream to be able to decode
  • Fixed DMA fd import issue when videocrop element used in pipeline
  • Fixed VCU Init driver issue giving segmentation fault while loading & unloading driver.
  • Removed requirement for Gop.length to be multiple of B+1 frames.

2018.2 Release:

  • Fixed AVC Decoder hang issue for corrupted input file
  • Fixed data corruption issue observed with b-frame enable
  • Fixed zynqmp_vcu_encode application to support b-frames in AVC.
  • Fixed long run frame drop issue for AVC1080p60 decode→display with large input file
  • Fixed bad parameter error when setting baseline profile and level=5.1
  • Fixed MCU clock division calculation in VCU Init driver.
  • Improve CBR/VBR rate control for static video sequences