Xilinx Zynq UltraScale+ MPSoC Video Codec Unit


Table of Contents

Overview

Xilinx Zynq UltraScale+ MPSoC Video Codec Unit (VCU) provides multi-standard video encoding and decoding capabilities, including: High Efficiency Video Coding (HEVC), i.e., H.265; and Advanced Video Coding (AVC), i.e., H.264 standards. VCU software stack consists of custom kernel module and custom user space library known as Control Software (CtrlSW). The OpenMAX IL (OMX) layer is integrated on top of CtrlSW, and Gstreamer frame work is used to integrate OMX-IL component along with other multimedia elements.

OpenMAX™ is a cross-platform API that provides comprehensive streaming media codec and application portability by enabling accelerated multimedia components. Gstreamer is the cross-platform / open source multimedia framework, and provides the infrastructure to integrate multiple multimedia components and create pipelines.

Users can develop their application at all 3 levels, i.e. CtrlSW, OMX-IL, and Gstreamer.

VCU Software Stack

Supported Features

  • Multi-standard encoding/decoding support, including:
    • Advanced Video Coding (AVC) H.264
    • High Efficiency Video Coding (HEVC) H.265
    • HEVC Main profiles, level upto 5.1, High Tier, 4kp60 Encoding/Decoding
    • AVC BP/MP/HP, level upto 5.2, 4kp60 Encoding/Decoding
  • Supports multi-stream encoding/decoding up to 8(1080p30) streams
  • 420/422 chroma-format support
  • 8/10 bit encoding/decoding
  • Slice level Encoding/Decoding
  • Flexible Rate Control: CBR, VBR, Constant QP, Low-latency-RC(Optimized for streaming applications)
  • On-the-fly change of multiple encoding parameters
    • Target bitrate
    • Gop length
    • Number of B frames
    • Insertion of IDR picture
  • Region of Interest based Encoding (ROI)

API information and Example pipelines


Refer to VCU product guide documentation for more details on example pipelines, API information, Encoder/Decoder application flow chart, Encoder parameters...etc.
H.264/H.265 Video Codec Unit v1.2 

Source code repos

VCU:

Gstreamer:

CMA Size setting instructions for Yocto and Petalinux users:

  • Using kernel menuconfig option

    1. Run kernel menuconfig using below cmds
      1. peta-linux : "petalinux-config -c kernel"
      2. yocto: " bitbake -f virtual/kernel -c menuconfig"
    2. Go to Device Drivers→Generic Driver options→ DMA Contiguous Memory Allocator
    3. Set default contiguous memory size as per your preferred value depending upon your usecase
    4. Save the configuration, exit the menuconfig and build images
  • Using bootargs at uboot prompt

    1. During boot-up stop at uboot prompt by continuously pressing any key during bootup and set custom CMA size using below cmd
      • "setenv bootargs $bootargs cma=1500m"
      • boot
  • Using custom uEnv.txt for Yocto

    1. There is a custom uEnv.txt file being generated by yocto build at work/build/tmp/deploy/images/zcu106-zynqmp which can be edited to set custom cma size.
    2. Modify uEnv.txt bootargs option to set requried cma size as below
      • "bootargs=earlycon clk_ignore_unused root=/dev/mmcblk0p2 rw rootwait cma=1500m"
    3. Copy uEnv.txt having CMA boot args setting in boot partition (i.e the first partition) of sdcard and boot the board.

Device Tree Binding

The device tree node will be automatically generated, if the core is configured in the HW design, using the Device Tree BSP.

Steps to generate device-tree is documented here,
http://www.wiki.xilinx.com/Build+Device+Tree+Blob

And a sample binding is shown below and the description of DT property is documented here - Documentation/devicetree/bindings/clock/xlnx,vcu.txt

Known Issues

  • AR66763 - LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues for the Vivado 2017.3 tool and later versions

2022.2 Release

New Feature Support:

  • Dynamic Insertion of IDR frame in Low latency

Bug Fixes:

  • Fix race condition when destroying the channel
  • Fixed wrong computation of iMaxSlices at decoder side, due to which several channel variables are computed wrong and decoder was crashing
  • Handled SDI input cable disconnect and reconnect scenarios based on video locking/unlocking event
  • Fixed NULL pointer dereference in ALSA framework while setting DAI data for xilinx DP audio based use cases

Known Issues:

S.NoIssue DescriptionWorkaroundcomments/AR link
1Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-runNo workaround availableNA
2Memory leak observed with decoder appNo workaround availableNA
3

Fps drops observed with below usecase:

  • LLP2 2x-4kp30 XV20 AVC serial use-case.
  • LLP2 4x-1080p60 XV20 AVC (2x-1080p60 serial + 2x-1080p60 cap->enc->fakesink) pipeline
  • Above two cases pipeline is taking more time to stabilize(around 3-4 seconds) at start.
  • YUV444 4kp30 10-bit(X403) AVC Streaming usecase.
No workaround available.NA
4VCU decoder will hang and could not recover itself if the input stream exceeds its capability for a few timesNo workaround available.NA
5Broken bitstream observed with h264 parser on decoding specific streamsNo workaround available.NA

2022.1 Release

New Feature Support:

  • DMA fd support for VCU Encoder output.
  • YUV444 support for encoder and decoder using Xilinx custom solution.

Bug Fixes:

  • Fix for memory leak issue that was observed with gstreamer based low-latency VCU pipelines.
  • Release reset for DP before accessing DP registers. Accessing the DP register without releasing the reset makes system hang.

Known Issues:

S.NoIssue DescriptionWorkaroundcomments/AR link
1Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-runNo workaround availableNA
2SIGINT is not working properly while killing low-latency stream-out pipelinesSend two consecutive ctrl+c signals kills the pipelineNA
3

Fps drops observed with below usecase:

  • LLP2 2x-4kp30 XV20 AVC serial use-case.
  • LLP2 4x-1080p60 XV20 AVC (2x-1080p60 serial + 2x-1080p60 cap->enc->fakesink) pipeline
  • Above two cases pipeline is taking more time to stabilize(around 3-4 seconds) at start.
  • YUV444 4kp30 10-bit(X403) AVC Streaming usecase.
No workaround available.NA
4ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations)Sending another SIGINT_KILL or Ctrl+C closes the applicationNA

2021.2 Release

New Feature Support:

  • 4 byte start code for all slices.

Bug Fixes:

  • Fixed: 4K HEVC Encoder with skip-frames enabled produces invalid bitstream syntax
  • Fixed: Scheduler changes for skip-frame issue
  • Fixed: Fixed MCU trace error reported when receiving RTSP stream
  • Fixed: Memory leak in gst-plugins-bad
  • Fixed: Decoder latency is high when receving rtsp multicast stream
  • Fixed: DPDMA example shows black screen with 1080p display monitors
  • Fixed: Handling case where mixer is not able to scale to application requested dimensions
  • Fixed: Allow media pipeline enable with single dma start

Known Issues:


S.NoIssue DescriptionWorkaroundcomments/AR link
1SIGINT is not working properly while killing low-latency stream-out pipelinesSend two consecutive ctrl+c signals kills the pipelineNA
2Fps drops observed with LLP2 2-4kp30 XV20 AVC serial use-case.
Pipeline is taking more time to stabilize(around 3-4 seconds).
No workaround available.NA
3ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations)Sending another SIGINT_KILL or Ctrl+C closes the applicationNA
4Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-runNo workaround available

NA

5Interlaced file playback shows distortion on the Display in the long runNo workaround availableNA
6PS_DP is not sending correct channel_status message for audio No workaround availableNA

2021.1 Release

New Feature Support:

  • Gstreamer version upgraded to 1.16.3
  • GRAY8/GRAY10 format support is added for VCU encoder/decoder at Gstreamer
  • Dynamic IDR insertion support is added for Pyramidal GOP
  • Added external CRTC (ex: PL video-mixer) support to PS-DP subsystem.
  • Uniform slice_type parameter support is added for VCU encoder
  • Vertical alignment/crop on input buffers is supported for VCU encoder
    • used in 486i to 480i conversion.

Bug Fixes:

  • Fixed V4l2 mem2mem driver reload/load issue.
  • Fixed gstreamer kmssink to display full screen mode for 4k wider monitors.
  • Fixed dma driver to capture and encode resolutions which are not aligned to 32, ex: 1400x1050.
  • Fixed overwriting SEI messages on BP and PT SEI messages.
  • Fixed kmssink to display planar 420 I420 video using PS_DP.
  • LLP2: 
    • Fixed issues related to 720p resolution capture + encode use-case
    • Fixed Display Ripple effects and start/stop live sources on the fly.

Known Issues:


S.NoIssue DescriptionWorkaroundcomments/AR link
1SIGINT is not working properly while killing low-latency stream-out pipelinesSend two consecutive ctrl+c signals kills the pipelineNA
2VCU encoder MCU trace error is reported when receiving erroneous RTSP streamNo workaround available, Need to handle error concealment for such errors 
3Interlaced file playback shows distortion on the Display in the long runNo workaround availableNA
44x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input videoUse b-frames=0NA


2020.2 Release

New Feature Support:

  • HDR10 is supported for capture, VCU encode/decode and display at gstreamer level
  • Added max-consecutive-skip parameter to VCU encoder
    • Allows users to specify a maximum number of consecutive skipped frames
  • Interlaced video support added for SCD MM
  • NTSC 4:2:0 interlace support is added to VCU encoder/decoder
  • Interlaced video/audio support is validated
  • Added example for passing DMA buffers between the VCU and appsrc/appsink

Bug Fixes:

  • Fixed XAVC compliance errors
  • Fixed incorrect POC on skipped interlaced frames
  • Fixed LLP2 encoder latency issues
  • Fixed frame drop issue with AVC, HIGH Profile 4kp60 with num-slices=16
  • Fixed coverity check errors in V4L2 and DRM
  • Fixed gstreamer parser bugs which caused crashes with third party video files
    • Fixed issue with improper handling of prefix NALs

Known issues:

S.NoIssue descriptionWork aroundComments/AR link
14x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input videoUse b-frames=0NA
2LLP2: 1280x720 XV20 (422 10 bit) doesn't workFix available, AR will be publishedNA
3Memory leak when using v4l2srcFix available, AR will be publishedNA
4LLP2: Switching Live source resolutions on the fly or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.

remove and re-insert vcu kernel modules when it happens

NA
5LLP2: Ripple effect on displayFix available, AR will be publishedNA

2020.1 Release

New Feature Support:

  • Gstreamer version upgraded to 1.16.1
  • Added support for HDR10 metadata insertion and extraction at control software level
  • Enabled buffer metadata to indicate encoder frame-skip frame to application at control software level
    • Users can determine whether or not a frame is skipped by calling "AL_Buffer_GetMetaData()" and extracting the bSkipped flag
  • Added custom ROI delta-qp (roi-by-value) support to encoder.
  • LLP2 Video + Audio pipeline support is verified.

Bug Fixes:

  • Fixed gstreamer parser bugs which caused crashes with third party video files
    • Fixed picture timing SEI parsing
    • Fixed crash caused by incomplete AUs
    • Fixed interlaced transport stream parsing
  • Fixed gradual horizontal/vertical line sweep when using GDR mode
    • Provided custom gstreamer application (zynqmp_relative_qp_insertion) to improve visual quality by modifying delta-qp, alpha, and beta offsets 
  • Fixed coverity check errors in control software
  • Fixed target bitrate in interlaced mode
  • XAVC SEI message buffer size is increased to avoid video hang/corruption for AVC encoder.

Known issues:

S.NoIssue descriptionWork aroundComments/AR link
14x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input videoNo workaroundNA
2LLP2:  Higher encoder latency is observed in 4x 1080p60 use-case. Extra 3 to 4 msec is observed, this may lead to occasional frame drops or lower fps problem.Use extra processing-deadline for gst-pipeline sink, it may reduce frame drops to some extentNA
3Frame drops observed in serial/streaming use-case for AVC, HIGH Profile 4kp60 with num-slices=16 use-case.use num-slices=8 for slice encoding NA
4Switching Live source resolutions on the fly  or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch.remove and re-insert vcu kernel modules when it happensNA

2019.2 Release:

New Feature Support

  • ZDMA copy is supported for VCU Encoder output buffer reconstruction, helps in reducing CPU load for encode use-cases.
  • Added IntraMB forcing at block level (Encoder) through External QP table and ROI Map 
  • Added 15 Bframes support in Pyramidal GOP structure
  • Dynamic resolution change support  without port reconfiguration is added gstreamer level, previous release had support at vcu control-sw level.
  • Added sample gstreamer test application to show case 32x stream video transcoding use-case.
  • Added support for generating separate codec-config data for VCU encoder, useful in android framework
  • Added max-picture-sizes control based on Frame type <I, P, B>
  • Added support for blow xAVC profiles
    1. AL_PROFILE_XAVC_HIGH10_INTRA_CBG
    2. AL_PROFILE_XAVC_HIGH10_INTRA_VBR
    3. AL_PROFILE_XAVC_HIGH_422_INTRA_CBG
    4. AL_PROFILE_XAVC_HIGH_422_INTRA_VBR
    5. AL_PROFILE_XAVC_LONG_GOP_MAIN_MP4
    6. AL_PROFILE_XAVC_LONG_GOP_HIGH_MP4
    7. AL_PROFILE_XAVC_LONG_GOP_HIGH_MXF
    8. AL_PROFILE_XAVC_LONG_GOP_HIGH_422_MXF
  • Added support external rate control plugin at MCU level
    • Enables users to develop their own rate-control and plugin into vcu firmware.
  • Added support for Loading of external QP map at gstremaer level 
  • Added support for transfering decoder pts/dts data using one-to-one buffer mapping instead of FIFO at application level.
  • New GOP structure support is added.
    • default-gop-b and pyramidal-gop-b modes.
  • Added support of changing loop filter alpha and beta coefficients dynamically for both avc and hevc.
  • Xilinx Low-latency (XNLXLL) / Low-latency phase2 support is added for VCU encoder/decoder. It uses extra HW IP to synchronize video buffers with other IPs (Ex: capture) on the fly.

Bug Fixes

  • Fixed reduced latency-mode (no-reorder mode) multi-stream support, it will support more than 2 decoder streams unlike low-latency mode.
  • Fixed OMX Encoder flush mechanism race condition, Helps in fixing encoder crash in regular video-recording start/stop use-case.
  • Fixed HEVC Picture timing SEI meta data filed, HEVC should use frame number not field number unlike AVC.
  • Fixed 32 (VGA) stream hang related issue, mailbox size  and channel creation optimizations added.
  • Fixed VCU encoder hang when FrameSkip is enabled, excluded IDR pictures from frameskip logic.
  • Fixed dmabuf handling in userbuffer() API at omxvideodec gstreamer.
  • Updated ZCU104 Petalinux BSP design to use external clock as source for VCU PLL Ref clock.  

2019.1 Release:

  • New Feature Support

    • Dynamic Resolution Change
      • VCU Decoder and Encoder support at Control-software
    • Frame skip support for VCU encoder
    • New rate control mode Capped VBR support 
    • SEI NAL Unit insertion at Gstreamer/OMX Level
    • DCI 4K (4096x2160 @60fps support
    • VCU Encoder – VQ improvement option
      • Temporal layer ID support for Pyramidal GOP
      • Frame DeltaQP support based on temporal layer
      • Lambda Table update based on temporal layer
    • 32 streams - 420P (Encode and Decode)
    • Adaptive GOP Support (ability to change number of dynamically)
    • VCU PL DDR Controller support for Limited DRAM parts
    • Multistream Audio/Video pipeline support
  • Bug fixes

    • Fix glitches in H.264 decode and display on long run
    • Improve robustness while decoding erroneous stream
    • Fix performance issue in CONST_QP mode compared to VBR mode
    • Fix for decoder always generating 10-bit output for chroma-4:2:2 (control-sw application)
    • Fix encoder scheduling issue in multi-stream use-case
    • Fix IDR frame insertion issue when gop-length=1
    • Fix decoder control-sw generating blocky video output for specific stream.
    • Fix incorrect parameter passing for reduced-latency mode
    • Fix encoder assert on multiple flush for empty frame

2018.3 Release:

  • New Feature Support
    • Scene Change Detection
    • HEVC Interlaced Encoding/Decoding
    • Adaptive B-Frame support
    • Long Term Reference picture
    • SEI Insertion at Control-SW Encoder
    • SEI Extraction at Control-SW Decoder
    • Dual Pass Encoding at frame level
    • Low Latency Mode 
    • GDR support Enhancement
      • Insertion of SPS/PPS at each GDR start frame
      • Decoder synchronization without IDR picture
    • Multi-stream support in Low latency mode
  • Stabilized streaming use cases
    • Fixed Hang/Crash issues observed in various streaming pipelines.
    • Fixed Assertion errors observed in multiple start/stop at both server & client side.
    • Unable to resume decoding if server/client is start/stopped.
    • Fixed time stamping errors in low-latency mode.
  • Fixed Frame drops issues observed in 4kp60 pipeline at 60Mbps in CBR/VBR/low latency modes.
  • Fixed Flickering & Block noise issue observed with 4kp60 pipeline at lower bitrate(10Mbps).
  • Fixed I-Frame flickering effect with lower CPBSize.
  • Fixed Memory leak where control software fail to release memory after encoding.
  • Fixed Memory leak observed when running pipeline in full-screen overlay mode.
  • Fixed Hang issue while seeking qtmux or mkv generated file. 
  • Fixed Assertion errors observed in repeat Playback with EOS
  • Fixed Encoder port flush issue observed while seeking.
  • Fixed Crash issue at Control Software when performing multiple concurrent transcoding.
  • Fixed Hang issue in HEVC decoder on EOS in low latency mode.
  • Fixed Encoder to achieve expected Bitrates
  • Fixed VCU Decoder to conceal errors & decode corrupted streams
  • Fixed various non-complaint stream to be able to decode
  • Fixed DMA fd import issue when videocrop element used in pipeline
  • Fixed VCU Init driver issue giving segmentation fault while loading & unloading driver.
  • Removed requirement for Gop.length to be multiple of B+1 frames.

2018.2 Release:

  • Fixed AVC Decoder hang issue for corrupted input file
  • Fixed data corruption issue observed with b-frame enable
  • Fixed zynqmp_vcu_encode application to support b-frames in AVC.
  • Fixed long run frame drop issue for AVC1080p60 decode→display with large input file
  • Fixed bad parameter error when setting baseline profile and level=5.1
  • Fixed MCU clock division calculation in VCU Init driver.
  • Improve CBR/VBR rate control for static video sequences

© Copyright 2019 - 2022 Xilinx Inc. Privacy Policy