Xilinx Zynq UltraScale+ MPSoC Video Codec Unit
Table of Contents
Overview
Xilinx Zynq UltraScale+ MPSoC Video Codec Unit (VCU) provides multi-standard video encoding and decoding capabilities, including: High Efficiency Video Coding (HEVC), i.e., H.265; and Advanced Video Coding (AVC), i.e., H.264 standards. VCU software stack consists of custom kernel module and custom user space library known as Control Software (CtrlSW). The OpenMAX IL (OMX) layer is integrated on top of CtrlSW, and Gstreamer frame work is used to integrate OMX-IL component along with other multimedia elements.OpenMAX™ is a cross-platform API that provides comprehensive streaming media codec and application portability by enabling accelerated multimedia components. Gstreamer is the cross-platform / open source multimedia framework, and provides the infrastructure to integrate multiple multimedia components and create pipelines.
Users can develop their application at all 3 levels, i.e. CtrlSW, OMX-IL, and Gstreamer.
VCU Software Stack |
Supported Features
- Multi-standard encoding/decoding support, including:
- Advanced Video Coding (AVC) H.264
- High Efficiency Video Coding (HEVC) H.265
- HEVC Main profiles, level upto 5.1, High Tier, 4kp60 Encoding/Decoding
- AVC BP/MP/HP, level upto 5.2, 4kp60 Encoding/Decoding
- Supports multi-stream encoding/decoding up to 8(1080p30) streams
- 420/422 chroma-format support
- 8/10 bit encoding/decoding
- Slice level Encoding/Decoding
- Flexible Rate Control: CBR, VBR, Constant QP, Low-latency-RC(Optimized for streaming applications)
- On-the-fly change of multiple encoding parameters
- Target bitrate
- Gop length
- Number of B frames
- Insertion of IDR picture
- Region of Interest based Encoding (ROI)
API information and Example pipelines
H.264/H.265 Video Codec Unit v1.2
Source code repos
VCU:
Gstreamer:
CMA Size setting instructions for Yocto and Petalinux users:
Using kernel menuconfig option
- Run kernel menuconfig using below cmds
- peta-linux : "petalinux-config -c kernel"
- yocto: " bitbake -f virtual/kernel -c menuconfig"
- Go to Device Drivers→Generic Driver options→ DMA Contiguous Memory Allocator
- Set default contiguous memory size as per your preferred value depending upon your usecase
- Save the configuration, exit the menuconfig and build images
- Run kernel menuconfig using below cmds
Using bootargs at uboot prompt
- During boot-up stop at uboot prompt by continuously pressing any key during bootup and set custom CMA size using below cmd
- "setenv bootargs $bootargs cma=1500m"
- boot
- During boot-up stop at uboot prompt by continuously pressing any key during bootup and set custom CMA size using below cmd
Using custom uEnv.txt for Yocto
- There is a custom uEnv.txt file being generated by yocto build at work/build/tmp/deploy/images/zcu106-zynqmp which can be edited to set custom cma size.
- Modify uEnv.txt bootargs option to set requried cma size as below
- "bootargs=earlycon clk_ignore_unused root=/dev/mmcblk0p2 rw rootwait cma=1500m"
- Copy uEnv.txt having CMA boot args setting in boot partition (i.e the first partition) of sdcard and boot the board.
Device Tree Binding
The device tree node will be automatically generated, if the core is configured in the HW design, using the Device Tree BSP.
Steps to generate device-tree is documented here,
http://www.wiki.xilinx.com/Build+Device+Tree+Blob
And a sample binding is shown below and the description of DT property is documented here - Documentation/devicetree/bindings/clock/xlnx,vcu.txt
Known Issues
- AR66763 - LogiCORE H.264/H.265 Video Codec Unit (VCU) - Release Notes and Known Issues for the Vivado 2017.3 tool and later versions
2022.2 Release
New Feature Support:
- Dynamic Insertion of IDR frame in Low latency
Bug Fixes:
- Fix race condition when destroying the channel
- Fixed wrong computation of iMaxSlices at decoder side, due to which several channel variables are computed wrong and decoder was crashing
- Handled SDI input cable disconnect and reconnect scenarios based on video locking/unlocking event
- Fixed NULL pointer dereference in ALSA framework while setting DAI data for xilinx DP audio based use cases
Known Issues:
S.No | Issue Description | Workaround | comments/AR link |
---|---|---|---|
1 | Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run | No workaround available | NA |
2 | Memory leak observed with decoder app | No workaround available | NA |
3 | Fps drops observed with below usecase:
| No workaround available. | NA |
4 | VCU decoder will hang and could not recover itself if the input stream exceeds its capability for a few times | No workaround available. | NA |
5 | Broken bitstream observed with h264 parser on decoding specific streams | No workaround available. | NA |
2022.1 Release
New Feature Support:
- DMA fd support for VCU Encoder output.
- YUV444 support for encoder and decoder using Xilinx custom solution.
Bug Fixes:
- Fix for memory leak issue that was observed with gstreamer based low-latency VCU pipelines.
- Release reset for DP before accessing DP registers. Accessing the DP register without releasing the reset makes system hang.
Known Issues:
S.No | Issue Description | Workaround | comments/AR link |
---|---|---|---|
1 | Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run | No workaround available | NA |
2 | SIGINT is not working properly while killing low-latency stream-out pipelines | Send two consecutive ctrl+c signals kills the pipeline | NA |
3 | Fps drops observed with below usecase:
| No workaround available. | NA |
4 | ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations) | Sending another SIGINT_KILL or Ctrl+C closes the application | NA |
2021.2 Release
New Feature Support:
- 4 byte start code for all slices.
Bug Fixes:
- Fixed: 4K HEVC Encoder with skip-frames enabled produces invalid bitstream syntax
- Fixed: Scheduler changes for skip-frame issue
- Fixed: Fixed MCU trace error reported when receiving RTSP stream
- Fixed: Memory leak in gst-plugins-bad
- Fixed: Decoder latency is high when receving rtsp multicast stream
- Fixed: DPDMA example shows black screen with 1080p display monitors
- Fixed: Handling case where mixer is not able to scale to application requested dimensions
- Fixed: Allow media pipeline enable with single dma start
Known Issues:
S.No | Issue Description | Workaround | comments/AR link |
---|---|---|---|
1 | SIGINT is not working properly while killing low-latency stream-out pipelines | Send two consecutive ctrl+c signals kills the pipeline | NA |
2 | Fps drops observed with LLP2 2-4kp30 XV20 AVC serial use-case. Pipeline is taking more time to stabilize(around 3-4 seconds). | No workaround available. | NA |
3 | ctrlsw encoder application is not closing properly after running rigorously in a loop. (1000+ iterations) | Sending another SIGINT_KILL or Ctrl+C closes the application | NA |
4 | Audio lost/distortion observed with only audio usecase and audio+video in LLP2 serial pipelines during long-run | No workaround available | NA |
5 | Interlaced file playback shows distortion on the Display in the long run | No workaround available | NA |
6 | PS_DP is not sending correct channel_status message for audio | No workaround available | NA |
2021.1 Release
New Feature Support:
- Gstreamer version upgraded to 1.16.3
- GRAY8/GRAY10 format support is added for VCU encoder/decoder at Gstreamer
- Dynamic IDR insertion support is added for Pyramidal GOP
- Added external CRTC (ex: PL video-mixer) support to PS-DP subsystem.
- Uniform slice_type parameter support is added for VCU encoder
- Vertical alignment/crop on input buffers is supported for VCU encoder
- used in 486i to 480i conversion.
Bug Fixes:
- Fixed V4l2 mem2mem driver reload/load issue.
- Fixed gstreamer kmssink to display full screen mode for 4k wider monitors.
- Fixed dma driver to capture and encode resolutions which are not aligned to 32, ex: 1400x1050.
- Fixed overwriting SEI messages on BP and PT SEI messages.
- Fixed kmssink to display planar 420 I420 video using PS_DP.
- LLP2:
- Fixed issues related to 720p resolution capture + encode use-case
- Fixed Display Ripple effects and start/stop live sources on the fly.
Known Issues:
S.No | Issue Description | Workaround | comments/AR link |
---|---|---|---|
1 | SIGINT is not working properly while killing low-latency stream-out pipelines | Send two consecutive ctrl+c signals kills the pipeline | NA |
2 | VCU encoder MCU trace error is reported when receiving erroneous RTSP stream | No workaround available, | Need to handle error concealment for such errors |
3 | Interlaced file playback shows distortion on the Display in the long run | No workaround available | NA |
4 | 4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video | Use b-frames=0 | NA |
2020.2 Release
New Feature Support:
- HDR10 is supported for capture, VCU encode/decode and display at gstreamer level
- Added max-consecutive-skip parameter to VCU encoder
- Allows users to specify a maximum number of consecutive skipped frames
- Interlaced video support added for SCD MM
- NTSC 4:2:0 interlace support is added to VCU encoder/decoder
- Interlaced video/audio support is validated
- Added example for passing DMA buffers between the VCU and appsrc/appsink
Bug Fixes:
- Fixed XAVC compliance errors
- Fixed incorrect POC on skipped interlaced frames
- Fixed LLP2 encoder latency issues
- Fixed frame drop issue with AVC, HIGH Profile 4kp60 with num-slices=16
- Fixed coverity check errors in V4L2 and DRM
- Fixed gstreamer parser bugs which caused crashes with third party video files
- Fixed issue with improper handling of prefix NALs
Known issues:
S.No | Issue description | Work around | Comments/AR link |
---|---|---|---|
1 | 4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video | Use b-frames=0 | NA |
2 | LLP2: 1280x720 XV20 (422 10 bit) doesn't work | Fix available, AR will be published | NA |
3 | Memory leak when using v4l2src | Fix available, AR will be published | NA |
4 | LLP2: Switching Live source resolutions on the fly or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch. | remove and re-insert vcu kernel modules when it happens | NA |
5 | LLP2: Ripple effect on display | Fix available, AR will be published | NA |
2020.1 Release
New Feature Support:
- Gstreamer version upgraded to 1.16.1
- Added support for HDR10 metadata insertion and extraction at control software level
- Enabled buffer metadata to indicate encoder frame-skip frame to application at control software level
- Users can determine whether or not a frame is skipped by calling "AL_Buffer_GetMetaData()" and extracting the bSkipped flag
- Added custom ROI delta-qp (roi-by-value) support to encoder.
- LLP2 Video + Audio pipeline support is verified.
Bug Fixes:
- Fixed gstreamer parser bugs which caused crashes with third party video files
- Fixed picture timing SEI parsing
- Fixed crash caused by incomplete AUs
- Fixed interlaced transport stream parsing
- Fixed gradual horizontal/vertical line sweep when using GDR mode
- Provided custom gstreamer application (zynqmp_relative_qp_insertion) to improve visual quality by modifying delta-qp, alpha, and beta offsets
- Fixed coverity check errors in control software
- Fixed target bitrate in interlaced mode
- XAVC SEI message buffer size is increased to avoid video hang/corruption for AVC encoder.
Known issues:
S.No | Issue description | Work around | Comments/AR link |
---|---|---|---|
1 | 4x1080p60 decode (PL_DDR) → display use-case shows occasional frame drops when B-frames are present in input video | No workaround | NA |
2 | LLP2: Higher encoder latency is observed in 4x 1080p60 use-case. Extra 3 to 4 msec is observed, this may lead to occasional frame drops or lower fps problem. | Use extra processing-deadline for gst-pipeline sink, it may reduce frame drops to some extent | NA |
3 | Frame drops observed in serial/streaming use-case for AVC, HIGH Profile 4kp60 with num-slices=16 use-case. | use num-slices=8 for slice encoding | NA |
4 | Switching Live source resolutions on the fly or removing and re-inserting live source cable when VCU encoder LLP2 pipeline is running, causes hang for next immediate pipeline launch. | remove and re-insert vcu kernel modules when it happens | NA |
2019.2 Release:
New Feature Support
- ZDMA copy is supported for VCU Encoder output buffer reconstruction, helps in reducing CPU load for encode use-cases.
- Added IntraMB forcing at block level (Encoder) through External QP table and ROI Map
- Added 15 Bframes support in Pyramidal GOP structure
- Dynamic resolution change support without port reconfiguration is added gstreamer level, previous release had support at vcu control-sw level.
- Added sample gstreamer test application to show case 32x stream video transcoding use-case.
- Added support for generating separate codec-config data for VCU encoder, useful in android framework
- Added max-picture-sizes control based on Frame type <I, P, B>
- Added support for blow xAVC profiles
- AL_PROFILE_XAVC_HIGH10_INTRA_CBG
- AL_PROFILE_XAVC_HIGH10_INTRA_VBR
- AL_PROFILE_XAVC_HIGH_422_INTRA_CBG
- AL_PROFILE_XAVC_HIGH_422_INTRA_VBR
- AL_PROFILE_XAVC_LONG_GOP_MAIN_MP4
- AL_PROFILE_XAVC_LONG_GOP_HIGH_MP4
- AL_PROFILE_XAVC_LONG_GOP_HIGH_MXF
- AL_PROFILE_XAVC_LONG_GOP_HIGH_422_MXF
- Added support external rate control plugin at MCU level
- Enables users to develop their own rate-control and plugin into vcu firmware.
- Added support for Loading of external QP map at gstremaer level
- Added support for transfering decoder pts/dts data using one-to-one buffer mapping instead of FIFO at application level.
- New GOP structure support is added.
- default-gop-b and pyramidal-gop-b modes.
- Added support of changing loop filter alpha and beta coefficients dynamically for both avc and hevc.
- Xilinx Low-latency (XNLXLL) / Low-latency phase2 support is added for VCU encoder/decoder. It uses extra HW IP to synchronize video buffers with other IPs (Ex: capture) on the fly.
Bug Fixes
- Fixed reduced latency-mode (no-reorder mode) multi-stream support, it will support more than 2 decoder streams unlike low-latency mode.
- Fixed OMX Encoder flush mechanism race condition, Helps in fixing encoder crash in regular video-recording start/stop use-case.
- Fixed HEVC Picture timing SEI meta data filed, HEVC should use frame number not field number unlike AVC.
- Fixed 32 (VGA) stream hang related issue, mailbox size and channel creation optimizations added.
- Fixed VCU encoder hang when FrameSkip is enabled, excluded IDR pictures from frameskip logic.
- Fixed dmabuf handling in userbuffer() API at omxvideodec gstreamer.
- Updated ZCU104 Petalinux BSP design to use external clock as source for VCU PLL Ref clock.
2019.1 Release:
New Feature Support
- Dynamic Resolution Change
- VCU Decoder and Encoder support at Control-software
- Frame skip support for VCU encoder
- New rate control mode Capped VBR support
- SEI NAL Unit insertion at Gstreamer/OMX Level
- DCI 4K (4096x2160 @60fps support
- VCU Encoder – VQ improvement option
- Temporal layer ID support for Pyramidal GOP
- Frame DeltaQP support based on temporal layer
- Lambda Table update based on temporal layer
- 32 streams - 420P (Encode and Decode)
- Adaptive GOP Support (ability to change number of dynamically)
- VCU PL DDR Controller support for Limited DRAM parts
- Multistream Audio/Video pipeline support
- Dynamic Resolution Change
Bug fixes
- Fix glitches in H.264 decode and display on long run
- Improve robustness while decoding erroneous stream
- Fix performance issue in CONST_QP mode compared to VBR mode
- Fix for decoder always generating 10-bit output for chroma-4:2:2 (control-sw application)
- Fix encoder scheduling issue in multi-stream use-case
- Fix IDR frame insertion issue when gop-length=1
- Fix decoder control-sw generating blocky video output for specific stream.
- Fix incorrect parameter passing for reduced-latency mode
- Fix encoder assert on multiple flush for empty frame
2018.3 Release:
- New Feature Support
- Scene Change Detection
- HEVC Interlaced Encoding/Decoding
- Adaptive B-Frame support
- Long Term Reference picture
- SEI Insertion at Control-SW Encoder
- SEI Extraction at Control-SW Decoder
- Dual Pass Encoding at frame level
- Low Latency Mode
- GDR support Enhancement
- Insertion of SPS/PPS at each GDR start frame
- Decoder synchronization without IDR picture
- Multi-stream support in Low latency mode
- Stabilized streaming use cases
- Fixed Hang/Crash issues observed in various streaming pipelines.
- Fixed Assertion errors observed in multiple start/stop at both server & client side.
- Unable to resume decoding if server/client is start/stopped.
- Fixed time stamping errors in low-latency mode.
- Fixed Frame drops issues observed in 4kp60 pipeline at 60Mbps in CBR/VBR/low latency modes.
- Fixed Flickering & Block noise issue observed with 4kp60 pipeline at lower bitrate(10Mbps).
- Fixed I-Frame flickering effect with lower CPBSize.
- Fixed Memory leak where control software fail to release memory after encoding.
- Fixed Memory leak observed when running pipeline in full-screen overlay mode.
- Fixed Hang issue while seeking qtmux or mkv generated file.
- Fixed Assertion errors observed in repeat Playback with EOS
- Fixed Encoder port flush issue observed while seeking.
- Fixed Crash issue at Control Software when performing multiple concurrent transcoding.
- Fixed Hang issue in HEVC decoder on EOS in low latency mode.
- Fixed Encoder to achieve expected Bitrates
- Fixed VCU Decoder to conceal errors & decode corrupted streams
- Fixed various non-complaint stream to be able to decode
- Fixed DMA fd import issue when videocrop element used in pipeline
- Fixed VCU Init driver issue giving segmentation fault while loading & unloading driver.
- Removed requirement for Gop.length to be multiple of B+1 frames.
2018.2 Release:
- Fixed AVC Decoder hang issue for corrupted input file
- Fixed data corruption issue observed with b-frame enable
- Fixed zynqmp_vcu_encode application to support b-frames in AVC.
- Fixed long run frame drop issue for AVC1080p60 decode→display with large input file
- Fixed bad parameter error when setting baseline profile and level=5.1
- Fixed MCU clock division calculation in VCU Init driver.
- Improve CBR/VBR rate control for static video sequences
© Copyright 2019 - 2022 Xilinx Inc. Privacy Policy