August 2025
Name | averne |
---|---|
Organization | FFmpeg |
Project | ProRes Vulkan decoder |
Mentors | Lynne, Niklas Haas |
FFmpeg is the ubiquitous media framework, with wide capabilities such
as video/audio decoding/encoding, filtering, scaling, etc.
In order to improve video decoding efficiency, FFmpeg makes use of
hardware decoders found on GPU cards. However, only common consumer
codecs are supported by this approach. For niche usages, a recent trend
has been shader-based decoders, which make use of the parallelism
offered by GPU hardware to improve decoding times and lift
compute-intensive tasks from the CPU.
Apple ProRes is a video codec designed as an intermediate format, meaning used during mainly video editing. As such, it provides less compression than other codecs such as H.264, but has a vastly simpler decoding process.
My work during GSoC 2025 has been to write a shader-based decoder for Apple ProRes. It has been submitted for review and inclusion in the FFmpeg project here: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20381.
The project is integrated directly within the FFmpeg framework. It
makes use of the “hwaccel” framework designed to implement hardware
video accelerators, and of the Vulkan helpers found in FFmpeg.
The project also builds upon the pre-existing software ProRes decoder.
This existing code handles parsing frame and picture headers.
My development branch can be found here: https://github.com/averne/FFmpeg/tree/vk-proresdec, and the submission branch here: https://code.ffmpeg.org/averne/FFmpeg/src/branch/vk-proresdec-submit.
The main code changes are in:
libavcodec/proresdec.c
:
Reworking the hwaccel code to accomodate the Vulkan decoder.The main code additions are in:
libavcodec/vulkan_prores.c
:
Vulkan decoder implementation. It initializes acceleration structures,
creates the decoding pipeline and dispatches the compute shaders.libavcodec/vulkan/prores_vld.comp
:
Entropy decoding shader, does the variable-length decoding for color and
alpha components.libavcodec/vulkan/prores_idct.comp
:
Inverse discrete cosine transform shader, does the frequency→spatial
domain transform for color components.Other merged miscellaneous fixes/changes:
The decoding process is divided into two shaders: VLD and IDCT. VLD does the variable-length decoding work, ie. processing the compressed bitstream to extract frequency-domain coefficients. IDCT converts these coefficients into the final spatial-domain pixel values.
The VLD shader is communicated two buffers, and a metadata structure.
The buffers contain a list of slice offsets, and the compressedd
bitstream. The metadata structure contains useful codec state such as
picture width, height and other parameters, and is accessed as a uniform
constant.
This shader writes frequency coefficients to the result texture (or the
final values in the case of alpha decoding).
The IDCT shader is communicated the same metadata structure as
VLD.
It reads the coefficients from the result texture, transforms them, and
writes them to the same texture.
The decoder supports all existing codec profiles (up to 4444 XQ) and all codec features, ie.:
Quick benchmarking shows good performance. For for a 4K yuv422p10 sample, I measured:
The performance is largely bottlenecked by the VLD shader, which is less parallelizable than the IDCT operation.
In-depth profiling and optimization has been started but not yet
completed.
Several optimization strategies are being considered:
Below are listed commands required to build and use the decoder. Note than a working Vulkan installation is required, along with the glslang SPIR-V compiler.
git clone https://github.com/averne/FFmpeg.git -b vk-proresdec --depth 1 && cd FFmpeg
../configure --disable-doc --enable-gpl --enable-libglslang --enable-vulkan
.
Additionally, you can use a leaner configuration to speed up the
compilation step:
./configure --disable-everything --disable-doc --enable-avcodec --enable-avformat --enable-ffmpeg--disable-optimizations --enable-debug --enable-gpl --enable-libshaderc --enable-vulkan --enable-libx264 --enable-decoder=prores,h264,hevc --enable-parser=prores,h264,hevc --enable-hwaccel=prores_vulkan,h264_nvdec,hevc_nvdec --enable-protocol=file --enable-demuxer=matroska,mxf,mp4,mov,matroska --enable-muxer=null,rawvideo,mp4,mov,matroska --enable-encoder=wrapped_avframe,rawvideo,libx264 --enable-filter=scale,setparams
.make -j$(nproc)
./ffmpeg -hwaccel vulkan -i <file> -an -c:v libx264 output.mkv
../ffmpeg -benchmark -hwaccel vulkan -i <file> -an -f null -
.Finally, samples can be created using the prores_ks
encoder, like so:
ffmpeg -f lavfi -i testsrc2=size=1280x720:duration=10 -pix_fmt yuv444p12le -c:v prores_ks -profile:v 4444 -y prores-ks-444p12-prog.mkv
.
Varying the pixel format and video dimensions is possible by tweaking
the appropriate values.