Vulkan ProRes decoder: GSoC report

Name	averne
Organization	FFmpeg
Project	ProRes Vulkan decoder
Mentors	Lynne, Niklas Haas

Introduction

FFmpeg is the ubiquitous media framework, with wide capabilities such as video/audio decoding/encoding, filtering, scaling, etc.
In order to improve video decoding efficiency, FFmpeg makes use of hardware decoders found on GPU cards. However, only common consumer codecs are supported by this approach. For niche usages, a recent trend has been shader-based decoders, which make use of the parallelism offered by GPU hardware to improve decoding times and lift compute-intensive tasks from the CPU.

Apple ProRes is a video codec designed as an intermediate format, meaning used during mainly video editing. As such, it provides less compression than other codecs such as H.264, but has a vastly simpler decoding process.

My work during GSoC 2025 has been to write a shader-based decoder for Apple ProRes. It has been submitted for review and inclusion in the FFmpeg project here: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20381.

Project structure

The project is integrated directly within the FFmpeg framework. It makes use of the “hwaccel” framework designed to implement hardware video accelerators, and of the Vulkan helpers found in FFmpeg.
The project also builds upon the pre-existing software ProRes decoder. This existing code handles parsing frame and picture headers.

My development branch can be found here: https://github.com/averne/FFmpeg/tree/vk-proresdec, and the submission branch here: https://code.ffmpeg.org/averne/FFmpeg/src/branch/vk-proresdec-submit.

Changes

The main code changes are in:

libavcodec/proresdec.c: Reworking the hwaccel code to accomodate the Vulkan decoder.

The main code additions are in:

libavcodec/vulkan_prores.c: Vulkan decoder implementation. It initializes acceleration structures, creates the decoding pipeline and dispatches the compute shaders.
libavcodec/vulkan/prores_vld.comp: Entropy decoding shader, does the variable-length decoding for color and alpha components.
libavcodec/vulkan/prores_idct.comp: Inverse discrete cosine transform shader, does the frequency→spatial domain transform for color components.

Other merged miscellaneous fixes/changes:

Implementation

The decoding process is divided into two shaders: VLD and IDCT. VLD does the variable-length decoding work, ie. processing the compressed bitstream to extract frequency-domain coefficients. IDCT converts these coefficients into the final spatial-domain pixel values.

The VLD shader is communicated two buffers, and a metadata structure. The buffers contain a list of slice offsets, and the compressedd bitstream. The metadata structure contains useful codec state such as picture width, height and other parameters, and is accessed as a uniform constant.
This shader writes frequency coefficients to the result texture (or the final values in the case of alpha decoding).

The IDCT shader is communicated the same metadata structure as VLD.
It reads the coefficients from the result texture, transforms them, and writes them to the same texture.

The decoder supports all existing codec profiles (up to 4444 XQ) and all codec features, ie.:

4:2:2 and 4:4:4 chroma subsampling
10- and 12-bit component depth
Interlacing
Alpha plane

Quick benchmarking shows good performance. For for a 4K yuv422p10 sample, I measured:

AMD Radeon 6700XT: 178 fps
Intel i7 Tiger Lake: 37 fps
NVidia Orin Nano: 70 fps

The performance is largely bottlenecked by the VLD shader, which is less parallelizable than the IDCT operation.

Using the Vulkan ProRes decoder

Below are listed commands required to build and use the decoder. Note than a working Vulkan installation is required, along with the glslang SPIR-V compiler.

Clone my FFmpeg branch: git clone https://github.com/averne/FFmpeg.git -b vk-proresdec --depth 1 && cd FFmpeg.
Configure the build: ./configure --disable-doc --enable-gpl --enable-libglslang --enable-vulkan. Additionally, you can use a leaner configuration to speed up the compilation step: ./configure --disable-everything --disable-doc --enable-avcodec --enable-avformat --enable-ffmpeg--disable-optimizations --enable-debug --enable-gpl --enable-libshaderc --enable-vulkan --enable-libx264 --enable-decoder=prores,h264,hevc --enable-parser=prores,h264,hevc --enable-hwaccel=prores_vulkan,h264_nvdec,hevc_nvdec --enable-protocol=file --enable-demuxer=matroska,mxf,mp4,mov,matroska --enable-muxer=null,rawvideo,mp4,mov,matroska --enable-encoder=wrapped_avframe,rawvideo,libx264 --enable-filter=scale,setparams.
Compile: make -j$(nproc)
Launch the decoder: ./ffmpeg -hwaccel vulkan -i <file> -an -c:v libx264 output.mkv.
Benchmarking can be done with: ./ffmpeg -benchmark -hwaccel vulkan -i <file> -an -f null -.

Finally, samples can be created using the prores_ks encoder, like so: ffmpeg -f lavfi -i testsrc2=size=1280x720:duration=10 -pix_fmt yuv444p12le -c:v prores_ks -profile:v 4444 -y prores-ks-444p12-prog.mkv.
Varying the pixel format and video dimensions is possible by tweaking the appropriate values.

Introduction

Project structure

Changes

Implementation

Things not yet done

Using the Vulkan ProRes decoder