Matt McManis
Matt McManis

Reputation: 4675

How to use FFmpeg Colorspace Options

I'm trying to understand the difference between Colorspace FFmpeg arguments:

The Colorspace categories are:


Question

When should each be used when encoding from raw, or transcoding/converting from one format to another? Such as mpg to mp4.

Do I need to specify the input colorspace or will FFmpeg auto-detect?


Problems

There is no -vp8-params, only -x264-params. Should I use normal options or -vf instead for vp8 and other codecs?


Errors


Resources

https://ffmpeg.org/ffmpeg-filters.html#colormatrix
https://ffmpeg.org/ffmpeg-filters.html#colorspace
https://trac.ffmpeg.org/wiki/colorspace

Upvotes: 14

Views: 34593

Answers (1)

F.X.
F.X.

Reputation: 7317

This might not completely answer your questions, but I found the intricacies of FFmpeg and libswscale and the various filters to be not very well-documented, so here goes my understanding of the thing.

  • -pix_fmt, -colorspace, -color_primaries, -color_trc and -color_range are for setting values inside the AVFrame structure described here. They don't do any conversion on their own, they just tag the input or output stream (depending on whether they're placed before or after -i). As far as I know this doesn't modify the pixel values themselves.

  • Some codecs or formats also need to know what colorspace they're operating in because they need to insert that in the encoded stream, and sometimes FFmpeg doesn't pass what it knows. -x264-params and -x265-params do that manually. As far as I know this alone doesn't modify the pixel values either.

  • The various -vf options are there to convert between colorspaces. They actively change the pixel values and layout in memory.

When running ffprobe <video file> or ffmpeg -i <video file>, the console standard error contains the details about each stream, such as for example:

...
  Stream #0:0: Video: hevc (High 10), yuv420p10le(tv, bt2020nc/bt2020/arib-std-b67), 1920x1080, SAR 1:1 DAR 16:9, 23.98 fps, 23.98 tbr, 1k tbn (default)
...

The yuv420p10le(tv, bt2020nc/bt2020/arib-std-b67) in the example output above contains the following information about how the pixels are stored in memory and how they should be interpreted as colors:

  • Pixel format is yuv420p10le, i.e. the memory layout consists of Y'CbCr signals stored as one full plane for Y', then Cb and then Cr at reduced resolution where each pixel value is 16-bit wide little-endian but only contains 10-bit values between 0-1023 (-pix_fmt option)
  • Range is tv, i.e. video signals use limited range (-color_range option)
  • The YUV <-> RGB color matrix is bt2020nc i.e. ITU-R BT.2020 non-constant luminance matrix (-colorspace option)
  • The primaries are bt2020 i.e. ITU-R BT.2020 primaries and white point (-color_primaries option)
  • The transfer function is arib-std-b67 i.e. HLG (-color_trc option)

Each option in the list above can be provided before the input (e.g. -color_primaries bt2020 -i <video file>) to override the default, but this is usually only needed if the input has incorrect or missing tags.

Another example is yuv420p(tv, bt709), which translates to:

  • 8-bit planar Y'CbCr pixel format
  • Limited range
  • ITU-R BT.709 primaries, white point and RGB/YUV conversion matrix

Now, this is where it gets fun: some filters are automatically applied depending on what FFmpeg knows of the input and output streams, usually using the libswscale component (-vf scale if invoked explicitly). This is obviously needed if you're converting between pixel formats (e.g. from RGB to YUV or between different YUV formats), but you can also use that filter to to some conversions manually or to change the image size, for example.

The big issue is that libswscale doesn't handle all cases very well, especially in command-line where you can't set all parameters, and especially for Wide-Gamut / HDR spaces like in BT2020. There are additional filters (colorspace, colormatrix and zscale) which don't cause as much trouble and which replace parts of libswscale. They do appear to set the correct AVFrame pixel formats as far as I could tell, so libswscale doesn't try to apply conversion twice, but I'm a bit hazy on which is automatic and which is not.

If needed, each option above can also be specified after the input to override the output configuration instead (e.g. -i <video> -color_primaries bt2020), but I prefer using filters explicitly as it avoids having to trust that the correct conversions happen.

Upvotes: 18

Related Questions