Xelpi
Xelpi

Reputation: 11

How can this YUV420 video have uneven dimensions

Just trying to cement my understanding of video codecs and can't seem to find any information about this.

https://www.mediafire.com/file/cbx8sciq5mie94m/arrow.webm/file contains a tiny vp9 webm. This was generated by me via ffmpeg by doing a simple gif -> webm conversion.

The source gif had a 13x11 resolution. Somehow, the output video also has a 13x11 resolution. I'm trying to understand how that is possible.

As far as I understand:

  1. The YUV420 pixel format would make this impossible due to the chroma subsampling factor of 2 forcing a divisibility by two requirement.

  2. VP9 itself has a minimum block size of 16x16(?) so at least that much data must be encoded(?)

Consequently, it's my assumption we have either a ~14x12 or ~16x16 video stream encoded here that is being somehow scaled or cropped down to 13x11.

The problem is I can't find any explanation as to how this is working.

Here's the ffprobe output for the stream:

[STREAM]
index=0
codec_name=vp9
codec_long_name=Google VP9
profile=Profile 0
codec_type=video
codec_time_base=1/60
codec_tag_string=[0][0][0][0]
codec_tag=0x0000
width=13
height=11
coded_width=13
coded_height=11
closed_captions=0
has_b_frames=0
sample_aspect_ratio=1:1
display_aspect_ratio=13:11
pix_fmt=yuv420p
level=-99
color_range=tv
color_space=unknown
color_transfer=unknown
color_primaries=unknown
chroma_location=unspecified
field_order=unknown
timecode=N/A
refs=1
id=N/A
r_frame_rate=60/1
avg_frame_rate=60/1
time_base=1/1000
start_pts=0
start_time=0.000000
duration_ts=N/A
duration=N/A
bit_rate=N/A
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=N/A
nb_read_frames=N/A
nb_read_packets=N/A
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
TAG:alpha_mode=1
TAG:ENCODER=Lavc58.91.100 libvpx-vp9
TAG:DURATION=00:00:00.600000000
[/STREAM]

and for the last frame:

[FRAME]
media_type=video
stream_index=0
key_frame=0
pkt_pts=583
pkt_pts_time=0.583000
pkt_dts=583
pkt_dts_time=0.583000
best_effort_timestamp=583
best_effort_timestamp_time=0.583000
pkt_duration=16
pkt_duration_time=0.016000
pkt_pos=3639
pkt_size=15
width=13
height=11
pix_fmt=yuv420p
sample_aspect_ratio=1:1
pict_type=P
coded_picture_number=0
display_picture_number=0
interlaced_frame=0
top_field_first=0
repeat_pict=0
color_range=tv
color_space=unknown
color_primaries=unknown
color_transfer=unknown
chroma_location=unspecified
[/FRAME]

going off the coded_width and coded_height values (supposed to represent the "true" width/height before any scaling(?)) plus sar value of 1, as far as I can tell this is genuinely a 13x11 video stream, but that should be impossible no?

My question is, why is this a valid video file?

If I try to e.g. zscale something to an uneven resolution in YUV420 pixel format I hit the expected chroma subsampling errors.

Upvotes: 1

Views: 897

Answers (0)

Related Questions