Reputation: 979
Here's a curious option listed in the man pages of ffmpeg:
-aframes number (output)
Set the number of audio frames to output. This is an obsolete alias for "-frames:a", which you should use instead.
What an 'audio frame' is seems dubious to me. This SO answer says that frame is synonymous with sample, but that can't be what ffmpeg thinks a frame is. Just look at this example when I resample some audio to 22.05 kHz and a length of exactly 313 frames:
$ ffmpeg -i input.mp3 -frames:a 313 -ar:a 22.05K output.wav
If 'frame' and 'sample' were synonymous, we would expect audio duration to be 0.014 seconds, but the actual duration is 8 seconds. ffmpeg thinks the frame rate of my input is 39.125.
What's going on here? What does ffmpeg think an audio frame really is? How do I go about finding this frame rate of my input audio?
Upvotes: 3
Views: 2014
Reputation: 92988
FFmpeg uses an AVFrame structure internally to convey and process all media data in chunks. The number of samples per frame depends on the decoder. For video, a frame consists of all pixel data for one picture, which is a logical grouping, although it can also contain pixel data for two half-pictures of an interlaced video stream.
For audio, decoders of DCT-based codecs typically fill a frame with the number of samples used in the DCT window - that's 1024 for AAC and 576/1152 for MP3, as Brad mentioned, depending on sampling rate. PCM samples are independent so there is no inherent concept of framing and thus frame size. However the samples still need to be accommodated within AVFrames, and ffmpeg defaults to 1024 samples per frame for planar PCM in each buffer (one for each channel).
You can use the ashowinfo filter to display the frame size. You can also use the asetnsamples filter to regroup the data in a custom frame size.
Upvotes: 4
Reputation: 163234
A "frame" is a bit of an overloaded term here.
In PCM, a frame is a set of samples occurring at the same time. If your audio were 22.05 kHz and you had 313 PCM frames, it's length in time would be about 14 milliseconds, as you expect.
However, your audio isn't PCM... it's MP3. An MP3 frame is about 26 milliseconds long. 313 of them add up to about 8 seconds. The frame here is a block of audio that cannot be decoded independently. (In fact, some frames actually depend on other frames via the bit reservoir!)
Upvotes: 3