Concatenating audio files with ffmpeg results in a wrong total duration

Question

With "wrong total duration" I mean a total duration different from the sum of individual duration of audio files.

sum_duration_files != duration( concatenation of files )

In particular I am concatenating 2 OGG audio files with this command

ffmpeg -safe 0 -loglevel quiet \
  -f concat -segment_time_metadata 1 -i {m3u_file_name} \
  -vf select=concatdec_select \
  -af aselect=concatdec_select,aresample=async=1 \
  {ogg_file_name}

And I get the following

# Output of:  ffprobe .ogg


======== files_in 

Input #0, ogg, from 'f1.ogg':
  Duration: 00:00:04.32, start: 0.000000, bitrate: 28 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, mono, fltp


Input #0, ogg, from 'f2.ogg':
  Duration: 00:00:00.70, start: 0.000000, bitrate: 68 kb/s
    Stream #0:0: Audio: vorbis, 44100 Hz, mono, fltp, 160 kb/s
    Metadata:
      ENCODER         : Lavc57.107.100 libvorbis

Note durations: 4.32 and 0.7 sec

And this is the output file.

========== files out (concatenate of files_in)

Input #0, ogg, from 'f_concat_v1.ogg':
  Duration: 00:00:04.61, start: 0.000000, bitrate: 61 kb/s
    Stream #0:0: Audio: vorbis, 48000 Hz, mono, fltp, 80 kb/s
    Metadata:
      ENCODER         : Lavc57.107.100 libvorbis

Duration: 4.61 sec

As 4.61 sec != 4.32 + 0.7 sec I have a problem.

kesh · Accepted Answer

The issue here is using a wrong concatenation approach for these files. As FFmpeg wiki article suggests, file-level concatenation (-f concat) requires all files in the listing to have the exact same codec parameters. In your case, only # of channels (mono) and sample format (flt) are common between them. On the other hand, codec (opus vs. vorbis) and sampling rate (48000 vs. 44100) are different.

-f concat grabs the first set of parameters and runs with it. In your case, it uses 48000 S/s for all the files. Although the second file is 44100 S/s, it assumes 48k (so it'll play it faster than it is). I don't know how the difference in the codec played out in the output.

So, a standard approach is to use -filter_complex concat=a=1:v=1:n=2 with these files given as separate inputs.

~~Out of curiosity, have you listen to the wrong-duration output file?~~ [edit: never mind, your self-answer indicates one of them is a silent track]

Concatenating audio files with ffmpeg results in a wrong total duration

Answers (2)

Related Questions