Isaac
Isaac

Reputation: 1401

FFMPEG - concatenating mp4s from different sources - unable to stop "Non-monotonous DTS in output stream" warning

I need to concatenate mp4 files from different sources, this means some of the variables are out of my control such as timebase, aspect ratio and encoding. So to get around this I re-encode and attempt to standardise the files before concatenating them. Unfortunately, despite this I get Non-monotonous DTS in output stream warnings during the concatenation stage, and the output video seems to always have broken audio/video syncing by the last segment.

I know there are a lot of other questions out there about resolving the warning above, but I've been through them all and reviewed the documentation.. but unfortunately I've been still been unable to solve it..

I think the thing which I don't understand is: if I have mp4s from different sources, what exactly do I need to do to ensure that the files will always neatly concatenate together?

What I've tried so far

The script I'm using to standardise the mp4 files before concantenation is the following (amends resolution, frame rate, timebase, bitrate for audio, bitrate for video, audio encoding and video encoding):

ffmpeg -y -i $1 -vf 'scale=1280:720:force_original_aspect_ratio=1,pad=1280:720:(ow-iw)/2:(oh-ih)/2' -r 30 -video_track_timescale 90000 -b:a 128K -b:v 1200K -c:a aac -c:v libx264 $2

Here's the ffprobe output on two of the files, there are some differences but I'm not sure if they are significant?

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'intro.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:08.98, start: 0.000000, bitrate: 1210 kb/s
    Stream #0:0(eng): Video: h264 (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1069 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 132 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'middle.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:59.72, start: 0.000000, bitrate: 1200 kb/s
    Stream #0:0(und): Video: h264 (avc1 / 0x31637661), yuv420p, 1280x720 [SAR 1:1 DAR 16:9], 1063 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

They all have normal video and audio at this point.

After that I concatenate them and add a watermark using the following (it sucks that I need to re-encode here):

  ffmpeg -y \
    -f concat \
    -safe 0 \
    -i $INFILES \
    -c:v libx264 \
    -c:a copy \
    -preset fast \
    -vf drawtext=enable="'between(t, $DRAW_TEXT_DELAY, $DRAW_TEXT_DURATION)': fontfile=$FONT_DIR/$FONT: text='$TEXT': fontcolor=$FONTCOLOR: fontsize=$FONTSIZE: $POSITION" \
    $OUTFILE

INFILES is a path to a text file formatted like:

file /usr/src/app/data/test/out/intro.mp4
file /usr/src/app/data/test/out/middle.mp4
file /usr/src/app/data/test/out/outro.mp4

What am I missing here? Is there a way to debug this further?

Upvotes: 2

Views: 1336

Answers (2)

Gergely Lukacsy
Gergely Lukacsy

Reputation: 3074

There are three methods to concatenate files in FFmpeg.

  1. Demuxer (You are using this)

    This method can be used to concat files with the same paramters, like codecs, size, PAR, etc.

    $ ffmpeg -concat -i files.txt [...] output.mp4
    
  2. Protocol

    Same as the first one, but on top of that, this method is useful for files that can be copied together bitwise - it doesn't involves re-encoding (some formats support this, like MpegTS or some lossless formats).

    $ ffmpeg -i "concat:FILE_0| ... |FILE_N" [...] output.mp4
    
  3. Filter

    If you have videos with different codecs, you have to use this method:

    $ ffmpeg -i <FILE_0> ... -i <FILE_N> [...] -filter_complex "[0:0][0:1]...[<N>:0][<N>:1] concat=n=<N>:v=1:a=1[v_out][a_out]" -map [v_out] -map [a_out] output.mp4
    

The concat filter decodes the video and reencodes it with the same parameters. It also takes care of the audio streams. I'm not entirely sure what does it do if the resolutions are different, but this should be a good start.

Upvotes: 0

Gyan
Gyan

Reputation: 93299

Your audio streams have distinct sampling rates, and may have distinct channel count as well. Also, compressed MPEG audio streams will introduce slight async upon concat.

Use

ffmpeg -y -i $1 -vf 'scale=1280:720:force_original_aspect_ratio=1,pad=1280:720:(ow-iw)/2:(oh-ih)/2,setsar=1,format=yuv420p' -r 30 -c:v libx264 -b:v 1200K -ac 2 -ar 48000 -c:a pcm_s16le -video_track_timescale 90000 $2

to standardize, but save to MOV.

Then during concat, change -c:a copy to -c:a aac.

Upvotes: 4

Related Questions