Nutela
Nutela

Reputation: 85

Merging multichannel audio tracks from Mumble with ffmpeg

We record talks through Mumble and because Mumble has a nitfy multichannel feature I'd figured we could get subtitles from YouTube by uploading each track to YouTube separately with for file in *; do ffmpeg -loop 1 -r 2 -i "$img" -i "$file" -vf scale=-1:380 -c:v libx264 -preset slow -tune stillimage -crf 18 -c:a copy -shortest -pix_fmt yuv420p -threads 0 "$file".mkv; done I then can prepend with a eg. a sed shell script a nickname for each speaker in the automatic captions i.e. subtitles from YouTube. Works like a charm.

But merging those tracks with ffmpeg gets tricky. I use ffmpeg -i input1.ogg -input2.ogg -i input3.ogg -i input4.ogg -input5.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amerge=inputs=5[aout]" -map "[aout]" -ac 2 output.ogg

Somehow ffmpeg shortens the resulting audio track and I don't yet have an idea why. I tried using the longest first and last since including silent tracks made even a shorter mixdown. Here are the warnings:

[Parsed_amerge_0 @ 0x7f8b29f02d20] No channel layout for input 1

[Parsed_amerge_0 @ 0x7f8b29f02d20] Input channel layouts overlap: output layout will be determined by the number of distinct input channels

But it says

[Parsed_amerge_0 @ 0x7f8b29f02d20] No channel layout for input 1

even when I change the order of inputs.

Allthough according to Mumble's documentation the tracks should be equal length VLC media info shows different track times. However the tracks are not out of sync just cut off at the end.

I also have no idea why ffmpeg mentions FLAC, all the files are vorbis.

ffmpeg -i Mumble-2017-09-09-16-33-18-149.210.187.155-chrisaiki2.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-Recorder.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-steempowerpics.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-Taconator.ogg -i Mumble-2017-09-09-16-33-18-149.210.187.155-fuzzynewest.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amerge=inputs=5[aout]" -map "[aout]" -ac 2 output5.ogg

ffmpeg version 2.8.4 Copyright (c) 2000-2015 the FFmpeg developers
  built with Apple LLVM version 7.0.2 (clang-700.1.81)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/2.8.4 --enable-shared --        enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-  avresample --cc=clang --host-cflags= --host-ldflags= --enable-opencl --enable-    libx264 --enable-libmp3lame --enable-libvo-aacenc --enable-libxvid --enable-vda
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-   chrisaiki2.ogg':
  Duration: 00:40:01.19, start: 0.000000, bitrate: 17 kb/s
    Stream #0:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
    Metadata:
      ENCODER         : libsndfile
      TITLE           : chrisaiki2
Input #1, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-Recorder.ogg':
  Duration: 00:33:57.88, start: 0.000000, bitrate: 1 kb/s
    Stream #1:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
    Metadata:
      ENCODER         : libsndfile
      TITLE           : Recorder
Input #2, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-steempowerpics.ogg':
  Duration: 00:33:53.93, start: 0.000000, bitrate: 1 kb/s
    Stream #2:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
    Metadata:
      ENCODER         : libsndfile
      TITLE           : steempowerpics
Input #3, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-Taconator.ogg':
  Duration: 00:35:36.37, start: 0.000000, bitrate: 6 kb/s
    Stream #3:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
    Metadata:
      ENCODER         : libsndfile
      TITLE           : Taconator
Input #4, ogg, from 'Mumble-2017-09-09-16-33-18-149.210.187.155-fuzzynewest.ogg':
  Duration: 00:41:53.23, start: 0.000000, bitrate: 30 kb/s
    Stream #4:0: Audio: vorbis, 48000 Hz, mono, fltp, 86 kb/s
    Metadata:
      ENCODER         : libsndfile
      TITLE           : fuzzynewest
File 'output5.ogg' already exists. Overwrite ? [y/N] y
[Parsed_amerge_0 @ 0x7f8b29f02d20] No channel layout for input 1
[Parsed_amerge_0 @ 0x7f8b29f02d20] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[flac @ 0x7f8b2b005600] encoding as 24 bits-per-sample
Output #0, ogg, to 'output5.ogg':
  Metadata:
    encoder         : Lavf56.40.101
    Stream #0:0: Audio: flac, 48000 Hz, stereo, s32 (24 bit), 128 kb/s (default)
    Metadata:
      encoder         : Lavc56.60.100 flac
Stream mapping:
  Stream #0:0 (vorbis) -> amerge:in0
  Stream #1:0 (vorbis) -> amerge:in1
  Stream #2:0 (vorbis) -> amerge:in2
  Stream #3:0 (vorbis) -> amerge:in3
  Stream #4:0 (vorbis) -> amerge:in4
  amerge -> Stream #0:0 (flac)
Press [q] to stop, [?] for help
size=  100900kB time=00:33:53.94 bitrate= 406.4kbits/s    
video:0kB audio:100441kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.457024%

Mumble multichannel talk on reddit

Upvotes: 1

Views: 1090

Answers (2)

Nutela
Nutela

Reputation: 85

I used amix in the end like this: ffmpeg -i input1.ogg -i input2.ogg -i input3.ogg -i inout4.ogg -i input5.ogg -filter_complex "[0:a][1:a][2:a][3:a][4:a] amix=inputs=5:duration=longest[aout]" -map "[aout]" -ac 2 -c:a libvorbis -b:a 128k output.ogg

ffmpeg didn't recognize libvorbis so I had to reinstall with brew first: brew reinstall ffmpeg --with-libvorbis

I then used ffmpeg -loop 1 -r 2 -i "$img" -i "$snd" -vf scale=-1:380 -c:v libx264 -preset slow -tune stillimage -crf 18 -c:a copy -shortest -pix_fmt yuv420p -threads 0 output.mkv to upload the mixed audio tracks to YouTube.

I had merged the subtitles which were generated with YouTube as well and I just added those to the resulting video. Works like a charm.

Upvotes: 0

llogan
llogan

Reputation: 133853

The amerge documentation states:

If inputs do not have the same duration, the output will stop with the shortest.

amix may be a better filter for this case.

Upvotes: 1

Related Questions