SebMa
SebMa

Reputation: 4719

Add coverart into ogg containing an opus audio stream with ffmpeg without re-encoding the audio stream

I'm trying to add a coverart into an ogg file with ffmpeg :

Here are my source.ogg and source.jpg files :

$ ffprobe -hide_banner source.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
$ identify source.jpg 
source.jpg JPEG 480x360 480x360+0+0 8-bit DirectClass 15.1KB 0.000u 0:00.000

I tried this :

$ ffmpeg -hide_banner -i source.ogg -i source.jpg -map 0 -map 1 -c:a copy -c copy -map_metadata 0 dest.ogg -y && echo && ffprobe -hide_banner dest.ogg 
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
Input #1, image2, from 'source.jpg':
  Duration: 00:00:00.04, start: 0.000000, bitrate: 3023 kb/s
    Stream #1:0: Video: mjpeg, yuvj420p(pc, bt470bg/unknown/unknown), 480x360 [SAR 1:1 DAR 4:3], 25 tbr, 25 tbn, 25 tbc
[ogg @ 0x5655578064c0] Unsupported codec id in stream 1
Could not write header for output file #0 (incorrect codec parameters ?): Invalid argument
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #1:0 -> #0:1 (copy)
    Last message repeated 1 times
[ogg @ 0x5655577e8540] Format ogg detected only with low score of 1, misdetection possible!
dest.ogg: End of file

I've also found this answer but it does not explain how to do it with ffmpeg.

I've read about a "METADATA_BLOCK_PICTURE" metadata in the ogg container that might contain the picture in base64, so I tried this :

$ ffmpeg -hide_banner -i source.ogg -map 0 -c:a copy -c copy -metadata METADATA_BLOCK_PICTURE="$(base64 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                    : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                    ..............................................................................
                    : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                    : KiiCWRRRRBJ//9k=
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkz
                      : ODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2Nj
                      : Y2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQED
                      ..............................................................................
                      : nVmaS2E/urUWVbH6ORI9z2l8zyRfFpkLooIHSBuk9lFFoC6OBnP1SON8rEooqM2WOVHDdRRAAUVK
                      : KiiCWRRRRBJ//9k=
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.01e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.100392%

It kinda "worked", but neither ffplay nor mpv can parse the cover art :

$ ffplay -hide_banner dest.ogg
[ogg @ 0x5655577e8540] Failed to parse cover art block.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
   3.95 M-A: -0.000 fd=   0 aq=   14KB vq=    0KB sq=    0B f=0/0    
$ mpv dest.ogg 
Playing: dest.ogg
[ffmpeg/demuxer] ogg: Failed to parse cover art block.
 (+) Audio --aid=1 (opus 2ch 48000Hz)
AO: [pulse] 48000Hz stereo 2ch float
A: 00:00:03 / 00:03:02 (2%)


Exiting... (Quit)

I alse tried -metadata:s:a along with the --wrap 0 of base64 (which I had forgotten to specify, oops :) ) :

$ ffmpeg -i source.ogg -map 0 -c:a copy -c copy -metadata:s:a METADATA_BLOCK_PICTURE="$(base64 --wrap 0 source.jpg)" dest.ogg
Input #0, ogg, from 'source.ogg':
  Duration: 00:03:02.45, start: 0.007500, bitrate: 73 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
File 'dest.ogg' already exists. Overwrite ? [y/N] y
Output #0, ogg, to 'dest.ogg':
  Metadata:
    encoder         : Lavf58.20.100
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100
      METADATA_BLOCK_PICTURE: /9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDA4MChAODQ4SERATGCgaGBYWGDEjJR0oOjM9PDkzODdASFxOQERXRTc4UG1RV19iZ2hnPk1xeXBkeFxlZ2MBERISGBUYLxoaL2NCOEJjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY2NjY//AABEIAWgB4AMBIgACEQEDEQH/xAAaAAACAwEBAAAAAAAAAAA
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
Press [q] to stop, [?] for help
size=    1658kB time=00:03:02.41 bitrate=  74.5kbits/s speed=1.22e+03x    
video:0kB audio:1624kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.084397%

But still the dest.ogg jpg coverart cannot be read properly :

$ ffprobe -hide_banner dest.ogg 
[ogg @ 0x5655577e8540] Invalid picture type: -2555936.
[ogg @ 0x5655577e8540] Could not read mimetype from an attached picture.
Input #0, ogg, from 'dest.ogg':
  Duration: 00:03:02.44, start: 0.000000, bitrate: 74 kb/s
    Stream #0:0: Audio: opus, 48000 Hz, stereo, fltp
    Metadata:
      DURATION        : 00:03:02.441000000
      ENCODER         : Lavf58.20.100

Can you please help me ?

Upvotes: 4

Views: 4368

Answers (4)

Stefan Brüns
Stefan Brüns

Reputation: 11

FFmpeg does not have native support for cover art in Ogg containers.

You can manually add an image as cover art using the -metadata:s:a METADATA_BLOCK_PICTURE=... parameter, but the supplied data has to

  1. Start with a header in accordance with the Xiph vorbis comments specification
  2. Have the image data encoded as base64

For a more complete answer, see https://superuser.com/a/1816195/1860118

Upvotes: 1

Somebody
Somebody

Reputation: 343

Using METADATA_BLOCK_PICTURE does work but you're doing it wrong. The base64 encoded data needs to be formatted in a specific way given in the FLAC format specification. I wrote a simple python tool (which requires file and ffprobe) you can use to convert pictures into the corresponding metadata blocks.

Usage: ffmpeg -i <input file> -metadata $(./mkpblock.py 3 <description> <cover art>) <output file> (Note that you might run into problems with large images due to argument list length limitations)

Upvotes: 0

user17549713
user17549713

Reputation: 94

FFmpeg version 4.4 automatically supports embedding album art into Ogg containers with the Theora video codec (see "Ogg codecs" on Wikipedia for a list of supported codecs, although they may not all be supported by FFmpeg).

This is not the the same as MP3 files, which store album art as binary encoded strings in special purpose tags. This allows media players to correctly detect it as an audio file (e.g. with mpv's --audio-display option) and prevent frame redrawing during playback. Ogg containers do not support this functionality, so FFmpeg simply adds a regular video stream to the file. The framerate of this video stream is set (at least for JPEGs) to 90000 resulting in a harmless warning.

This does not decrease performance at least with mpv, which only redraws as fast as the screen refresh rate allows. Only a single frame is encoded in the video stream, which can be manually verified by running ffprobe -v error -select_streams v:0 -count_packets -show_entries stream=nb_read_packets -of csv=p=0 input.ogg as suggested in this answer. The framerate can be manually set to 1 with the -r:v 1 option if desired. See the comments for additional discussion.

Here's an example converting an MP3 file with a video track containing album art to an Ogg file with Opus encoded audio and Theora encoded video:

$ ffprobe -hide_banner '01 - State of Grace.mp3' 
[mp3 @ 0x5594cbafe320] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
$ ffmpeg -hide_banner -i '01 - State of Grace.mp3' -c:a libopus -b:a 128000 -c:v libtheora -q:v 10 '01 - State of Grace.ogg'
[mp3 @ 0x55ebe6d3cc40] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '01 - State of Grace.mp3':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
  Duration: 00:04:55.81, start: 0.000000, bitrate: 321 kb/s
  Stream #0:0: Audio: mp3, 44100 Hz, stereo, fltp, 320 kb/s
  Stream #0:1: Video: mjpeg (Baseline), yuvj444p(pc, bt470bg/unknown/unknown), 600x600 [SAR 72:72 DAR 1:1], 90k tbr, 90k tbn, 90k tbc (attached pic)
    Metadata:
      title           : Cover
      comment         : Cover (front)
Stream mapping:
  Stream #0:1 -> #0:0 (mjpeg (native) -> theora (libtheora))
  Stream #0:0 -> #0:1 (mp3 (mp3float) -> opus (libopus))
Press [q] to stop, [?] for help
[swscaler @ 0x55ebe6db69e0] deprecated pixel format used, make sure you did set range correctly
[ogg @ 0x55ebe6d44c80] Frame rate very high for a muxer not efficiently supporting it.
Please consider specifying a lower framerate, a different muxer or -vsync 2
Output #0, ogg, to '01 - State of Grace.ogg':
  Metadata:
    lyrics-eng      :  
    copyright       : š 2012 Big Machine Records, LLC.
    title           : State of Grace
    album_artist    : Taylor Swift
    album           : Red (Deluxe Version)
    date            : 2012
    track           : 01/22
    genre           : Country
    composer        : Taylor Swift
    disc            : 1/1
    comment         : Taylor Swift
    encoder         : Lavf58.76.100
  Stream #0:0: Video: theora, yuv444p(tv, bt470bg/unknown/unknown, progressive), 600x600 [SAR 1:1 DAR 1:1], q=2-31, 200 kb/s, 90k fps, 90k tbn (attached pic)
    Metadata:
      title           : Cover
      DESCRIPTION     : Cover (front)
      encoder         : Lavc58.134.100 libtheora
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
  Stream #0:1: Audio: opus, 48000 Hz, stereo, flt, 128 kb/s
    Metadata:
      encoder         : Lavc58.134.100 libopus
      lyrics-eng      :  
      copyright       : š 2012 Big Machine Records, LLC.
      title           : State of Grace
      ALBUMARTIST     : Taylor Swift
      album           : Red (Deluxe Version)
      date            : 2012
      TRACKNUMBER     : 01/22
      genre           : Country
      composer        : Taylor Swift
      DISCNUMBER      : 1/1
      DESCRIPTION     : Taylor Swift
[mp3float @ 0x55ebe6d96360] Header missing time=00:04:31.63 bitrate=   0.1kbits/s speed=59.8x    64x    
Error while decoding stream #0:0: Invalid data found when processing input
frame=    1 fps=0.2 q=-0.0 Lsize=    4929kB time=00:04:55.79 bitrate= 136.5kbits/s speed=59.8x    
video:58kB audio:4830kB subtitle:0kB other streams:0kB global headers:3kB muxing overhead: 0.845459%
$ mpv '01 - State of Grace.ogg'
 (+) Video --vid=1 'Cover' (theora 600x600)
 (+) Audio --aid=1 'State of Grace' (opus 2ch 48000Hz)
AO: [alsa] 48000Hz stereo 2ch float
VO: [gpu] 600x600 yuv444p
(Paused) AV: -00:00:00 / 00:04:55 (0%)

Exiting... (Quit)
$ 

Note that the -q:v 10 Theora video codec option is used for the highest possible video quality. Without this option the album art is extremely low resolution by default, and the size difference when using the highest quality is negligible since only a single frame is being encoded.

This requires FFmpeg to be built with libtheora (and libopus for Opus encoded audio). Here is the output of ffmpeg -codecs with unrelated codecs removed and better formatting:

$ ffmpeg -codecs
ffmpeg version 4.4.1 Copyright (c) 2000-2021 the FFmpeg developers
  built with gcc 11.1.0
  configuration: --prefix=/usr --libdir=/usr/lib64 --shlibdir=/usr/lib64
  --docdir=/usr/share/doc/ffmpeg-4.4.1-r1/html --mandir=/usr/share/man
  --enable-shared --cc=x86_64-pc-linux-gnu-gcc
  --cxx=x86_64-pc-linux-gnu-g++ --ar=x86_64-pc-linux-gnu-ar
  --nm=x86_64-pc-linux-gnu-nm --ranlib=x86_64-pc-linux-gnu-ranlib
  --pkg-config=x86_64-pc-linux-gnu-pkg-config --optflags='-O2 -pipe
  -march=native -ggdb3' --extra-libs= --enable-static --enable-avfilter
  --enable-avresample --disable-stripping --disable-optimizations
  --disable-libcelt --enable-nonfree --disable-indev=v4l2
  --disable-outdev=v4l2 --disable-indev=oss --disable-indev=jack
  --disable-indev=sndio --disable-outdev=oss --disable-outdev=sndio
  --enable-bzlib --enable-runtime-cpudetect --disable-debug
  --disable-gcrypt --enable-gnutls --disable-gmp --enable-gpl
  --disable-hardcoded-tables --enable-iconv --disable-libxml2 --enable-lzma
  --enable-network --disable-opencl --enable-openssl --enable-postproc
  --disable-libsmbclient --disable-ffplay --disable-sdl2 --disable-vaapi
  --disable-vdpau --disable-vulkan --enable-xlib --enable-libxcb
  --enable-libxcb-shm --enable-libxcb-xfixes --enable-zlib
  --disable-libcdio --disable-libiec61883 --disable-libdc1394
  --disable-libcaca --enable-openal --enable-opengl --disable-libv4l2
  --disable-libpulse --disable-libdrm --disable-libjack
  --disable-libopencore-amrwb --disable-libopencore-amrnb
  --disable-libcodec2 --enable-libdav1d --disable-libfdk-aac
  --disable-libopenjpeg --disable-libbluray --disable-libgme
  --disable-libgsm --disable-libaribb24 --disable-mmal --disable-libmodplug
  --enable-libopus --disable-libilbc --disable-librtmp --disable-libssh
  --disable-libspeex --disable-libsrt --disable-librsvg --disable-ffnvcodec
  --disable-libvorbis --disable-libvpx --disable-libzvbi --disable-appkit
  --disable-libbs2b --disable-chromaprint --disable-cuda-llvm
  --disable-libflite --disable-frei0r --disable-libfribidi
  --enable-fontconfig --disable-ladspa --disable-libass
  --disable-libtesseract --disable-lv2 --disable-libfreetype
  --disable-libvidstab --disable-librubberband --disable-libzmq
  --disable-libzimg --disable-libsoxr --enable-pthreads
  --disable-libvo-amrwbenc --disable-libmp3lame --disable-libkvazaar
  --enable-libaom --disable-libopenh264 --disable-librav1e
  --disable-libsnappy --enable-libtheora --disable-libtwolame
  --disable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid
  --disable-gnutls --disable-armv5te --disable-armv6 --disable-armv6t2
  --disable-neon --disable-vfp --disable-vfpv3 --disable-armv8
  --disable-mipsdsp --disable-mipsdspr2 --disable-mipsfpu --disable-altivec
  --disable-vsx --disable-power8 --disable-amd3dnow --disable-amd3dnowext
  --disable-aesni --disable-avx --disable-avx2 --disable-fma3
  --disable-fma4 --disable-sse3 --disable-ssse3 --disable-sse4
  --disable-sse42 --disable-xop --cpu=host --disable-doc
  --disable-htmlpages --enable-manpages
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Codecs:
 D..... = Decoding supported
 .E.... = Encoding supported
 ..V... = Video codec
 ..A... = Audio codec
 ..S... = Subtitle codec
 ...I.. = Intra frame-only codec
 ....L. = Lossy compression
 .....S = Lossless compression
 -------
 [...]
 DEV.L. theora               Theora (encoders: libtheora )
 [...]
 DEAIL. opus                 Opus (Opus Interactive Audio Codec)
                             (decoders: opus libopus ) (encoders: opus libopus )
 [...]
$ 

FFmpeg can also add album art (or any video track) from a separate file, instead of directly mapping the original album art to the output. Here's an example of how to extract the original MJPEG album art as a separate file, then pass it back in and using the -map option to only use the audio track from the MP3 and the video track from the MJPEG (I removed most of the output of the commands since they are basically the same):

$ ffmpeg -i '01 - State of Grace.mp3' -map 0:v -c:v copy '01 - State of Grace.jpg'
[...]
$ ffmpeg -i '01 - State of Grace.mp3' -i '01 - State of Grace.jpg' -map 0:a -map 1:v '01 - State of Grace.ogg'
[...]
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> flac (native))
Stream #1:0 -> #0:1 (mjpeg (native) -> theora (libtheora))
[...]

I also omitted the audio and video codecs and their options (which I wouldn't suggest) so FFmpeg used FLAC as the default audio codec and Theora as the default video codec for an Ogg container.

Hope this helps!

Upvotes: 4

user12711
user12711

Reputation: 713

This works for me:

ffmpeg -i mysong.ogg -i coverart.jpg song_with_art.ogg

Upvotes: -1

Related Questions