Reputation: 1498
FFmpeg has a concept of “dispositions” – a property that describes the purpose of a stream in a media file. For example, here are the streams in a file I have lying around, with the dispositions emphasized:
Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo,
fltp, 251 kb/s (default)
Metadata:
creation_time : 2021-11-10T20:14:06.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
Stream #0:1[0x2](und): Video: mjpeg (Baseline) (jpeg / 0x6765706A),
yuvj420p(pc, bt470bg/unknown/unknown), 1024x1024, 0 kb/s, 0.0006 fps, 3.08 tbr,
600 tbn (default) (attached pic) (timed thumbnails)
Metadata:
creation_time : 2021-11-10T20:14:06.000000Z
handler_name : Core Media Video
vendor_id : [0][0][0][0]
Stream #0:2[0x3](und): Data: bin_data (text / 0x74786574)
Metadata:
creation_time : 2021-11-10T20:14:06.000000Z
handler_name : Core Media Text
Stream #0:3[0x0]: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/
unknown), 1024x1024 [SAR 144:144 DAR 1:1], 90k tbr, 90k tbn (attached pic)
However, if I make any modification to this file’s chapter markers using the C++ library MP4v2 (even just re-saving the existing ones: auto f = MP4Modify("test.m4a"); MP4Chapter_t* chapterList; uint32_t chapterCount; MP4GetChapters(f, &chapterList, &chapterCount); MP4SetChapters(f, chapterList, chapterCount); MP4Close(f);
), some of these dispositions are removed:
Stream #0:0[0x1](und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo,
fltp, 251 kb/s (default)
Metadata:
creation_time : 2021-11-10T20:14:06.000000Z
handler_name : Core Media Audio
vendor_id : [0][0][0][0]
Stream #0:1[0x2](und): Video: mjpeg (Baseline) (jpeg / 0x6765706A),
yuvj420p(pc, bt470bg/unknown/unknown), 1024x1024, 0 kb/s, 0.0006 fps, 3.08 tbr,
600 tbn (default) ← “attached pic” and “timed thumbnails” removed!
Metadata:
creation_time : 2021-11-10T20:14:06.000000Z
handler_name : Core Media Video
vendor_id : [0][0][0][0]
Stream #0:2[0x0]: Video: mjpeg (Baseline), yuvj420p(pc, bt470bg/unknown/
unknown), 1024x1024 [SAR 144:144 DAR 1:1], 90k tbr, 90k tbn (attached pic)
Stream #0:3[0x4](und): Data: bin_data (text / 0x74786574)
This stream was moved to the end, but that’s intended behavior. It contains chapter titles, and we just edited the chapters.
Metadata:
creation_time : 2025-03-05T09:56:31.000000Z
It also renders the file unplayable in MPC-HC (but not in VLC!), which is apparently a bug in MP4v2. I’m currently investigating that bug to report and potentially fix it, but that’s a separate issue – in my journey there, I’m wracking my brain trying to understand what it is that MP4v2 changes to make FFmpeg stop reporting the “attached pic” and “timed thumbnails” dispositions. I’ve explored the before-and-afters in MP4 Box, and I can’t for the life of me find which atom it is that differs in a relevant way.
(I’d love to share the files, but unfortunately the contents are under copyright – if anyone knows of a way to remove the audio from an MP4 file without changing anything else, let me know and I’ll upload dummied-out versions. Without them, I can’t really ask about the issue directly. I can at least show you the files’ respective atom trees, but I’m not sure how relevant that is.)
I thought I’d read FFmpeg’s source code to find out how it determines dispositions for MP4 streams, but of course, FFmpeg is very complex. Could someone who’s more familiar with C and/or FFmpeg’s codebase help me sleuth out how FFmpeg determines dispositions for MP4 files (in particular, “attached pic” and “timed thumbnails”)?
movenc.c
be helpful?Upvotes: 2
Views: 43
Reputation: 1498
Though I figured it out by reverse-engineering my MP4 files rather than reading the FFmpeg source code, here’s the answer:
It’s possible for a chap
atom to refer not only to chapter text tracks, but also to JPEG video tracks. If a video track is referenced by a chap
atom, FFmpeg sets the “attached pic” and “timed thumbnails” dispositions. Other dispositions are set in other ways.
(At the time of writing, MP4v2 doesn’t handle references to video tracks in chap
atoms, and instead removes those references but leaves the track intact, resulting in the situation in the question.)
Upvotes: 2