Etienne
Etienne

Reputation: 2287

How can I detect corrupt/incomplete MP3 file, from a node.js app?

The common situation when the integrity of an MP3 file is not correct, is when the file has been partially uploaded to the server. In this case, the indicated audio duration doesn't correspond to what is really in the MP3 file: we can hear the beginning, but at some point the playing stops and the indicated duration of the audio player is broken.

I tried with libraries like node-ffprobe, but it seems they just read metadata, without making comparison with real audio data in the file. Is there a way to detect efficiently a corrupted or incomplete MP3 file from node.js?

Note: the client uploading MP3 files is a hardware (an audio recorder), uploading files on a FTP server. Not a browser. So I'm not able to upload potentially more useful data from the client.

Upvotes: 1

Views: 1775

Answers (2)

lys
lys

Reputation: 1037

The expression for calculating the filesize of an mp3 based on duration and encoding (from this answer) is quite simple:

x = length of song in seconds

y = bitrate in kilobits per second

(x * y) / 1024 = filesize (MB)

There is also a javascript implementation for the Web Audio API in another answer on that same question. Perhaps that would be useful in your Node implementation.

mp3diags is some older open source software for fixing mp3s and which was great for batch processing stuff like this. The source is c++ and still available if you're feeling nosy and want to see how some of these features are implemented.

Worth a look since it has some features that might be be useful in your context:

What is MP3 Diags and what does it do?

  • low quality audio
  • missing VBR header
  • missing normalization data
  • Correcting files that show incorrect song duration
  • Correcting files in which the player cannot seek correctly

Upvotes: 1

Brad
Brad

Reputation: 163478

MP3 files don't normally have a duration. They're just a series of MPEG frames. Sometimes, there is an ID3 tag indicating duration, but not always.

Players can determine duration by choosing one of a few methods:

  • Decode the entire audio file.
    This is the slowest method, but if you're going to decode the file anyway, you might as well go this route as it gives you an exact duration.
  • Read the whole file, skimming through frame headers.
    You'll have to read the whole file from disk, but you won't have to decode it. Can be slow if I/O is slow, but gives you an exact duration.
  • Read the first frame's bitrate and estimate duration by file size.
    Definitely the fastest method, and the one most commonly used by players. Duration is an estimate only, and is reasonably accurate for CBR, but can be wildly inaccurate for VBR.

What I'm getting at is that these files might not actually be broken. They might just be VBR files that your player doesn't know the duration of.

If you're convinced they are broken (such as stopping in the middle of content), then you'll have to figure out how you want to handle it. There are probably only a couple ways to determine this:

  • Ideally, there's an ID3 tag indicating duration, and you can decode the whole file and determine its real duration to compare.
  • Usually, that ID3 tag won't exist, so you'll have to check to see if the last frame is complete or not.

Beyond that, you don't really have a good way of knowing if the stream is incomplete, since there is no outer container that actually specifies number of frames to expect.

Upvotes: 2

Related Questions