Reputation: 12579
Is it possible to read the ID3 tags of an MP3 stored online without actually downloading the entire file?
I've used TagLib Sharp, but to my knowledge you actually have to open the file to read the ID3 tags.
Upvotes: 2
Views: 1272
Reputation: 20725
As Florian said above, you can use an HTTP Range to read a little bit of the file and see whether there is an ID3 or not, then read the rest of the tag (if present/necessary). For example:
Range: bytes=0-65535
An ID3 tag may include an image, so it could be really large (I've seen some that are 500Kb). However, most of the useful information, such as the title, description, etc. is likely going to be available in the first few Kb. Depending on your connections (or expected client's connections), I would select a first number of Kb to download. For most connections, 64Kb is going to be really fast now a day (maybe it was less so in 2014).
Note that the entire file could also be less than 64Kb total. The Range request should still work, only it will return the file size. In that case, you'll never send a second request for more data.
An MP3 file with an ID3 tag starts like so:
0x49 0x44 0x33 ID3
0x03 0x00 major.revision (2.0 or 3.0)
0x00 flags
0xSS 0xSS 0xSS 0xSS size
Notes about the version:
- The tag is ID3, that
3
is not part of the version- The first version is
2
because MP3 already had aTAG
capability and that was considered to be version1
(and1.1
in certain conditions).- At this time, I've not see any revision other than
0
. This is why we reference tags as ID3v1 (TAG
), ID3v2 (ID3
+ 0x02), and ID3v3 (ID3
+ 0x03).
The 0xSS
represents the size. This is an interesting one because only 7 bits are used in each byte to avoid 0xFF
which is the synchronization code for MP3 (MPEG) files. Only they forget to do something about 0xFF
found in PNG and JPEG images... Anyway...
The way to calculate the size is like this:
size = (buffer[pos + 6] << 21) +
(buffer[pos + 7] << 14) +
(buffer[pos + 8] << 7) +
(buffer[pos + 9] << 0)
IMPORTANT: You should verify that bit 7 is not set in any of those bytes. If set, then it's not a valid ID3 tag. This is why I don't do a (buffer[pos + n] & 0x7F
), the & 0x7F
part is not required if you properly verified the size early.
Note that this size
does not include the size of the header. So keep in mind there are 10 bytes for the header.
The rest of buffer is organized in frames. These are either 3 letters, a size and the data of that frame, or 4 letters, a size, flags, and data. The header of each frame is determined by the version (2 or 3).
Anyway, once you have that size
, if you want to read the entire ID3, you can do another GET to the HTTP server and retrieve the remaining data if the first 64Kb (or whatever size you used first) is not already larger or equal to the necessary size.
Range: bytes=65536-<size + 10 - 1>
The size is the data within the ID3. The +10 is for the header. The -1 is because the HTTP range is inclusive (not a size, it's a position).
IMPORTANT NOTE: All servers do not accept the Range
header. If you are in control and your server doesn't support range requests, you may want to consider adding a proxy in front of the server. nginx is really good at that. It can cache the entire file and return just the range(s) requested in the HTTP header.
Upvotes: 0