rbaleksandar
rbaleksandar

Reputation: 9701

How to retrieve ID3 tag information from PCM (WAV) file using the LAME encoder library?

I have a problem retrieving tag information from a test WAV file (no sound, just empty WAV file).

Here is how I initialize LAME:

lame_global_flags*  lame = NULL;
lame = lame_init();
lame_set_in_samplerate(lame, 44100);
lame_set_VBR(lame, vbr_default);
id3tag_init(lame);
id3tag_add_v2(lame);
id3tag_v2_only(lame);
lame_set_write_id3tag_automatic(lame, 1);
// id3tag_set_artist(lame, "Test artist");
 // id3tag_set_album(lame, "Test ablum");
lame_init_params(lame);

And here is how I try to retrieve the tag:

unsigned char buffer[256];
size_t id3tagSize = lame_get_id3v2_tag(lame, buffer, sizeof(buffer));

which to be honest doesn't really help the situation since it doesn't involve the file that has been read (see code below).

Here is a HEX view of my input PCM file (just the ID3 tag info data chunk):

enter image description here

You can see information chunks containing TIT2 (title), TALB (album), TPUB (publisher) etc. (see list of all ID3 tag names here), which means that the information is there but the LAME encoder seems to ignore it.

If a tag name is not empty it is followed by a specific number of bytes based on my observations. For example TIT2 is followed by 0x00 0x00 0x00 0x08, which converted to decimal represents the length+1 of the title, which is "Message" plus padding with 0x00.

However if I enter a total number of tracks TRACKTOTAL I get 1 0x00 byte (probably padding) plus an arbitrary number of bytes, each containing the hexademical code representation of digits. For example if I put 300 as the number of total tracks I will get 54 52 41 43 4b 54 4f 54 41 4c 00 33 30 30, which in terms can be converted to a string TRACKTOTAL 300.

I can try to manually parse this information but I'd rather spend my time on working on another part of my code. There are non-standard tag names, tag information is usually placed either at the beginning or at the end of a file (based on the specification for RIFF WAVE files) etc. Don't want to write my own ID3 tag parsers. -_-

At first I didn't try to configure anything related to tagging. According to the lame.h the default (unless tagging is disabled) behaviour for LAME is to enable lame_set_write_id3tag_automatic(lame_global_flags * gfp, int) (the int parameter). My initial code was:

#include <stdio.h>
#include <lame.h>

int main(int argc, char* argv[])
{
    int read = 0;
    int write = 0;

    char* path = argv[1];
    printf("WAV input: %s\n", path);
    FILE *pcm = fopen(path, "rb");
    FILE *mp3 = fopen("output.mp3", "wb");

    fseek(pcm, 0, SEEK_END);
    const int PCM_SIZE = ftell(pcm);
    fseek(pcm, 0, SEEK_SET);
    const int MP3_SIZE = PCM_SIZE;

    short int pcm_buffer[PCM_SIZE * 2];
    unsigned char mp3_buffer[MP3_SIZE];


    lame_global_flags*  lame = NULL;
    lame = lame_init();
    lame_set_in_samplerate(lame, 44100);
    lame_set_VBR(lame, vbr_default);
    lame_init_params(lame);

    do
    {
        read = fread(pcm_buffer, 2 * sizeof(short int), PCM_SIZE, pcm);
        if (read == 0)
        {
            // TODO Retrieve ID3 tag here? Copy?
            write = lame_encode_flush(lame, mp3_buffer, MP3_SIZE);
        }
        else
        {
            write = lame_encode_buffer_interleaved(lame, pcm_buffer, read, mp3_buffer, MP3_SIZE);
        }
        fwrite(mp3_buffer, write, 1, mp3);
    }
    while (read != 0);

    lame_close(lame);
    fclose(mp3);
    fclose(pcm);

However even though a MP3 file was produced and all the audio data was in it (I also tried with other WAV files of course) the ID3 tag information was gone.

Setting the ID3 tag manually by using id3tag_set_artist(...), id3tag_set_album(...) etc.) works. However I am looking for a way to copy the ID3 tag from the source (PCM) into the destination (MP3). The documentation of LAME is basically its header, which is not really that great. I was unable to find anything related to copying the tag except for the lame_set_write_id3tag_automatic(...).

I tried both ID3v1 and ID3v2 since ID3v2 is generated only if one of the text fields doesn't fit into a ID3v1 but on the other hand forcing ID3v2 should override this setting.

In addition lame_encode_flush(...) is supposed to write any ID3v1 tags.

Upvotes: 2

Views: 1490

Answers (0)

Related Questions