mp3 stream decoding in browser

Question

I am trying to set up an mp3 stream receiver in browser using emscripten and libmad.
I managed to decode mp3 file with low-level api loading it completely to the memory. My next step was to load it in chunks.
In given example I emulate fragmented packages with allocated buffers of random size (from 20 to 40 kbyte) and copy file part by part to those buffers.

My algorithm of decoding correlate with an answer in this question but it is a bit different. The main object is Decoder, it receives fragments by addFragment method. Decoder has a pull of pending fragments and a glue buffer. When user adds first fragment its tail is copied in the first half of the glue buffer. When the second fragment is added it's beginning being copied to the second half of the glue. When decoder reaches the end of active buffer end it switches to glue, and vice versa when glue finishes. I make sure all those buffers parts are consistent and mad_stream points to the same logical byte it was pointing before switching.

Significant fragments from decoder.cpp

void Decoder::addFragment          //adds the fragment to decoding queue
(intptr_t bufferPtr, uint32_t length)
{
    if (length < GLUE_LENGTH / 2) {
        return;
    }
    uint8_t* buffer = (uint8_t(*))bufferPtr;
    RawBuffer rb = {buffer, length};
    pending.push_back(rb);

    switch (state) {
        case empty:
            mad_stream_buffer(&stream, buffer, length);

            for (int i = 0; i < GLUE_LENGTH/2; ++i) {
                glue[i] = buffer[length - GLUE_LENGTH/2 + i];
            }

            state = onBufferHalf;
            prepareNextBuffer();
            break;
        case onBufferHalf:
            for (int i = 0; i < GLUE_LENGTH/2; ++i) {
                glue[GLUE_LENGTH/2 + i] = buffer[i];
            }

            state = onBufferFull;
            break;
        case onGlueHalf:
            for (int i = 0; i < GLUE_LENGTH/2; ++i) {
                glue[GLUE_LENGTH/2 + i] = buffer[i];
            }

            state = onGlueFull;
            cached = false;
            prepareNextBuffer();
            break;
        default:
            break;
    }
}

emscripten::val Decoder::decode     //decodes up to requested amount of frames
(uint32_t count)
{
    emscripten::val ret = emscripten::val::undefined();

    int available = framesLeft(count);
    if (available > 0) {
        ret = context.call("createBuffer", channels, available * samplesPerFrame, sampleRate);

        std::vector chans(channels, emscripten::val::undefined());
        for (int i = 0; i < channels; ++i) {
            chans[i] = ret.call("getChannelData", i);
        } 

        for (int i = 0; i < available; ++i) {
            int res = mad_frame_decode(&frame, &stream);

            if (res != 0) {
                if (MAD_RECOVERABLE(stream.error)) {
                    continue;
                } else {
                    break;
                }
            }

            mad_synth_frame(&synth, &frame);
            for (int j = 0; j < samplesPerFrame; ++j) {
                for (int k = 0; k < channels; ++k) {
                    float value = mad_f_todouble(synth.pcm.samples[k][j]);
                    chans[k].set(std::to_string(success * samplesPerFrame + j), emscripten::val(value));
                }
            }
        }

        cachedLength -= available;
        if (cachedLength == 0) {
            cached = false;
            prepareNextBuffer();
        }
    }
    return ret;
}


//tells how many frames can be decoded on the same
//sample rate, same amount of channels without switching the buffers
//it is required in Decoder::decode method to understand the size of 
//allocating AudioContext::AudioBuffer.

uint32_t Decoder::framesLeft(uint32_t max)
{
    if (state == empty || state == onGlueHalf) {
        return 0;
    }

    if (cached == false) {
        mad_stream probe;
        mad_header ph;
        initializeProbe(probe);
        mad_header_init(&ph);

        while (cachedLength < max) {
            if (mad_header_decode(&ph, &probe) == 0) {
                if (sampleRate == 0) {
                    sampleRate = ph.samplerate;
                    channels = MAD_NCHANNELS(&ph);
                    samplesPerFrame = MAD_NSBSAMPLES(&ph) * 32;
                } else {
                    if (sampleRate != ph.samplerate || channels != MAD_NCHANNELS(&ph) || samplesPerFrame != MAD_NSBSAMPLES(&ph) * 32) {
                        break;
                    }
                }
                if (probe.next_frame > probe.this_frame) {
                    ++cachedLength;
                }
            } else {
                if (!MAD_RECOVERABLE(probe.error)) {
                    break;
                }
            }
        }

        cachedNext = probe.next_frame;
        cachedThis = probe.this_frame;
        cachedError = probe.error;
        mad_header_finish(&ph);
        mad_stream_finish(&probe);
        cached = true;
    }

    return std::min(cachedLength, max);
}

//this method fastforwards the stream
//to the cached end
void Decoder::pullBuffer()
{
    if (cached == false) {
        throw 2;
    }
    stream.this_frame = cachedThis;
    stream.next_frame = cachedNext;
    stream.error = cachedError;
}

//this method switches the stream to glue buffer
//or to the next pending buffer
//copies the parts to the glue buffer if required

void Decoder::changeBuffer()
{
    uint32_t left;
    switch (state) {
        case empty:
            throw 3;
        case onBufferHalf:
            switchToGlue();
            state = onGlueHalf;
            break;
        case onBufferFull:
            switchToGlue();
            state = onGlueFull;
            break;
        case onGlueHalf:
            throw 4;
            break;
        case onGlueFull:
            switchBuffer(pending[0].ptr, pending[0].length);

            for (int i = 0; i < GLUE_LENGTH/2; ++i) {
                glue[i] = pending[0].ptr[pending[0].length - GLUE_LENGTH/2 + i];
            }
            state = onBufferHalf;

            if (pending.size() > 1) {
                for (int i = 0; i < GLUE_LENGTH/2; ++i) {
                    glue[GLUE_LENGTH/2 + i] = pending[1].ptr[i];
                }
                state = onBufferFull;
            }
    }

    cached = false;
}

//this method seeks the decodable data in pending buffers
//prepares if any proper data has been found
void Decoder::prepareNextBuffer()
{
    bool shift;
    do {
        shift = false;
        framesLeft();
        if (cachedLength == 0 && state != empty && state != onGlueHalf) {
            pullBuffer();
            changeBuffer();
            shift = true;
        }
    } while (shift);
}

//low level method to switch to glue buffer, also frees the drained fragment
void Decoder::switchToGlue()
{
    switchBuffer(glue, GLUE_LENGTH);
    stream.error = MAD_ERROR_NONE;

    free(pending[0].ptr);
    pending.pop_front();
}

//low level method which actually switch mad_stream
//to another buffer
void Decoder::switchBuffer(uint8_t* bufferPtr, uint32_t length)
{
    uint32_t left;
    left = stream.bufend - stream.next_frame;
    mad_stream_buffer(&stream, bufferPtr + GLUE_LENGTH / 2 - left, length - (GLUE_LENGTH / 2 - left));
    stream.error = MAD_ERROR_NONE;
}

Here is my repo with full code. To try it you need to build it with CMake (emscripten is supposed to be installed) and open index.html from the build directory in your browser.

The problem
The playback is distorted. I tried to check the bytes around last successful frame before and after shift, all of the different substructures of mad_stream - everything seems to work properly but it still doesn't. My latest progress is built and hosted here. I'am really stuck and I don't know what to do to eliminate distortion in the playback.

I would really appreciate if someone helps me.

Blue · Accepted Answer

I've found it! MAD works perfectly, just because of my inner counter I kept skipping first decoded frames in output.

for (int i = 0; success < available; ++i) {
            int res = mad_frame_decode(frame, stream);

            if (res == 0) {
                ++**success**;
            } else {
                if (MAD_RECOVERABLE(stream->error)) {

                    std::cout << "Unexpected error during the decoding process: " << mad_stream_errorstr(stream) << std::endl;
                    continue;
                } else {
                    break;
                }
            }

            mad_synth_frame(synth, frame);

            for (int j = 0; j < samplesPerFrame; ++j) {
                for (int k = 0; k < channels; ++k) {
                    float value = mad_f_todouble(synth->pcm.samples[k][j]);
                    chans[k].set(std::to_string(success * samplesPerFrame + j), emscripten::val(value));
                }
            }
        }

changed success to i and it worked.

mp3 stream decoding in browser

Answers (2)

Related Questions