simone viozzi
simone viozzi

Reputation: 460

how to decode an m4a audio on android

I'm trying to decode an audio on android and get the raw data to apply a filter.

I'm using MediaExtractor to extract the encoded data from the file, and that seem to work. Then I tried to mix up the code from MediaExtractor docs plus MediaCodec on Synchronous Processing using Buffers to extract the data and decode them in blocks.

So I first configured the decoder with the format taken from extractor.getTrackFormat(0);

MediaExtractor extractor = new MediaExtractor();

String path = "...";
extractor.setDataSource(path);

MediaFormat format = extractor.getTrackFormat(0);
mAudioKeyMine = format.getString(MediaFormat.KEY_MIME);

extractor.selectTrack(0);

MediaCodec decoder;
decoder = MediaCodec.createDecoderByType(mAudioKeyMine);
decoder.configure(format, null, null, 0);

And then tried to get the data:

public void getData(MediaExtractor extractor)
{
    int offset = 0;

    ByteBuffer inputBuffer = ByteBuffer.allocate(2048);

    MediaFormat outputFormat = decoder.getOutputFormat();
    Log.v(TAG, "outputFormat: " + outputFormat.toString());

    decoder.start();
    int index = decoder.dequeueInputBuffer(1000);

    boolean sawInputEOS = false;

    int sample = 0;
    while (sample >= 0)
    {

        int inputBufferId = decoder.dequeueInputBuffer(1000);
        if (inputBufferId >= 0)
        {
            inputBuffer = decoder.getInputBuffer(index);

            sample = extractor.readSampleData(inputBuffer, 0);

            long presentationTimeUs = 0;

            if (sample < 0)
            {
                sawInputEOS = true;
                sample = 0;
            }
            else
            {
                int trackIndex = extractor.getSampleTrackIndex();
                presentationTimeUs = extractor.getSampleTime();

                Log.v(TAG, "trackIndex: " + trackIndex + ", presentationTimeUs: " + presentationTimeUs);
                Log.v(TAG, "sample: " + sample + ", offset: " + offset);
                Log.v(TAG, "inputBuffer: " + inputBuffer.toString());
            }

            decoder.queueInputBuffer(inputBufferId, 0, sample, presentationTimeUs, sawInputEOS ? MediaCodec.BUFFER_FLAG_END_OF_STREAM : 0);

            if (!sawInputEOS)
            {
                extractor.advance();
            }

        }
        MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();

        int outputBufferId = decoder.dequeueOutputBuffer(info, 1000);
        Log.v(TAG, "info: " + info.toString());

        if (outputBufferId >= 0)
        {
            ByteBuffer outputBuffer = decoder.getOutputBuffer(outputBufferId);
            MediaFormat bufferFormat = decoder.getOutputFormat(outputBufferId);

            Log.v(TAG, "option A");
            Log.v(TAG, "outputBufferId: " + outputBufferId);
            if (outputBuffer != null)
            {
                Log.v(TAG, "outputBuffer: " + outputBuffer.toString());
            }
            else
            {
                Log.v(TAG, "outputBuffer: null");
            }
            Log.v(TAG, "bufferFormat: " + bufferFormat.toString());

            if (outputBuffer != null)
            {
                int cont = 0;
                while (outputBuffer.hasRemaining())
                {
                    int pos = outputBuffer.position();
                    byte data = outputBuffer.get();

                    // do something with the data
                    if (cont < 10)
                    {
                        Log.v(TAG, "outputBuffer: " + pos + " -> " + data);
                    }
                    cont++;
                }
            }
            else
            {
                Log.v(TAG, "outputBuffer: null");
            }
            decoder.releaseOutputBuffer(outputBufferId, 0);
        }
        else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED)
        {
            Log.v(TAG, "option B");
            outputFormat = decoder.getOutputFormat(); 
            Log.v(TAG, "outputFormat: " + outputFormat.toString());
        }
        Log.v(TAG, "extractor.advance()");
        offset += sample;
    }
    Log.v(TAG, "end of track");
    extractor.release();
    extractor = null;
    decoder.stop();
    decoder.release();
}

But I get an error IllegalStateException at the line int outputBufferId = decoder.dequeueOutputBuffer(info, 1000);.

I searched for the error and how to properly decode an m4a but most of the solution where deprecated on api 21, and now I'm stuck on this error.

So there is an example of audio decoding for api 26/28, or please someone can explain how to do it correctly?

The entire project is hosted on GitHub.

Upvotes: 3

Views: 1843

Answers (1)

simone viozzi
simone viozzi

Reputation: 460

I solved the question in asynchronous mode using callbacks.

The basic workflow is to:

  • extract the encoded data from the file using MediaExtractor
  • pass it to the MediaCodec for the decoding
  • pass the decoded data to AudioTrack to reproduce it (or do whatever you want do with the data)

First we need some initialization, i put this in the constructor of the class i used to decode and reproduce the file:

// inizialize the mediaExtractor and set the source file
mediaExtractor = new MediaExtractor();
mediaExtractor.setDataSource(fileName);

// select the first audio track in the file and return it's format
mediaFormat = null;
int i;
int numTracks = mediaExtractor.getTrackCount();
for (i = 0; i < numTracks; i++)
{
    mediaFormat = mediaExtractor.getTrackFormat(i);
    if (mediaFormat.getString(MediaFormat.KEY_MIME).startsWith("audio/"))
    {
        mediaExtractor.selectTrack(i);
        break;
    }
}
// we get the parameter from the mediaFormat
channelCount = mediaFormat.getInteger(MediaFormat.KEY_CHANNEL_COUNT);
sampleRate = mediaFormat.getInteger(MediaFormat.KEY_SAMPLE_RATE);
duration = mediaFormat.getLong(MediaFormat.KEY_DURATION);
mimeType = mediaFormat.getString(MediaFormat.KEY_MIME);

// we can get the minimum buffer size from audioTrack passing the parameter of the audio
// to keep it safe it's good practice to create a buffer that is 8 times bigger
int minBuffSize = AudioTrack.getMinBufferSize(sampleRate,
                                              AudioFormat.CHANNEL_OUT_STEREO,
                                              AudioFormat.ENCODING_PCM_16BIT);

// to reproduce the data we need to initialize the audioTrack, by passing the audio parameter
// we use the MODE_STREAM so we can put more data dynamically with audioTrack.write()
audioTrack = new AudioTrack(AudioManager.STREAM_MUSIC,
                            sampleRate,
                            AudioFormat.CHANNEL_OUT_STEREO,
                            AudioFormat.ENCODING_PCM_16BIT,
                            minBuffSize * 8,
                            AudioTrack.MODE_STREAM);

For the developer guide the method i used to initialize audioTrack is deprecated, but it works for me and the new method don't, so for the purpose of this example i kept this type of initialization.

After the initialization phase we need to create the decoder, set the callbacks for it, and start the decoder and the audioTrack. The callbacks of the mediaCodec are:

  • onInputBufferAvailable: Called when an input buffer becomes available.
  • onOutputBufferAvailable: Called when an output buffer becomes available.
  • onError: Called when the MediaCodec encountered an error
  • onOutputFormatChanged: Called when the output format has changed

So we need to:

  • use the extractor to extract the encoded data from the file and use it to fill the inputBuffers that we get from the codec.
  • And after the codec is done decoding we can get the decoded data from an outputBuffer and pass it to the audioTrack.

My code is:

// we get the mediaCodec by creating it using the mime_type extracted form the track
MediaCodec decoder = MediaCodec.createDecoderByType(mimeType);

// to decode the file in asynchronous mode we set the callbacks
decoder.setCallback(new MediaCodec.Callback()
{
    private boolean mOutputEOS = false;
    private boolean mInputEOS = false;

    @Override
    public void onInputBufferAvailable (@NonNull MediaCodec codec,
                                        int index)
    {
        // if i reached the EOS i either the input or the output file i just skip
        if (mOutputEOS | mInputEOS) return;

        // i must use the index to get the right ByteBuffer from the codec
        ByteBuffer inputBuffer = codec.getInputBuffer(index);

        // if the codec is null i just skip and wait for another buffer
        if (inputBuffer == null) return;

        long sampleTime = 0;
        int result;

        // with this method i fill the inputBuffer with the data read from the mediaExtractor
        result = mediaExtractor.readSampleData(inputBuffer, 0);
        // the return parameter of readSampleData is the number of byte read from the file
        // and if it's -1 it means that i reached EOS
        if (result >= 0)
        {
            // if i read some bytes i can pass the index of the buffer, the number of bytes
            // that are in the buffer and the sampleTime to the codec, so that it can decode
            // that data
            sampleTime = mediaExtractor.getSampleTime();
            codec.queueInputBuffer(index, 0, result, sampleTime, 0);
            mediaExtractor.advance();
        }
        else
        {
            // if i reached EOS i need to tell the codec
            codec.queueInputBuffer(index, 0, 0, -1, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
            mInputEOS = true;
        }
    }

    @Override
    public void onOutputBufferAvailable (@NonNull MediaCodec codec,
                                         int index,
                                         @NonNull MediaCodec.BufferInfo info)
    {
        // i can get the outputBuffer from the codec using the relative index
        ByteBuffer outputBuffer = codec.getOutputBuffer(index);

        // if i got a non null buffer
        if (outputBuffer != null)
        {
            outputBuffer.rewind();
            outputBuffer.order(ByteOrder.LITTLE_ENDIAN);

            // i just need to write the outputBuffer into the audioTrack passing the number of
            // bytes it contain and using the WRITE_BLOCKING so that this call will block
            // until it doesn't finish to write the data
            int ret = audioTrack.write(outputBuffer,
                                       outputBuffer.remaining(),
                                       AudioTrack.WRITE_BLOCKING);
        }

        // if the flags in the MediaCodec.BufferInfo contains the BUFFER_FLAG_END_OF_STREAM
        // it mean that i reached EOS so i set mOutputEOS to true, and to assure
        // that it remain true even if this callback is called again i use the logical or
        mOutputEOS |= ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0);

        // i always need to release the buffer i use so the system can recycle them and use
        // it again
        codec.releaseOutputBuffer(index, false);

        // if i reached the end of the output stream i need to stop and release the codec
        // and the extractor
        if (mOutputEOS)
        {
            codec.stop();
            codec.release();
            mediaExtractor.release();
            audioTrack.release();
        }
    }

    @Override
    public void onError (@NonNull MediaCodec codec,
                         @NonNull MediaCodec.CodecException e)
    {
        Timber.e(e, "mediacodec collback onError: %s", e.getMessage());
    }

    @Override
    public void onOutputFormatChanged (@NonNull MediaCodec codec,
                                       @NonNull MediaFormat format)
    {
        Timber.d("onOutputFormatChanged: %s", format.toString());
    }

});
// now we can configure the codec by passing the mediaFormat and start it
decoder.configure(mediaFormat, null, null, 0);
decoder.start();
// also we need to start the audioTrack.
audioTrack.play();

Upvotes: 3

Related Questions