snachmsm
snachmsm

Reputation: 19243

AMR decoding from RTP

I'm receving some RTP stream, which I know only its AMR-WB octet-aligned 100 ms per packet. Some 3rd party can receive same stream and its "hearable", so its proper. Now I'm receiving this data and trying to decode, without luck...

init:

val sampleRate = 16000
val mc = MediaCodec.createDecoderByType(MediaFormat.MIMETYPE_AUDIO_AMR_WB)
val mf = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AMR_WB, sampleRate, 1)
mf.setInteger(MediaFormat.KEY_SAMPLE_RATE, sampleRate) // is it needed?
mc.configure(mf, null, null, 0)
mc.start()

decode each packet separatelly:

private fun decode(decoder: MediaCodec, mediaFormat: MediaFormat, rtpPacket: RtpPacket): ByteArray {
    var outputBuffer: ByteBuffer
    var outputBufferIndex: Int

    val inputBuffers: Array<ByteBuffer> = decoder.inputBuffers
    var outputBuffers: Array<ByteBuffer> = decoder.outputBuffers

    // input
    val inputBufferIndex = decoder.dequeueInputBuffer(-1L)
    if (inputBufferIndex >= 0) {
        val inputBuffer = inputBuffers[inputBufferIndex]
        inputBuffer.clear()
        inputBuffer.put(rtpPacket.payload)
        // native ACodec/MediaCodec crash in here (log below)
        decoder.queueInputBuffer(inputBufferIndex, 0, rtpPacket.payload.size, System.nanoTime()/1000, 0)
    }

    // output
    val bufferInfo: MediaCodec.BufferInfo = MediaCodec.BufferInfo()
    outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, -1L)
    Timber.i("outputBufferIndex: ${outputBufferIndex}")
    when (outputBufferIndex) {
        MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED -> {
            Timber.d("INFO_OUTPUT_BUFFERS_CHANGED")
            outputBuffers = decoder.outputBuffers
        }
        MediaCodec.INFO_OUTPUT_FORMAT_CHANGED -> {
            val format: MediaFormat = decoder.outputFormat
            Timber.d("INFO_OUTPUT_FORMAT_CHANGED $format")
            audioTrack.playbackRate = format.getInteger(MediaFormat.KEY_SAMPLE_RATE)
        }
        MediaCodec.INFO_TRY_AGAIN_LATER -> Timber.d("INFO_TRY_AGAIN_LATER")
        else -> {
            val outBuffer = outputBuffers[outputBufferIndex]
            outBuffer.position(bufferInfo.offset);
            outBuffer.limit(bufferInfo.offset + bufferInfo.size);

            val chunk = ByteArray(bufferInfo.size)
            outBuffer[chunk]
            outBuffer.clear()
            audioTrack.write(
                chunk,
                bufferInfo.offset,
                bufferInfo.offset + bufferInfo.size
            )
            decoder.releaseOutputBuffer(outputBufferIndex, false)
            Timber.v("chunk size:${chunk.size}")
            return chunk
        }
    }

    // All decoded frames have been rendered, we can stop playing now
    if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM != 0) {
        Timber.d("BUFFER_FLAG_END_OF_STREAM")
    }
    return ByteArray(0)
}

sadly I'm getting on some (clean) Android 10

E/ACodec: [OMX.google.amrwb.decoder] ERROR(0x80001001)
E/ACodec: signalError(omxError 0x80001001, internalError -2147483648)
E/MediaCodec: Codec reported err 0x80001001, actionCode 0, while in state 6
E/RtpReceiver: java.lang.IllegalStateException
    at android.media.MediaCodec.native_dequeueInputBuffer(Native Method)
    at android.media.MediaCodec.dequeueInputBuffer(MediaCodec.java:2727)

I should probably pack up dequeueOutputBuffer+when in some while(true), but then I'm getting similar logs as above, but with 0x8000100b

on another device - Android 12 on Pixel - Im' getting similar

D/BufferPoolAccessor2.0: bufferpool2 0xb400007067901978 : 4(32768 size) total buffers - 4(32768 size) used buffers - 0/5 (recycle/alloc) - 0/0 (fetch/transfer)
D/CCodecBufferChannel: [c2.android.amrwb.decoder#471] work failed to complete: 14
E/MediaCodec: Codec reported err 0xe, actionCode 0, while in state 6/STARTED
E/RtpReceiver: java.lang.IllegalStateException
    at android.media.MediaCodec.native_dequeueOutputBuffer(Native Method)
    at android.media.MediaCodec.dequeueOutputBuffer(MediaCodec.java:3535)

I'm obviusly cutting off RTP header (payload used above), but nothing else done. Should I also recognize payload/AMR header? Inside of it there is e.g. FT - frame type index - which is determining bitrate, so decoder should got this param before start() call right? Or can I pass whole payload, with CMR, ToC with FT, Q etc. straight to decoder, but I've inited it not so well? Or my decode method is somehow wrongly implemented? In short: how to properly decode (and play) AMR-WB got from RTP stream?

edit: worth mentioning that payload starts with F0 84 84 84 84 04 for every packet

Upvotes: 5

Views: 608

Answers (1)

snachmsm
snachmsm

Reputation: 19243

turned out that I have to "unpack" also AMR header and "re-pack" data into AMR frames. first bytes of payload posted in question are ToC list.

F0 is CMR and may be ommited, starting pos 1 we can calculate ToC size - number of consecutive bytes with 1 on msb (or as int >= 128 or as hex first char >= 8) + 1. so if payload[1] starts with 0(hex) then ToC size is 1 and payload is one frame and we can pass it to decoder (don't forget to skip first CMR byte!). in my sample ToC size is 5, so I have to divide rest of payload and interlace with ToC bytes, where "frame" = one byte for ToC + frame-payload.

my whole payload has 91 bytes -1 for cmr -5 ToCs gives 85 bytes for 5 frames (toc size) which gives 5 frames with 1 (toc byte) + 17 (85/5 amrpayload) size

we can just divide rest of payload, but its worth ensuring that size by checking bitrate mode passed in every ToC byte for every frame and comparing with fixed frame sizes per bitrate (check out index in below code)

fun decode(rtpPacket: RtpPacket): ByteArray {
    var outData = ByteArray(0)
    var position = 0
    position++ // skip payload header, ignore CMR - rtpPacket.payload[0]

    var tocLen = 0
    while (getBit(rtpPacket.payload[position].toInt(), 7)) {
        //first byte has 1 at msb
        position++
        tocLen++
    }
    if (tocLen > 0) { // if there is any toc detected
        // first byte which has NOT 1 at msb also belongs to ToC
        position++
        tocLen++
    }
    //Timber.i("decoded tocListSize: $tocLen")

    if (tocLen > 0) {
        // starting from 1 because this is first ToC byte position after ommiting CMR
        for (i in 1 until (tocLen + 1)) {
            val index = rtpPacket.payload[i].toInt() shr 3 and 0xf
            if (index >= 9) {
                Timber.w("Bad AMR ToC, index=$index")
                break
            }
            val amr_frame_sizes = intArrayOf(17, 23, 32, 36, 40, 46, 50, 58, 60, 5)
            val frameSize = amr_frame_sizes[index]
            //Timber.i("decoded i:$i index:$index frameSize:frameSize position:$position")
            if (position + frameSize > rtpPacket.payloadLength) {
                Timber.w("Truncated AMR frame")
                break
            }
            val frame = ByteArray(1 + frameSize)
            frame[0] = rtpPacket.payload[i]
            System.arraycopy(rtpPacket.payload, position, frame, 1, frameSize)

            outData = outData.plus(decode(frame))
            position += frameSize
        }
    } else { // single frame case, NOT TESTED!!
        outData = ByteArray(rtpPacket.payloadLength - 1) // without CMR
        System.arraycopy(rtpPacket.payload, 1, outData, 0, outData.size)
        outData = decode(outData)
    }
    return outData
}

returned data may be used instead of rtpPacket.payload in decode method posted in question (well, code of decoder itself may be a bit improved, as last lines are unreachable, but even in this form is working)

amr_frame_sizes is const array for my case, in which 100 ms of AMR is divided into 5 frames. these sizes are adjusted to such case - 20ms frame - and position according to index ("changeable" bitrate)

Upvotes: 2

Related Questions