tiagocarvalho92
tiagocarvalho92

Reputation: 427

Decoding and Encoding Video generates file without stream

I'm trying to create some logic to select a video from the gallery ( with any format or encoding) and convert it to mp4 with H264 codec. The file is being generated with a positive size and duration but the stream track is empty:

Duration: 00:00:02.83, bitrate: 4462 kb/s
Stream #0:0(eng): Video: none, none, 90k tbr, 90k tbn, 90k tbc (default)
Unsupported codec with id 0 for input stream 0

This is my current code:

fun processVideo(uri: Uri, filename: String): Result<File> {
    val newFileName = filename.substringBeforeLast(".") + ".mp4"
    val outputFile = File(context.cacheDir, newFileName)
    Timber.e("Output file is $outputFile")

    val extractor = MediaExtractor()
    var muxer: MediaMuxer? = null
    var decoder: MediaCodec? = null
    var encoder: MediaCodec? = null

    try {
        // Initialize MediaExtractor
        extractor.setDataSource(context, uri, null)

        // Initialize MediaMuxer
        muxer = MediaMuxer(outputFile.absolutePath, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

        // Loop through tracks to find video track
        Timber.e("Loop through tracks to find video track")
        val oldTrackIndex = findTrackIndex(extractor, "video/")
        if (oldTrackIndex == -1) {
            Timber.e("No video track found in input video")
        }
        extractor.selectTrack(oldTrackIndex)

        extractor.seekTo(0, MediaExtractor.SEEK_TO_CLOSEST_SYNC)

        val inputVideoFormat = extractor.getTrackFormat(oldTrackIndex)

        encoder = createVideoEncoder(inputVideoFormat)
        decoder = createVideoDecoder(inputVideoFormat)

        decoder.start()
        encoder.start()

        var videoTrackIndex = -1

        val retrieverSrc = MediaMetadataRetriever()
        retrieverSrc.setDataSource(context, uri)
        val degreesString = retrieverSrc.extractMetadata(
            MediaMetadataRetriever.METADATA_KEY_VIDEO_ROTATION
        )
        if (degreesString != null) {
            val degrees = degreesString.toInt()
            if (degrees >= 0) {
                muxer.setOrientationHint(degrees)
            }
        }

        var inputBuffer: ByteBuffer
        var outputBuffer: ByteBuffer

        var inputBufferIndex: Int
        var outputBufferIndex: Int

        val bufferInfo = MediaCodec.BufferInfo()

        val startTime = System.currentTimeMillis()

        var isDecoderEOS = false
        var isEncoderEOS = false

        while (!isEncoderEOS) {
            if (!isDecoderEOS) {
                // Step 1: Extract data from MediaExtractor to MediaCodec (decoder)
                inputBufferIndex = decoder.dequeueInputBuffer(10000)
                if (inputBufferIndex >= 0) {
                    inputBuffer = decoder.getInputBuffer(inputBufferIndex)!!
                    val sampleSize = extractor.readSampleData(inputBuffer, 0)
                    if (sampleSize < 0) {
                        // End of stream
                        Timber.e("End of stream reached.")
                        decoder.queueInputBuffer(
                            inputBufferIndex,
                            0,
                            0,
                            0,
                            MediaCodec.BUFFER_FLAG_END_OF_STREAM
                        )
                        isDecoderEOS = true
                    } else {
                        Timber.e("Queueing input buffer for decoding.")
                        decoder.queueInputBuffer(
                            inputBufferIndex,
                            0,
                            sampleSize,
                            extractor.sampleTime,
                            extractor.sampleFlags
                        )
                        extractor.advance()
                    }
                }
            }

            // Step 2: Decode and re-encode (from decoder to encoder)
            outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 10000)
            if (outputBufferIndex >= 0) {
                outputBuffer = decoder.getOutputBuffer(outputBufferIndex)!!
                val outBufferIndex = encoder.dequeueInputBuffer(10000)
                if (outBufferIndex >= 0) {
                    Timber.e("Re-encoding video frame.")
                    val outBuffer = encoder.getInputBuffer(outBufferIndex)!!
                    outBuffer.put(outputBuffer)
                    encoder.queueInputBuffer(
                        outBufferIndex,
                        0,
                        bufferInfo.size,
                        bufferInfo.presentationTimeUs,
                        bufferInfo.flags
                    )

                    // Release the decoder's output buffer after successfully filling encoder's input buffer
                    decoder.releaseOutputBuffer(outputBufferIndex, false)
                }
            }

            // Step 3: Write re-encoded data to MediaMuxer
            val encoderOutputBufferIndex = encoder.dequeueOutputBuffer(bufferInfo, 10000)
            if (encoderOutputBufferIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                // The output format has changed, add the new format to the MediaMuxer
                if (videoTrackIndex == -1) {
                    videoTrackIndex = muxer.addTrack(encoder.outputFormat)
                    Timber.e("Encoder format: ${encoder.outputFormat}")
                    muxer.start() // Start the muxer here after adding the track
                }
            } else if (encoderOutputBufferIndex >= 0) {
                if (videoTrackIndex != -1) {
                    Timber.e("Writing re-encoded data to MediaMuxer.")
                    outputBuffer = encoder.getOutputBuffer(encoderOutputBufferIndex)!!
                    muxer.writeSampleData(videoTrackIndex, outputBuffer, bufferInfo)
                    encoder.releaseOutputBuffer(encoderOutputBufferIndex, false)
                }
            }

            // Check for end of stream
            if ((bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
                Timber.e("Received end of stream flag.")
                isEncoderEOS = true
            }
        }
        val endTime = System.currentTimeMillis()
        val timeTaken = endTime - startTime
        val timeTakenSeconds = timeTaken / 1000
        val timeTakenRemainingMillis = timeTaken % 1000

        Timber.e("Time taken for main loop: ${timeTakenSeconds}s ${timeTakenRemainingMillis}ms")
        Timber.e("Main processing loop completed")
    } catch (e: Exception) {
        Timber.e("Error during video conversion", e)
        Timber.e(e)
    } finally {
        extractor.safeRelease()
        decoder?.safeStop()
        decoder?.safeRelease()
        encoder?.safeStop()
        encoder?.safeRelease()
        muxer?.safeStop()
        muxer?.safeRelease()
    }

    return Result.success(outputFile)
}

private fun createVideoEncoder(inputFormat: MediaFormat): MediaCodec {
    Timber.e("Create video encoder")
    val codecInfo = selectCodec("video/avc")
    val videoEncoder = MediaCodec.createByCodecName(codecInfo!!.name)

    val width = inputFormat.getInteger(MediaFormat.KEY_WIDTH)
    val height = inputFormat.getInteger(MediaFormat.KEY_HEIGHT)
    val frameRate = inputFormat.getSafeInteger(MediaFormat.KEY_FRAME_RATE, 30)
    val iFrameInterval = inputFormat.getSafeInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 5)
    val colorFormat = inputFormat.getSafeInteger(
        MediaFormat.KEY_COLOR_FORMAT,
        MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Flexible
    )
    val bitrateMode = inputFormat.getSafeInteger(
        MediaFormat.KEY_BITRATE_MODE,
        MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_CBR
    )
    val quality = 0.1f
    val estimatedBitRate = estimateBitRate(width, height, frameRate, quality)

    val outputFormat = MediaFormat.createVideoFormat(
        "video/avc",
        width,
        height
    )

    // Set the codec to H.264
    outputFormat.setString(MediaFormat.KEY_MIME, "video/avc")

    // Copy properties from input format to output format
    outputFormat.setInteger(MediaFormat.KEY_BIT_RATE, estimatedBitRate)
    outputFormat.setInteger(MediaFormat.KEY_FRAME_RATE, frameRate)
    outputFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, iFrameInterval)
    outputFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, colorFormat)
    outputFormat.setInteger(MediaFormat.KEY_BITRATE_MODE, bitrateMode)

    videoEncoder.configure(outputFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
    return videoEncoder
}

private fun createVideoDecoder(inputFormat: MediaFormat): MediaCodec {
    Timber.e("Create video decoder")
    val codecInfo = selectCodec(inputFormat.getString(MediaFormat.KEY_MIME)!!, shouldUseEncode = false)
    val decoder = MediaCodec.createByCodecName(codecInfo!!.name)
    decoder.configure(inputFormat, null, null, 0)
    return decoder
}

Sorry for the long snippet :) Does anyone have any idea of what could be the cause?

EDIT 1: Updated the solution after the suggestions and now ffmpeg shows

  Duration: 00:00:30.03, start: 0.000000, bitrate: 684 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, unknown/bt470bg/unknown), 640x360, 682 kb/s, 30 fps, 30 tbr, 90k tbn, 180k tbc (default)
    Metadata:
      creation_time   : 2023-11-03T07:02:10.000000Z
      handler_name    : VideoHandle

Upvotes: 1

Views: 214

Answers (1)

dev.bmax
dev.bmax

Reputation: 10581

After a quick glance over your code I can see a few possible issues:

  1. If you call encoder.outputFormat immediately after configuring the codec you will get an incomplete format (as stated in the docs). You are supposed to do it either after you receive an INFO_OUTPUT_FORMAT_CHANGED signal or after receiving the first output buffer.

    See an asynchronous mode example.

    See a synchronous mode example.

  2. You don't control the color format that the decoder outputs. So you should not hardcode the value of MediaFormat.KEY_COLOR_FORMAT. You should instead pass the color format value that you get from encoder.outputFormat.

  3. The same applies to MediaFormat.KEY_FRAME_RATE (if you don't want to play the video at the wrong speed).

  4. The estimation of MediaFormat.KEY_BIT_RATE in too low. It should ideally be calculated based on the video resolution and frame rate.

  5. It's a good practice to always seek the MediaExtractor before reading the samples.

Upvotes: 1

Related Questions