FFMPEG: Working of parser of a video decoder

Question

I'm going through the working of H.263 video decoders parser in FFMPEG multimedia framework.

What I know:

Every video decoder needs a parser to fetch frames from a given input stream and once data related to a frame is obtained, it is sent to the decoder for decoding process.

Every codec's parser needs to define a structure of type AVCodecParser. This structure has a function pointers:

.parser_parse -> Points to the function which deals with the parsing functionality

.parser_close -> points to a function that performs buffer deallocation.

Taking the example of a video decoder H.264, it has a parser function as shown below:

static int h263_parse(AVCodecParserContext *s,
                           AVCodecContext *avctx,
                           const uint8_t **poutbuf, int *poutbuf_size,
                           const uint8_t *buf, int buf_size)
{
    ParseContext *pc = s->priv_data;
    int next;

    if (s->flags & PARSER_FLAG_COMPLETE_FRAMES) {
        next = buf_size;
    } else {
        next= ff_h263_find_frame_end(pc, buf, buf_size);

        if (ff_combine_frame(pc, next, &buf, &buf_size) < 0) {
            *poutbuf = NULL;
            *poutbuf_size = 0;
            return buf_size;
        }
    }

    *poutbuf = buf;
    *poutbuf_size = buf_size;
    return next;
}

Could anyone please explain, the parameters of the above function.

According to me:

poutbuf -> is a pointer that points to parsed frame data.

poutbuf_size -> contains the size of the data.

Are my above assumptions right? Which parameter holds the input buffer data? And what is the above parse function returning? Also a brief explanation for the above code will be anyone who is referring to the post. Any information regarding the same will be really helpful.

nmaier · Accepted Answer

First of a disclaimer: it has been some time since I last looked into this stuff...

Are my above assumptions right?

Yes, poutbuf is a pointer to (start of) the frame data, and poutbuf_size contains the size of the frame data.

Which parameter holds the input buffer data?

buf with a size of buf_size. Please note that that a parser will not necessarily copy the data, so buf and poutbuf might point into the same allocation. (e.g. when parsing is skipped). Just saying, in case you want to mess with the output buffer, which could therefore also modify the input buffer and have unexpected side effects.

The output buffer will either point to the same allocation as the input buffer, or to the internal buffer the parser context holds. As such the output buffer will be either freed as a result of freeing the input buffer, or when the parser context reallocates its buffer or is destroyed.

And what is the above parse function returning?

The function has three outputs:

The actual return value. This value specifies an relative offset into the input for the next call.
poutbuf. The actual frame data, if any. Or the input buffer if parsing is skipped.
poutbuf_size. The size of the actual frame data, if any, Or the input buffer size if parsing is skipped.

One not so obvious thing is that this parser function will reuse the pointer variables buf and buf_size, but these changes do not leave the function (pass-by-value and all that).

Also a brief explanation for the above code will be anyone who is referring to the post.

PARSER_FLAG_COMPLETE_FRAMES these days essentially means "no actual parsing needed" (see libavformat/utils.c on how this is set up). Hence if the flag is specified, just set `output = input and return the full buffer size as the next offset.
If the flag is not set, the parser will try to find the frame end (ff_h263_find_frame_end(pc, buf, buf_size))
Next it will buffer either the data to the found end or the whole data if no end was found (`ff_combine_frame').
- If the frame still incomplete (no frame end found) the function will set the output buffer and size to 0, indicating that this call didn't yield a complete frame, and return the next offset. The data is still buffered, so that the frame can be completed in a subsequent call.
- If the frame cannot be combined, e.g due to an out-of-memory conditions, ff_combine_frame will drop the buffered data (and frame) and also return a null buffer and buf_size next offset.
At this point we either have a complete frame (or skipped parsing altogether) so set the output buffer to the frame buffer (and size) and return the calculated offset (which is either the frame end, or the whole buffer when not parsed).

Those *_parse routines pretty much all look the same and work the same IIRC.

FFMPEG: Working of parser of a video decoder

Answers (1)

Related Questions