How to properly close a FFmpeg stream and AVFormatContext without leaking memory?

Question

I have built an app that uses FFmpeg to connect to remote IP cameras in order to receive video and audio frames via RTSP 2.0.

The app is built using Xcode 10-11 and Objective-C with a custom FFmpeg build config.

The architecture is the following:

MyApp


Document_0

    RTSPContainerObject_0
        RTSPObject_0

    RTSPContainerObject_1
        RTSPObject_1

    ...
Document_1
...

GOAL:

After closing Document_0 no FFmpeg objects should be leaked.
The closing process should stop-frame reading and destroy all objects which use FFmpeg.

PROBLEM:

Somehow Xcode's memory debugger shows two instances of MyApp.

FACTS:

macOS'es Activity Monitor doesn't show two instances of MyApp.
macOS'es Activity Monitor doesn't any instances of FFmpeg or other child processes.
The issue is not related to some leftover memory due to a late memory snapshot since it can be reproduced easily.
Xcode's memory debugger shows that the second instance only having RTSPObject's AVFormatContext and no other objects.
1. The second instance has an AVFormatContext and the RTPSObject still has a pointer to the AVFormatContext.

FACTS:

Opening and closing the second document Document_1 leads to the same problem and having two objects leaked. This means that there is a bug that creates scalable problems. More and more memory is used and unavailable.

Here is my termination code:

   - (void)terminate
{
    // * Video and audio frame provisioning termination *
    [self stopVideoStream];
    [self stopAudioStream];
    // *

    // * Video codec termination *
    avcodec_free_context(&_videoCodecContext); // NULL pointer safe.
    self.videoCodecContext = NULL;
    // *

// * Audio codec termination *
avcodec_free_context(&_audioCodecContext); // NULL pointer safe.
self.audioCodecContext = NULL;
// *

if (self.packet)
{
    // Free the packet that was allocated by av_read_frame.
    av_packet_unref(&packet); // The documentation doesn't mention NULL safety.
    self.packet = NULL;
}

if (self.currentAudioPacket)
{
    av_packet_unref(_currentAudioPacket);
    self.currentAudioPacket = NULL;
}

// Free raw frame data.
av_freep(&_rawFrameData); // NULL pointer safe.

// Free the swscaler context swsContext.
self.isFrameConversionContextAllocated = NO;
sws_freeContext(scallingContext); // NULL pointer safe.

[self.audioPacketQueue removeAllObjects];

self.audioPacketQueue = nil;

self.audioPacketQueueLock = nil;
self.packetQueueLock = nil;
self.audioStream = nil;
BXLogInDomain(kLogDomainSources, kLogLevelVerbose, @"%s:%d: All streams have been terminated!", __FUNCTION__, __LINE__);

// * Session context termination *
AVFormatContext *pFormatCtx = self.sessionContext;
BOOL shouldProceedWithInputSessionTermination = self.isInputStreamOpen && self.shouldTerminateStreams && pFormatCtx;
NSLog(@"
Terminating session context...");
if (shouldProceedWithInputSessionTermination)
{
    NSLog(@"
Terminating...");
    //av_write_trailer(pFormatCtx);
    // Discard all internally buffered data.
    avformat_flush(pFormatCtx); // The documentation doesn't mention NULL safety.
    // Close an opened input AVFormatContext and free it and all its contents.
    // WARNING: Closing an non-opened stream will cause avformat_close_input to crash.
    avformat_close_input(&pFormatCtx); // The documentation doesn't mention NULL safety.
    NSLog(@"Logging leftovers - %p, %p  %p", self.sessionContext, _sessionContext, pFormatCtx);
    avformat_free_context(pFormatCtx);

    NSLog(@"Logging content = %c", *self.sessionContext);
    //avformat_free_context(pFormatCtx); - Not needed because avformat_close_input is closing it.
    self.sessionContext = NULL;
}
// *

}

IMPORTANT: The termination sequence is:

    New frame will be read.
-[(RTSPObject)StreamInput currentVideoFrameDurationSec]
-[(RTSPObject)StreamInput frameDuration:]
-[(RTSPObject)StreamInput currentCGImageRef]
-[(RTSPObject)StreamInput convertRawFrameToRGB]
-[(RTSPObject)StreamInput pixelBufferFromImage:]
-[(RTSPObject)StreamInput cleanup]
-[(RTSPObject)StreamInput dealloc]
-[(RTSPObject)StreamInput stopVideoStream]
-[(RTSPObject)StreamInput stopAudioStream]

Terminating session context...
Terminating...
Logging leftovers - 0x109ec6400, 0x109ec6400  0x109ec6400
Logging content = \330
-[Document dealloc]

NOT WORKING SOLUTIONS:

Changing the order of object releases (The AVFormatContext has been freed first but it didn't lead to any change).
Calling RTSPObject's cleanup method much sooner to give FFmpeg more time to handle object releases.
Reading a lot of SO answers and FFmpeg documentation to find a clean cleanup process or newer code which might highlight why the object release doesn't happen properly.

I am currently reading the documentation on AVFormatContext since I believe that I am forgetting to release something. This believe is based on the memory debuggers output that AVFormatContext is still around.

Here is my creation code:

#pragma mark # Helpers - Start

- (NSError *)openInputStreamWithVideoStreamId:(int)videoStreamId
                                audioStreamId:(int)audioStreamId
                                     useFirst:(BOOL)useFirstStreamAvailable
                                       inInit:(BOOL)isInitProcess
{
    // NSLog(@"%s", __PRETTY_FUNCTION__); // RTSP
    self.status = StreamProvisioningStatusStarting;
    AVCodec *decoderCodec;
    NSString *rtspURL = self.streamURL;
    NSString *errorMessage = nil;
    NSError *error = nil;

    self.sessionContext = NULL;
    self.sessionContext = avformat_alloc_context();

    AVFormatContext *pFormatCtx = self.sessionContext;
    if (!pFormatCtx)
    {
        // Create approp error.
        return error;
    }


    // MUST be called before avformat_open_input().
    av_dict_free(&_sessionOptions);

        self.sessionOptions = 0;
        if (self.usesTcp)
        {
            // "rtsp_transport" - Set RTSP transport protocols.
            // Allowed are: udp_multicast, tcp, udp, http.
            av_dict_set(&_sessionOptions, "rtsp_transport", "tcp", 0);
        }
        av_dict_set(&_sessionOptions, "rtsp_transport", "tcp", 0);

    // Open an input stream and read the header with the demuxer options.
    // WARNING: The stream must be closed with avformat_close_input()
    if (avformat_open_input(&pFormatCtx, rtspURL.UTF8String, NULL, &_sessionOptions) != 0)
    {
        // WARNING: Note that a user-supplied AVFormatContext (pFormatCtx) will be freed on failure.
        self.isInputStreamOpen = NO;
        // Create approp error.
        return error;
    }

    self.isInputStreamOpen = YES;

    // user-supplied AVFormatContext pFormatCtx might have been modified.
    self.sessionContext = pFormatCtx;

    // Retrieve stream information.
    if (avformat_find_stream_info(pFormatCtx,NULL) < 0)
    {
        // Create approp error.
        return error;
    }

    // Find the first video stream
    int streamCount = pFormatCtx->nb_streams;

    if (streamCount == 0)
    {
        // Create approp error.
        return error;
    }

    int noStreamsAvailable = pFormatCtx->streams == NULL;

    if (noStreamsAvailable)
    {
        // Create approp error.
        return error;
    }

    // Result. An Index can change, an identifier shouldn't.
    self.selectedVideoStreamId = STREAM_NOT_FOUND;
    self.selectedAudioStreamId = STREAM_NOT_FOUND;

    // Fallback.
    int firstVideoStreamIndex = STREAM_NOT_FOUND;
    int firstAudioStreamIndex = STREAM_NOT_FOUND;

    self.selectedVideoStreamIndex = STREAM_NOT_FOUND;
    self.selectedAudioStreamIndex = STREAM_NOT_FOUND;

    for (int i = 0; i < streamCount; i++)
    {
        // Looking for video streams.
        AVStream *stream = pFormatCtx->streams[i];
        if (!stream) { continue; }
        AVCodecParameters *codecPar = stream->codecpar;
        if (!codecPar) { continue; }

        if (codecPar->codec_type==AVMEDIA_TYPE_VIDEO)
        {
            if (stream->id == videoStreamId)
            {
                self.selectedVideoStreamId = videoStreamId;
                self.selectedVideoStreamIndex = i;
            }

            if (firstVideoStreamIndex == STREAM_NOT_FOUND)
            {
                firstVideoStreamIndex = i;
            }
        }
        // Looking for audio streams.
        if (codecPar->codec_type==AVMEDIA_TYPE_AUDIO)
        {
            if (stream->id == audioStreamId)
            {
                self.selectedAudioStreamId = audioStreamId;
                self.selectedAudioStreamIndex = i;
            }

            if (firstAudioStreamIndex == STREAM_NOT_FOUND)
            {
                firstAudioStreamIndex = i;
            }
        }
    }

    // Use first video and audio stream available (if possible).

    if (self.selectedVideoStreamIndex == STREAM_NOT_FOUND && useFirstStreamAvailable && firstVideoStreamIndex != STREAM_NOT_FOUND)
    {
        self.selectedVideoStreamIndex = firstVideoStreamIndex;
        self.selectedVideoStreamId = pFormatCtx->streams[firstVideoStreamIndex]->id;
    }

    if (self.selectedAudioStreamIndex == STREAM_NOT_FOUND && useFirstStreamAvailable && firstAudioStreamIndex != STREAM_NOT_FOUND)
    {
        self.selectedAudioStreamIndex = firstAudioStreamIndex;
        self.selectedAudioStreamId = pFormatCtx->streams[firstAudioStreamIndex]->id;
    }

    if (self.selectedVideoStreamIndex == STREAM_NOT_FOUND)
    {
        // Create approp error.
        return error;
    }

    // See AVCodecID for codec listing.

    // * Video codec setup:
    // 1. Find the decoder for the video stream with the gived codec id.
    AVStream *stream = pFormatCtx->streams[self.selectedVideoStreamIndex];
    if (!stream)
    {
        // Create approp error.
        return error;
    }
    AVCodecParameters *codecPar = stream->codecpar;
    if (!codecPar)
    {
        // Create approp error.
        return error;
    }

    decoderCodec = avcodec_find_decoder(codecPar->codec_id);
    if (decoderCodec == NULL)
    {
        // Create approp error.
        return error;
    }

    // Get a pointer to the codec context for the video stream.
    // WARNING: The resulting AVCodecContext should be freed with avcodec_free_context().
    // Replaced:
    // self.videoCodecContext = pFormatCtx->streams[self.selectedVideoStreamIndex]->codec;
    // With:
    self.videoCodecContext = avcodec_alloc_context3(decoderCodec);
    avcodec_parameters_to_context(self.videoCodecContext,
                                  codecPar);

    self.videoCodecContext->thread_count = 4;
    NSString *description = [NSString stringWithUTF8String:decoderCodec->long_name];

    // 2. Open codec.
    if (avcodec_open2(self.videoCodecContext, decoderCodec, NULL) < 0)
    {
        // Create approp error.
        return error;
    }

    // * Audio codec setup:
    if (self.selectedAudioStreamIndex > -1)
    {
        [self setupAudioDecoder];
    }

    // Allocate a raw video frame data structure. Contains audio and video data.
    self.rawFrameData = av_frame_alloc();

    self.outputWidth = self.videoCodecContext->width;
    self.outputHeight = self.videoCodecContext->height;

    if (!isInitProcess)
    {
        // Triggering notifications in init process won't change UI since the object is created locally. All
        // objects which need data access to this object will not be able to get it. Thats why we don't notifiy anyone about the changes.
        [NSNotificationCenter.defaultCenter postNotificationName:NSNotification.rtspVideoStreamSelectionChanged
                                                          object:nil userInfo: self.selectedVideoStream];

        [NSNotificationCenter.defaultCenter postNotificationName:NSNotification.rtspAudioStreamSelectionChanged
                                                          object:nil userInfo: self.selectedAudioStream];
    }

    return nil;
}

UPDATE 1

The initial architecture allowed using any given thread. Most of the below code would mostly run on the main thread. This solution was not appropriate since the opening of the stream input can take several seconds for which the main thread is blocked while waiting for a network response inside FFmpeg. To solve this issue I have implemented the following solution:

Creation and the initial setup are only allowed on the background_thread (see code snippet "1" below).
Changes are allowed on the current_thread(Any).
Termination is allowed on the current_thread(Any).

After removing main thread checks and dispatch_asyncs to background threads, leaking has stopped and I can't reproduce the issue anymore:

// Code that produces the issue.   
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
    // 1 - Create and do initial setup. 
    // This block creates the issue. 
[self.rtspObject = [[RTSPObject alloc] initWithURL: ... ];
[self.rtspObject openInputStreamWithVideoStreamId: ...
                                audioStreamId: ...
                                     useFirst: ...
                                       inInit: ...];
});

I still don't understand why Xcode's memory debugger says that this block is retained?

Any advice or idea is welcome.

How to properly close a FFmpeg stream and AVFormatContext without leaking memory?

Answers (1)

Related Questions