Scott Ward
Scott Ward

Reputation: 31

Decoding elementary h264 stream from gstreamer in iOS8

I am attempting to use the AVSampleBufferDisplayLayer to render an elementary h264 stream coming over a UDP connection. For a source, I am using this gstreamer command:

gst-launch-1.0 -v videotestsrc is-live=true pattern-ball ! video/x-raw,width-120,height=90,framerate=15/1 ! x264enc tune=zerolatency ! h264parse ! video/x-h264,stream-format=byte-stream ! rtph264pay mtu=100000000 ! udpsink host=127.0.0.1 port=1234

After getting the SPS and PPS parameters, the client reads the stream, packages the payload into a CMBlockBuffer (with the correct NAL header), packages the CMBlockBuffer into a CMSampleBuffer, and then passes the CMSampleBuffer to the AVSampleBufferDisplayLayer. This post on using the AVSampleBufferDisplayLayer provided a great example on executing this, and my prototype implementation is very similar. For this base case, it works great, and my very small (120x90) video renders perfectly.

However, when I try to increase the resolution of the source video, everything falls apart. At a 400x300 resolution video, I start getting 4 NAL units (each in a separate UDP packet) in between each access unit delimiter packet. I expect that each of those NAL units now represents a slice of the final frame, rather than the entire frame. So, I am collecting all of the slices for a frame and bundling them into a single CMSampleBuffer to pass to the AVSampleBufferDisplayLayer.

When the slices are bundled, the AVSampleBufferDisplayLayer renders nothing. When the slices are passed individually to the layer as separate CMSampleBuffers, then the AVSampleBufferDisplayLayer shows mostly green with a flickering black strip at the top.

Here’s the code that is bundling the packet data before sending it into the AVSampleBufferDisplayLayer:

-(void)udpStreamSource:(UdpSource*)source didReceiveCodedSlicePacket:(NSData*)packet withPayloadOffset:(NSInteger)payloadOffset withPayloadLength:(NSInteger)payloadLength {
    const size_t nalLength = payloadLength+4;
    uint8_t *videoData = malloc(sizeof(uint8_t)*nalLength);

    // first byte of payload is NAL header byte
    [packet getBytes:videoData+4 range:NSMakeRange(payloadOffset, payloadLength)];
    // prepend payloadLength to NAL
    videoData[0] = (uint8_t)(payloadLength >> 24);
    videoData[1] = (uint8_t)(payloadLength >> 16);
    videoData[2] = (uint8_t)(payloadLength >> 8);
    videoData[3] = (uint8_t) payloadLength;

    sliceSizes[sliceCount] = nalLength;
    sliceList[sliceCount++] = videoData;
}

-(void)udpStreamSource:(UdpSource*)source didReceiveAccessUnitDelimiter:(NSData*)packet {
    if (sliceCount <= 0) {
        return;
    }
    CMBlockBufferRef videoBlock = NULL;
    OSStatus status = CMBlockBufferCreateWithMemoryBlock(NULL, sliceList[0], sliceSizes[0], NULL, NULL, 0, sliceSizes[0], 0, &videoBlock);
    if (status != noErr) {
        NSLog(@"CMBlockBufferCreateWithMemoryBlock failed with error: %d", status);
    }

    for (int i=1; i < sliceCount; i++) {
        status = CMBlockBufferAppendMemoryBlock(videoBlock, sliceList[i], sliceSizes[i], NULL, NULL, 0, sliceSizes[i], 0);
        if (status != noErr) {
            NSLog(@"CMBlockBufferAppendMemoryBlock failed with error: %d", status);
        }
    }

    CMSampleBufferRef sampleBuffer = NULL;
    status = CMSampleBufferCreate(NULL, videoBlock, TRUE, NULL, NULL, _formatDescription, sliceCount, 0, NULL, sliceCount, sliceSizes, &sampleBuffer);
    if (status != noErr) {
        NSLog(@"CMSampleBufferCreate failed with error: %d", status);
    }

    CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, YES);
    CFMutableDictionaryRef dict = (CFMutableDictionaryRef)CFArrayGetValueAtIndex(attachments, 0);
    CFDictionarySetValue(dict, kCMSampleAttachmentKey_DisplayImmediately, kCFBooleanTrue);

    sliceCount = 0;
    // videoLayer is an AVSampleBufferDisplayLayer
    [videoLayer enqueueSampleBuffer:sampleBuffer];
    [videoLayer setNeedsDisplay];
}

Any help or thoughts would be greatly appreciated.

Note that when I’m working with a low-resolution video source, the implementation works fine. It only runs into problems when I increase the resolution of the video feed.

I have also attempted using VTDecompressionSession to decode the data. In that case, the decoder gave me back frames with no errors, but I haven’t been able to figure out how to render them onto the screen.

Upvotes: 2

Views: 1888

Answers (1)

Scott Ward
Scott Ward

Reputation: 31

It turns out that the problem was with the "mtu=100000000” part of the get-launch-1.0 pipeline.

Gstreamer would attempt to send out packets larger than the network would handle. Even though the theoretical packet size limit over udp is ~65k, a network (or even OS) may impose its own “Max Transmission Unit” limit. I determined that the network that I was using had an MTU of 1492.

So, in order to make the data transfer correctly (and successfully receive all the packets), I had to change the MTU to a smaller size. I chose 1400. On the client side, this means that the NAL units will be fragmented across several RTP packets in the FU-A type NAL structure. So, I had to aggregate those packets into their original, full-size NALs before sending them to the decoder.

Upvotes: 1

Related Questions