Sean Connor
Sean Connor

Reputation: 31

How to pack Android MediaCodec encoded H264 into RTP packets

How can I properly pack a H264 byte stream into RTP packets so I can receive frames with FFMPEG?

When I start the FFMPEG receiver, it pumps out a lot of errors like these:

Invalid UE golomb code
[h264 @ 0xd63060] pps_id 3199971767 out of range
[h264 @ 0xd63060] slice type 32 too large at -1
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] non-existing PPS 0 referenced
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] no frame!
[h264 @ 0xd63060] decode_slice_header error
[h264 @ 0xd63060] Unknown NAL code: 0 (0 bits)
[h264 @ 0xd63060] no frame!
[h264 @ 0xd63060] non-existing PPS 0 referenced

Here is the SDP file I use:

c=IN IP4 192.168.2.30
t=0 0
m=video 51372 RTP/AVP 96
a=rtpmap:96 H264/90000
a=recv only

The pps_id error is curious, its as if its looking for the next PPS, but can't find it, although I tried embedding the PPS into each NALU.

I've been reading RFC 6184 and trying to understand it. But I feel I still don't quite understand how H264 and RTP interact. Currently I'm trying to encode pixels from a camera and stream 1920x1080 H264 encoded frames through RTP across the network where it is then received by FFMPEG and decoded. I'm assembling the RTP and FU-A headers in Java and fragmenting the NALU when they are to large for the MTU.

I've been watching the stream closely in Wireshark, here is the output of my first packet:

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 0
Timestamp: 2727179012
Synchronization Source identifier: 0x00000000 (0)
H.264
NAL unit header or first byte of the payload
    0... .... = F bit: No bit errors or other syntax violations
    .00. .... = Nal_ref_idc (NRI): 0
    ...0 0000 = Type: Undefined (0)
H264 NAL Unit Payload

I don't understand why the first payload has the the NALU type of 0. Nevertheless, here is my second packet:

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
    0... .... = F bit: No bit errors or other syntax violations
    .11. .... = Nal_ref_idc (NRI): 3
    ...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
    1... .... = Start bit: the first packet of FU-A picture
    .0.. .... = End bit: Not the last packet of FU-A picture
    ..0. .... = Forbidden bit: 0
    ...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)
H264 NAL Unit Payload
    0000 0000  0000 0000  0000 0000  0000 0001  0110 0101  1011 1000  0000 0100  0000 010. = first_mb_in_slice: 3000762881
    .... ...1 = slice_type: P (P slice) (0)
    0011 1... = pic_parameter_set_id: 6

So I think the last packet was a I-Frame? Here is a fragment between the start and end fragments:

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
    0... .... = F bit: No bit errors or other syntax violations
    .11. .... = Nal_ref_idc (NRI): 3
    ...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
    0... .... = Start bit: Not the first packet of FU-A picture
    .0.. .... = End bit: Not the last packet of FU-A picture
    ..0. .... = Forbidden bit: 0
    ...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)

And of course here is the last packet of the supposed I-Frame:

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 1
Timestamp: 2727179019
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
    0... .... = F bit: No bit errors or other syntax violations
    .11. .... = Nal_ref_idc (NRI): 3
    ...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
    0... .... = Start bit: Not the first packet of FU-A picture
    .1.. .... = End bit: the last packet of FU-A picture
    ..0. .... = Forbidden bit: 0
    ...0 0101 = Nal_unit_type: Coded slice of an IDR picture (5)

Now here is my packet for the next bytes the encoder gave me:

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
0... .... = Marker: False
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 2
Timestamp: 2727179089
Synchronization Source identifier: 0x00000000 (0)
H.264
FU identifier
    0... .... = F bit: No bit errors or other syntax violations
    .11. .... = Nal_ref_idc (NRI): 3
    ...1 1100 = Type: Fragmentation unit A (FU-A) (28)
FU Header
    1... .... = Start bit: the first packet of FU-A picture
    .0.. .... = End bit: Not the last packet of FU-A picture
    ..0. .... = Forbidden bit: 0
    ...0 0001 = Nal_unit_type: Coded slice of a non-IDR picture (1)
H264 NAL Unit Payload
    0000 0000  0000 0000  0000 0000  0000 0001  0110 0001  1110 0000  0010 0000  0001 100. = first_mb_in_slice: 2968522763
    .... ...0  0111 .... = slice_type: B (B slice) (6)
    .... 0001  110. .... = pic_parameter_set_id: 13

This part confuses me, when the camera is stationary, the encoder gives me smaller and smaller NALU with undefined types, and I'm not entirely sure why, anyways, the packet below gets sent as one whole NALU to FFMPEG.

Real-Time Transport Protocol
10.. .... = Version: RFC 1889 Version (2)
..0. .... = Padding: False
...0 .... = Extension: False
.... 0000 = Contributing source identifiers count: 0
1... .... = Marker: True
Payload type: DynamicRTP-Type-96 (96)
Sequence number: 36
Timestamp: 2727180258
Synchronization Source identifier: 0x00000000 (0)
H.264
NAL unit header or first byte of the payload
    0... .... = F bit: No bit errors or other syntax violations
    .00. .... = Nal_ref_idc (NRI): 0
    ...0 0000 = Type: Undefined (0)
H264 NAL Unit Payload

I'm using Android MediaCodec encoder, and here is some code where I configure the encoder:

mediaCodec = MediaCodec.createByCodecName("OMX.Nvidia.h264.encoder");
mediaFormat = MediaFormat.createVideoFormat("video/avc", 1920, 1080);
mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 125000);
mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30);
mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 0);
mediaFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 1920 * 1080);

Is the encoder giving me whole access units or only NALU?

Here is my logic:

I feel like I'm close, but its clearly not working for any RTP receivers. I appreciate any thoughts or ideas on the matter.

Thanks,

Upvotes: 2

Views: 2128

Answers (1)

Sean Connor
Sean Connor

Reputation: 31

I finally managed to work it out, my packets were not configured properly.

  • I must iterate the sequence number per packet.
  • I must set the time stamp per NALU instead of per packet.
  • I must strip the NALU prefix of 00 00 01 ** sending bytes after index 4.
  • The bitwise operations in my headers were incorrect.

I can even start FFmpeg in the middle of the stream and it works!

Upvotes: 1

Related Questions