Reputation: 327
I got H264 RTP packet from RTSP stream. So I want to detect whether the frame is an I-frame or not.
Below is the first packet I got from the first time I open the stream. So I believe that it is an I-frame. Here are the first 160 bytes:
packet:
00 00 00 01 67 4D 00 1F : 95 A8 14 01 6E 40 00 00
00 01 68 EE 3C 80 00 00 : 00 01 06 E5 01 33 80 00
00 00 01 65 B8 00 00 08 : 52 90 9F F6 BE D6 C6 9C
3D F6 D4 2F 49 FB F7 13 : F2 A9 C7 27 2D A4 75 59
6C DB FF 35 27 A4 C7 B6 : E7 69 A2 E0 FB 0E FF 2D
0E E0 6F 25 43 78 BF B9 : 69 22 1B 24 E3 CA 60 56
44 16 6C 15 44 DA 55 29 : C2 39 24 86 CE D6 75 BB
E0 0C F4 F4 EC C5 76 E4 : 7B 59 B9 40 2D B3 ED 19
E4 1D 94 B7 54 9B B3 D0 : 8F 24 58 CD 3C F3 FA E0
D4 7D 88 70 0E 49 79 12 : B2 14 92 BA B6 9C 3A F7
8D 13 78 6B 4C CD C0 CC : C8 39 6A AC BE 3D AA 00
9A DB D2 68 70 5F C4 20 : B7 5C FC 45 93 DB 00 12
9F 87 5A 66 2C B2 B8 E7 : 63 C4 87 0B A4 AA 2E 6D
AB 42 3F 02 C2 A6 F9 41 : E5 FE 80 64 49 14 38 3D
52 4B F6 B2 E7 53 DD 3E : F6 BB A8 EB 13 23 BB 71
B1 C9 90 06 92 3E 5F 15 : F2 C0 39 43 EA 24 5A 86
AE 11 27 D4 C5 4B 5C CD : 6C 90 2B 44 80 18 76 95
6E 16 DF 5D 86 49 25 5A : B6 66 23 E6 40 D4 25 6B
CE A2 4C EE 13 DD 7B 88 : FF A0 64 EC 33 44 B1 DC
B7 0B 89 5B 8F 85 68 3C : 65 3E 55 0F 41 4B 32 C9
C8 56 78 1A 15 14 8C C7 : F5 17 40 D4 EC BC 5B 62
8A 24 66 6A C3 7E 3B DB : 44 A8 EC D8 EE 37 E0 DE
.. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. ..
Then I used the below piece of code to determine the frame:
public static bool isH264iFrame(byte[] paket)
{
int RTPHeaderBytes = 0;
int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
int start_bit = paket[RTPHeaderBytes + 1] & 0x80;
if (((fragment_type == 28 || fragment_type == 29) && nal_type == 5 && start_bit == 128) || fragment_type == 5)
{
return true;
}
return false;
}
My problem is that I cannot know the exact value of RTPHeaderByte
. In this case my packets always start with "00 00 00 01".
Upvotes: 6
Views: 9524
Reputation: 20725
Actually, this looks wrong:
int fragment_type = paket[RTPHeaderBytes + 0] & 0x1F;
int nal_type = paket[RTPHeaderBytes + 1] & 0x1F;
int start_bit = paket[RTPHeaderBytes + 1] & 0x80;
You have the NAL type first, then other things and bit 7 of the NAL type byte is always 0.
The fact is that you can simply search for two or three zeroes followed by a 1 and that's your marker for a NAL. The NAL follows just after that. It is not currently clear to me what's the difference between 2 and 3 zeroes.
So in your example, you have the following NALs:
00 00 00 01 67 4D 00 1F : 95 A8 14 01 6E 40 00 00
^^ ^^ ^^ ^^ ^^ ^^ ^^
00 01 68 EE 3C 80 00 00 : 00 01 06 E5 01 33 80 00
^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^ ^^
00 00 01 65 B8 00 00 08 : 52 90 9F F6 BE D6 C6 9C
^^ ^^ ^^ ^^
3D F6 D4 2F 49 FB F7 13 : F2 A9 C7 27 2D A4 75 59
.. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. ..
So that means you have 0x67, 0x68, 0x06, 0x65, which as per the link given by szatmary, you have (i.e. we do type = (byte & 0x1F)
):
7 Sequence parameter set non-VCL
8 Picture parameter set non-VCL
6 Supplemental enhancement information (SEI) non-VCL
5 Coded slice of an IDR picture VCL
The 5 means you have an I-Frame.
Looking at one of my files, the next group of NALs uses 0x41 or 0x01, which is a Coded slice of a non-IDR picture (i.e. B-Frame). Once in a while, I see a 5 instead of the 1 (i.e. an I-Frame). By default, the x264 generates a new I-Frame every 250 or so frames. You can change that parameter.
So your code detect whether this set of NALs represent an I-Frame or another frame needs to search for all the NALs within the frame and find the 1 (B-Frame) or 5 (I-Frame).
in.open("source-file.h264");
while(in)
{
char marker[4];
in.read(marker, 3);
for(;;)
{
in.read(marker + 3, 1);
if(marker[0] == 0
&& marker[1] == 0
&& marker[2] == 1)
{
// found one! (short one)
break;
}
if(marker[0] == 0
&& marker[1] == 0
&& marker[2] == 0
&& marker[3] == 1)
{
// found one! (long one)
break;
}
}
in.read(marker, 1);
type = marker[0] & 0x1F;
if(type == 1)
{
return B_FRAME;
}
if(type == 5)
{
return I_FRAME;
}
}
return NOT_FOUND;
WARNING: This code is going to be slow unless you have a good side buffer in your in
file. This is C++ code. If you already have the data in a buffer, you should replace the in
file with a pointer or an index within your buffer and that will definitely be very fast.
Note: The H.264 format makes sure to insert a 3 if it happens to have a 0x00 0x00 0x00 or a 0x00 0x00 0x01 sequence. That is, either one would look like this instead: 0x00 0x00 0x03 0x00 and 0x00 0x00 0x03 0x01. You can try to compress pure black frames, and you'll see many of those 0x03 appearing in the NAL picture data.
Upvotes: 1
Reputation: 31101
You will have to parse the payload. see the SO answer Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream. For IDR, all VCL NALUs will be type 5. As for B/P you will need to parse out the exp-golmb encoded data to find the slice type.
Upvotes: 4