Reputation: 508
I am trying to parse different mpeg4 frames from an rtp stream coming from an axis camera, and feed the packets to ffmpeg library using avcodec_decode_video function. here are the steps i am doing 1. rtsp stream is initialized 2. rtp stream starts flowing in 3. First packet i am getting starts with 000001b0... and the configuration data follows and after that frame starts with 000001b6.. second rtp payload will be different, till i get a rtp packet where the marker bit is set. after than again i get packet starting with 000001b6 and goes on around 5-10 rtp packets.. this pattern repeast
what i am doing is if i detect 000001b0/b6 - i will accumulate all the packets coming after than and feed the bigger buffer to the avcodec_decode_video function of libavcodec, after initializing the decoder context properly.
But i am getting a crappy picture here, with the top most portion , a horizontal bar - crystal clear picture and the rest is crappy. I am not sure why it is behaving like this. Please help me
The data i am getting in rtp packet is dynamic-96.
point to note : when i am passing the iframes and p frames which is engrossed in the propreitary protocol of some other manufacturer the ffmpeg is able parse and give very good pcitures.
Any help is appreciated
Upvotes: 3
Views: 6823
Reputation: 11343
Try to fiddle with your MPEG4 stream settings on AXIS IP camera. Pay attention on Video & Image/Advanced part where you should set this:
Also, try to change "Priority" or "Optimize video stream for" setting (you should have frame rate, image quality, bandwidth, none).
If none of this works, then read more...
I hope that you understand how the MPEG4 stream is transmitted over RTP. In short (if you are not sure how):
"Configuration frame" (Visual Object Sequence Start) starts with an integer 000001B0
(hex). It contains the data needed for a video to be decoded. You need to send it to decoder only the first time you are trying to decode a stream, and it is used to decode all VOPs that come after it. Note that AXIS sends this packet in SDP (response to DESCRIBE in RTSP) for example:
a=fmtp:96 profile-level-id=245; config=000001B0F5000001B5891300000100000001200086C40FA28A021E0A21
. So if the stream never changes, and you are getting this in SDP, you dont need to pass VOS to the decoder... but if you do, there is no harm.
Video Object Plane (I-VOP, P-VOP, B-VOP) starts with an integer 000001B6
. If you set GOV length to be 10, and structure of the stream to "IP" you will get 1 I-Frame (I-VOP) and 9 P-VOP-s, but all will have 000001B6
starting code. The trick to differentiate between them is to check next two BITS in the FIFTH byte. Check the table to determine the type of VOP you are getting:
VOP_CODING_TYPE (binary) Coding method
00 intra-coded (I)
01 predictive-coded (P)
10 bidirectionally-predictive-coded (B)
11 sprite (S)
Now, to decode video you must have VOS sent to decoder, immediately followed by an I-VOP. But fist, your way of extracting this frames from RTP stream is awkward... If I-VOP is 10000B in size, and your network MTU is 1400B, you can't sent it as it is and not have a network congestion. So AXIS camera splits I-VOPs and all other BIG frames into FRAGMENTS that it sends over RTP as RTP packets which size doesn't exceed MTU. Main idea is this (example):
Now, when you are receiving this, you kinda get the idea, but you need to restore the original 10KB FRAME in order for decoder to decode it. The way you are doing, you are only decoding the first MTU bytes of much larger frame, and all other fragments that you send to decoder are discarded. That's why you can get the shitty picture...
To restore original frame:
000001B6
or 000001B0
and RTP MARKER bit set to 0. If the MARKER is set to 1, that is the whole frame, and you can decode it as it is! If it is 0, more parts follow...There, hope I helped... :)
Upvotes: 9