CompNet
CompNet

Reputation: 77

PES structure of mpeg4 AVC video for encapsulation into Transport Stream

I want to understand how I,B,P pictures are packetized and multiplexed when mpeg4 AVC/H.264 coded video is encapsulated to a Transport Stream container (for streaming protocols like HTTP Live Streaming). For mpeg2 codec video as I understood that each PES starts in a new TS packet but there can be overlap of I,B,P pictures in a single PES.

But for mpeg4 AVC video can anyone explain how I,B,P frames are multiplexed in to PES? Can they overlap in a PES which means a single TS packet loss can potentially lose multiple I/B/P frames? I tried to go through the payload structures from the RFC and some other documents but could not understand clearly.

Upvotes: 3

Views: 1234

Answers (2)

Svetlin Mladenov
Svetlin Mladenov

Reputation: 4427

I,B,P frames are NOT multiplexed in a PES packet because different frames have different DTSs and PTSs but in a single PES packet only one pair of DTS/PTS can be specified. What the muxer does is to take the frame (be it I,B or P) package it in a PES packet, put the DTS and PTS on the packet, and that's it. The next frame will be packaged in another PES. Sometimes, depending on the encoder and muxer, when a frame is very large (for example an I frame of an HD video) it is packed into multiple PES packets that have the same DTS/PTS.

However the SPS and PPS of the h264 stream are packed together with an I frame into a single PES packet. This means that if the TS packet containing the SPS and PPS is lost then the decoder will have to wait until the next SPS and PPS are transmitted because without them it cannot decode the stream.

Please not that this is just the way most (if not all) encoders and muxer work. The standard does not (and cannot possibly) describe each and every case.

Upvotes: 1

shri
shri

Reputation: 874

As a quick summary, normally in broadcast application, PES will not contain more than one video frame data. And hence when a single TS packet is lost, we should not be losing multiple frame details. Having said that, the packet loss will affect the quality of the subsequent frame and if it happens to be a reference frame then the distortion will be very high.

Hence on the decoder side, we need to have error correction mechanism. Forward error correction is commonly the error recovery mechanism in continuous video transmission system. Also, when the packet is lost people try to retransmit the lost packet. This works fine as long as the network latency is low. However for interactive TV these conventional error recovery mechanism may not well suited.

Upvotes: 0

Related Questions