Tozar
Tozar

Reputation: 986

Why does decreasing the framerate with videorate incur a significant CPU performance penalty?

My understanding of the videorate element is that framerate correction is performed by simply dropping frames and no "fancy algorithm" is used. I've profiled CPU usage for a gst-launch-1.0 pipeline and I've observed that as the framerate decreases below 1 FPS, CPU usage, counter-intuitively, increases dramatically.enter image description here Sample pipeline (you can observe the performance penalty by changing the framerate fraction):

gst-launch-1.0 filesrc location=test.mp4 ! qtdemux ! h264parse ! avdec_h264 ! videorate drop-only=true ! video/x-raw,framerate=1/10 ! autovideosink

I would expect that decreasing the framerate would reduce the amount of processing required throughout the rest of the pipeline. Any insight into this phenomenon would be appreciated.

System info: Centos 7, GStreamer 1.4.5

EDIT: Seems this happens with the videotestsrc as well but only if you specify a high framerate on the source.

videotestsrc pattern=snow ! video/x-raw,width=1920,height=1080,framerate=25/1 ! videorate drop-only=true ! video/x-raw,framerate=1/10 ! autovideosink

Removing the framerate from the videotestsrc caps puts CPU usage at 1%, and usage increases as the videorate framerate increases. Meanwhile, setting the source to 25/1 FPS increases CPU usage to 50% and it lowers as the videorate framerate increases.

Upvotes: 1

Views: 9476

Answers (2)

mpr
mpr

Reputation: 3378

Tozar I'm going to specifically address the pipeline you posted in your comment above.

If you're only going to be sending a frame once every ten seconds there's probably no need to use h264. In ten seconds time the frame will have changed completely and there will be no data similarities to be encoded for bandwidth savings. The encoder will likely just assume a new keyframe is needed. You could go with jpegenc and rtpjpegpay as alternatives.

If you're re-encoding the content you'll definitely see a CPU spike every ten seconds. It's just not avoidable.

If you want to place CPU usage as low as possible on the machine doing the transformation, you could go to the work of parsing the incoming h264 data, pulling out the key frames (IDR frames), and then passing those along to the secondary destination. That would be assuming the original transmitter sent keyframes though (no intra refresh). It would not be easy.

You may want to form a more general question about what you're trying to do. What is the role of the machine doing the transformation? Does it have to use the data at all itself? What type of machine is receiving the frames every ten seconds and what is its role?

Upvotes: 4

mpr
mpr

Reputation: 3378

videorate is tricky and you need to consider it in conjunction with every other element in the pipeline. You also need to be aware of how much CPU time is actually available to cut off. For example, if you're decoding a 60fps file and displaying it at 1fps, you'll still be eating a lot of CPU. You can output to fakesink with sync set to true to see how much CPU you could actually save.

I recommend adding a bit of debug info to better understand videorate's behavior.

export GST_DEBUG=2,videorate:7

Then you can grep for "pushing buffer" for when it pushes:

gst-launch-1.0 [PIPELINE] 2>&1 | grep "pushing buffer"

..and for storing buffer when it receives data:

gst-launch-1.0 [PIPELINE] 2>&1 | grep "storing buffer"

In the case of decoding a filesrc, you're going to see bursts of CPU activity because what happens is the decoder will run through say 60 frames, realize the pipeline is filled, pause, wait till a need-buffers event comes in, then burst to 100% CPU to fill the pipeline again.

There are other factors too. Like you may need to be careful that you have queue elements between certain bottlenecks, with the correct max-size attributes set. Or your sink or source elements could be behaving in unexpected ways.

To get the best possible answer for your question, I'd suggest posting the exact pipeline you intend to use, with and without the videorate. If you have something like "autovideosink" change that to the element it actually resolves to on your system.

Here are a few pipelines I tested with:

gst-launch-1.0 videotestsrc pattern=snow ! video/x-raw,width=320,height=180,framerate=60/1 ! videorate ! videoscale method=lanczos ! video/x-raw,width=1920,height=1080,framerate=60/1 ! ximagesink 30% CPU in htop

gst-launch-1.0 videotestsrc pattern=snow ! video/x-raw,width=320,height=180,framerate=60/1 ! videorate ! videoscale method=lanczos ! video/x-raw,width=1920,height=1080,framerate=1/10 0% with 10% spikes in htop

Upvotes: 1

Related Questions