Reputation: 11
I'm using yolo to track a face in a video on a frame by frame basis, and then I'm using OpenCV to crop the video (with some padding) around the speakers face so it creates a motion tracking type effect. (short example here: https://www.instagram.com/reel/DCIFkFEObkE/?hl=en)
The problem I'm running into is my final crop looks "shaky". Like it's flickering a bit. I think this is because every frame yolo is updating the coordinates, and so there's a slight change in x, y and because of that it creates this shakiness/flickering effect. Here's what mine looks like compared to the example: https://drive.google.com/file/d/1MKHWK-5EH5abSq6i32r71GWld76tN1GP/view?usp=sharing
You can see there's this "shakiness" type look to it. it's especially apparent when the speaker is relatively still.
I've tried doing an exponential moving average and a few other averaging techniques but none have fixed it.
Is my fundamental approach of doing a frame by frame crop wrong? or is there just some smoothing algorithm or some dumb mistake that you think I may be making? Any advice is mega appreciated!
Upvotes: 1
Views: 69