Reputation: 1
I am working on a ROS project for tello drones and I use this driver. When I am just subscribing to CompressedImage
messages from the drone camera and display the images on screen I have no problems, everything is working fine.
But as soon as I try to use face detection with cv2.CascadeClassifier
, the frames get a huge delay of about 30 seconds in real-time. So, the images are only displayed on the screen about 30 seconds later. Does anyone have an idea how this delay can be minimized for good results in real-time?
Here is the code so far:
#!/usr/bin/env python
import rospy
from sensor_msgs.msg import CompressedImage
import av
import cv2
import numpy
import threading
import traceback
class StandaloneVideoStream(object):
def __init__(self):
self.cond = threading.Condition()
self.queue = []
self.closed = False
def read(self, size):
self.cond.acquire()
try:
if len(self.queue) == 0 and not self.closed:
self.cond.wait(2.0)
data = bytes()
while 0 < len(self.queue) and len(data) + len(self.queue[0]) < size:
data = data + self.queue[0]
del self.queue[0]
finally:
self.cond.release()
return data
def seek(self, offset, whence):
return -1
def close(self):
self.cond.acquire()
self.queue = []
self.closed = True
self.cond.notifyAll()
self.cond.release()
def add_frame(self, buf):
self.cond.acquire()
self.queue.append(buf)
self.cond.notifyAll()
self.cond.release()
stream = StandaloneVideoStream()
def callback(msg):
stream.add_frame(msg.data)
def main():
rospy.init_node('face_detection')
rospy.Subscriber('/tello/image_raw/h264', CompressedImage, callback)
container = av.open(stream)
for frame in container.decode(video=0):
image_msg = cv2.cvtColor(numpy.array(frame.to_image()), cv2.COLOR_RGB2BGR)
stop_data = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
found = stop_data.detectMultiScale(image_msg, minSize =(20, 20))
amount_found = len(found)
if amount_found != 0:
for (x, y, width, height) in found:
cv2.rectangle(image_msg, (x, y), (x + height, y + width), (0, 255, 0), 5)
cv2.imshow('Frame', image_msg)
cv2.waitKey(1)
if __name__ == '__main__':
try:
main()
except BaseException:
traceback.print_exc()
finally:
stream.close()
cv2.destroyAllWindows()
EDIT:
When I print out the shape of the images (image_msg
) then I get the dimension of (720, 960, 3)
height, width and 3 channels
This shows the size of the stream in bytes
...
Tello: 15:54:16.106: Info: video data 599118 bytes 290.2KB/sec
Tello: 15:54:18.106: Info: video data 502212 bytes 245.2KB/sec
Tello: 15:54:20.108: Info: video data 503748 bytes 245.7KB/sec
Tello: 15:54:22.109: Info: video data 503182 bytes 245.6KB/sec
Tello: 15:54:22.446: Info: video recv: 1460 bytes 1b00 +103
Tello: 15:54:22.813: Info: video recv: 1460 bytes 2400 +173
Tello: 15:54:23.190: Info: video recv: 1460 bytes 2f00 +177
Tello: 15:54:23.554: Info: video recv: 1460 bytes 3a00 +178
Tello: 15:54:23.918: Info: video recv: 1460 bytes 4500 +176
Tello: 15:54:24.268: Info: video recv: 1460 bytes 5000 +160
Tello: 15:54:24.268: Info: video data 502157 bytes 227.1KB/sec
Tello: 15:54:24.585: Info: video recv: 1460 bytes 5c00 +140
Tello: 15:54:24.917: Info: video recv: 1460 bytes 6600 +142
Tello: 15:54:25.266: Info: video recv: 1460 bytes 7000 +157
Tello: 15:54:25.545: Info: video recv: 1460 bytes 7a00 +102
Tello: 15:54:25.878: Info: video recv: 1460 bytes 8201 +140
Tello: 15:54:26.178: Info: video recv: 1460 bytes 8d00 +102
Tello: 15:54:26.271: Info: video data 534194 bytes 260.5KB/sec
...
Upvotes: 0
Views: 197
Reputation: 1211
If this code as it is is 30s loop time, and with the "stop_data =" and "found = " lines commented out is significantly faster, then that's the bottleneck. You have 3 options (by severity): 1) change the parameters, 2) change the input data, 3) change the algorithm. I'm assuming you've tried (1) changing the parameters and you don't (3) want to change the algorithm, so your only choice is to (2) change the input data.
Try downsampling your image to something like 240x360 or 480x720. It should be decently faster. You can use cv2.pyrDown()
(c++ doc) (Python ex), a gaussian smoothing downsample, which will keep the image smoother than a simple pick-every-nth-pixel downsample.
Upvotes: 0