InfinityBit
InfinityBit

Reputation: 57

How to send and receive webcam stream using tcp sockets in Python?

I am trying to recreate this project. What I have is a server (my computer), and a client (my raspberry pi). What I am doing differently than the original project is that I am trying to use a simple webcam instead of a raspberry pi camera to stream images from my rpi to the server. I know that I must:

  1. Get opencv image frames from the camera.
  2. Convert a frame (which is a numpy array) to bytes.
  3. Transfer the bytes from the client to the server.
  4. Convert the bytes back into frames and view.

Examples would be appreciated.

self_driver.py

import SocketServer
import threading
import numpy as np
import cv2
import sys


ultrasonic_data = None

#BaseRequestHandler is used to process incoming requests
class UltrasonicHandler(SocketServer.BaseRequestHandler):

    data = " "

    def handle(self):

        while self.data:
            self.data = self.request.recv(1024)
            ultrasonic_data = float(self.data.split('.')[0])
            print(ultrasonic_data)


#VideoStreamHandler uses streams which are file-like objects for communication
class VideoStreamHandler(SocketServer.StreamRequestHandler):

    def handle(self):
        stream_bytes = b''

        try:
            stream_bytes += self.rfile.read(1024)
            image = np.frombuffer(stream_bytes, dtype="B")
            print(image.shape)
            cv2.imshow('F', image)
            cv2.waitKey(0)

        finally:
            cv2.destroyAllWindows()
            sys.exit()


class Self_Driver_Server:

    def __init__(self, host, portUS, portCam):
        self.host = host
        self.portUS = portUS
        self.portCam = portCam

    def startUltrasonicServer(self):
        # Create the Ultrasonic server, binding to localhost on port 50001
        server = SocketServer.TCPServer((self.host, self.portUS), UltrasonicHandler)
        server.serve_forever()

    def startVideoServer(self):
        # Create the video server, binding to localhost on port 50002
        server = SocketServer.TCPServer((self.host, self.portCam), VideoStreamHandler)
        server.serve_forever()

    def start(self):
        ultrasonic_thread = threading.Thread(target=self.startUltrasonicServer)
        ultrasonic_thread.daemon = True
        ultrasonic_thread.start()
        self.startVideoServer()


if __name__ == "__main__":

    #From SocketServer documentation
    HOST, PORTUS, PORTCAM = '192.168.0.18', 50001, 50002
    sdc = Self_Driver_Server(HOST, PORTUS, PORTCAM)

    sdc.start()

video_client.py

import socket
import time
import cv2


client_sock = socket.socket()
client_sock.connect(('192.168.0.18', 50002))
#We are going to 'write' to a file in 'binary' mode
conn = client_sock.makefile('wb')

try:
    cap = cv2.VideoCapture(0)
    cap.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH,320)
    cap.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT,240)

    start = time.time()

    while(cap.isOpened()):
        conn.flush()
        ret, frame = cap.read()
        byteImage = frame.tobytes()
        conn.write(byteImage)


finally:
    finish = time.time()
    cap.release()
    client_sock.close()
    conn.close()

Upvotes: 3

Views: 7842

Answers (1)

abarnert
abarnert

Reputation: 366103

You can't just display every received buffer of 1-1024 bytes as an image; you have to concatenate them up and only display an image when your buffer is complete.

If you know, out of band, that your images are going to be a fixed number of bytes, you can do something like this:

IMAGE_SIZE = 320*240*3

def handle(self):
    stream_bytes = b''

    try:
        stream_bytes += self.rfile.read(1024)
        while len(stream_bytes) >= IMAGE_SIZE:
            image = np.frombuffer(stream_bytes[:IMAGE_SIZE], dtype="B")
            stream_bytes = stream_bytes[IMAGE_SIZE:]
            print(image.shape)
            cv2.imshow('F', image)
            cv2.waitKey(0)
    finally:
        cv2.destroyAllWindows()
        sys.exit()

If you don't know that, you have to add some kind of framing protocol, like sending the frame size as a uint32 before each frame, so the server can know how many bytes to received for each frame.


Next, if you're just sending the raw bytes, without any dtype or shape or order information, you need to embed the dtype and shape information into the server. If you know it's supposed to be, say, bytes in C order in a particular shape, you can do that manually:

image = np.frombuffer(stream_bytes, dtype="B").reshape(320, 240, 3)

… but if not, you have to send that information as part of your framing protocol as well.

Alternatively, you could send a pickle.dumps of the buffer and pickle.loads it on the other side, or np.save to a BytesIO and np.load the result. Either way, that includes the dtype, shape, order, and stride information as well as the raw bytes, so you don't have to worry about it.


The next problem is that you're exiting as soon as you display one image. Is that really what you want? If not… just don't do that.


But that just raises another problem. Do you really want to block the whole server with that cv.waitKey? Your client is capturing images and sending them as fast as it can; surely you either want to make the server display them as soon as they arrive, or change the design so the client only sends frames on demand. Otherwise, you're just going to get a bunch of near-identical frames, then a many-seconds-long gap while the client is blocked waiting for you to drain the buffer, then repeat.

Upvotes: 4

Related Questions