Beta
Beta

Reputation: 1746

Face Detection in Video using Google Cloud API

I'm trying to do face detection in a video using Google Vision API. I'm using the following code:

import argparse
import cv2
from google.cloud import vision
from PIL import Image, ImageDraw


def detect_face(face_file, max_results=4):
    """Uses the Vision API to detect faces in the given file.
    Args:
        face_file: A file-like object containing an image with faces.
    Returns:
        An array of Face objects with information about the picture.
    """
    content = face_file.read()
    # [START get_vision_service]
    image = vision.Client().image(content=content)
    # [END get_vision_service]

    return image.detect_faces()


def highlight_faces(frame, faces, output_filename):
    """Draws a polygon around the faces, then saves to output_filename.
    Args:
      image: a file containing the image with the faces.
      faces: a list of faces found in the file. This should be in the format
          returned by the Vision API.
      output_filename: the name of the image file to be created, where the
          faces have polygons drawn around them.
    """
    im = Image.open(frame)
    draw = ImageDraw.Draw(im)

    for face in faces:
        box = [(bound.x_coordinate, bound.y_coordinate)
               for bound in face.bounds.vertices]
        draw.line(box + [box[0]], width=5, fill='#00ff00')

    #im.save(output_filename)


def main(input_filename, max_results):

    video_capture = cv2.VideoCapture(input_filename)


    while True:
        # Capture frame-by-frame
        ret, frame = video_capture.read()
        faces = detect_face(frame, max_results)
        highlight_faces(frame, faces)
        cv2.imshow('Video', frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break


if __name__ == '__main__':
    parser = argparse.ArgumentParser(
        description='Detects faces in the given image.')
    parser.add_argument(
        'input_image', help='the image you\'d like to detect faces in.')
    parser.add_argument(
        '--max-results', dest='max_results', default=4,
        help='the max results of face detection.')
    args = parser.parse_args()

    main(args.input_image, args.max_results)

But I'm getting the error:

content = face_file.read() AttributeError: 'numpy.ndarray' object has no attribute 'read'

The "frames" are getting read as numpy array. But don't know how to bypass them.

Can anyone please help me?

Upvotes: 0

Views: 825

Answers (1)

dizcology
dizcology

Reputation: 170

The detect_face function is expecting a file-like object to read the data from. One possible way to do this is to convert frame (of type numpy.ndarray) into an image, and put it into a buffer, which can then be read like a file.

For example, try making the following changes to your code:

## Add some imports.
import io
import numpy as np
...

def main(input_filename, max_results):
    ...
    while True:
        # Capture frame-by-frame
        ret, frame = video_capture.read()

        ## Convert to an image, then write to a buffer.
        image_from_frame = Image.fromarray(np.unit8(frame))
        buffer = io.BytesIO()
        image_from_frame.save(buffer, format='PNG')
        buffer.seek(0)

        ## Use the buffer like a file.
        faces = detect_face(buffer, max_results)

        ...

Note: There should be a way to use image_from_frame.tobytes() as image content in the vision API client, but I could not make it work.

Upvotes: 1

Related Questions