Make42
Make42

Reputation: 13088

Memory issues when loading videos into frames

I have folder with 160 FLV videos, each having 120 frames of size 152, 360 with RGB colors (3 channels) that I would like to load into the numpy array frames. I do this with the code:

import numpy as np
import cv2
import os

directory = "data/"
# frames = []
frames = np.empty(shape=(160 * 120, 152, 360,3), dtype=np.float32)

for file in os.listdir(directory):
    if file.endswith(".flv"):

        # Create a VideoCapture object and read from input file
        cap = cv2.VideoCapture(os.path.join(directory, file))

        # Read until video is completed
        while (cap.isOpened()):
            # Capture frame-by-frame
            ret, frame = cap.read()
            if ret == True:
                # frames.append(frame.astype('float32') / 255.)
                frames[nr_frame, :, :, :] = frame.astype('float32') / 255.
                nr_frame = nr_frame + 1
                nb_frames_in_file = nb_frames_in_file + 1
            else:
                break

        # When everything done, release the video capture object
        cap.release()

# frames = np.array(frames)

Originally I tried to use a list frames (see the commented lines), instead of the prerallocated numpy array, but it seemed this took too much memory - no idea why though.

However, it seems this did not help much: Still the code is very memory hungry (many GB), even though my videos are just a few KB large. I think it is because the resources of the cap-objects (the cv2.VideoCapture-objects) might not freed despite me using cap.release() - is that correct? What can I do, to make my code memory-efficient?

Upvotes: 2

Views: 2025

Answers (3)

kevinkayaks
kevinkayaks

Reputation: 2726

I recommend using pims for this task. It's a very nice PIL extension I've been using lately. You can load frames from a video into an object which calls them as they're needed.

For example if you had a video,

import pims
V = pims.Video('filename.avi')

You can then access frames of the video with numpy like indexing/slicing

im = V[100]

And they are only held in memory when you convert them to numpy arrays

import numpy as np 
im = np.array(im)

You can use pipelines to preprocess whole videos without calling them into memory

@pims.pipeline
def grayscale(vid):
    return np.array(vid)[...,0].astype('float')/255 # float grayscale 

gray = grayscale(vid)

Upvotes: 5

Andy Chang
Andy Chang

Reputation: 79

I'm not sure why you use float32 but frame dtype shall be uint8?

frames = np.empty(shape=(1000 * 120,300, 400,3), dtype=np.uint8)

Upvotes: 0

Dan Mašek
Dan Mašek

Reputation: 19041

You seem to be severely underestimating the amount of memory required. The compressed video files may be small, but you're storing the raw, uncompressed data.

Let's recap:

  • 1000 videos
  • each video contains 120 frames
  • each frame contains 300 rows and 400 columns of pixels = 120000 pixels
  • each pixel contains 3 values (RGB)
  • each value requires 4 bytes (float32)

Hence, total memory = 1000 x 120 x 300 x 400 x 3 x 4 = 1.728 × 10^11 Bytes (which is roughly 161 GiB)

If you want to reduce your memory requirements, then you will need to redesign your algorithm, such that it doesn't need to have everything in memory at once (i.e. batch processing).

Upvotes: 2

Related Questions