Reputation: 13088
I have folder with 160 FLV videos, each having 120 frames of size 152, 360 with RGB colors (3 channels) that I would like to load into the numpy array frames
. I do this with the code:
import numpy as np
import cv2
import os
directory = "data/"
# frames = []
frames = np.empty(shape=(160 * 120, 152, 360,3), dtype=np.float32)
for file in os.listdir(directory):
if file.endswith(".flv"):
# Create a VideoCapture object and read from input file
cap = cv2.VideoCapture(os.path.join(directory, file))
# Read until video is completed
while (cap.isOpened()):
# Capture frame-by-frame
ret, frame = cap.read()
if ret == True:
# frames.append(frame.astype('float32') / 255.)
frames[nr_frame, :, :, :] = frame.astype('float32') / 255.
nr_frame = nr_frame + 1
nb_frames_in_file = nb_frames_in_file + 1
else:
break
# When everything done, release the video capture object
cap.release()
# frames = np.array(frames)
Originally I tried to use a list frames
(see the commented lines), instead of the prerallocated numpy array, but it seemed this took too much memory - no idea why though.
However, it seems this did not help much: Still the code is very memory hungry (many GB), even though my videos are just a few KB large. I think it is because the resources of the cap
-objects (the cv2.VideoCapture
-objects) might not freed despite me using cap.release()
- is that correct? What can I do, to make my code memory-efficient?
Upvotes: 2
Views: 2025
Reputation: 2726
I recommend using pims for this task. It's a very nice PIL extension I've been using lately. You can load frames from a video into an object which calls them as they're needed.
For example if you had a video,
import pims
V = pims.Video('filename.avi')
You can then access frames of the video with numpy like indexing/slicing
im = V[100]
And they are only held in memory when you convert them to numpy arrays
import numpy as np
im = np.array(im)
You can use pipelines to preprocess whole videos without calling them into memory
@pims.pipeline
def grayscale(vid):
return np.array(vid)[...,0].astype('float')/255 # float grayscale
gray = grayscale(vid)
Upvotes: 5
Reputation: 79
I'm not sure why you use float32 but frame dtype shall be uint8?
frames = np.empty(shape=(1000 * 120,300, 400,3), dtype=np.uint8)
Upvotes: 0
Reputation: 19041
You seem to be severely underestimating the amount of memory required. The compressed video files may be small, but you're storing the raw, uncompressed data.
Let's recap:
Hence, total memory = 1000 x 120 x 300 x 400 x 3 x 4 = 1.728 × 10^11 Bytes (which is roughly 161 GiB)
If you want to reduce your memory requirements, then you will need to redesign your algorithm, such that it doesn't need to have everything in memory at once (i.e. batch processing).
Upvotes: 2