chase
chase

Reputation: 3782

Most efficient method for reading/writing image files as numpy array

I am wondering what the most efficient methods for reading/writing image and PDF files as numpy arrays for processing.

So far I have seen scipy.ndimage.imread and using PIL and numpy, which yeild the following results:

import os
import glob
from scipy.ndimage import imread
from PIL import Image
import numpy as np
import timeit
iters = 2
def scipy_fun():
    for x in glob.glob("*.jpg"):
        px = imread(x)
def PIL_fun():
    for x in glob.glob("*.jpg"):
        with Image.open(x) as im:
            px = np.array(im)

print(timeit.Timer(scipy_fun).timeit(number=iters))
print(timeit.Timer(PIL_fun).timeit(number=iters))

running the script shows similar results with marginally better from scipy:

2.8794324089019234
3.0174482765699095

Are there any faster ways to do this?

Upvotes: 3

Views: 1983

Answers (1)

First, do this

pip install pdf2image

Then,

import numpy as np
from pdf2image import convert_from_path as read
import PIL
import cv2
#pdf in the form of numpy array to play around with in OpenCV or PIL
img = np.asarray(read('path to the pdf file')[0])#first page of pdf

Upvotes: 1

Related Questions