Reputation: 1
I am a beginner in Deep Learning. I am confused on how to read images dataset in Google Colab. Basically, the dataset consists of 2 folders for train and test images and 2 csv files for train and test labels. Now I need to identify dance patterns of the images for which I need to first read data and then split data.
However I tried to read dataset using below code:
zip_path = '/content/0664343c9a8f11ea.zip'
with ZipFile(zip_path) as z:
data = z.namelist()
This code worked and read data but in form of list. Later I won't be able to split this into train and test for creating neural networks. Also each image is of different size, so how should I deal with this?
Please help with this. It would be appreciated.
Thanks Prachi
Upvotes: 0
Views: 6961
Reputation: 115
There are many ways to read images to feed into a model. One basic method is converting images into numpy arrays. With a zip file of images, you can take the following steps to obtain numpy arrays of images. This will work on any Python kernal, whether it be Google Colab or your local kernal.
import zipfile # unziping import glob # finding image paths import numpy as np # creating numpy arrays from skimage.io import imread # reading images from skimage.transform import resize # resizing images # 1. Unzip images path = 'your zip file path' with zipfile.ZipFile(path, 'r') as zip_ref: zip_ref.extractall('path for extracted images') # 2. Obtain paths of images (.png used for example) img_list = sorted(glob.glob('path for extracted images/*.png')) # 3. Read images & convert to numpy arrays ## create placeholding numpy arrays IMG_SIZE = 256 (image resolution of 256 x 256 used for example) x_data = np.empty((len(img_list), IMG_SIZE, IMG_SIZE, 1), dtype=np.float32) ## read and convert to arrays for i, img_path in enumerate(img_list): # read image img = imread(img_path) # resize image (1 channel used for example; 1 for gray-scale, 3 for RGB-scale) img = resize(img, output_shape=(IMG_SIZE, IMG_SIZE, 1), preserve_range=True) # save to numpy array x_data[i] = img
Afterall, you have a numpy array, x_data
, containing your images. This array can be then used to train or test your model.
Hope this helps.
Upvotes: 1