How to load custom image dataset to X_train

Question

I have to GAN model trained on mnist dataset, I have to train it with my another images dataset

(X_train, _), _ = keras.datasets.mnist.load_data()

how to to the same for images saved in file path? Here is the kaggle notebook on which i'm working on https://www.kaggle.com/qbasit/vr-final

Sarim Sikander · Accepted Answer

Let's be general to any type of dataset for images. if the format of your image directory is like this:

Data...
       Images...
                 Apple class folder
                 Orange class folder
                 Mango class folder

If there are folder names with classes of the images and then the folder contains the images in it. then you can use this code to upload the images to your X and classes to y variables.

First of all set a variable to the dictionary of your class names:

classes = {'Apple':0,'Orange':1,'Mango':2}

then copy path like this:

APPLE_DIR='Images/fruits/Apple'
ORANGE_DIR='Images/fruits/Orange'
MANGO_DIR='Images/fruits/Mango'

After this use this code to make data:

def assign_label(img,fruit_type):
    return fruit_type

X = []
y = []

def make_data(fruit_type,DIR):
    for img in tqdm(os.listdir(DIR)):
        label=assign_label(img,fruit_type)
        path = os.path.join(DIR,img)
        img = cv2.imread(path,cv2.IMREAD_COLOR)
        img = cv2.resize(img, (IMAGE_WIDTH,IMAGE_HEIGHT))
        
        X.append(np.array(img))
        y.append(str(label))

If above function gives an error when running code below use try-except in that function like this:

def make_data(fruit_type,DIR):
    for img in tqdm(os.listdir(DIR)):
        try:
            label=assign_label(img,fruit_type)
            path = os.path.join(DIR,img)
            img = cv2.imread(path,cv2.IMREAD_COLOR)
            img = cv2.resize(img, (IMAGE_WIDTH,IMAGE_HEIGHT))
        except:
            pass;
        X.append(np.array(img))
        y.append(str(label))

Then to run the function use this code:

make_data(classes.get('Apple'), APPLE_DIR)
make_data(classes.get('Orange'), ORANGE_DIR)
make_data(classes.get('Mango'), MANGO_DIR)

All the images with classes will be stored in X and y. You can view the shape of those Images by following:

len(X)
X = np.array(X)
X = X/255
X.shape //returns the shape (numberofImages,WIDTH,HEIGHT,CHANNELS)

libraries used are:

from tqdm import tqdm
from random import shuffle  
from zipfile import ZipFile
from PIL import Image
import cv2
import numpy as np
import pandas as pd

This will solve your problem with any custom dataset for images.

How to load custom image dataset to X_train

Answers (1)

Related Questions