pmaster3321
pmaster3321

Reputation: 21

Issue with numpy array shape when creating custom image dataset

I am trying to create a custom dataset for a Deep Learning project using jpg images. I need to read them all in one batch. Doing so using the code below, but my array shape is (100, 1, 224, 224, 3) instead of (100,224, 224, 3). Any suggestions?

path = '/content/drive/My Drive/Dataset/Training'
X=[]
for img in os.listdir(path):
    pic = cv2.imread(os.path.join(path,img))
    pic = cv2.cvtColor(pic,cv2.COLOR_BGR2RGB)
    pic = cv2.resize(pic,(224,224))
    X.append([pic])
X=np.array(X)
print(X.shape)
(100, 1, 224, 224, 3)

Upvotes: 2

Views: 67

Answers (1)

Alexander Korovin
Alexander Korovin

Reputation: 1475

From general point of view, use squeeze from numpy to remove unused dimension (with unit length) from tensor.

For example:

print(np.squeeze(X).shape)

gives you:

(100, 224, 224, 3)

But perhaps in your case it is enough to use X.append(pic) in the line 7 (try to check this).

Tip: try to avoid lists when using numpy. About @hpaulj comment, you can use the concatenate function of numpy instead of lists:

# initialization like X = []
X = np.zeros([0]+list(pic.shape))
...
# append
X = np.concatenate((X, pic.reshape([1]+list(pic.shape))), axis=0)

Upvotes: 2

Related Questions