Sarvagya Gupta
Sarvagya Gupta

Reputation: 907

ImageNet dataset images not loading properly

I see that we can't download ImageNet dataset from Pytorch directly now. I get this error:

RuntimeError: The dataset is no longer publicly accessible. You need to download the archives externally and place them in the root directory.

So I went on the website and downloaded the 32X32 images (why is to so slow to download?). So it downloaded the training data in batches and when I loaded one of them and see how the images look like, I get this: enter image description here

Here's how I loaded the image:

file_1 = np.load("imagenet/Imagenet32_train_npz/train_data_batch_1.npz")
img = file_1['data'][0]
img = np.reshape(img, (32,32,3))
plt.imshow(img)
plt.show()

Am I doing something wrong or ImageNet is only changed? Let me know.

Upvotes: 0

Views: 1073

Answers (2)

aigrrrrl
aigrrrrl

Reputation: 1

The db was taken down for a bit. There were some "issues" with the data related to humans. The problem was generally kept quiet, but its been repaired, hopefully, in later versions of imagenet.

See:

https://www.artsy.net/article/artsy-editorial-online-image-database-will-remove-600-000-pictures-art-project-revealed-systems-racist-bias

https://paglen.studio/2020/04/29/imagenet-roulette/

Upvotes: 0

mohammadmahdi
mohammadmahdi

Reputation: 21

I've faced the same problem, and I understood that imagenet Data is channel first which means instead of reshaping it into (32, 32, 3) you should reshape it into (3, 32, 32) and then transpose it the full code would look like this:

file_1 = np.load("yourpath" , allow_pickle=True)
images = file_1["data"].reshape(-1  , 3 , 32 , 32)
images = images.transpose(0 , -2 , -1 , 1)

Upvotes: 1

Related Questions