Reputation: 907
I see that we can't download ImageNet dataset from Pytorch directly now. I get this error:
RuntimeError: The dataset is no longer publicly accessible. You need to download the archives externally and place them in the root directory.
So I went on the website and downloaded the 32X32
images (why is to so slow to download?). So it downloaded the training data in batches and when I loaded one of them and see how the images look like, I get this:
Here's how I loaded the image:
file_1 = np.load("imagenet/Imagenet32_train_npz/train_data_batch_1.npz")
img = file_1['data'][0]
img = np.reshape(img, (32,32,3))
plt.imshow(img)
plt.show()
Am I doing something wrong or ImageNet is only changed? Let me know.
Upvotes: 0
Views: 1073
Reputation: 1
The db was taken down for a bit. There were some "issues" with the data related to humans. The problem was generally kept quiet, but its been repaired, hopefully, in later versions of imagenet.
See:
https://paglen.studio/2020/04/29/imagenet-roulette/
Upvotes: 0
Reputation: 21
I've faced the same problem, and I understood that imagenet Data is channel first which means instead of reshaping it into (32, 32, 3) you should reshape it into (3, 32, 32) and then transpose it the full code would look like this:
file_1 = np.load("yourpath" , allow_pickle=True)
images = file_1["data"].reshape(-1 , 3 , 32 , 32)
images = images.transpose(0 , -2 , -1 , 1)
Upvotes: 1