Reputation: 749
I am trying to run a tutorial based on MNIST data in a cluster and the node where training script runs don't have internet access so I am manually placing the MNIST dataset in the desired directory but I am getting Dataset not found error.
I am trying to run this tutorial on the cluster.
I have tried this answer but the answer doesn't resolve my problem.
Below is my code modifications -
import horovod.torch as hvd
train_dataset = \
datasets.MNIST('/scratch/netra/MNIST/processed/training.pt-%d' % hvd.rank(), train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
test_dataset = \
datasets.MNIST('/scratch/netra/MNIST/processed/test.pt-%d' % hvd.rank(), train=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
How to resolve it?
Upvotes: 0
Views: 843
Reputation: 1779
If the above does not work, try putting those .pt files in a folder called .data
in your current working directory:
import os
CURR_DIR = os.getcwd()
print(CURR_DIR)
train = datasets.MNIST(root='./data',download=False, train=True,
transform=transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))]))
# works
train = datasets.MNIST(root=CURR_DIR + '\\data',
download=False, train=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))]))
# works
# same files also in this folder
train = datasets.MNIST(root=CURR_DIR + '\\processed',download=False, train=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
# Dataset not found
Interestingly, in the last example, this is precisely the location that the torch MNIST data set class that generates the data places the .pt files.
Upvotes: 1
Reputation: 8981
You have to specify a root folder, not a full path to the processed file:
root (string)
: Root directory of dataset whereMNIST/processed/training.pt
andMNIST/processed/test.pt
exist.
In your case:
root is /scratch/netra
Thus,
train_dataset = \
datasets.MNIST('/scratch/netra-%d' % hvd.rank(), train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
test_dataset = \
datasets.MNIST('/scratch/netra-%d' % hvd.rank(), train=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
Upvotes: 1