Reputation: 43
I am compiling some existing examples from the PyTorch tutorial website. I am working especially on the CPU device no GPU.
When running a program the type of error below is shown. Does it become I'm working on the CPU device or setup issue? raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 15876, 2756) exited unexpectedly`. How can I solve it?
import torch
import torch.functional as F
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import matplotlib.pyplot as plt
import numpy as np
from torch.utils.tensorboard import SummaryWriter
from torch.utils.data import DataLoader
from torchvision import datasets
device = 'cpu' if torch.cuda.is_available() else 'cuda'
print(device)
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))]
)
#Store separate training and validations splits in data
training_set = datasets.FashionMNIST(
root='data',
train=True,
download=True,
transform=transform
)
validation_set = datasets.FashionMNIST(
root='data',
train=False,
download=True,
transform=transform
)
training_loader = DataLoader(training_set, batch_size=4, shuffle=True, num_workers=2)
validation_loader = DataLoader(validation_set, batch_size=4, shuffle=False, num_workers=2)
classes = ('T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle Boot')
def matplotlib_imshow(img, one_channel=False):
if one_channel:
img = img.mean(dim=0)
img = img/2+0.5 #unnormalize
npimg = img.numpy()
if one_channel:
plt.imshow(npimg, cmap="Greys")
else:
plt.imshow(np.transpose(npimg, (1, 2, 0)))
dataiter = iter(training_loader)
images, labels = dataiter.next()
img_grid = torchvision.utils.make_grid(images)
matplotlib_imshow(img_grid, one_channel=True)
Upvotes: 4
Views: 12207
Reputation: 1
I have this problem(RuntimeError: DataLoader worker (pid(s) 78192) exited unexpectedly) when I compiled the test code of pytorch, and the cmd window of win10 tell me that your current disk space is not enough, you can increase the virtual disk space that test code is in. I hope it's useful for you.
Upvotes: 0
Reputation: 124
set num_workers=0 On Windows, due to multiprocessing restrictions, setting num_workers to > 0 leads to errors. This is expected.
There is an issue on Github too:
Upvotes: 5
Reputation: 1674
You need to first figure out why the dataLoader worker crashed. A common reason is out of memory. You can check this by running dmesg -T
after your script crashes and see if the system killed any python
process.
Upvotes: 2