Reputation: 71
Below is the code:
import torch
import torchvision
from torchvision import transforms, datasets
#Establishing the batch size
Batch_size = 10
#Downloading the train data
train_mnist = datasets.MNIST(root="./data", train=True, download=True,
transform=transforms.Compose([transforms.ToTensor()]))
#Passing the train data into a Dataloader
train_set = torch.utils.data.DataLoader(train_mnist, batch_size=Batch_size, shuffle=True)
#Downloading the test data
test_mnist = datasets.MNIST(root="./data", train=True, download=True,
transform=transforms.Compose([transforms.ToTensor()]))
#Passing the test data into a Dataloader
test_set = torch.utils.data.DataLoader(test_mnist, batch_size=Batch_size, shuffle=True)
#Building the network
import torch.nn as nn
import torch.nn.functional as f
class Netwk(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(28*28, 64)
self.fc2 = nn.Linear(64, 64)
self.fc3 = nn.Linear(64, 64)
self.fc4 = nn.Linear(64, 10)
def Fpropagation(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return F.log_softmax(x, dim=1)
return x
net = Netwk()
print(net)
#Creating optimization for the loss
import torch.optim as optim
Optimizer = optim.Adam(net.parameters(), lr=0.001)
EPOCHS = 3
for epoch in range(EPOCHS):
for data in train_set:
X, y = data
net.zero_grad()
output = net(X.view(28*28))
loss = F.nll_loss(output, y)
loss.backward()
optimizer.step()
print(loss)
#The error i get
RuntimeError Traceback (most recent call last)
<ipython-input-58-b92f3c4f7059> in <module>()
10 X, y = data
11 net.zero_grad()
---> 12 output = net(X.view(28*28))
13 loss = F.nll_loss(output, y)
14 loss.backward()
RuntimeError: shape '[784]' is invalid for input of size 7840
Have tried to get around it, i seem not able to understand whats wrong. From what have tried googling seems my dimensions have an issue and i dont know how to get the right dimensions If at all its the issue.
Upvotes: 0
Views: 1494
Reputation: 785
You are forgetting about "Batch" here. See, when you're in the for loop
and destructuring your data
with the line X, y = data
. Here, what you may've thought is X
is an image from trainset, but actually X
is a batch of images (10 specifically, as you've set 10 as your batch_size
) and y
similarly is a batch of labels. Means, your X
has a shape of something like 10x28x28x1
. Now, you may realize the line net(X.view(28*28))
is actually squashing your whole batch of 10 images into a single 28x28=>784 size vector, which is wrong, right? So rather what you might want to do here is something like net(X.view(-1, 28*28))
, now -1
will ensure that whatever your batch size is, it remains intact as you're sending data into the model, so now you could easily deduce the shape of X would be 10x784
which is what you want.
Another thing I've noticed, a typo maybe, your test_set
and train_set
are actually the same. To download the proper test_set, make sure to send train=False
as the parameter.
Upvotes: 1