Reputation: 4479
I have the following code:
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import scipy.io
folder = 'small/'
mat = scipy.io.loadmat(folder+'INISTATE.mat');
ini_state = np.float32(mat['ini_state']);
ini_state = torch.from_numpy(ini_state);
ini_state = ini_state.cuda();
mat = scipy.io.loadmat(folder+'TARGET.mat');
target = np.float32(mat['target']);
target = torch.from_numpy(target);
target = target.cuda();
class MLPNet(nn.Module):
def __init__(self):
super(MLPNet, self).__init__()
self.fc1 = nn.Linear(3, 64)
self.fc2 = nn.Linear(64, 128)
self.fc3 = nn.Linear(128, 128)
self.fc4 = nn.Linear(128, 41)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = F.relu(self.fc3(x))
x = self.fc4(x)
return x
def name(self):
return "MLP"
model = MLPNet();
model = model.cuda();
criterion = nn.MSELoss();
criterion = criterion.cuda();
learning_rate = 0.001;
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
batch_size = 20
iter_size = int(target.size(0)/batch_size)
print(iter_size)
for epoch in range(50):
for i in range(iter_size):
start = i*batch_size;
end = (i+1)*batch_size-1;
samples = ini_state[start:end,:];
labels = target[start:end,:];
optimizer.zero_grad() # zero the gradient buffer
outputs = model(samples)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if (i+1) % 500 == 0:
print("Epoch %s, batch %s, loss %s" % (epoch, i, loss))
if (epoch+1) % 7 == 0:
for g in optimizer.param_groups:
g['lr'] = g['lr']*0.1;
But when I train the simple MLP, the CPU usage is around 100% while the gpu is only around 10%. What is the problem that prevents using the GPU?
Upvotes: 1
Views: 1640
Reputation: 306
Actually your model indeed runs on GPU instead of CPU. The reason of low GPU usage is that both your model and batch size are small, which demands low computational cost. You may try increasing the batch size to around 1000, and the GPU usage should be higher. In fact PyTorch prevents operations that mix CPU and GPU data, e.g., you can't multiply a GPU tensor and a CPU tensor. So usually it is unlikely that part of your network runs on CPU and the other part runs on GPU, unless you deliberately design it.
By the way, data shuffling is necessary for neural networks. As your are using mini-batch training, in each iteration you are hoping that the mini batch approximates the whole dataset. Without data shuffling, it is likely that samples in a mini batch are highly correlated, which leads to biased estimation of parameter update. The data loader provided by PyTorch can help you do the data shuffling.
Upvotes: 1