Reputation: 143
I have some numpy arrays that I would like to pass into the TensorDataset from PyTorch, so it can be passed into the DataLoader for training in a neural network. These are the dimension of my train and test feature and targets:
Feature train shape:
(2338834, 21)
Target train shape:
(2338834, 3)
Feature test shape:
(662343, 21)
Target test shape:
(662343, 3)
I am trying to perform this command:
train = TensorDataset(input_train, output_train)
However, I get this error:
assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors), "Size mismatch between tensors"
TypeError: 'int' object is not callable
However, I am pretty sure the first dimensions of each of the numpy arrays are the same, for the train and test? Here is the code I am trying to run:
# Passing numpy array to to DataLoader
train = TensorDataset(input_train, output_train)
test = TensorDataset(input_test, output_test)
train_loader = DataLoader(dataset = train, batch_size = batch_size, shuffle = True)
test_loader = DataLoader(dataset = test, batch_size = batch_size, shuffle = True)
Upvotes: 1
Views: 2794
Reputation: 143
I was able to bypass this by converting to a tensor first:
features_train_tensor = torch.tensor(input_train)
target_train_tensor = torch.tensor(output_train)
features_test_tensor = torch.tensor(input_test)
target_test_tensor = torch.tensor(output_test)
# Passing numpy array to to DataLoader
train = TensorDataset(features_train_tensor, target_train_tensor)
test = TensorDataset(features_test_tensor, target_test_tensor)
train_loader = DataLoader(dataset = train, batch_size = batch_size, shuffle = True)
test_loader = DataLoader(dataset = test, batch_size = batch_size, shuffle = True)
Upvotes: 5