user10335564
user10335564

Reputation: 143

TensorDataset error with dimensions and int not callabe?

I have some numpy arrays that I would like to pass into the TensorDataset from PyTorch, so it can be passed into the DataLoader for training in a neural network. These are the dimension of my train and test feature and targets:

Feature train shape:
(2338834, 21)
Target train shape:
(2338834, 3)
Feature test shape:
(662343, 21)
Target test shape:
(662343, 3)

I am trying to perform this command:

train = TensorDataset(input_train, output_train)

However, I get this error:

assert all(tensors[0].size(0) == tensor.size(0) for tensor in tensors), "Size mismatch between tensors"
TypeError: 'int' object is not callable

However, I am pretty sure the first dimensions of each of the numpy arrays are the same, for the train and test? Here is the code I am trying to run:

    # Passing numpy array to to DataLoader
    train = TensorDataset(input_train, output_train)
    test = TensorDataset(input_test, output_test)
    train_loader = DataLoader(dataset = train, batch_size = batch_size, shuffle = True)
    test_loader = DataLoader(dataset = test, batch_size = batch_size, shuffle = True)

Upvotes: 1

Views: 2794

Answers (1)

user10335564
user10335564

Reputation: 143

I was able to bypass this by converting to a tensor first:

    features_train_tensor = torch.tensor(input_train)
    target_train_tensor = torch.tensor(output_train)
    features_test_tensor = torch.tensor(input_test)
    target_test_tensor = torch.tensor(output_test)

    # Passing numpy array to to DataLoader
    train = TensorDataset(features_train_tensor, target_train_tensor)
    test = TensorDataset(features_test_tensor, target_test_tensor)
    train_loader = DataLoader(dataset = train, batch_size = batch_size, shuffle = True)
    test_loader = DataLoader(dataset = test, batch_size = batch_size, shuffle = True)

Upvotes: 5

Related Questions