Mattpats
Mattpats

Reputation: 534

RNN - RuntimeError: input must have 3 dimensions, got 2

I’m getting the following error:

RuntimeError: input must have 3 dimensions, got 2

I have a single feature column that I am trying to feed into a GRU neural net.

Below are my data loader and neural net. I have also included the output of my data loader when I retrieve a batch of data.

What am I doing wrong???

def batch_data(feature1, sequence_length, batch_size):
“”"
Batch the neural network data using DataLoader
:param feature1: the single feature column
:param sequence_length: The sequence length of each batch
:param batch_size: The size of each batch; the number of sequences in a batch
:return: DataLoader with batched data
“”"
    # total number of batches we can make
    n_batches = len(feature1)//batch_size

    # Keep only enough characters to make full batches
    feature1= feature1[:n_batches * batch_size]

    y_len = len(feature1) - sequence_length

    x, y = [], []
    for idx in range(0, y_len):
        idx_end = sequence_length + idx
        x_batch = feature1[idx:idx_end]
        x.append(x_batch)
        # only making predictions after the last item in the batch
        batch_y = feature1[idx_end]    
        y.append(batch_y)    

    # create tensor datasets
    data = TensorDataset(torch.from_numpy(np.asarray(x)), torch.from_numpy(np.asarray(y)))

    data_loader = DataLoader(data, shuffle=False, batch_size=batch_size)

    # return a dataloader
    return data_loader



# test dataloader on subset of actual data

test_text = data_subset_b
t_loader = batch_data(test_text, sequence_length=5, batch_size=10)
 
data_iter = iter(t_loader)
sample_x, sample_y = data_iter.next()
 
print(sample_x.shape)
print(sample_x)
print()
print(sample_y.shape)
print(sample_y)

When I pass in data, the following batch is generated…

torch.Size([10, 5])
tensor([[ 0.0045, 0.0040, -0.0008, 0.0005, -0.0012],
[ 0.0040, -0.0008, 0.0005, -0.0012, 0.0000],
[-0.0008, 0.0005, -0.0012, 0.0000, -0.0015],
[ 0.0005, -0.0012, 0.0000, -0.0015, 0.0008],
[-0.0012, 0.0000, -0.0015, 0.0008, 0.0000],
[ 0.0000, -0.0015, 0.0008, 0.0000, 0.0000],
[-0.0015, 0.0008, 0.0000, 0.0000, -0.0008],
[ 0.0008, 0.0000, 0.0000, -0.0008, -0.0039],
[ 0.0000, 0.0000, -0.0008, -0.0039, -0.0026],
[ 0.0000, -0.0008, -0.0039, -0.0026, -0.0082]], dtype=torch.float64)

torch.Size([10])
tensor([ 0.0000, -0.0015, 0.0008, 0.0000, 0.0000, -0.0008, -0.0039, -0.0026,
-0.0082, 0.0078], dtype=torch.float64)

Upvotes: 1

Views: 4043

Answers (2)

Adam Oudad
Adam Oudad

Reputation: 347

As suggested by the error you got, the input tensor shape expected by the GRU is three dimensional with shape (batch_size, seq_len, input_size)1

But you are feeding a tensor of shape (10, 5). You said your input has one feature value, so you should add a dimension for input_size of size 1. This can be done like this

sample_x.unsqueeze(-1)

Upvotes: 4

berkay
berkay

Reputation: 134

Actually error itself tells you the problem. The RNN class which is super class of the GRU, expects an input shape with:

 (#batch,#number_of_timesteps,#number_of_features)

So for your case you have 1 feature, 5 timesteps. At your dataloader you need to expand the X to (#batch,5,1).

Upvotes: 0

Related Questions