Nicolas Gervais
Nicolas Gervais

Reputation: 36624

Tensorflow 2 LSTM: InvalidArgumentError: Shapes of all inputs must match

I get an error. I think it might be because of the time steps.

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import pandas_datareader.data as web
import datetime as dt
import numpy as np
from tensorflow.keras import Model
import tensorflow as tf
from tensorflow.keras.layers import LSTM, Bidirectional, Dense
from tensorflow.keras.activations import relu

start = dt.datetime(2018,1,1)
end = dt.datetime(2019,1,1)

df = web.DataReader(name=['IBM', 'MSFT', 'NKE'],
                    data_source='yahoo',
                    start=start,
                    end=end).reset_index()['Close']

values = df.values

average_3_day = df.NKE.rolling(3).mean().values
previous_1_day = df.NKE.shift(-1).values

naive_3_day = tf.keras.metrics.mean_absolute_error(df['NKE'].values[2:], ma_3_day[2:]).numpy()
naive_1_day = tf.keras.metrics.mean_absolute_error(df['NKE'].values[:-1], previous_1_day[:-1]).numpy()
print('The benchmark score of 3 day moving average is {:.4f}.'.format(naive_3_day))
print('The benchmark score of the previous day is {:.4f}.'.format(naive_1_day))

for val, fut in zip(df['NKE'].values[:10], previous_1_day[:10]):
    print(f'Value: {val:>6.3f} Future: {fut:>6.3f}')

MEAN = np.mean(values[:200, :], axis=0)
STD = np.std(values[:200, :], axis=0)

data = (values - MEAN)/STD


def multivariate_data(dataset, target, start_index, end_index, history_size,
                      target_size, step, single_step=False):
  data, labels = [], []
  start_index = start_index + history_size
  if end_index is None:
    end_index = len(dataset) - target_size
  for i in range(start_index, end_index):
    indices = range(i-history_size, i, step)
    data.append(dataset[indices])
    if single_step:
      labels.append(target[i+target_size])
    else:
      labels.append(target[i:i+target_size])
  return np.array(data), np.array(labels)



PAST_HISTORY = 5
FUTURE_TARGET = 3
STEP = 5

x_train, y_train = multivariate_data(dataset=data,
                                     target=data[:, -1],
                                     start_index=0,
                                     end_index=200,
                                     history_size=PAST_HISTORY,
                                     target_size=FUTURE_TARGET,
                                     step=STEP)

x_test, y_test = multivariate_data(dataset=data,
                                   target=data[:, -1],
                                   start_index=200,
                                   end_index=None,
                                   history_size=PAST_HISTORY,
                                   target_size=FUTURE_TARGET,
                                   step=STEP)

train_data = tf.data.Dataset.from_tensors((x_train, y_train)).shuffle(len(x_train)).take(-1)
test_data = tf.data.Dataset.from_tensors((x_test, y_test)).shuffle(len(x_test)).take(-1)
print(next(iter(train_data))[0].shape)
print(next(iter(train_data))[1].shape)

class BiDirectionalLSTM(Model):
    def __init__(self):
        super(BiDirectionalLSTM, self).__init__()
        self.bidr = Bidirectional(LSTM(32, activation=None, return_sequences=True))
        self.dense = Dense(3)

    def call(self, inputs, training=None, mask=None):
        x = self.bidr(relu(inputs, alpha=2e-1))
        x = self.dense(x)
        return x


bidirec = BiDirectionalLSTM()

bidirec(next(iter(train_data)))

tensorflow.python.framework.errors_impl.InvalidArgumentError: Shapes of all inputs must match: values[0].shape = [1,192,2,3] != values[1].shape = [1,192,3] [Op:Pack] name: feat

Upvotes: 1

Views: 2035

Answers (1)

Zabir Al Nazi Nabil
Zabir Al Nazi Nabil

Reputation: 11198

First of all, as I can see your x_train.shape is (195, 1, 3) and `y_train.shape is (195, 3).

So, your output is 2-d, but you're setting return_sequences=True in your BiLSTM layer, this will produce 3-d output.

ref: https://keras.io/layers/recurrent/

So, just fix this first.

class BiDirectionalLSTM(Model):
    def __init__(self):
        super(BiDirectionalLSTM, self).__init__()
        self.bidr = Bidirectional(LSTM(32, activation=None, return_sequences=False))
        self.dense = Dense(3)

    def call(self, inputs, training=None, mask=None):
        x = self.bidr(relu(inputs, alpha=2e-1))
        x = self.dense(x)
        return x

Secondly, I see you're passing next(iter(train_data)), but the Model object doesn't expect that.

You can write bidirec(x_train) which will run fine, but train_data has two elements the x_train and y_train (the labels). The Model is not designed to take both x_train and y_train.

print(next(iter(train_data))[0].shape)
print(next(iter(train_data))[1].shape)

As you can see here, each has different dimensions. But you can do this and the code will run fine.

bidirec(next(iter(train_data))[0]) # only the training input data, not labels

From this the model actually gives you the prediction.

To train your model, you can just do the following,

bidirec.compile('adam', 'mse')
bidirec.fit(x_train, y_train)

Upvotes: 3

Related Questions