KCK
KCK

Reputation: 2033

Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (0, 1) only on epoch>1 and at a specific dataset split

This question is different than ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (10, 1) as the answers there does not satisfy my case.


Following the tutorial here: https://www.altumintelligence.com/articles/a/Time-Series-Prediction-Using-LSTM-Deep-Neural-Networks, Here is the model summary:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_1 (LSTM)                (None, 49, 100)           41200     
_________________________________________________________________
dropout_1 (Dropout)          (None, 49, 100)           0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 49, 100)           80400     
_________________________________________________________________
lstm_3 (LSTM)                (None, 100)               80400     
_________________________________________________________________
dropout_2 (Dropout)          (None, 100)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 101       
=================================================================
Total params: 202,101
Trainable params: 202,101
Non-trainable params: 0
_________________________________________________________________
None

The size of my data set is 79005, sequence length is 50 and timesteps/batch size is 32. The problem is when I configure epoch to be 1, all goes perfect. But when I change it to 2, I get the below error right at the start of the second epoch:

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (0, 1)

I just want to understand why this is not the problem with 1 epoch and why only with 2 (or more)? EDIT: setting the train-test split as 0.80 instead of 0.85 actually removed the error! I still would like to know the reason for this as I am not getting it.

Below is my data loading code:

import math
import numpy as np
import pandas as pd

class DataLoader():
    """A class for loading and transforming data for the lstm model"""

    def __init__(self, filename, split, cols):
        dataframe = pd.read_csv(filename)
        print("data shape:",dataframe.shape)
        i_split = int(len(dataframe) * split)
        self.data_train = dataframe.get(cols).values[:i_split]
        self.data_test  = dataframe.get(cols).values[i_split:]
        self.len_train  = len(self.data_train)
        self.len_test   = len(self.data_test)
        self.len_train_windows = None

    def get_test_data(self, seq_len, normalise):
        '''
        Create x, y test data windows
        Warning: batch method, not generative, make sure you have enough memory to
        load data, otherwise reduce size of the training split.
        '''
        data_windows = []
        #data_x=[]
        #data_y=[]
        for i in range(self.len_test - seq_len):
            data_windows.append(self.data_test[i:i+seq_len])

        data_windows = np.array(data_windows).astype(float)
        data_windows = self.normalise_windows(data_windows, single_window=False) if normalise else data_windows

        x = data_windows[:, :-1]
        y = data_windows[:, -1, [0]]
            #x,y=self._next_window(i,seq_len,normalise,train=False)
            #data_x.append(x)
            #data_y.append(y)
        #return np.array(data_x),np.array(data_y)
        return x,y

    def get_train_data(self, seq_len, normalise):
        '''
        Create x, y train data windows
        Warning: batch method, not generative, make sure you have enough memory to
        load data, otherwise use generate_training_window() method.
        '''
        data_x = []
        data_y = []
        for i in range(self.len_train - seq_len):
            x, y = self._next_window(i, seq_len, normalise)
            data_x.append(x)
            data_y.append(y)
        return np.array(data_x), np.array(data_y)

    def generate_train_batch(self, seq_len, batch_size, normalise, epochs):
        '''Yield a generator of training data from filename on given list of cols split for train/test'''
        i = 0
        print("train length:",self.len_train)
        #while epoch < epochs:
        while i < ((self.len_train - seq_len)*(epochs+1)):
            print("i:",i)
            x_batch = []
            y_batch = []
            for b in range(batch_size):
                if i >= (self.len_train - seq_len):
                    # stop-condition for a smaller final batch if data doesn't divide evenly
                    yield np.array(x_batch), np.array(y_batch)
                    i = 0
                    print("i set to 0")
                x, y = self._next_window(i, seq_len, normalise)
                x_batch.append(x)
                y_batch.append(y)
                i += 1

            print ("x:",np.array(x_batch).shape)
            print ("y:",np.array(y_batch).shape)
            yield np.array(x_batch), np.array(y_batch)
        #epoch += 1

    def _next_window(self, i, seq_len, normalise):
        '''Generates the next data window from the given index location i'''
        window = self.data_train[i:i+seq_len]
        #if train:
        #    window = self.data_train[i:i+seq_len]
        #else:
        #    window = self.data_test[i:i+seq_len]
        window = self.normalise_windows(window, single_window=True)[0] if normalise else window
        x = window[:-1]
        y = window[-1, [0]]
        return x, y

    def normalise_windows(self, window_data, single_window=False):
        '''Normalise window with a base value of zero'''
        eps=0.00001
        normalised_data = []
        window_data = [window_data] if single_window else window_data
        for window in window_data:
            normalised_window = []
            for col_i in range(window.shape[1]):
                normalised_col = [((float(p) / (float(window[0, col_i])+eps)) - 1) for p in window[:, col_i]]
                normalised_window.append(normalised_col)
            normalised_window = np.array(normalised_window).T # reshape and transpose array back into original multidimensional format
            normalised_data.append(normalised_window)
        return np.array(normalised_data)

Below is the Model building code:

    class Model():
    """A class for an building and inferencing an lstm model"""

    def __init__(self):
        self.model = Sequential()

    def load_model(self, filepath):
        print('[Model] Loading model from file %s' % filepath)
        self.model = load_model(filepath)

    def build_model(self, configs):
        timer = Timer()
        timer.start()

        for layer in configs['model']['layers']:
            neurons = layer['neurons'] if 'neurons' in layer else None
            dropout_rate = layer['rate'] if 'rate' in layer else None
            activation = layer['activation'] if 'activation' in layer else None
            return_seq = layer['return_seq'] if 'return_seq' in layer else None
            input_timesteps = layer['input_timesteps'] if 'input_timesteps' in layer else None
            input_dim = layer['input_dim'] if 'input_dim' in layer else None

            if layer['type'] == 'dense':
                self.model.add(Dense(neurons, activation=activation))
            if layer['type'] == 'lstm':
                self.model.add(LSTM(neurons, input_shape=(input_timesteps, input_dim), return_sequences=return_seq))
            if layer['type'] == 'dropout':
                self.model.add(Dropout(dropout_rate))

        self.model.compile(loss=configs['model']['loss'], optimizer=configs['model']['optimizer'],metrics=['mean_squared_error'])
        print(self.model.summary())
        plot_model(self.model, to_file='model.png')

        print('[Model] Model Compiled')
        timer.stop()
        return self.model

    def train(self, x, y, epochs, batch_size, save_dir=""):
        timer = Timer()
        timer.start()
        print('X shape:', (x.shape))
        print('[Model] Training Started')
        print('[Model] %s epochs, %s batch size' % (epochs, batch_size))

        save_fname = os.path.join(save_dir, '%s-e%s.h5' % (dt.datetime.now().strftime('%d%m%Y-%H%M%S'), str(epochs)))
        callbacks = [
            EarlyStopping(monitor='val_loss', patience=2),
            ModelCheckpoint(filepath=save_fname, monitor='val_loss', save_best_only=True)
        ]
        modelhistory=self.model.fit(
            x,
            y,
            epochs=epochs,
            batch_size=batch_size,
            callbacks=callbacks
        )
        self.model.save(save_fname)
        print('[Model] Training Completed. Model saved as %s' % save_fname)
        timer.stop()
        return modelhistory

    def train_generator(self, data_gen, epochs, batch_size, steps_per_epoch, save_dir=""):
        timer = Timer()
        timer.start()
        #print('X shape:', (x.shape))
        print('[Model] Training Started')
        print('[Model] %s epochs, %s batch size, %s batches per epoch' % (epochs, batch_size, steps_per_epoch))

        save_fname = os.path.join(save_dir, '%s-e%s.h5' % (dt.datetime.now().strftime('%d%m%Y-%H%M%S'), str(epochs)))
        callbacks = [
            ModelCheckpoint(filepath=save_fname, monitor='loss', save_best_only=True)
        ]
        modelhistory=self.model.fit_generator(
            data_gen,
            steps_per_epoch=steps_per_epoch,
            epochs=epochs,
            callbacks=callbacks,
            workers=1
        )

        print('[Model] Training Completed. Model saved as %s' % save_fname)
        timer.stop()
        return modelhistory

Below is the model configuration where all the parameters are defined:

configJson={
    "data": {
        "filename": "C:/projects!/Experiments/2015-02-02-To-2019-5-19-5-Min.csv",
        "columns": [
            "Close","Volume"
        ],
        "sequence_length": 50,
        "train_test_split": 0.85,
        "normalise": True
    },
    "training": {
        "epochs": 2,
        "batch_size": 32
    },
    "model": {
        "loss": "mse",
        "optimizer": "adam",
        "layers": [
            {
                "type": "lstm",
                "neurons": 100,
                "input_timesteps": 49,
                "input_dim": 2,
                "return_seq": True
            },
            {
                "type": "dropout",
                "rate": 0.2
            },
            {
                "type": "lstm",
                "neurons": 100,
                "return_seq": True
            },
            {
                "type": "lstm",
                "neurons": 100,
                "return_seq": False
            },
            {
                "type": "dropout",
                "rate": 0.2
            },
            {
                "type": "dense",
                "neurons": 1,
                "activation": "linear"
            }
        ]
    }
}

Below is how I am building my model:

data = DataLoader(
    os.path.join(configs['data']['filename']),
    configs['data']['train_test_split'],
    configs['data']['columns']
)

model = Model()
model.build_model(configs)

# out-of memory generative training
steps_per_epoch = math.ceil((data.len_train - configs['data']['sequence_length']) / configs['training']['batch_size'])
modelhistory=model.train_generator(
    data_gen = data.generate_train_batch(
        seq_len = configs['data']['sequence_length'],
        batch_size = configs['training']['batch_size'],
        normalise = configs['data']['normalise'],
      epochs = configs['training']['epochs']
    ),
    epochs = configs['training']['epochs'],
    batch_size = configs['training']['batch_size'],
    steps_per_epoch = steps_per_epoch
)

Please help.

Upvotes: 2

Views: 281

Answers (1)

Akaisteph7
Akaisteph7

Reputation: 6436

Ok, I think I know what your issue is (or was).

So, in generate_train_batch, this line:

if i >= (self.len_train - seq_len):

checks to see when to reset the counter to 0 but does not add anything else to the batches for computation. When you are running it with 2 epochs and 0.85 split, it just so happens that when that line gets executed (right at the beginning of the second epoch), it's still the beginning of the generation for that batch (ie. i=67104 and so the if i >= (self.len_train - seq_len): condition is met as soon as the for loop is started). So i never increases and so your batch is empty.

For all of the other configurations this error just does not happen because this specific case mentioned above doesn't occur. On your side, to make sure this does not happen, I would recommend just removing the yield np.array(x_batch), np.array(y_batch) line after the if-statement mentioned above. It's okay if your last batch reuses some of the first elements. I think that would be the easiest way to solve this issue here.

Upvotes: 1

Related Questions