ingrid
ingrid

Reputation: 107

Cannot properly define the input of LSTM in order to model many-to-one scenario

I am new to Keras and recurrent layers, e.g. LSTM.

I should solve the following task: Given the sequences of events, it's necessary to predict the class for each sequence.

More details: I have historical data of some events. The sequence consists of N events, where N is not fixed. For each sequence of events, I want to predict a category (0, 1 or 2). I have many short sequences for training.

To complete this task, I am developing LSTM (many-to-one) with a softmax layer for multi-class classification.

For example, let's imagine that I have these data (batches of N events):

1, 17 => 0
1, 18
0, 18

0, 18 => 1
1, 19
0, 19
0, 20

…

0, 11 => 1
1, 11

The precedence of events in a sequence matters a lot. If the precedence is changed, then a corresponding category can also change. For example, if the first sequence shown above is changed by swapping the second and third rows, then the category can change from 0 to 1:

1, 17 => 1
0, 18
1, 18

I want to use LSTM (many-to-one), because it allows considering the impact of the precedence of events on the class (if I understand it correctly).

This is my starting code:

import pandas as pd
from sklearn import model_selection

events = {
            'batch_id': [0,0,0,1,1,2,2,2,2,2],
            'phase': [1,0,1,1,0,0,1,0,0,1],
            'hour': [16,16,17,17,17,18,18,19,20,20],
            'event_category': [1,1,1,2,2,0,0,0,0,0]
        }

columns = ['batch_id', 'phase', 'hour', 'event_category']

df = pd.DataFrame(events, columns=columns)

X = df.drop('event_category',1).drop('batch_id',1)
y = df['event_category']
output_classes = y.nunique()

My biggest problem is that I don't know how to model the varying size of sequences. I introduced a column batch_id. It can be noticed that I have 3 sequences of the sizes: 3, 2, 5.

How can I put this data into a Deep network? Should the size of a sequence be always fixed?

This is my draft setup of the model. In this setup, I lack the definition of input data as sequences. Maybe I should change the format of events in some way?

from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout
from keras.layers.recurrent import LSTM

model = Sequential()
model.add(LSTM(
                units=1, 
                return_sequences=False, 
                input_shape=(None,X_train.shape[1])
              )
         )

model.add(Dropout(0.2))

model.add(Dense(activation='softmax', units=output_classes))

# Define a performance metric
model.compile(loss="categorical_crossentropy",
              optimizer='adadelta')

Upvotes: 1

Views: 70

Answers (1)

Mikhail Berlinkov
Mikhail Berlinkov

Reputation: 1624

When using Keras you need to specify the input pass batches of a fixed shape. If your sequences have different lengths you have the following options:

  • pad sequences to the same length (e.g. with 0-vectors)
  • use bucketing for sequence sizes if lengths vary a lot (with reinitializing the model with the same weights)
  • use PyTorch or any other dynamic graph NN library

Upvotes: 1

Related Questions