A.Gharbi
A.Gharbi

Reputation: 759

Classification of time series of variable lengths using 1D CNN in tensorflow

I have a dataset consisting of time series of different lengths. For instance, consider this

ts1 = np.random.rand(230, 4)
ts2 = np.random.rand(12309, 4)

I have 200 sequences in the form of list of arrays

input_x = [ts1, ts2, ..., ts200]

These time series have labels 1 if good and 0 if not. Hence my labels will be something like

labels = [0, 0, 1, 0, 1, ....] 

I am building a keras model as follows:

model = keras.Sequential([
keras.layers.Conv1D(64, 3, activation='relu', input_shape=(None, 4)),
keras.layers.MaxPool1D(3), 
keras.layers.Conv1D(160, 10, activation='relu'),
keras.layers.GlobalAveragePooling1D(),
keras.layers.Dropout(0.5),
keras.layers.Dense(128, activation='relu'),
keras.layers.Dense(2, activation='softmax')

])

The 4 in the input shape of the first convolution layer corresponds to the number of columns in each time series which is constant (think of it as having 4 sensors returning measurements for different operations). The objective is to classify if a time series is good or bad (0 or 1) however I am unable to figure out how to train this using keras.

Running this line

model.fit(input_x, labels, epochs=5, batch_size=1)

Returns an error

Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 200 arrays

Even using np.array(input_x) gives an error. How can I train this model with sequences of variable lengths? I know padding is an option but that's not what I am looking for. Also, I don't want to use an RNN with a sliding window. I am really looking into a solution with 1D CNN that works with sequences of variable lengths. Any help would be so much appreciated!

Upvotes: 4

Views: 1877

Answers (1)

Pedro Marques
Pedro Marques

Reputation: 2682

When working with a time series you want to define the input to the NN as (batch_size, sequence_length, features).

Which corresponds to a input_shape=(sequence_length, 4,) in your case. You will have to decide upon a maximum sequence length that you will process for the purposes of training and generating predictions.

The inputs to the NN also need to be in the shape (batch_size, sequence_length, features).

Upvotes: 1

Related Questions