Tensorflow dataset API for time series classification

Question

I am getting used to the new dataset API and try to do some time series classification. I have a dataset formatted as tf-records in the shape of: (time_steps x features). Also I have a label for each time step. (time_steps x 1)

What I want to do is to reformat the dataset to have a rolling window of time steps like this: (n x windows_size x features). With n being the amounts of time_steps-window_size (if I use a stride of 1 for the rolling window)

The labels are supposed to be (window_size x 1), meaning that we take the label of the last time_step in the window.

I already know, that I can use tf.sliding_window_batch() to create the sliding window for the features. However, the labels get shaped in the same way, and I do not know how to do this correctly: (n x window_size x 1

How do I do this using the tensorflow dataset API? https://www.tensorflow.org/programmers_guide/datasets

Thanks for your help!

Lukas Hestermeyer · Accepted Answer

I couldn't figure out how to do this, but I figured I might as well do it using numpy.

I found this great answer and applied it to my case.

Afterwards it was just using numpy like so:

train_df2 = window_nd(train_df, 50, steps=1, axis=0)
train_features = train_df2[:,:,:-1]
train_labels = train_df2[:,:,-1:].squeeze()[:,-1:]
train_labels.shape

My label was the last column, so you might have to adjust this a bit.

Tensorflow dataset API for time series classification

Answers (2)

Related Questions