Ketchup
Ketchup

Reputation: 298

Regression : how to handle multiple multivariate timeseries?

I am trying to develop a model using machine learning that reproduces a biological behavior. My goal is to do a regression of timeseries e.g from multiple input each time_step predict multiple output and not forcasting.

For this, I have :

To feed my data to an ML/DL algorithm, I can either:

[nb_instances, time_steps, features]

[nb_instances * time_steps, features]

With the 3D data, I have a hard time to introduce them in a classical ml algo (for example, sklearn models...). I know that I could use a DL algorithm but I would like to have/test a "low resource" solution first. I am not considering dimensionality reduction for constraint purposes.

Is there a way to feed 3D data to a classic ML algo from sklearn or another python library?

If I choose the second option (removing the nb_instances dimension), I will lose some information (like the execution cycle) but I will be able to use both ML and DL.

Which option is better? Is there another way to look at the problem?

Upvotes: 0

Views: 128

Answers (1)

Rahul
Rahul

Reputation: 1579

I would go with the most common option of removing the dimension of the number of instances and concatenating the data into a 2D format.

There is as mentioned risk of losing information, but you can use preprocessing techniques.

For 3D data, use a sliding window to transform 3d to 2d over multiple pipes over multiple steps. I would experiment as much as possible.

I would also consider LSTM here or experiment with Gated recurrent units.

Good luck.

Upvotes: 1

Related Questions