Reputation: 298
I am trying to develop a model using machine learning that reproduces a biological behavior. My goal is to do a regression of timeseries e.g from multiple input each time_step predict multiple output and not forcasting.
For this, I have :
To feed my data to an ML/DL algorithm, I can either:
[nb_instances, time_steps, features]
[nb_instances * time_steps, features]
With the 3D data, I have a hard time to introduce them in a classical ml algo (for example, sklearn models...). I know that I could use a DL algorithm but I would like to have/test a "low resource" solution first. I am not considering dimensionality reduction for constraint purposes.
Is there a way to feed 3D data to a classic ML algo from sklearn or another python library?
If I choose the second option (removing the nb_instances dimension), I will lose some information (like the execution cycle) but I will be able to use both ML and DL.
Which option is better? Is there another way to look at the problem?
Upvotes: 0
Views: 128
Reputation: 1579
I would go with the most common option of removing the dimension of the number of instances and concatenating the data into a 2D format.
There is as mentioned risk of losing information, but you can use preprocessing techniques.
For 3D data, use a sliding window to transform 3d to 2d over multiple pipes over multiple steps. I would experiment as much as possible.
I would also consider LSTM here or experiment with Gated recurrent units.
Good luck.
Upvotes: 1