Reputation: 290
I have a csv file with data in the the form of:
Timestamp,Signal_1,Signal_2,Signal_3,Signal_4,Signal_5
2021-04-13 11:03:13+02:00,3,3,3,12,12
2021-04-13 11:03:14+02:00,3,3,3,12,12
Now I want to create a NN to do time series forecasting, so in order to do that I wanted to turn the content into a numpy array so I can assign training/test-sets. The input aswell as the output should be 5-dimensional (all Signal Groups should be predicted). Currently my code looks like this:
import pandas
from matplotlib import pyplot
from sklearn.model_selection import train_test_split
from numpy import genfromtxt
filename = 'test.csv'
data = pandas.read_csv(filename , header=0, index_col=0)
my_data = genfromtxt('test.csv', delimiter=',')
print(data.shape)
print(type(my_data))
v, w, x, y, z = my_data
I am aware that the actual assignment of the test and training parts is missing, but even in this stage I get the error ValueError: too many values to unpack (expected 5)
Upvotes: 0
Views: 225
Reputation: 670
Not sure exactly which bit you would like to unpack (looks like you tried to import a version using pandas and one using numpy), but the error is because your my_data.shape
= (3, 6)
, as the headers and timestamp column are not interpreted by np.genfromtxt
, which causes the too many values to unpack
error at v, w, x, y, z = my_data
array([[nan, nan, nan, nan, nan, nan],
[nan, 3., 3., 3., 12., 12.],
[nan, 3., 3., 3., 12., 12.]])
For the numpy my_data
array, you could index to remove the first row and column and transpose to get it the right way up:
v, w, x, y, z = my_data[1:, 1:].T
Which will give you your 1D arrays:
>> v
array([3., 3.])
>> w
array([3., 3.])
>> x
array([3., 3.])
>> y
array([12., 12.])
>> z
array([12., 12.])
N.B. Just as an aside, if you try to do the same thing using your pandas dataframe data
, i.e. v, w, x, y, z = data
, you'll actually get the column header strings assigned, not the columns themselves. In this case, you want:
v, w, x, y, z = data.values.T
If you want the timestamp too, it's probably easier to use the pandas import as it handles mixed data more easily, just reset the index or remove index_col
from your read_csv
call:
data = pandas.read_csv(filename, header=0)
u, v, w, x, y, z = df.values.T
that will give you your timestamps in u
.
Upvotes: 1