Marios Nikolaou
Marios Nikolaou

Reputation: 1336

Timeseries dataset split data to chunks of equal size

I have time series dataset to predict stock market price, in format: Date from 2015 - 2019 and time steps t1- t300 with float values.

Date         t1     t2    t3    t4  ... t300
01-01-2019   -0.34  0.40  0.50  1.2
02-01-2019   0.45   0.56  0.34  0.45
 ...

I want to split each row to equal chunks of data (50 timesteps) and append it to array.

Expected array, t1-t50 means the values of t1,t2,t3,t4 until t50 and so on.

[ 
   [[t1-t50],[t50-t100],[t100-t150],[t150-t200],[t200-t250],[t250-t300] ],
   [[t1-t50],[t50-t100],[t100-t150],[t150-t200],[t200-t250],[t250-t300] ],
   ...
 ]

Thanks in advance.

Upvotes: 1

Views: 3007

Answers (1)

anky
anky

Reputation: 75080

IIUC, you need to split the df over axis=1, use np.split():

np.split(df,df.shape[1]/50,axis=1)
#or np.split(df.values,df.shape[1]/50,axis=1)

Adding an example:

df=pd.DataFrame(np.arange(0,30).reshape(5,6))
print(df)

    0   1   2   3   4   5
0   0   1   2   3   4   5
1   6   7   8   9  10  11
2  12  13  14  15  16  17
3  18  19  20  21  22  23
4  24  25  26  27  28  29

based on the above df if i want to split each row to 3 values:

np.split(df.values,df.shape[1]/3,axis=1)

[array([[ 0,  1,  2],
    [ 6,  7,  8],
    [12, 13, 14],
    [18, 19, 20],
    [24, 25, 26]]), array([[ 3,  4,  5],
    [ 9, 10, 11],
    [15, 16, 17],
    [21, 22, 23],
    [27, 28, 29]])]

Upvotes: 4

Related Questions