Ayan Mitra
Ayan Mitra

Reputation: 525

create Time Series DF from 2D array

I have a DF like the following

t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX 

The time array(/values) is same for all candidates (t1,t2,...tn).

How can I transform this DF into a Time Series data ? The first step (easiest) is to take out the first half of the first row, which will be the time column (same for all candidates) and then break up the DF into two parts vertically from the middle, take the values (second) part, make transverse and concat with the time column, trasversed. But then how can I preserve the labels ?

I expect the output something like :

t1,a1,b1,c1,...,x1
t2,a2,b2,c2,...,x2
.
.
.
tn,an,bn,cn,...,xn

But I am in darkness as to how to preserve the Labels for each candidate's time series value while transforming them to timeseries data.

Upvotes: 0

Views: 505

Answers (2)

Jose Gibson
Jose Gibson

Reputation: 1

If you are confused about where to fit the labels, realize that LabelX is basically x501. It perfectly fits at the end of the time series. So after the process which you explained in the question, add a time step t501 and treat labels as any other data point. See the example

t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX 

will become

t1  ,a1,    b1,    c1,    ...,x1
t2  ,a2,    b2,    c2,    ...,x2
t3  ,a3,    b3,    c3,    ...,x3
.
.
t500,a500,  b500,  c500,  ...,x500
t501,Label1,Label2,Label3,...,LabelX 

Upvotes: 0

jezrael
jezrael

Reputation: 862641

IIUC select all columns by position - in sample data from column position 4 to end without last column, rename columns by first row (because same values in each row) and last transpose:

print (df)
    0   1   2     3   4   5   6     7       8
0  t1  t2  t3  t500  a1  a2  a3  a500  Label1
1  t1  t2  t3  t500  b1  b2  b3  b500  Label2
2  t1  t2  t3  t500  c1  c2  c3  c500  Label3
3  t1  t2  t3  t500  x1  x2  x3  x500  LabelX



df = df.set_index([8]).iloc[:, 4:-1].rename(columns=dict(zip(df.columns[4:-1], df.iloc[0]))).T
print (df)
8  Label1 Label2 Label3 LabelX
t1     a1     b1     c1     x1
t2     a2     b2     c2     x2
t3     a3     b3     c3     x3

Upvotes: 1

Related Questions