Reputation: 525
I have a DF like the following
t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX
The time array(/values) is same for all candidates (t1,t2,...tn
).
How can I transform this DF into a Time Series data ? The first step (easiest) is to take out the first half of the first row, which will be the time column (same for all candidates) and then break up the DF into two parts vertically from the middle, take the values (second) part, make transverse and concat with the time column, trasversed. But then how can I preserve the labels ?
I expect the output something like :
t1,a1,b1,c1,...,x1
t2,a2,b2,c2,...,x2
.
.
.
tn,an,bn,cn,...,xn
But I am in darkness as to how to preserve the Labels for each candidate's time series value while transforming them to timeseries data.
Upvotes: 0
Views: 505
Reputation: 1
If you are confused about where to fit the labels, realize that LabelX
is basically x501
. It perfectly fits at the end of the time series.
So after the process which you explained in the question, add a time step t501
and treat labels as any other data point. See the example
t1,t2,t3,...,t500, a1,a2,a3,...,a500,Label1
t1,t2,t3,...,t500, b1,b2,b3,...,b500,Label2
t1,t2,t3,...,t500, c1,c2,c3,...,c500,Label3
.
.
.
t1,t2,t3,...,t500, x1,x2,x3,...,x500,LabelX
will become
t1 ,a1, b1, c1, ...,x1
t2 ,a2, b2, c2, ...,x2
t3 ,a3, b3, c3, ...,x3
.
.
t500,a500, b500, c500, ...,x500
t501,Label1,Label2,Label3,...,LabelX
Upvotes: 0
Reputation: 862641
IIUC select all columns by position - in sample data from column position 4 to end without last column, rename
columns by first row (because same values in each row) and last transpose:
print (df)
0 1 2 3 4 5 6 7 8
0 t1 t2 t3 t500 a1 a2 a3 a500 Label1
1 t1 t2 t3 t500 b1 b2 b3 b500 Label2
2 t1 t2 t3 t500 c1 c2 c3 c500 Label3
3 t1 t2 t3 t500 x1 x2 x3 x500 LabelX
df = df.set_index([8]).iloc[:, 4:-1].rename(columns=dict(zip(df.columns[4:-1], df.iloc[0]))).T
print (df)
8 Label1 Label2 Label3 LabelX
t1 a1 b1 c1 x1
t2 a2 b2 c2 x2
t3 a3 b3 c3 x3
Upvotes: 1