Reputation: 350
I have a long dataframe with an index of a timeseries like this:
datetime number
2015-07-06 00:00:00 12
2015-07-06 00:10:00 55
2015-07-06 00:20:00 129
2015-07-06 00:30:00 5
2015-07-06 00:40:00 3017
2015-07-06 00:50:00 150
2015-07-06 01:00:00 347
2015-07-06 01:10:00 8
2015-07-06 01:20:00 19
... ...
I would like to transform/reshape this by splitting the column every n rows into a row in a 'new' table.
For example, an n=3 create:
datetime #0 #1 #2
2015-07-06 00:00:00 12 55 129
2015-07-06 00:30:00 5 3017 150
2015-07-06 01:00:00 347 8 19
... ... ... ...
I can think of doing this with a For-Loop, but I was wondering if there was a more efficient way native to Pandas.
Upvotes: 2
Views: 261
Reputation: 38415
Here is one solution
n = 3
new_df = df.groupby(df.index//n).agg({'datetime': 'first', 'number': lambda x: x.tolist()})
new_df.assign(**(new_df.number.apply(pd.Series).add_prefix('#')))
datetime number #0 #1 #2
0 2015-07-06 00:00:00 [12, 55, 129] 12 55 129
1 2015-07-06 00:30:00 [5, 3017, 150] 5 3017 150
2 2015-07-06 01:00:00 [347, 8, 19] 347 8 19
You can drop the number column
Edit: As @coldspeed suggested, you can combine the last two steps.
new_df = df.groupby(df.index//n).agg({'datetime': 'first', 'number': lambda x: x.tolist()})
new_df.assign(**(new_df.pop('number').apply(pd.Series).add_prefix('#')))
datetime #0 #1 #2
0 2015-07-06 00:00:00 12 55 129
1 2015-07-06 00:30:00 5 3017 150
2 2015-07-06 01:00:00 347 8 19
Upvotes: 1
Reputation: 402814
You can use groupby
and apply
/agg
with list
:
u = df.groupby(pd.Grouper(key='datetime', freq='30min'))['number'].agg(list)
pd.DataFrame(u.tolist(), index=u.index)
0 1 2
datetime
2015-07-06 00:00:00 12 55 129
2015-07-06 00:30:00 5 3017 150
2015-07-06 01:00:00 347 8 19
Upvotes: 3