Reputation: 1553
I have a dataframe with about 50 columns, some of them are period_start_time, id, speed_throughput, etc. dataframe sample:
id period_start_time speed_througput ...
0 1 2017-06-14 20:00:00 6
1 1 2017-06-14 20:00:00 10
2 1 2017-06-14 21:00:00 2
3 1 2017-06-14 21:00:00 5
4 2 2017-06-14 20:00:00 8
5 2 2017-06-14 20:00:00 12
...
I have tried to go create two new columns by grouping two columns(id and period_start_time) and find avg and min of speed_trhoughput. The code I've tried:
df['Throughput_avg']=df.sort_values(['period_start_time'],ascending=False).groupby(['period_start_time','id'])[['speed_trhoughput']].max()
df['Throughput_min'] = df.groupby(['period_start_time', 'id'])[['speed_trhoughput']].min()
As you can see, there are two ways I've tried, but nothing works. The error message I received for both attempts:
TypeError:incompatible index of inserted column with frame index
I suppose you know what my output needs to be, so there is no need to post it.
Upvotes: 1
Views: 1088
Reputation: 294488
Option 1
Use agg
in a groupby
and join
to attach to main dataframe
df.join(
df.groupby(['id', 'period_start_time']).speed_througput.agg(
['mean', 'min']
).rename(columns={'mean': 'avg'}).add_prefix('Throughput_'),
on=['id', 'period_start_time']
)
id period_start_time speed_througput Throughput_avg Throughput_min
0 1 2017-06-14 20:00:00 6 8.0 6
1 1 2017-06-14 20:00:00 10 8.0 6
2 1 2017-06-14 21:00:00 2 3.5 2
3 1 2017-06-14 21:00:00 5 3.5 2
4 2 2017-06-14 20:00:00 8 10.0 8
5 2 2017-06-14 20:00:00 12 10.0 8
Option 2
Use transform
in a groupby
context and use assign
to add the new columns
g = df.groupby(['id', 'period_start_time']).speed_througput.transform
df.assign(Throughput_avg=g('mean'), Throughput_min=g('min'))
id period_start_time speed_througput Throughput_avg Throughput_min
0 1 2017-06-14 20:00:00 6 8.0 6
1 1 2017-06-14 20:00:00 10 8.0 6
2 1 2017-06-14 21:00:00 2 3.5 2
3 1 2017-06-14 21:00:00 5 3.5 2
4 2 2017-06-14 20:00:00 8 10.0 8
5 2 2017-06-14 20:00:00 12 10.0 8
Upvotes: 1