Reputation: 2153
I have a dataframe ('df') something like (simplified for this example):
index | timestamp | value
================================
001 | 2020-09-20 07:00 | 1.4
002 | 2020-09-20 07:00 | 1.5
001 | 2020-09-20 09:00 | 1.6
002 | 2020-09-20 09:00 | 1.4
001 | 2020-09-20 11:00 | 1.23
002 | 2020-09-20 11:00 | 1.46
If I do this:
grouped = df.groupby('timestamp')
I now have a groupby with three groups. I need to now add a 'date_time_trigger' column that contains a value based on the index of the group:
index | timestamp | value | date_time_trigger
================================================
001 | 2020-09-20 07:00 | 1.4 | triggergroup1
002 | 2020-09-20 07:00 | 1.5 | triggergroup1
001 | 2020-09-20 09:00 | 1.6 | triggergroup2
002 | 2020-09-20 09:00 | 1.4 | triggergroup2
001 | 2020-09-20 11:00 | 1.23 | triggergroup3
002 | 2020-09-20 11:00 | 1.46 | triggergroup3
I then need to combine the groups back into the original dataframe. I've tried simply adding column to the original data frame, then changing its value inside a group iterator:
idx=0
df['date_time_trigger']='foo'
grouped = df.groupby('timestamp')
for name,group in grouped:
idx=idx+1
group['date_time_trigger']='triggergroup'+str(idx)
And as far as I can tell, the value of date_time_trigger
is being set inside each group, but now
I need to recombine the group into the original dataframe df to continue with my process. The only way I can find in the docs is to apply some type of aggregation, like mean or avg, but I just needed the groupby to add the labels to each group instance. How do I get my dataframe back?
Upvotes: 1
Views: 120
Reputation: 323226
Check with factorize
df['group'] = df['timestamp'].factorize()[0]+1
Method two
df.groupby('timestamp').ngroup().add(1).astype(str).radd('triggergroup')
0 triggergroup1
1 triggergroup1
2 triggergroup2
3 triggergroup2
4 triggergroup3
5 triggergroup3
dtype: object
Upvotes: 1