Reputation: 1983
I want to know how to create a new column after groupby and apply sum calculation.
I have a data frame like this.
> df
tour_id time condA condB condC
1 10 True True True
1 20 True True True
1 30 False False False
1 40 False False False
2 15 True True True
2 25 False False False
2 30 False False False
2 45 False False False
2 50 True True True
I want to
groupby tour_id
and take a sum of time
column where condA
and condB
and condC
are all False
. Like below.
tour_id sum
1 70
2 100
Merge the result of 1 to the original data frame df
, filling with the same value in a group, and name this new column as driving
.
so the result should look like:
tour_id time condA condB condC driving
1 10 True True True 70
1 20 True True True 70
1 30 False False False 70
1 40 False False False 70
2 15 True True True 100
2 25 False False False 100
2 30 False False False 100
2 45 False False False 100
2 50 True True True 100
My attempt:
temp = df[(df.condA == True)&(df.condB == True) &(df.condC == True)]
df2 = temp.groupby('tour_id').time.sum().reset_index()
But I don't know how to merge df2
to the original df
.
Upvotes: 0
Views: 57
Reputation: 9019
df['driving'] = df['tour_id'].map(df[~df[['condA','condB','condC']].all(1)].groupby('tour_id')['time'].sum())
Yields:
tour_id time condA condB condC driving
0 1 10 True True True 70
1 1 20 True True True 70
2 1 30 False False False 70
3 1 40 False False False 70
4 2 15 True True True 100
5 2 25 False False False 100
6 2 30 False False False 100
7 2 45 False False False 100
8 2 50 True True True 100
Upvotes: 1
Reputation: 323226
Using all
to filter the data frame
df['driving']=df.tour_id.map(df[(df.iloc[:,-3:]).all(1)].groupby('tour_id').time.sum())
Upvotes: 1
Reputation: 3855
You can turn your groupby dataframe into a dict, and then map the tour_id column through it:
df['driving'] = df.tour_id.map(temp.groupby('tour_id').time.sum().to_dict())
Upvotes: 0