Ruslan Pylypyuk
Ruslan Pylypyuk

Reputation: 57

How to group by with sum transformation

I want to transform dataframe so all ltv in identical (date, country_code, media_source ) would sum up

Here is dataframe that I have:

date        country_code    media_source    ltv      cost
2020-07-01  AE               Facebook Ads   22.5    59.30
2020-07-01  AE               Facebook Ads   0.00000 59.30
2020-07-01  AE               Facebook Ads   11.2    59.30
2020-07-01  AT               Facebook Ads   11.2    7.53
2020-07-01  AU               Facebook Ads   11.2    73.72
2020-07-01  AU               Facebook Ads   0.00000 73.72
2020-07-01  AU               Facebook Ads   56.3    73.72
2020-07-01  AU               Facebook Ads   11.2    73.72
2020-07-01  AU               Facebook Ads   0.00000 73.72
2020-07-01  AU               Facebook Ads   22.5    73.72
2020-07-01  BE               Facebook Ads   33.7    13.26
2020-07-01  BM               Facebook Ads   56.3    0.92
2020-07-01  BN               Facebook Ads   11.2    1.22
2020-07-01  BR               Facebook Ads   11.2    249.99
2020-07-01  BR               Facebook Ads   0.00000 249.99
2020-07-01  BR               Facebook Ads   11.2    249.99
2020-07-01  BR               Facebook Ads   11.2    249.99
2020-07-01  BR               Facebook Ads   0.00000 249.99
2020-07-01  BR               Facebook Ads   11.2    249.99

and here is expected dataframe:

date       country_code media_source      ltv(sum)   cost
2020-07-01    AE        Facebook Ads        33.7    59.30
2020-07-01    AT        Facebook Ads        11.2    7.53
2020-07-01    AU        Facebook Ads        101.2   73.72
2020-07-01    BE        Facebook Ads        33.7    0.92
2020-07-01    BN        Facebook Ads        11.2    1.22
2020-07-01    BR        Facebook Ads        44.8    249.99

How could i transform it into this dataframe

Upvotes: 0

Views: 23

Answers (1)

Ynjxsjmh
Ynjxsjmh

Reputation: 30052

You can try

out = (df.groupby(['date', 'country_code', 'media_source'], as_index=False)
       .agg(**{'ltv(sum)': ('ltv', 'sum'), 'cost': ('cost', 'first')}))
print(out)

         date country_code  media_source  ltv(sum)    cost
0  2020-07-01           AE  Facebook Ads      33.7   59.30
1  2020-07-01           AT  Facebook Ads      11.2    7.53
2  2020-07-01           AU  Facebook Ads     101.2   73.72
3  2020-07-01           BE  Facebook Ads      33.7   13.26
4  2020-07-01           BM  Facebook Ads      56.3    0.92
5  2020-07-01           BN  Facebook Ads      11.2    1.22
6  2020-07-01           BR  Facebook Ads      44.8  249.99

Upvotes: 1

Related Questions