Reputation: 4842
I have a df
id qty location
1 1 Paris
1 10 Rome
1 50 London
2 100 Paris
2 40 Oslo
3 2 Paris
3 9 Washington
3 10 London
I am trying to get the total sum of qty
and sum for Paris, Rome and other location
so that the df
would look like this:
id qty_total qty_Paris qty_Rome qty_other
1 61 1 10 50
2 140 100 0 40
3 21 2 0 19
What I tried:
df_temp = df.groupby('id')['qty'].sum()
df_temp['qty_Paris'] = df[df.groupby(['id', 'location'])[['qty']]
.sum().reset_index()['location']=='Paris']
But I get an error:
Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).
How should I approach this problem so that I would get the df
I desire?
Upvotes: 1
Views: 54
Reputation: 14949
One way:
df.loc[~df.location.isin(['Paris', 'Rome']), 'location'] = 'Others'
df1 = df.pivot_table(index='id', columns='location', values='qty',
aggfunc=sum, fill_value=0,).add_prefix('qty_').rename_axis(columns=None)
df1['qty_total'] = df1.sum(1)
qty_Others qty_Paris qty_Rome qty_total
id
1 50 1 10 61
2 40 100 0 140
3 19 2 0 21
Complete Example:
df = pd.DataFrame({'id': {0: 1, 1: 1, 2: 1, 3: 2, 4: 2, 5: 3, 6: 3, 7: 3},
'qty': {0: 1, 1: 10, 2: 50, 3: 100, 4: 40, 5: 2, 6: 9, 7: 10},
'location': {0: 'Paris',
1: 'Rome',
2: 'London',
3: 'Paris',
4: 'Oslo',
5: 'Paris',
6: 'Washington',
7: 'London'}})
df.loc[~df.location.isin(['Paris', 'Rome']), 'location'] = 'Others'
df1 = df.pivot_table(index='id', columns='location', values='qty',
aggfunc=sum, fill_value=0,).add_prefix('qty_').rename_axis(columns=None)
df1['qty_total'] = df1.sum(1)
Upvotes: 4