Reputation: 51
I have 2 dataframes:
purchases = pd.DataFrame([['Alice', 'sweeties', 4],
['Bob', 'chocolate', 5],
['Alice', 'chocolate', 3],
['Claudia', 'juice', 2]],
columns=['client', 'item', 'quantity'])
goods = pd.DataFrame([['sweeties', 15],
['chocolate', 7],
['juice', 8],
['lemons', 3]], columns=['good', 'price'])
and I want to transform purchases with cols and indexes alike at this photo:
My first thought was to use pivot:
purchases.pivot(columns="item", values="quantity")
Output:
The problem is: I also need the lemons column in the pivot result because it's present in the goods dataframe (just filled with None values).
How can I accomplish that?
Upvotes: 2
Views: 96
Reputation: 34086
You can use df.merge
with df.pivot
:
In [3626]: x = goods.merge(purchases, left_on='good', right_on='item', how='left')
In [3628]: x['total'] = x.price * x.quantity # you can tweak this calculation
In [3634]: res = x[['good', 'client', 'total']].pivot('client', 'good', 'total').dropna(how='all').fillna(0)
In [3635]: res
Out[3635]:
good chocolate juice lemons sweeties
client
Alice 21.0 0.0 0.0 60.0
Bob 35.0 0.0 0.0 0.0
Claudia 0.0 16.0 0.0 0.0
Upvotes: 1
Reputation: 150785
You can chain with reindex
:
purchases.pivot(columns="item", values="quantity").reindex(goods['good'], axis=1)
Output:
good sweeties chocolate juice lemons
0 4.0 NaN NaN NaN
1 NaN 5.0 NaN NaN
2 NaN 3.0 NaN NaN
3 NaN NaN 2.0 NaN
Upvotes: 1