Reputation: 3267
I have the following df qty
which has date with repetitive items with qty.
DF qty
| date | qty | item | ccy |
+------------+------+----------+-----+
| 2015-01-01 | 700 | CB04 box | USD |
| 2015-01-01 | 500 | CB04 box | USD |
| 2015-01-01 | 1500 | AB01 box | USD |
| 2015-01-02 | 550 | CB04 box | USD |
| 2015-01-02 | 750 | AB01 box | USD |
| 2015-01-02 | 500 | AB01 box | USD |
| 2015-01-03 | 100 | AB01 box | USD |
| 2015-01-03 | 200 | AB01 box | USD |
| 2015-01-03 | 300 | AB01 box | USD |
I need some suggestion on, how can I make this DF qty
be like the following:
DF qty
| date | qty | item | ccy |
+------------+------+----------+-----+
| 2015-01-01 | 1200 | CB04 box | USD |
| 2015-01-01 | 1500 | AB01 box | USD |
| 2015-01-02 | 550 | CB04 box | USD |
| 2015-01-02 | 1250 | AB01 box | USD |
| 2015-01-03 | 600 | AB01 box | USD |
what I was doing was, making a subset df
for each date
and then group them by qty
.
qty = qty[qty[date] == '2015-01-01']
then
## Sum items lots
qty = qty.groupby('item').agg({'date': 'first',
'ccy':'first',
'qty' : 'sum'}).reset_index()
This process is very repetitive, I want some thing that will do this for all dates
and sum up the item
for qty
Upvotes: 0
Views: 304
Reputation: 59519
You can groupby
multiple columns at the same time. In this case it looks like you want find the total sum for a given item on the same day in the same currency. This means you want to groupby ['date', 'item', 'ccy']
df.groupby(['date', 'item', 'ccy']).sum().reset_index()
Outputs:
date item ccy qty
0 2015-01-01 AB01_box USD 1500
1 2015-01-01 CB04_box USD 1200
2 2015-01-02 AB01_box USD 1250
3 2015-01-02 CB04_box USD 550
4 2015-01-03 AB01_box USD 600
Upvotes: 1