Reputation: 2615
I have a data frame df1 with data that looks like this:
Item Store Sales Dept
0 1 1 5 A
1 1 2 3 A
2 1 3 4 A
3 2 1 3 A
4 2 2 3 A
I then want to use group by to see the total sales by item:
df2 = df1.groupby(['Item']).agg({'Item':'first','Sales':'sum'})
Which gives me:
Item Sales
0 1 12
1 2 6
And then I add a column with the rank of the item in terms of number of sales:
df2['Item Rank'] = df2['Sales'].rank(ascending=False,method='min').astype(int)
So that I get:
Item Sales Item Rank
0 1 12 1
1 2 6 2
I now want to add the Dept column to df2, so that I have
Item Sales Item Rank Dept
0 1 12 1 A
1 2 6 2 A
But everything I have tried has failed. I either get an empty column, when I try to add the column in from the beginning, or a df with the wrong size if I try to concatenate the new df with the column from the original df.
Upvotes: 0
Views: 179
Reputation: 323316
df.groupby(['Item']).agg({'Item':'first','Sales':'sum','Dept': 'first'}).\
assign(Itemrank=df.Sales.rank(ascending=False,method='min').astype(int) )
Out[64]:
Item Dept Sales Itemrank
Item
1 1 A 12 3
2 2 A 6 2
Upvotes: 2
Reputation: 5609
This is unusual but if you can add the Dept
column when you're doing the groupby itself:
A simple option is just to hard code the value if you already know what it needs to be:
df2 = df1.groupby(['Item']).agg({'Item':'first',
'Sales':'sum',
'Dept': lambda x: 'A'})
Or you could take it from the dataframe itself:
df2 = df1.groupby(['Item']).agg({'Item':'first',
'Sales':'sum',
'Dept': lambda x: df1['Dept'].iloc[0]})
Upvotes: 1