Alex Kinman
Alex Kinman

Reputation: 2615

Adding a column from the original data frame to a groupby data frame?

I have a data frame df1 with data that looks like this:

      Item Store Sales Dept 
   0  1     1     5     A
   1  1     2     3     A
   2  1     3     4     A
   3  2     1     3     A
   4  2     2     3     A  

I then want to use group by to see the total sales by item:

df2 = df1.groupby(['Item']).agg({'Item':'first','Sales':'sum'})

Which gives me:

      Item Sales  
   0  1     12     
   1  2     6     

And then I add a column with the rank of the item in terms of number of sales:

 df2['Item Rank'] = df2['Sales'].rank(ascending=False,method='min').astype(int) 

So that I get:

      Item Sales Item Rank 
   0  1     12       1
   1  2     6        2 

I now want to add the Dept column to df2, so that I have

      Item Sales Item Rank Dept 
   0  1     12       1      A
   1  2     6        2      A

But everything I have tried has failed. I either get an empty column, when I try to add the column in from the beginning, or a df with the wrong size if I try to concatenate the new df with the column from the original df.

Upvotes: 0

Views: 179

Answers (2)

BENY
BENY

Reputation: 323316

df.groupby(['Item']).agg({'Item':'first','Sales':'sum','Dept': 'first'}).\
   assign(Itemrank=df.Sales.rank(ascending=False,method='min').astype(int) )
Out[64]: 
      Item Dept  Sales  Itemrank
Item                            
1        1    A     12         3
2        2    A      6         2

Upvotes: 2

nitred
nitred

Reputation: 5609

This is unusual but if you can add the Dept column when you're doing the groupby itself:

A simple option is just to hard code the value if you already know what it needs to be:

df2 = df1.groupby(['Item']).agg({'Item':'first',
                                 'Sales':'sum',
                                 'Dept': lambda x: 'A'})

Or you could take it from the dataframe itself:

df2 = df1.groupby(['Item']).agg({'Item':'first',
                                 'Sales':'sum',
                                 'Dept': lambda x: df1['Dept'].iloc[0]})

Upvotes: 1

Related Questions