user3709511
user3709511

Reputation: 221

Pandas group multiple columns as set and sort on additional columns

I have a data frame with the data formatted as follows:

  Exchange Ticker                        Name  Year  Dividend_Cover_Ratio Dividend Net_Return        
0     NYSE     VZ  VERIZON COMMUNICATIONS INC  2013                  1.93     5.2%     41.69%             
1     NYSE     VZ  VERIZON COMMUNICATIONS INC  2014                  1.13    5.38%     14.79%             
2     NYSE     VZ  VERIZON COMMUNICATIONS INC  2015                  1.59    6.62%     24.74%             
3     NYSE     VZ  VERIZON COMMUNICATIONS INC  2016                  1.42    4.51%      28.7%            
4     NYSE     VZ  VERIZON COMMUNICATIONS INC  2017                  3.18    4.43%     -1.81%  
50    NYSE    MCD              MCDONALDS CORP  2013                  1.79    3.66%     33.83%             
51    NYSE    MCD              MCDONALDS CORP  2014                  1.48    3.85%     14.03%            
52    NYSE    MCD              MCDONALDS CORP  2015                  1.40     3.1%     51.36%            
53    NYSE    MCD              MCDONALDS CORP  2016                  1.52    3.06%     11.34%            
54    NYSE    MCD              MCDONALDS CORP  2017                  1.68    2.24%     39.44%    

I'd like to treat Exchange, Ticker, Year as a set and rank these sets according to the highest dividend (based on dividend column).

I'm thinking I might have to average the dividend column (per set) then rank (index?) based on the average then drop the average column (I don't want to see this).

Can someone suggest some code that would achieve this? I've looked at other posts but nothing worked for me (due to the grouping of multiple columns I suspect).

Upvotes: 1

Views: 32

Answers (1)

jpp
jpp

Reputation: 164673

First add an average dividend series by group:

df['Div_Grp_Avg'] = df.groupby(['Exchange', 'Ticker', 'Year'])['Dividend'].transform('mean')

Then sort by this new series:

df = df.sort_values('Div_Grp_Avg', ascending=False)

Finally, drop the helper column:

df = df.drop('Div_Grp_Avg', 1)

Upvotes: 1

Related Questions