Dusty
Dusty

Reputation: 53

Python Pandas mean and weighted Average

I am new to python pandas. Any help will be much appreciated

This is my raw data:

         Feed  Close  Sector  Market_Cap
Date
2015-09-18 A   5.60  Property  50    
2015-09-21 A   5.60  Property  20    
2015-09-23 A   5.60  Property  30    
2015-09-18 ABC 0.67  Property  50    
2015-09-21 ABC 0.66  Property  80     
2015-09-18 DA  0.67  Mining    65    
2015-09-21 KK  1.66  Mining    80    

what I would like to get is this:

1 Create a new column call Mean to calculate average market Cap for each feed.

2 Find weighted average.

This is what I want:
         Feed  Close  Sector   Market_Cap   Mean   Sector_WeightedAvg
Date
2015-09-18 A   5.60  Property  50           33.33      33.33/(33.33+65) 
2015-09-21 A   5.60  Property  20           33.33      33.33/(33.33+65)
2015-09-23 A   5.60  Property  30           33.33      33.33/(33.33+65)
2015-09-18 ABC 0.67  Property  50           65         65/(33.33+65)
2015-09-21 ABC 0.66  Property  80           65         65/(33.33+65) 
2015-09-18 DA  0.67  Mining    65           62         62/(62+80)
2015-09-21 KK  1.66  Mining    80           80         80/(62+80)

This is my current code for mean which I get NaN:

df3= pd.DataFrame(df3)
df3['Mean'] = df3.groupby(by=['Sector'])[ Market_Cap].mean()  

         Feed  Close  Sector   Market_Cap   Mean   
Date
2015-09-18 A   5.60  Property  50           NaN       
2015-09-21 A   5.60  Property  20           NaN      
2015-09-23 A   5.60  Property  30           NaN      
2015-09-18 ABC 0.67  Property  50           NaN             

and for weighted average code:

df2['WeightedAverage'] =df3[ Market_Cap].value /df3['Mean'].value

I got the error:

AttributeError: 'Series' object has no attribute 'value'

Upvotes: 1

Views: 1044

Answers (2)

jezrael
jezrael

Reputation: 862481

IIUC you can use transform and mean.

Weighted Average is column Mean divided by sum of unique values of column Mean and df3 is group by column Sector.

print df3
          Feed  Close    Sector  Market_Cap
Date                                        
2015-09-18    A   5.60  Property          50
2015-09-21    A   5.60  Property          20
2015-09-23    A   5.60  Property          30
2015-09-18  ABC   0.67  Property          50
2015-09-21  ABC   0.66  Property          80
2015-09-18   DA   0.67    Mining          65
2015-09-21   KK   1.66    Mining          80

df3['Mean'] = df3.groupby(by=['Feed'])['Market_Cap'].transform('mean')   
df3['WeightedAverage'] = df3['Mean'] / df3.groupby(by=['Sector'])[ 'Mean'].transform(lambda x: sum(x.unique())) 
print df3
           Feed  Close    Sector  Market_Cap       Mean  WeightedAverage
Date                                                                    
2015-09-18    A   5.60  Property          50  33.333333         0.338983
2015-09-21    A   5.60  Property          20  33.333333         0.338983
2015-09-23    A   5.60  Property          30  33.333333         0.338983
2015-09-18  ABC   0.67  Property          50  65.000000         0.661017
2015-09-21  ABC   0.66  Property          80  65.000000         0.661017
2015-09-18   DA   0.67    Mining          65  65.000000         0.448276
2015-09-21   KK   1.66    Mining          80  80.000000         0.551724

Upvotes: 1

Sandeep S
Sandeep S

Reputation: 218

Try a combination of transform('sum'), mean

In [5]: df
Out[5]: 
   Close Feed  Market_Cap    Sector
0   5.60    A          50  Property
1   5.60    A          20  Property
2   5.60    A          30  Property
3   0.67  ABC          50  Property
4   0.66  ABC          80  Property
5   0.67   DA          65    Mining
6   1.66   KK          80    Mining

In [6]: g = df.groupby(['Sector', 'Feed'])

..

In [7]: c = g.Market_Cap.mean()

In [8]: c
Out[8]: 
Sector    Feed
Mining    DA      65.000000
          KK      80.000000
Property  A       33.333333
          ABC     65.000000
Name: Market_Cap, dtype: float64

In [9]: d = c.groupby(level=0).transform('sum')

In [10]: d
Out[10]: 
Sector    Feed
Mining    DA      145.000000
          KK      145.000000
Property  A        98.333333
          ABC      98.333333
dtype: float64

..

In [11]: df['Mean'] = df.apply(lambda x: c[x.Sector, x.Feed], axis=1)

In [12]: df['Weighted_Avg'] = df.apply(lambda x: c[x.Sector, x.Feed] / d[x.Sector, x.Feed], axis=1)

In [13]: df
Out[13]: 
   Close Feed  Market_Cap    Sector       Mean  Weighted_Avg
0   5.60    A          50  Property  33.333333      0.338983
1   5.60    A          20  Property  33.333333      0.338983
2   5.60    A          30  Property  33.333333      0.338983
3   0.67  ABC          50  Property  65.000000      0.661017
4   0.66  ABC          80  Property  65.000000      0.661017
5   0.67   DA          65    Mining  65.000000      0.448276
6   1.66   KK          80    Mining  80.000000      0.551724

Upvotes: 0

Related Questions