navy_green
navy_green

Reputation: 111

getting weighted average then grouping in pandas

I have the following dataframe.

      weight       x     value
0          5     -8.7        2
1          9     -8.7        3
2         12    -21.4       10
3         32    -21.4       15

I need to get weighted average of the value and grouped on x. Result will be:

-8.7: (5/(5+9) * 2) + ((9/14) * 3) = 2.64

-21.4: ((12/44) * 10) + ((32/44) * 15) = 13.63

         x     weighted_value
0     -8.7               2.64
1    -21.4              13.63

Upvotes: 1

Views: 68

Answers (2)

Divakar
Divakar

Reputation: 221514

Here's a vectorized approach using NumPy tools -

# Get weighted averages and corresponding unique x's
unq,ids = np.unique(df.x,return_inverse=True)
weight_avg = np.bincount(ids,df.weight*df.value)/np.bincount(ids,df.weight)

# Store into a dataframe
df_out = pd.DataFrame(np.column_stack((unq,weight_avg)),columns=['x','wghts'])

Sample run -

In [97]: df
Out[97]: 
   weight     x  value
0       5  -8.7      2
1       9  -8.7      3
2      12 -21.4     10
3      32 -21.4     15

In [98]: df_out
Out[98]: 
      x      wghts
0 -21.4  13.636364
1  -8.7   2.642857

Upvotes: 0

Alicia Garcia-Raboso
Alicia Garcia-Raboso

Reputation: 13913

numpy.average admits a weights argument:

import io
import numpy as np
import pandas as pd

data = io.StringIO('''\
      weight       x     value
0          5     -8.7        2
1          9     -8.7        3
2         12    -21.4       10
3         32    -21.4       15
''')
df = pd.read_csv(data, delim_whitespace=True)

df.groupby('x').apply(lambda g: np.average(g['value'], weights=g['weight']))

Output:

x
-21.4    13.636364
-8.7      2.642857
dtype: float64

Upvotes: 1

Related Questions