TmSmth
TmSmth

Reputation: 452

Groupby mean ignoring zero

I have a pivot table which has this structure :

        correl
        stock_b      AAAA    CCCC    DDDD
stock_a date            
AAAA    2018-01-02   0.00    0.00    0.00
        2018-01-10   0.00    0.00    0.00
        2018-01-15   0.00   69.45   69.14
        2018-01-31   0.00    0.00    0.00
BBBB    2018-01-02   0.00   43.29   67.38
        2018-01-14   0.00    0.00    0.00
        2018-01-15  50.16   42.74   67.47
        2018-01-31   0.00   42.74   67.47

I would like to group by stock_a and take the mean value, ignoring zero. It would look like :

         correl
stock_b  AAAA    CCCC    DDDD
stock_a         
AAAA     0.00   69.45   69.14
BBBB    50.16   42.92   67.44

df.groupby('stock_a').mean() take in account the 0 and the value is not what i would like to have.

Upvotes: 0

Views: 1038

Answers (2)

Henry Yik
Henry Yik

Reputation: 22523

Just groupby and mean:

print (df[df.ne(0)].groupby("stock_a").mean().fillna(0))
   
          AAAA       CCCC   DDDD
stock_a                         
AAAA      0.00  69.450000  69.14
BBBB     50.16  42.923333  67.44

Upvotes: 1

SimonR
SimonR

Reputation: 1824

You can use a 'custom' aggregate function for your groupby using agg. This can be a lambda function, but for readability, I prefer a proper function like this:

def mean_excluding_zero(l):
    nonzero = [elem for elem in l if elem != 0]
    return sum(nonzero) / len(nonzero)

df.groupby('stock_a').agg(mean_excluding_zero)

Upvotes: 2

Related Questions