Reputation: 452
I have a pivot table which has this structure :
correl
stock_b AAAA CCCC DDDD
stock_a date
AAAA 2018-01-02 0.00 0.00 0.00
2018-01-10 0.00 0.00 0.00
2018-01-15 0.00 69.45 69.14
2018-01-31 0.00 0.00 0.00
BBBB 2018-01-02 0.00 43.29 67.38
2018-01-14 0.00 0.00 0.00
2018-01-15 50.16 42.74 67.47
2018-01-31 0.00 42.74 67.47
I would like to group by stock_a
and take the mean value, ignoring zero. It would look like :
correl
stock_b AAAA CCCC DDDD
stock_a
AAAA 0.00 69.45 69.14
BBBB 50.16 42.92 67.44
df.groupby('stock_a').mean()
take in account the 0 and the value is not what i would like to have.
Upvotes: 0
Views: 1038
Reputation: 22523
Just groupby
and mean
:
print (df[df.ne(0)].groupby("stock_a").mean().fillna(0))
AAAA CCCC DDDD
stock_a
AAAA 0.00 69.450000 69.14
BBBB 50.16 42.923333 67.44
Upvotes: 1
Reputation: 1824
You can use a 'custom' aggregate function for your groupby using agg
. This can be a lambda function, but for readability, I prefer a proper function like this:
def mean_excluding_zero(l):
nonzero = [elem for elem in l if elem != 0]
return sum(nonzero) / len(nonzero)
df.groupby('stock_a').agg(mean_excluding_zero)
Upvotes: 2