Reputation: 499
I need to replace consequent zeroes in a column with a value that is an average of these consequent zeros and the following value. Is there an elegant way to do that, instead of iterating all the rows one by one?
import pandas as pd
import datetime as dt
dates=pd.date_range(start = dt.date(2019,1,1), end=dt.date(2019,1,18), freq='1D')
data=[3,1,3.3,0,0,6,2.7,3,4,2.1,0,0,0,0,0,12,8,7]
df = pd.DataFrame(data=data, index=dates, columns=['how it is now'])
df['how it should be']= [3,1,3.3,2,2,2,2.7,3,4,2.1,2,2,2,2,2,2,8,7]
Thanks.
Upvotes: 1
Views: 45
Reputation: 61910
You could use simple boolean indexing and numpy.where:
mask = df.current.eq(0) # find those that are zero
shift_mask = df.current.eq(0).shift().fillna(False) # find those that are after a zero
# use numpy.where to set the elements where the mask is True otherwise keep the old value (df.current)
df['result'] = np.where(mask | shift_mask, df.current[mask | shift_mask].mean(), df.current)
print(df)
Output
current result
2019-01-01 3.0 3.0
2019-01-02 1.0 1.0
2019-01-03 3.3 3.3
2019-01-04 0.0 2.0
2019-01-05 0.0 2.0
2019-01-06 6.0 2.0
2019-01-07 2.0 2.0
2019-01-08 3.0 3.0
2019-01-09 4.0 4.0
2019-01-10 2.1 2.1
2019-01-11 0.0 2.0
2019-01-12 0.0 2.0
2019-01-13 0.0 2.0
2019-01-14 0.0 2.0
2019-01-15 0.0 2.0
2019-01-16 12.0 2.0
2019-01-17 8.0 8.0
2019-01-18 7.0 7.0
Upvotes: 1
Reputation: 323276
Here you go
df.groupby(df['how it is now'].ne(0).iloc[::-1].cumsum())['how it is now'].transform('mean')
Upvotes: 1