iris
iris

Reputation: 381

Pandas DataFrame replace negative values with latest preceding positive value

Consider a DataFrame such as

df = pd.DataFrame({'a': [1,-2,0,3,-1,2], 
                   'b': [-1,-2,-5,-7,-1,-1], 
                   'c': [-1,-2,-5,4,5,3]})

For each column, how to replace any negative value with the last positive value or zero ? Last here refers from top to bottom for each column. The closest solution noticed is for instance df[df < 0] = 0.

The expected result would be a DataFrame such as

df_res = pd.DataFrame({'a': [1,1,0,3,3,2], 
                       'b': [0,0,0,0,0,0], 
                       'c': [0,0,0,4,5,3]})

Upvotes: 5

Views: 2090

Answers (3)

Alexander S
Alexander S

Reputation: 71

Expected result may obtained with this manipulations:

mask = df >= 0 #creating boolean mask for non-negative values
df_res = (df.where(mask, np.nan) #replace negative values to nan
          .ffill() #apply forward fill for nan values 
          .fillna(0)) # fill rest nan's with zeros

Upvotes: 3

wwnde
wwnde

Reputation: 26676

Use pandas where

df.where(df.gt(0)).ffill().fillna(0).astype(int)



   a  b  c
0  1  0  0
1  1  0  0
2  1  0  0
3  3  0  4
4  3  0  5
5  2  0  3

Upvotes: 3

Erfan
Erfan

Reputation: 42906

You can use DataFrame.mask to convert all values < 0 to NaN then use ffill and fillna:

df = df.mask(df.lt(0)).ffill().fillna(0).convert_dtypes()
   a  b  c
0  1  0  0
1  1  0  0
2  0  0  0
3  3  0  4
4  3  0  5
5  2  0  3

Upvotes: 6

Related Questions