J Doe
J Doe

Reputation: 79

In pandas replace consecutive 0s with NaN

I want to clean some data by replacing only CONSECUTIVE 0s in a data frame

Given:

import pandas as pd
import numpy as np
d = [[1,np.NaN,3,4],[2,0,0,np.NaN],[3,np.NaN,0,0],[4,np.NaN,0,0]]
df = pd.DataFrame(d, columns=['a', 'b', 'c', 'd'])
df
   a    b  c    d
0  1  NaN  3  4.0
1  2  0.0  0  NaN
2  3  NaN  0  0.0
3  4  NaN  0  0.0

The desired result should be:

   a    b  c    d
0  1  NaN  3   4.0
1  2  0.0  NaN NaN
2  3  NaN  NaN NaN
3  4  NaN  NaN NaN

where column c & d are affected but column b is NOT affected as it only has 1 zero (and not consecutive 0s).

I have experimented with this answer: Replacing more than n consecutive values in Pandas DataFrame column

which is along the right lines but the solution keeps the first 0 in a given column which is not desired in my case.

Upvotes: 3

Views: 256

Answers (1)

BENY
BENY

Reputation: 323356

Let us do shift with mask

df=df.mask((df.shift().eq(df)|df.eq(df.shift(-1)))&(df==0))
Out[469]: 
   a    b    c    d
0  1  NaN  3.0  4.0
1  2  0.0  NaN  NaN
2  3  NaN  NaN  NaN
3  4  NaN  NaN  NaN

Upvotes: 2

Related Questions