Reputation: 7227

Set all values in one column to NaN if the corresponding values in another column are also NaN

The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.

Having the following data frame:

df = pd.DataFrame({'a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14]})

     a   b
0  NaN  11
1    2  12
2  NaN  13
3    4  14

Maintaining the relationship from column a to column b, where all NaN values are updated results in:

     a    b
0  NaN  NaN
1    2   12
2  NaN  NaN
3    4   14

One way that it is possible to achieve the desired behaviour is:

df.b.where(~df.a.isnull(), np.nan)

Is there any other way to maintain such a relationship?

Upvotes: 10

Answers (5)

Reputation: 2167

Using np.where(),

df['b'] = np.where(df.a.isnull(), df.a, df.b)

Working - np.where(condition, [a, b])

Return elements, either from a or b, depending on condition.

Output:

>>> df
    a       b
0   NaN     NaN
1   2.0     12.0
2   NaN     NaN
3   4.0     14.0

Upvotes: 1

Reputation: 164643

Using pd.Series.notnull to avoid having to take the negative of your Boolean series:

df.b.where(df.a.notnull(), np.nan)

But, really, there's nothing wrong with your existing solution.

Upvotes: 3

Reputation: 27869

Another one would be:

df.loc[df.a.isnull(), 'b'] = df.a

Isn't shorter but does the job.

Upvotes: 2

Reputation: 323226

Using dropna with reindex

df.dropna().reindex(df.index)
Out[151]: 
     a     b
0  NaN   NaN
1  2.0  12.0
2  NaN   NaN
3  4.0  14.0

Upvotes: 1

Reputation: 76917

You could use mask on NaN rows.

In [366]: df.mask(df.a.isnull())
Out[366]:
     a     b
0  NaN   NaN
1  2.0  12.0
2  NaN   NaN
3  4.0  14.0

For, presence of any NaN across columns use df.mask(df.isnull().any(1))

Upvotes: 9