Reputation: 7227
The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.
Having the following data frame:
df = pd.DataFrame({'a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14]})
a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14
Maintaining the relationship from column a
to column b
, where all NaN values are updated results in:
a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14
One way that it is possible to achieve the desired behaviour is:
df.b.where(~df.a.isnull(), np.nan)
Is there any other way to maintain such a relationship?
Upvotes: 10
Views: 2666
Reputation: 2167
Using np.where()
,
df['b'] = np.where(df.a.isnull(), df.a, df.b)
Working - np.where(condition, [a, b])
Return elements, either from a
or b
, depending on condition
.
Output:
>>> df
a b
0 NaN NaN
1 2.0 12.0
2 NaN NaN
3 4.0 14.0
Upvotes: 1
Reputation: 164643
Using pd.Series.notnull
to avoid having to take the negative of your Boolean series:
df.b.where(df.a.notnull(), np.nan)
But, really, there's nothing wrong with your existing solution.
Upvotes: 3
Reputation: 27869
Another one would be:
df.loc[df.a.isnull(), 'b'] = df.a
Isn't shorter but does the job.
Upvotes: 2
Reputation: 323226
Using dropna
with reindex
df.dropna().reindex(df.index)
Out[151]:
a b
0 NaN NaN
1 2.0 12.0
2 NaN NaN
3 4.0 14.0
Upvotes: 1
Reputation: 76917
You could use mask
on NaN
rows.
In [366]: df.mask(df.a.isnull())
Out[366]:
a b
0 NaN NaN
1 2.0 12.0
2 NaN NaN
3 4.0 14.0
For, presence of any NaN
across columns use df.mask(df.isnull().any(1))
Upvotes: 9