Reputation: 2085
I need to replace NaN values only in specific rows of my data frame. My original looks like this:
import pandas as pd
import numpy as np
pd.DataFrame.from_dict({'col1': 'A B A D E B C D E A'.split(),
'col2': [np.nan, 0, 1, np.nan, np.nan, np.nan, 1, np.nan, np.nan, 0]})
If col1
is equal to A
or B
and col2
is NaN, then I want to replace it with 0. For other values of col1
, the NaNs should remain unchanged.
My first idea was to use pd.DataFrame.replace
and write something like this:
data.loc[data['col1'].isin(['A', 'B']), 'col2'].replace({np.nan: 0})
but it doesn't seem to work as intended and fills all NaNs with 0
.
Upvotes: 2
Views: 61
Reputation: 109546
I believe you intended to use a mask.
mask = data['col1'].isin({'A', 'B'})
data.loc[mask, 'col2'] = data.loc[mask, 'col2'].replace({np.nan, 0})
>>> data
col1 col2
0 A NaN
1 B 0.0
2 A 1.0
3 D NaN
4 E NaN
5 B NaN
6 C 1.0
7 D NaN
8 E NaN
9 A 0.0
Upvotes: 2
Reputation: 392
data:
col1 col2
0 A NaN
1 B 0.0
2 A 1.0
3 D NaN
4 E NaN
5 B NaN
6 C 1.0
7 D NaN
8 E NaN
9 A 0.0
script:
mask1 = data['col1'].isin(['A', 'B'])
mask2 = data['col2'].isna()
mask = mask1 & mask2
data.loc[mask, 'col2'] = 0
output:
col1 col2
0 A 0.0
1 B 0.0
2 A 1.0
3 D NaN
4 E NaN
5 B 0.0
6 C 1.0
7 D NaN
8 E NaN
9 A 0.0
Upvotes: 0
Reputation: 153460
You can use fillna
instead of replace
:
df['col2'] = df.loc[df['col1'].isin(['A','B']), 'col2'].fillna(0)
df
Output:
| | col1 | col2 |
|---:|:-------|-------:|
| 0 | A | 0 |
| 1 | B | 0 |
| 2 | A | 1 |
| 3 | D | nan |
| 4 | E | nan |
| 5 | B | 0 |
| 6 | C | nan |
| 7 | D | nan |
| 8 | E | nan |
| 9 | A | 0 |
Upvotes: 1
Reputation: 150735
You can just do a loc
:
df.loc[df.col1.isin(['A','B']) & df.col2.isna(), 'col2'] = 0
Upvotes: 2