Reputation: 368
I have a column in a df
that contains numerous strings. I was to replace
some of the strings with Nan
but there's too many to list that can be used to replace
these values. I have a separate column that does contain Nan
values, which may be used to achieve this.
I want to replace
specific strings in Value
. Just imagine if this column contains 1000 different strings and I want to replace
500 with Nan
. it would be inefficient to create a list with these unwanted strings and use it to replace with Nan
There is a separate column (X)
that displays Nan
values that can be used to replace
rows in Value
. So where X
is Nan
, replace the row in Value
with Nan
.
Is there an easier way to do this?
df = pd.DataFrame({
'Value' : ['B','A','X','Y','C','D','E','F','G','H','I'],
'X' : ['A','A','A','A',np.nan,'A','A','A',np.nan,'A','A'],
})
df = df.loc[df['X'].eq(np.nan), df['Value']] = np.nan
print(df)
Intended Output:
Value X
0 B A
1 A A
2 X A
3 Y A
4 Nan NaN
5 D A
6 E A
7 F A
8 NaN NaN
9 H A
10 I A
Upvotes: 1
Views: 294
Reputation: 30920
You want DataFrame.mask
with Series.isna
df=df.mask(df['X'].isna())
print(df)
Value X
0 B A
1 A A
2 X A
3 Y A
4 NaN NaN
5 D A
6 E A
7 F A
8 NaN NaN
9 H A
10 I A
Also you can use DataFrame.where
with Series.notna
df=df.where(df['X'].notna())
Upvotes: 2
Reputation: 323226
We can do dropna
+ reindex
df=df.dropna().reindex(df.index)
Value X
0 B A
1 A A
2 X A
3 Y A
4 NaN NaN
5 D A
6 E A
7 F A
8 NaN NaN
9 H A
10 I A
Upvotes: 1