Reputation: 2007
I have pandas.dataFrame with column 'Country', head() is below:
0 tmp
1 Environmental Indicators: Energy
2 tmp
3 Energy Supply and Renewable Electricity Produc...
4 NaN
5 NaN
6 NaN
7 Choose a country from the following drop-down ...
8 NaN
9 Country
When I use this line:
energy['Country'] = energy['Country'].str.replace(r'[...]', 'a')
There is no change. But when I use this line instead:
energy['Country'] = energy['Country'].str.replace(r'[...]', np.nan)
All values are NaN.
Why does only second code change output? My goal is change valuses with triple dot only.
Upvotes: 0
Views: 1605
Reputation: 914
Is this what you want when you say "I need change whole values, not just the triple dots"?
mask = df.Country.str.contains(r'\.\.\.', na=False)
df.Country[mask] = 'a'
Upvotes: 1
Reputation: 57033
.replace(r'[...]', 'a')
treats the first parameter as a regular expression, but you want to treat it literally. So, you need .replace(r'\.\.\.', 'a')
.
As for your actual question, .str.replace
requires a string as the second parameter. It attempts to convert np.nan
to a string (which is not possible) and fails. For the reason not known to me, instead of raising a TypeError
, it instead returns np.nan
for each row.
Upvotes: 0