Reputation: 31
I have a problem on replacing this value "..." by NaN. Here is my code
import pandas as pd
import numpy as np
energy = pd.read_excel('Energy Indicators.xls')
del energy['Unnamed: 0']
del energy['Unnamed: 1']
energy.columns = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']
energy.replace("...", np.NaN)
energy['Energy Supply']
They always appear :
Out[46]:
17 321
18 102
19 1959
20 ...
21 9
...
241 344
242 400
243 480
244 NaN
245 NaN
Name: Energy Supply, Length: 229, dtype: objec
Anyone can help me solve this ?
Thank you
Upvotes: 0
Views: 44
Reputation: 944
Your problem is that replace returns a DataFrame by defaulft (see doc). To solve this you can either
energy = energy.replace("...", np.NaN)
or
energy.replace("...", np.NaN, inplace=True)
You can even avoid this problem altogether by specifying that "..." should be interpreted as NaN at read time:
energy = pd.read_excel("Energy Indicators.xls", na_values="...")
See doc.
Upvotes: 1
Reputation: 150735
That's because the ...
goes with leading/trailing spaces. And also, you should check if they are one three-dot character or ...
. If your supply is numerical as shown, you can do:
energy['Energy Supply'] = pd.to_numeric(energy['Energy Supply'], errors='coerce')
Upvotes: 0