Removing some characters in a DataFrame

Question

I have a problem on replacing this value "..." by NaN. Here is my code

    import pandas as pd
    import numpy as np
    energy = pd.read_excel('Energy Indicators.xls')
    del energy['Unnamed: 0']
    del energy['Unnamed: 1']
    energy.columns = ['Country', 'Energy Supply', 'Energy Supply per Capita', '% Renewable']
    energy.replace("...", np.NaN)  
    energy['Energy Supply']

They always appear :

Out[46]:
17      321
18      102
19     1959
20      ...
21        9
       ... 
241     344
242     400
243     480
244     NaN
245     NaN
Name: Energy Supply, Length: 229, dtype: objec

Anyone can help me solve this ?

Thank you

Big Bro · Accepted Answer

Your problem is that replace returns a DataFrame by defaulft (see doc). To solve this you can either

energy = energy.replace("...", np.NaN)

or

energy.replace("...", np.NaN, inplace=True)

You can even avoid this problem altogether by specifying that "..." should be interpreted as NaN at read time:

energy = pd.read_excel("Energy Indicators.xls", na_values="...")

See doc.

Removing some characters in a DataFrame

Answers (2)

Related Questions