Alissa
Alissa

Reputation: 99

pd.to_numeric converts entire series to NaN

I'm trying to convert a column using pd.to_numeric, but for some reason it turns all values (except one) into NaN:

In[]: pd.to_numeric(portfolio["Principal Remaining"],errors="coerce")
Out[]: 
1           NaN
2           NaN
3           NaN
4           NaN
5           NaN
6           NaN
7           NaN
8           NaN
9           NaN
10          NaN
11          NaN
12          NaN
13          NaN
14          NaN
15          NaN
16          NaN
17          NaN
18       836.61
19          NaN
20          NaN
      ...  
Name: Principal Remaining, Length: 32314, dtype: float64

Thoughts on why this is happening? The original data looks like this:

1         18,052.02
2         27,759.85
3         54,061.75
4         89,363.61
5         46,954.46
6         64,295.64
7        100,000.00
8         27,905.98
9         13,821.48
10        16,937.89
        ...    
Name: Principal Remaining, Length: 32314, dtype: object

Upvotes: 2

Views: 3266

Answers (1)

cs95
cs95

Reputation: 402323

read_csv with thousands=','

df = pd.read_csv('file.csv', thousands=',')

This fixes the problem while reading your data.


replace and to_numeric

df['Principal Remaining'] = pd.to_numeric(
    df['Principal Remaining'].str.replace(',', ''), errors='coerce')

If the first option isn't a choice, you'll need to get rid of the commas first using str.replace, then call pd.to_numeric as shown here.

Upvotes: 11

Related Questions