Minsky
Minsky

Reputation: 513

Trouble converting string to float in python

I am fairly new to Python so forgive me this simple question. I'm trying to convert string to float. Here is a sample of the data:

0     10.65%
1      7.90%

When I try:

 df['int_rate'] = df['int_rate'].astype('float')

I get:

ValueError: could not convert string to float: '13.75%'

When I try:

df['int_rate'] = df['int_rate'].replace("%","", inplace=True) 

And check my data, I get:

0     None
1     None

Any ideas what I'm doing wrong? Many thanks!

Upvotes: 8

Views: 14097

Answers (3)

Guillaume
Guillaume

Reputation: 6029

As you guessed, ValueError: could not convert string to float: '13.75%' indicates that the % character blocks the convertion.

Now when you try to remove it:

df['int_rate'] = df['int_rate'].replace("%","", inplace=True) 

You set inplace=True in your replacement, which as the name suggests changes the dataframe in-place, so replace() method call returns None. Thus you store None in df['int_rate'] and end up with a column containing only None values. You should either do:

df['int_rate'] = df['int_rate'].replace("%","") 

or

df['int_rate'].replace("%","", inplace=True)

Upvotes: 5

jezrael
jezrael

Reputation: 863541

You can use Series.replace with parameter regex=True for replace substrings:

df = pd.DataFrame({'int_rate':['10.65%','7.90%']})
df['int_rate'] = df['int_rate'].replace("%","", regex=True).astype(float)
print (df)
   int_rate
0     10.65
1      7.90

Or Series.str.replace:

df['int_rate'] = df['int_rate'].str.replace("%","")
print (df)
  int_rate
0    10.65
1     7.90
2         

Or Series.str.rstrip:

df['int_rate'] = df['int_rate'].str.rstrip("%").astype(float)
print (df)
   int_rate
0     10.65
1      7.90

See difference without it:

df = pd.DataFrame({'int_rate':['10.65%','7.90%', '%']})

df['int_rate_subs'] = df['int_rate'].replace("%","", regex=True)
df['int_rate_val'] = df['int_rate'].replace("%","")
print (df)
  int_rate int_rate_subs int_rate_val
0   10.65%         10.65       10.65%
1    7.90%          7.90        7.90%
2        %                           

Upvotes: 6

Tyler
Tyler

Reputation: 129

Since you're using a string, you could convert the value to a float using

float(df['int_rate'][:-1])

This reads the string from the first position to the second to last position, 10.65 instead of 10.65%.

Upvotes: 2

Related Questions