Reputation:
I am new to python and working on string manipulation
I have a dataframe
df['Installs']
Out[22]:
0 10,000+
1 500,000+
2 5,000,000+
3 50,000,000+
4 100,000+
5 50,000+
How do I remove the "+" and convert the string in the df to float?
My input:
df['Installs'] = df['Installs'].str.replace('+','',regex=True).astype(float)
However I get an error:
ValueError: could not convert string to float: '10,000'
How can I edit my code such that I get 10,000.0 as my output and so on for the other values instead of 10,000+
Upvotes: 2
Views: 439
Reputation: 863291
Use Series.str.replace
with ,
and +
to empty string
:
df['Installs'] = df['Installs'].str.replace('[,+]','').astype(float)
#alternative
#df['Installs'] = df['Installs'].replace('[,+]','', regex=True).astype(float)
print (df)
Installs
0 10000.0
1 500000.0
2 5000000.0
3 50000000.0
4 100000.0
5 50000.0
Upvotes: 1