Reputation: 6564
I have a column with a string containing letter with a number. The number is sometimes with and sometimes without a decimal point. I want to convert the number to float. example dataframe:
df = pd.DataFrame({'colA': ['q7.8', 'g5.3', '4.5r', 'john7']})
The updated column should contain: 7.8, 5.3, 4.5 7.0
.
There are no systemic rules for the number of letters and their location.
Thanks
Upvotes: 1
Views: 269
Reputation: 35626
Assuming only one number is in each cell we can use str.extract
then astype
to convert to float
:
df['colA'] = df['colA'].str.extract(r'(\d+(?:\.\d+)?)').astype('float')
There are many regex available at How to extract a floating number from a string if additional considerations like exponentionation or positive/negative are needed like:
df['colA'] = df['colA'].str.extract(
r'([-+]?(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][-+]?\d+)?)'
).astype('float')
*Note with this approach extract
needs exactly 1 capturing group.
df
:
colA
0 7.8
1 5.3
2 4.5
3 7.0
Upvotes: 3