Hello there
Hello there

Reputation: 29

How to extract multiple float values from a string in Python

I need to extract numeric values from a string inside a pandas DataFrame.

Let's say the DataFrame cell is as follows (stored as a string):

[1.234,2.345]

I can get the first value with the following:

print(df['column_name'].str.extract('(\d+.\d+)',).astype('float'))

Output:

1.234

Now my thoughts to find both values was to do the following:

print(df['column_name'].str.extract('(\d+.\d+),(\d+.\d+)',).astype('float'))

but the output is then as follows:

NaN NaN

Expected output:

1.234 2.345

Upvotes: 1

Views: 912

Answers (3)

nomansland008
nomansland008

Reputation: 142

You can use pandas.str.split, setting n=2. If you want to expand the DataFrame you must set expand=True. So the result might look like:

your_dataframe['your_column_name'].str.split(",", n=2, expand=True).astype(float)

Upvotes: 0

U13-Forward
U13-Forward

Reputation: 71570

Why not just pd.eval:

>>> df['Float'] = pd.eval(df['String'])
>>> df
           String           Float
0  [1.234, 2.345]  [1.234, 2.345]
1  [1.234, 2.345]  [1.234, 2.345]
>>> 

Upvotes: 1

Corralien
Corralien

Reputation: 120391

If you want to use a regex to extract floats, you can use str.findall:

>>> df['column_name'].str.findall(r'(-?\d+\.?\d+)').str.join(' ')
0    1.234 2.345
Name: String, dtype: object

Old answer: Use ast.literal_eval:

import ast

df = pd.DataFrame({'String': ['[1.234, 2.345]']})

df['Float'] = df['String'].apply(ast.literal_eval)

Output:

>>> df
           String           Float
0  [1.234, 2.345]  [1.234, 2.345]

>>> type(df.at[0, 'String'][0])
str

>>> type(df.at[0, 'Float'][0])
float

Upvotes: 0

Related Questions