The Dan
The Dan

Reputation: 1680

Pandas Extract Number with decimals from String

I am trying to extract all numbers including decimals, dots and commas form a string using pandas.

This is my DataFrame

       rate_number    
0      92 rate
0      33 rate
0      9.25 rate
0    (4,396 total
0    (2,620 total

I tried using df['rate_number'].str.extract('(\d+)', expand=False) but the results were not correct.

The DataFrame I need to extract should be the following:

    rate_number    
0      92 
0      33 
0      9.25 
0    4,396 
0    2,620 

Upvotes: 0

Views: 1297

Answers (3)

eduardofc
eduardofc

Reputation: 51

There is a small error with the asterisk's position:

df['rate_number_2'] = df['rate_number'].str.extract('([0-9]*[,.][0-9]*)')

Upvotes: 1

mLstudent33
mLstudent33

Reputation: 1175

Dan's comment above is not very noticeable but worked for me:

for df in df_arr:
    df = df.astype(str)
    df_copy = df.copy()
    for i in range(1, len(df.columns)):
        df_copy[df.columns[i]]=df_copy[df.columns[i]].str.extract('(\d+[.]?\d*)', expand=False) #replace(r'[^0-9]+','')
    new_df_arr.append(df_copy)

Upvotes: 0

NYC Coder
NYC Coder

Reputation: 7594

You can try this:

df['rate_number'] = df['rate_number'].replace('\(|[a-zA-Z]+', '', regex=True)

Better answer:

df['rate_number_2'] = df['rate_number'].str.extract('([0-9][,.]*[0-9]*)')

Output:

  rate_number rate_number_2
0         92             92
1         33             33
2       9.25           9.25
3      4,396          4,396
4      2,620          2,620

Upvotes: 2

Related Questions