Shay Agarwal
Shay Agarwal

Reputation: 5

Convert string in pandas dataframe to integer

I have a data frame with a column for number of reviews the dataframe column is listed in this format

816 ratings
1,139 ratings
5 ratings
22,3456 ratings

Id like to convert this to an integer so I can sort the dataframe. My output should be

816
1139
5
223456

I tried

df=df['num_reviews'].str.extract('(\d+)').astype(float)
df

however this converted everything after the comma into a decimal. (i.e. 22,3456 returns 22.0) and using .astype(int) gave me errors due to fields having NaN

Upvotes: 0

Views: 1691

Answers (1)

mujjiga
mujjiga

Reputation: 16916

df['num_reviews'].str.replace(r'\D+', '').replace('','0').astype(float)

Test case:

df = pd.DataFrame({
    'num_reviews': ["816 ratings", "1,139 ratings", 
                    "5 ratings", "no ratings", "22,3456 ratings"]
})
print (df['num_reviews'].str.replace(r'\D+', '').replace('','0').astype(float))

Output:

0       816.0
1      1139.0
2         5.0
3         0.0
4    223456.0

Upvotes: 1

Related Questions