Hagrid_
Hagrid_

Reputation: 27

cleaning numeric columns in pandas

I have some difficulties to exploit csv scraping file in pandas. I have several columns, one of them contain prices as '1 800 €'

After to import csv as dataframe, I can not convert my columns in Integrer

I deleted euro symbol without problem

data['prix']= data['prix'].str.strip('€')

I tried to delete space with the same approach, but the space still remaied

data['prix']= data['prix'].str.strip()
or
data['prix']= data['prix'].str.strip(' ')
or
data['prix']= data['prix'].str.replace(' ', '')

I tried to force the conversion in Int

data['prix']= pd.to_numeric(data['prix'], errors='coerce')

My column was fill by Nan value

I tried to convert before operation of replace space in string

data = data.convert_dtypes(convert_string=True)

But same result : impossible to achieve my aim the spaces are always present and I can not convert in integer

I looked with Excel into dataset, I can not identify special problem in the data

I tried also to change encoding standard in read_csv ... ditto

In this same dataset I had the same problem for the kilometrage as 15 256 km And I had no problem to retreat and convert to int ...

I would like to test through REGEX to copy only numbers of the field et create new column with

How to proceed ? I am also interested by other ideas

Thank you

Upvotes: 2

Views: 1062

Answers (2)

gerda die gandalfziege
gerda die gandalfziege

Reputation: 762

To delete the white space use this line:

data['prix']= data['prix'].str.replace(" ","")

and to convert the string into a int use this line:

data['prix'] = [int(i) for i in data['prix']]

Upvotes: 0

Corralien
Corralien

Reputation: 120409

Use str.findall:

I would like to test through REGEX to copy only numbers of the field et create new column with

data['prix2'] = data['prix'].str.findall(r'\d+').str.join('').astype(int)

# Or if it raises an exception

data['prix2'] = pd.to_numeric(data['prix'].str.findall('(\d+)').str.join(''), errors='coerce')

Upvotes: 1

Related Questions