Simd
Simd

Reputation: 21343

How to strip \xa3 from all the numbers in a column?

I have a column of a dataframe which is read in from a csv file using pd.read_csv. When I look at the numbers they all look like:

df['Amount'][0]
Out[4]: '\xa3128.23'

That is they have \xa3 preprended to them and are also therefore not interpreted as floats.

How can I strip of the \xa3 and make them floats?

Upvotes: 2

Views: 153

Answers (2)

PoweredBy90sAi
PoweredBy90sAi

Reputation: 1233

As @jezrael and I were discussing on his answer, It may be better to handle the encoding on import with pandas instead of loop stripping the result. This is because the extra loops wont scale well with larger data sets and may lead to some slow runtimes.

 pd.read_csv("your_data_set_path", encoding='utf-8') #use appropriate encoding.

Upvotes: 1

jezrael
jezrael

Reputation: 863301

I think need replace:

df['Amount'].str.replace('\xa3', '').astype(float)

Or lstrip by £:

df['Amount'].str.lstrip('£').astype(float)

As @csevier pointed there seems encoding problem, solution is:

df = pd.read_csv("your_data_set_path", encoding='utf-8') 

And then:

df['Amount'] = df['Amount'].str.lstrip('£').astype(float)

Upvotes: 4

Related Questions