Mensa 23
Mensa 23

Reputation: 73

Formatting a string containing currency and commas

Does anyone know how I'd format this string (which is a column in a dataframe) to be a float so I can sort by the column please?

£880,000
£88,500
£850,000
£845,000

i.e. I want this to become

88,500
845,000
850,000
880,000

Thanks in advance!

Upvotes: 1

Views: 115

Answers (1)

mozway
mozway

Reputation: 262164

Assuming 'col' the column name.

If you just want to sort, and keep as string, you can use natsorted:

from natsort import natsort_key
df.sort_values(by='col', key=natsort_key)

# OR

from natsort import natsort_keygen
df.sort_values(by='col', key=natsort_keygen())

output:

        col
1   £88,500
3  £845,000
2  £850,000
0  £880,000

If you want to convert to floats:

df['col'] = pd.to_numeric(df['col'].str.replace('[^\d.]', '', regex=True))

df.sort_values(by='col')

output:

      col
1   88500
3  845000
2  850000
0  880000

If you want strings, you can use str.lstrip:

df['col'] = df['col'].str.lstrip('£')

output:

       col
0  880,000
1   88,500
2  850,000
3  845,000

Upvotes: 1

Related Questions