Reputation: 4636
I have this dataframe:
df:
CNPJ last_revenues
0 01.637.895/0001-32 R$ 12.696.658
1 02.916.265/0001-60 R$ 162.914.526,04
2 02.932.074/0001-91 R$ 1.928.312
3 03.853.896/0001-40 R$ 19.333.453,92
The values in the column "last_revenues" are strings of currency values in latin format. I need to turn them into floats, it would have to look like this:
df:
CNPJ last_revenues
0 01.637.895/0001-32 12696658.0000
1 02.916.265/0001-60 162914526.0400
2 02.932.074/0001-91 1928312.0000
3 03.853.896/0001-40 19333453.9200
Actually I noticed that there are a few nan's in the middle of it. So my df looks like this:
df:
CNPJ last_revenues
0 01.637.895/0001-32 R$ 12.696.658
1 02.916.265/0001-60 R$ 162.914.526,04
2 02.932.074/0001-91 nan
3 03.853.896/0001-40 R$ 19.333.453,92
And I want to make it look like this:
df:
CNPJ last_revenues
0 01.637.895/0001-32 12696658.0000
1 02.916.265/0001-60 162914526.0400
2 02.932.074/0001-91 nan
3 03.853.896/0001-40 19333453.9200
Upvotes: 1
Views: 432
Reputation: 41
You can also try:
df.last_revenues.replace('R\$ ','',inplace=True,regex=True)
df.last_revenues.replace('\.','',inplace=True,regex=True)
df.last_revenues.replace(',','.',inplace=True,regex=True)
Upvotes: 1
Reputation: 177
There are 2 things required:
rstr.split(' ')[1]
)Convert the currency string to int
df['last_revenues'] = df['last_revenues'].apply(lambda rstr: '{:5.4f}'.format(float(rstr.split(' ')[1].replace('.','').replace(',', '.'))) if rstr != 'nan' else 'nan')
Upvotes: 1
Reputation: 6891
One way to do it, is to apply
a conversion function to each element in last_revenues
. This can be done in many ways, but as a lambda
one liner it can be written as:
df['last_revenues'] = df['last_revenues'].dropna().apply(lambda rstr: float(rstr.replace('.','').replace(',','.')[3:]))
Here I first I drop all nan values, thereafter I replace all dots with nothing, then the comma with a dot. Finally I remove the initial currency characters and convert the resulting string to a float.
Upvotes: 2