Python - pandas - is it possible to read csv with multiple decimal marks

Question

I'm trying to read a csv file with hundreds of float columns. Half of them have '.' as decimal mark, the others have ',' as decimal mark and none of them have any thousands separator so it would be helpful if one could set decimal parameter in pd.read_csv to ',' or '.' but it seems that Only length-1 decimal markers is supported for this parameter. Only one half of my columns are imported in dataframe with float dtype. The second half is Object dtype that must be treated separately to be converted to float.

>>> import pandas as pd
>>> df0 = pd.read_csv('example.csv')
>>> df0.head()
    col1   col2
0  123,2  12.02
1  22,15   1.50
>>> df0.dtypes
col1     object
col2    float64
dtype: object
>>> df1 = pd.read_csv('example.csv', decimal=',')
>>> df1.head()
     col1   col2
0  123.20  12.02
1   22.15    1.5
>>> df1.dtypes
col1    float64
col2     object
dtype: object

==> Is there any pythonesque way to import all columns as float and treat both '.' and ',' characters as decimal mark?

Alperen · Accepted Answer

Before you read the file, use this:

with open("example.csv") as f:
    content = f.read()

content = content.replace('","','###')    #To prevent deleting required commas
content = content.replace(',','.')
content = content.replace('###','","')

with open("example.csv", "w") as f:
    content = f.write(content)

Python - pandas - is it possible to read csv with multiple decimal marks

Answers (2)

Related Questions