KcFnMi
KcFnMi

Reputation: 6191

Convert decimal separator

I'm loading a CSV where decimal value separator is , and I would like to replace it by . in order to proceed the analysis.

I see the converters option in pandas.read_csv but to use it I need to provide a list of all column names (which I want to convert), which might not be a good idea since there are lots of columns.

What I have in mind is to look each cell in all columns and replace it.

ii = len(list(df))-1
print ii
jj = len(df.ix[:,0])
print jj
for i in range(0, ii):
    for j in range(0, jj):
        df.ix[i,j] = df.ix[i,j].to_string().replace(',' , '.')

Is there a better approach?

Upvotes: 1

Views: 371

Answers (2)

root
root

Reputation: 33843

You can use the decimal parameter of read_csv:

df = pd.read_csv(file.csv, decimal=',')

Upvotes: 3

gokul_uf
gokul_uf

Reputation: 760

You don't have to provide all the column names to converter.

Give only those columns you want to convert

It would be converter = {'col_name':lambda x : str(x).replace(',','.')}

EDIT after rewording question.

Is this the best way to do it?

I would say yes. OP mentioned that there are a large number of columns he/she wants to convert and feels that a dict go out of hand. IMO, it will not. There are two reasons to why it wouldn't.

The first reason is that eventhough you have a large number of columns, I assume there is some pattern to it, (like the column numbers 2, 4... need to be converted). You could run a for loop or a list comprehension to generate this dict and pass it to the converter. Another advantage is that converters accept both column label as well as column index as keys so you don't have to mention the column labels.

Second, a dict is implemented using a hash table. This ensures that worst case look-up is constant time. So you don't have to worry about slow runtimes when using a large number of elements in the dictionary.

Though your method is correct, IMO it is reinventing the wheel.

Upvotes: 0

Related Questions