Martin Thoma
Martin Thoma

Reputation: 136379

How can I make sure Pandas does not interpret a numeric string as a number in Pandas?

I have code that reads a CSV like this:

#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import pandas as pd

csv_path = 'test.csv'
df = pd.read_csv(csv_path, delimiter=';', quotechar='"',
                 decimal=',', encoding="ISO-8859-1", dtype={'FOO': str})
df.FOO = df.FOO.map(lambda n: n.zfill(6))

and I get

AttributeError: 'float' object has no attribute 'zfill'

so obviously, Pandas interpreted the column FOO as a number. It is numeric, but I don't want to interpret it as a number

(I know that df.FOO = df.FOO.map(lambda n: str(n).zfill(6)) makes the problem go away, but I would like to know why this problem occurs in the first place.)

I use pandas 0.20.3.

Example CSV

FOO;BAR
01,23;4,56
1,23;45,6
;987

Upvotes: 4

Views: 2171

Answers (1)

Martin Thoma
Martin Thoma

Reputation: 136379

The problem is the empty cell.

The line

df.FOO = df.FOO.fillna(value="")

gives the desired behavior, but this seems to be a quite dirty solution.

I'm not sure weather this is a bug or desired behaviour: https://github.com/pandas-dev/pandas/issues/17810

Upvotes: 6

Related Questions