Python pandas read_table converts zero to NaN

Question

Say I have the following file test.txt:

Aaa Bbb
Foo 0
Bar 1
Baz NULL

(The separator is actually a tab character, which I can't seem to input here.) And I try to read it using pandas (0.10.0):

In [523]: pd.read_table("test.txt")
Out[523]:
   Aaa  Bbb
0  Foo  NaN
1  Bar    1
2  Baz  NaN

Note that the zero value in the first column has suddenly turned into NaN! I was expecting a DataFrame like this:

   Aaa   Bbb
0  Foo     0
1  Bar     1
2  Baz   NaN

What do I need to change to obtain the latter? I suppose I could use pd.read_table("test.txt", na_filter=False) and subsequently replace 'NULL' values with NaN and change the column dtype. Is there a more straightforward solution?

DSM · Accepted Answer

I think this is issue #2599, "read_csv treats zeroes as nan if column contains any nan", which is now closed. I can't reproduce in my development version:

In [27]: with open("test.txt") as fp:
   ....:     for line in fp:
   ....:         print repr(line)
   ....:         
'Aaa	Bbb
'
'Foo	0
'
'Bar	1
'
'Baz	NULL
'

In [28]: pd.read_table("test.txt")
Out[28]: 
   Aaa  Bbb
0  Foo    0
1  Bar    1
2  Baz  NaN

In [29]: pd.__version__
Out[29]: '0.10.1.dev-f7f7e13'

Python pandas read_table converts zero to NaN

Answers (2)

Related Questions