Pandas - how to drop rows containing fewer fields than header

Question

Pandas correctly errors out the rows which contain more fields than the header in a csv, however it adds NaN to rows containing lesser fields even if there is no trailing , indicating an empty field.

My csv:

id,name,pin,city
1,abc,123,SJ
2,xyz,789
3,pqr,456,AL
4,qwe,345,

When I try to read this via pandas:

>>> import pandas
>>> a = pandas.read_csv('test.csv', error_bad_lines=False)
>>> a
   id name  pin city
0   1  abc  123   SJ
1   2  xyz  789  NaN
2   3  pqr  456   AL
3   4  qwe  345  NaN
>>>

Here row 4 is read with NaN in city value, which is correct since last , indicates an empty field. But line 2 should error out/not read into the dataframe. Any way to achieve this?

Pandas - how to drop rows containing fewer fields than header

Answers (1)

Related Questions