arelangi
arelangi

Reputation: 325

fread doesn't like lines with less fields than other lines

I'm using fread to read a 179mb CSV file with 16 columns and 637501 rows. fread is not reading the first 29 lines of the CSV file. It misses the headers in the first line as well. I have used

fread("filename.csv",sep= ",")
fread("filename.csv",sep= "," , skip>=0L)
fread("filename.csv",sep= "," , skip>=1L)
fread("filename.csv",sep= ",", autostart=1L)

When I set header =TRUE, the row 30 is set as the header but fread fails to recognize the first 29 rows. I am able to read the read the same file read.csv without any issues (only it takes a lot longer).

Is this a bug or am I missing something?

Link to a sample CSV that produces the same bug (20kb) https://dl.dropboxusercontent.com/u/17747104/example.csv

Here's the link to the 179mb file. https://dl.dropboxusercontent.com/u/17747104/read.csv

Upvotes: 2

Views: 3727

Answers (1)

Matt Dowle
Matt Dowle

Reputation: 59612

As you've now realised by looking at row 30, it has 16 columns whereas the other rows have 36 columns. It seems chopped off, like a data error.

Edit : fread gained fill=TRUE in v1.9.8 on CRAN Nov 2016: release notes. That should resolve it.

Upvotes: 4

Related Questions