Vasily A
Vasily A

Reputation: 8646

Any way to force fread() of data.table not to stop on empty lines?

(question is not relevant anymore, since new version of data.table of 25-NOV-2016 - see accepted answer below)

So, I have a table with some empty lines in the middle. When I try to open it with fread, it stops, saying Stopped reading at empty line 10006, but text exists afterwards (discarded). Is there any way to avoid this without changing the data file?

Upvotes: 20

Views: 9289

Answers (3)

bg49ag
bg49ag

Reputation: 143

If anyone else is having a similar problem, I've noticed that data.table 1.10.4 (the current 2017 release I'm using) seems to produce empty line errors with some files if you don't explicitly state:

'strip.white = FALSE'

I was looking at what were obviously line errors in ~350 files I was trying to import. Some lines were broken across two rows in the originals and, since they contained different forms of information, fread was warning of class coercion issues for some of the columns. But I was simultaneously getting 'empty line' errors as well for almost every file, on different lines. I manually checked those in notepad++. Many times. There were no empty lines and there were remaining lines; lots of them. Tried working through the import arguments and disabling specifically strip.white removed the empty line warnings.

Upvotes: 2

dnlbrky
dnlbrky

Reputation: 9825

Version 1.9.8 of data.table, released 25-NOV-2016, has a new blank.lines.skip option to skip blank lines.

text <- "1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c"

library(data.table)
fread(text)
##    V1 V2
## 1:  2  b
## 2:  3  c
## 3:  4  a
## Warning message:
## In fread("1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c") :
##   Stopped reading at empty line 6 but text exists afterwards (discarded): 5,b

fread(text, blank.lines.skip=TRUE)
##    V1 V2
## 1:  1  a
## 2:  2  b
## 3:  3  c
## 4:  4  a
## 5:  5  b
## 6:  6  c

Upvotes: 11

Bram Visser
Bram Visser

Reputation: 573

You can use the Windows findstr command to get rid of empty lines.

Example file "Data.txt".

1,a

2,b
3,c
4,a


5,b

6,c

Reproduces your error.

> dt <- fread("Data.txt")
Warning message:
In fread("Data.txt") :
Stopped reading at empty line 6 of file, but text exists afterwards (discarded): 5,b

But works when using Windows findstr directly in fread.

> require(data.table)
> dt <- fread('findstr "." Data.txt')

# > dt
#    V1 V2
# 1:  1  a
# 2:  2  b
# 3:  3  c
# 4:  4  a
# 5:  5  b
# 6:  6  c

Upvotes: 5

Related Questions