Reputation: 8646
(question is not relevant anymore, since new version of data.table
of 25-NOV-2016 - see accepted answer below)
So, I have a table with some empty lines in the middle. When I try to open it with fread
, it stops, saying Stopped reading at empty line 10006, but text exists afterwards (discarded)
. Is there any way to avoid this without changing the data file?
Upvotes: 20
Views: 9289
Reputation: 143
If anyone else is having a similar problem, I've noticed that data.table 1.10.4 (the current 2017 release I'm using) seems to produce empty line errors with some files if you don't explicitly state:
'strip.white = FALSE'
I was looking at what were obviously line errors in ~350 files I was trying to import. Some lines were broken across two rows in the originals and, since they contained different forms of information, fread was warning of class coercion issues for some of the columns. But I was simultaneously getting 'empty line' errors as well for almost every file, on different lines. I manually checked those in notepad++. Many times. There were no empty lines and there were remaining lines; lots of them. Tried working through the import arguments and disabling specifically strip.white removed the empty line warnings.
Upvotes: 2
Reputation: 9825
Version 1.9.8 of data.table, released 25-NOV-2016, has a new blank.lines.skip
option to skip blank lines.
text <- "1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c"
library(data.table)
fread(text)
## V1 V2
## 1: 2 b
## 2: 3 c
## 3: 4 a
## Warning message:
## In fread("1,a\n\n2,b\n3,c\n4,a\n\n5,b\n\n6,c") :
## Stopped reading at empty line 6 but text exists afterwards (discarded): 5,b
fread(text, blank.lines.skip=TRUE)
## V1 V2
## 1: 1 a
## 2: 2 b
## 3: 3 c
## 4: 4 a
## 5: 5 b
## 6: 6 c
Upvotes: 11
Reputation: 573
You can use the Windows findstr
command to get rid of empty lines.
Example file "Data.txt".
1,a
2,b
3,c
4,a
5,b
6,c
Reproduces your error.
> dt <- fread("Data.txt")
Warning message:
In fread("Data.txt") :
Stopped reading at empty line 6 of file, but text exists afterwards (discarded): 5,b
But works when using Windows findstr
directly in fread
.
> require(data.table)
> dt <- fread('findstr "." Data.txt')
# > dt
# V1 V2
# 1: 1 a
# 2: 2 b
# 3: 3 c
# 4: 4 a
# 5: 5 b
# 6: 6 c
Upvotes: 5