Reputation: 77096
I'm using the recently introduced fread
function from data.table
to read data files.
When I wrap my code into a knitr (Rmd) document, I noticed some strange output, namely lines like:
##
0%
even though the verbose
option of fread
was set to FALSE. I've used sink
to hide this output, but I'd like to report the exact problem to the package author(s). Here's a minimal example,
library(knitr)
test = "```{r}
require(data.table)
fread('1 2 3\n')
```"
knit2html(text=test, output="test.html")
browseURL("test.html")
What is the 0% output?
Upvotes: 14
Views: 2035
Reputation: 637
There is a parameter called showProgress
in fread
, if you set it to FALSE
, then you will not see the progress output. (It's useful in making r markdown.)
Upvotes: 1
Reputation: 59602
It's a % progress counter. For me it prints 0%, 5%, 10%, ... 95%, 100% (for example) with a \r
at the end to make it appear on one line just underneath the call to fread
when typed at the prompt.
But when called from functions, batches and knitr this is undesirable. This has now been removed. From NEWS for v1.8.9 (rev 851) :
- % progress console meter has been removed. The ouput was inconvenient in batch mode, log files and reports which don't handle
\r
. It was too difficult to detect wherefread
is being called from, plus, removing it speeds upfread
a little by saving code inside the C for loop (which is why it wasn't made optional instead). Use your operating system's system monitor to confirm fread is progressing. Thanks to Baptiste for highlighting :
Strange output from fread when called from knitr
Just a quick reminder for completeness. From the top of ?fread
:
This function is still under development. For example, dates are read as character (they can be converted afterwards using the excellent fasttime package or standard base functions) and embedded quotes ("\"" and """") have problems. There are other known issues that haven't been fixed and features not yet implemented. But, you may find it works in many cases. Please report problems to datatable-help or Stack Overflow's data.table tag.
Not for production use yet. Not because it's unstable in the sense that it crashes or is buggy (your testing will show whether it is stable in your cases or not) but because fread's arguments and behaviour is likely to change in future; i.e., we expect to make (hopefully minor) non-backwards-compatible changes. Why has it been released to CRAN then? Because a maintenance release was asked for by CRAN maintainers to comply with new stricter tests in R-devel, and a few Bioconductor packages depend on data.table and Bioconductor requires packages to pass R-devel checks. It was quicker to leave fread in and write these paragraphs, than take fread out.
Upvotes: 15
Reputation: 115392
It isn't a problem to be reported.
As stated by Matthew Dowle, this is a progress counter from fread
You can set results = 'hide'
to avoid these results being included
library(knitr)
test = "```{r, results = 'hide'}
require(data.table)
fread('1 2 3\n')
```"
knit2html(text=test, output="test.html")
browseURL("test.html")
Look, no progress bar.
At a practical level, I think it would be sensible to have results = 'hide'
or even include = FALSE
for a step like this.
You will not want to repeat this kind of reading in step, practically, you only ever want to read the data in once, then you would serialize it (using save
, saveRDS
or similar), so you could use that next time (which would be faster).
Edit in light of the comment
I would split the processing up into a number of smaller chunks. You could then not include the reading in chunk, but include a dummy version that is not evaluated (so you can see the code, but not include the results)
```{r libraries}
require(data.table)
```
```{r loaddata, include = FALSE}
DT <- fread('yourfile')
```
```{r loaddummy, ref.label = 'loaddata', eval = FALSE, echo = TRUE}
```
```{r dostuff}
# doing other stuff
```
Upvotes: 12