Reputation: 17631
I'm creating an R package with several files in /data
. The way one loads data in the R package is to use the system.file()
,
system.file(..., package = "base", lib.loc = NULL, mustWork = FALSE)
The file in /data
I would like to load into an R data.table has the extension *.txt.gz
, my_file.txt.gz
. How do I load this into a data.table via read.table()
or fread()
?
Within the R script, I tried :
#' @import data.table
#' @export
my_function = function(){
my_table = read.table(system.file("data", "my_file.txt.gz", package = "FusionVizR"), header=TRUE)
}
This leads to an error via devtools
document()
:
Error in read.table(system.file("data", "my_file.txt.gz", package = "FusionVizR"), header = TRUE) (from script1.R#7) :
no lines available in input
In addition: Warning message:
In file(file, "rt") :
file("") only supports open = "w+" and open = "w+b": using the former
I appear to get the same issue via fread()
#' @import data.table
#' @export
my_function() = function(){
my_table = fread(system.file("data", "my_file.txt.gz", package = "FusionVizR"), header=TRUE)
}
This outputs the error:
Input is either empty or fully whitespace after the skip or autostart. Run again with verbose=TRUE.
So, it appears that system.file()
doesn't give an object to the file which I could load into an R data.table. How do I do this?
Upvotes: 0
Views: 4362
Reputation: 368201
Do yourself a HUGE favour and study fread()
closely: it is one of the very best features in data.table
. I have examples (at work) of reading from a pipe of other commands, of reading compresse data and more.
Here is a simple mock example:
R> write.csv(iris, file="/tmp/demo.csv")
R> system("gzip /tmp/demo.csv") # to be very plain
R> fread("zcat /tmp/demo.csv.gz")
V1 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1: 1 5.1 3.5 1.4 0.2 setosa
2: 2 4.9 3.0 1.4 0.2 setosa
3: 3 4.7 3.2 1.3 0.2 setosa
4: 4 4.6 3.1 1.5 0.2 setosa
5: 5 5.0 3.6 1.4 0.2 setosa
---
146: 146 6.7 3.0 5.2 2.3 virginica
147: 147 6.3 2.5 5.0 1.9 virginica
148: 148 6.5 3.0 5.2 2.0 virginica
149: 149 6.2 3.4 5.4 2.3 virginica
150: 150 5.9 3.0 5.1 1.8 virginica
R>
Seems in the hast I wrote one column too many (rownames) but you get the idea.
Now, you don't even need fread
(but it still more powerful than the alternatives):
R> head(read.csv(file="/tmp/demo.csv.gz"))
X Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 1 5.1 3.5 1.4 0.2 setosa
2 2 4.9 3.0 1.4 0.2 setosa
3 3 4.7 3.2 1.3 0.2 setosa
4 4 4.6 3.1 1.5 0.2 setosa
5 5 5.0 3.6 1.4 0.2 setosa
6 6 5.4 3.9 1.7 0.4 setosa
R>
R figured out by itself it needed to compress the file.
Edit: I was editing this question earlier when it was deleted under me, which is about as de-motivating as it gets. In a nutshell:
system.file()
works, e.g. file <- system.file("rawdata", "population.csv", package="gunsales")
does contain the complete path as the file exists: "/usr/local/lib/R/site-library/gunsales/rawdata/population.csv"
. But this is easy to mess up. (Needless to say I do have the package and the file.)data/
directory and what Writing R Extensions says. It is a good mechanism.Upvotes: 3