Reputation: 587
What is the fastest way to write a vector to a file? I have a character vector that is ~2 million rows and that has rather large values (200 characters). I am currently doing
write(myVector, "myFile.txt")
But this is extremely slow. I have searched around for solutions but the fast writing functions (such as fwrite
) only take a data frame/matrix as input. Thanks!
Upvotes: 18
Views: 30763
Reputation: 1778
You could use data.table
's fwrite:
library(data.table) # install if not installed already
fwrite(list(myVector), file = "myFile.csv")
Upvotes: 2
Reputation: 76575
After trying several options I found the fastest to be data.table::fwrite
. Like @Gregor says in his first comment, it is faster by an order of magnitude, which is worth the extra package loaded. It is also one of the ones that produces bigger files. (The other one is readr::write_lines
. Thanks to the comment by Calum You, I had forgotten this one.)
library(data.table)
library(readr)
set.seed(1) # make the results reproducible
n <- 1e6
x <- rnorm(n)
t1 <- system.time({
sink(file = "test_sink.txt")
cat(x, "\n")
sink()
})
t2 <- system.time({
cat(x, "\n", file = "test_cat.txt")
})
t3 <- system.time({
write(x, file = "test_write.txt")
})
t4 <- system.time({
fwrite(list(x), file = "test_fwrite.txt")
})
t5 <- system.time({
write_lines(x, "test_write_lines.txt")
})
rbind(sink = t1[1:3], cat = t2[1:3],
write = t3[1:3], fwrite = t4[1:3],
readr = t5[1:3])
# user.self sys.self elapsed
#sink 4.18 11.64 15.96
#cat 3.70 4.80 8.57
#write 3.71 4.87 8.64
#fwrite 0.42 0.02 0.51
#readr 2.37 0.03 6.66
In his second comment, Gregor notes that as.list
and list
behave differently. The difference is important. The former writes the vector as one row and many columns, the latter writes one column and many rows.
The speed difference is also noticeable:
fw1 <- system.time({
fwrite(as.list(x), file = "test_fwrite.txt")
})
fw2 <- system.time({
fwrite(list(x), file = "test_fwrite2.txt")
})
rbind(as.list = fw1[1:3], list = fw2[1:3])
# user.self sys.self elapsed
#as.list 0.67 0.00 0.75
#list 0.19 0.03 0.11
Final clean up.
unlink(c("test_sink.txt", "test_cat.txt", "test_write.txt",
"test_fwrite.txt", "test_fwrite2.txt", "test_write_lines.txt"))
Upvotes: 22
Reputation: 263421
I found writeBin
to be twice as fast as fwrite
. Try this:
zz <- file("myFile.txt", "wb")
writeBin( paste(myVector, collapse="\n"), zz )
close(zz)
Using the same timing approach offered by Rui I get (older box):
user.self sys.self elapsed
sink 9.650 7.900 17.418
cat 6.507 7.870 14.254
write 6.436 7.849 14.171
fwrite 0.500 0.051 0.593
write_lines 4.337 0.150 4.451
writeBin 0.238 0.006 0.242
Upvotes: 8