Reputation: 65
There are quite a few R posts with a similar topic but they don't provide what I'm looking for.
Psuedo code (this is NOT meant to be R) for what I want is as as follows:
fileConn <- file("foo.txt")
for (i in 1:hiLimit) {
# extract elements from each nested and variable json element in an R list.
# paste elements into a comma separated list
write(pastedStuff,fileConn)
}
close(fileConn)
Now if I skip the 'file' and 'close' and just use 'cat' with a filename and 'append=TRUE' as follows:
cat(paste(cve,vndr,pnm,vnmbr,vaffct,sep=","),file="outfile.txt",append=TRUE,sep="\n")
I get what I want. But, presumably, this is opening and closing the file for each call (??? assumption). Avoiding that should make it faster.
What I have not been able to work out is how to achieve the result via the method in the psuedo code which only opens and closes the file once. Using 'cat' or 'writeLines' just gives me the last line in the file.
By way of explanation, the problem I'm working on involves building a dataframe from scratch row by row. My timings (see below) indicate that by far the fastest way I can do this is to write a csv to disk and then read it back in to create the dataframe. This is crazy but that's the way it's panning out.
## Just the loop without any attempt to collect parsed data into a dataframe
system.time(tmp <- affectsDetails(CVEbase,Affect))
user system elapsed
0.30 0.00 0.29
## Using rbind as in rslt<- rbind (rslt,c(stuff)) to build dataframe in the loop.
system.time(tmp <- affectsDetails(CVEbase,Affect))
user system elapsed
990.46 2.94 994.01
# Preallocate and insert list as per
# https://stackoverflow.com/questions/3642535/creating-an-r-dataframe-row-by-row
system.time(tmp <- affectsDetails(CVEbase,Affect))
user system elapsed
1451.42 0.04 1452.37
# Write to a file with cat and read back the csv.
system.time(tmp <- affectsDetails(CVEbase,Affect))
user system elapsed
10.70 29.00 45.42
Any suggestions appreciated!
Upvotes: 2
Views: 975
Reputation: 6441
Not sure how I can help you. But you can open a connection and keep it open until writing is finished.
testcon <- file(description = "C:/test.txt", open = "a")
isOpen(testcon)
[1] TRUE
start <- Sys.time()
for (i in 1:10000) {
cat(paste0("hallo", i), file= testcon, append=TRUE,sep="\n")
}
end <- Sys.time()
end-start
Time difference of 0.2017999 secs
close(testcon)
Which seems to be considerably faster than:
start <- Sys.time()
for (i in 1:10000) {
cat(paste0("hallo", i), file= "C:/test.txt", append=TRUE,sep="\n")
}
end <- Sys.time()
end-start
Time difference of 3.382569 secs
Upvotes: 1