Reputation: 103
Help me figure out what I am doing wrong!
I have about 20 .csv files (product feeds) online. I used to be able to fetch them all. But now they crash R if I fetch more than one or two. File size is about 50K rows / 30 columns each.
I guess it's a memory issue but I've tried on a different computer with the exact same result. Could it be some formatting in the files that make R use too much memory? Or what can it be?
If I run one of these everything is good. Two sometimes. Three and it almost certainly crashes
a <- read.csv("URL1")
b <- read.csv("URL2")
c <- read.csv("URL3")
I have tried specifying all sorts of stuff like:
d <- read.csv("URL4",skipNul=TRUE,sep=",",stringsAsFactors=FALSE,header=TRUE)
I keep getting this message:
R session aborted. R encountered a fatal error. The session was terminated.
We have some commercial software where I can fetch the same files without issues, so the files should be fine. And my script was running twice daily for several months without issues
R version 3.6.1
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Upvotes: 1
Views: 1971
Reputation: 11
I have had this issue as well but with read_csv(). I haven't figured out what the exact cause is yet, but my best guess is that trying to read a file and write that file to a variable at the same time is too much for memory or CPU to handle.
Stemming from that guess, I tried this method and it has worked perfectly for me:
library(dplyr)
a <- read.csv("URL1") %>% as_tibble()
# you can use other data types instead of tibble. that is just my example.
The whole idea is to split the reading process from the writing process by separating them using a pipe. This makes sure that one must be finished before the next can start.
Upvotes: 1