Reputation: 15
I'm currently trying to write an R script to import a variety of files I've created related to a dataset. This involves reading a lot of .txt files using several nested for loops based on how I've organized the directories and names of the files.
I can run the inner most loop fine (albiet a little slow). However, trying to run the second loop or any further loops creates the following error:
Error: vector memory exhausted (limit reached?)
I believe this may be related to how R handles memory? I'm running R out of Rstuidio. I've also tried the solution posted here with no luck
'R
R version 3.5.1 (2018-07-02) -- "Feather Spray"
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Code Below
subjects <- 72
loop1_names <- as.character(list('a','b','c'))
loop2_names <- as.character(list('one','two','three'))
loop3_names <- as.character(list('N1','N2'))
loop4_names<- as.character(list('choice1','choice2','choice3'))
i<-1;j<-1;
loop3.subset<- data.frame
for(k in 1:length(loop3_names)){
loop4.subset<- data.frame()#Data frame for handling each set of loop 4 values
for(l in 1:length(loop4_names)){
#Code for extracting the variables for each measure
measures.path <- file.path(results_fldr, 'amp_measures',loop1_names[i],loop2_names[j],'mont',loop3_names[k])
measures.data <- read.table(file.path(measures.path, paste(paste(loop1_names[i],loop2_names[j],loop3_names[k],loop4_names[l],sep = '_'),'.txt',sep = '')), header = T, nrows = subjects)
#Get rid of the IDs, we'll add those back in later
col_idx_ID <- grep('ID', names(measures.data))
measures.data <- as.data.frame(measures.data[,-col_idx_ID])# make sure when trimming to keep the measures as a data frame
names(measures.data) <- c(paste(loop1_names[i],loop2_names[j],loop3_names[k],loop4_names[l],sep = '_'))#Add a label to the data
#Now combine this data with the other data in the loop4 subset data frame
if(l == 1){
loop4.subset <- measures.data
} else {
loop4.subset <- merge(erp.subset,measures.data)
}
}#End l/loop 4
if(k == 1){
loop3.subset <- loop4.subset
} else {
freq.subset <- merge(loop3.subset,loop4.subset)
}
}#End k/loop 3
Upvotes: 0
Views: 457
Reputation: 1234
Generally I would suggest you read in only part of the data to memory, then write the partially merge to disk. In the example below which of course I can't run because I don't have your files. I write to disk after each i, j loop and then after that is done have 9 files. Now you merge those 6 files in another loop. If you still have memory problems break this up into another 2 files by first doing the "j" merge and writing each to 3 "i" files. Then if you can't merge those files you have a fundamental problem with lack of memory on your machine.
subjects <- 72
loop1_names <- as.character(list('a','b','c'))
loop2_names <- as.character(list('one','two','three'))
loop3_names <- as.character(list('N1','N2'))
loop4_names<- as.character(list('choice1','choice2','choice3'))
for(i in 1:length(loop1_names)) {
for(j in 1:length(loop2_names)) {
loop3.subset<- data.frame
for(k in 1:length(loop3_names)){
loop4.subset<- data.frame()
for(l in 1:length(loop4_names)){
##Code for extracting the variables for each measure
measures.path <- file.path(results_fldr,
'amp_measures',
loop1_names[i],
loop2_names[j],
'mont',
loop3_names[k])
measures.data <- read.table(file.path(measures.path, paste(paste(loop1_names[i],
loop2_names[j],
loop3_names[k],
loop4_names[l],
sep = '_'),'.txt',sep = '')),
header = T, nrows = subjects)
##Get rid of the IDs, we'll add those back in later
col_idx_ID <- grep('ID', names(measures.data))
measures.data <- as.data.frame(measures.data[,-col_idx_ID])
names(measures.data) <- c(paste(loop1_names[i],
loop2_names[j],
loop3_names[k],
loop4_names[l],
sep = '_'))
## Now combine this data with the other data in the loop4 subset data frame
if(l == 1){
loop4.subset <- measures.data
} else {
loop4.subset <- merge(erp.subset,measures.data)
}
}#End l/loop 4
if(k == 1){
loop3.subset <- loop4.subset
} else {
freq.subset <- merge(loop3.subset,loop4.subset)
}
}#End k/loop 3
write.table(freq.subset, paste0(i, "_", j, ".txt"))
}
}
## Now you have 6 files to read in a merge.
## Something like this:
df <- NULL
for(i in 1:length(loop1_names)) {
for(j in 1:length(loop2_names)) {
df1 <- read.table(paste0(i, "_", j, ".txt"))
df <- merge(df, df1)
}
}
Upvotes: 1