Reputation: 8806
I have a 64-bit Windows 7 machine with 8GB RAM. memory.limit()
shows 8135
. I ran into memory issue even though what I'm trying to do does not seem humongous at all (compared to other memory-related questions on SO).
Basically I'm matching firm's ID with their industry. ref.table
is the data frame where I store ID and industry for reference.
matchid <- function(id) {
firm.industry <- ref.table$industry[ref.table$id==id]
firm.industry <- as.character(firm.industry[1]) # Sometimes same ID has multiple industries. I just pick one.
resid <<- c(resid, firm.industry)
}
resid <- c()
invisible( lapply(unmatched.id, matchid) ) # unmatched.id is the vector of firms' ID to be matched
The unmatched.id
vector is about 60,000-element long. Still I got the error "Cannot allocate vector of 41.8kb size" (Only 41.8kb!) Windows task manager shows full RAM usage at all time.
Is it because my function is too unwieldy somehow? I can't imagine it's the vector size causing problems.
(PS: I do gc() and rm() frequently)
Upvotes: 0
Views: 338
Reputation: 46856
I think you're looking up a vector of unmatched id's in ref.table$id
, and finding the corresponding index
## first match, one for each unmatched.id, NA if no match
idx <- match(unmatched.id, ref.table$id)
## matching industries
resid <- ref.table$industry[idx]
This is 'vectorized' so much more efficient than an lapply.
Upvotes: 2
Reputation: 55340
Try the following to see if it quits giving you memory complaints
lapply(unmatched.id, function(id) as.character(ref.table$industry[ref.table$id==id]))
If the above works, then wrap it in unlist( .., use.names=FALSE)
library(data.table)
ref.table <- data.table(ref.table, key="id")
ref.table[.(unmatched.id), as.character(industry)]
Upvotes: 3