Heisenberg
Heisenberg

Reputation: 8806

Memory error despite having ample RAM

I have a 64-bit Windows 7 machine with 8GB RAM. memory.limit() shows 8135. I ran into memory issue even though what I'm trying to do does not seem humongous at all (compared to other memory-related questions on SO).

Basically I'm matching firm's ID with their industry. ref.table is the data frame where I store ID and industry for reference.

matchid <- function(id) {
  firm.industry <- ref.table$industry[ref.table$id==id]
  firm.industry <- as.character(firm.industry[1]) # Sometimes same ID has multiple industries. I just pick one.
  resid <<- c(resid, firm.industry)
}
resid <- c()
invisible( lapply(unmatched.id, matchid) ) # unmatched.id is the vector of firms' ID to be matched

The unmatched.id vector is about 60,000-element long. Still I got the error "Cannot allocate vector of 41.8kb size" (Only 41.8kb!) Windows task manager shows full RAM usage at all time.

Is it because my function is too unwieldy somehow? I can't imagine it's the vector size causing problems.

(PS: I do gc() and rm() frequently)

Upvotes: 0

Views: 338

Answers (2)

Martin Morgan
Martin Morgan

Reputation: 46856

I think you're looking up a vector of unmatched id's in ref.table$id, and finding the corresponding index

## first match, one for each unmatched.id, NA if no match
idx <- match(unmatched.id, ref.table$id)
## matching industries
resid <- ref.table$industry[idx]

This is 'vectorized' so much more efficient than an lapply.

Upvotes: 2

Ricardo Saporta
Ricardo Saporta

Reputation: 55340

Try the following to see if it quits giving you memory complaints

 lapply(unmatched.id, function(id) as.character(ref.table$industry[ref.table$id==id]))

If the above works, then wrap it in unlist( .., use.names=FALSE)

or try using data.table

library(data.table)
ref.table <- data.table(ref.table, key="id") 
ref.table[.(unmatched.id), as.character(industry)]

Upvotes: 3

Related Questions