Reputation: 87
Using RStudio and looking to identify variable using most memory and clear them out before saving project session.
Based on various sources, I managed to do it as follows:
env <- data.frame(
"var" = ls(),
"size" = sapply(ls(),function(x){object.size(get(x))}),
"sizef" = sapply(ls(),function(x){format(object.size(get(x)), unit = 'auto')})
)
I was able to do that with base
subsetting. In this case, why does order(-env$size)
work but order(-size)
throws an error?
head(env[order(-env$size),],10)
I also made a first use of dplyr
.
library(dplyr)
env %>%
arrange(-size) %>%
filter(size>=1e8) %>%
top_n(10)
As I'm often finding at the start of my R journey, I can't tell what is the method to be using, if any of these. In terms of clarity, speed, flexibility, ease of use, quickest to code, etc... what is best practice?
Upvotes: 2
Views: 1146
Reputation: 5201
A shorter way to get all variable names would be to access the global environment directly:
sort(sapply(.GlobalEnv, object.size)) # a sorted, named, numeric vector
To get the largest n
objects, you can then use tail
:
tail(sort(sapply(.GlobalEnv, object.size)), n)
If you want it as a data.frame:
data.frame(size = sort(sapply(.GlobalEnv, object.size))) # object name is the name of each row
Upvotes: 2