kavmeister
kavmeister

Reputation: 87

Global environment variable memory use to table

Using RStudio and looking to identify variable using most memory and clear them out before saving project session.

  1. What's the best way to create a data frame with variable names and sizes?

Based on various sources, I managed to do it as follows:

env <- data.frame(
              "var" = ls(),
              "size" = sapply(ls(),function(x){object.size(get(x))}),
              "sizef" = sapply(ls(),function(x){format(object.size(get(x)), unit = 'auto')})
              )
  1. What's the best way to sort the list by size and output top results?

I was able to do that with base subsetting. In this case, why does order(-env$size) work but order(-size) throws an error?

head(env[order(-env$size),],10)

I also made a first use of dplyr.

library(dplyr)
env %>%
  arrange(-size) %>%
  filter(size>=1e8) %>%
  top_n(10)

As I'm often finding at the start of my R journey, I can't tell what is the method to be using, if any of these. In terms of clarity, speed, flexibility, ease of use, quickest to code, etc... what is best practice?

Upvotes: 2

Views: 1146

Answers (1)

C. Braun
C. Braun

Reputation: 5201

A shorter way to get all variable names would be to access the global environment directly:

sort(sapply(.GlobalEnv, object.size)) # a sorted, named, numeric vector

To get the largest n objects, you can then use tail:

tail(sort(sapply(.GlobalEnv, object.size)), n)

If you want it as a data.frame:

data.frame(size = sort(sapply(.GlobalEnv, object.size))) # object name is the name of each row

Upvotes: 2

Related Questions