Anton von Schantz
Anton von Schantz

Reputation: 165

How to make saving ggplot2 objects more efficient?

I am trying to produce weighted density plots with R using the ggplot2 package and save them as .png files. In my code I am producing 100-1000 of these plots, with different geographical coordinates.

The problem is that, if my data set is even 1500 points, then the ggsave function becomes really slow. Then it approximately takes 100s to save one of these plots. From what I have understood, the computational inefficiency comes from the fact that the ggplot2 objects I'm plotting are grids and the ggsave has to print them before saving them.

So, I'm asking is there any way to make the saving of these ggplot2 objects more efficient? I mean any other way than lowering the resolution of the kde2d density estimate, which would indeed make the data frame to be plotted smaller.

I have provided a minimum working example, where I produce one of the .png files. When you use system.time() around the ggsave function, you will see that it takes around 100s to perform it.

library(MASS)
library(ggplot2)
library(grid)


x <- runif(1550, 0, 100)
y <- runif(1550, 0, 100)
wg <- runif(1550, 0, 1)

data <- data.frame(x, y, wg)


source("C:/Users/cpt2avo/Documents/R/kde2dweighted.r")
dens <- kde2d.weighted(data$x, data$y, data$wg)
dfdens <- data.frame(expand.grid(x=dens$x, y=dens$y), z=as.vector(dens$z))

p <- ggplot(data, aes(x = x, y = y)) + stat_contour(data = dfdens, geom = "polygon", bins = 20, alpha = 0.2, aes(x = x, y = y, z = z, fill = ..level..)) + scale_fill_continuous(low = "green", high = "red") + scale_alpha(range = c(0,1), limits = c(0.5, 1), na.value = 0) + labs(x = NULL, y = NULL) + theme(axis.title = element_blank(), axis.text = element_blank(), axis.ticks = element_blank(), axis.line = element_blank(), plot.margin = unit(c(0,0,-0.5,-0.5), "line"), panel.border = element_blank(), panel.grid = element_blank(), panel.margin = unit(c(0,0,0,0), "mm"), legend.position = "none", plot.background = element_rect(fill = "transparent", colour = NA), panel.background = element_blank())

system.time(ggsave(p, file = "C:/Users/cpt2avo/Documents/R/example.png", width = 2, height = 2, units = "in", dpi = 128))

The kde2d.weighted is a function for doing 2d weighted kernel density estimates.

kde2d.weighted <- function (x, y, w, h, n = 25, lims = c(range(x), range(y))) {
      nx <- length(x)
      if (length(y) != nx) 
        stop("data vectors must be the same length")
      if (length(w) != nx & length(w) != 1)
        stop("weight vectors must be 1 or length of data")
      gx <- seq(lims[1], lims[2], length = n) # gridpoints x
      gy <- seq(lims[3], lims[4], length = n) # gridpoints y
      if (missing(h)) 
        h <- c(bandwidth.nrd(x), bandwidth.nrd(y));
      if (missing(w)) 
        w <- numeric(nx)+1;
      h <- h/4
      ax <- outer(gx, x, "-")/h[1] # distance of each point to each grid point in x-direction
      ay <- outer(gy, y, "-")/h[2] # distance of each point to each grid point in y-direction
      z <- (matrix(rep(w,n), nrow=n, ncol=nx, byrow=TRUE)*matrix(dnorm(ax), n, nx)) %*% t(matrix(dnorm(ay), n, nx))/(sum(w) * h[1] * h[2]) # z is the density
      return(list(x = gx, y = gy, z = z))
    }

Upvotes: 9

Views: 3897

Answers (1)

MajkWu
MajkWu

Reputation: 46

@AntonvSchantz I ran into the same problems as you did, am having very similar experiences. Indeed, it's exporting to high-resolution png via ggsave() which makes this process slow. My resolution was to go with exporting into pdf, by doing something like:

Above your plot creation pdf(paste("plots/my_filename", rn , ".pdf", sep = ""), width = 11, height = 8)

Below your plot creation: dev.off()

Upvotes: 3

Related Questions