totallyuneekname
totallyuneekname

Reputation: 2020

R runs out of memory plotting data frame with ggplot2

I'm running R on Fedora 31 on a Dell XPS laptop with 8Gb RAM. I'm attempting to plot this GeoTIFF using ggplot2, so that I can overlay other data using code I've already written with ggplot2. I've been roughly following this lesson on working with raster data in R. After converting the TIFF into a RasterLayer into a data frame, the R program fails when loading the data frame with ggplot2, simply outputting "Killed" and exiting.

Here is a minimal code sample that produces this error:

library(tidyverse)
library(raster)
library(rgdal)

afg_pop <- raster("afg_ppp_2020.tif")
pop_df <- as.data.frame(afg_pop, xy = TRUE)

ggplot() +
    # This is the line that results with the error: "Killed"
    geom_raster(data = pop_df , aes(x = x, y = y, fill = afg_ppp_2020))

Running dmesg reveals that R ran out of memory:

 [20563.603882] Out of memory: Killed process 42316 (R) total-vm:11845908kB, anon-rss:6878420kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:19984kB oom_score_adj:0

It's hard for me to believe that even with a data file this large R is running out of the memory required to handle it. Why does R need so much memory to perform this task, and more importantly what other method can I use to plot this data, preferably using ggplot2?

I'm relatively new to R, so please forgive me if I'm ignoring something obvious here. Any help would be appreciated!

Upvotes: 1

Views: 1354

Answers (1)

Robert Hijmans
Robert Hijmans

Reputation: 47051

I cannot speak to the memory requirements of ggplot but the spatial resolution of the data is very high (~ 90m). There is no point in asking ggplot to draw 10955 (rows) * 17267 (columns) = 189,159,985 pixels as you won't be able to see them (unless, perhaps you are printing a billboard). So a simple workaround is to take a regular sample, or to aggregate

f <- "ftp://ftp.worldpop.org.uk/GIS/Population/Global_2000_2020/2020/AFG/afg_ppp_2020.tif"
if (!file.exists(basename(f))) download.file(f, basename(f), mode="wb")

library(raster)
afg_pop <- raster("afg_ppp_2020.tif")
pop_df <- data.frame(sampleRegular(afg_pop, 10000, xy=TRUE))

library(ggplot2)
ggplot() + geom_raster(data = pop_df , aes(x = x, y = y, fill = afg_ppp_2020))

A better alternative that takes a little longer

afg_pop2 <- aggregate(afg_pop, 10) # this takes some time
pop_df2 <- as.data.frame(afg_pop2, xy=TRUE)
ggplot() + geom_raster(data = pop_df2 , aes(x = x, y = y, fill = afg_ppp_2020))

The maps are not very nice; there are better options in other R packages for making maps.

Upvotes: 2

Related Questions