vestland
vestland

Reputation: 61174

ggplot2: How to inspect every element of a plot using ggplot_build()?

Is there a way to search the entire output from ggplot_build() (or any other function), almost like searching the complete content of every subdirectory of a folder?


The details:

I was looking for a solution to Retrieve values for axis labels in ggplot2_3.0.0, and one of the early answers revealed that, depending on the ggplot2 version, the correct answer most likely would contain the parts $layout and / or $x.labels in the output from ggplot_build(g). So I started checking the ggplot_build() output each step of the way. One of the steps looks like the output below.

Snippet 1:

ggplot_build(g)$layout

Output 1:

<ggproto object: Class Layout, gg>
    coord: <ggproto object: Class CoordCartesian, Coord, gg>
        aspect: function
        clip: on

        [...]

    map_position: function
    panel_params: list
    panel_scales_x: list
    panel_scales_y: list
    render: function

        [...]

    ylabel: function
    super:  <ggproto object: Class Layout, gg>
>

And deep down there, under panel params, x.labels can be found along with lots of useful information like this:

Snippet 2:

ggplot_build(g)$layout$panel_params

Output 2:

[[1]]
[[1]]$`x.range`
[1]  7.7 36.3

[[1]]$x.labels
[1] "10" "15" "20" "25" "30" "35"

[[1]]$x.major
[1] 0.08041958 0.25524476 0.43006993 0.60489510 0.77972028 0.95454545

And it can be referenced directly like this:

Snippet 3:

ggplot_build(g)$layout$panel_params[[1]]$x.labels

Output 3:

[1] "10" "15" "20" "25" "30" "35"

My attempt for a more elegant approach:

I was certain I could do this with capture.output() like you can with str() as described here, but as far as I can tell, you won't find x.labels there either. I'm not going to flood the question with that output since it's about 300 lines long.

Thank you for any suggestions!


Upvotes: 8

Views: 1650

Answers (1)

Stibu
Stibu

Reputation: 15917

Solution for (presumably) for ggplot2 3.0.0 until 3.2.1

This old solution worked for ggplot2 version 3.0.0 and presumably up to 3.2.1, but I have not checked this. A solution for newer versions is below.

This function goes through a nested list structure and finds the paths through that structure that contain a given character string:

find_name <- function(obj, name) {
  
  # get all named paths through obj
  find_paths <- function(obj, path) {
    
    if ((!is.list(obj) && is.null(names(obj))) || identical(obj, .GlobalEnv)) {
      return (path)
    } else {
      if (is.null(names(obj))) {
        return(c(path,
                 lapply(seq_along(obj), function(x) find_paths(obj[[x]], paste0(path, "[[", x, "]]")))
              ))
      } else {
        return(c(path,
                 lapply(names(obj), function(x) find_paths(obj[[x]], paste(path, x, sep = "$")))
              ))
      }  
    }
    
  }
  
  # get all the paths through the nested structure
  all_paths <- unlist(find_paths(obj, deparse(substitute(obj))))
  
  # find the requested name
  path_to_name <- grep(paste0("\\$", name, "$"), all_paths, value = TRUE)
  
  return (path_to_name)
}

Here is an example of using this function with a ggplot_built object:

library(ggplot2)
p <- ggplot(mtcars) + geom_point(aes(x = disp, y = mpg, col = as.factor(cyl)))
gb <- ggplot_build(p)
find_name(gb, "x.labels")
## [1] "gb$layout$panel_params[[1]]$x.labels"

You can also directly get the contents of x.labels:

eval(parse(text = find_name(gb, "x.labels")))
## [1] "100" "200" "300" "400"

A few remarks on how this works:

  • The function find_paths() goes through the nested structure and returns all "paths" through the structure in a form similar to "gb$layout$panel_params[[1]]$x.labels".
  • The nested structure can contain named lists, unnamed lists, named "lists" that have another class (and thus return FALSE for is.list() and environments. One has to take care of all these situations.
  • A particular caveat is that a ggplot_built contains a reference to the global environment (gb$layout$facet_params$plot_env), which leads to an infinite loop if it is not properly treated.
  • The result of find_paths() is a nested list again, but the structure can easily be simplified with unlist().
  • The last step is to extract those paths that contain the name one is looking for. The regular expression I use ensures that only elements that exactly match the given name are returned. As an example, find_name(gb, "x") will not return "gb$layout$panel_params[[1]]$x.labels".

I have tested the function with the ggplot_built object from my example and with a nested list. I cannot guarantee that it works for all situations.

Solution for ggplot2 3.3.0 and newer

According to this answer, the structure of the object returned by ggplot_build() has changed in version 3.3.0. The above function then fails because of an infinite loop. The changes that are relevant here are:

  • The object contains back references to parent environments called .enclos_env`. These are the cause of the infinite loop and must explicitly be caught.
  • The object contains functions, which must be explicitly handled. Maybe this was the case already before, but it was not relevant for catching x.labels such that I did not notice. According to (again) this answer, there is now a function get_labels() that returns the labels. So we need to catch those functions as well.

This is an adapted version of the function find_name() that works with ggplot2 newer than 3.3.0 (but has only be checked with ggplot2 3.4.4):

find_name <- function(obj, name) {

  # get all named paths through obj
  find_paths <- function(obj, path) {
    
    if (is.function(obj)) {
      return(paste0(path, "()"))
    } else if ((!is.list(obj) && is.null(names(obj))) || identical(obj, .GlobalEnv) || grepl("\\.__enclos_env__$", path)) {
      return(path)
    } else {
      if (is.null(names(obj))) {
        return(c(path,
                 lapply(seq_along(obj), function(x) find_paths(obj[[x]], paste0(path, "[[", x, "]]")))
              ))
      } else {
        return(c(path,
                 lapply(names(as.list(obj)), function(x) find_paths(obj[[x]], paste(path, x, sep = "$")))
              ))
      } 
    }
  }

  # get all the paths through the nested structure
  all_paths <- unlist(find_paths(obj, deparse(substitute(obj))))

  # find the requested name
  path_to_name <- grep(paste0("\\$", name, "(\\(\\))$"), all_paths, value = TRUE)

  return (path_to_name)
}

The function does not crash, but x.labels cannot be found:

library(ggplot2)
p <- ggplot(mtcars) + geom_point(aes(x = disp, y = mpg, col = as.factor(cyl)))
gb <- ggplot_build(p)
find_name(gb, "x.labels")
## character(0)

But it can find all the functions called get_labels() (note that you must omit the parenthesis in the name of the function):

find_name(gb, "get_labels")
##  [1] "gb$layout$panel_scales_x[[1]]$get_labels()"           "gb$layout$panel_scales_y[[1]]$get_labels()"          
##  [3] "gb$layout$panel_params[[1]]$x$scale$get_labels()"     "gb$layout$panel_params[[1]]$x$get_labels()"          
##  [5] "gb$layout$panel_params[[1]]$x.sec$scale$get_labels()" "gb$layout$panel_params[[1]]$x.sec$get_labels()"      
##  [7] "gb$layout$panel_params[[1]]$y$scale$get_labels()"     "gb$layout$panel_params[[1]]$y$get_labels()"          
##  [9] "gb$layout$panel_params[[1]]$y.sec$scale$get_labels()" "gb$layout$panel_params[[1]]$y.sec$get_labels()"      
## [11] "gb$plot$scales$scales[[1]]$get_labels()"              "gb$plot$scales$scales[[2]]$get_labels()"             
## [13] "gb$plot$scales$scales[[3]]$get_labels()"  

Upvotes: 6

Related Questions