Richard J. Acton
Richard J. Acton

Reputation: 915

dplyr do multiple plots in an anonymous function

I would like to use do to make multiple ggplots based on a grouped data frame but make an alteration to the plot, namely reversing the y-axis if a column contains a particular value.

I modelled my approach after Hadley's answer to this question: dplyr::do() requires named function?

The problem i'm having is getting the gg object into the data frame to return it, how do I manually what do did automatically in my working example below and 'wrap' the gg object in somthing that can be placed into a data frame?

df <-   data.frame( position=rep(seq(0,99),2),
                    strand=c(rep("-",100),rep("+",100)),
                    score=rnorm(200),
                    gene=c(rep("alpha",100),rep("beta",100))
        )

This works fine:

plots <- df %>% 
    group_by(gene) %>%
    do(plot=
        ggplot(.,aes(position,score)) +
            geom_point()
    )
plots   

Result:

# A tibble: 2 x 2
  gene  plot    
* <fct> <list>  
1 alpha <S3: gg>
2 beta  <S3: gg>

This does not:

plots <- df %>% 
    group_by(gene) %>%
    do({
        plot <- ggplot(.,aes(position,score)) +
            geom_point()

        if (all(.$strand=="-")) {
            plot <- plot + scale_y_reverse()
        }
        data.frame(., plot) ##!! <<< how to get the ggplot object into a data frame
    })
plots

Fails with the error:

Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) : 
  cannot coerce class "c("gg", "ggplot")" to a data.frame

Upvotes: 2

Views: 1324

Answers (2)

acylam
acylam

Reputation: 18681

We can use a nested data frame instead of do:

library(ggplot2)
library(tidyverse)

plots <- df %>%
  group_by(gene) %>%
  nest() %>%
  mutate(plots = data %>% map(~{
    plot <- ggplot(.,aes(position,score)) +
      geom_point()

    if (all(.$strand=="-")) {
      plot <- plot + scale_y_reverse()
    }
    return(plot)
  })) %>%
  select(-data) 

Output:

# A tibble: 2 x 2
  gene  plots   
  <fct> <list>  
1 alpha <S3: gg>
2 beta  <S3: gg>

enter image description here enter image description here

Upvotes: 2

r2evans
r2evans

Reputation: 160437

I don't think you need the return value to be a frame. Try this:

plots <- df %>% 
    group_by(gene) %>%
    do(plot= {
        p <- ggplot(.,aes(position,score)) +
            geom_point()
        if (all(.$strand == "-")) p <- p + scale_y_reverse()
        p
    })
plots
# Source: local data frame [2 x 2]
# Groups: <by row>
# # A tibble: 2 x 2
#   gene  plot    
# * <fct> <list>  
# 1 alpha <S3: gg>
# 2 beta  <S3: gg>

I think one issue is that your conditional logic is fine but you did not name the block within do(...).

You can view one of them with:

plots$plot[[1]]

sample plot

If you want to dump all plots (e.g., in a markdown document), just do plots$plot and they will be cycled through rather quickly (not as useful on the console).

Upvotes: 3

Related Questions