EML
EML

Reputation: 671

Plotting ggplot with for loop. How should variable be referenced?

I am trying to create scatterplots where a single plot displays the relationship between each predictor and a single outcome. However, the outcome is not displaying normally. I assume this is because the ggplot function is not recognizing the outcome as a column name. Any advice on how to properly refer to the outcome?

# data
data <- data.frame(o1=rnorm(100, 3, sd=1.2),
                   o2=rnorm(100, 3.5, sd=1.4),
                   p1=rnorm(100, 2, sd=1.9),
                   p2=rnorm(100, 1, sd=1.2),
                   p3=rnorm(100, 7, sd=1.6)
)

func <- function(data, outcomes, predictors) {

for(i in seq_along(outcomes)){
  print(data %>% select(outcomes[[i]], predictors[[i]]) %>% 
          gather(var, value, -outcomes[[i]]) %>% 
          ggplot(aes(x=value, y=outcomes[[i]])) + geom_point() + facet_wrap(~var))
   
}
}

func(data, outcomes=c("o1", "o2"), predictors=list(c("p1", "p2"), c("p2","p3")))

Upvotes: 0

Views: 84

Answers (2)

Ramiro Reyes
Ramiro Reyes

Reputation: 535

You could try this without the loop:

data <- data.frame(o1=rnorm(100, 3, sd=1.2),
                   o2=rnorm(100, 3.5, sd=1.4),
                   p1=rnorm(100, 2, sd=1.9),
                   p2=rnorm(100, 1, sd=1.2),
                   p3=rnorm(100, 7, sd=1.6)
                   )

tidy_data <- data %>% 
  pivot_longer(c(p1:p3), names_to = "predictor", values_to = "x") %>% 
  pivot_longer(c(o1:o2), names_to = "outcome", values_to = "y")


ggplot(tidy_data) +
  
  geom_point(aes(x = x,
                 y = y)) +
  
  facet_grid(outcome~predictor)

enter image description here

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388797

In the function you can :

  • create a list to hold all the plots.
  • replace gather with pivot_longer since gather is retired.
  • Use .data pronoun to specify y-axis value as variable.
library(tidyverse)

func <- function(data, outcomes, predictors) {
  plot_list <- vector('list', length(outcomes))
  
  for(i in seq_along(outcomes)){
    plot_list[[i]] <- data %>% 
      select(outcomes[i], predictors[[i]]) %>% 
      pivot_longer(cols = -outcomes[i]) %>%
      ggplot(aes(x=value, y=.data[[outcomes[i]]])) + 
            geom_point() + facet_wrap(~name)
  }
  return(plot_list)
}

and call it as :

result <- func(data, outcomes=c("o1", "o2"), 
                     predictors=list(c("p1", "p2"), c("p2","p3")))

where result is a list of plots and each individual plots can be accessed as result[[1]], result[[2]] and so on.

Upvotes: 1

Related Questions