Erin Giles
Erin Giles

Reputation: 95

R function (Loop?) to make a new graph for each column in a dataset

I'm trying to write the code to use my dataset and make a new graph for each column of a dataset, rather than have to write out a new value for y each time in the code.

I have a dataset where each row is a person, each column is a measurement in the blood (ie, insulin, glucose, etc). I have a few extra columns with descriptive categories that I"m using for my groups (ie lean, obese). I'd like to make a graph for each of those column measurements (ie, one graph for insulin, another for glucose, etc). I have 90 different variables to cycle through.

I've figured out how to a boxplot for each of these, but can't figure out how to have the code "loop"? so that I don't have to re-write the code for each variable.

Using the mtcars dataset as an example, I have it making a graph where the y is disp, and then another graph where y = hp, and then y = drat.

data("mtcars")

#boxplot with individual points - first y variable
ggplot(data = mtcars, aes(x = cyl, y = disp)) +
  geom_boxplot()+
  geom_point()

#boxplot with individual points - 2nd y variable
ggplot(data = mtcars, aes(x = cyl, y = hp)) +
  geom_boxplot()+
  geom_point()

#boxplot with individual points - 3rd y variable
ggplot(data = mtcars, aes(x = cyl, y = drat)) +
  geom_boxplot()+
  geom_point()

How do I set this up so my code will automatically cycle through all of the variables in the dataset (I have 90 of them)?

Upvotes: 2

Views: 1419

Answers (2)

Dave2e
Dave2e

Reputation: 24079

Here is a slightly different version using a for loop and the using !!sym() to evaluate the variable text string:

library(rlang)
variables<-c("disp", "hp", "drat")

for (var in variables) {
  # print(var)
   p<-ggplot(data = mtcars, aes(x = cyl, y = !!sym(var), group=cyl)) +
      geom_boxplot()+
      geom_point()
   print(p)
}

Upvotes: 0

Matt
Matt

Reputation: 7385

Here's a basic solution, where you would populate vector_of_yvals with your 90 variables to loop through:

library(tidyverse)

plot_func <- function(yval){
  p <- ggplot(data = mtcars, aes(x = cyl, y = yval)) +
    geom_boxplot()+
    geom_point()
  p
}


vector_of_yvals <- c("disp", "hp", "drat")

list_of_plots <- map(vector_of_yvals, plot_func)

You can populate vector_of_yvals with all of the variables in your dataframe by doing:

vector_of_yvals <- colnames(mtcars)

This will give you a vector:

[1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear" "carb"

If you don't want to include cyl in your vector, you can filter it out like so:

vector_of_yvals <- vector_of_yvals %>% .[. != "cyl"]

Upvotes: 2

Related Questions