NLM09
NLM09

Reputation: 25

Creating a loop in r for multiple box plots

I have designed an experiment to see how serum markers change with time since eating a meal. I have a data frame consisting of 72 observations and 23o variables this is called BreakfastM.

There are 229 variables which are serum markers and 1 which is timepoint. The observations are different samples

Iam looking for trends in the data of how the serum markers (ie cholestrol) change with the timepoint. I have created a boxplot which shows nicely the trends in a particular serum marker in relation to timepoint

This is the code I used

boxplot((BreakfastM$Variable~BreakfastM$Timepoint))

Is there a quick way to test all the variables in the dataframe against the timepoint by writing a loop code in R?

Upvotes: 2

Views: 13135

Answers (2)

Joe
Joe

Reputation: 8611

You can also use a loop to write many plots to image files in your working directory. Let's make a 10 column matrix representing 10 measured variables, each split by 3 factor levels:

data <- matrix(rnorm(150), nrow=15)
grps <- factor(c(rep("group1", 5), rep("group2", 5), rep("group3", 5)))

The loop writes each boxplot to files called var_1.png, var_2.png, etc. This will put 10 pngs in your working directory.

for (i in 1:ncol(data)) {
  png(file = paste("var_", i, ".png", sep=""))
  boxplot(data[, i] ~ grps)
  dev.off()
}

The files are very small and you can flick through them quickly with a simple image viewer.

enter image description here

Upvotes: 3

Mark Peterson
Mark Peterson

Reputation: 9570

If you are just looking to plot, converting to long form with tidyr (and dplyr) and then plotting with ggplot2 is probably the best starting point.

If you have only a small number of variables, you could just use facet_wrap to split the boxplots by measure. Because you didn't provide reproducible data, I am using the mtcars data, substituting "gear" for your time point, and limiting to just the numeric values to compare. select is picking the columns I want to use, then gather converts them to long format before passing to ggplot

mtcars %>%
  select(gear, mpg, disp:qsec) %>%
  gather(Measure, Value, -gear) %>%
  ggplot(aes(x = factor(gear)
             , y = Value)) +
  geom_boxplot() +
  facet_wrap(~Measure
             , scales = "free_y")

enter image description here

Now, with 229 variables, that is not going to be a readable plot. Instead, you may want to look at facet_multiple from ggplus which spreads facets over multiple pages. Here, I am using it to put one per "page" which you can either view in the viewer, or save, depending on your needs.

First, save the base plot (with no facetting):

basePlot <-
  mtcars %>%
  select(gear, mpg, disp:qsec) %>%
  gather(Measure, Value, -gear) %>%
  ggplot(aes(x = factor(gear)
             , y = Value)) +
  geom_boxplot()

Then, use it as an argument to facet_multiple:

facet_multiple(basePlot, "Measure"
               , nrow = 1
               , ncol = 1
               , scales = "free_y")

Will generate the same panels as above, but with one per page (changing nrow and ncol can increase the number of facets shown per page).

Upvotes: 5

Related Questions