n8sty
n8sty

Reputation: 1438

In R, how to use for loop to create series of graphs

I have a wide-form table that looks like this:

ID  Test_11 LVL11  Score_X_11 Score_Y_11  Test_12 LV12  Score_X_12  Score_Y_12
1   A       I      100        NA          NA      NA    100         100
2   A       II     90         100         B       II    90          85 
3   NA      NA     NA         NA          B       II    90          NA
4   A       III    100        80          A       III   75          75
5   B       I      NA         90          NA      NA    60          50
6   B       I      70         100         NA      NA    NA          NA
7   B       II     85         NA          A       I     60          60

And a table used for sorting that looks like this

Test_11   A
Test_11   B
Test_12   A
Test_12   B

What this second table tells us is that for Test_11 there are two versions, A and B (same for Test_12).

I am trying to create a series of boxplots that graph the distribution of every combination of Test_11 and Test_12, and their respective versions (A, B). So, for Test_11==A the boxplot created would have three groups (I, II, III) and then the resulting graphical information from the subset where Test_11==A, and then the same for Test_11==B, Test_12==A, and Test_12==B. In total there should be, in this example, 4 graphs created.

What I have in R is:

z <- subset(df, df$Test_11=="A")
plot(z$LVL11, z$Score_X_11, varwidth = TRUE, notch = TRUE, xlab = 'LVL', 
     ylab = 'score')

What I would like, and haven't been able to figure out how to do, is to write a for loop that does the subsetting for me so that I could automate this for my actual data set which has a few dozen of these combinations.

Thanks for any help and guidance.

Upvotes: 0

Views: 3254

Answers (2)

Juan
Juan

Reputation: 1421

The "straight forward way"

Maybe you should save all your logical vectors in a data.frame or matrix before the loop:

selections <- matrix(nrow = nrow(df), ncol = 4)
selections[,1] <- df$Test_11 == "A"
selections[,2] <- df$Test_11 == "B"
selections[,3] <- df$Test_12 == "A"
selections[,4] <- df$Test_12 == "B"
# etc...
par(mfcol = c(2, 2)) # here you should customize at will...
for (i in 1:4) {
  z <- subset(df, selections[,i])
  plot(z$LVL11, z$Score_X_11, varwidth = TRUE, 
       notch = TRUE, xlab = 'LVL', 
       ylab = 'score')
}

You can change your code so instead of using z$Score_X_11, use z[,string]. The value of string should be constructed with paste (or other string manipulating functions). For example:

v <- c("X", "Y")
n <- c("11", "12")
for (i in 1:2) {
  for (j in 1:2) {
    string <- paste("Score", v[i], n[i], sep = "_")
    print(string)
  }
}

A similar reasoning would be used with the z$LVLXX values, so you should be able to figure out a way to accommodate for that.

Alternative way, with ggplot2 & reshape2

I'm not very experienced with using trellis graphics (like in the other anwser), but I know a little ggplot2, so I decided to take the challenge and try a bit. It is not great, but at least works:

# df <- read.table("data.txt", header = TRUE, na.string = "NA")
require(reshape2)
require(ggplot2)

# Melt your data.frame, using the scores as the "values":
mdf <- melt(df[,-1], id = c("LVL11", "LV12", "Test_11", "Test_12"))

# loop through level types:
for (lvl in c("LVL11", "LV12")) {
  # looping through values of test11
  for (test11 in c("A", "B")) {
    # Note the use of subset before ggplot
    p <- ggplot(subset(mdf, Test_11 == test11), aes_string(x=lvl, y="value"))
    # I added the geom_jitter as in the example given there were only a few points
    g <- p + geom_boxplot(aes(fill = variable)) + geom_jitter(aes(shape = variable))
    print(g) # it is necessary to print p explicitly like this in order to use ggplot in a loop
    # Finally, save each plot with a relevant name:
    savePlot(paste0(lvl, "-t11", test11, ".png")) 
    # (note that savePlot has some problems with RStudio iirc)

  }
  # Same as before, but with test_12
  for (test12 in c("A", "B")) {
    p <- ggplot(subset(mdf, Test_12 == test12), aes_string(x=lvl, y="value"))
    g <- p + geom_boxplot(aes(fill = variable)) + geom_jitter(aes(shape = variable))
    print(g) 
    savePlot(paste0(lvl, "-t12", test12, ".png"))
  }
}

If anyone knows how to use trellis graphics or maybe facet_grid in this case, so I can put all grahpics in one image, I would love to hear how.

cheers.

Upvotes: 1

Ramnath
Ramnath

Reputation: 55695

Classic plyr solution (HT to @hadleywickham)

require(plyr); require(lattice); require(gridExtra)
bplots <- dlply(dat, .(Test_11, Test_12), function(df){
  bwplot(Score_X_11 ~ LVL11, data = df)
})
do.call('grid.arrange', bplots)

enter image description here

Upvotes: 1

Related Questions