Reputation: 928
I have a dataset organised into subcategories and sub-subcategories, along the lines of nested bullet points:
-1
-1a
-1ai
-1aii
-1b
-1bi
...and so on.
I want to use ggplot2 to make a dotplot which shows all data for 1 followed by data for 1a only, followed by data for 1ai only, and so on.
Example dataset:
x <- data.frame(cat=1, subA=letters[rep(1:5,each=10)],
subB=as.character(as.roman(rep(1:5,5,each=2))),value=rnorm(50,20,7))
> head(x)
cat subA subB value
1 1 a I 26.75573
2 1 a I 12.52218
3 1 a II 24.53499
4 1 a II 23.21012
5 1 a III 11.18173
6 1 a III 25.01914
I want to end up with a chart that looks something like this:
I was able to make this plot by doing lots of subsetting and rbinding to make a massively redundant derivative data frame, but this seems like a clear example of Doing It Wrong.
x2 <- with(x,rbind(cbind(key="1",x),
cbind(key="1 a",x[paste(cat,subA) == "1 a",]),
cbind(key="1 a I",x[paste(cat,subA,subB) == "1 a I",]),
cbind(key="1 a II",x[paste(cat,subA,subB) == "1 a II",])))
library(ggplot2)
library(plyr)
ggplot(x2,aes(x=reorder(key,desc(key)),y=value))
+ geom_point(position=position_jitter(width=0.1,height=0))
+ coord_flip() + scale_x_discrete("Category")
Is there a better way of doing this? A related problem is that it would be nice if each value always had the same amount of jitter added to it, whether it was plotted against "1" or "1 a" or "1 a II", but there I'm not even sure where to start.
Upvotes: 2
Views: 772
Reputation: 118799
I can't think of a way other than reconstructing your data with separate groups as shown below:
x.m1 <- x[c("cat", "value")]
x.m2 <- do.call(rbind, lapply(split(x, interaction(x[, 1:2])), function(y) {
y$cat <- do.call(paste0, y[, 1:2])
y[c("cat", "value")]
}))
x.m3 <- do.call(rbind, lapply(split(x, interaction(x[, 1:3])), function(y) {
y$cat <- do.call(paste0, y[, 1:3])
y[c("cat", "value")]
}))
y <- rbind(x.m1, x.m2, x.m3)
ggplot(data = y, aes(x = value, y = cat)) + geom_point()
Note: You should reorder the levels of cat
column in y
to order the y-axis in the way you want. I'll leave that to you.
Edit: Following @Justin's suggestion, you could do something like this:
x.m1 <- x
x.m1$grp <- x$cat
x.m2 <- do.call(rbind, lapply(split(x, interaction(x[, 1:2])), function(y) {
y$grp <- do.call(paste0, y[, 1:2])
y
}))
x.m3 <- do.call(rbind, lapply(split(x, interaction(x[, 1:3])), function(y) {
y$grp <- do.call(paste0, y[, 1:3])
y
}))
y <- rbind(x.m1, x.m2, x.m3)
ggplot(data = y, aes(x = value, y = grp)) + geom_point(aes(colour=subA, shape=subB))
Upvotes: 2