Reputation: 846
I'm having an issue when trying to make side by side boxplots by factors. I've read several examples, but for some reason my plots are not displaying correctly. I think it's trying to plot a boxplot for each value, even though I specified it as a factor.
I'm using the following code:
samp.norm = rnorm(1000,0,1)
samp.exp = rexp(1000,1)
samp.unif = runif(1000)
samp = c(samp.norm,samp.exp,samp.unif)
dist = c( rep("norm",1000), rep("exp",1000), rep("unif",1000) )
DATA = as.data.frame(cbind(samp,dist))
DATA$dist= as.factor(DATA$dist)
p = ggplot(DATA, aes(x=factor(DATA$dist), y = DATA$samp)) + geom_boxplot()
p
Upvotes: 1
Views: 1517
Reputation: 5894
The problem is your use of cbind()
coerces the resulting object so that DATA$samp is a factor rather than numeric. The columns resulting from cbind need to have the same class, which means they go for the lowest common demoninator class in this case "character". This is exactly what data frames were invented to solve.
Try
DATA=data.frame(samp,dist)
instead of the more complicated line you've got and it all should work.
As an aside, you also should have the much simpler
p=ggplot(DATA, aes(x=dist, y = samp)) + geom_boxplot()
rather than your second-last line. Once you have specified to ggplot() you are using DATA, you don't need to tell it where to find dist and samp ie no need for DATA$dist, just dist. Also, as dist is already a factor, you don't need to specify factor(dist).
Upvotes: 3
Reputation: 11893
+1 to @PeterEllis. Note that you can also get even simpler than his suggestion with:
boxplot(samp~dist)
Upvotes: 0