Reputation: 311
I have a two continuous variables that I am trying to plot against each other in ggplot2, but I want to show the data means and standard errors using geom_crossbar(). In order to do this I need to plot the x-axis as a factor, which is fine except that I cannot get the type of spacing that I want on the x-axis. Does anyone know of a way to coerce the x variable to space as a continuous variable even when it is discrete?
Some code...
# assemble data, calculate means and standard errors
x <- c(rep(15, 10), rep(30, 10), rep(41, 10), rep(42, 10), rep(45, 10))
y <- c(rnorm(10, 47, 15), rnorm(10, 35, 11), rnorm(10, 31, 12), rnorm(10, 37, 13), rnorm(10, 30, 10))
dat <- data.frame(x,y)
y.mean <- aggregate(dat$y, by=list(x=dat$x), mean)
names(y.mean) <- c('x', 'mean')
dat <- merge(dat, y.mean, by=c('x'))
se <- function(x) sqrt(var(x) / length(x))
y.se <- aggregate(dat$y, by=list(x=dat$x), se)
names(y.se) <- c('x','se')
dat <- merge(dat, y.se, by=c('x'))
g <- ggplot(dat, aes(x=factor(x), y=mean, ymin= mean - se, ymax= mean + se))
g + geom_crossbar(width=0.5) + geom_jitter(mapping=aes(x=factor(x), y=y), position=position_jitter(width=0.2))
As you can see, the x-variable is placed as a discrete variable just like it should be. I don't actually want that; rather, I'd like to see it spaced as the continuous variable that it is. However, I have to plot x as a factor to keep the crossbars, or else the crossbars start to go wonky on me. I would just use geom_boxplot(), but I want standard errors instead of interquartile range.
Thanks for any help, Paul
Upvotes: 0
Views: 853
Reputation: 16026
Without knowing your data it's hard to know, but it sounds like there are some shaky visualisation issues here... Regardless, I think this will be much more straightforward if you have different data sources - one for your points, and one for the boxes. Without addressing any of the other issues, here's how I would modify your approach:
dat <- data.frame(x,y)
y.mean <- aggregate(dat$y, by=list(x=dat$x), mean)
names(y.mean) <- c('x', 'mean')
se <- function(x) sqrt(var(x) / length(x))
y.se <- aggregate(dat$y, by=list(x=dat$x), se)
names(y.se) <- c('x','se')
dat.mean <- merge(y.mean, y.se, by=c('x'))
library(ggplot2)
g <- ggplot(dat, aes(x, y)) + geom_point()
g + geom_crossbar(data = dat.mean, aes(y = mean,
ymin = mean - se, ymax = mean + se, group = x))
If you want the x axis labels to reflect your 'levels' (is x a factor? I think this is an important issue to work out), you can add:
scale_x_continuous(breaks = dat.mean$x)
Upvotes: 1