logicForPresident
logicForPresident

Reputation: 311

Plot x variable as a factor while still retaining continuous placement on x-axis

I have a two continuous variables that I am trying to plot against each other in ggplot2, but I want to show the data means and standard errors using geom_crossbar(). In order to do this I need to plot the x-axis as a factor, which is fine except that I cannot get the type of spacing that I want on the x-axis. Does anyone know of a way to coerce the x variable to space as a continuous variable even when it is discrete?

Some code...

 # assemble data, calculate means and standard errors
 x <- c(rep(15, 10), rep(30, 10), rep(41, 10), rep(42, 10), rep(45, 10))
 y <- c(rnorm(10, 47, 15), rnorm(10, 35, 11), rnorm(10, 31, 12), rnorm(10, 37, 13), rnorm(10, 30, 10))

 dat <- data.frame(x,y)
 y.mean <- aggregate(dat$y, by=list(x=dat$x), mean)
 names(y.mean) <- c('x', 'mean')
 dat <- merge(dat, y.mean, by=c('x'))

 se <- function(x) sqrt(var(x) / length(x))
 y.se <- aggregate(dat$y, by=list(x=dat$x), se)
 names(y.se) <- c('x','se')
 dat <- merge(dat, y.se, by=c('x'))

 g <- ggplot(dat, aes(x=factor(x), y=mean, ymin= mean - se, ymax= mean + se))
 g + geom_crossbar(width=0.5) + geom_jitter(mapping=aes(x=factor(x), y=y), position=position_jitter(width=0.2))

As you can see, the x-variable is placed as a discrete variable just like it should be. I don't actually want that; rather, I'd like to see it spaced as the continuous variable that it is. However, I have to plot x as a factor to keep the crossbars, or else the crossbars start to go wonky on me. I would just use geom_boxplot(), but I want standard errors instead of interquartile range.

Thanks for any help, Paul

Upvotes: 0

Views: 853

Answers (1)

alexwhan
alexwhan

Reputation: 16026

Without knowing your data it's hard to know, but it sounds like there are some shaky visualisation issues here... Regardless, I think this will be much more straightforward if you have different data sources - one for your points, and one for the boxes. Without addressing any of the other issues, here's how I would modify your approach:

dat <- data.frame(x,y)
y.mean <- aggregate(dat$y, by=list(x=dat$x), mean)
names(y.mean) <- c('x', 'mean')

se <- function(x) sqrt(var(x) / length(x))
y.se <- aggregate(dat$y, by=list(x=dat$x), se)
names(y.se) <- c('x','se')
dat.mean <- merge(y.mean, y.se, by=c('x'))

library(ggplot2)
g <- ggplot(dat, aes(x, y)) + geom_point()
g + geom_crossbar(data = dat.mean, aes(y = mean, 
  ymin = mean - se, ymax = mean + se, group = x))

enter image description here

If you want the x axis labels to reflect your 'levels' (is x a factor? I think this is an important issue to work out), you can add:

scale_x_continuous(breaks = dat.mean$x)

enter image description here

Upvotes: 1

Related Questions