Reputation: 479
I'm creating some box plots with geom_boxplot and geom_jitter in ggplot2. For the most part, my data points are clustered around the boxes, but there are a few that aren't. I'm not removing them as outliers. When the plot is rendered, it squashes the boxes so that the y axis is scaled evenly and it shows the points at the top. What I'd like to do, is still show the points, but have the y axis distance between 1 and 3 the same as between 0 and 1 (approximately anyway). If the results were larger, I would log or square root transform, but they're small numbers. Is there a way I can make this plot?
Here's some code
dat <- data.frame (cat = "A", result = rnorm (87, 0.26, 0.19))
ggplot(dat, aes (x = cat, y = result)) +
geom_boxplot()+
geom_jitter()
Which produces
Now add in some data points further away
new_values <- data.frame(cat = "A", result = c(3.4 ,3.2))
dat <- rbind(dat, new_values)
ggplot(dat, aes (x = cat, y = result)) +
geom_boxplot()+
geom_jitter()
which produces
What I'd like to do is adjust the scale of the y axis so that the box plot isn't compressed but it still shows the other two data points. Something like this.
Any suggestions welcome. Thanks in advance
Upvotes: 0
Views: 45
Reputation: 125373
In general you can apply any transformation to a scale via the trans=
argument. When you have specific needs and it's worth the effort you can create a custom transformation. However, as first step you might consider using one of the built-in transformations, e.g. scales::transform_modulus
(a generalization of a Box-Cox transformation) seems to come close to what you have in mind:
library(ggplot2)
library(scales)
set.seed(123)
dat <- data.frame(cat = "A", result = rnorm(87, 0.26, 0.19))
new_values <- data.frame(cat = "A", result = c(3.4, 3.2))
dat <- rbind(dat, new_values)
ggplot(dat, aes(x = cat, y = result)) +
geom_boxplot(outliers = FALSE) +
geom_jitter() +
scale_y_continuous(
trans = scales::transform_modulus(-1),
breaks = c(0, .5, 1.75, 3.5)
)
Upvotes: 2