Reputation: 1007
I have a ggplot where some of the points are overlapping with a few others. I was wondering if there is a way to put the points one above the other. In my case, there are 2 points at most overlapping.
x=c(1,1,2,3,4,4)
y=c('a1','a1','a2','a3','a4','a4')
type = c('A','B','C','A','B','C')
data = as.data.frame(cbind(x,y,type))
ggplot() + geom_point(data = data, aes(x=x,y=y, color = type, fill = type), size = 2, shape = 25)
Here we see that for point x=1 and y=a1
the type A
is sitting beneath type B
but I ideally want Type B
to be shifted vertically by a bit.
If I use jitter, every thing gets displaced, including the points that don't have an overlap.
Upvotes: 4
Views: 960
Reputation: 23231
We can use duplicated
or any similar function to detect the overlap, then we can use R indexing with jitter
to apply jitter selectively.
I wrote it as a function:
selective_jitter <- function(x, # x = x co-ordinate
y, # y = y co-ordinate
g # g = group
){
x <- as.numeric(x)
y <- as.numeric(y)
a <- cbind(x, y)
a[duplicated(a)] <- jitter(a[duplicated(a)], amount = .15) # amount could be made a parameter
final <- cbind(a, g)
return(final)
}
data <- as.data.frame(selective_jitter(data$x, data$y, data$type))
ggplot() + geom_point(data = data, aes(x=x,y=y, color = g, fill = type), size = 2, shape = 25)
There are a lot of ways to write this differently or to tweak it. For instance, I think a very nice tweak would be to add an optional argument for the amount
option of jitter()
.
Another potential improvement would be to use a caliper to look for (near-) duplicates as well as the exact duplicates (whereas duplicated
will just find exact dupes).
Final note - sometimes when I do this I like to use semi-transparent colors rather than jitter
. This variation works well only if the number of series (type
) is small, so that you can do things like have 1 series in yellow, 1 in blue, and then their overlap would be green (there are existing solutions on Stack Overflow) that demonstrate that if you're interested.
Upvotes: 6
Reputation: 17299
Just another way with transformed y
values. The basic idea is similar to that of Hack-R:
library(data.table)
setDT(data)
data[, y2 := as.numeric(y) + 0.2* (rowid(y) - 1)]
ggplot() +
geom_point(data = data,
aes(x=x,y=y2, color = type, fill = type),
size = 2, shape = 25) +
scale_y_continuous(breaks = seq_len(uniqueN(data$y)), labels = levels(data$y))
Note: I assume y
is a factor as in your example. Otherwise you can convert y
from character to factor with data$y <- factor(data$y)
.
Upvotes: 2