Reputation: 1800
I am not able to get a position_dodge
to work for ggplot2
(version 3.3.0
) for my data set even though I am able to make it work for a toy dataset (based on the discussion in a very early version here).
First , what works:
library(ggplot2)
dat <- data.frame(x=1:2, y=1:12, g=LETTERS[1:3])
dat
is
> dat
x y g
1 1 1 A
2 2 2 B
3 1 3 C
4 2 4 A
5 1 5 B
6 2 6 C
7 1 7 A
8 2 8 B
9 1 9 C
10 2 10 A
11 1 11 B
12 2 12 C
# plotting
ggplot(dat, aes(x=x, group=g)) +
geom_point(aes(y=y), position=position_dodge(width = 0.2))
which gives,
What does not work (my data set being dat1
)
dat1 <- structure(list(GPVAR = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L), .Label = c("1",
"2", "3", "4"), class = "factor"), TIME = c(12.33, 24.33, 48.33,
72.33, 96.33, 12.33, 24.33, 48.33, 72.33, 96.33, 12.33, 24.33,
48.33, 72.33, 96.33, 12.33, 24.33, 48.33, 72.33, 96.33), PERC = c(69.4232142857143,
90.450496031746, 102.25248015873, 100.341482142857, 104.310987301587,
25.6843253968254, 49.9654761904762, 66.2337301587302, 71.6874007936508,
73.5505277777778, 42.4852380952381, 53.3393261904762, 62.0385523809524,
62.9715285714286, 65.5977922619048, 14.635119047619, 27.3870238095238,
41.2321428571429, 50.3591904761905, 56.0338928571429)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
dat1
# A tibble: 20 x 3
GPVAR TIME PERC
<fct> <dbl> <dbl>
1 2 12.3 69.4
2 2 24.3 90.5
3 2 48.3 102.
4 2 72.3 100.
5 2 96.3 104.
6 3 12.3 25.7
7 3 24.3 50.0
8 3 48.3 66.2
9 3 72.3 71.7
10 3 96.3 73.6
11 1 12.3 42.5
12 1 24.3 53.3
13 1 48.3 62.0
14 1 72.3 63.0
15 1 96.3 65.6
16 4 12.3 14.6
17 4 24.3 27.4
18 4 48.3 41.2
19 4 72.3 50.4
20 4 96.3 56.0
## plotting
ggplot(dat1, aes(x = TIME, group = GPVAR)) +
+ geom_point(aes(y = PERC), position = position_dodge(width = 0.2))
which gives a plot without horizontal dodging,
Running str()
on both dat
and dat1
shows that they are quite similar, so I am not sure what is going on..
str(dat)
'data.frame': 12 obs. of 3 variables:
$ x: int 1 2 1 2 1 2 1 2 1 2 ...
$ y: int 1 2 3 4 5 6 7 8 9 10 ...
$ g: Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3 1 ...
> str(dat1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20 obs. of 3 variables:
$ GPVAR: Factor w/ 4 levels "1","2","3","4": 2 2 2 2 2 3 3 3 3 3 ...
$ TIME : num 12.3 24.3 48.3 72.3 96.3 ...
$ PERC : num 69.4 90.5 102.3 100.3 104.3 ...
Any help or explanation here would be very useful, thanks!
Upvotes: 1
Views: 217
Reputation: 591
It may not be obvious but your plot does show some horizontal movement (jitter). The width = ...
argument specifies the degree of random noise in the x
direction. Note, the range of x
values varies considerably between the two plots.
Two fixes could help adjust the movement of the points. First, augment the width
argument to a value greater than 0.2 inside of position_dodge()
. Second (preferred), simply insert position = "jitter"
inside of geom_point()
.
ggplot(dat1, aes(x = TIME, y = PERC, group = GPVAR)) +
geom_point(position = "jitter")
Note, you could also replace geom_point()
with geom_jitter()
. Omitting the width = ...
argument defaults to 40 percent of the resolution of the data. Try resolution(dat1$TIME)
to see how this distance is calculated. Since jitter is added in both positive and negative directions, the jitter values will occupy 80% of the implied bins. For more information, please refer to the documentation.
Typically, these techniques are used when there is considerable over-plotting. You only have twenty values so you don't have to overdue the degree of jitter.
I hope this helps.
Upvotes: 1
Reputation: 1228
Your code runs fine and as intended. There is a dodge in the second plot, it's just barely perceptible because your position = position_dodge(width = 0.2)
argument is too small. It works on the first one because your X axis is at a scale where that argument makes a difference; but the second one is at a different scale. If you increase that parameter, you'll see your code works fine.
ggplot(dat1, aes(x = TIME, group = GPVAR)) +
geom_point(aes(y = PERC), position = position_dodge(width = 5))
An alternative here is also to use geom_jitter
in place of geom_point
:
ggplot(dat1, aes(x = TIME, group = GPVAR)) +
geom_jitter(aes(y = PERC))
Looking at this a bit more, it seems like because your x-axis is a continuous variable, the dodge parameter makes an absolute change. However, if I make your x-axis discrete, then the dodging looks to be more relative.
ggplot(dat1, aes(x = factor(TIME), group = GPVAR)) +
geom_point(aes(y = PERC), position = position_dodge(width = 0.2))
Upvotes: 1