Satya
Satya

Reputation: 1800

Dodging points in ggplot2 package not working

I am not able to get a position_dodge to work for ggplot2 (version 3.3.0) for my data set even though I am able to make it work for a toy dataset (based on the discussion in a very early version here).

First , what works:

library(ggplot2)
dat <- data.frame(x=1:2, y=1:12, g=LETTERS[1:3])

dat is

> dat
   x  y g
1  1  1 A
2  2  2 B
3  1  3 C
4  2  4 A
5  1  5 B
6  2  6 C
7  1  7 A
8  2  8 B
9  1  9 C
10 2 10 A
11 1 11 B
12 2 12 C

# plotting
ggplot(dat, aes(x=x, group=g)) + 
    geom_point(aes(y=y), position=position_dodge(width = 0.2))

which gives,

position-dodge-working

What does not work (my data set being dat1)

dat1 <- structure(list(GPVAR = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 
3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L), .Label = c("1", 
"2", "3", "4"), class = "factor"), TIME = c(12.33, 24.33, 48.33, 
72.33, 96.33, 12.33, 24.33, 48.33, 72.33, 96.33, 12.33, 24.33, 
48.33, 72.33, 96.33, 12.33, 24.33, 48.33, 72.33, 96.33), PERC = c(69.4232142857143, 
90.450496031746, 102.25248015873, 100.341482142857, 104.310987301587, 
25.6843253968254, 49.9654761904762, 66.2337301587302, 71.6874007936508, 
73.5505277777778, 42.4852380952381, 53.3393261904762, 62.0385523809524, 
62.9715285714286, 65.5977922619048, 14.635119047619, 27.3870238095238, 
41.2321428571429, 50.3591904761905, 56.0338928571429)), row.names = c(NA, 
-20L), class = c("tbl_df", "tbl", "data.frame"))


dat1
# A tibble: 20 x 3
   GPVAR  TIME  PERC
   <fct> <dbl> <dbl>
 1 2      12.3  69.4
 2 2      24.3  90.5
 3 2      48.3 102. 
 4 2      72.3 100. 
 5 2      96.3 104. 
 6 3      12.3  25.7
 7 3      24.3  50.0
 8 3      48.3  66.2
 9 3      72.3  71.7
10 3      96.3  73.6
11 1      12.3  42.5
12 1      24.3  53.3
13 1      48.3  62.0
14 1      72.3  63.0
15 1      96.3  65.6
16 4      12.3  14.6
17 4      24.3  27.4
18 4      48.3  41.2
19 4      72.3  50.4
20 4      96.3  56.0

## plotting
ggplot(dat1, aes(x = TIME, group = GPVAR)) +
+     geom_point(aes(y = PERC), position = position_dodge(width = 0.2)) 

which gives a plot without horizontal dodging,

position-dodge-not-working

Running str() on both dat and dat1 shows that they are quite similar, so I am not sure what is going on..

str(dat)
'data.frame':   12 obs. of  3 variables:
 $ x: int  1 2 1 2 1 2 1 2 1 2 ...
 $ y: int  1 2 3 4 5 6 7 8 9 10 ...
 $ g: Factor w/ 3 levels "A","B","C": 1 2 3 1 2 3 1 2 3 1 ...
> str(dat1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   20 obs. of  3 variables:
 $ GPVAR: Factor w/ 4 levels "1","2","3","4": 2 2 2 2 2 3 3 3 3 3 ...
 $ TIME : num  12.3 24.3 48.3 72.3 96.3 ...
 $ PERC : num  69.4 90.5 102.3 100.3 104.3 ...

Any help or explanation here would be very useful, thanks!

Upvotes: 1

Views: 217

Answers (2)

Thomas Bilach
Thomas Bilach

Reputation: 591

It may not be obvious but your plot does show some horizontal movement (jitter). The width = ... argument specifies the degree of random noise in the x direction. Note, the range of x values varies considerably between the two plots.

Two fixes could help adjust the movement of the points. First, augment the width argument to a value greater than 0.2 inside of position_dodge(). Second (preferred), simply insert position = "jitter" inside of geom_point().

ggplot(dat1, aes(x = TIME, y = PERC, group = GPVAR)) +
  geom_point(position = "jitter")

Note, you could also replace geom_point() with geom_jitter(). Omitting the width = ... argument defaults to 40 percent of the resolution of the data. Try resolution(dat1$TIME) to see how this distance is calculated. Since jitter is added in both positive and negative directions, the jitter values will occupy 80% of the implied bins. For more information, please refer to the documentation.

Typically, these techniques are used when there is considerable over-plotting. You only have twenty values so you don't have to overdue the degree of jitter.

I hope this helps.

Upvotes: 1

dshkol
dshkol

Reputation: 1228

Your code runs fine and as intended. There is a dodge in the second plot, it's just barely perceptible because your position = position_dodge(width = 0.2) argument is too small. It works on the first one because your X axis is at a scale where that argument makes a difference; but the second one is at a different scale. If you increase that parameter, you'll see your code works fine.

ggplot(dat1, aes(x = TIME, group = GPVAR)) + 
  geom_point(aes(y = PERC), position = position_dodge(width = 5))

enter image description here

An alternative here is also to use geom_jitter in place of geom_point:

ggplot(dat1, aes(x = TIME, group = GPVAR)) + 
  geom_jitter(aes(y = PERC))

enter image description here Looking at this a bit more, it seems like because your x-axis is a continuous variable, the dodge parameter makes an absolute change. However, if I make your x-axis discrete, then the dodging looks to be more relative.

ggplot(dat1, aes(x = factor(TIME), group = GPVAR)) + 
  geom_point(aes(y = PERC), position = position_dodge(width = 0.2))

enter image description here

Upvotes: 1

Related Questions