dylanjm
dylanjm

Reputation: 2101

Apply Geom Layer Conditionally - Separate Points & Lines

I have a data set similar to the one below where I have a lot of data for certain groups and then only single observations for other groups. I would like my single observations to show up as points but the other groups with multiple observations to show up as lines (no points). My code is below:

EDIT: I'm attempting to find a way to do this without using multiple datasets in the geom_* calls because of the issues it causes with the legend. There was an answer that has since been deleted that was able to handle the legend but didn't get rid of the points on the lines. I would potentially like a single legend with points only showing up if they are a single observation.

library(tidyverse)

dat <- tibble(x = runif(10, 0, 5),
              y = runif(10, 0, 20),
              group = c(rep("Group1", 4),
                        rep("Group2", 4),
                        "Single Point 1",
                        "Single Point 2")
              )

dat %>% 
  ggplot(aes(x = x, y = y, color = group)) + 
  geom_point() +
  geom_line()

Created on 2019-04-02 by the reprex package (v0.2.1)

Upvotes: 1

Views: 306

Answers (1)

aosmith
aosmith

Reputation: 36114

Only plot the data with 1 point in geom_point() and the data with >1 point in geom_line(). These can be precalculated in mutate().

dat = dat %>%
     group_by(group) %>%
     mutate(n = n() )

dat %>% 
     ggplot(aes(x = x, y = y, color = group)) + 
     geom_point(data = filter(dat, n == 1) ) +
     geom_line(data = filter(dat, n > 1) ) 

Having the legend match this is trickier. This is the sort of thing that that override.aes argument in guide_legend() can be useful for.

In your case I would separately calculate the number of observations in each group first, since that is what the line vs point is based on.

sumdat = dat %>%
     group_by(group) %>%
     summarise(n = n() )

The result is in the same order as the factor levels in the legend, which is why this works.

Now we need to remove lines and keep points whenever the group has only a single observation. 0 stands for a blank line and NA stands for now shape. I use an ifelse() statement for linetype and shape for override.aes, based on the number of observations per group.

dat %>% 
     ggplot(aes(x = x, y = y, color = group)) + 
     geom_point(data = filter(dat, n == 1) ) +
     geom_line(data = filter(dat, n > 1) ) +
     guides(color = guide_legend(override.aes = list(linetype = ifelse(sumdat$n == 1, 0, 1),
                                                     shape = ifelse(sumdat$n == 1, 19, NA) ) ) )

enter image description here

Upvotes: 4

Related Questions