MLEN
MLEN

Reputation: 2561

Create a line plot using categorical data and not connecting the lines

Trying to create a graph where both x and y are factors but I don't want the lines to be connected if there is a gap. How can I achieve this?

library(ggplot2)

df <- data.frame(x = c('a', 'b', 'c', 'd', 'e'), y = c('a', 'a', NA, 'a', 'a'))

ggplot(df, aes(x = x, y = y, group = y)) +
  geom_point() + 
  geom_line()

Dont want the NA in the plot and there shouldn't be a line between b and d.

Upvotes: 4

Views: 358

Answers (2)

tjebo
tjebo

Reputation: 23737

Another way is to factorise y and use the levels. Group with group = 1. You can relabel with scale.

library(ggplot2)
df <- data.frame(x = c('a', 'b', 'c', 'd', 'e'), 
                 y = c('a', 'a', NA, 'a', 'a'))

ggplot(df, aes(x = x, y = as.numeric(as.factor(y)), group = 1)) +
  geom_point() + 
  geom_line() +
  scale_y_continuous(breaks = 1, labels = 'a') +
  labs(y = 'y')
#> Warning: Removed 1 rows containing missing values (geom_point).

Created on 2020-03-04 by the reprex package (v0.3.0)

Upvotes: 0

lroha
lroha

Reputation: 34441

This may need extra work with your full dataset but one approach is to create a grouping variable to use in ggplot to prevent connections that aren't wanted.

df <- data.frame(x = c('a', 'b', 'c', 'd', 'e'), y = c('a', 'a', NA, 'a', 'a'), stringsAsFactors = FALSE)

df %>% 
  mutate(grp = with(rle(y), rep(seq_along(lengths), lengths))) %>%  # y can't be a factor
  mutate_all(as.factor) %>%
  na.omit() %>%                              # Drop NA cases so they're not plotted
  ggplot(aes(x = x, y = y, group = grp)) +
  geom_point() + 
  geom_line() +
  scale_x_discrete(drop = FALSE)             # Preserve empty factor levels in the plot

enter image description here

Upvotes: 2

Related Questions