strugglebus
strugglebus

Reputation: 45

How do I make a categorical variable on x-axis of ggplot geom_line with multiple groups

I'm starting with

>df57 <- data.frame(cellType = c("4.57", "4.57", "8.57", "8.57", "8.28.57", "8.28.57"), 
                        ORR = c("PD", "nonPD"),
                        BL = rep(0, each=6), 
                        Treated = c(10, -5, 8, -4, 15, -2))
>df57melt <- melt(df57)
>df57melt

cellType    ORR variable  value
1   4.57    PD      BL      0
2   4.57    nonPD   BL      0
3   8.57    PD      BL      0
4   8.57    nonPD   BL      0
5   8.28.57 PD      BL      0
6   8.28.57 nonPD   BL      0
7   4.57    PD      Treated 10
8   4.57    nonPD   Treated -5
9   8.57    PD      Treated 8
10  8.57    nonPD   Treated -4
11  8.28.57 PD      Treated 15
12  8.28.57 nonPD   Treated -2                        

I want to make a line plot in which treatment is on the x-axis (BL, Treated) and value on the y-axis (continuous). I want to have three cell types (4.57, 8.57, and 8.28.57; I want coded by line color) that each have response variable (PD and nonPD; I want coded by line style).

I plot what I think should work:

>ggplot(data=df57melt, aes(x=variable, y = value)) + 
  geom_line(aes(linetype = ORR, color = cellType))

geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

#so I add group info
>ggplot(data=df57melt, aes(x=variable, y = value, group = cellType)) + 
  geom_line(aes(linetype = ORR, color = cellType))

Error: geom_path: If you are using dotted or dashed lines, colour, size and linetype must be constant over the line

#but if I change from categorical x to continuous x...
>ggplot(data=df57melt, aes(x=as.numeric(variable), y = value)) + 
  geom_line(aes(linetype = ORR, color = cellType))

And it gives me something like what I want...but not quite Line plot How do I get it to recognize my x as categorical?

Upvotes: 2

Views: 10541

Answers (2)

eipi10
eipi10

Reputation: 93871

Use the group aesthetic to tell ggplot which combinations of columns to treat as separate groups. ggplot will draw lines between all points within a given group.

By default, with a categorical x-axis, ggplot will treat each x value as a separate group. In this case, the data end up grouped by variable, ORR, and cellType, resulting in only one value per group. We can override that by setting the group aesthetic. We want a separate line for each unique combination of ORR and cellType, so we use interaction(ORR, cellType) to group by each combination of these two variables.

In the code below, I've also used gather from the tidyr package to convert the data frame to long format, since reshape2 is an older package that is no longer under active development.

library(tidyverse)
theme_set(theme_classic())

df57 %>% 
  gather(key, value, BL:Treated) %>% 
  ggplot(aes(x=key, y=value)) + 
    geom_line(aes(linetype = ORR, color = cellType, 
                  group=interaction(ORR, cellType))) +
    scale_x_discrete(expand=c(0.05, 0.05))

enter image description here

What about something like this as an alternative:

df57 %>% 
  ggplot(aes(x=cellType, Treated, colour=ORR)) + 
    geom_hline(yintercept=0, colour="grey50", size=0.5) +
    geom_text(aes(label=sprintf("%1.1f", Treated))) +
    geom_text(data=. %>% 
                arrange(cellType) %>%
                group_by(ORR) %>% 
                slice(1) %>% 
                mutate(Treated=0.5*Treated),
              aes(label=gsub("nP", "n-P", ORR), x=0.65), hjust=1, fontface="bold") +
    geom_segment(aes(xend=cellType, yend=BL), linetype="11", size=0.3) +
    labs(x="Cell Type", y="Treatment Effect") +
    scale_x_discrete(expand=expand_scale(add=c(1,0.25))) + 
    guides(colour=FALSE) + 
    theme_classic(base_size=15)

enter image description here

If PD and nonPD can both be positive (or both negative) then you could do something like this:

df57 <- data.frame(cellType = c("4.57", "4.57", "8.57", "8.57", "8.28.57", "8.28.57"), 
                   ORR = c("PD", "nonPD"),
                   BL = rep(0, each=6), 
                   Treated = c(10, 5, 8, 4, 15, -2))

pd=position_dodge(0.5)
df57 %>% 
  ggplot(aes(x=cellType, Treated, colour=ORR)) + 
    geom_hline(yintercept=0, colour="grey50", size=0.5) +
    geom_text(aes(label=sprintf("%1.1f", Treated)), 
              position=pd, show.legend=FALSE) +
    geom_linerange(aes(ymin=BL, ymax=Treated), 
                 linetype="11", size=0.3, position=pd) +
    labs(x="Cell Type", y="Treatment Effect") +
    theme_classic(base_size=15) +
    theme(legend.position="bottom",
          legend.margin=margin(t=-5)) +
    scale_x_discrete(expand=expand_scale(add=c(0.3,0.3))) +
    guides(colour=guide_legend(override.aes=list(linetype="solid", size=4)))

enter image description here

And you can also of course reverse the roles of ORR and cellType:

df57 %>% 
  ggplot(aes(x=ORR, Treated, colour=cellType)) + 
    geom_hline(yintercept=0, colour="grey50", size=0.5) +
    geom_text(aes(label=sprintf("%1.1f", Treated)), 
              position=pd, show.legend=FALSE) +
    geom_linerange(aes(ymin=BL, ymax=Treated), 
                   linetype="11", size=0.3, position=pd) +
    labs(x="ORR", y="Treatment Effect") +
    theme_classic(base_size=15) +
    theme(legend.position="bottom",
          legend.margin=margin(t=-5)) +
    scale_x_discrete(expand=expand_scale(add=c(0.3,0.3))) +
    guides(colour=guide_legend(override.aes=list(linetype="solid", size=4)))

enter image description here

Upvotes: 3

denis
denis

Reputation: 5673

I know this is not really the answer, but I feel that you are trying to force the line where it is not supposed to be. I would propose:

ggplot(data=df57melt) + 
  geom_col(aes(x=as.factor(variable), 
               y = value,fill = as.factor(cellType),
               color = ORR,
               group=cellType),
           position = position_dodge(),size = 2)+
  scale_fill_grey()

enter image description here

Upvotes: 1

Related Questions