Y.Coch
Y.Coch

Reputation: 331

ggplo2 in R: geom_segment displays different line than geom_line

Say I have this data frame:

treatment <- c(rep("A",6),rep("B",6),rep("C",6),rep("D",6),rep("E",6),rep("F",6))
year <- as.numeric(c(1999:2004,1999:2004,2005:2010,2005:2010,2005:2010,2005:2010))
variable <- c(runif(6,4,5),runif(6,5,6),runif(6,3,4),runif(6,4,5),runif(6,5,6),runif(6,6,7))
se <- c(runif(6,0.2,0.5),runif(6,0.2,0.5),runif(6,0.2,0.5),runif(6,0.2,0.5),runif(6,0.2,0.5),runif(6,0.2,0.5))
id <- 1:36
df1 <- as.data.table(cbind(id,treatment,year,variable,se))

df1$year <- as.numeric(df1$year)
df1$variable <- as.numeric(df1$variable)
df1$se <- as.numeric(df1$se)

As I mentioned in a previous question (draw two lines with the same origin using ggplot2 in R), I wanted to use ggplot2 to display my data in a specific way.

I managed to do so using the following script:

y1 <- df1[df1$treatment=='A'&df1$year==2004,]$variable
y2 <- df1[df1$treatment=='B'&df1$year==2004,]$variable
y3 <- df1[df1$treatment=='C'&df1$year==2005,]$variable
y4 <- df1[df1$treatment=='D'&df1$year==2005,]$variable
y5 <- df1[df1$treatment=='E'&df1$year==2005,]$variable
y5 <- df1[df1$treatment=='E'&df1$year==2005,]$variable
y6 <- df1[df1$treatment=='F'&df1$year==2005,]$variable

p <- ggplot(df1,aes(x=year,y=variable,group=treatment,color=treatment))+
geom_line(aes(y = variable, group = treatment, linetype = treatment, color = treatment),size=1.5,lineend = "round") +
scale_linetype_manual(values=c('solid','solid','solid','dashed','solid','dashed')) +
geom_point(aes(colour=factor(treatment)),size=4)+
geom_errorbar(aes(ymin=variable-se,ymax=variable+se),width=0.2,size=1.5)+
guides(colour = guide_legend(override.aes = list(shape=NA,linetype = c("solid", "solid",'solid','dashed','solid','dashed'))))

p+labs(title="Title", x="years", y = "Variable 1")+
  theme_classic() +
scale_x_continuous(breaks=c(1998:2010), labels=c(1998:2010),limits=c(1998.5,2010.5))+
  geom_segment(aes(x=2004, y=y1, xend=2005, yend=y3),colour='blue1',size=1.5,linetype='solid')+
  geom_segment(aes(x=2004, y=y1, xend=2005, yend=y4),colour='blue1',size=1.5,linetype='dashed')+
  geom_segment(aes(x=2004, y=y2, xend=2005, yend=y5),colour='red3',size=1.5,linetype='solid')+
  geom_segment(aes(x=2004, y=y2, xend=2005, yend=y6),colour='red3',size=1.5,linetype='dashed')+
  scale_color_manual(values=c('blue1','red3','blue1','blue1','red3','red3'))+
  theme(text = element_text(size=12))

As you can see I used both geom_line and geom_segment to display the lines for my graph.

figure

It's almost perfect but if you look closely, the segments that are drawn (between 2004 and 2005) do not display the same line size, even though I used the same arguments values in the script (i.e. size=1.5 and linetype='solid' or dashed).

Of course I could change manually the size of the segments to get similar lines, but when I do that, segments are not as smooth as the lines using geom_line. Also, I get the same problem (different line shapes) by including the size or linetype arguments within the aes() argument.

Do you have any idea what causes this difference and how I can get the exact same shapes for both my segments and lines ?

Upvotes: 2

Views: 3633

Answers (1)

Dave Gruenewald
Dave Gruenewald

Reputation: 5689

It seems to be an anti-aliasing issue with geom_segment, but that seems like a somewhat cumbersome approach to begin with. I think I have resolved your issue by duplicating the A and B treatments in the original data frame.

# First we are going to duplicate and rename the 'shared' treatments
library(dplyr)
library(ggplot2)

df1 %>% 
  filter(treatment %in% c("A", "B")) %>% 
  mutate(treatment = ifelse(treatment == "A",
                            "AA", "BB")) %>% 
  bind_rows(df1) %>% # This rejoins with the original data
  # Now we create `treatment_group` and `line_type` variables
  mutate(treatment_group = ifelse(treatment %in% c("A", "C", "D", "AA"),
                                  "treatment1",
                                  "treatment2"), # This variable will denote color
         line_type = ifelse(treatment %in% c("AA", "BB", "D", "F"),
                            "type1",
                            "type2")) %>% # And this variable denotes the line type

# Now pipe into ggplot
  ggplot(aes(x = year, y = variable,
             group = interaction(treatment_group, line_type), # grouping by both linetype and color
             color = treatment_group)) +
  geom_line(aes(x = year, y = variable, linetype = line_type), 
            size = 1.5, lineend = "round") +
  geom_point(size=4) +
  # The rest here is more or less the same as what you had
  geom_errorbar(aes(ymin = variable-se, ymax = variable+se), 
                width = 0.2, size = 1.5) +
  scale_color_manual(values=c('blue1','red3')) +
  scale_linetype_manual(values = c('dashed', 'solid')) +
  labs(title = "Title", x = "Years", y = "Variable 1") +
  scale_x_continuous(breaks = c(1998:2010), 
                     limits = c(1998.5, 2010.5))+
  theme_classic() +
  theme(text = element_text(size=12))

Which will give you the followingenter image description here

My numbers are different since they were randomly generated.

You can then modify the legend to your liking, but my recommendation is using something like geom_label and then be sure to set check_overlap = TRUE.

Hope this helps!

Upvotes: 2

Related Questions