r: Automatically stagger overlapping labels in ggplot slopegraph

Question

While creating a slopegraph with ggplot2, as below, I find that many of my labels overlap when their data points are close together. How can I change the labelling to automatically stagger my labels if there is overlap?

library(ggplot2)
library(scales)
install.packages("Lock5Data", repos = "http://cran.us.r-project.org")  # you might need this
library(Lock5Data)
data("NBAStandings1e")
data("NBAStandings2016")


colnames(NBAStandings1e)[4] <- "year1"    # 2010-2011
colnames(NBAStandings2016)[4] <- "year2"  # 2015-2016
nba_df <- merge(NBAStandings1e[,c('Team','year1')], NBAStandings2016[,c('Team','year2')])
scale <- dim(nba_df)[1] 

a<-nba_df
p<-ggplot(nba_df) + geom_segment(aes(x=0,xend=scale,y=year1,yend=year2),size=.75)

# clear junk
p<-p + theme(panel.background = element_blank())
p<-p + theme(panel.grid=element_blank())
p<-p + theme(axis.ticks=element_blank())
# p<-p + theme(axis.text=element_blank())
p<-p + theme(panel.border=element_blank())
# p<-p + theme(panel.grid.major = element_line(linetype = "dashed", fill = NA))
p<-p + theme(panel.grid.major = element_line(linetype = "dashed",color = "grey80"))
p<-p + theme(panel.grid.major.x = element_blank())
p<-p + theme(axis.text.x = element_blank())


# annotate
p<-p + xlab("") + ylab("Percentage Wins")
p<-p + xlim((-5),(scale+12))
p<-p + geom_text(label="2010-2011 Season", x=0,     y=(1.1*(max(a$year2,a$year1))),hjust= 1.2,size=3)
p<-p + geom_text(label="2015-2016 Season", x=months,y=(1.1*(max(a$year2,a$year1))),hjust=-0.1,size=3)
p<-p + geom_text(label=nba_df$Team, y=nba_df$year2, x=rep.int(scale,dim(a)[1]),hjust=-0.2,size=2)
p<-p + geom_text(label=nba_df$Team, y=nba_df$year1, x=rep.int( 0,dim(a)[1]),hjust=1.2,size=2)
p

eipi10 · Accepted Answer

Since the teams that overlap have the same winning percentage, you can deal with overlap more simply by combining the labels for teams with the same winning percentage. I've also made a few other changes to your code intended to streamline the process.

library(Lock5Data)
library(tidyverse)
library(scales)

data("NBAStandings1e")
data("NBAStandings2016")
colnames(NBAStandings1e)[4] <- "2010-11"    # 2010-2011
colnames(NBAStandings2016)[4] <- "2015-16"  # 2015-2016
nba_df <- merge(NBAStandings1e[,c('Team','2010-11')], NBAStandings2016[,c('Team','2015-16')])

# Convert data to long format
dat = gather(nba_df, Season, value, -Team) 

# Combine labels for teams with same winning percentage (see footnote * below)
dat_lab = dat %>% group_by(Season, value) %>% 
  summarise(Team = paste(Team, collapse="\U2014"))  # \U2014 is the emdash character

ggplot(dat, aes(Season, value, group=Team)) +
  geom_line() +
  theme_minimal() + theme(panel.grid.minor=element_blank()) +
  labs(y="Winning Percentage") +
  scale_y_continuous(limits=c(0,1), labels=percent) +
  geom_text(data=subset(dat_lab, Season=="2010-11"), aes(label=Team, x=0.98), hjust=1, size=2) +
  geom_text(data=subset(dat_lab, Season=="2015-16"), aes(label=Team, x=2.02), hjust=0, size=2)

Here's a closeup of what the labels look like:

* If there are teams that overlap due to having very close, but unequal, winning percentages, you can still group them by rounding. For example, if you wanted to group teams with winning percentages that are the same when rounded to the nearest 2 percent, you could do:

dat_lab = dat %>% group_by(Season, group=round(value/0.02)*0.02) %>% 
  summarise(Team = paste(Team, collapse="\U2014"),
            value = mean(value))

This would result in the labels being placed at the mean value for their group.

r: Automatically stagger overlapping labels in ggplot slopegraph

Answers (1)

Related Questions