Reputation: 93
I'm currently trying to make my own graphical timeline like the one at the bottom of this page. I scraped the table from that link using the rvest package and cleaned it up.
Here is my code:
library(tidyverse)
library(rvest)
library(ggthemes)
library(lubridate)
URL <- "https://en.wikipedia.org/wiki/List_of_Justices_of_the_Supreme_Court_of_the_United_States"
justices <- URL %>%
read_html %>%
html_node("table.wikitable") %>%
html_table(fill = TRUE) %>%
data.frame()
# Removes weird row at bottom of the table
n <- nrow(justices)
justices <- justices[1:(n - 1), ]
# Separating the information I want
justices <- justices %>%
separate(Justice.2, into = c("name","year"), sep = "\\(") %>%
separate(Tenure, into = c("start", "end"), sep = "\n–") %>%
separate(end, into = c("end", "reason"), sep = "\\(") %>%
select(name, start, end)
# Removes wikipedia tags in start column
justices$start <- gsub('\\[e\\]$|\\[m\\]|\\[j\\]$$','', justices$start)
justices$start <- mdy(justices$start)
# This will replace incumbencies with NA
justices$end <- mdy(justices$end)
# Incumbent judges are still around!
justices[is.na(justices)] <- today()
justices$start = as.Date(justices$start, format = "%m/%d%/Y")
justices$end = as.Date(justices$end, format = "%m/%d%/Y")
justices %>%
ggplot(aes(reorder(x = name, X = start))) +
geom_segment(aes(xend = name,
yend = start,
y = end)) +
coord_flip() +
scale_y_date(date_breaks = "20 years", date_labels = "%Y") +
theme(axis.title = element_blank()) +
theme_fivethirtyeight() +
NULL
This is the output from ggplot (I'm not worried about aesthetics yet I know it looks terrible!):
The goal for this plot is to order the judges chronologically from their start date, so the judge with the oldest start date should be at the bottom while the judge with the most recent should be at the top. As you can see, There are multiple instances where this rule is broken.
Instead of sorting chronologically, it simply lists the judges as the order they appear in the data frame, which is also the order Wikipedia has it in. Therefore, a line segment above another segment should always start further right than the one below it
My understanding of reorder is that it will take the X = start from geom_segment and sort that and list the names in that order.
The only help I could find to this problem is to factor the dates and then order them that way, however I get the error
Error: Invalid input: date_trans works with objects of class Date only.
Thank you for your help!
Upvotes: 1
Views: 2075
Reputation: 16871
You can make the name
column a factor and use forcats::fct_reorder
to reorder names based on start date. fct_reorder
can take a function that's used for ordering start
; you can use min()
to order by the earliest start date for each justice. That way, judges with multiple start dates will be sorted according to the earliest one. Only a two line change: add a mutate
at the beginning of the pipe, and remove the reorder
inside aes
.
justices %>%
mutate(name = as.factor(name) %>% fct_reorder(start, min)) %>%
ggplot(aes(x = name)) +
geom_segment(aes(xend = name,
yend = start,
y = end)) +
coord_flip() +
scale_y_date(date_breaks = "20 years", date_labels = "%Y") +
theme(axis.title = element_blank()) +
theme_fivethirtyeight()
Created on 2018-06-29 by the reprex package (v0.2.0).
Upvotes: 1
Reputation: 23231
I would make this a comment, but I couldn't fit it.
This was an attempt I gave up on. It looks like it actually does fix the problem, but it broke several other aspects of the formatting and I've run out of time to fix it back.
justices <- justices[order(justices$start, decreasing = TRUE),]
any(diff(justices$start) > 0) # FALSE, i.e. it works
justices$id <- nrow(justices):1
ggplot(data=justices, mapping=aes(x = start, y=id)) + #,color=name, color =
scale_x_date(date_breaks = "20 years", date_labels = "%Y") +
scale_y_discrete(breaks=justices$id, labels = justices$name) +
geom_segment(aes(xend = end, y = justices$id, yend = justices$id), size = 5) +
theme(axis.title = element_blank()) +
theme_fivethirtyeight()
Please also refer to this thread. GL!
Upvotes: 1