retrofuture
retrofuture

Reputation: 53

Visualize rank-change using alluvial in R ggalluvial

I have a pretty basic df in which I have calculated the rank-change of values between two timestamps:

   value rank_A rank_B group
1      A   1     1      A
2      B   2     3      A
3      C   3     2      B
4      D   4     4      B
5      E   5     8      A
6         F   6    5   C
7         G   7    6   C
8         H   8    7   A

What makes it a bit tricky (for me) is plotting the values on the Y-axis.

ggplot(df_alluvial, aes(y = value, axis1 = rank_A, axis2 = rank_B))+
  geom_alluvium(aes(fill = group), width = 1/12)+
  ...

As of now, I can plot the rank-change and the groups successfully, but they are not linked to my value-names - there are no axis names and I don't know how to add them.

In the end it should look similiar to this: https://www.reddit.com/r/GraphicalExcellence/comments/4imh5f/alluvial_diagram_population_size_and_rank_of_uk/

Thanks for your advice!

Upvotes: 1

Views: 1133

Answers (1)

Steen Harsted
Steen Harsted

Reputation: 1932

Your update made the question more clear to me.

The y parameter should be a numerical value, and the data should be in 'long' format. I'm not sure how to change your data to fulfill these requirements. Therefore, I create some new data in this example. I have tried to make the data similar to the data in the plot that you have linked to.

Labels and stratum refer to the city-names. You can use geom_text to label the strata.

# Load libraries
library(tidyverse)
library(ggalluvial)


# Create some data
df_alluvial <- tibble(
  city = rep(c("London", "Birmingham", "Manchester"), 4),
  year = rep(c(1901, 1911, 1921, 1931), each = 3),
  size = c(0, 10, 100, 10, 15, 100, 15, 20, 100, 30, 25, 100))

# Notice the data is in long-format
df_alluvial
#> # A tibble: 12 x 3
#>    city        year  size
#>    <chr>      <dbl> <dbl>
#>  1 London      1901     0
#>  2 Birmingham  1901    10
#>  3 Manchester  1901   100
#>  4 London      1911    10
#>  5 Birmingham  1911    15
#>  6 Manchester  1911   100
#>  7 London      1921    15
#>  8 Birmingham  1921    20
#>  9 Manchester  1921   100
#> 10 London      1931    30
#> 11 Birmingham  1931    25
#> 12 Manchester  1931   100

ggplot(df_alluvial,
       aes(x = as.factor(year), stratum = city, alluvium = city, 
           y = size,
           fill = city, label = city))+
  geom_stratum(alpha = .5)+
  geom_alluvium()+
  geom_text(stat = "stratum", size = 3)

If you want to sort the cities based on their size, you can add decreasing = TRUE to all layers in the plot.

ggplot(df_alluvial,
       aes(x = as.factor(year), stratum = city, alluvium = city, 
           y = size,
           fill = city, label = city))+
  geom_stratum(alpha = .5, decreasing = TRUE)+
  geom_alluvium(decreasing = TRUE)+
  geom_text(stat = "stratum", size = 3, decreasing = TRUE)

enter image description here

Created on 2019-11-08 by the reprex package (v0.3.0)

Upvotes: 4

Related Questions