David Russell
David Russell

Reputation: 13

Can you track and label a single alluvium using the ggalluvial package for ggplot2 in R?

I'm working on an alluvial diagram in R, based off of the "student curricula" example in the ggalluvial vignette. I want to be able to track a single cohort/alluvium (in the majors dataset, a single student) across the whole diagram by labeling the alluvium at each axis. I've found, however, that it only works with lode.guidance (in geom_flow) set to "zigzag", and with no other settings.

Using the vignette example, you can label the alluvia with the student ID number as follows. The only changes I made from the vignette example are flagged with comments:

    library(ggplot2)
    library(ggalluvial)

    data(majors)
    majors$curriculum <- as.factor(majors$curriculum)
    ggplot(majors,
       aes(x = semester, stratum = curriculum, alluvium = student,
           fill = curriculum, label = student)) + #changed from label = alluvium
    scale_fill_brewer(type = "qual", palette = "Set2") +
    geom_flow(stat = "alluvium", lode.guidance = "frontback",
            color = "darkgray") + #can change lode.guidance parameter here in geom_flow
    geom_stratum() +
    geom_text(stat = "alluvium", size = 3) #added this geom_text to get the label

Which produces the following alluvial diagram:

(image of inconsistent flows using frontback)

There are some inconsistencies in showing the movement of an alluvium (a student) from axis to axis. Some students are "shuffled" in their shift from one axis to the next. For example, in the flow from CURR3 to CURR5, student 10 becomes student 2. In the same shift, student 6 becomes student 10, etc.

The same problem occurs with all other lode.guidance settings (forward, rightward, backward, leftward, frontback, rightleft, backfront, leftright), except for "zigzag", which shows it perfectly. (image of correct flows using zigzag)

My question is this: is tracking a single alluvium from axis to axis using ggalluvial supposed to be possible using all lode.guidance settings, or is this a bug in the package? Or is "zigzag" the only lode.guidance parameter that is meant for tracking an alluvium?

Any help with this is much appreciated! Of course, using "zigzag" works for my graph, but I wanted to let everyone know this issue is out there and to see if anyone could clear up my confusion.

Upvotes: 1

Views: 923

Answers (1)

Cory Brunson
Cory Brunson

Reputation: 718

Certainly each alluvium should correspond to a single case, whatever the parameter settings. The issue here is that the alluvium stat (statistical transformation) is being used to produce two layers in the plot under different parameter settings: the flow layer with lode.guidance set to "frontback" and the text layer with lode.guidance defaulting to "zigzag". This is briefly discussed in a recent package vignette but it's otherwise not well documented.

One solution is to make sure that every use of any stat in a plot is passed the same parameters. Another is to set a global parameter that controls the default settings for each stat. Both are taken below to produce the plot i think you have in mind.

library(ggalluvial)
#> Loading required package: ggplot2
data(majors)
majors$curriculum <- as.factor(majors$curriculum)
# apply the same parameter setting to every instance of the alluvium stat
ggplot(majors,
       aes(x = semester, stratum = curriculum, alluvium = student,
           fill = curriculum, label = student)) +
  scale_fill_brewer(type = "qual", palette = "Set2") +
  geom_flow(stat = "alluvium", lode.guidance = "frontback",
            color = "darkgray") +
  geom_stratum() +
  geom_text(stat = "alluvium", lode.guidance = "frontback", size = 3)

# alternatively, set a package-specific global option
options(ggalluvial.lode.guidance = "frontback")
ggplot(majors,
       aes(x = semester, stratum = curriculum, alluvium = student,
           fill = curriculum, label = student)) +
  scale_fill_brewer(type = "qual", palette = "Set2") +
  geom_flow(stat = "alluvium", color = "darkgray") +
  geom_stratum() +
  geom_text(stat = "alluvium", size = 3)

Created on 2020-02-05 by the reprex package (v0.3.0)

Upvotes: 2

Related Questions