user10806160
user10806160

Reputation:

R - How to make a Alluvial diagram

I want to make an Alluvial diagram using library(alluvial)

My dataframe looks like this:

  > id   Diagnose 1      Diagnose 2     Diagnose 3   
    1    Cancer          cancer           cancer            
    2    Headache        Breastcancer     Breastcancer             
    3    Breastcancer    Breastcancer     cancer   
    4    Cancer          cancer           cancer            
    5    Cancer          Breastcancer     Breastcancer             
    6    Cancer          Breastcancer     cancer            

etc.

The dataframe shows the name of a diagnose given by the doctor (just examples, not real diagnosis).

So for patient id 1, the first diagnosis is cancer, the second is also cancer and the last one is also cancer. For patient number 2, the first diagnosis is headache, then the patient is given the diagnosis Breastcancer and so on.

I want to make an alluvial diagram which shows the development of the diagnosis of each patient. And collects all patients that have "cancer" as first diagnosis and so on. How can i make an Alluvial diagram, looking like this: [![enter image description here][1]][1]

Upvotes: 1

Views: 2247

Answers (1)

s__
s__

Reputation: 9525

You should first work with your data, then use the alluvial function:

library(dplyr)                                          # to manipulate data
library(alluvial)
allu <- data %>% 
        group_by(Diagnose1, Diagnose2, Diagnose3) %>%   # grouping
        summarise(Freq = n())                           # adding frequencies

# here the plot
alluvial(allu[,1:3], freq=allu$Freq)

enter image description here


with data ( I removed the space in the column names):

data <- read.table(text = "id   Diagnose1      Diagnose2     Diagnose3        
    1    Cancer          cancer           cancer            
    2    Headache        Breastcancer     Breastcancer             
    3    Breastcancer    Breastcancer     cancer   
    4    Cancer          cancer           cancer            
    5    Cancer          Breastcancer     Breastcancer             
    6    Cancer          Breastcancer     cancer      ",header = T)

EDIT

If you have NAs, you can try to replace them in this way:

# first, you should use the option stringsAsFactor = F in the data, in my case
data <- read.table(text = "id   Diagnose1      Diagnose2     Diagnose3        
    1    Cancer          cancer           cancer            
                   2    Headache        Breastcancer     Breastcancer             
                   3    Breastcancer    Breastcancer     cancer   
                   4    Cancer          NA           cancer            
                   5    Cancer          Breastcancer     Breastcancer             
                   6    Cancer          Breastcancer     cancer      ",header = T, stringsAsFactor = F )

# second, replace them with something you like:
data[is.na(data)] <- 'nothing'

Last, you'll can plot the plot, and it's going to appear the word choosen to replace NAs.

Upvotes: 4

Related Questions