Reputation:
I want to make an Alluvial diagram using library(alluvial)
My dataframe looks like this:
> id Diagnose 1 Diagnose 2 Diagnose 3
1 Cancer cancer cancer
2 Headache Breastcancer Breastcancer
3 Breastcancer Breastcancer cancer
4 Cancer cancer cancer
5 Cancer Breastcancer Breastcancer
6 Cancer Breastcancer cancer
etc.
The dataframe shows the name of a diagnose given by the doctor (just examples, not real diagnosis).
So for patient id 1, the first diagnosis is cancer, the second is also cancer and the last one is also cancer. For patient number 2, the first diagnosis is headache, then the patient is given the diagnosis Breastcancer and so on.
I want to make an alluvial diagram which shows the development of the diagnosis of each patient. And collects all patients that have "cancer" as first diagnosis and so on. How can i make an Alluvial diagram, looking like this: [![enter image description here][1]][1]
Upvotes: 1
Views: 2247
Reputation: 9525
You should first work with your data, then use the alluvial
function:
library(dplyr) # to manipulate data
library(alluvial)
allu <- data %>%
group_by(Diagnose1, Diagnose2, Diagnose3) %>% # grouping
summarise(Freq = n()) # adding frequencies
# here the plot
alluvial(allu[,1:3], freq=allu$Freq)
with data ( I removed the space in the column names):
data <- read.table(text = "id Diagnose1 Diagnose2 Diagnose3
1 Cancer cancer cancer
2 Headache Breastcancer Breastcancer
3 Breastcancer Breastcancer cancer
4 Cancer cancer cancer
5 Cancer Breastcancer Breastcancer
6 Cancer Breastcancer cancer ",header = T)
EDIT
If you have NAs, you can try to replace them in this way:
# first, you should use the option stringsAsFactor = F in the data, in my case
data <- read.table(text = "id Diagnose1 Diagnose2 Diagnose3
1 Cancer cancer cancer
2 Headache Breastcancer Breastcancer
3 Breastcancer Breastcancer cancer
4 Cancer NA cancer
5 Cancer Breastcancer Breastcancer
6 Cancer Breastcancer cancer ",header = T, stringsAsFactor = F )
# second, replace them with something you like:
data[is.na(data)] <- 'nothing'
Last, you'll can plot the plot, and it's going to appear the word choosen to replace NAs.
Upvotes: 4