emudrak
emudrak

Reputation: 1016

Set sector width of chord diagram with circlize

I have a data set involving 100 people and their diagnosis of 5 medical conditions. Any combinations of conditions can occur, but I've set it up so that the probability of condition D depends on condition A, and E depends on B.

set.seed(14)
numpeople <- 100
diagnoses <- data.frame(A=rbinom(100, 1, .15), 
                        B=rbinom(100, 1, .1),
                        C=rbinom(100, 1, .2)
                        )
# Probability of diagnosis for D increases by .4 if patient has A, otherwise .5
diagnoses$D <- sapply(diagnoses$A, function(x) rbinom(1, 1, .4*x+.2))
# Probability of diagnosis for E increases by .3 if patient has B, otherwise rare
diagnoses$E <- sapply(diagnoses$B, function(x) rbinom(1, 1, .7*x+.1))

To make a co-occurrence matrix, where each cell is the number of people with both of the diagnoses in the row and column, I use matrix algebra:

diagnoses.dist <- t(as.matrix(diagnoses))%*%as.matrix(diagnoses)
diag(diagnoses.dist) <- 0
diagnoses.dist
> diagnoses.dist
   A B C  D E
A  0 1 1 11 3
B  1 0 0  1 7
C  1 0 0  5 4
D 11 1 5  0 4
E  3 7 4  4 0

Then I'd like to use a chord diagram to show the proportion of co-diagnoses for each diagnosis.

circos.clear()
circos.par(gap.after=10)
chordDiagram(diagnoses.dist, symmetric=TRUE)

Example Chord diagram with 5 groups

By default, size of the sector (pie slice) allocated for each group is proportional to the number of links.

> colSums(diagnoses.dist) #Number of links related to each diagnosis
 A  B  C  D  E 
16  9 10 21 18 

Is it possible to set the sector width to illustrate the number of people which each diagnosis?

> colSums(diagnoses) #Number of people with each diagnosis
 A  B  C  D  E 
16  8 20 29 18 

This problem seems somewhat related to section 14.5 of the circlize book, but I'm not sure how to work the math for the gap.after argument.

Based on section 2.3 of the circlize book, I tried setting the sector size using circos.initalize but I think the chordDiagram function overrides this, because the scale on the outside is exactly the same.

circos.clear()
circos.par(gap.after=10)
circos.initialize(factors=names(diagnoses), x=colSums(diagnoses)/sum(diagnoses), xlim=c(0,1))
chordDiagram(diagnoses.dist, symmetric=TRUE)

enter image description here

I see a lot of options to fine-tune tracks in chordDiagram but not much for sectors. Is there a way this can be done?

Upvotes: 2

Views: 1923

Answers (1)

Zuguang Gu
Zuguang Gu

Reputation: 1321

In your case, Number of people in the category sometimes can be smaller than the total number of co-occurrence to other categories. For example, category B has totally 9 co-occurrence but the number of people is only 8.

If this is not the problem, you can put some values on the diagram of the matrix which correspond to the number people that only stay in one category. In following example code, I just add random numbers to the diagram to illustrate the idea:

diagnoses.dist <- t(as.matrix(diagnoses))%*%as.matrix(diagnoses)
diag(diagnoses.dist) = sample(10, 5)

# since the matrix is symmetric, we set the uppper diagnal to zero.
# we don't use `symmetrix = TRUE` here because the values on the diagonal
# are still used.
diagnoses.dist[upper.tri(diagnoses.dist)] = 0

par(mfrow = c(1, 2))
# here you can remove `self.link = 1` to see the difference
chordDiagram(diagnoses.dist, grid.col = 2:6, self.link = 1)

# If you don't want to see the "mountains"
visible = matrix(TRUE, nrow = nrow(diagnoses.dist), ncol = ncol(diagnoses.dist))
diag(visible) = FALSE
chordDiagram(diagnoses.dist, grid.col = 2:6, self.link = 1, link.visible = visible)

enter image description here

PS: link.visible option is only available in recent versions of circlize.

Upvotes: 3

Related Questions