Reputation: 63
I am trying to create an Euler diagram with the R package eulerr. I am using the following code:
vd <- euler(c(A = 54, B = 22, C = 53, D= 26 ,"A&B" = 20, "A&C" = 29, "A&D"=10, "B&C" = 16, "B&D"=5, "C&D"=7,"A&B&C" = 14, "A&B&D"=5, "A&C&D"=4, "B&C&D"=3,"A&B&C&D"=3),input = c("union"), shape="ellipse")
plot(vd, labels = c("A", "B", "C","D"), main = "Databases",Count=TRUE, quantities = TRUE)
I am getting the following result:
But the resulting Euler-plot is wrong:
How can I fix this or is this a package error?
The error_plot is shows the following:
Region error:
Residuals:
Unfortunately the Residual-plot doesn't show the residuals.
Nonetheless the missing cases are shown in the "normal" residual statistic below.
original fitted residuals regionError
A 15 15 0 0.004
B 0 0 0 0.000
C 19 19 0 0.005
D 13 13 0 0.003
A&B 4 4 0 0.001
A&C 14 14 0 0.003
A&D 4 4 0 0.001
B&C 2 0 2 0.022
B&D 0 0 0 0.000
C&D 3 3 0 0.001
A&B&C 11 11 0 0.003
A&B&D 2 2 0 0.000
A&C&D 1 1 0 0.000
B&C&D 0 0 0 0.000
A&B&C&D 3 3 0 0.001
diagError: 0.022
stress: 0.004
Upvotes: 2
Views: 4279
Reputation: 1
Euler can go wrong in a number of instances, for instance:
vd <- euler(c(A=23578,B=30492,C=63610,"A&B"=563,"A&C"=624,"B&C"=1600,"A&B&C"=308))
plot(vd, labels = c("1", "2", "3"), main = "overlap", cex=2)
displays a diagram with NO overlapping regions for the three categories.
i think this is simply an inaccurate tool to use.
Upvotes: 0
Reputation: 2628
Regarding how to fix the issue, it depends on the level of precission you want. From the nVenn algorithm, I authored the nVennR package to create quasi-proportional Euler diagrams. With the caveats mentioned in the link, you can represent larger numbers of sets and show the relative size of each region. In your example,
library(nVennR)
myV <- createVennObj(nSets = 4, sNames = c('A', 'B', 'C', 'D'), sSizes = c(0, 26, 53, 7, 22, 5, 16, 3, 54, 10, 29, 4, 20, 5, 14, 3))
myV <- plotVenn(nVennObj = myV)
Depending on your requirements, this may not be satisfactory. The proportionality is in the area of the circles, not the regions (you can see that the region 1, 2, 3, 4 - A&B&C&D - has empty space. However, this strategy overcomes the limitations of regular shapes in these representations mentioned by Johan Larsson. If you are interested, there are more details in the vignette.
Upvotes: 5
Reputation: 3694
The reason why some areas are left out is simple: the diagram is inexact and is missing some areas. There is no place to put the label for B&C
so that's why B and C are missing 2 units. There likely isn't any way (or at least eulerr cannot find it) to perfectly represent your combination with an Euler diagram using ellipses. You either have to accept that it is inexact or try another solution.
Similarly, the residual plot cannot show the missing residuals graphically because there is no area representing them. I am, by the way, the author of this package and I do have something better in mind for the residual plot which would display missing areas as well, but I haven't had time to implement it yet.
Upvotes: 4