Reputation: 143
I am making an upset diagram for the following data in percentages. This is a dummy example for my more complicated data.
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) upset(fromExpression(x), order.by = "freq")
I want these percentages to appear as decimal numbers and all the bars visible even if it is 0.1%. All the data is important in this plot.
Upvotes: 3
Views: 1756
Reputation: 993
Two facts are standing in the way of a quick and easy solution to this problem:
UpSetR
is very strongly oriented toward discrete sets of countable objects.A potential solution would be instead of using whole objects to use fractional objects, but the first thing upset()
does is to check for which columns of your data frame have "0"
and "1"
as their only levels. This is hardcoded. If this fails, the startend
object becomes NULL
and there is no way the function will be able to do anything.
UpSetR
does not give very good access to the plots it creates.Once the plots are made, you are left with no return value from upset()
. This means you cannot modify the plot objects themselves or change way they are plotted outside of the arguments allowed to pass to upset()
.
So, what can you do?
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35)
upset(fromExpression(x*100), order.by = "freq")
and then edit in inkscape/illustrator. (BAD)
UpSetR
and hijack the scale.intersections
and scale.sets
parameters. In the Make_main_bar()
function you would just change the way it handles a "percent" argument to scale_intersections
, and change the way Make_size_plot()
handles the same argument to scale_sets
. This would then become:x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35)
upset(fromExpression(x*100), order.by = "freq",
scale.intersections="percent", scale.sets="percent")
I have personally forked UpSetR
myself for other purposes, but the package in general needs a major refactoring so that it might be applied to additional use cases. The authors may have wanted to prevented uses of the concept outside of their concept.
Upvotes: 2
Reputation: 3175
library(UpSetR)
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35)
upset(fromExpression(x), order.by = "freq", show.numbers = 'yes')
So you want two things:
percentages to appear as decimal numbers
bars visible even if it is 0.1%
You start by converting your vector of percentages to counts (integer) with fromExpression
. So the input to upset
is then a dataframe:
library(UpSetR)
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35)
str(fromExpression(x))
#> 'data.frame': 98 obs. of 3 variables:
#> $ a: num 1 1 1 1 1 1 1 1 1 1 ...
#> $ b: num 0 0 0 0 0 0 0 0 0 0 ...
#> $ c: num 0 0 0 0 0 0 0 0 0 0 ...
upset
internally then gets the labels from this data, so the link to your original percentages is no longer present inside upset
.
Having labels as percentages, or some other custom labels, does not seem to be a supported option for the function upset
from the UpSetR
package at the moment.
There is the show.numbers
argument but only allow to show those absolute frequencies on top of the bars (show.numbers = "yes"
or show.numbers = "Yes"
) or not (any other value for show.numbers
), here's the code bit involved:
So I think you need to change that piece of code, i.e., the geom_text
and aes_string
, to use a different aesthetic mapping (your relative frequencies). So maybe ask the developer to do it?
Well, this ultimately depends on your y-axis dynamic range and the size of your plot, i.e., if the tallest bar is a lot greater than the shortest than it might be impossible to see both in the same chart (unless you make y-axis discontinuous).
I understand this is not really a solution to your problem but it is an answer that hopefully points you in the direction of the solution to your problem.
Upvotes: 2