XtrimA
XtrimA

Reputation: 143

In UpSetR, how to show decimal number on the intersection bar

I am making an upset diagram for the following data in percentages. This is a dummy example for my more complicated data.

x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) upset(fromExpression(x), order.by = "freq")

I want these percentages to appear as decimal numbers and all the bars visible even if it is 0.1%. All the data is important in this plot.

Upvotes: 3

Views: 1756

Answers (2)

Bob Zimmermann
Bob Zimmermann

Reputation: 993

Two facts are standing in the way of a quick and easy solution to this problem:

UpSetR is very strongly oriented toward discrete sets of countable objects.

A potential solution would be instead of using whole objects to use fractional objects, but the first thing upset() does is to check for which columns of your data frame have "0" and "1" as their only levels. This is hardcoded. If this fails, the startend object becomes NULL and there is no way the function will be able to do anything.

UpSetR does not give very good access to the plots it creates.

Once the plots are made, you are left with no return value from upset(). This means you cannot modify the plot objects themselves or change way they are plotted outside of the arguments allowed to pass to upset().

So, what can you do?

  • Depending on how complicated your real plot is (and how often have to replot it) you might just do this:
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) 
upset(fromExpression(x*100), order.by = "freq")

and then edit in inkscape/illustrator. (BAD)

  • Fork UpSetR and hijack the scale.intersections and scale.sets parameters. In the Make_main_bar() function you would just change the way it handles a "percent" argument to scale_intersections, and change the way Make_size_plot() handles the same argument to scale_sets. This would then become:
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) 
upset(fromExpression(x*100), order.by = "freq",
      scale.intersections="percent", scale.sets="percent")

I have personally forked UpSetR myself for other purposes, but the package in general needs a major refactoring so that it might be applied to additional use cases. The authors may have wanted to prevented uses of the concept outside of their concept.

Upvotes: 2

Ramiro Magno
Ramiro Magno

Reputation: 3175

The upset'ting plot

library(UpSetR)
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) 
upset(fromExpression(x), order.by = "freq", show.numbers = 'yes')

Your question

So you want two things:

  • percentages to appear as decimal numbers

  • bars visible even if it is 0.1%

Percentages to appear as decimal numbers

You start by converting your vector of percentages to counts (integer) with fromExpression. So the input to upset is then a dataframe:

library(UpSetR)
x <- c(a=80, b=9.9, c=5, 'a&b'=0.1, 'a&c'=1.65, 'c&b'=3.35) 
str(fromExpression(x))
#> 'data.frame':    98 obs. of  3 variables:
#>  $ a: num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ b: num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ c: num  0 0 0 0 0 0 0 0 0 0 ...

upset internally then gets the labels from this data, so the link to your original percentages is no longer present inside upset.

Having labels as percentages, or some other custom labels, does not seem to be a supported option for the function upset from the UpSetR package at the moment.

There is the show.numbers argument but only allow to show those absolute frequencies on top of the bars (show.numbers = "yes" or show.numbers = "Yes") or not (any other value for show.numbers), here's the code bit involved:

https://github.com/hms-dbmi/UpSetR/blob/fe2812c8cbe87af18c063dcee9941391c836e7b2/R/MainBar.R#L130-L132

So I think you need to change that piece of code, i.e., the geom_text and aes_string, to use a different aesthetic mapping (your relative frequencies). So maybe ask the developer to do it?

Bars visible even if it is 0.1%

Well, this ultimately depends on your y-axis dynamic range and the size of your plot, i.e., if the tallest bar is a lot greater than the shortest than it might be impossible to see both in the same chart (unless you make y-axis discontinuous).

Conclusion

I understand this is not really a solution to your problem but it is an answer that hopefully points you in the direction of the solution to your problem.

Upvotes: 2

Related Questions