Borexino
Borexino

Reputation: 842

Removing Isolated Variables in UpSetR Plot

This is my situation:

library(UpSetR)

movies <- read.csv(system.file("extdata", "movies.csv", package = "UpSetR"), header = TRUE, sep = ";")

upset(movies, sets = c("Action", "Adventure", "Comedy", "Drama", "Mystery",  "Thriller", "Romance", "War", "Western"), 
      order.by = "freq")

I would like to improve the plot by removing variables (genres) that are displayed alone, without any intersections with other variables.

How can I modify the code to remove these isolated variables as specified below?

enter image description here

Upvotes: 1

Views: 328

Answers (1)

MrFlick
MrFlick

Reputation: 206207

You can filter them out of the data before you draw the plot. For example

sets <- c("Action", "Adventure", "Comedy", "Drama", "Mystery",  "Thriller", "Romance", "War", "Western")

# keep only rows with more than 1 value
reduced_data <- movies[rowSums(movies[, sets]) > 1, ]
# or with dplyr...
# reduced_data <- movies %>% filter(rowSums(pick(all_of(sets)))>1)

upset(reduced_data, sets = sets, 
      order.by = "freq")

which gives you upset plot with no single groups

Upvotes: 2

Related Questions