mariodrumblue
mariodrumblue

Reputation: 171

Bar plot of categorical variable (string) with multiple answers

This might be trivial but I can't figure it out and can't find it online. Let's say I survey people asking the reason they did something. Two options: reason 1 and reason 2. They can also pick both options.

data <- data.frame('reason'=c(rep('R1', 5),rep('R2', 3),rep('R1,R2', 4)))
data
   reason
1      R1
2      R1
3      R1
4      R1
5      R1
6      R2
7      R2
8      R2
9   R1,R2
10  R1,R2
11  R1,R2
12  R1,R2

I want to plot the answers, but only counting R1 and R2. That is, if they answered R1 and R2 assign 1 count to each. The command,

ggplot(data = data, aes(x = reason)) + geom_bar() +  coord_flip()

would plot the multiple answer cases as a separate category. enter image description here

What I want instead is R1 to have a count of 5+4=9 and R2 to have a count of 3+4=7, and no R1,R2 category.

I am interested in this because I have real data from a Qualtrics survey

Upvotes: 0

Views: 122

Answers (2)

Edward
Edward

Reputation: 19199

You need to do some data management first. Something like:

mutate(data, 
       R1=grepl('R1', reason),
       R2=grepl('R2', reason)) %>%
  select(-reason) %>%
  pivot_longer(everything(), names_to="reason") %>%
  filter(value) %>%
  count(reason) %>%
  print() %>%
  ggplot(aes(x=reason, y=n)) +
  geom_col() +
  coord_flip()

enter image description here

Upvotes: 0

stefan
stefan

Reputation: 125228

Using tidyr::separate_rows you could split your reason column into multiple rows:

data <- data.frame('reason'=c(rep('R1', 5),rep('R2', 3),rep('R1,R2', 4)))

library(tidyr)
library(ggplot2)

data_sep <- data |> 
  separate_rows(reason)

ggplot(data = data_sep, aes(y = reason)) + 
  geom_bar()

Upvotes: 2

Related Questions