Reputation: 1141
I have survey data structured into several item variables that denote whether something was mentioned (1
) or not mentioned (2
) by a survey respondent. So in short, each row is for a different survey respondent and they can either choose all options a through c (as is the case for the third respondent in the data below), or none or just some.
Let this be the dataset:
testdat<-data.frame(option_a=c(1,2,2,1,2),
option_b=c(1,1,2,1,2),
option_c=c(1,1,2,1,1))
What would be the easiest and fastest way to plot just the relative frequencies of how often any option was chosen? The outcome should be a geom_bar plot with three bars representing the different options (a: 40%, b: 60%, c: 20%). Put differently, I would like to have a plot based on which I could say, a given option was chosen in x% of the cases by the respondents.
Is there a way by which I could do this directly in ggplot without having to restructure the dataset or replace 2
s by 0
s, etc.? I guess this should be fairly easy, but I just can't see it right now.
Upvotes: 2
Views: 2141
Reputation: 6020
For a barplot you need to make your data into a long format. You cannot do that within the ggplot
function itself. You can change the levels of the values within ggplot
, nut you will also need to rename the fill
legend.
testdat<-data.frame(option_a=c(1,2,2,1,2),
option_b=c(1,1,2,1,2),
option_c=c(1,1,2,1,1))
require(ggplot2)
require(tidyverse)
testdat %>%
gather(option,value) %>%
ggplot(aes(x = factor(option), fill = factor((value-2)*-1))) +
geom_bar()
to get the percentages/proportions instead of n
you can summarise the data before plotting the data like so:
testdat %>%
gather(option, value) %>%
group_by(option,value) %>%
summarise(n = n()) %>%
group_by(option) %>%
mutate(percentage = n/sum(n)*100) %>%
ggplot(aes(x = factor(option), y = percentage, fill = factor((value-2)*-1))) +
geom_bar(stat = "identity")
EDIT:
only show the relative frequencies of one of the options:
testdat %>%
gather(option, value) %>%
group_by(option,value) %>%
summarise(n = n()) %>%
group_by(option) %>%
mutate(percentage = n/sum(n)*100) %>%
filter(value == 1) %>%
ggplot(aes(x = factor(option), y = percentage, fill = factor((value-2)*-1))) +
geom_bar(stat = "identity")
Upvotes: 2