KaC
KaC

Reputation: 287

Plot 2 variables in ggplot to show proportion/percentage, not sum

I want to plot responses to a survey question by state. I have dataframe

DF <- data.frame(V1 = factor(c("Option 1", "Option 1", "Option 1", "Option 2", "Option 1", "Option 2", "Option 1", "Option 1", "Option 2", NA, "Option 2", "Option 1")),
                  Location = factor(c("California", "Georgia", "Texas", "Texas", "Georgia", "Georgia", "California", "Georgia", "Texas", "Texas", "California", "Georgia")))

Because Georgia is overrepresented in the sample, the plot can be difficult to interpret:

library(tidyr)
library(dplyr)
DF %>%
  filter(!is.na(V1)) %>% 
  ggplot(aes(V1, ..count..)) + 
  geom_bar(aes(fill = Location), position = "dodge") +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Is there a way to display proportion/percentage of responses, with the responses per state standardized to 1 or 100?

Upvotes: 0

Views: 518

Answers (1)

Kara Woo
Kara Woo

Reputation: 3615

I'm not 100% sure I'm following, but here's one option that shows relative proportions of the different options rather than counts:

DF <- data.frame(
  V1 = factor(c("Option 1", "Option 1", "Option 1", "Option 2", "Option 1", "Option 2", "Option 1", "Option 1", "Option 2", NA, "Option 2", "Option 1")),
  Location = factor(c("California", "Georgia", "Texas", "Texas", "Georgia", "Georgia", "California", "Georgia", "Texas", "Texas", "California", "Georgia"))
)

library("tidyverse")
DF <- filter(DF, !is.na(V1))
ggplot(DF, aes(Location, fill = V1)) +
  geom_bar(position = "fill")

(you'd probably then want to rename the y axis to "proportion" or similar)

Upvotes: 1

Related Questions