Testing and density plot across multiple columns

Question

I have a list of restaurants and their star rating:

Restaurant     Question               1.star  2.stars ...etc

McDonalds      How was the food?      5       6       ...
McDonalds      How were the drinks?   3       4
McDonalds      How were the workers?  2       7
Burger_King    How was the food?      4       11
Burger_King    How were the drinks?   9       3
Burger_King    How were the workers?  12      1

1. How do I perform a t-test to determine whether people only use the 1-star and 5-star ratings?

2. How do I graph a density distribution of the star ratings?

3. In general, how do you graph across multiple columns, e.g. col_1 has value, col_2 has frequency?

tribble for convenience:

tribble(
  ~restaurant, ~question,  ~one_star, ~two_star, ~three_star, ~four_star, ~five_star, ~average,

  "McDonalds", "How was the food?",  5, 6, 8, 2, 9, (5*1 + 6*2 + 8*3 + 2*4 + 5*9)/(5 + 6 + 8 + 2 + 9),
  "McDonalds", "How were the drinks?",  9, 8, 7, 5, 1, (9*1 + 8*2 + 7*3 + 5*4 + 5*1)/(9 + 8 + 7 + 5 + 1),
  "McDonalds", "How were the drinks?",  9, 8, 7, 5, 1, (9*1 + 8*2 + 7*3 + 5*4 + 5*1)/(9 + 8 + 7 + 5 + 1),
  "BurgerKing", "How was the food?",  2, 6, 8, 2, 9, (2*1 + 6*2 + 8*3 + 2*4 + 5*9)/(2 + 6 + 8 + 2 + 9),
  "BurgerKing", "How were the drinks?",  1, 4, 8, 5, 1, (1*1 + 4*2 + 8*3 + 5*4 + 5*1)/(1 + 4 + 8 + 5 + 1),
  "BurgerKing", "How were the drinks?",  4, 7, 2, 5, 1, (4*1 + 7*2 + 2*3 + 5*4 + 5*1)/(4 + 7 + 2 + 5 + 1)
)

Edit: As requested, here is my attempt:

#Note: this only works because it truncates the rest of the dataframe. Unaware of alternatives
#Step 1: Transform from wide to long
ratingdf <-  
  df %>%
  select(one_star:five_star) %>%
  pivot_longer(one_star:five_star, names_to = "rating")

#Step 2: Collapse values into total frequency
ratingdf <- 
  ratingdf %>%
  group_by(rating) %>%
  summarize(sum(value)) 

#Graph using ggplot
ratingdf %>%
  ggplot(aes(x = rating, y = `sum(value)`)) +
  geom_histogram(stat = "identity")

When I tried to use geom_density() on this, it does not show anything because the frequencies instead of the columns are given.

Testing and density plot across multiple columns

Answers (1)

Related Questions