freeazabird
freeazabird

Reputation: 391

How to create dataset using prop.table?

I would like to create a dataset comprised of frequencies created using prop.table. How would I go about doing this? dataset is here https://gofile.io/d/QuqKh5

(prop.table(table(sample$day12))*100),
(prop.table(table(sample$day13))*100),
(prop.table(table(sample$day14))*100),
(prop.table(table(sample$day15))*100),
(prop.table(table(sample$day16))*100)

Here is a sample of my data:

structure(list(day12 = c("5 = Very High", "5 = Very High", "5 = Very High", 
"4 = High", "5 = Very High", "5 = Very High", "4 = High", "4 = High", 
"5 = Very High", "4 = High"), day13 = c("5 = Very High", "5 = Very High", 
"5 = Very High", "4 = High", "5 = Very High", "4 = High", "4 = High", 
"4 = High", "5 = Very High", "4 = High"), day14 = c("4 = High", 
"5 = Very High", "5 = Very High", "5 = Very High", NA, "3 = Medium", 
"4 = High", "3 = Medium", "4 = High", "4 = High")), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

Ultimately I need to produce a graph like this, which shows the percentage of 4=High and 5=Very High (combined)enter image description here

Upvotes: 0

Views: 110

Answers (2)

Gregor Thomas
Gregor Thomas

Reputation: 145775

This should get you well on your way. First we convert the data to long format, then summarize it (I include the missing values as not high ratings - you may want to handle them differently), and finally plot it.

library(tidyr)
library(dplyr)
library(ggplot2)

## with dplyr and tidyr
sample_summarized = sample %>%
  pivot_longer(everything(), names_to = "day", values_to = "rating") %>%
  group_by(day) %>%
  summarize(high_proportion = mean(grepl("High", rating)))

## with base
prop_high = sapply(sample, function(x) mean(grepl("High", x)))
sample_summarized = data.frame(high_proportion = prop_high, day = names(prop_high))

ggplot(sample_summarized) +
  aes(x = high_proportion, y = day) +
  geom_col(fill = "#104E8B") +
  geom_text(
    aes(x = high_proportion / 2, label = scales::percent(high_proportion, accuracy = 1)),
    color = "white"
  ) +
  scale_x_continuous(labels = scales::percent_format()) +
  theme_minimal()

enter image description here

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388962

In base R, you could do :

barplot(colMeans(sapply(df, grepl, pattern = 'High')) * 100)

enter image description here

Upvotes: 0

Related Questions