Reputation: 391
I would like to create a dataset comprised of frequencies created using prop.table. How would I go about doing this? dataset is here https://gofile.io/d/QuqKh5
(prop.table(table(sample$day12))*100),
(prop.table(table(sample$day13))*100),
(prop.table(table(sample$day14))*100),
(prop.table(table(sample$day15))*100),
(prop.table(table(sample$day16))*100)
Here is a sample of my data:
structure(list(day12 = c("5 = Very High", "5 = Very High", "5 = Very High",
"4 = High", "5 = Very High", "5 = Very High", "4 = High", "4 = High",
"5 = Very High", "4 = High"), day13 = c("5 = Very High", "5 = Very High",
"5 = Very High", "4 = High", "5 = Very High", "4 = High", "4 = High",
"4 = High", "5 = Very High", "4 = High"), day14 = c("4 = High",
"5 = Very High", "5 = Very High", "5 = Very High", NA, "3 = Medium",
"4 = High", "3 = Medium", "4 = High", "4 = High")), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
Ultimately I need to produce a graph like this, which shows the percentage of 4=High and 5=Very High (combined)
Upvotes: 0
Views: 110
Reputation: 145775
This should get you well on your way. First we convert the data to long format, then summarize it (I include the missing values as not high ratings - you may want to handle them differently), and finally plot it.
library(tidyr)
library(dplyr)
library(ggplot2)
## with dplyr and tidyr
sample_summarized = sample %>%
pivot_longer(everything(), names_to = "day", values_to = "rating") %>%
group_by(day) %>%
summarize(high_proportion = mean(grepl("High", rating)))
## with base
prop_high = sapply(sample, function(x) mean(grepl("High", x)))
sample_summarized = data.frame(high_proportion = prop_high, day = names(prop_high))
ggplot(sample_summarized) +
aes(x = high_proportion, y = day) +
geom_col(fill = "#104E8B") +
geom_text(
aes(x = high_proportion / 2, label = scales::percent(high_proportion, accuracy = 1)),
color = "white"
) +
scale_x_continuous(labels = scales::percent_format()) +
theme_minimal()
Upvotes: 2
Reputation: 388962
In base R, you could do :
barplot(colMeans(sapply(df, grepl, pattern = 'High')) * 100)
Upvotes: 0