Reputation: 45
I have a dataset of protocols that participants followed and they chose how frequently they followed the protocol - never, sometimes, always. This is coded as 1, 2 and 3.
Sample df
Protocol 1 | Protocol 2 | Protocol 3 |
---|---|---|
1 | 2 | 3 |
3 | 2 | 3 |
2 | 1 | 2 |
2 | 3 | 3 |
1 | 3 | 3 |
3 | 3 | 3 |
Each row corresponds to one participant saying how frequently they followed each protocol (never = 1, sometimes = 2, always = 3)
I want to transpose the df and add three columns counting the frequency of each option of each protocol and percentage in brackets
So, it would look something like this:
Never | Sometimes | Always | |||||||
---|---|---|---|---|---|---|---|---|---|
Protocol 1 | 1 | 3 | 2 | 2 | 1 | 3 | 2(33.33%) | 2(33.33%) | 2(33.33%) |
Protocol 2 | 2 | 2 | 1 | 3 | 3 | 3 | 1(16.67%) | 2(33.33%) | 3(50%) |
Protocol 3 | 3 | 3 | 2 | 3 | 3 | 3 | 0(0%) | 1(16.67%) | 5(83.33%) |
I tried multiple solutions from stackoverflow and others but nothing seems to work. Even if I can't show percentages and just the frequency, that would work too.
Upvotes: 2
Views: 51
Reputation: 1474
Here is another approach.
library(tidyverse)
library(glue)
dat1 <- dat %>%
pivot_longer(everything()) %>%
group_by(name) %>%
summarise(
Never = glue("{sum(value == 1)} ({round(mean(value == 1) * 100, 2)}%)"),
Sometiimes = glue("{sum(value == 2)} ({round(mean(value == 2) * 100, 2)}%)"),
Always = glue("{sum(value == 3)} ({round(mean(value == 3) * 100, 2)}%)"),
)
dat2 <- dat %>%
pivot_longer(everything()) %>%
group_by(name) %>%
mutate(n = row_number()) %>%
ungroup() %>%
pivot_wider(
id_cols = name,
names_from = n,
values_from = value
) %>%
left_join(dat1, by = "name")
dat2
# # A tibble: 3 x 10
# name `1` `2` `3` `4` `5` `6` Never Sometiimes Always
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <glue> <glue> <glue>
# 1 Protocol 1 1 3 2 2 1 3 2 (33.33%) 2 (33.33%) 2 (33.33%)
# 2 Protocol 2 2 2 1 3 3 3 1 (16.67%) 2 (33.33%) 3 (50%)
# 3 Protocol 3 3 3 2 3 3 3 0 (0%) 1 (16.67%) 5 (83.33%)
Data:
dat <- tibble(
`Protocol 1` = c(1, 3, 2, 2, 1, 3),
`Protocol 2` = c(2, 2, 1, 3, 3, 3),
`Protocol 3` = c(3, 3, 2, 3, 3, 3)
)
Upvotes: 0
Reputation: 389235
Here is one tidyverse
approach -
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = everything()) %>%
mutate(value = factor(value)) %>%
count(name, value, .drop = FALSE) %>%
group_by(name) %>%
mutate(n = sprintf('%d (%.2f %%)', n, prop.table(n) * 100),
value = recode(value,`1`='never',`2` = 'sometimes',`3` = 'always')) %>%
ungroup %>%
pivot_wider(names_from = value, values_from = n)
# name never sometimes always
# <chr> <chr> <chr> <chr>
#1 Protocol1 2 (33.33 %) 2 (33.33 %) 2 (33.33 %)
#2 Protocol2 1 (16.67 %) 2 (33.33 %) 3 (50.00 %)
#3 Protocol3 0 (0.00 %) 1 (16.67 %) 5 (83.33 %)
data
df <- structure(list(Protocol1 = c(1L, 3L, 2L, 2L, 1L, 3L), Protocol2 = c(2L,
2L, 1L, 3L, 3L, 3L), Protocol3 = c(3L, 3L, 2L, 3L, 3L, 3L)),
class = "data.frame", row.names = c(NA, -6L))
Upvotes: 2