Reputation: 353
I have the following data:
data <- structure(list(class = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 3, 3, 1, 1, 2, 2, 2, 3, 3, 3, 3, 5, 5, 5, 5, 5, 5, 5,
5), ID = c(700, 700, 800, 800, 800, 300, 300, 300, 300, 555,
555, 555, 555, 555, 555, 555, 555, 700, 700, 900, 900, 800, 300,
300, 300, 300, 555, 555, 555, 555, 555, 555, 555, 555), type = c(1,
1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2,
3, 3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1), date = structure(c(1610668800,
1610668800, 1610668800, 1610668800, 1610668800, 1610668800, 1610668800,
1610668800, 1610668800, 1610668800, 1610668800, 1610668800, 1610668800,
1610668800, 1610668800, 1610668800, 1610668800, 1610841600, 1610841600,
1610841600, 1610841600, 1610841600, 1610841600, 1610841600, 1610841600,
1610841600, 1610841600, 1610841600, 1610841600, 1610841600, 1610841600,
1610841600, 1610841600, 1610841600), class = c("POSIXct", "POSIXt"
), tzone = "UTC")), row.names = c(NA, -34L), class = c("tbl_df",
"tbl", "data.frame"))
What I would like to do is to calculate the count / unique times of the ID column per date and per class and then calculate each type of 1,2 and 3. So for example, although the ID 700 appears 2 times on the 2021-01-15 I would like to contribute one time in the percentage.
I have tried the following with different variations with no success:
data_perc <- data %>%
tabyl(class, type)
So my results should look something like the following:
class date type1 type2 type3
1 2021-01-15 30% 30% 40%
1 2021-01-17 33% 33% 34%
2 2021-01-15 60% 20% 20%
Thank you in advance :)
Upvotes: 0
Views: 540
Reputation: 2924
Not sure if I'm misunderstanding the question or the expected results isn't from the sample provided but anyway this is a solution to what I think you want.
data %>%
group_by(class, date, type) %>%
summarise(n = n_distinct(ID)) %>%
group_by(class, date) %>%
mutate(
n = n/sum(n),
type = paste0("type", type),
n = scales::percent(n) #optional row if want formatted as percent string
) %>%
pivot_wider(
names_from = type,
values_from = n,
values_fill = NA
)
Upvotes: 1