Reputation: 1169
I have a data set as shown below:
data <- tribble(
~top_1, ~top_2, ~top_3,
"A", "B", "C",
"B", "B", "B",
"C", "B", "C",
"A", "B", "B",
"A", "A", "A",
"B", "B", "A",
"C", "A", "C",
"A", "A", "A",
"A", "C", "B",
"B", "B", "C",
)
And now, I want to count distinct the rows for each column and have a new data set something like this:
new_data <- tribble(
~product, ~top_1, ~top_2, ~top_3,
"A", .50, .30, .30,
"B", .30, .60, .30,
"C", .20, .10, .40,
)
Could you please help me to be able to create this data?
Upvotes: 2
Views: 99
Reputation: 886938
Here is one option where we gather
into 'long' format, get the count
and reshape to 'wide' format with pivot_wider
library(dplyr)
library(tidyr)
data %>%
gather %>%
group_by_all %>%
count %>%
group_by(key) %>%
mutate(n = n/sum(n)) %>%
pivot_wider( names_from = key, values_from = n)
# A tibble: 3 x 4
# Groups: value [3]
# value top_1 top_2 top_3
# <chr> <dbl> <dbl> <dbl>
#1 A 0.5 0.3 0.3
#2 B 0.3 0.6 0.3
#3 C 0.2 0.1 0.4
Upvotes: 2
Reputation: 39858
One base R
option could be:
table(stack(data))/nrow(data)
values top_1 top_2 top_3
A 0.5 0.3 0.3
B 0.3 0.6 0.3
C 0.2 0.1 0.4
And if you want it as a data.frame
:
as.data.frame.matrix(table(stack(data))/nrow(data))
Upvotes: 3
Reputation: 32538
lvl = unique(unlist(data))
sapply(data, function(x) prop.table(table(factor(x, lvl))))
# top_1 top_2 top_3
#A 0.5 0.3 0.3
#B 0.3 0.6 0.3
#C 0.2 0.1 0.4
Upvotes: 4