Reputation: 7725
What is the dplyr
way to tabulate several variables that share the same "levels" to produce the following output?
df <- data.frame(v1 = c("sometimes", "sometimes", "rarely", "never", "often",
"often"),
v2 = c("often", "sometimes", "rarely", "never", "rarely",
"often"))
tab <- data.frame(cbind(table(df$v1), table(df$v2)))
names(tab) <- names(df)
tab
# v1 v2
#never 1 1
#often 2 2
#rarely 1 2
#sometimes 2 1
Upvotes: 2
Views: 511
Reputation: 61154
You can use this approach:
> df %>%
gather(var) %>%
group_by(var) %>%
count(value) %>%
spread(var, n)
# A tibble: 4 x 3
value v1 v2
<chr> <int> <int>
1 never 1 1
2 often 2 2
3 rarely 1 2
4 sometimes 2 1
As pointed out by @Frank you can go straight without group_by
, you can count by group just using count()
, as follows:
df %>%
gather %>%
count(key,value) %>%
spread(key, n)
Upvotes: 3
Reputation: 11957
One approach is to convert the data to "long" format, which will make easier to simply count the occurrences of your labels, then spread
them into the desired format.
df.count <- df %>%
gather(variable, value) %>%
group_by(variable, value) %>%
count %>%
spread(variable, n)
value v1 v2
<chr> <int> <int>
1 never 1 1
2 often 2 2
3 rarely 1 2
4 sometimes 2 1
Of course, dplyr
isn't strictly necessary:
df2 <- sapply(df, table)
This produces a named matrix, as opposed to a data frame:
v1 v2
never 1 1
often 2 2
rarely 1 2
sometimes 2 1
And with a little more work you can turn it into a data frame:
df2 <- sapply(df, table) %>%
as.data.frame %>%
rownames_to_column(var = 'level')
Upvotes: 1