firefly2442
firefly2442

Reputation: 557

Get counts of categorical factors across multiple variables/columns in R

I have the following minimal example in R:

testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing

        one          two
1  Once a week Once a month
2  Once a week Once a month
3       Rarely  Once a week
4 Once a month       Rarely
5 Once a month       Rarely

I want the end result to be a dataframe that has all the possible categorical factors in one column and the rest of the columns are the counts for each column/variable like this:

categories    one    two
Rarely        1      2
Once a month  2      2
Once a week   2      1

I have no restrictions on R libraries so whatever would be the easiest here (maybe plyr/dplyr?).

Thanks.

Upvotes: 2

Views: 6483

Answers (3)

Pierre L
Pierre L

Reputation: 28441

Table works with no need for outside packages:

sapply(testing, table)
#             one two
#Once a month   2   2
#Once a week    2   1
#Rarely         1   2

Upvotes: 7

JasonAizkalns
JasonAizkalns

Reputation: 20463

Here's another way utilizing tidyr::gather, tidyr::spread, and dplyr::count:

library(dplyr)
library(tidyr)

testing %>%
  gather(measure, value) %>%
  count(measure, value) %>%
  spread(measure, n)

# Source: local data frame [3 x 3]
# 
#          value   one   two
#          (chr) (int) (int)
# 1 Once a month     2     2
# 2  Once a week     2     1
# 3       Rarely     1     2

Also, see this fantastic gist on this topic.

Upvotes: 2

cderv
cderv

Reputation: 6542

You could tidy your table with tidyr and dplyr packages and count categories with base table function

testing = data.frame(c("Once a week", "Once a week", "Rarely", "Once a month", "Once a month"), c("Once a month", "Once a month", "Once a week", "Rarely", "Rarely"))
colnames(testing) = c("one", "two")
testing
#>            one          two
#> 1  Once a week Once a month
#> 2  Once a week Once a month
#> 3       Rarely  Once a week
#> 4 Once a month       Rarely
#> 5 Once a month       Rarely

library(tidyr)
library(dplyr)

testing %>%
  gather("type", "categories") %>%
  table()
#>      categories
#> type  Once a month Once a week Rarely
#>   one            2           2      1
#>   two            2           1      2

# or reorder colum before table
testing %>%
  gather("type", "categories") %>%
  select(categories, type) %>%
  table()
#>               type
#> categories     one two
#>   Once a month   2   2
#>   Once a week    2   1
#>   Rarely         1   2

Upvotes: 2

Related Questions