u31889
u31889

Reputation: 351

Count number of occurrence of string per selected columns

I have a dataframe df and particular columns I am interested in myCol.

df <- data.frame("Col1"=c("yes", "yes", "no"), "Col2"=c("no", "no", "yes"), "Col3"=c("no", "yes", "no"), "Col4"=c("yes", "yes", "yes"))

df
  Col1 Col2 Col3 Col4
1  yes   no   no  yes
2  yes   no  yes  yes
3   no  yes   no  yes

myCol <- c("Col1", "Col2", "Col4")

I am trying to count (column wise) the number of no and yes for myCol columns only.

I tried with summarise, count, table but it either returns errors or nested counts.

Thanks for your help !

EDIT

For clarification, my output could look something like this:

    Col1  Col2  Col4
no     1     2     0
yes    2     1     3

Upvotes: 1

Views: 53

Answers (2)

tmfmnk
tmfmnk

Reputation: 39858

One dplyr and purrr option could be:

map_dfc(.x = c("yes", "no"),
        ~ df %>%
         transmute(!!.x := rowSums(select(., one_of(myCol)) == .x)))

  yes no
1   2  1
2   2  1
3   2  1

If you actually need it as a sum per columns:

map_dfc(.x = c("yes", "no"),
        ~ df %>%
         summarise(!!.x := sum(rowSums(select(., one_of(myCol)) == .x))))

  yes no
1   6  3

To match the OPs output, using dplyr and tidyr:

df %>%
 select(one_of(myCol)) %>%
 pivot_longer(everything()) %>%
 count(name, value) %>%
 pivot_wider(names_from = "name", values_from = "n", values_fill = list(n = 0))

  value  Col1  Col2  Col4
  <fct> <int> <int> <int>
1 no        1     2     0
2 yes       2     1     3

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

A simple base R solution if you just want to count the total number of yes and no in those columns is:

table(sapply(myCol, function(x) df[[x]]))

#> no yes 
#>  3   6 

Or for individual columns:

sapply(myCol, function(x) table(factor(df[[x]], levels = c("no", "yes"))))
#>     Col1 Col2 Col4
#> no     1    2    0
#> yes    2    1    3

Upvotes: 0

Related Questions