Shahin
Shahin

Reputation: 1316

getting a summary of every column of a df (dplyr::count)

I have the following dataframe

tbl <- structure(list(col1 = c("a", NA, "b", NA, "c", "c"), col2 = c("n", 
"n", "b", "a", NA, "c"), col3 = c("z", "a", "z", "b", "1", "c"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
# A tibble: 6 x 3
  col1  col2  col3 
  <chr> <chr> <chr>
1 a     n     z    
2 NA    n     a    
3 b     b     z    
4 NA    a     b    
5 c     NA    1    
6 c     c     c

Is it possible to apply the dplyr::count function to every column or some other function that returns the unique entries of every column and potentially the number each unique value appears?

Upvotes: 5

Views: 665

Answers (2)

akrun
akrun

Reputation: 887851

We can loop over the names with map and apply count

library(dplyr)
library(purrr)
map(names(tbl), ~ tbl %>% 
                     select(.x) %>% 
                     count(!! rlang::sym(.x)))

Or can apply table with summarise_all and return a list column

tbl %>%
    summarise_all( ~ list(table(.)))

Or for number of distinct elements

tbl %>%
    summarise_all(n_distinct)

Or in base R

lapply(tbl, function(x) as.data.frame(table(x)))

Upvotes: 5

tmfmnk
tmfmnk

Reputation: 40171

One solution using dplyr and purrr could be, for the number of distinct values:

map(tbl, n_distinct)

$col1
[1] 4

$col2
[1] 5

$col3
[1] 5

For the counts:

map(tbl, table)

$col1

a b c 
1 1 2 

$col2

a b c n 
1 1 1 2 

$col3

1 a b c z 
1 1 1 1 2 

Upvotes: 2

Related Questions