Reputation: 47

How to count() each variable automatically

I am cleaning some data and like to use the count() function in dplyr to look at unique values of every variable.
Is there a way to do this automatically? Right now I am using this method:

df %>% count(variable1)
df %>% count(variable2)
df %>% count(variable3)
...

I would like something that returns all of them without me having to repeat the line of code and type in each variable. I thought about trying to have R recognize all the column names and automatically fill them in but I'm not sure where to start. If I just add variables together, say

df %>% count(variable1, variable2)

I get counts by both of those variables when I want individual tables for each variable.

Upvotes: 3

Answers (3)

Darren Tsai

Reputation: 35554

Assume that you want to count am, gear, and carb from mtcars. You can apply the function table() on each variable by map(), which returns a list object.

library(dplyr)
library(purrr)

mtcars %>%
  select(am, gear, carb) %>%
  map(table)

# $am
#  0  1 
# 19 13 
# 
# $gear
#  3  4  5 
# 15 12  5 
# 
# $carb
#  1  2  3  4  6  8 
#  7 10  3 10  1  1

base Version :

lapply(mtcars[c("am", "gear", "carb")], table)

In addition, you can use summary(), which counts factor variables.

mtcars %>%
  select(am, gear, carb) %>%
  mutate(across(.fn = as.factor)) %>%
  summary

#  am     gear   carb  
#  0:19   3:15   1: 7  
#  1:13   4:12   2:10  
#         5: 5   3: 3  
#                4:10  
#                6: 1  
#                8: 1

Upvotes: 2

Hasan Bhagat

Reputation: 405

a simple solution would be to use sapply or lapply with table

sapply(df,table)

This will return you a list of count tables for each of the columns for dt. You can always pass in a subsetted dataframe to get the count for your variables of interest.

Upvotes: 1

Duck

Reputation: 39595

It looks like you can use a tidyverse approach to solve your issue. You want to get the counts for each variable in your dataset (Please next time add a sample of df). You can get something close to what you want using data in long format. I will show you an example with mtcars data. I will choose some variables that display classes so that they can be summarised with counts. Here the code:

library(tidyverse)
#Data
data("mtcars")

I will select some categorical variables with next code, then I will reshape to long. Finally, I will use summarise() and n() (used for counting) with group_by() to determine the counts:

#Code
mtcars %>% select(cyl,vs,am,gear,carb) %>%
  #Format to long
  pivot_longer(cols = everything()) %>%
  #Group and summarise
  group_by(name,value) %>%
  summarise(N=n())

Output:

# A tibble: 16 x 3
# Groups:   name [5]
   name  value     N
   <chr> <dbl> <int>
 1 am        0    19
 2 am        1    13
 3 carb      1     7
 4 carb      2    10
 5 carb      3     3
 6 carb      4    10
 7 carb      6     1
 8 carb      8     1
 9 cyl       4    11
10 cyl       6     7
11 cyl       8    14
12 gear      3    15
13 gear      4    12
14 gear      5     5
15 vs        0    18
16 vs        1    14

As you can see all the variables are showed with their respective groups and counts.

Upvotes: 2

How to count() each variable automatically

Answers (3)

Related Questions