Prakhar Singh Dhaila
Prakhar Singh Dhaila

Reputation: 153

How to make frequency table for all categorical variables in a dataframe?

I tried to apply for loop to get count for each column in the dataframe. I created a freq names dataframe which only contains categorical variables.

 n <- names(freq)
for(var in n){
  count(freq,var)
}

I am getting the following error:

Error in grouped_df_impl(data, unname(vars), drop) : Column var is unknown

Upvotes: 0

Views: 2771

Answers (2)

datasci-iopsy
datasci-iopsy

Reputation: 345

The table function in base R is really helpful in creating counts of categorical variables; however, the output is a specific table object - thus it is not recognized by almost any subsequent functions within R that would prove useful (ggplot, kable, etc.).

Here's a function that creates a list comprising the count of each level within the factors and converts them to data frames.

#df should be a data structure containing the factors of interest
freqList = lapply(df, 
              function(x) {

                  my_lst = data.frame(table(x))
                  names(my_lst) = c("level", "n")

                  return(my_lst) 
                    }
                )
freqList

Calling freqList will print the full list. Each column/variable will be its own data frame object.

Upvotes: 0

Rui Barradas
Rui Barradas

Reputation: 76673

You are using var as a character string, when dplyr::count needs a variable. to get the variable, use get.
In this example, the dataframe freq will be the built int dataset iris.

freq <- iris

n <- names(freq)
n <- n[sapply(n, function(var) is.factor(freq[[var]]))]

for(var in n){
  cnt <- dplyr::count(freq, get(var))
  print(cnt)
}
## A tibble: 3 x 2
#  `get(var)`     n
#  <fct>      <int>
#1 setosa        50
#2 versicolor    50
#3 virginica     50

Upvotes: 2

Related Questions