Ajit Tramp
Ajit Tramp

Reputation: 11

How to find only categories variables?

This code snippet takes the data set train and tries to print it if its variables are not categories. I am finding that is.factor(varname[i]) is always FALSE. How to rectify it?

find_Cat <- function(){
  varname<-NULL
  for(i in  1 : length(names(train))){   
    varname[i]<-paste('train$',names(train)[i],sep='')
    if(is.factor(varname[i])) 
    print("This is Category variable ")    
  }
}

Upvotes: 1

Views: 68

Answers (3)

LyzandeR
LyzandeR

Reputation: 37879

You could also use Filter for this if you like:

Using the built-in iris data set:

names(Filter(is.factor, iris))
[1] "Species"

Filter will extract the columns that are factors and names returns the names of the factor columns.


Based on your comment, if you want to find the mode of all categorical columns you could do:

df <- data.frame(a=runif(5), b=rep(c('a','b'), c(2,3)), c=rep(c('a','d'), c(2,3)) )

#> df
#           a b c
#1 0.29489199 a a
#2 0.08649974 a a
#3 0.65941729 b d
#4 0.49732569 b d
#5 0.62138883 b d


#the below will find the mode of the column whether it is factor or character 
#assuming you have numeric, integer, character and factor columns
lapply(Filter(Negate(is.numeric), df),
       function(x) names(sort(table(x), decreasing=TRUE)[1]))

$b
[1] "b"

$c
[1] "d"

Upvotes: 1

Benjamin
Benjamin

Reputation: 17279

Or if you want a character vector in return, you can slightly alter akrun's code.

factors <- vapply(train, 
            function(x) if (is.factor(x)) 
                         "This is a Category Variable" 
                         else "", 
            character(1))

Upvotes: 1

akrun
akrun

Reputation: 887158

We can use lapply

lapply(train, function(x) if(is.factor(x))
                   'This is Category variable'
                   else NULL)

Upvotes: 2

Related Questions