kittygirl
kittygirl

Reputation: 2443

Different result in the case of `factor` or `as.factor`?

My system is R 3.5.3 with Rstudio 1.1.463

Set dataframe as below:

df <- data.frame(
    cola = c('a','b','c','d','e','e','1',NA,'c','d'),
    colb = c("A",NA,"C","D",'a','b','c','d','c','d'),stringsAsFactors = FALSE)
cats<-c('a','b','c','d','e','f','1')

Then,run df['cola'] <- lapply(df['cola'], function(x) factor(x,levels=cats,exclude = NULL,ordered = FALSE,nmax=6)),get the expect result.

If change factor to as.factor based on this post,run df['cola'] <- lapply(df['cola'], function(x) as.factor(x,levels=cats,exclude = NULL,ordered = FALSE,nmax=6)),will get error as below:

Error in as.factor(x, levels = cats, exclude = NULL, ordered = FALSE,  : 
  unused arguments (levels = cats, exclude = NULL, ordered = FALSE, nmax = 6)

What's the problem?

Upvotes: 2

Views: 1894

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388817

Problem is as stated in the error message. You are passing arguments which are not present for as.factor. If you read ?as.factor you see the parameter to as.factor is only x. levels, exclude, ordered, nmax are arguments for factor and not as.factor. Hence, it is giving you error that you are passing arguments which you are not using.

If you remove those arguments and run the function then it works without any error message.

lapply(df['cola'], function(x) as.factor(x))
#$cola
# [1] a    b    c    d    e    e    1    <NA> c    d   
#Levels: 1 a b c d e

OR just

lapply(df['cola'], as.factor)

and if you have just one column no need for lapply

as.factor(df$cola)

Upvotes: 4

Related Questions