Anthony Taravella
Anthony Taravella

Reputation: 3

how can you order factor levels all at once instead of separately

I'm working on an analysis of a survey, and most of the questions (105 questions out of 167) have a rank between 1 and 10, and 99999 when they are not filled in. I loaded the data set into R and made a data frame with these 105 questions. When I did this I saw that the the data types were not right. They were all dbl. So I first changed the datatype with (data set = survey):

survey <-data.frame(lapply(survey, as.character), stringsAsFactors=FALSE)
survey[survey == 99999] <- "No answer"

to be able to change the 99999 to "no answer" and then I used:

survey[] <- lapply(survey,factor)

to change it to factors. But the problem now is that the order of the factors or the ranks changed immediately after I applied the change to char. I think the reason for this is that for some questions no-one ranked 1 and when you change it to char it puts the rank = 10 in the first position when you, for example:

survey %>% group_by(v2_a)%>% summarize(count = n())

I know a way to reorder the levels separately, for example:

survey$v2_a <- factor(survey$v2_a, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_b <- factor(survey$v2_b, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
survey$v2_c <- factor(survey$v2_c, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))
...

But this requires a lot of work if you have to do it for 105 different questions. Does someone know a shorter way? I tried something like:

survey <- factor(survey, levels = c("1","2", "3", "4","5","6","7","8","9","10","No answer"))

But this definitely doesn't work.

Upvotes: 0

Views: 57

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226182

Any additional arguments provided to lapply will be added to the function arguments, so something like this

survey[] <- lapply(survey,factor,levels=c(1:10,"no answer"))

would probably work.

If you wanted to be more explicit about it you could do:

ffun <- function(x) return(factor(x,levels=c(1:10,"no answer")))
survey[] <- lapply(survey,ffun)

You could also try reading in your data with na.strings="9999" (or whatever) in the first place, so that your no-answer cases got automatically converted to NA.

Upvotes: 2

Related Questions