Reputation: 635
I have data.frame of survey results. The responses are in Portuguese language and I need to substitute some responses, for instance, "Não sabe" and "Não respondeu" by "Ns/Nr". Each column of the matrix is of factor class. Can I generalize this procedure for any labels? I tried something like this
mydata[mydata %in% c("Não sabe", "Não respondeu")] <- "Ns/Nr"
But, it doesn't work. In fact, when I tried:
mydata[mydata %in% c("Não sabe", "Não respondeu")]<- "Ns/Nr"
there is not error, but, when I do
freq(mydata$Q_9)
there is no "Ns/Nr" label and the other frequencies remain the same. And, when I do
mydata[mydata == "Não respondeu"]<- "Ns/Nr"
there are warnings like
In `[<-.factor`(`*tmp*`, thisvar, value = "Ns/Nr") :
invalid factor level, NA generated
In this case, when I do freq(mydata$Q_9), the frequence of the "Não respondeu" is zero and NA takes the frequence of the old "Não respondeu".
Upvotes: 1
Views: 55
Reputation: 887291
You could convert the 'factor' class to 'character' before substituting i.e.
mydata[] <- lapply(mydata, as.character)
mydata[] <- lapply(mydata, function(x) {
x[ x %in%c("Não sabe", "Não respondeu")] <- 'Ns/Nr'
x})
Or without converting to 'character' class we can use recode
from car
. The advantage is that you can have new levels 'Ns/Nr' updated in the 'factor' column while dropping the levels replaced.
library(car)
mydata[] <- lapply(mydata, function(x)
recode(x, "c('Não sabe', 'Não respondeu')='Ns/Nr'") )
Upvotes: 1