Reputation: 1295
I've got a set of variables that are characters that I am trying to convert into a binary by creating the following function and using the apply()
function:
a <- as.factor(c("n/a", "False", "False", "True"))
b <- as.factor(c("n/a", "True", "False", "True"))
y <- data.frame(a,b)
conv <- function(x){
levels(x)[which(levels(x)=="n/a")] <- NA
levels(x)[which(levels(x)=="False")] <- 0
levels(x)[which(levels(x)=="True")] <- 1
x <- as.numeric(levels(x))[x]
return(x)
}
apply(y,2, conv)
However, when I do this, it outputs NAs. Alternatively, if you apply the function by column, it works:
conv(y[,1])
conv(y[,2])
The expected output should be:
y:
NA NA
0 1
0 0
1 1
Any thoughts on why this is happening? Thanks.
Upvotes: 0
Views: 39
Reputation: 887691
In R
, logical values are TRUE/FALSE and not strings "True", "False". In addition, NA
is the missing value
y[] <- NA^(is.na(replace(as.matrix(y), y=="n/a", NA)))*+(y=='True')
y
# a b
#1 NA NA
#2 0 1
#3 0 0
#4 1 1
Upvotes: 1
Reputation: 51592
A simple ifelse
can take care of the NA
requirement. grepl
can then be used to convert to 0/1, i.e.
y[] <- lapply(y[], function(i) ifelse(i == 'n/a', NA, grepl('True', i)*1))
y
# a b
#1 NA NA
#2 0 1
#3 0 0
#4 1 1
Upvotes: 1
Reputation: 2076
Your function is fine you just need to use lapply
.
conv <- function(x){
levels(x)[which(levels(x)=="n/a")] <- NA
levels(x)[which(levels(x)=="False")] <- 0
levels(x)[which(levels(x)=="True")] <- 1
x <- as.numeric(levels(x))[x]
return(x)
}
lapply(y,conv)
Also if the order of levels is same for all the variables then you could just do this.
conv <- function(x){
levels(x)=c(0,NA,1)
return(x)
}
lapply(y, conv)
Upvotes: 1