pyll
pyll

Reputation: 1764

Convert multiple columns to binary in R

Hi I have a dataset with multiple columns that are populated with either NA or "Y". I wish to make these values 0 and 1 respectively.

I am fairly new to R, and trying to determine the best way to loop through these variables and recode them.

STATE<-c(NA, "WA", "NY", NA, NA)  
x<-c(NA,"Y",NA,NA,"Y")
y<-c(NA,NA,"Y",NA,"Y")
z<-c("Y","Y",NA, NA, NA)
mydata<-data.frame(x,y,z)

I have a large dataset, and many of these variables. However, some of them (such as STATE), I wish to leave alone. Any help would be greatly appreciated. Thanks.

Upvotes: 0

Views: 2820

Answers (3)

David Pinto
David Pinto

Reputation: 141

The best way I think is to use the mutate_each() function from the package dplyr:

library(dplyr)

STATE  <- c(NA, "WA", "NY", NA, NA)  
x      <- c(NA, "Y", NA, NA, "Y")
y      <- c(NA, NA, "Y", NA, "Y")
z      <- c("Y", "Y", NA, NA, NA)
mydata <- data.frame(x, y, z, STATE)

mydata <- mutate_each(mydata, funs(ifelse(is.na(.), 0, 1)), -STATE)

It will apply the function specified inside funs() to each variable. The dot . is a representation for the variable. To skip one or more variables just write their names with a - before them: -var1, -var2, ...

Upvotes: 1

someguyinafloppyhat
someguyinafloppyhat

Reputation: 421

First, you need to make sure the character vectors are not coded as factors:

mydata <- data.frame(x,y,z, stringsAsFactors=F)

Then:

mydata[mydata=="Y"] <- 1
mydata[is.na(mydata)] <- 0
mydata
  x y z
  1 0 0 1
  2 1 0 1
  3 0 1 0
  4 0 0 0
  5 1 1 0

Upvotes: 0

xraynaud
xraynaud

Reputation: 2136

You can use ifelse:

ifelse(is.na(mydata),0,ifelse(mydata=="Y",1,mydata)

This replaces elements of mydata to 0 if they are NA, to one if they are "Y" or keep element if they are anything else.

You added the binary tag. R has a binary type: TRUE/FALSE, so if you want binary, you should use

 ifelse(is.na(mydata),FALSE,ifelse(mydata=="Y",TRUE,mydata)

instead.

Upvotes: 2

Related Questions