Converting binary categorical variable to 0's and 1's

Question

I have a dataset where the outcome variable is a binary categorical variable "diagnosis" which is is the type of tumour: "benign" or "malignant".

When converting the variable to numeric ("benign"=0 and "malignant"=1) I use the code:

tumor.df <- fread("df.csv", stringsAsFactors = T)
tumor.df$diagnosis = as.numeric(tumor.df$diagnosis, levels=c('benign', 'malignant'), labels=c(0, 1))

However, instead of diagnosis converting to 0's and 1's, it converts to 1's and 2's. Why is this happening?

Ben Bolker · Accepted Answer

Because R stores factors as an underlying set of integer codes (starting from 1) and a set of associated labels.

I would say you should go ahead and subtract one from the value that you got. There are lots of other ways to do the conversion, that vary in efficiency and readability. One other option would be as.numeric(tumor.df$diagnosis=="malignant") (R converts FALSE to 0, TRUE to 1)

Converting binary categorical variable to 0's and 1's

Answers (1)

Related Questions

Converting binary categorical variable to 0&#39;s and 1&#39;s

Answers (1)

Related Questions

Converting binary categorical variable to 0's and 1's