Reputation: 11
I am sure this is a simple question, but have found nothing online to clarify. I am working on a CSV file in R and have a column labeled Gender with 2 levels "M" and "F". I am trying to change the variables so that F=1 and M=0, both with a numeric type. What code do I need to plug in to change gender?
I have tried using gsub, the replace function, and code with this format:
Test[Test$Gender == "F",]$Gender = 1
When I type in the code above it returns the error message:
Error in
[<-.data.frame
(*tmp*
, Test$Gender == "F", , value = list( : missing values are not allowed in subscripted assignments of data frames
What do I need to do in order to properly replace M and F with 0 and 1?
Upvotes: 1
Views: 1253
Reputation: 345
One possible way to do it, through manipulation of the levels of Gender
:
#dummy data:
Test = data.frame(Gender = factor(sample(c('M','F'), replace=T, size=10)))
# solution:
Test$Gender = as.integer(factor(Test$Gender, levels=c('F','M')))-1
You can use levels
to choose which level (M or F) gets the first value.
Upvotes: 3
Reputation: 1498
Using Tidyverse
approach:
library(tidyverse)
Test <- data.frame(Gender=c("F","M","F","M"))
Test %>%
mutate(Gender_mod=case_when(
Gender=="F" ~ 1,
Gender=="M" ~ 0
)
and you create a new variable encoding the old one in a new one with desiderd values.
Gender Gender_mod
1 F 1
2 M 0
3 F 1
4 M 0
Or you may decide to replace the values in the original variable:
Test %>%
mutate(Gender=as.numeric(str_replace_all(string=Gender,pattern=c("F","M"),replacement=c("1","0"))))
and this is the output:
Gender
1 1
2 0
3 1
4 0
Upvotes: 2