Joey Evans
Joey Evans

Reputation: 11

How do I replace multiple character strings in a column to numbers

I am sure this is a simple question, but have found nothing online to clarify. I am working on a CSV file in R and have a column labeled Gender with 2 levels "M" and "F". I am trying to change the variables so that F=1 and M=0, both with a numeric type. What code do I need to plug in to change gender?

I have tried using gsub, the replace function, and code with this format:

Test[Test$Gender == "F",]$Gender = 1

When I type in the code above it returns the error message:

Error in [<-.data.frame(*tmp*, Test$Gender == "F", , value = list( : missing values are not allowed in subscripted assignments of data frames

What do I need to do in order to properly replace M and F with 0 and 1?

Upvotes: 1

Views: 1253

Answers (2)

IBrum
IBrum

Reputation: 345

One possible way to do it, through manipulation of the levels of Gender:

#dummy data:
Test = data.frame(Gender = factor(sample(c('M','F'), replace=T, size=10)))
# solution:    
Test$Gender = as.integer(factor(Test$Gender, levels=c('F','M')))-1

You can use levels to choose which level (M or F) gets the first value.

Upvotes: 3

Scipione Sarlo
Scipione Sarlo

Reputation: 1498

Using Tidyverse approach:

library(tidyverse)
Test <- data.frame(Gender=c("F","M","F","M"))
Test %>% 
    mutate(Gender_mod=case_when(
        Gender=="F" ~ 1,
        Gender=="M" ~ 0
    )

and you create a new variable encoding the old one in a new one with desiderd values.

  Gender Gender_mod
1      F          1
2      M          0
3      F          1
4      M          0

Or you may decide to replace the values in the original variable:

Test %>% 
   mutate(Gender=as.numeric(str_replace_all(string=Gender,pattern=c("F","M"),replacement=c("1","0"))))

and this is the output:

  Gender
1      1
2      0
3      1
4      0

Upvotes: 2

Related Questions