Joost
Joost

Reputation: 101

How can you make a dummy variable based on part of a character?

I want to make dummy variables based on if a specific word is present in a column. I included an example to clarify it:

source/medium           qr_dummy

Amsterdam/qr_code          0 
Rotterdam/offline          0
Utrecht/online             0

I want to have a 1 if qr_code is present in the source/medium column. I tried the code below, but because "qr_code" is not matching the exact characters it wont give a 1.

df$qr_code_dummy[df$sourceMedium == "qr_code"] <- 1

So the wanted outcome looks as follows:

source/medium           qr_dummy

Amsterdam/qr_code          1 
Rotterdam/offline          0
Utrecht/online             0

Upvotes: 1

Views: 640

Answers (3)

user10917479
user10917479

Reputation:

A little more readable may be using stringr. Here it is in a dplyr flow, but you can use str_detect() without it.

library(dplyr)
library(stringr)

df %>% 
  mutate(qr_code_dummy = as.integer(str_detect(sourceMedium, "qr_code")))

Upvotes: 1

Andrew Chisholm
Andrew Chisholm

Reputation: 6567

As mentioned, grepl is a good choice. Here's an example using dplyr with the ifelse to change booleans to 0 and 1.

library(dplyr)
df <- data.frame(sourceMedium = c('Amsterdam/qr_code','Rotterdam/offline','Utrecht/online'))
summary <- df %>% mutate(qr_code_dummy = ifelse(grepl('qr_code', sourceMedium), 1, 0))
summary

#       sourceMedium qr_code_dummy
# 1 Amsterdam/qr_code            1
# 2 Rotterdam/offline            0
# 3    Utrecht/online            0

Upvotes: 1

Daniel O
Daniel O

Reputation: 4358

As @duckmayr reccomended in the comments

df$qr_code_dummy[grepl("qr_code",df$sourceMedium)] <- 1

       sourceMedium qr_code_dummy
1 Amsterdam/qr_code             1
2 Rotterdam/offline             0
3    Utrecht/online             0

data:

df <- structure(list(sourceMedium = structure(1:3, .Label = c("Amsterdam/qr_code", 
"Rotterdam/offline", "Utrecht/online"), class = "factor"), qr_code_dummy = c(1, 
0, 0)), row.names = c(NA, -3L), class = "data.frame")

Upvotes: 2

Related Questions