Milhouse
Milhouse

Reputation: 177

Creating Dummy Variable in R

I'm pretty new to R and I'm trying to create some new variables. Basically my dataset has individuals with a variable for mother ID (i.e. if two individuals have the same mother the value of this variable will be the same).

Keeping it simple to begin with, lets say I want to create a dummy variable that = 1 if two individuals are siblings. I tried using:

    dummy <- as.numeric(duplicated(Identifiers_age$MPUBID) = TRUE)

but the vector I get only = 1 for one of the siblings. What should I be doing?

Thanks

Upvotes: 0

Views: 855

Answers (2)

lmo
lmo

Reputation: 38500

If your goal is to return a vector of 0s and 1s where it is 1 if the observational unit has a sibling, then you want to include a second duplicated statement with fromLast=TRUE.

The first duplicated function will return a 1 for as many siblings as there are in a MPUBID after the first sibling, and the second duplicated will pick up the first sibling.

hasSiblings <- as.integer(duplicated(Identifiers_age$MPUBID) | 
                          duplicated(Identifiers_age$MPUBID, fromLast=TRUE))

The | is the vector logical operator OR. Note that duplicated returns a logical vector, so you don't have to include the =TRUE after it as you did in your question.

Upvotes: 3

Axeman
Axeman

Reputation: 35262

A dplyr answer:

library(dplyr)

Identifiers_age %>%
  group_by(MPUBID) %>%
  mutate(hasSiblings = as.integer(n() > 1))

Upvotes: 0

Related Questions