Reputation: 177
I'm pretty new to R and I'm trying to create some new variables. Basically my dataset has individuals with a variable for mother ID (i.e. if two individuals have the same mother the value of this variable will be the same).
Keeping it simple to begin with, lets say I want to create a dummy variable that = 1 if two individuals are siblings. I tried using:
dummy <- as.numeric(duplicated(Identifiers_age$MPUBID) = TRUE)
but the vector I get only = 1 for one of the siblings. What should I be doing?
Thanks
Upvotes: 0
Views: 855
Reputation: 38500
If your goal is to return a vector of 0s and 1s where it is 1 if the observational unit has a sibling, then you want to include a second duplicated
statement with fromLast=TRUE.
The first duplicated function will return a 1 for as many siblings as there are in a MPUBID after the first sibling, and the second duplicated will pick up the first sibling.
hasSiblings <- as.integer(duplicated(Identifiers_age$MPUBID) |
duplicated(Identifiers_age$MPUBID, fromLast=TRUE))
The |
is the vector logical operator OR. Note that duplicated
returns a logical vector, so you don't have to include the =TRUE after it as you did in your question.
Upvotes: 3
Reputation: 35262
A dplyr
answer:
library(dplyr)
Identifiers_age %>%
group_by(MPUBID) %>%
mutate(hasSiblings = as.integer(n() > 1))
Upvotes: 0