Reputation: 169
I'm trying to identify (using a binary 1/0 variable) groups of subjects from a database, whenever at least one subject from this group fulfills two conditions.
My database DF
is composed of families with description of sex and age of each member (and family ID family
) : I would like to create a new, binary variable NoMan
which would be 0 if at least one of the males from a family (sx
with attribute 1
) is aged more than 16y., otherwise it would take the value 1. Note that I want NoMan
to be identical for all members from the same family.
family <- factor(rep(c("001","002","003"), c(10,8,15)),
levels=c("001","002","003"), labels=c("001","002","003"), ordered=TRUE)
ag <- c(22,8,4,2,55,9,44,65,1,7,32,2,2,1,6,9,18,99,73,1,2,3,4,5,6,7,8,9,10,18,11,22,33)
sx <- c(1,2,2,2,1,2,2,2,1,1,2,1,2,1,2,1,2,2,2,2,1,2,1,2,1,2,1,2,1,2,1,2,2)
DF <- data.frame(family, ag, sx)
DF
I have tried to use ddply
combined with ifelse
but this was not successful :
DF <- ddply(DF,.(family), transform, NoMan=ifelse(sx==1 & ag>16, 1, 0))
DF
It seems that among eventual other limitation, in this script the functions apply to individuals instead of families (I actually would like them to apply the same result to all members of the same family).
I feel I'm on the right tracks, but maybe someone has a good solution to this problem?
PS: just edited DF
because in this example I wanted all members from family 003 to be tagged as NoMan==1
Upvotes: 0
Views: 64
Reputation: 288
DF$NoMan = c(! DF$family %in% unique(DF[DF$sx == 1 & DF$ag < 16,1]))
Upvotes: 1
Reputation: 32558
#Obtain unique families
family = c(unique(as.character(DF$family)))
NoMan = c()
for (i in 1:length(family)){
#Subset a new dataframe for each family with only male members and check if minimun age is below 16
if (min(subset(DF,DF$family == family[1] & DF$sx=="1")$ag) < 16){
NoMan[i] = 1
} else {
NoMan[i] = 0
}
}
#Join unique family and NoMan into new dataframe
DF2 = cbind(family,NoMan)
#Use lookup command of qdapTools package
library(qdapTools)
DF$NoMan = lookup(DF$family,DF2)
Upvotes: 1
Reputation: 887711
We can use dplyr
library(dplyr)
DF %>%
group_by(family) %>%
mutate(NoMan = as.integer(any(sx == 1 & ag < 16)))
Or using ave
from base R
DF$NoMan <- with(DF, as.integer(ave(sx==1 & ag < 16, family, FUN = any)))
Upvotes: 1
Reputation: 215117
ifelse
returns a vector of results disregard the group, you can use any
to aggregate the results per group:
library(plyr)
ddply(DF, .(family), transform, NoMan = +any(sx == 1 & ag < 16))
Upvotes: 1