den
den

Reputation: 169

Identifying groups of individuals if conditional occurence in one of them

I'm trying to identify (using a binary 1/0 variable) groups of subjects from a database, whenever at least one subject from this group fulfills two conditions.

My database DFis composed of families with description of sex and age of each member (and family ID family) : I would like to create a new, binary variable NoMan which would be 0 if at least one of the males from a family (sx with attribute 1) is aged more than 16y., otherwise it would take the value 1. Note that I want NoManto be identical for all members from the same family.

family <- factor(rep(c("001","002","003"), c(10,8,15)),
                 levels=c("001","002","003"), labels=c("001","002","003"), ordered=TRUE)
ag <- c(22,8,4,2,55,9,44,65,1,7,32,2,2,1,6,9,18,99,73,1,2,3,4,5,6,7,8,9,10,18,11,22,33)
sx <- c(1,2,2,2,1,2,2,2,1,1,2,1,2,1,2,1,2,2,2,2,1,2,1,2,1,2,1,2,1,2,1,2,2)
DF <- data.frame(family, ag, sx)
DF

I have tried to use ddplycombined with ifelse but this was not successful :

DF <- ddply(DF,.(family), transform, NoMan=ifelse(sx==1 & ag>16, 1, 0))
DF

It seems that among eventual other limitation, in this script the functions apply to individuals instead of families (I actually would like them to apply the same result to all members of the same family).

I feel I'm on the right tracks, but maybe someone has a good solution to this problem?

PS: just edited DF because in this example I wanted all members from family 003 to be tagged as NoMan==1

Upvotes: 0

Views: 64

Answers (4)

S van Balen
S van Balen

Reputation: 288

    DF$NoMan = c(! DF$family %in% unique(DF[DF$sx == 1 & DF$ag < 16,1]))

Upvotes: 1

d.b
d.b

Reputation: 32558

#Obtain unique families
family = c(unique(as.character(DF$family)))
NoMan  = c()

for (i in 1:length(family)){
#Subset a new dataframe for each family with only male members and check if minimun age is below 16
if (min(subset(DF,DF$family == family[1] & DF$sx=="1")$ag) < 16){
NoMan[i] = 1
} else {
NoMan[i] = 0
}
}

#Join unique family and NoMan into new dataframe
DF2 = cbind(family,NoMan)

#Use lookup command of qdapTools package
library(qdapTools)
DF$NoMan = lookup(DF$family,DF2)

Upvotes: 1

akrun
akrun

Reputation: 887711

We can use dplyr

library(dplyr)
DF %>%
   group_by(family) %>% 
   mutate(NoMan = as.integer(any(sx == 1 & ag < 16)))

Or using ave from base R

DF$NoMan <- with(DF, as.integer(ave(sx==1 & ag < 16, family, FUN = any)))

Upvotes: 1

akuiper
akuiper

Reputation: 215117

ifelse returns a vector of results disregard the group, you can use any to aggregate the results per group:

library(plyr)
ddply(DF, .(family), transform, NoMan = +any(sx == 1 & ag < 16))

Upvotes: 1

Related Questions