Anup
Anup

Reputation: 31

Generating conditional dummy ids

I have a dataset which looks like the following. I'm using R to work on this data. The first three columns (year,id and var) forms part of the raw data. I need to create the new variable ans as follows

If var=1, then for each year (where var=1), i need to create a new dummy ans which takes the value of 1 for all corresponding id's where an instance of var=1 was recorded. Sample data with the expected output(ans) is shown below.

 year     id     var     ans
 2010      1      1       1
 2010      2      0       0
 2010      1      0       1
 2010      1      0       1
 2011      2      1       1
 2011      2      0       1
 2011      1      0       0
 2011      1      0       0

Any help on how to achieve this is much appreciated.

Thanks Anup

Upvotes: 2

Views: 196

Answers (1)

Roland
Roland

Reputation: 132864

Use ddply with transform and any:

DF <- read.table(text=" year     id     var     ans
 2010      1      1       1
 2010      2      0       0
 2010      1      0       1
 2010      1      0       1
 2011      2      1       1
 2011      2      0       1
 2011      1      0       0
 2011      1      0       0", header=TRUE)

library(plyr)
ddply(DF,.(year,id),transform, ans2 = as.numeric(any(var==1)))

#   year id var ans ans2
# 1 2010  1   1   1    1
# 2 2010  1   0   1    1
# 3 2010  1   0   1    1
# 4 2010  2   0   0    0
# 5 2011  1   0   0    0
# 6 2011  1   0   0    0
# 7 2011  2   1   1    1
# 8 2011  2   0   1    1

Note that ddply reorders by design.

Upvotes: 1

Related Questions