Reputation: 933
I have a large dataset and want to insert a new column in the dataset with binary values (0 & 1), if it satisfies the following criteria.
if columns with df1$seg.mean >= 0.5
is equal to df1$id == gain
and df1$seg.mean <= -0.5
is equal to df1$id == loss
, insert 1 in df1$Occurance
.
for those rows which does not satisfy this criteria assign df1$Occurance == 0
df1 <-
Chr start end num.mark seg.mean id
1 68580000 68640000 8430 0.7 gain
1 115900000 116260000 8430 0.0039 loss
1 173500000 173680000 5 -1.7738 loss
1 173500000 173680000 12 0.011 loss
1 173840000 174010000 6 -1.6121 loss
desired output
Chr start end num.mark seg.mean id Occurance
1 68580000 68640000 8430 0.7 gain 1
1 115900000 116260000 8430 0.0039 loss 0
1 173500000 173680000 5 -1.7738 loss 1
1 173500000 173680000 12 0.011 loss 0
1 173840000 174010000 6 -1.6121 loss 1
Upvotes: 2
Views: 117
Reputation: 21497
Try using ifelse
df1$Occurance <- ifelse((df1$seg.mean >= 0.5 & df1$id == "gain") |
(df1$seg.mean <= -0.5 & df1$id == "loss"), 1, 0)
Edit: Avoiding ifelse
and using within
for not having to write df1
all the time you can use
transform(df1, Occurance = as.numeric((seg.mean >= 0.5 & id == "gain") |
(seg.mean <= -0.5 & id == "loss")))
Comment: If you also Accept TRUE/FALSE insted of 1/0 you can skip the as.numeric
Edit #2: If you want to have multiple outcomes like -1,0,1 you can do the following
df1$Occurance = 0
within(df1, {Occurance[seg.mean >= 0.5 & id == "gain"] <- 1;
Occurance[seg.mean <= -0.5 & id == "loss"] <- -1})
which results in
Chr start end num.mark seg.mean id Occurance
1 1 68580000 68640000 8430 0.7000 gain 1
2 1 115900000 116260000 8430 0.0039 loss 0
3 1 173500000 173680000 5 -1.7738 loss -1
4 1 173500000 173680000 12 0.0110 loss 0
5 1 173840000 174010000 6 -1.6121 loss -1
Upvotes: 4
Reputation: 1678
You can also do:
df1$Occurrence[with(df1,(seg.mean>=.5 & id == "gain") | (seg.mean<=-.5 & id=="loss"))]<-1
df1$Occurrence[is.na(df1$Occurrence)]<-0
Upvotes: -1
Reputation: 5951
Try this:
df1$Occurance <- (df1$seg.mean >= 0.5 & df1$id == "gain") |
(df1$seg.mean <= -0.5 & df1$id == "loss"))*1
# TRUE*1 = 1
# FALSE*1 = 0
Upvotes: 2