Reputation: 666
I have a data.frame with 6 columns. The first is for subjects, the second for blocks in an experiment, and columns 3,4 and 5 are values I need to calculate a binary score (0 or 1), that I want to add in the sixth column (that's why now, it's full of 0s).
head(kfdblock3to9)
subject time gr ugr sdugr IL
40002.3 40002 3 0.4475618 0.3706000 0.02994533 0
40002.4 40002 4 0.4361786 0.3901111 0.01846110 0
40002.5 40002 5 0.4279880 0.4550000 0.02811839 0
40002.6 40002 6 0.4313647 0.4134444 0.04352974 0
40002.7 40002 7 0.4420889 0.4394286 0.02883143 0
40002.8 40002 8 0.4325227 0.3960000 0.06559222 0
I'm trying to do this with a for loop, but I'm a beginner in R and I'm having difficulties with this. The scoring formula I'm trying to implement is one where: If the value in column 3 ($gr) is less that the difference between the value in column 4 ($ugr) and .35 times the value in column 5 ($sdugr), then the subject receives a 1, otherwise a 0.
What I've tried so far is:
for (i in kfdblock3to9$subject) {
if (kfdblock3to9$gr<(kfdblock3to9$ugr-(.35*kfdblock3to9$sdugr)))
kfdblock3to9$IL=1
else kfdblock3to9$IL=0
}
This gives me 50 warnings, all saying: "the condition has length > 1 and only the first element will be used"
I suppose I'm doing something wrong with the indexes then, but I haven't been able to figure it out. Any help is much appreciated.
Upvotes: 1
Views: 165
Reputation: 4180
You shouldn't use a loop in this case. Whenever you use a loop in the future, you need to use indices :
for (i in 1:length(kfdblock3to9$subject)) {
if (kfdblock3to9[i,"gr"] < (kfdblock3to9[i, "ugr"] - .35 * kfdblock3to9[i, "sdugr"]))
kfdblock3to9[i,"IL"]=1
else kfdblock3to9[i,"IL"]=0
}
kfdblock3to9
subject time gr ugr sdugr IL
40002.3 40002 3 0.4475618 0.3706000 0.02994533 0
40002.4 40002 4 0.4361786 0.3901111 0.01846110 0
40002.5 40002 5 0.4279880 0.4550000 0.02811839 1
40002.6 40002 6 0.4313647 0.4134444 0.04352974 0
40002.7 40002 7 0.4420889 0.4394286 0.02883143 0
40002.8 40002 8 0.4325227 0.3960000 0.06559222 0
Upvotes: 0
Reputation: 327
What you want is a logical test. You can thus avoid the use of the loop
, and even ifelse
, and simply do:
kfdblock3to9$IL <- with(kfdblock3to9, gr < (ugr-0.35*sdugr))
The IL column will include TRUE of FALSE, instead of 1 or 0. If you prefer having integers, you can do:
kfdblock3to9$IL <- as.integer(with(kfdblock3to9, gr < (ugr-0.35*sdugr)))
Upvotes: 2
Reputation: 25736
To solve your problem I would suggest something like this:
kfdblock3to9[, "IL"] <- ifelse(kfdblock3to9$gr < (kfdblock3to9$ugr-(0.35*kfdblock3to9$sdugr)), 1, 0);
(A vectorized approach is mostly faster than a loop.)
Your loop is wrong because you don't respect your index i
. You have to use i
to access the row in the loop:
for (i in seq(along=kfdblock3to9)) {
cat("row:", i, kfdblock3to9[i, "subject"], "\n");
}
Upvotes: 2
Reputation: 108523
Take a look at within
and ifelse
:
kfdblock3to9 <-
within(kfdblock3to9,
IL <- ifelse( gr < ugr - 0.35 * dugr, 1, 0)
)
within()
isn't really that necessary, but it keeps your code a whole lot more readible and easier to understand.
Why does it go wrong? That's because your condition is vectorized : try
kfdblock3to9$gr<(kfdblock3to9$ugr-(.35*kfdblock3to9$sdugr))
and you will see it returns a logical vector. Now an if()
clause can only deal with one boolean value at a time. If you have a vectorized result, you need a vectorized solution and that is ifelse()
Upvotes: 2