Reputation: 113
I have data frame:
DF
Chset Choices X1 X2 utility
1 1 8 1 1 2
2 1 2 0 1 3
3 1 1 1 0 -1
4 2 1 1 1 2
5 2 5 0 1 5
6 2 1 1 0 -1
7 2 2 0 0 0
8 3 1 1 1 2
9 3 2 0 1 6
10 3 5 1 0 -1
11 4 6 1 1 2
12 4 1 0 1 14
13 4 1 1 0 -1
14 4 1 0 0 0
And I want to create column "predict" where I put 1 if utility is maximum in Chset. For example, we have 3 rows where Chset=1, and those have utilities (2,3,-1). Then, in column "predict" should be (0,1,0) - 1 for row 2, because it has the maximum utility in Chset=1, and so on:
Chset Choices X1 X2 utility predict
1 1 8 1 1 2 0
2 1 2 0 1 3 1
3 1 1 1 0 -1 0
4 2 1 1 1 2 0
5 2 5 0 1 5 1
6 2 1 1 0 -1 0
7 2 2 0 0 0 0
8 3 1 1 1 2 0
9 3 2 0 1 6 1
10 3 5 1 0 -1 0
11 4 6 1 1 2 0
12 4 1 0 1 14 1
13 4 1 1 0 -1 0
14 4 1 0 0 0 0
After that, I want to cheak, whether the prediction is right. The prediction is correct if predict=1 and value in column "Choices" is the maximum in its "Chset". For example, in Chset=1 we can see "predict"=1 for the 2nd row, whereas the maximum "Choices" in Chset=1 is on the 1st row (and equals to 8), so prediction is incorrect. By contrast, in Chset=2, "predict" is equal to 1 for the 5th row, and this row has the maximum value of "Choices" within this Chset=2, so here prediction is correct. To cheak all cases, I want to create table "cheak" which is equal to 1 if prediction is correct, and 0 vice versa. Finally, I should get:
Chset Choices X1 X2 utility predict cheak
1 1 8 1 1 2 0 0
2 1 2 0 1 3 1 0
3 1 1 1 0 -1 0 0
4 2 1 1 1 2 0 0
5 2 5 0 1 5 1 1
6 2 1 1 0 -1 0 0
7 2 2 0 0 0 0 0
8 3 1 1 1 2 0 0
9 3 2 0 1 6 1 0
10 3 5 1 0 -1 0 0
11 4 6 1 1 2 0 0
12 4 1 0 1 14 1 0
13 4 1 1 0 -1 0 0
14 4 1 0 0 0 0 0
How can I do that?
Waiting for your help
Upvotes: 0
Views: 109
Reputation: 7839
This should do it
DF <-
unsplit(lapply(split(DF, DF$Chset),
function(x) within(x, {
predict <- as.numeric(utility == max(utility))
check <- as.numeric(Choices == max(Choices) & predict == 1)
})),
DF$Chset)
Upvotes: 1