Reputation: 33
This is what my data looks like:
ID XYZ N_ID1 N_ID2
1 10 3 4
2 20 8 2
3 50 6 5
4 100 2 6
5 70 7 10
6 25 1 3
7 30 2 4
8 35 6 9
. . . .
. . . .
. . . .
So, I have two variables here called 'N_ID1' and 'N_ID2' which are the 2 neighbors of the 'ID' variable.
I want to create a new variable based on: 'XYZ', 'N_ID1', 'N_ID2', such that the new variable is an average of 'XYZ' for the values at ID location of 'N_ID1' and 'N_ID2'.
So, if we look at the first row where ID=1, there, 'N_ID1' = 3, 'N_ID2'= 4. Now, my new variable should be an average of 'XYZ' value at ID=3 and 'XYZ' value at ID=4. Similarly for other rows as well..
This is what my final result should look like:
ID XYZ N_ID1 N_ID2 New_Variable
1 10 3 4 (50+100)/2 = 75
2 20 8 2 (35+20)/2 = 27.5
3 50 6 5 (25+70)/2 = 47.5
4 100 2 6 .
5 70 7 10 .
6 25 1 3 .
7 30 2 4 .
8 35 6 9 .
. . . . .
. . . . .
. . . . .
So, as you can see above, the first value in the 'New_Variable' = 75 which is the average of ID#3 and ID#4 for 'XYZ'
Can anyone please tell me how to do this in R?
Upvotes: 3
Views: 70
Reputation: 93833
match
each N_IDx
to ID
, subset XYZ
, add +
, and divide.
Reduce(`+`,
lapply(dat[c("N_ID1","N_ID2")], function(x) dat$XYZ[match(x,dat$ID)] )
) / 2
#[1] 75.0 27.5 47.5 22.5 NA 30.0 60.0 NA
Without the functional approach, it would just be:
with(dat, (XYZ[match(N_ID1, ID)] + XYZ[match(N_ID2, ID)]) / 2 )
But this will become painful if you have many variables to sum.
Upvotes: 4