loop calculation as it go in r

Question

I am having difficulty executing a calculation that is defined iteratively. The following data serves as an example (actual data set much larger):

## DATA ##
# Columns
   Individual<-c("A","B","C","D","E","F","G","H1","H2","H3","H4","H5","K1","K2","K3","K4","K5")
   P1<-c(0,0,"A",0,"C","C",0, rep("E",5),"H1","H2","H3","H4","H5")
   P2<-c(0,0,"B",0,"D", "E",0,rep("G",5),"H1","H2","H3","H4","H5")
# Dataframe
   myd<-data.frame(Individual,P1,P2,stringsAsFactors=FALSE)


   Individual P1 P2
1           A  0  0
2           B  0  0
3           C  A  B
4           D  0  0
5           E  C  D
6           F  C  E
7           G  0  0
8          H1  E  G
9          H2  E  G
10         H3  E  G
11         H4  E  G
12         H5  E  G
13         K1 H1 H1
14         K2 H2 H2
15         K3 H3 H3
16         K4 H4 H4
17         K5 H5 H5

The data represents the relationship between and Individual and the two parents, P1, P2.

The calculation required, labeled relationA, represents how much each individual is related to A.

By definition, the relationship between A and A is given a value of 1. The values for all other individuals need to be computed based on the information in the table, as follows:

The value of relationA for an individual should be equal to 
   1/2 (the value of relationA of P1 of the individual)  
 + 1/2 (the value of relationA of P2 of the individual)

FOR EXAMPLE

  Individual P1 P2      relationA
1           A  0  0       1
2           B  0  0       0
3           C  A  B       (A = 1 + B = 0)/2 = 0.5
4           D  0  0       0
5           E  C  D       (C= 0.5 + D = 0)/2 = 0.25
6           F  C  E       (C = 0.5 + E = 0.25)/2 = 0.375

The expected output is as the following:

 Individual P1 P2  relationA
1           A  0  0   1
2           B  0  0   0
3           C  A  B   0.5
4           D  0  0   0
5           E  C  D   0.25
6           F  C  E   0.375
7           G  0  0   0 
8          H1  E  G   0.125
9          H2  E  G   0.125
10         H3  E  G   0.125
11         H4  E  G   0.125
12         H5  E  G   0.125
13         K1 H1 H1   0.125
14         K2 H2 H2   0.125
15         K3 H3 H3   0.125
16         K4 H4 H4   0.125
17         K5 H5 H5   0.125

My difficulty is in expressing this in a proper manner in R. Any help would be appreciated.

Ricardo Saporta · Accepted Answer

Edit:

more succinctly, you could use sapply and rowSums to do away with the for-loop into a single line of code:

# Initialize values of relationA
myd$relationA <- 0
myd$relationA[myd$Individual=="A"] <- 1

# Calculate relationA
myd$relationA <-   myd$relationA + rowSums(sapply(myd$Individual, function(indiv) 
     myd$relationA[myd$Individual==indiv]/2 * ((myd$P1==indiv) + (myd$P2==indiv))))

Is what you are looking for something like this?

# Initialize values of relationA
myd$relationA <- 0
myd$relationA[myd$Individual=="A"] <- 1


# Iterate over all Individuals
for (indiv in myd$Individual) {

  indiVal <- myd$relationA[myd$Individual==indiv]

  # all columns handled at once, thanks to vectorization;  no need for myd$P1[i]
  myd$relationA <- myd$relationA  + 
                 indiVal/2 * ((myd$P1==indiv) + (myd$P2==indiv))
}

Output

myd
   Individual P1 P2 relationA
1           A  0  0     1.000
2           B  0  0     0.000
3           C  A  B     0.500
4           D  0  0     0.000
5           E  C  D     0.250
6           F  C  E     0.375
7           G  0  0     0.000
8          H1  E  G     0.125
9          H2  E  G     0.125
...

loop calculation as it go in r

Answers (2)

Edit:

Related Questions