R - Construct conditional probability matrix

Question

I have a dataframe which consists of four columns, the two first are used for identifying a user and a product, and the last two are conditional probabilities. My final dataframe looks like this:

         id1    id2       p(id2|id1)   p(id1|id2)
1        1      1         0.1111111    4.290376e-04
2        1      2         0.22222222   8.286866e-03
3        1      3         0.22222222   2.639876e-04
4        1      4         0.44444444   2.850284e-03
5        2      1         0.09090909   1.644470e-03
6        2      5         0.2727273    3.286420e-04
7        2      6         0.4545455    1.002740e-03
8        2      3         0.1818182    1.738019e-05

and with many more users coming after. As you can see, we can have more than one different value for id2 belonging to the same id1. I want to find the probability of getting a certain id2, given that a user already has some id2, i.e. I am interested in finding

p(id2 = x | id2 = y) = sum_id1 ( p(id2 = x | id1 ) * p(id1 | id2 = y) )

and construct it as a matrix for all x and y. In this case we have 6 different id2, so the resulting matrix should look something like this

     1             2              3             4       5      6
1    NA            0.0009207628   3.091197e-05  ....    ....   ....
2    9.534169e-05  NA             ...
3    0.0003943363  ...
4    ...
5    ...
6    ...

We get the element (1,2) as

p(id2=1 | id2 = 2) = p(id2 = 1 | id1 = 1) * p(id1 = 1 | id2 = 2) 
= 0.1111111*8.286866e-03 = 0.0009207628.

For element (1,3) we get

p(id2 = 1 | id2 = 3) = p(id2 = 1 | id1 = 1) * p(id1 = 1 | id2 = 3)
+ p(id2 = 1 | id1 = 2) * p(id1 = 2 | id2 = 3) 
= 0.1111111 * 2.639876e-04 + 0.09090909 * 1.738019e-05 = 3.091197e-05

I hope it is clear what I want to accomplish. Does anyone have any idea how I can construct this matrix in R?

Thanks in advance

R - Construct conditional probability matrix

Answers (1)

Related Questions