Natalia
Natalia

Reputation: 399

compute weighted transition matrix in r

I found this question interesting: Transition matrix

So following his setup, suppose I add a weight (xt2) to each row:

 >df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(1,2,3,2,3,5,2,4,5,5), xt1 = c(1,4,2,1,1,4,2,2,2,2),xt2=c(1:10));df
   cusip xt xt1 xt2
1     A1  1   1   1
2     A2  2   4   2
3     A3  3   2   3
4     A4  2   1   4
5     A5  3   1   5
6     A6  5   4   6
7     A7  2   2   7
8     A8  4   2   8
9     A9  5   2   9
10   A10  5   2  10

Using the answer in that post, we get the transition matrix:

res <- with(df, table(xt, xt1))
    xt1
 xt  1 2 4
   1 1 0 0
   2 1 1 1
   3 1 1 0
   4 0 1 0
   5 0 2 1
result <- res/rowSums(res) ;a
        xt1
 xt          1         2         4
   1 1.0000000 0.0000000 0.0000000
   2 0.3333333 0.3333333 0.3333333
   3 0.5000000 0.5000000 0.0000000
   4 0.0000000 1.0000000 0.0000000
   5 0.0000000 0.6666667 0.3333333

But what if I want to compute the transition matrix weighted by the xt2 column? That is to say, when we generate res, we do not just count the frequency of change of state, we use actual numbers (the weight). For example, res[2,1] should be 4, and res[5,2] should be 9+10=19. Therefore, the new res wanted should be like the following:

    xt1
 xt  1 2 4
   1 1 0 0
   2 4 7 2
   3 5 3 0
   4 0 8 0
   5 0 19 6

And then, we can just calculate result using the same code above. How can I achieve that res? Thank you.

P.S., Or is there any other way to "weight" the transition matrix?

Upvotes: 1

Views: 599

Answers (1)

akrun
akrun

Reputation: 887691

We can use xtabs. Using the formula method, we specify the cross-classifying variables on the rhs of ~ and the vector of counts on the lhs. By default, it will do the sum

xtabs(xt2~xt+xt1, df)
#    xt1
#xt   1  2  4
#  1  1  0  0
#  2  4  7  2
#  3  5  3  0
#  4  0  8  0
#  5  0 19  6

Or with tapply, we group by 'xt', 'xt1' and specify the FUN as sum. For those elements that don't have a combination, it will give NA, which can be replaced to 0 if necessary.

with(df, tapply(xt2, list(xt, xt1), FUN=sum))

Or we can use acast from reshape2. We reshape from 'long' to 'wide' by specifying the formula and the value.var column.

library(reshape2)
acast(df, xt~xt1, value.var='xt2', sum)

Upvotes: 2

Related Questions