Reputation: 8848
Consider the following dataframe:
df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(1,2,3,2,3,5,2,4,5,1), xt1 = c(1,4,2,1,1,4,2,2,2,5))
The data is divided in five states, which are quantiles in reality: 1,2,3,4,5. The first column of the dataframe represents the state at time t, and the second column is the state at time t+1.
I would like to compute a sort of a transition matrix for the five states. The meaning of the matrix would be as follows:
I am really not sure how to do this in an efficient way. I have the feeling the answer is trivial, but I just can't get my head around it.
Could anyone please help?
Upvotes: 5
Views: 683
Reputation: 23129
If you want to have all the states (1..5) in the column of the transition matrix, you can try this:
levs <- sort(union(df$xt, df$xt1))
tbl <- table(factor(df$xt, levs), factor(df$xt1, levs))
tbl / rowSums(tbl)
1 2 3 4 5
1 0.5000000 0.0000000 0.0000000 0.0000000 0.5000000
2 0.3333333 0.3333333 0.0000000 0.3333333 0.0000000
3 0.5000000 0.5000000 0.0000000 0.0000000 0.0000000
4 0.0000000 1.0000000 0.0000000 0.0000000 0.0000000
5 0.0000000 0.5000000 0.0000000 0.5000000 0.0000000
Upvotes: 0
Reputation: 162431
res <- with(df, table(xt, xt1)) ## table() to form transition matrix
res/rowSums(res) ## /rowSums() to normalize by row
# xt1
# xt 1 2 4 5
# 1 0.5000000 0.0000000 0.0000000 0.5000000
# 2 0.3333333 0.3333333 0.3333333 0.0000000
# 3 0.5000000 0.5000000 0.0000000 0.0000000
# 4 0.0000000 1.0000000 0.0000000 0.0000000
# 5 0.0000000 0.5000000 0.5000000 0.0000000
## As an alternative to 2nd line above, use sweep(), which won't rely on
## implicit recycling of vector returned by rowSums(res)
sweep(res, MARGIN = 1, STATS = rowSums(res), FUN = `/`)
Upvotes: 5