Reputation: 562
Assume a data.frame:
df <- data.frame(name = c("a","b","c","d","e"),rank = c(1,1,4,3,2))
name rank
a 1
b 1
c 4
d 3
e 2
Based on the above data.frame, I want to create a new one that holds the count of transitions from one rank to another. So the output would be something like this:
name 1to1 1to2 1to3 1to4 2to1 2to2 2to3 2to4 3to1 3to2 3to3 3to4 4to1 4to2 4to3 4to4
1 b 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
2 c NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA
3 d NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA
4 e NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA
One way to do this would be to run a for
loop and then using ifs
but I am pretty sure there should be a more efficient way of doing this.
For example, if item d
has a rank of 3
and item c
is ranked as 4
then the code should increase the count of the 4to3
column under d
's row (as per example above). Please let me know if this is unclear and I appreciate all the help.
P.S. colnames are not that important.
Upvotes: 1
Views: 59
Reputation: 73552
You could use Map
to create seq
uences for extracting the transitions and collapse them into the desired form using paste
.
tmp <- sapply(Map(seq, 1:(nrow(df1)-1), 2:nrow(df1)), function(i) df1$rank[i])
v <- apply(tmp, 2, function(x) paste(x, collapse="to"))
Then create a grid with all permutations
to <- apply(expand.grid(1:4, 1:4), 1, function(x) paste(x, collapse="to"))
and compare them with the actual transitions to get the resulting binary structure; create a data frame out of it.
res <- data.frame(name=df1$name[-1], t(sapply(v, function(i) setNames(+(i == to), to))))
Afterwards, you may convert the zeroes to NA
using
res[res == 0] <- NA
res
# name X1to1 X2to1 X3to1 X4to1 X1to2 X2to2 X3to2 X4to2 X1to3 X2to3 X3to3 X4to3 X1to4 X2to4 X3to4 X4to4
# 1to1 b 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
# 1to4 c NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA
# 4to3 d NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA
# 3to2 e NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA
Data
df1 <- structure(list(name = structure(1:5, .Label = c("a", "b", "c",
"d", "e"), class = "factor"), rank = c(1, 1, 4, 3, 2)), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 2