Reputation: 1028
I have a state column in a dataframe and I want to create two new columns: One that looks ahead to the next stage change and one that looks back to the previous state change. So the resulting dataframe will look like below:
state coming previous
a a-b NA
a a-b NA
a a-b NA
a a-b NA
b b-c a-b
b b-c a-b
b b-c a-b
c c-a b-c
c c-a b-c
c c-a b-c
a NA c-a
a NA c-a
Or maybe even better, but now you just create two transition columns:
state trans1 trans2
a a-b NA
a a-b NA
a a-b NA
a a-b NA
b a-b b-c
b a-b b-c
b a-b b-c
c c-a b-c
c c-a b-c
c c-a b-c
a c-a NA
a c-a NA
[Edit] changed states named "1" to "c" because it was confusing
Upvotes: 0
Views: 113
Reputation: 1028
Thanks to DWin's answer I found the answer to the second part om my question myself. Here's the complete code to create a dataframe with a transitions column:
state = rep(c('a','b','c','a'), c(4,3,3,2))
inp=data.frame(state, vals=rnorm(12))
runinps=rle(as.character(inp$state)) # doesn't work without as.character
(rs <- runinps$values)
(ls=runinps$lengths)
(inp$coming <- rep( c( paste( rs[-length(rs)], rs[-1], sep="-"), NA), ls ))
(inp$previous <-rep( c( NA, paste(rs[-length(rs)], rs[-1], sep="-")), ls ))
# Create the first transitions column
(reps=rep(1:(length(ls)/2),each=2))
(ls2=as.vector(tapply(ls , reps, sum)))
seqRs=seq(from=1,to=length(rs),by=2)
(inp$trans <- rep(paste( rs[seqRs], rs[seqRs+1], sep="-"), ls2 ))
# Create the second transitions column
reps=c(reps[-1], max(reps)+1)
(ls2=as.vector(tapply(ls , reps, sum)))
seqRs=seq(from=2,to=length(rs)-1,by=2)
(inp$trans2 <- rep(c(NA, paste( rs[seqRs], rs[seqRs+1], sep="-"), NA), ls2 ))
# some last commands to create one transition column
inp2=subset(inp,!is.na(inp$trans2))
inp2$trans=inp2$trans2
inp=rbind(inp,inp2)
inp$trans2<-NULL
Upvotes: 0
Reputation: 263451
Let's give that dataframe a name, say 'inp'. Use the rle
function to construct the sequence of "states":
> rle(inp$state)
Run Length Encoding
lengths: int [1:4] 4 3 3 2
values : chr [1:4] "a" "b" "1" "a"
runinp <- rle(inp$state)$values
paste( runinp[-length(runinp)], runinp[-1], sep="-")
# [1] "a-b" "b-1" "1-a"
inp$coming <- rep( c( paste( runinp[-length(runinp)], runinp[-1], sep="-"), NA),
rle(inp$state)$lengths )
inp$coming
# [1] "a-b" "a-b" "a-b" "a-b" "b-1" "b-1" "b-1" "1-a" "1-a" "1-a" NA NA
inp$previous <-
rep( c( NA_character_, paste(runinp[-1], runinp[-length(runinp)], sep="-")),
rle(inp$state)$lengths )
inp$previous
[1] NA NA NA NA "b-a" "b-a" "b-a" "1-b" "1-b" "1-b" "a-1" "a-1"
(I was able to overcome my difficulty with understanding your first request, but had persistent difficulty with the second part.)
Upvotes: 1