Reputation: 4229
This is trivial question, however I don't seem to find neat solution for this. (without excluding NA's first and including them back again). So I'm looking for some ideas without the need of NA's exclusion.
I would like to label the start of a 0 and 1
sequence with 2
and 1
respectively and replace NA's with 0 as well as the remaining sequence of 0's and 1's.
Is the rle
function useful here? Base R solution would be welcomed.
Example:
x <- c(rep(NA,10),rep(1,5),rep(NA,5),rep(1,10),rep(NA,3),rep(0,7),rep(NA,15),rep(1,9))
r <- c(0,diff(x)); r[r %in% -1] <- 2
From this sample data:
x
[1] NA NA NA NA NA NA NA NA NA NA 1 1 1 1 1 NA NA NA NA NA 1 1 1 1 1 1 1 1 1 1 NA NA NA 0 0 0 0 0 0 0 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 1 1 1 1 1 1 1 1
Desired output:
[1] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
Upvotes: 1
Views: 49
Reputation: 887213
We could use rle
to create a grouping variable ('gr') to split
the 'x' into a list
. Replace the first element that is 0 or 1 with 2 or 1 using match
, concatenate with 0s, unlist
and then replace the NA elements with 0.
xN <- x
xN[is.na(xN)] <- -999
v1 <- rle(xN)$lengths
gr <- rep(seq_along(v1), v1)
x1 <- unlist(lapply(split(x, gr), function(x)
c(match(x[1],1:0),rep(0,length(x)-1)) ), use.names=FALSE)
x1[is.na(x1)] <- 0
x1
#[1] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
#[39] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
Or instead of split
, we can use which
and diff
to replace the values.
x1 <- (!x)+2*(!is.na(x))-1
ind <- which(!is.na(x))
x1[c(ind[c(FALSE,diff(ind)==1)], which(is.na(x)))] <- 0
x1
#[1] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
#[39] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
Or we can use rleid
from the devel version of data.table
as grouping variable. Replace the first element of 0's and 1's with 2 and 1 using match
and change the NA values to 0.
library(data.table)#v1.9.5+
DT <- setDT(list(x))
DT[, c(match(V1[1], 1:0), rep(0, .N-1)) ,rleid(V1)][is.na(V1), V1:=0]$V1
#[1] 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0
#[39] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0
Upvotes: 1