John Montague
John Montague

Reputation: 2010

Find # of rows between events in R

I have a series of data in the format (true/false). eg it looks like it can be generated from rbinom(n, 1, .1). I want a column that represents the # of rows since the last true. So the resulting data will look like

true/false  gap
    0        0
    0        0
    1        0
    0        1
    0        2
    1        0
    1        0
    0        1

What is an efficient way to go from true/false to gap (in practice I'll this will be done on a large dataset with many different ids)

Upvotes: 1

Views: 76

Answers (2)

Roland
Roland

Reputation: 132626

DF <- read.table(text="true/false  gap
    0        0
    0        0
    1        0
    0        1
    0        2
    1        0
    1        0
    0        1", header=TRUE)



DF$gap2 <- sequence(rle(DF$true.false)$lengths) * #create a sequence for each run length
            (1 - DF$true.false) * #multiply with 0 for all 1s
             (cumsum(DF$true.false) != 0L) #multiply with zero for the leading zeros

#  true.false gap gap2
#1          0   0    0
#2          0   0    0
#3          1   0    0
#4          0   1    1
#5          0   2    2
#6          1   0    0
#7          1   0    0
#8          0   1    1

The cumsum part might not be the most efficient for large vectors. Something like

if (DF$true.false[1] == 0) DF$gap2[seq_len(rle(DF$true.false)$lengths[1])] <- 0 

might be an alternative (and of course the rle result could be stored temporarly to avoid calculating it twice).

Upvotes: 4

Inox
Inox

Reputation: 2275

Ok, let me put this in answer

1) No brainer method

data['gap'] = 0
for (i in 2:nrow(data)){
    if data[i,'true/false'] == 0{
       data[i,'gap'] = data[i-1,'gap'] + 1
    }
}

2) No if check

data['gap'] = 0
for (i in 2:nrow(data)){
    data[i,'gap'] = (data[i-1,'gap'] + 1) * (-(data[i,'gap'] - 1))
}

Really don't know which is faster, as both contain the same amount of reads from data, but (1) have an if statement, and I don't know how fast is it (compared to a single multiplication)

Upvotes: 0

Related Questions