Santiago Alvarez
Santiago Alvarez

Reputation: 3

Get a vector with the count of consecutive specific values in R

I have a dataframe that looks like this:

    (TimeDate,Movement,Booked)

    1   2017-01-02 09:00:13 FALSE   TRUE
    2   2017-01-02 09:01:13 FALSE   TRUE
    3   2017-01-02 09:02:15 TRUE    TRUE
    4   2017-01-02 09:03:13 TRUE    TRUE
    5   2017-01-02 09:04:13 FALSE   FALSE
    6   2017-01-02 09:05:13 FALSE   FALSE
    7   2017-01-02 09:06:13 FALSE   TRUE
    8   2017-01-02 09:07:13 FALSE   TRUE
    9   2017-01-02 09:08:14 TRUE    TRUE
    10  2017-01-02 09:09:13 FALSE   TRUE

It has a minute by minute log with two booleans corresponding to movement detected and room booked. What I want to obtain is a vector that counts the consecutive occurrences of (FALSE,TRUE) for Movement and Booked comumns, so in this case I would want a resulting vector like this one:

    c(2,2,1)

Where the first element is the count of lines 1 and 2, the second for lines 7 and 8 and the last one for line 10.

Honestly I don't even know how to start with this. Thanks!

Upvotes: 0

Views: 104

Answers (3)

infinitefactors
infinitefactors

Reputation: 154

Edit: Did not see Roman's identical solution when posting.

We would like something like:

tmp <- rle(with(df, !Movement & Booked))
tmp$lengths[tmp$values]

The indexing by tmp$values ensures you only get the rows corresponding to the pattern you've specified.

Hope this helps!

Upvotes: 0

akrun
akrun

Reputation: 887911

Here is one option with rleid where we create a grouping column based on the FALSE in 'Movement' and Booked TRUE, get the count of the column

library(dplyr)
library(data.table)
df1 %>%
     mutate(rl = rleid(!Movement, Booked)) %>%
     filter(!(! Movement & Booked)) %>% 
     count(rl) %>%
     pull(n)
#[1] 2 2 1

Or with rle from base R

with(df1, with(rle(paste(Movement, Booked)), lengths[values == 'FALSE TRUE']))
#[1] 2 2 1

or using a compact way

with(df1, table(rleid(Movement, Booked)[!Movement & Booked]))

#1 4 6 
#2 2 1 

data

df1 <- structure(list(TimeDate = c("2017-01-02 09:00:13", "2017-01-02 09:01:13", 
"2017-01-02 09:02:15", "2017-01-02 09:03:13", "2017-01-02 09:04:13", 
"2017-01-02 09:05:13", "2017-01-02 09:06:13", "2017-01-02 09:07:13", 
"2017-01-02 09:08:14", "2017-01-02 09:09:13"), Movement = c(FALSE, 
FALSE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, TRUE, FALSE), 
    Booked = c(TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, 
    TRUE, TRUE)), class = "data.frame", row.names = c("1", "2", 
"3", "4", "5", "6", "7", "8", "9", "10"))

Upvotes: 0

Roman Luštrik
Roman Luštrik

Reputation: 70653

How about this? For each line check that you get the correct sequence and then use rle.

xy <- read.table(text = "FALSE   TRUE
FALSE   TRUE
TRUE    TRUE
TRUE    TRUE
FALSE   FALSE
FALSE   FALSE
FALSE   TRUE
FALSE   TRUE
TRUE    TRUE
FALSE   TRUE", header = FALSE)
colnames(xy) <- c("Movement", "Booked")

out <- rle(apply(xy, MARGIN = 1, FUN = function(x) !isTRUE(x[1]) & isTRUE(x[2])))
out$lengths[out$values == TRUE]

[1] 2 2 1

Another more simple option would be

out <- rle(xy$Movement == FALSE & xy$Booked == TRUE)
out$lengths[out$values == TRUE]

Upvotes: 3

Related Questions