dataabaass
dataabaass

Reputation: 13

R find consecutive months

I'd like to find consecutive month by client. I thought this is easy but still can't find solutions..

My goal is to find months' consecutive purchases for each client. Any

My data

Client Month consecutive
A      1       1
A      1       2
A      2       3
A      5       1
A      6       2
A      8       1
B      8       1

Upvotes: 1

Views: 472

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

In base R, we can use ave

df$consecutive <- with(df, ave(Month, Client, cumsum(c(TRUE, diff(Month) > 1)),
                       FUN = seq_along))
df

#  Client Month consecutive
#1      A     1           1
#2      A     1           2
#3      A     2           3
#4      A     5           1
#5      A     6           2
#6      A     8           1
#7      B     8           1

In dplyr, we can create a new group with lag to compare the current month with the previous month and assign row_number() in each group.

library(dplyr)

df %>%
  group_by(Client,group=cumsum(Month-lag(Month, default = first(Month)) > 1)) %>%
  mutate(consecutive = row_number()) %>%
  ungroup %>%
  select(-group)

Upvotes: 1

akrun
akrun

Reputation: 887118

We can create a grouping variable based on the difference in adjacent 'Month' for each 'Client' and use that to create the sequence

library(dplyr)
df1 %>%
  group_by(Client) %>%
  group_by(grp =cumsum(c(TRUE, diff(Month) > 1)), add = TRUE) %>% 
  mutate(consec = row_number()) %>%
  ungroup %>%
  select(-grp)
# A tibble: 7 x 4
#  Client Month consecutive consec
#  <chr>  <int>       <int>  <int>
#1 A          1           1      1
#2 A          1           2      2
#3 A          2           3      3
#4 A          5           1      1
#5 A          6           2      2
#6 A          8           1      1
#7 B          8           1      1

Or using data.table

library(data.table)
setDT(df1)[, grp := cumsum(c(TRUE, diff(Month) > 1)), Client
       ][, consec := seq_len(.N), .(Client, grp)
        ][, grp := NULL][]

data

df1 <- structure(list(Client = c("A", "A", "A", "A", "A", "A", "B"), 
    Month = c(1L, 1L, 2L, 5L, 6L, 8L, 8L), consecutive = c(1L, 
    2L, 3L, 1L, 2L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-7L))

Upvotes: 1

Related Questions