user14250906
user14250906

Reputation: 197

How to create a subset of groups that have full consecutive months

I am trying to create a subset of groups that have full consecutive months in R.

For instance, if there is data that look like:

structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 7, 8, 1, 2, 3, 4)), class = "data.frame", row.names = c(NA, 
-8L), codepage = 65001L)

In a table, this looks like:

╔═══════╦═══════╗
║ Group ║ Month ║
╠═══════╬═══════╣
║ 1     ║ 3     ║
╠═══════╬═══════╣
║ 1     ║ 4     ║
╠═══════╬═══════╣
║ 1     ║ 7     ║
╠═══════╬═══════╣
║ 1     ║ 8     ║
╠═══════╬═══════╣
║ 2     ║ 1     ║
╠═══════╬═══════╣
║ 2     ║ 2     ║
╠═══════╬═══════╣
║ 2     ║ 3     ║
╠═══════╬═══════╣
║ 2     ║ 4     ║
╚═══════╩═══════╝

I would like Group 1 to be dropped because there is a "juncture" in the consecutive months (there is no Month 5, 6).

Upvotes: 2

Views: 78

Answers (4)

Duck
Duck

Reputation: 39613

It is also valid to contrast with the numbers of observations across all groups and check if all differences are equal to one:

library(tidyverse)
#Code
df %>% group_by(Group) %>%
  mutate(Diff=c(1,diff(Month)),
         Value=n()==sum(Diff==1)) %>%
  filter(Value) %>% ungroup() %>% select(-c(Value,Diff))

Output:

# A tibble: 4 x 2
  Group Month
  <dbl> <dbl>
1     2     1
2     2     2
3     2     3
4     2     4

Some data used:

#Data
df <- structure(list(Group = c(1, 1, 1, 1, 2, 2, 2, 2), Month = c(3, 
4, 7, 8, 1, 2, 3, 4)), class = "data.frame", row.names = c(NA, 
-8L), codepage = 65001L)

Upvotes: 0

ThomasIsCoding
ThomasIsCoding

Reputation: 102559

Here is a base R option using subset + ave

> subset(df,as.logical(ave(Month,Group, FUN = function(x) all(diff(x)==1))))
  Group Month
5     2     1
6     2     2
7     2     3
8     2     4

Upvotes: 0

Sotos
Sotos

Reputation: 51582

A base R solution can be to use ave, i.e.

df[!!with(df, ave(Month, Group, FUN = function(i)all(diff(i) == 1))),]

#  Group Month
#5     2     1
#6     2     2
#7     2     3
#8     2     4

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 40171

One dplyr option could be:

df %>%
 group_by(Group) %>%
 filter(all(diff(Month) == 1))

  Group Month
  <dbl> <dbl>
1     2     1
2     2     2
3     2     3
4     2     4

Upvotes: 0

Related Questions