lokheart
lokheart

Reputation: 24665

using R to filter out non-consecutive numbers from a dataframe's column

Suppose I have a dataframe like this:

DAYS   VALUE
1      A
2      A
3      A
5      A
7      A
9      A
10     A
12     A
13     A
14     A
15     A

I am trying to derive a function so that only series of consecutive numbers (3 as minimum) is remained, like this:

DAYS   VALUE
1      A
2      A
3      A
12     A
13     A
14     A
15     A

I wonder if there are any functions from packages that can do this?

Thanks!

Upvotes: 2

Views: 1723

Answers (2)

VitoshKa
VitoshKa

Reputation: 8523

A simple for loop will do as well:

 d <- as.integer(DATA$DAYS)
 consec <- rep.int(FALSE, length(d))

 for(i in 1:(length(d)-2))
     if(identical(d[i] + 1:2, d[i + 1:2])){
         consec[i + 0:2] <- TRUE
     }

DATA[consec, ]

Upvotes: 0

kohske
kohske

Reputation: 66842

there must be more simple way... but as oneliner:

d[(1+(s<-c(0,cumsum(1-(diff(d$DAYS)==1)))))%in%which(table(s)>=3),]

step-by-step

d1 <- c(FALSE, diff(d$DAYS)!=1)
d2 <- cumsum(d1)+1
d3 <- table(d2)
d4 <- which(d3 >= 3)
d[d2 %in% d4,]

Upvotes: 12

Related Questions