Find the length of consecutive numbers in R

Question

I have a data.frame that looks like this:

data=data.frame(position=c(1,2,3,1,1,4,5,6,7,8,2,2),
             name=c("A","B","C","A","A","D","E","F","G","H","B","B"))

 position name
     1    A
     2    B
     3    C
     1    A
     1    A
     4    D
     5    E
     6    F
     7    G
     8    H
     2    B
     2    B

I would like to be able to identify in the column "position" all the consecutive intervals and then paste into a new column the length of each interval.

I would like my data to look somehow like this.

position  length
  1 - 3     3  
  4 - 8     5

Any help and comment are highly appreciated

Paul · Accepted Answer

Here is a base R solution.

Create a column, sequence, which indicates which rows are contiguous.

data$sequence <- c(NA, head(data$position, -1)) + 1 == data$position
data$sequence[[1]] <- data$sequence[[2]]

data
#>    position name sequence
#> 1         1    A     TRUE
#> 2         2    B     TRUE
#> 3         3    C     TRUE
#> 4         1    A    FALSE
#> 5         1    A    FALSE
#> 6         4    D    FALSE
#> 7         5    E     TRUE
#> 8         6    F     TRUE
#> 9         7    G     TRUE
#> 10        8    H     TRUE
#> 11        2    B    FALSE
#> 12        2    B    FALSE

Use rle to construct the run lengths.

run_lengths <- rle(data$sequence)

i_ends <- cumsum(run_lengths$lengths)[run_lengths$values]
i_starts <- c(1, head(i_ends, -1))

data.frame(
  position = paste0(data$position[i_starts], " - ", data$position[i_ends]),
  length = i_ends - i_starts
)
#>   position length
#> 1    1 - 3      2
#> 2    3 - 8      7

Find the length of consecutive numbers in R

Answers (2)

Related Questions