Reputation: 559

Count consecutive duplicates in a column

I would like to find out for how many rows a number has appeared in a column consecutively and put this number into a matrix for each occurrence.

For instance I would like to use this input to find all the consecutive -1 occurrences

df$V1

0
1
0
-1
-1
0
1
-1
-1
-1
1

The number of rows the consecutive -1s were found:

output
2
3

All I could think to do was iterate through each row and see if a -1 had occurred in the row above and was also a -1 in the current row, then add to the counter. But I imagine there must be a faster way to do it?

Upvotes: 2

Answers (3)

989

Reputation: 12937

You could do this in base R:

r <- x==-1
diff(unique(cumsum(r)[!r]))
#[1] 2 3

Where x <- df$V1.

Upvotes: 0

d.b

Reputation: 32548

Use rle

x = c(0L, 1L, 0L, -1L, -1L, 0L, 1L, -1L, -1L, -1L, 1L)    
with(rle(x), lengths[values == -1])
#[1] 2 3

For all unique elements of x

with(rle(x), setNames(sapply(unique(values), function(x)
                lengths[values == x]), nm = unique(values)))
#$`0`
#[1] 1 1 1

#$`1`
#[1] 1 1 1

#$`-1`
#[1] 2 3

Upvotes: 2

akrun

Reputation: 886948

For all values, we can do this with rleid from data.table

library(data.table)
res <- setDT(df)[, .(value = V1[1L], n = .N), .(grp = rleid(V1))]
res
#   grp value n
#1:   1     0 1
#2:   2     1 1
#3:   3     0 1
#4:   4    -1 2
#5:   5     0 1
#6:   6     1 1
#7:   7    -1 3
#8:   8     1 1

From this, we can subset the elements where 'V1' is -1

res[value== -1][, grp := NULL][]
#   value n
#1:    -1 2
#2:    -1 3

data

df <- structure(list(V1 = c(0L, 1L, 0L, -1L, -1L, 0L, 1L, -1L, -1L, 
-1L, 1L)), .Names = "V1", row.names = c(NA, -11L), class = "data.frame")

Upvotes: 0

Count consecutive duplicates in a column

Answers (3)

data

Related Questions