Removing values if they duplicate in two subsequent rows

Question

I have a data frame

dat <- data.frame(time = c(24.83,25.24,25.46,25.71,25.78,26.11), key = c("z","f","x","f","f","x"))

which looks like this:

time    key
24.83   z
25.24   f
25.46   x
25.71   f
25.78   f
26.11   x

I want to find all instances where the value of 'key' is the same in two subsequent rows (like 'f' here) and remove the second row.

I looked at ?duplicated and ?unique but still have no idea how to apply it for this purpose.

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

duplicated and unique might not be the best choices here--they would remove all subsequent duplicates by default.

Instead, you can use rle, like this:

> dat[sequence(rle(as.character(dat$key))$lengths) == 1, ]
   time key
1 24.83   z
2 25.24   f
3 25.46   x
4 25.71   f
6 26.11   x

rle returns a list with two items: lengths (which indicates the run length) and values.

Answers (2)