Reputation: 115
I'm trying to make a logic vector to check if one element is equal to previous element.
vector <- c(1, 1, 2, 2, 2, 3, 3)
I'd like to check each element if it's equal to the previous, therefore the resuld should be:
FALSE TRUE FALSE TRUE TRUE FALSE TRUE
I know I could make a loop, buts it's not efficient (i have a 16 million row df). So
it's not the ideal, but is what I could manage:
for(i in 2:length(vector)) {print(vector[i] == vector[i-1])}
that would take forever. Is there any vectorized way to do that?
Upvotes: 0
Views: 3285
Reputation: 1890
Here's a data.table
answer. Note that the first item is really an NA. You can manually edit that one if desired.
library("data.table")
vector <- c(1, 1, 2, 2, 2, 3, 3)
df <- data.frame(original=vector)
setDT(df)
df[, prev_eq := original==shift(vector,1)]
Upvotes: 3
Reputation: 73325
We can use (better for integer vector)
c(FALSE, diff(x) == 0)
Example
x <- c(1L, 1L, 2L, 2L, 2L, 3L, 3L)
c(FALSE, diff(x) == 0)
#[1] FALSE TRUE FALSE TRUE TRUE FALSE TRUE
If your vector contains floating point numbers, this is more robust:
c(FALSE, abs(diff(x)) < .Machine$double.eps ^ 0.5)
but it will costs three times more memory and possibly three times slower than the above for really huge vector.
If you have character vector, we can use
c(FALSE, x[-1] == x[-length(x)])
It is always safe to compare strings using "=="
.
Upvotes: 7