Reputation: 6649
I have the following dataframe
x <- data.frame(id = c(1:6),
a = c('a', 'b', 'b', 'a', 'a', 'c'),
b = rep(2, 6),
c = c(5, 4, 4, 5, 5, 2))
> x
id a b c
1 1 a 2 5
2 2 b 2 4
3 3 b 2 4
4 4 a 2 5
5 5 a 2 5
6 6 c 2 2
I want to end up with
id a b c
1 1 a 2 5
2 2 b 2 4
4 4 a 2 5
6 6 c 2 2
Requirement is that I want to remove the row if it is the same as the previous row, with the exception of the column id
. If it is the same as a column further up the column but not immediately previous I do not want to get rid of it. For example id4 is the same as id1 but not removed, as it is not immediately above it.
Any help would be appreciated
Upvotes: 3
Views: 915
Reputation: 887721
We can use base R
x[!c(FALSE, !rowSums(x[-1, -1] != x[-nrow(x), -1])),]
# id a b c
#1 1 a 2 5
#2 2 b 2 4
#4 4 a 2 5
#6 6 c 2 2
Upvotes: 3
Reputation: 2434
Here is a way using lag
function in dplyr
. The idea is creating a key column and check whether it's the same as previous one.
library(dplyr)
x %>%
mutate(key=paste(a, b, c, sep="|")) %>%
filter(key != lag(key, default="0")) %>%
select(-key)
Upvotes: 2