chenchenmomo
chenchenmomo

Reputation: 233

Remove repeated values in one column using R

I hope to remove the rows whose status is same with previous, just retain the first row in a sequence with same status.

 test[1:100,]
                         time status
    1   2014-07-31 17:00:01.0 0
    2   2014-07-31 17:00:01.2 0
    3   2014-07-31 17:00:01.4 1
    4   2014-07-31 17:00:01.5 1
    5   2014-07-31 17:00:02.2 1
    6   2014-07-31 17:00:02.5 1
    7   2014-07-31 17:00:03.0 1
    8   2014-07-31 17:00:04.0 1
    9   2014-07-31 17:00:04.0 1
    10  2014-07-31 17:00:05.0 1
    11  2014-07-31 17:00:09.0 1
    12  2014-07-31 17:00:10.0 1
    13  2014-07-31 17:00:10.2 1
    14  2014-07-31 17:00:11.2 1
    15  2014-07-31 17:00:11.5 1
    16  2014-07-31 17:00:12.0 1
    17  2014-07-31 17:00:12.5 1
    18  2014-07-31 17:00:12.5 0
    19  2014-07-31 17:00:12.9 1
    20  2014-07-31 17:00:13.4 0

What I wish is the values of adjacent statuses are always different.

                     time status
1   2014-07-31 17:00:01.0 0
3   2014-07-31 17:00:01.4 1
18  2014-07-31 17:00:12.5 0
19  2014-07-31 17:00:12.9 1
20  2014-07-31 17:00:13.4 0

I tried unitest <- subset(test, !duplicated(test[,2])) , the function removed all the duplicated rows, only left two. What kind of function can I use?

Upvotes: 0

Views: 136

Answers (1)

Jota
Jota

Reputation: 17611

test[c(1, which(diff(test$status) != 0)+1), ]

#                    time status
#1  2014-07-31 17:00:01.0      0
#3  2014-07-31 17:00:01.4      1
#18 2014-07-31 17:00:12.5      0
#19 2014-07-31 17:00:12.9      1
#20 2014-07-31 17:00:13.4      0

Upvotes: 4

Related Questions