Reputation: 729
First, I am new to R, so I am not completely familiar with the syntax of the language -- I have a list of data, and for example we can say it looks like this:
1,1,1,1,1,2,2,2,3,3,3,2,2,3,3,3,4,4,4,4,4,4,4,4,4,5,5,5,5,6,6,5,6,5,7,7,7,7
What I want to do is create a new list with only one entry per group of identical data, so:
1,2,3,2,3,4,5,6,5,6,5,7 (approximately).
I am not quite sure how to go about this. Note that values may not be integers. Also, if anyone has any ideas for doing the same thing with strings or timestamps, suggestions would be appreciated! So far I am trying to thing about it in terms of indexing but I am having trouble getting it down.
Upvotes: 0
Views: 90
Reputation: 186
Looks like you need the function rle. If x is your vector of values then rle(x)$values will give you want you want.
values <- c(1,1,1,1,1,2,2,2,3,3,3,2,2,3,3,3,4,4,4,4,4,4,4,4,4,5,5,5,5,6,6,5,6,5,7,7,7,7)
rle(values)$values
## [1] 1 2 3 2 3 4 5 6 5 6 5 7
values <- as.character(values)
rle(values)$values
## [1] "1" "2" "3" "2" "3" "4" "5" "6" "5" "6" "5" "7"
ts <- Sys.time()
stamps <- sort(rep(c(ts, ts+1, ts+2, ts+3), 5))
## [1] "2014-09-25 10:55:29 EDT" "2014-09-25 10:55:29 EDT" "2014-09-25 10:55:29 EDT"
## [4] "2014-09-25 10:55:29 EDT" "2014-09-25 10:55:29 EDT" "2014-09-25 10:55:30 EDT"
## [7] "2014-09-25 10:55:30 EDT" "2014-09-25 10:55:30 EDT" "2014-09-25 10:55:30 EDT"
## [10] "2014-09-25 10:55:30 EDT" "2014-09-25 10:55:31 EDT" "2014-09-25 10:55:31 EDT"
## [13] "2014-09-25 10:55:31 EDT" "2014-09-25 10:55:31 EDT" "2014-09-25 10:55:31 EDT"
## [16] "2014-09-25 10:55:32 EDT" "2014-09-25 10:55:32 EDT" "2014-09-25 10:55:32 EDT"
## [19] "2014-09-25 10:55:32 EDT" "2014-09-25 10:55:32 EDT"
as.POSIXct(rle(as.numeric(stamps))$values, origin = '1970-01-01')
## [1] "2014-09-25 10:55:29 EDT" "2014-09-25 10:55:30 EDT" "2014-09-25 10:55:31 EDT"
## [4] "2014-09-25 10:55:32 EDT"
Upvotes: 4