Reputation: 3788
In R, you can define an arbitrary integer sequence using :
, e.g.
a = c(1:3, 12:14)
print(a)
## 1 2 3 12 13 14
I'm looking for a way to do the inverse operation, e.g. given a vector of integers I want to produce a character (or character vector) that collapses the integer sequence(s) to the equivalent expressions using :
, e.g.
some_function (a)
## "1:3" "12:14"
Bonus if the stride can be detected, e.g. c(1, 3, 5)
becomes "1:2:5"
or something like that.
Motivation: generate an integer sequence in R based on some data manipulation to identify database row selection, and pass the most concise representation of that sequence to an external program in the proper format.
Upvotes: 1
Views: 80
Reputation: 79338
We can be able to take into consideration the rle
of the differences and paste the range
together taking into consideration the sequence distance.
fun=function(s){
m=c(0,diff(s))
b=rle(m)
b$values[b$lengths==1&b$values!=1]=0
l=cumsum(!inverse.rle(b))
d=function(x)paste0(range(x[,1]),
collapse = paste0(":",unique(x[-1,-1]),":"))
f=c(by(cbind(s,m),l,d))
sub("::.*","",sub(":1:",":",f))
}
fun(c(1,1:3,12:14,c(1,3,5)))
1 2 3 4
"1" "1:3" "12:14" "1:2:5"
fun(c(1, 3, 5, 8:10, 14, 17, 20))
1 2 3
"1:2:5" "8:10" "14:3:20"
fun(1)
1
"1"
Upvotes: 1
Reputation:
Ah, nerd heaven. Here's a first shot. You could even use this for encoding within R.
Needs testing; code always prints the stride out.
encode_ranges <- function (x) {
rle_diff <- list(
start = x[1],
rled = rle(diff(x))
)
class(rle_diff) <- "rle_diff"
rle_diff
}
decode_ranges <- function (x) {
stopifnot(inherits(x, "rle_diff"))
cumsum(c(x$start, inverse.rle(x$rled)))
}
format.rle_diff <- function (x, ...) {
stopifnot(inherits(x, "rle_diff"))
output <- character(length(x$rled$values))
start <- x$start
for (j in seq_along(x$rled$values)) {
stride <- x$rled$values[j]
len <- x$rled$lengths[j]
if (len == 1L) {
start <- end + stride
next
}
end <- start + stride * x$rled$lengths[j]
output[j] <- paste(start, end, stride, sep = ":")
}
output <- output[nchar(output) > 0]
paste(output, collapse = ", ")
}
print.rle_diff <- function (x, ...) cat(format(x, ...))
encode_ranges(c(1:3, 12:14))
encode_ranges(c(1, 3, 5, 8:10, 14, 17, 20))
Upvotes: 1
Reputation: 887891
We create a grouping variable with diff
and cumsum
, then use on the group by functions to paste
the range
of values
f1 <- function(vec) {
unname(tapply(vec, cumsum(c(TRUE, diff(vec) != 1)),
FUN = function(x) paste(range(x), collapse=":")))
}
f1(a)
#[1] "1:3" "12:14"
For the second case
b <- c(1, 3, 5)
un1 <- unique(diff(c(1, 3, 5)))
paste(b[1], un1, b[length(b)], sep=":")
#[1] "1:2:5"
Upvotes: 0