Reputation: 781
Given a generic vector x
(which could be numeric, character, factor, etc.), I need to be able to count the number of sequential occurrences of value
, including singletons.
x <- c(0, 1, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1)
In this case, the following would be returned if value == 0
.
[1] 1 1 2 3
This code below works, but it is very slow when x
gets large. Does anyone know of a way to speed this up? I imagine there must be a clever vectorized way of doing this.
getSequential <- function(x, value) {
counts <- c()
last <- FALSE
for (i in 1:length(x)) {
if (last & x[i] == value) {
counts[length(counts)] <- counts[length(counts)] + 1
} else if (x[i] == value) {
counts <- c(counts, 1)
last <- TRUE
} else {
last <- FALSE
}
}
return(counts)
}
Upvotes: 0
Views: 166
Reputation: 1006
Well, the code is quite good. I doubt using rle
and with
or which
together would increase speed of the algorithm (much).
My proposition:
counting(int[] input, int value) {
int[] len = int[size(input)](); \\assume worst case scenario
len[0] = 0;
int j = 0;
for (i = 0; i < size(input); i++) { \\2 unit operations
if (input[i] != value && len[j] == 0) \\followed relatively often, 3-4 unit operations (1-2 unit operations on this step)
continue; \\5 unit operations
else if (input[1] == value) \\also followed relatively often, 4 unit operations (on '&&' if first is false, second is not checked)
len[j]++; \\5 unit operations
else /*if (input[i] != value && len[j] != 0)*/ { \\4 unit operations (commented operation is not needed (only remaining possible outcome)
j++; \\5 unit operations
len[j] = 0; \\7 unit operations (access and substitution)
continue; \\8 unit operations
}
}
}
As you can see there are at most 8 unit operations, and at best 4-5. Worst case scenario is that there are n/2 paths that go through 8 operations, but most of the cases I suppose it would follow one of the 5 step path.
Well, perhaps rle
and those other functions are better optimized, but question is, is it optimized for your problem? I recommend checking.
Upvotes: 0
Reputation: 14192
you can use rle
rle(x)$lengths[which(rle(x)$values==0)]
# 1 1 2 3
for speed, you could only run rle
once:
x1 <- rle(x)
x1$lengths[which(x1$values==0)]
Upvotes: 2