Reputation: 5413
A periodic sequence is a sequence that repeats itself after n terms, for example, the following is a periodic sequence:
1, 2, 3, 1, 2, 3, 1, 2, 3, ...
And we define the period of that sequence to be the number of terms in each subsequence (the subsequence above is 1, 2, 3). So the period for the above sequence is 3.
In R, I can define the above sequence (albeit not to infinity), using:
sequence <- rep(c(1,2,3),n) #n is a predefined variable
So if n = 50
, sequence
will be the sequence 1, 2, 3, 1, 2, 3, ... , 1, 2, 3, where each number has appeared 50 times, in the obvious way.
I am looking to build a function that calculates the periodicity of sequence
. Pseudocode is as follows:
period <- function(sequence){
subsequence <- subsequence(sequence) #identify the subsequence
len.subsequence <- length(subsequence) #calculate its length
return(len.subsequence) #return it
}
How would I identify the subsequence? This is sort of a reversing of the rep
function, such that I pass in a sequence and it passes out the length of the initial vector.
Upvotes: 5
Views: 5033
Reputation: 193647
Building on the lead from @DWin, you can probably make a function something like this:
subsequence <- function(data) {
ii <- 0
while (TRUE) {
ii <- ii + 1
LAG <- sum((diff(data, lag = ii) == 0) - 1)
if (LAG == 0) { break }
}
list(Period = ii,
Sequence = data[1:ii],
Reps = length(data)/ii)
}
Note: This was my first time using while()
, so I'm not sure if there's a better way to implement it.
Here is some data; s3 is non-monotonic:
s1 <- rep(c(1,2,3), 3)
s2 <- rep(c(1,2,3), 50)
s3 <- c(1, 2, 3, 4, 2, 3, 4, 1, 2, 3, 4, 2, 3, 4)
Here are the results of the subsequence()
function.
subsequence(s1)
# $Period
# [1] 3
#
# $Sequence
# [1] 1 2 3
#
# $Reps
# [1] 3
subsequence(s2)
# $Period
# [1] 3
#
# $Sequence
# [1] 1 2 3
#
# $Reps
# [1] 50
subsequence(s3)
# $Period
# [1] 7
#
# $Sequence
# [1] 1 2 3 4 2 3 4
#
# $Reps
# [1] 2
Upvotes: 1
Reputation: 263451
It's fairly easy with that sequence, although I would avoid using the name 'sequence' since it is an R function name. This would identify the periodicity of any monotonic sequence so it's a bit more general but it would not identify a sequence like: 1.2.3.4.2.3.4,1,2,3,4,2,3,4, ....
> which(diff(seQ) < 0)
[1] 3 6 9 12 15 18 21 24 27
> diff(which(diff(seQ) < 0) )
[1] 3 3 3 3 3 3 3 3
You could test the equality of intervals or use either of those results to index the original vector. You should test your answers with c(1, 2, 3, 4, 2, 3, 4, 1, 2, 3, 4, 2, 3, 4) to see if they pass the test of identifying an non-monotonic repetition. So far none of them do so; since none report a period of 7.
Upvotes: 1
Reputation: 21532
If the period is always the same, i.e. the sequence never changes, then you could use a loop over lag
to see when a match occurs.
With total bias, I would also recommend using seqle
(guess who wrote that function :-) ), which is like rle
but finds sequences. detect intervals of the consequent integer sequences
I'm not the only person to edit the source for "rle" that way.
Upvotes: 5