dplanet
dplanet

Reputation: 5413

Measure the periodicity of a sequence of numbers [R]

A periodic sequence is a sequence that repeats itself after n terms, for example, the following is a periodic sequence:

1, 2, 3, 1, 2, 3, 1, 2, 3, ...

And we define the period of that sequence to be the number of terms in each subsequence (the subsequence above is 1, 2, 3). So the period for the above sequence is 3.

In R, I can define the above sequence (albeit not to infinity), using:

sequence <- rep(c(1,2,3),n) #n is a predefined variable

So if n = 50, sequence will be the sequence 1, 2, 3, 1, 2, 3, ... , 1, 2, 3, where each number has appeared 50 times, in the obvious way.

I am looking to build a function that calculates the periodicity of sequence. Pseudocode is as follows:

period <- function(sequence){
    subsequence <- subsequence(sequence) #identify the subsequence
    len.subsequence <- length(subsequence) #calculate its length
    return(len.subsequence) #return it
}

How would I identify the subsequence? This is sort of a reversing of the rep function, such that I pass in a sequence and it passes out the length of the initial vector.

Upvotes: 5

Views: 5033

Answers (3)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193647

Building on the lead from @DWin, you can probably make a function something like this:

subsequence <- function(data) {
  ii <- 0
  while (TRUE) {
    ii <- ii + 1
    LAG <- sum((diff(data, lag = ii) == 0) - 1)
    if (LAG == 0) { break }
  }
  list(Period = ii, 
       Sequence = data[1:ii], 
       Reps = length(data)/ii) 
}

Note: This was my first time using while(), so I'm not sure if there's a better way to implement it.

Here is some data; s3 is non-monotonic:

s1 <- rep(c(1,2,3), 3)
s2 <- rep(c(1,2,3), 50)
s3 <- c(1, 2, 3, 4, 2, 3, 4, 1, 2, 3, 4, 2, 3, 4)

Here are the results of the subsequence() function.

subsequence(s1)
# $Period
# [1] 3
# 
# $Sequence
# [1] 1 2 3
# 
# $Reps
# [1] 3

subsequence(s2)
# $Period
# [1] 3
# 
# $Sequence
# [1] 1 2 3
# 
# $Reps
# [1] 50

subsequence(s3)
# $Period
# [1] 7
# 
# $Sequence
# [1] 1 2 3 4 2 3 4
# 
# $Reps
# [1] 2

Upvotes: 1

IRTFM
IRTFM

Reputation: 263451

It's fairly easy with that sequence, although I would avoid using the name 'sequence' since it is an R function name. This would identify the periodicity of any monotonic sequence so it's a bit more general but it would not identify a sequence like: 1.2.3.4.2.3.4,1,2,3,4,2,3,4, ....

> which(diff(seQ) < 0)
[1]  3  6  9 12 15 18 21 24 27
> diff(which(diff(seQ) < 0) )
[1] 3 3 3 3 3 3 3 3

You could test the equality of intervals or use either of those results to index the original vector. You should test your answers with c(1, 2, 3, 4, 2, 3, 4, 1, 2, 3, 4, 2, 3, 4) to see if they pass the test of identifying an non-monotonic repetition. So far none of them do so; since none report a period of 7.

Upvotes: 1

Carl Witthoft
Carl Witthoft

Reputation: 21532

If the period is always the same, i.e. the sequence never changes, then you could use a loop over lag to see when a match occurs.

With total bias, I would also recommend using seqle (guess who wrote that function :-) ), which is like rle but finds sequences. detect intervals of the consequent integer sequences I'm not the only person to edit the source for "rle" that way.

Upvotes: 5

Related Questions