Bethan Huish
Bethan Huish

Reputation: 43

How to predict next next number in a sequence in R

I wish create a function to predict the next number in a sequence of geometric sequences such as these or any other nth multiple:

1 2 4 8 16 32 64
2 4 8 16 32 64 128
3 6 12 24 48 96 192

1 3 9 27 81 243 729
2 6 18 54 162 486 1458
3 9 27 81 243 729 2187

I've tried using this method (How to get next number in sequence in R) however it only seems to work with linear sequences. Also how could an IF statement be implemented to check if the sequence is a geometric sequence and not any other sequence type e.g, linear?

Upvotes: 2

Views: 1281

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269371

For a geometric series the ratio of successive values is constant so multiplying that ratio by the current value gives the next value.

To check whether the series is geometric we can take the ratio of each successive pair of values in the series and if those ratios are all equal the series is geometric. Since that is equivalent to checking whether their variance is zero we can do it easily using var. Since floating point arithmetic is not exact we check whether the variance is less than eps .

Note that is.geo returns NA for a series of length 1 or 2 and nextValue returns NA if is.geo does not return TRUE.

nextValue <- function(x) {
  if (!isTRUE(is.geo(x))) NA
  else {
    y <- tail(x, 2)
    y[2]^2 / y[1]
  }
}

is.geo <- function(x, eps = 1e-5) var(x[-1] / x[-length(x)]) < eps

Test

Using m defined in the Note at the end we can append the next value to it as a new column:

cbind(m, apply(m, 1, nextValue))

giving:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,]    1    2    4    8   16   32   64  128
[2,]    2    4    8   16   32   64  128  256
[3,]    3    6   12   24   48   96  192  384
[4,]    1    3    9   27   81  243  729 2187
[5,]    2    6   18   54  162  486 1458 4374
[6,]    3    9   27   81  243  729 2187 6561

Also we can test each row of m to check if is geometric:

apply(m, 1, is.geo)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE

is.geo(c(1, 2, 4, 12))
## [1] FALSE

Using lm

If by the method of the link shown in the question means using lm then we can use lm if the series is strictly positive by noting tha that the log of such a geometric series is arithmetic so we can fit the log of the series to 1, 2, 3, ... . If the residuals are zero which occurs when the deviance is zero then it satisfies this.

fit <- function(x) {
    ix <- seq_along(x)
    lm(log(x) ~ ix)
}

nextValue2 <- function(x) {
  if (!isTRUE(is.geo2(x))) NA
  else exp( predict(fit(x), list(ix = length(x) + 1)) )
}

is.geo2 <- function(x, eps = 1.e-5) {
  if (length(x) <= 2) NA
  else deviance(fit(x)) < eps
}

Note

m <- matrix(c(1L, 2L, 3L, 1L, 2L, 3L, 2L, 4L, 6L, 3L, 6L, 9L, 4L, 
8L, 12L, 9L, 18L, 27L, 8L, 16L, 24L, 27L, 54L, 81L, 16L, 32L, 
48L, 81L, 162L, 243L, 32L, 64L, 96L, 243L, 486L, 729L, 64L, 128L, 
192L, 729L, 1458L, 2187L), 6)

Upvotes: 2

OmG
OmG

Reputation: 18838

If it is just a geometric sequence, you can find the factor by factor <- seq[2]/seq[1]. If you don't know the type of sequence, finding the formula in a general case is not possible.

However, you know the general formula of the sequence, so you have some variables that can be computed by some terms of the sequence. For example, for the geometric sequence, we know that a_n = factor * a_{n-1}. Hence, by replacing some terms of the sequence, we can find the factor here. It is one variable equation. We can say that factor = a_n / a_{n-1}.

For another example, suppose we know that the sequence formula likes a_n = alpha * a_{n-1} + beta * a_{n-2}. Now, we can find the alpha and beta using four terms of the sequence (a_1, a_2, a_3, and `a_4).

For the final case, you can have the general form of the sequence without any variable such as a_n = a_{n-1} + n. If you have this, you can predict the last term easily base on the required last terms of the sequence.

Upvotes: 0

Related Questions