tsWIDE
tsWIDE

Reputation: 25

Conditional interpolation of time series data in R

I have time series data with N/As. The data are to end up in an animated scatterplot

Week    X   Y
 1      1   105
 2      3   110
 3      5   N/A
 4      7   130
 8     15   160
12     23   180
16     30   N/A
20     37   200

For a smooth animation, the data will be supplemented by calculated, additional values/rows. For the X values this is simply arithmetical. No problem so far.

Week    X   Y
 1      1   105
        2
 2      3   110
        4
 3      5   N/A
        6
 4      7   130
        8
        9
       10
       11
       12
       13
       14
 8     15   160
       16
       17
       18
       19
       20
       21
       22
12     23   180
       24
       25
       26
       27
       28
       29
16     30   N/A
       31
       32
       33
       34
       35
       36
20     37   200

The Y values should be interpolated and there is the additional requirement, that interpolation should only appear between two consecutive values and not between values, that have a N/A between them.

Week    X   Value
 1      1   105
        2   interpolated value
 2      3   110
        4
 3      5   N/A
        6
 4      7   130
        8   interpolated value
        9   interpolated value
       10   interpolated value
       11   interpolated value
       12   interpolated value
       13   interpolated value
       14   interpolated value
 8     15   160
       16   interpolated value
       17   interpolated value
       18   interpolated value
       19   interpolated value
       20   interpolated value
       21   interpolated value
       22   interpolated value
12     23   180
       24
       25
       26
       27
       28
       29
16     30   N/A
       31
       32
       33
       34
       35
       36
20     37   200

I have already experimented with approx, converted the "original" N/A to placeholder values and tried the zoo package with na.approx etc. but don´t get it, to express a correct condition statement for this kind of "conditional approximation" or "conditional gap filling". Any hint is welcome and very appreciated.

Thanks in advance

Upvotes: 0

Views: 286

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 270045

Replace the NAs with Inf, interpolate and then revert infinite values to NA.

library(zoo)

DF2 <- DF
DF2$Y[is.na(DF2$Y)] <- Inf

w <- merge(DF2, data.frame(Week = min(DF2$Week):max(DF2$Week)), by = 1, all.y = TRUE)
w$Value <- na.approx(w$Y)
w$Value[!is.finite(Value)] <- NA

giving the following where Week has been expanded to all weeks, Y is such that the original NAs are shown as Inf and the inserted NAs as NA. Value is the interpolated Y.

> w
   Week  X   Y Value
1     1  1 105 105.0
2     2  3 110 110.0
3     3  5 Inf    NA
4     4  7 130 130.0
5     5 NA  NA 137.5
6     6 NA  NA 145.0
7     7 NA  NA 152.5
8     8 15 160 160.0
9     9 NA  NA 165.0
10   10 NA  NA 170.0
11   11 NA  NA 175.0
12   12 23 180 180.0
13   13 NA  NA    NA
14   14 NA  NA    NA
15   15 NA  NA    NA
16   16 30 Inf    NA
17   17 NA  NA    NA
18   18 NA  NA    NA
19   19 NA  NA    NA
20   20 37 200 200.0

Note: Input DF in reproducible form:

Lines <- "
Week    X   Y
 1      1   105
 2      3   110
 3      5   N/A
 4      7   130
 8     15   160
12     23   180
16     30   N/A
20     37   200"
DF <- read.table(text = Lines, header = TRUE, na.strings = "N/A")

Upvotes: 1

Related Questions