sqwish
sqwish

Reputation: 79

data interpolation conditional on gap length

I would like to interpolate a time series using spline methods. I would like to use a 'gap tolerance' where if there are > x number of consecutive days of NA, data will remain as NA and not be interpolated. In my example, lets say if there are more than three consecutive days of NAs, I will not interpolate. Example data:

x <- seq(as.Date("2016-01-01"),as.Date("2016-01-31"),by="day")
y <- c(0.45062130 ,0.51136174 ,NA ,NA ,0.29481738 ,NA ,0.27713756 ,0.62638512 ,0.23547530,0.29253901 ,0.75899501 ,0.67779756 ,0.51831742 ,0.08050147 ,0.71183739 ,NA ,0.79406706 ,NA,0.03434758 ,0.59573892 ,0.22102821 ,0.13154414 ,NA ,NA ,NA ,NA ,0.23692593,0.95215104 ,0.38810846 ,0.17970580 ,0.05176054)

df <- data.frame(x,y)

> df
            x          y
 2016-01-01 0.45062130
 2016-01-02 0.51136174
 2016-01-03         NA
 2016-01-04         NA
 2016-01-05 0.29481738
 2016-01-06         NA
 2016-01-07 0.27713756
 2016-01-08 0.62638512
 2016-01-09 0.23547530
 2016-01-10 0.29253901
 2016-01-11 0.75899501
 2016-01-12 0.67779756
 2016-01-13 0.51831742
 2016-01-14 0.08050147
 2016-01-15 0.71183739
 2016-01-16         NA
 2016-01-17 0.79406706
 2016-01-18         NA
 2016-01-19 0.03434758
 2016-01-20 0.59573892
 2016-01-21 0.22102821
 2016-01-22 0.13154414
 2016-01-23         NA
 2016-01-24         NA
 2016-01-25         NA
 2016-01-26         NA
 2016-01-27 0.23692593
 2016-01-28 0.95215104
 2016-01-29 0.38810846
 2016-01-30 0.17970580
 2016-01-31 0.05176054

One thought I has was to create 2 new data frames. The first one being completely interpolated, the second removing NAs under the gap tolerance, and then merging. Is there a better way to do this?

My desired data set would look like this:

> df
            x          y
 2016-01-01 0.45062130
 2016-01-02 0.51136174
 2016-01-03 0.35684617
 2016-01-04 0.30481738
 2016-01-05 0.29481738
 2016-01-06 0.28481738
 2016-01-07 0.27713756
 2016-01-08 0.62638512
 2016-01-09 0.23547530
 2016-01-10 0.29253901
 2016-01-11 0.75899501
 2016-01-12 0.67779756
 2016-01-13 0.51831742
 2016-01-14 0.08050147
 2016-01-15 0.71183739
 2016-01-16 0.75158886
 2016-01-17 0.79406706
 2016-01-18 0.21584455
 2016-01-19 0.03434758
 2016-01-20 0.59573892
 2016-01-21 0.22102821
 2016-01-22 0.13154414
 2016-01-23         NA
 2016-01-24         NA
 2016-01-25         NA
 2016-01-26         NA
 2016-01-27 0.23692593
 2016-01-28 0.95215104
 2016-01-29 0.38810846
 2016-01-30 0.17970580
 2016-01-31 0.05176054

Upvotes: 1

Views: 162

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 270045

Try na.spline in the zoo package. (fortify.zoo(z) will convert z back to a data frame although you may prefer to keep it in zoo form to take advantage of other facilities there as well.) Also check out the other na.* functions in zoo.

library(zoo)
z <- na.spline(zoo(y, x), maxgap = 2)

giving:

> z
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05 2016-01-06 2016-01-07 
0.45062130 0.51136174 0.50365727 0.43252778 0.29481738 0.14613360 0.27713756 
2016-01-08 2016-01-09 2016-01-10 2016-01-11 2016-01-12 2016-01-13 2016-01-14 
0.62638512 0.23547530 0.29253901 0.75899501 0.67779756 0.51831742 0.08050147 
2016-01-15 2016-01-16 2016-01-17 2016-01-18 2016-01-19 2016-01-20 2016-01-21 
0.71183739 1.06652092 0.79406706 0.17526465 0.03434758 0.59573892 0.22102821 
2016-01-22 2016-01-23 2016-01-24 2016-01-25 2016-01-26 2016-01-27 2016-01-28 
0.13154414         NA         NA         NA         NA 0.23692593 0.95215104 
2016-01-29 2016-01-30 2016-01-31 
0.38810846 0.17970580 0.05176054 

Upvotes: 1

Related Questions