Reputation: 47
I have currently looked at imputeTS and zoo packages but it does not see to work Current data is..
group/timeseries(character)
1 2017-05-17 04:00:00
1 2017-05-17 04:01:00
1 NA
1 NA
1 2017-05-17 05:00:00
1 2017-05-17 06:00:00
2 NA
2 2017-05-17 04:31:00
2 NA
2 NA
2 NA
2 2017-05-17 05:31:00
I would like to fill in NA with the interpolation time series so that the time is the mid point of the row before and after. Also, i have to point out that each time series belongs to a group. Meaning the time resets for each group.
I will provide a picture of the actual data to be more clear
Thanks for the help in advance!
Upvotes: 1
Views: 1412
Reputation: 7730
imputeTS and zoo do not take chars or timestamps as input for their interpolation functions. (usually interpolating chars does not make sense)
But you can give characters as input to the na.locf function of zoo. (the last observation is carried forward with this function)
The best solution for your task should be the following (I am assuming you have the date given as POSIX.ct)
# Perform the imputation on numeric input
temp <- imputeTS::na_interpolation( as.numeric ( input ) )
# Transform the numeric values back to dates
as.POSIXct(temp, origin = "1960-01-01", tz = "UTC")
With "input" in the first line being your vector with the POSIX.ct timestamps. The origin and tz (timezone) settings in line two have to be set according to your timestamps.
Upvotes: 1
Reputation: 2636
na.approx
in the zoo package can do this and the grouping can be handled without loops using either tapply
in base or as a group operation in data.table.
For your data set
df <- read.table(text=c("
group timeseries
1 '2017-05-17 04:00:00'
1 '2017-05-17 04:01:00'
1 NA
1 NA
1 '2017-05-17 05:00:00'
1 '2017-05-17 06:00:00'
2 NA
2 '2017-05-17 04:31:00'
2 NA
2 NA
2 NA
2 '2017-05-17 05:31:00'
"),
colClasses = c("integer", "POSIXct"),
header = TRUE)
Write function to coerce vector to zoo object, interpolate NAs, extract result
library(zoo)
foo <- function(x) coredata(na.approx(zoo(x), na.rm = FALSE))
Example using tapply in base R to apply foo to each group
df2 <- df #make a copy
df2$timeseries <- do.call(c, tapply(df2$timeseries, INDEX = df2$group, foo))
Example using group by in data.table to apply foo to each group
library(data.table)
DT <- data.table(df)
DT[, timeseries := foo(timeseries), by = "group"]
Upvotes: 1