the_chimp
the_chimp

Reputation: 255

R Interpolate values by group

I have a dataframe with the European states where each state occurs 10 times (for 10 days). I want to interpolate the NA values of multiple columns, which I could achieve using

library("imputeTS")
na_interpolation(dataframe)

But I want to interpolate all NA values by state. How can that be done? I have already tried a lot of different solutions, but none did work for me.

As pseudo-code I would like to have something like

na_interpolation(dataframe, groupby=state)

Anything that could work?

These code samples did unfortunaetly not work for me

interpolation <- dataframe %>% 
  group_by(state-name) %>% 
  na_interpolation(dataframe)

Upvotes: 1

Views: 600

Answers (3)

akrun
akrun

Reputation: 887058

An option with data.table

library(data.table)
setDT(dataframe)[,  value := imputeTS::na_interpolation(value), state]

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388962

You should be able to apply na_interpolation by group. Try :

library(dplyr)

interpolation  <- dataframe %>%
                    group_by(state) %>%
                    mutate(value = imputeTS::na_interpolation(value))

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 173793

You could use the split-apply-bind method:

do.call(rbind, lapply(split(dataframe, dataframe$state), na_interpolation))

As a worked example, take the following dummy data:

set.seed(3)

dataframe <- data.frame(state = rep(c("A", "B", "C"), each = 5),
                        value = rnorm(15))

dataframe$value[sample(15, 4)] <- NA

dataframe
#>    state       value
#> 1      A -0.96193342
#> 2      A          NA
#> 3      A  0.25878822
#> 4      A -1.15213189
#> 5      A  0.19578283
#> 6      B  0.03012394
#> 7      B  0.08541773
#> 8      B          NA
#> 9      B          NA
#> 10     B  1.26736872
#> 11     C -0.74478160
#> 12     C          NA
#> 13     C -0.71635849
#> 14     C  0.25265237
#> 15     C  0.15204571

Then we can do:

library(imputeTS)

do.call(rbind, lapply(split(dataframe, dataframe$state), na_interpolation))
#>      state       value
#> A.1      A -0.96193342
#> A.2      A -0.35157260
#> A.3      A  0.25878822
#> A.4      A -1.15213189
#> A.5      A  0.19578283
#> B.6      B  0.03012394
#> B.7      B  0.08541773
#> B.8      B  0.47940140
#> B.9      B  0.87338506
#> B.10     B  1.26736872
#> C.11     C -0.74478160
#> C.12     C -0.73057004
#> C.13     C -0.71635849
#> C.14     C  0.25265237
#> C.15     C  0.15204571

Created on 2020-12-12 by the reprex package (v0.3.0)

Upvotes: 0

Related Questions