Reputation: 17
mydata <-data.frame(group = c(1,1,1,1,1,2,2,2,2,2), score = c(10, NA, NA, 20, 30, 5, NA, NA, NA, 40))
From 'mydata' I am trying to use dplyr
to interpolate 'x' using na.approx
when there are fewer than 3 consecutive NAs between the closest non-NA entries in 'value'. The interpolated x values are store in 'x_approx'.
Without the condition on the number of consecutive NAs in 'value' I use this code:
library(zoo)
mydata %>%
group_by(group) %>%
mutate(score_approx = na.approx(score)) %>%
mutate(score_approx = coalesce(score_approx,score))
mydata
# A tibble: 10 x 3
# Groups: group [2]
group score score_approx
<dbl> <dbl> <dbl>
1 1 10 10
2 1 NA 13.3
3 1 NA 16.7
4 1 20 20
5 1 30 30
6 2 5 5
7 2 NA 13.8
8 2 NA 22.5
9 2 NA 31.2
10 2 40 40
However, the desired data frame is:
# A tibble: 10 x 3
# Groups: group [2]
group score score_approx
<dbl> <dbl> <dbl>
1 1 10 10
2 1 NA 13.3
3 1 NA 16.7
4 1 20 20
5 1 30 30
6 2 5 5
7 2 NA NA
8 2 NA NA
9 2 NA NA
10 2 40 40
Upvotes: 0
Views: 219
Reputation: 388962
You can use maxgap
argument in na.approx
-
library(dplyr)
library(zoo)
mydata %>%
group_by(group) %>%
mutate(score_approx = na.approx(score, maxgap = 2)) %>%
ungroup
# group score score_approx
# <dbl> <dbl> <dbl>
# 1 1 10 10
# 2 1 NA 13.3
# 3 1 NA 16.7
# 4 1 20 20
# 5 1 30 30
# 6 2 5 5
# 7 2 NA NA
# 8 2 NA NA
# 9 2 NA NA
#10 2 40 40
Upvotes: 1