Reputation: 157
I'm sourcing some data where sometimes I get consecutive 0 values like 5+ times (can change, not constant). For my purposes, it would be great if I can replace these values with the nearest non-0 value.
n = n %>% mutate(impliedearningsmov = if_else(impliedearningsmov==0,
if_else(lead(impliedearningsmov)==0,
if_else(lead(impliedearningsmov,2)==0,
if_else(lead(impliedearningsmov,3)==0,
if_else(lead(impliedearningsmov,4)==0,
if_else(lead(impliedearningsmov,5)==0,
if_else(lead(impliedearningsmov,6)==0,
lead(impliedearningsmov,7),
lead(impliedearningsmov,6)),
lead(impliedearningsmov,5)),
lead(impliedearningsmov,4)),
lead(impliedearningsmov,3)),
lead(impliedearningsmov,2)),
lead(impliedearningsmov)),
impliedearningsmov))
My solution seems very redundant, it doesn't matter if I use the lead or lagged value. I know how to do the mean or median, but I need the nearest non 0 value. Thanks!
An example of the data would just be a numeric vector.
3.84
3.54
3.94
3.72
0.00
0.00
0.00
5.88
3.26
4.37
Upvotes: 1
Views: 417
Reputation: 388982
Replace 0 with NA
then you can either use tidyr::fill
library(dplyr)
library(tidyr)
df %>%
mutate(a = replace(a, a == 0, NA)) %>%
fill(a)
Or zoo::na.locf
df %>%
mutate(a = replace(a, a == 0, NA),
a = zoo::na.locf(a))
# a
# <dbl>
# 1 1
# 2 2
# 3 3
# 4 4
# 5 5
# 6 5
# 7 5
# 8 5
# 9 1
#10 2
# … with 12 more rows
Upvotes: 2
Reputation: 2141
You could do it like this, first define a function that does the trick like this:
zero_replacer <- function(x) {
#Find zero values
is_zero <- which(x == 0)
#Find stretches of zero values
zero_stretches <- cumsum(!dplyr::lag(is_zero, default = is_zero[1] - 1) + 1 == is_zero) + 1
#Replace each stretch with prev value
for (i in unique(zero_stretches )) {
idx_stretch <- zero_stretches == i
idx_zero <- is_zero[idx_stretch]
idx_val <- idx_zero[1] - 1
x[idx_zero] <- x[idx_val]
}
x
}
Then you can use this function in mutate like this:
tibble::tibble(# Generate some dummy data
a = c(1:5, 0, 0, 0, 1:4, rep(0, times = 10))
) %>%
dplyr::mutate(
#Replace zero with prev non-zero
a = zero_replacer(a)
)
Output:
# A tibble: 22 x 1
a
<dbl>
1 1
2 2
3 3
4 4
5 5
6 5
7 5
8 5
9 1
10 2
11 3
12 4
13 4
14 4
15 4
16 4
17 4
18 4
19 4
20 4
21 4
22 4
Upvotes: 1