R diff() handling NA

Question

I would like to calculate the first difference in a variable if either the current value or the lag value is missing. The R diff() function returns NA if either value is missing. Can this behavior be changed?

data <- c(5, NA, NA, 10, 25)

diff_i_want <- c(-5, NA, 10, 15)

diff_i_get <- diff(data)

identical(diff_i_want, diff_i_get)

R. Schifini · Accepted Answer

Here is a way:

data <- c(5, NA, NA, 10, 25)
data2 = data
data2[is.na(data2)] = 0
diffData2 = diff(data2)
diffData2[diff(is.na(data))==0 & is.na(data[-1])] = NA

> diffData2
[1] -5 NA 10 15

First make a copy the data to data2, set all NAs to 0 and then diff. At the last step put back all NAs into the calculated diff.

R diff() handling NA

Answers (2)

Related Questions