Using diff() in R with NA and negative numbers

Question

I have a R DataFrame df with the following content:

Serial N         year         current
   B              10            14
   B              10            16
   B              11            10
   B              11            NA
   B              11            15
   C              12            11
   C              12             9
   C              12            13
   C              12            17
   .              .              .

I would like to find the difference between the each consecutive pair of current of the same serial N. This is code I wrote.But I am getting some strange results

library(data.table)
setDT(df)[,mydiff:=diff(df$current),by=Serial N]   
print(length(df$current))

I get the following as outuput for that column is quite strange, I get this:

2 6  NA NA NA 2 6  NA NA NA

What I would like to have actually is :

Serial N         year         current      mydiff
   B              10            14         
   B              10            16         16-14=2
   B              11            10         10-16=-4
   B              11            NA            NA
   B              11            15         15-10=5
   C              12            11
   C              12             9         9-11=-2    
   C              12           -13        -13-9=-22
   C              12            17         17-(-13)=30
   .              .              .

Is diff the right thing to do that? if not, how can tackle this (especially without using loops)?

Wyldsoul · Accepted Answer

This may work for you. You can bring values forward with na.locf from the zoo package. The ifelse condition only populates my.diff if current is not NA.

library(data.table)
library(zoo)
df <- read.table(textConnection("
                         'Serial N'         year         current
                            B              10            14
                            B              10            16
                            B              11            10
                            B              11            NA
                            B              11            15
                            C              12            11
                            C              12             9
                            C              12            -13
                            C              12            17"),header=TRUE)

setDT(df)
setkey(df,Serial.N)
df[,my.diff := ifelse(!is.na(current), c(" ",diff(na.locf(current))), NA),by=Serial.N]  


#        Serial.N year current my.diff
# 1:        B   10      14        
# 2:        B   10      16       2
# 3:        B   11      10      -6
# 4:        B   11      NA      NA
# 5:        B   11      15       5
# 6:        C   12      11        
# 7:        C   12       9      -2
# 8:        C   12     -13     -22
# 9:        C   12      17      30

Using diff() in R with NA and negative numbers

Answers (1)

Related Questions