Splash1199
Splash1199

Reputation: 379

Creating a new column using ddply in the R package "plyr"

I am working on an animal tracking data set, and I need to calculate the time difference between time stamps for each GPS position for each individual. For simplicity, my data looks like this (lets forget about the other variables for now):

ID  Time
B1  6:57
B1  6:59
B1  7:03
B1  7:10
B2  6:34
B2  6:45
B2  6:47
B2  6:48
B3  6:23
B3  6:35
B3  6:46
B3  6:47

I tried to calculate the time difference using the following:

ddply(df, "ID",transform,timediff=diff(Time))

However I get this error message:

Error in data.frame(list(ID = c(1L, 1L, 1L, 1L), Time = 8:11):
arguments imply differing number of rows: 4, 3

I assume the problem is that there is no value for the first row for each Animal. Is there a way around this? Any help is much appreciated.

Upvotes: 0

Views: 196

Answers (2)

akrun
akrun

Reputation: 887691

We can use ave from base R

 df1$timediff <- with(df1, ave(as.numeric(Time), ID, FUN = function(x) c(NA, diff(x))))

assuming that 'Time' is of datetime class.

Upvotes: 0

rafa.pereira
rafa.pereira

Reputation: 13817

You could use data.table

 library(data.table)

# create a lag variable of time by ID
setDT(data)[, timediff:=c(NA, Time[-.N]), by=ID]

dt
#>     ID Time timediff
#>  1: B1 6:57       NA
#>  2: B1 6:59        8
#>  3: B1 7:03        9
#>  4: B1 7:10       10
#>  5: B2 6:34       NA
#>  6: B2 6:45        2
#>  7: B2 6:47        4
#>  8: B2 6:48        6
#>  9: B3 6:23       NA
#>  10: B3 6:35       1
#>  11: B3 6:46       3
#>  12: B3 6:47       5

Upvotes: 0

Related Questions