New dataframe with difference between first and last values of repeated measurements?

Question

I am working with time series data and want to calculate the difference between the first and final measurement times, and put these numbers into a new and simpler dataframe. For example, for this dataframe

structure(list(time = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), indv = c(1L, 
1L, 1L, 1L, 2L, 2L, 2L, 2L), value = c(1L, 3L, 5L, 8L, 3L, 4L, 
7L, 8L)), .Names = c("time", "indv", "value"), class = "data.frame", row.names = c(NA, 
-8L))

or

time    indv    value
1   1   1
2   1   3
3   1   5
4   1   8
1   2   3
2   2   4
3   2   7
4   2   8

I can use this code

ddply(test, .(indv), transform, value_change = (value[length(value)] - value[1]), time_change = (time[length(time)] - time[1]))

to give

time indv value value_change time_change
1    1     1            7           3
2    1     3            7           3
3    1     5            7           3
4    1     8            7           3
1    2     3            5           3
2    2     4            5           3
3    2     7            5           3
4    2     8            5           3

However, I would like to eliminate the redundant rows and make a new and simpler dataframe like this

indv    time_change value_change
1   3   7
2   3   5

Does anyone have any clever way to do this?

Thanks!

flodel · Accepted Answer

Just replace transform with summarize. You can also make your code a little prettier by using head and tail:

ddply(test, .(indv), summarize,
      value_change = tail(value, 1) - head(value, 1),
      time_change  = tail(time,  1) - head(time,  1))

For maximum readability, write a function:

change <- function(x) tail(x, 1) - head(x, 1)
ddply(test, .(indv), summarize, value_change = change(value),
                                time_change  = change(time))

New dataframe with difference between first and last values of repeated measurements?

Answers (1)

Related Questions