Reputation: 1520
I have a dataframe and need to calculate the difference between successive entries within each ID, but would like to do this without having to create individual dataframes for each ID and then join back together (my current solution). Here is an example using a similar structure to the dataframes.
df = as.data.frame(matrix(nrow = 20,ncol =2 ))
names(df) = c("ID","number")
df$ID = sample(c("A","B","C"),20,replace = T)
df$number = rnorm(20,mean = 5)
I can easily calculate the difference between successive rows using this function
roll.dif <-function(x) {
difference = rollapply(x,width = 2, diff, fill=NA, align = "right")
return(difference)
}
df$dif = roll.dif(df$number)
however I would like to do this within each ID. I have tried using with based on this answer Apply function conditionally as
with(df, tapply(number, ID, FUN = roll.dif))
I have also tried using by
by(df$number,df$ID,FUN = roll.dif)
both of which give me the answers I am looking for, but I cannot figure out how to get them back into the dataframe. I would like the output to look like this:
ID number dif
1 A 3.967251 NA
2 B 3.771882 NA
3 A 5.920705 1.953454
4 A 7.517528 1.596823
5 B 5.252357 3.771882
6 B 4.811998 -0.440359
7 B 3.388951 -1.423047
8 A 5.284527 -2.233001
9 C 6.070546 NA
10 A 5.319934 0.035407
11 A 5.517615 0.197681
12 B 5.454738 2.065787
13 C 6.402359 0.331813
14 C 5.617123 -0.785236
15 A 5.692807 0.175192
16 C 4.902007 -0.715116
17 B 4.975184 -0.479554
18 A 6.05282 0.360013
19 C 3.677114 -1.224893
20 C 4.883414 1.2063
Upvotes: 0
Views: 78
Reputation: 886968
We can use data.table
library(data.table)
setDT(df)[, dif := roll.dif(number), by = ID]
Or a base R
option is ave
df$dif <- with(df, ave(number, ID, FUN = roll.dif))
Upvotes: 1
Reputation: 3587
You can use dplyr
package like this
df %>% group_by(ID) %>% mutate(dif=roll.dif(number))
Upvotes: 2