Kim
Kim

Reputation: 90

Calculate difference between values in rows by two grouping variables

I want to use the ave solution from this question: subtract value from previous row by group with two groups.

#Reproduceable example db:
FU <- c(5,10,20,2,7,20,6,14,8,20,15,17)
Studynr <- c(1,1,2,2,3,3,4,4,4,5,6,6)
Fugroup <- c(1,0,0,1,0,1,1,0,2,0,1,0)
db <- data.frame(Studynr,Fugroup,FU) 

# code to calculate the difference consequetively
db$FUdiff <- ave(db$FU, db$Studynr, FUN=function(x) c(NA,diff(x)))

This is the head of the table this code provides:

FU Studynr Fugroup FUdiff
5 1 1 NA
10 1 0 5
20 2 0 NA
2 2 1 -18
7 3 0 NA
20 3 1 13

But what I want is (switching 5 and NA, because FU group is ordered as 1 to 0)

FU Studynr Fugroup FUdiff
5 1 1 5
10 1 0 NA
20 2 0 NA
2 2 1 -18
7 3 0 NA
20 3 1 13

I think the ave code is not working for me as the data per group isn't consecutively sorted in the database. Therefore, instead of using one grouping variable (db$Studynr), I would like to use a second grouping variable (db$Fugroup). Any ideas? Thanks!

(adjusted post)

Upvotes: 0

Views: 1071

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388807

You can order the data first and apply the ave code :

db <- db[with(db, order(Studynr, Fugroup)), ]
db$FUdiff <- ave(db$FU, db$Studynr, FUN=function(x) c(NA,diff(x)))

You can implement the same logic in dplyr and data.table :

#dplyr
library(dplyr)

db %>%
  arrange(Studynr, Fugroup) %>%
  group_by(Studynr) %>%
  mutate(FUdiff = c(NA, diff(FU))) %>%
  ungroup -> db

#data.table
library(data.table)
setDT(db)[order(Studynr, Fugroup), FUdiff := c(NA, diff(FU)), Studynr]

Upvotes: 1

Related Questions