Ben
Ben

Reputation: 43

Matching the previous row in a specific column and performing a calculation in R

I currently have a data file that resembles this:

R ID A B    
1 A1 0 0  
2 A1 2 4  
3 A1 4 8    
4 A2 0 0  
5 A2 3 3  
6 A2 6 6

I would like to write a script that will only calculate "(8-4)/(4-2)" from the previous row only if the "ID" matches. For example, in the output for a column "C" in row 3, if A1 == A1 in the "ID" column, then (8-4)/(4-2) = 2. If A1 != A1, then output is 0.

I would like the output to be like this:

R ID A B C   
1 A1 0 0 0  
2 A1 2 4 2  
3 A1 4 8 2     
4 A2 0 0 0  
5 A2 3 3 1   
6 A2 6 6 1  

Hopefully I explained this correctly in a non-confusing manner.

Upvotes: 4

Views: 147

Answers (2)

akrun
akrun

Reputation: 887891

We can also use lag

library(dplyr)
df %>% 
   group_by(ID) %>% 
   mutate(C = (B - lag(B, default = first(B)))/(A - lag(A, default = first(A))))

data

df <- structure(list(R = 1:6, ID = structure(c(1L, 1L, 1L, 2L, 2L, 
2L), .Label = c("A1", "A2"), class = "factor"), A = c(0L, 2L, 
4L, 0L, 3L, 6L), B = c(0L, 4L, 8L, 0L, 3L, 6L)), class = "data.frame", 
row.names = c(NA, -6L))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389275

We could group_by ID, use diff to calculate difference between rows and divide.

library(dplyr)
df %>% group_by(ID) %>% mutate(C = c(0, diff(B)/diff(A)))

#      R ID        A     B     C
#  <int> <fct> <int> <int> <dbl>
#1     1 A1        0     0     0
#2     2 A1        2     4     2
#3     3 A1        4     8     2
#4     4 A2        0     0     0
#5     5 A2        3     3     1
#6     6 A2        6     6     1

and similarly using data.table

library(data.table)
setDT(df)[, C := c(0, diff(B)/diff(A)), ID]

data

df <- structure(list(R = 1:6, ID = structure(c(1L, 1L, 1L, 2L, 2L, 
2L), .Label = c("A1", "A2"), class = "factor"), A = c(0L, 2L, 
4L, 0L, 3L, 6L), B = c(0L, 4L, 8L, 0L, 3L, 6L)), class = "data.frame", 
row.names = c(NA, -6L))

Upvotes: 3

Related Questions