subtracting rows nested in a data.frame based on column value corrospondence

Question

For each unique value of study, I was wondering how to subtract the yi of rows that have a group == "C" on each interval_id from their corresponding yi of rows that have a group != "C"?

For example, in study == 1, yi == .4 for group == "C" on interval_id == 0 should be subtracted from yi == .1 for group == "T1" on interval_id == 0.

Similarly, in study == 1, yi == .5 for group == "C" on interval_id == 1 should be subtracted from yi == .3 for group == "T1" on interval_id == 1.

The final output should be a data.frame with group == C rows deleted (below).

m = "
study group  yi  vi interval_id obs
1      T1    .1  1  0           1
1      T1    .3  2  1           2
1      C     .4  3  0           3
1      C     .5  4  1           4
2      T2    .6  5  0           5
2      C     .9  6  1           6
"

data <- read.table(text=m,h=T)

# DESIRED OUTPUT:
"
study group  yi  vi interval_id obs
1      T1    -.3  .  0           1
1      T1    -.2  .  1           2
2      T2    -.3  .  0           5
2      C      .9  .  1           6
"

Ronak Shah · Accepted Answer

We can subtract yi values in every study where group != 'C' with yi values where group = 'C'. Finally, drop the rows where group != 'C'.

library(dplyr)

data %>%
  group_by(study) %>%
  mutate(yi = rep(yi[group != 'C'] - yi[group == 'C'], 2)) %>%
  ungroup() %>%
  filter(group != 'C')

#  study group    yi    vi interval_id   obs
#             
#1     1 T1     -0.3     1           0     1
#2     1 T1     -0.2     2           1     2
#3     2 T2     -0.3     5           0     5

subtracting rows nested in a data.frame based on column value corrospondence

Answers (2)

Related Questions