How to pair rows in a data frame in R with dplyr?

Question

I have a dataframe containing observations from the control and the experimental group with replicates for each subject: Here is an example of my dataframe:

subject  group    replicate value
  A     control      1       10
  A     control      2       15
  A     experim      1       40
  A     experim      2       45
  B     control      1       5
  B     experim      1       30
  C     control      1       50
  C     experim      1       NA

I'd like to pair each control observation with its corresponding experimental one in order to calculate the ratio between the paired values. The desired output:

subject  replicate  control   experim  ratio
  A         1         10        40       4
  A         2         15        45       3
  B         1          5        30       6
  C         1         50        NA       NA

Please, note that the number of replicates for subjects can vary (A has two replicates, B only one, C has one with a missing value). Ideally, I'd like to see this implemented with dplyr and pipes.

akrun · Accepted Answer

We can use dcast from data.table to convert to 'wide' format, then create the 'ratio' column by dividing 'experim' with 'control'

library(data.table)
dcast(setDT(df1), subject+replicate~group, value.var="value")[,
            ratio:= experim/control][]
#     subject replicate control experim ratio
#1:       A         1      10      40     4
#2:       A         2      15      45     3
#3:       B         1       5      30     6
#4:       C         1      50      NA    NA

Or using spread from tidyr to convert to 'wide' format and then create the 'ratio' with mutate.

library(dplyr)
library(tidyr)
spread(df1, group, value) %>% 
        mutate(ratio = experim/control)
#    subject replicate control experim ratio
#1       A         1      10      40     4
#2       A         2      15      45     3
#3       B         1       5      30     6
#4       C         1      50      NA    NA

Or using reshape from base R

transform(reshape(df1, idvar = c("subject", "replicate"), 
   timevar="group", direction="wide"), ratio = value.experim/value.control)

How to pair rows in a data frame in R with dplyr?

Answers (1)

Related Questions