Subtract data table by a subset of the data

Question

I hope it's not a duplicate, but I searched hard and didn't find the answer.

So, I have a big data.table (>50000 observations), here's the head:

   measure condition subject channel     score
1:     LZs      dark      03       1 0.5589379
2:     LZs      dark      03       2 0.5225509
3:     LZs      dark      03       3 0.5988951
4:     LZs      dark      03       4 0.5475331
5:     LZs      dark      03       5 0.5468930
6:     LZs      dark      03       6 0.5431141

I want to create a new column such as

data$diff = data$score - data$score[data$condition%in%"dark"]

I have 9 different measures, 5 conditions, 18 subjects and 64 channels - thus I can't check line by line if I get the expected result. Still, with a random check in the data I found out it wasn't the case.

How to be SURE that this simple operation is done using the score of the right measure, subject and channel each time?

Of course, I could do several for loops, but that's not nice R code. I assume it could be done using dplyr, but I'm not familiar with it and a simple mutate() didn't work better.

akrun · Accepted Answer

Assuming that we need to get the difference for each 'measure' and 'subject', specify the 'measure' and 'subject' in the by , subtract 'score' from those elements where 'condition' is 'dark' (the length is assumed to be same)

library(data.table)
data[, Diff := score - score[condition =="dark"], .(measure, subject)]

Subtract data table by a subset of the data

Answers (1)

Related Questions