Reputation: 111
I am trying to figure out how to create a summary statistic that uses different rows' information in dplyr
Subject BinLab mean.RT
s001 Deviant_RT 533.8115
s001 Standard_RT 508.2450
s002 Deviant_RT 465.5538
s002 Standard_RT 425.0351
Basically, I want to create a data frame that groups by subject and gives me the difference between the mean.RT for Deviant_RT and Standard_RT
This is what I have tried:
RTDataDifferenceWave <- RTData %>%
group_by(Subject) %>%
summarise(DiffRT = Deviant_RT-StandardRT)
I'm stuck on how to create this new dependent variable "DiffRT" which, again, is the difference between the Deviant_RT and Standard_RT. Would prefer an answer in dplyr but open to other solutions.
Upvotes: 1
Views: 175
Reputation: 1622
Take into account that Deviant_RT and StandardRT are not columns, but instead are values of BinLab. In these case you can predefine the sign of mean.RT in each row using the value of BinLab, and then sum the values, like so:
RTDataDifferenceWave <- RTData %>%
mutate(mean.RT_signed = mean.RT * ifelse(BinLab == 'Deviant_RT', 1, -1)) %>%
group_by(Subject) %>%
summarise(DiffRT = sum(mean.RT_signed))
Notice this assumes that BinLab can only be one of Deviant_RT or StandardRT. If it can assume other values, you could change the mutate to this:
mutate(mean.RT_signed = mean.RT * ifelse(BinLab == 'Deviant_RT', 1, ifelse(BinLab == 'Standard_RT', -1, 0)))
Upvotes: 0
Reputation: 7592
One way is to switch to a wide-data format:
RTDataDifferenceWave <- RTData %>% group_by(Subject) %>%
tidyr::spread(BinLab, mean.RT) %>%
mutate(DiffRT = Deviant_RT-Standard_RT)
Upvotes: 4