Reputation: 318
Hypothetical data:
hypo <- data.frame('X1' = c('a','b','a','b','a','b','a','b'),
'X2' = c('x','x','y','y','x','x','y','y'),
'X3' = c('m','m','m','m','n','n','n','n'),
'X4' = c(1,6,4,9,10,7,8,3))
Output:
X1 X2 X3 X4
1 a x m 1
2 b x m 6
3 a y m 4
4 b y m 9
5 a x n 10
6 b x n 7
7 a y n 8
8 b y n 3
You want to find the difference between X4 values when the X1 and X2 values are the same and X3 is different. For example, we can do this for a single value using subset():
value <- (subset(hypo, X1 == 'a' & X2 == 'x' & X3 == 'm')$X4
- subset(hypo, X1 == 'a' & X2 == 'x' & X3 == 'n')$X4)
# -9
How can we do this such that for difference between X4 values are calculated for all instances where X1 and X2 are the same and X3 different?
Ideal output:
X1 X2 m-n
1 a x -9
2 b x -1
3 a y -4
4 b y 6
Any help would be greatly appreciated.
Upvotes: 2
Views: 67
Reputation: 6740
This one is explicit that it should compute m-n
rather than n-m
.
library(dplyr)
hypo %>% group_by(X1, X2) %>%
summarize(`m-n` = X4[X3=="m"] - X4[X3=="n"])
Upvotes: 2
Reputation: 43334
This is really easy with dplyr
. Just group_by
the two variables you want the same, and then summarise
with diff
to subtract the two. It does n-m by default, so make it negative to get m-n:
> library(dplyr)
> hypo %>% group_by(X1, X2) %>% summarise(-diff(X4))
Source: local data frame [4 x 3]
Groups: X1 [?]
X1 X2 -diff(X4)
(fctr) (fctr) (dbl)
1 a x -9
2 a y -4
3 b x -1
4 b y 6
Upvotes: 2