Reputation: 7725
I'd like to use dplyr
to calculate differences in value
between people
nested in pair
by session
.
dat <- data.frame(person=c(rep(1, 10),
rep(2, 10),
rep(3, 10),
rep(4, 10),
rep(5, 10),
rep(6, 10),
rep(7, 10),
rep(8, 10)),
pair=c(rep(1, 20),
rep(2, 20),
rep(3, 20),
rep(4, 20)),
condition=c(rep("NEW", 10),
rep("OLD", 10),
rep("NEW", 10),
rep("OLD", 10),
rep("NEW", 10),
rep("OLD", 10),
rep("NEW", 10),
rep("OLD", 10)),
session=rep(seq(from=1, to=10, by=1), 8),
value=c(0, 2, 4, 8, 16, 16, 18, 20, 20, 20,
0, 1, 1, 2, 4, 5, 8, 12, 15, 15,
0, 2, 8, 10, 15, 16, 18, 20, 20, 20,
0, 4, 4, 6, 6, 8, 10, 12, 12, 18,
0, 6, 8, 10, 16, 16, 18, 20, 20, 20,
0, 2, 2, 3, 4, 8, 8, 8, 10, 12,
0, 10, 12, 16, 18, 18, 18, 20, 20, 20,
0, 2, 2, 8, 10, 10, 11, 12, 15, 20)
)
For instance, person
1 and 2 make a pair (pair==1
):
person==1
& session==2
: 2person==2
& session==2
: 1Difference (NEW
-OLD
) is 2-1=1
.
Here's what I have tried so far. I think I need to group_by()
first and then summarise()
, but I have not cracked this nut.
dat %>%
mutate(session = factor(session)) %>%
group_by(condition, pair, session) %>%
summarise(pairDiff = value-first(value))
Desired output:
Upvotes: 1
Views: 481
Reputation: 66834
Your output can be obtained by:
dat %>% group_by(pair,session) %>% arrange(condition) %>% summarise(diff = -diff(value))
Source: local data frame [40 x 3]
Groups: pair [?]
# A tibble: 40 x 3
pair session diff
<dbl> <dbl> <dbl>
1 1 1 0
2 1 2 1
3 1 3 3
4 1 4 6
5 1 5 12
6 1 6 11
7 1 7 10
8 1 8 8
9 1 9 5
10 1 10 5
# ... with 30 more rows
The arrange
ensures that NEW and OLD are in the correct positions, but the solution does depend on there being exactly 2 values for each combination of pair and session.
Upvotes: 3
Reputation: 214927
You can spread condition
to headers and then do the subtraction NEW - OLD
:
library(dplyr); library(tidyr)
dat %>%
select(-person) %>%
spread(condition, value) %>%
mutate(diff = NEW - OLD) %>%
select(session, pair, diff)
# A tibble: 40 x 3
# session pair diff
# <dbl> <dbl> <dbl>
# 1 1 1 0
# 2 2 1 1
# 3 3 1 3
# 4 4 1 6
# 5 5 1 12
# 6 6 1 11
# 7 7 1 10
# 8 8 1 8
# 9 9 1 5
#10 10 1 5
# ... with 30 more rows
Upvotes: 2