Reputation: 141
I have a data frame like this (with more observations and code variable than in this example):
code tmp wek sbd
<chr> <chr> <dbl> <dbl>
1 abc01 T1 1 7.83
2 abc01 T1 1 7.83
3 abc01 T1 2 8.5
4 abc01 T1 2 8.5
5 abc01 T1 1 7.83
6 abc01 T1 1 7.83
7 abc01 T1 1 7.83
8 abc01 T1 1 7.83
9 abc01 T1 1 7.83
10 abc01 T2 1 7.56
11 abc01 T2 1 7.56
12 abc01 T2 2 7.22
13 abc01 T2 2 7.22
14 abc01 T2 1 7.56
15 abc01 T2 1 7.56
16 abc01 T2 1 7.56
17 abc01 T2 1 7.56
18 abc01 T2 1 7.56
Now I want to calculate a new variable that gives the difference of variable sbd between wek = 1 and wek = 2 by code and tmp variable.
So far I just found functions that give me the difference of consecutive rows, but this does not fit in my case.
Upvotes: 0
Views: 1029
Reputation: 15143
Using distinct
may work
df %>%
group_by(code, tmp) %>%
distinct() %>%
summarise(diff = diff(sbd))
code tmp diff
<chr> <chr> <dbl>
1 abc01 T1 0.67
2 abc01 T2 -0.34
Upvotes: 2
Reputation: 389275
You can use match
to get the corresponding sbd
value at wk
1 and 2.
library(dplyr)
df %>%
group_by(code, tmp) %>%
summarise(diff = sbd[match(1, wek)] - sbd[match(2, wek)])
# code tmp diff
# <chr> <chr> <dbl>
#1 abc01 T1 -0.67
#2 abc01 T2 0.34
If you want to add a new column in the dataframe keeping the rows same, use mutate
instead of summarise
.
data
It is easier to help if you provide data in a reproducible format
df <- structure(list(code = c("abc01", "abc01", "abc01", "abc01", "abc01",
"abc01", "abc01", "abc01", "abc01", "abc01", "abc01", "abc01",
"abc01", "abc01", "abc01", "abc01", "abc01", "abc01"), tmp = c("T1",
"T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1", "T2", "T2", "T2",
"T2", "T2", "T2", "T2", "T2", "T2"), wek = c(1L, 1L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), sbd = c(7.83,
7.83, 8.5, 8.5, 7.83, 7.83, 7.83, 7.83, 7.83, 7.56, 7.56, 7.22,
7.22, 7.56, 7.56, 7.56, 7.56, 7.56)),
class = "data.frame", row.names = c(NA, -18L))
Upvotes: 4