Reputation: 1315
My question is similar to this OP and this OP, with a minor difference that seems to be overly complicated.
Example of my data:
ind_id wt date
1002 25 1987-07-27
1002 15 1988-05-05
2340 30 1987-03-18
2340 52 1989-08-15
I am calculating the difference between wt
values after group_by(ind_id)
.
To do this:
df<-df %>%
group_by(ind_id) %>%
mutate(mass_diff=(wt-lag(wt))
This gives me this output:
ind_id wt date mass_diff
1002 15 1988-05-05 -10
2340 52 1989-08-15 22
But, the output I want should keep the first wt
record, not the last.
Desired output:
ind_id wt date mass_diff
1002 25 1988-05-05 -10
2340 30 1989-08-15 22
Note that the wt
column is the only one I'd like to have maintained from the first row. (Keep in mind that this example is overly simplified and I am actually working with 18 rows).
Any suggestions (using dplyr
) would be appreciated!
Upvotes: 2
Views: 720
Reputation: 25323
A possible solution:
library(tidyverse)
df <- structure(list(ind_id = c(1002, 1002, 2340, 2340), wt = c(25,
15, 30, 52), date = structure(c(6416, 6699, 6285, 7166), class = "Date")), row.names = c(NA,
-4L), class = "data.frame")
df %>%
group_by(ind_id) %>%
mutate(mass_diff = (wt-lag(wt))) %>%
mutate(wt = first(wt)) %>%
slice_tail %>% ungroup
#> # A tibble: 2 × 4
#> ind_id wt date mass_diff
#> <dbl> <dbl> <date> <dbl>
#> 1 1002 25 1988-05-05 -10
#> 2 2340 30 1989-08-15 22
Upvotes: 2