R: assignment using values from several rows

Question

Say I have measured some value (valueencoded as H,L or I) in five individuals (id) at two time points (time). Sometimes NAs may occur in value:

require(stringr)
require(dplyr)    
set.seed(8)

df1 <- data.frame(
  time=rep(c(1,2), 5),
  id=rep(c("a", "b", "c", "d", "e"),2),
  value=sample(c("H","L","I", NA), replace=T, 10))

How can I make a factor variable (preferable using dplyr::mutate()) that indicates for each idthe transition of value from time 1 to time 2 (e.g: like "HL" if H at time 1 and L at time 2).

df1 %>%
  group_by(id) %>%
  arrange(time)

Gives:

time id value
1     1  a     L
2     2  a     I
3     1  b     L
4     2  b     H
5     1  c    NA
6     2  c    NA
7     1  d    NA
8     2  d     I
9     1  e     L
10    2  e     I

And I would need a fourth column indicating time transition, like (made-up):

   time id value transition
1     1  a     L         L-I
2     2  a     I         L-I
3     1  b     L         L-H
4     2  b     H         L-H
5     1  c    NA         NA-NA
6     2  c    NA         NA-NA
7     1  d    NA         NA-I
8     2  d     I         NA-I
9     1  e     L         L-I
10    2  e     I         L-I

Something like (if only the str_c() command could do it):

df1 <- 
  df1 %>%
  group_by(id) %>%
  arrange(time) %>%
  mutate(transition=str_c(value, sep="-"))

davechilders · Accepted Answer

df1 %>%
  arrange(id, time) %>% 
  group_by(id) %>%
  mutate(transition = paste0(value[1],"-",value[2]))

R: assignment using values from several rows

Answers (1)

Related Questions