juanli
juanli

Reputation: 613

Adding first row by group for longitudinal data frame?

Here is the data frame looking like:

subject_ID  cd4_time0    other_time  cd4_other_time
1           12            0.5            462
1            12            1.0            140 
1            12            3.0           789
1            12           6.0            100
2            4            0.5            230
2            4            1.0            350
2            4            1.9            450
2            4            3.2            550
3         
3
..

A brief introduction of my data frame: more than 2k patients followed up years. I have one column of cd4 value in baseline and another column with repeated measurements of cd4 per patient. Now I would like to combine these two types of cd4 data according to subject_ID into one column for my analysis. The output should be like this:

subject_ID  cd4_time0    other_time  cd4_other_time
1            12             0.5            12
1            12             0.5            462
1            12             1.0            140 
1            12            3.0             789
1            12             6.0            100
2            4             0.5             4
2            4             0.5            230
2            4             1.0            350
2            4             1.9            450
2            4             3.2            550
3         
3
..

Any solutions based on R is welcome. Thanks in advance.

Upvotes: 1

Views: 211

Answers (1)

akuiper
akuiper

Reputation: 214967

One option you can use group_by %>% do to construct data frames for each group dynamically:

library(dplyr)

df %>% group_by(subject_ID) %>% do ({
# extract and modify the first row
      firstRow <- .[1,]
      firstRow['cd4_other_time'] <- firstRow['cd4_time0']

# bind the first row with the sub data frame . represents a data frame with a unique subject_ID
      bind_rows(firstRow, .)
})

#Source: local data frame [10 x 4]
#Groups: subject_ID [2]

#   subject_ID cd4_time0 other_time cd4_other_time
#        <int>     <int>      <dbl>          <int>
#1           1        12        0.5             12
#2           1        12        0.5            462
#3           1        12        1.0            140
#4           1        12        3.0            789
#5           1        12        6.0            100
#6           2         4        0.5              4
#7           2         4        0.5            230
#8           2         4        1.0            350
#9           2         4        1.9            450
#10          2         4        3.2            550

Upvotes: 2

Related Questions