Reputation: 613
Here is the data frame looking like:
subject_ID cd4_time0 other_time cd4_other_time
1 12 0.5 462
1 12 1.0 140
1 12 3.0 789
1 12 6.0 100
2 4 0.5 230
2 4 1.0 350
2 4 1.9 450
2 4 3.2 550
3
3
..
A brief introduction of my data frame: more than 2k patients followed up years. I have one column of cd4 value in baseline and another column with repeated measurements of cd4 per patient. Now I would like to combine these two types of cd4 data according to subject_ID into one column for my analysis. The output should be like this:
subject_ID cd4_time0 other_time cd4_other_time
1 12 0.5 12
1 12 0.5 462
1 12 1.0 140
1 12 3.0 789
1 12 6.0 100
2 4 0.5 4
2 4 0.5 230
2 4 1.0 350
2 4 1.9 450
2 4 3.2 550
3
3
..
Any solutions based on R is welcome. Thanks in advance.
Upvotes: 1
Views: 211
Reputation: 214967
One option you can use group_by %>% do
to construct data frames for each group dynamically:
library(dplyr)
df %>% group_by(subject_ID) %>% do ({
# extract and modify the first row
firstRow <- .[1,]
firstRow['cd4_other_time'] <- firstRow['cd4_time0']
# bind the first row with the sub data frame . represents a data frame with a unique subject_ID
bind_rows(firstRow, .)
})
#Source: local data frame [10 x 4]
#Groups: subject_ID [2]
# subject_ID cd4_time0 other_time cd4_other_time
# <int> <int> <dbl> <int>
#1 1 12 0.5 12
#2 1 12 0.5 462
#3 1 12 1.0 140
#4 1 12 3.0 789
#5 1 12 6.0 100
#6 2 4 0.5 4
#7 2 4 0.5 230
#8 2 4 1.0 350
#9 2 4 1.9 450
#10 2 4 3.2 550
Upvotes: 2