Reputation: 11
I have data acording to uniue id and sorted on date of visit. Some people have multiple visits. Data is in the long format sorted by visit. I only want to replicate a row of the last visit of each person. How does one replicate only specific rows in a data frame?
id visit glucose
1 12 Jan 2015 12
1 3 Feb 2015 8
2 1 Feb 2015 13
3 12 Jan 2015 7
3 4 Feb 2015 13
3 1 March 2015 8
Upvotes: 0
Views: 76
Reputation: 887048
If we need to duplicate the last row based on the 'visit' for each 'id', we can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), order
by 'id', and 'visit', grouped by 'id', we replicate the last row (.N
)
library(data.table)
setDT(df1)[order(id, as.Date(visit, "%d %b %Y")), .SD[c(seq_len(.N), .N)], by = id]
# id visit glucose
#1: 1 12 Jan 2015 12
#2: 1 3 Feb 2015 8
#3: 1 3 Feb 2015 8
#4: 2 1 Feb 2015 13
#5: 2 1 Feb 2015 13
#6: 3 12 Jan 2015 7
#7: 3 4 Feb 2015 13
#8: 3 1 March 2015 8
#9: 3 1 March 2015 8
If we want only the last row for each 'id'
setDT(df1)[order(id, as.Date(visit, "%d %b %Y")), .SD[.N], id]
Upvotes: 1