Paul Rheeder
Paul Rheeder

Reputation: 11

duplicating/replicating only specific rows in a data frame

I have data acording to uniue id and sorted on date of visit. Some people have multiple visits. Data is in the long format sorted by visit. I only want to replicate a row of the last visit of each person. How does one replicate only specific rows in a data frame?

id  visit             glucose
1     12 Jan 2015      12
1      3 Feb 2015       8
2      1 Feb 2015       13
3      12 Jan 2015      7 
3      4 Feb 2015       13
3      1  March 2015    8

Upvotes: 0

Views: 76

Answers (1)

akrun
akrun

Reputation: 887048

If we need to duplicate the last row based on the 'visit' for each 'id', we can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), order by 'id', and 'visit', grouped by 'id', we replicate the last row (.N)

library(data.table)
setDT(df1)[order(id, as.Date(visit, "%d %b %Y")), .SD[c(seq_len(.N), .N)], by = id]
#    id         visit glucose
#1:  1   12 Jan 2015      12
#2:  1    3 Feb 2015       8
#3:  1    3 Feb 2015       8
#4:  2    1 Feb 2015      13
#5:  2    1 Feb 2015      13
#6:  3   12 Jan 2015       7
#7:  3    4 Feb 2015      13
#8:  3 1  March 2015       8
#9:  3 1  March 2015       8

If we want only the last row for each 'id'

setDT(df1)[order(id, as.Date(visit, "%d %b %Y")), .SD[.N], id]

Upvotes: 1

Related Questions