Reputation: 551
I have a dataframe in which there are 3 columns Alertdate
, AppointmentDate
and ID
. The id is not unique and there multiple occurrences for alert and appointment. I want to calculate the difference in Alertdate
and AppointmentDate
but only for the first occurrence of ID
. Example
ID Alertdate AppointmentDate NBD
*1 01/01/2000 04/01/2000 3*
1 02/01/2000 04/01/2000 2
*2 01/01/2000 04/01/2000 3*
2 01/01/2000 05/01/2000 4
For the above sample data I just need row1 and row3 in my resulting output.
Upvotes: 1
Views: 55
Reputation: 10375
In case you need to calculate NBD first
dat=read.table(text="
ID Alertdate AppointmentDate NBD
1 01/01/2000 04/01/2000 3
1 02/01/2000 04/01/2000 2
2 01/01/2000 04/01/2000 3
2 01/01/2000 05/01/2000 4",h=T)
dat$Alertdate=as.Date(dat$Alertdate,format="%d/%m/%Y")
dat$AppointmentDate=as.Date(dat$AppointmentDate,format="%d/%m/%Y")
dat$NBD=as.numeric(dat$AppointmentDate-dat$Alertdate)
In case the table is not sorted
dat=dat[order(dat$ID,dat$Alertdate),]
finally
do.call(rbind,by(dat,list(dat$ID),function(x){x[1,]}))
ID Alertdate AppointmentDate NBD
1 1 01/01/2000 04/01/2000 3
2 2 01/01/2000 04/01/2000 3
Upvotes: 3