R dplyr summarise date gaps

Question

I have data on a set of students and the semesters they were enrolled in courses.

ID = c(1,1,1,
   2,2,
   3,3,3,3,3,
   4)

The semester variable "Date" is coded as the year followed by 20 for spring, 30 for summer, and 40 for fall. so the Date value 201430 is summer semester of 2014...

Date = c(201220,201240,201330,
     201340,201420,
     201120,201340,201420,201440,201540,
     201640)

Enrolled<-data.frame(ID,Date)

I'm using dplyr to group the data by ID and to summarise various aspects about a given student's enrollment history

Enrollment.History<-dplyr::select(Enrolled,ID,Date)%>%group_by(ID)%>%summarise(Total.Semesters = n_distinct(Date),
                                                                First.Semester = min(Date))

I'm trying to get a measure for the number of enrollment gaps that each student has, as well as the size of the largest enrollment gap. The data frame shouls end up looking like this:

Enrollment.History$Gaps<-c(2,0,3,0)
Enrollment.History$Biggest.Gap<-c(1,0,7,0)
print(Enrollment.History)

I'm just trying to figure out what the best way to code those gap variables. Is it better to turn that Date variable into an ordered factor? I hope this is a simple solution

R dplyr summarise date gaps

Answers (1)

Related Questions