L Xandor
L Xandor

Reputation: 1841

Time Series model for predicting online student grades?

I have a dataset with daily activities for online students (time spent, videos watched etc). based on this data I want to predict if each student will pass or not. Until this point I have been treating it as a classification problem, training a model for each week with the student activity to date and their final outcomes.

This model works pretty well, but it ignores behavior over time. I am interested in doing some kind of time series analysis where the model takes into account all datapoints for each student over time to make the final prediction.

The time series models I've been looking at aim to forecast a specific metric for a population (demand, revenue etc) at future time steps. In my case I am less interested in the aggregated timestep metrics and more interested in the final outcome by individual.

In other words, mine is more of a classification or regression problem, but I am hoping to be able to leverage each individual students usage patterns over time for this. Is there a way to combine the two? Basically build a better classifier that understands patterns over time.

Upvotes: 0

Views: 383

Answers (2)

brandoldperson
brandoldperson

Reputation: 87

Broadly speaking, you have two options:

  1. Create time-binned aggregates for your features to help it capture time dependencies. You could also use something like tsfresh to automatically generate features from your time series.

  2. Use a multivariate time-series model. You could try an RNN or VAR (example here)

Upvotes: 1

KWx
KWx

Reputation: 310

Look at the fbprophet module. This can separate a time series into components such as trend, seasonality and noise. The module was originally developed for web traffic.

You can incorporate this into your regression model in a number of ways by constructing additional variables, for example:

  • Ratio of trend at start of term to end of term
  • The magnitude of the weekly seasonal pattern
  • The variance of the white noise series.
  • etc.

Not to say any of these constructed variables will be significant in your model, but it is the type of things I would try. You could feasibly construct some of these variables without doing any complex time series model at all, for instance the ratio of time spent watching videos at the start of the course vs the end of the course could be calculated in excel.

Upvotes: 1

Related Questions