Ankit Kumar
Ankit Kumar

Reputation: 1281

Training neural network for updated data

I have a neural network which has been trained over some dataset. Say the dataset had 10k data points initially and another 100 data points are now added. Is there a way for my neural network to learn this entire (updated) dataset without training from scratch? Further, is catastrophic interference applicable here? I know catastrophic interference is applicable when the NN tries to learn "new information", but I wasn't sure if "updated (due to insertions) information" counts as "new information".

Upvotes: 0

Views: 166

Answers (2)

iacob
iacob

Reputation: 24181

Online Learning refers to models which adapt to incrementally available / continual streams of input data.

Catastrophic interference my indeed be an issue, depending on your model, data, and problem.

If you assume that:

  1. your new data D2 is i.i.d, sampled from the same distribution as the original dataset D1
  2. your original model was trained using 'mini-batches' of the dataset
  3. The size of D2 is >= the size of the minibatches used

you can split D2 into new mini-batches and continue training where you left off.

But if this is not the case, it would indeed likely be susceptible to Catastrophic Forgetting, since the task is nominally the same but the domain (underlying distribution of the data) is changing. In this instance, if retraining on the entire (updated) dataset is not feasible, you will need to investigate Continual Learning methods, which are specifically designed to mitigate this.

Upvotes: 1

Ach113
Ach113

Reputation: 1825

Indeed, unfortunately catastrophic interference (or forgetting) is applicable to your case. But there is a branch of Deep Learning that focuses on that problem called Continual Learning.

Upvotes: 1

Related Questions