Reputation: 45

Creating supervised model in machine learning

I have recently learned how supervised learning works. It learns labeled dataset and predict unlabeled datum.

But, I have a question that is it fine to teach the created model with the predicted datum and then predict unlabeled datum again. And repeat the process.

For example, Model M was created by 10 labeled dataset D, then Model M predicts datum A. Then, data A is added into dataset D and creates Model M again. The process is repeated with the amount of unpredicted data.

Upvotes: 3

Answers (2)

lejlot

Reputation: 66825

What you are describing here is a well known technique known as (among other names) "selftraining" or "self semi-supervised training". See for example slides https://www.cs.utah.edu/~piyush/teaching/8-11-print.pdf. There are hundreads of modifications around this idea. Unfortunately, in general it is hard to prove that it should help, so while it will help for some datasets it will hard the other ones. The main criterion here is the quality of the very first model, since selftraining is based on the assumption, that your original model is really good, thus you can trust it enough to label new examples. It might help with slow concept drift with a strong model, but will fail misserably with weak models.

Upvotes: 2

Atilla Ozgur

Reputation: 14721

What you describe is called online machine learning, incremental supervised learning, Updateable Classifiers... There are bunch of algorithms that accomplish these behavior. See for example weka toolbox Updateable Classifiers. I suggest to look following ones.

HoeffdingTree
IBk
NaiveBayesUpdateable
SGD

Upvotes: -1

Creating supervised model in machine learning

Answers (2)

Related Questions