Reputation: 57
I have a model, that needs to train with real world data that I am acquiring daily. In every 3 or 4 days, I can prepare around 500 images for training. So, I must start training and checking the model just after getting 500 images. Meanwhile I will acquire another 500 images and so on. Whether it is OK to train with first 500 data set and save the model weights and continue train with latest 500 data set by using saved weights?
Upvotes: 0
Views: 396
Reputation: 86
You have two options - effectively engage in transfer learning (as mentioned above) OR, if you really believe old data + new data = the best possible data set for you to train on, consider retraining from scratch on the complete data set (old data + new data). The latter gives all data, new and old, an equally fair shake, which is not necessarily true of transfer learning. Although I have to question your need to do this every 3 or 4 days - if your problem is well formulated and your model design is good, at some point you should have enough data that the model trained on that data generalizes well enough that continuously giving it more data will no longer improve the performance significantly. Also, if the model will perform significantly better having been trained on 2000 images than 500 images, why not just wait a couple more weeks until you have 2000 images before releasing it into the real world? Obviously this depends on your task and area of industry, so you may well have a good reason that I'm not aware of, but it's worth thinking about.
Upvotes: 1
Reputation: 2112
This is basically like transfer learning. You take a pre-trained model and fine-tune it on your new data. You will have to save the model and its weight and then load them back and train on the new data like you would normally. This is a common practice.
Upvotes: 1