Tensorflow Object Detection API - Adding more training data half way, is it ok?

Question

Currently I have a total of 300 images for the training dataset, the current saved checkpoint is 8054. Trained for 10 hours and having accuracy of around 50%. The model is required to detect only one object.

I wish to improve the accuracy. So, I am just wondering, what if I wish to add more training images? I will need to create new, xml files, csv files and for sure, new TF records as well. Do I need to start the training all over again? Or is it just, add images, new xml, new csv, then new TF records, and just continue the training at checkpoint 8054?

I am using SSD mobilenet COCO model, batch size 5. Besides of just merely increasing the number of training dataset, what are the parameters / factors that I need to change / improve in order to increase the accuracy?

Long Hoang Nguyen · Accepted Answer

To answer your first question, yes, you can add more data (as a separate tfrecord file, for example) to the object detection API. Just edit the pipeline.config file for your model. You can actually add a whole list of .record files to the input_path parameters.

It would look roughly like this:

train_input_reader: {
  tf_record_input_reader {
    input_path: ["path/to/first.record", "path/to/second.record"]
  }
label_map_path: "..."
}

If you don't change any other parameters, training should simply continue from the checkpoint and it will load the new data.

To answer the second question, for SSD a simple thing to look at is the learning rate, it might be too high and the loss might jump a lot. You would have to analyze the loss using tensorboard yourself, though. As for other hyperparameters, I didn't have much success yet so I hope others might chime in on that one.

Tensorflow Object Detection API - Adding more training data half way, is it ok?

Answers (2)

Related Questions