Reputation: 457
Im trying to train a model to check images, identify specified objects and tell me its coodinates (i dont even need to see an square around the object).
For this im using Tensorflow's object detection and most of what I did was looking this tutorial:
But some things changed, probably because of updates, and then I had to do somethings on my own. I can actually train the model (I guess) but I don't understand the evaluation results. Im used to see loss and current step but this output is unusual for me. Also I don't think the training is being saved. --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_coco.config
model {
faster_rcnn {
num_classes: 9
image_resizer {
keep_aspect_ratio_resizer {
min_dimension: 600
max_dimension: 1024
feature_extractor {
type: 'faster_rcnn_inception_v2'
first_stage_features_stride: 16
first_stage_anchor_generator {
grid_anchor_generator {
scales: [0.25, 0.5, 1.0, 2.0]
aspect_ratios: [0.5, 1.0, 2.0]
height_stride: 16
width_stride: 16
first_stage_box_predictor_conv_hyperparams {
op: CONV
regularizer {
l2_regularizer {
weight: 0.0
initializer {
truncated_normal_initializer {
stddev: 0.01
first_stage_nms_score_threshold: 0.0
first_stage_nms_iou_threshold: 0.7
first_stage_max_proposals: 300
first_stage_localization_loss_weight: 2.0
first_stage_objectness_loss_weight: 1.0
initial_crop_size: 14
maxpool_kernel_size: 2
maxpool_stride: 2
second_stage_box_predictor {
mask_rcnn_box_predictor {
use_dropout: false
dropout_keep_probability: 1.0
fc_hyperparams {
op: FC
regularizer {
l2_regularizer {
weight: 0.0
initializer {
variance_scaling_initializer {
factor: 1.0
uniform: true
mode: FAN_AVG
second_stage_post_processing {
batch_non_max_suppression {
score_threshold: 0.0
iou_threshold: 0.6
max_detections_per_class: 100
max_total_detections: 300
score_converter: SOFTMAX
second_stage_localization_loss_weight: 2.0
second_stage_classification_loss_weight: 1.0
train_config: {
batch_size: 5
optimizer {
momentum_optimizer: {
learning_rate: {
manual_step_learning_rate {
initial_learning_rate: 0.0002
schedule {
step: 900000
learning_rate: .00002
schedule {
step: 1200000
learning_rate: .000002
momentum_optimizer_value: 0.9
use_moving_average: false
gradient_clipping_by_norm: 10.0
fine_tune_checkpoint: "faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt"
from_detection_checkpoint: true
num_steps: 50000
data_augmentation_options {
random_horizontal_flip {
train_input_reader: {
tf_record_input_reader {
input_path: "C:/tensorflow1/models/research/object_detection/images/train.record"
label_map_path: "C:/tensorflow1/models/research/object_detection/training/object-detection.pbtxt"
eval_config: {
num_examples: 67
max_evals: 10
eval_input_reader: {
tf_record_input_reader {
input_path: "C:/tensorflow1/models/research/object_detection/images/test.record"
label_map_path: "C:/tensorflow1/models/research/object_detection/training/object-detection.pbtxt"
shuffle: false
num_readers: 1
2019-03-16 01:05:23.842424: I tensorflow/core/common_runtime/gpu/] Adding visible gpu devices: 0
2019-03-16 01:05:23.842528: I tensorflow/core/common_runtime/gpu/] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-16 01:05:23.845561: I tensorflow/core/common_runtime/gpu/] 0
2019-03-16 01:05:23.845777: I tensorflow/core/common_runtime/gpu/] 0: N
2019-03-16 01:05:23.847854: I tensorflow/core/common_runtime/gpu/] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6390 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
creating index...
index created!
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.05s).
Accumulating evaluation results...
DONE (t=0.04s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.681
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.670
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.542
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.825
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.682
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.689
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.689
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.556
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.825
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Also the models inside faster_rcnn_inception_v2_coco_2018_01_28
have not been changed since Jan 2018, which probably means that even if it's training, it's not saving the progress.
My questions are:
Upvotes: 2
Views: 2574
Reputation: 4071
Wow, a lot of questions to answer here.
1 .I think your config file is correct, usually the fields that need to be carefully configured are:
the number of classes of your datasetfine_tune_checkpoint
: the checkpoint to start the training with if you adopt tansfer learning, this should be provided if from_detection_checkpoint
is set true.label_map_path
: path to your label file, the number of classes should be equal to num_classes
in both train_input_reader
and eval_input_reader
in eval_config
, this is your validation dataset size, e.g. the number of examples in your validation dataset.num_steps
: this is the total number of training steps to reach before the model stops training. 2 Yes, your training process is being saved, it is saved at train_dir
(if you are using the older version api, but model_dir
if you are using the latest version), the official description is here. You can use tensorbard
to visualize your training process.
3 The output if of COCO evaluation format as this is the default evalution metric option. But you can try other evalution metrics by setting metrics_set :
in eval_config
in the config file, other options are available here. For coco metrics, specifically:
is Intersection over Union, this defines how much your detection bounding box overlaps with your groundtruth box. This answer provides more details for you to understand how the precision is calculated on different IoUs.maxDets
is thresholds on max detections per image (see here for better discussion)area
, there are three categories of area, it depends the number of pixels the area takes, small, medium and large are all defined here. 4 The training will stop once the total number of training step is reached to num_steps
as set in your cofig file. In your case, every 15 minutes an evaluation session is performed. Also how often each evaluation is performed can also be configured in the config file.
5 Although you followed the tutorial above, but I suggest follow the official API documentation
PS: Indeed I can confirm the negative precision score is because of the absence of corresponding category. See reference in the cocoapi.
Upvotes: 8