Reputation: 73
I have fine-tuned a faster_rcnn_resnet101 model available on the Model Zoo to detect my custom objects. I had the data split into train and eval set, and I used them in the config file while training. Now after training has completed, I want to test my model on an unseen data (I call it the test data). I used a couple of functions but can not figure out for certain which code to use from the tensorflow's API to evaluate the performance on the test dataset. Below are the things that I tried:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.459
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.601
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.543
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.459
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.543
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.627
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Now, I know that mAP and AR can't be negative and there is something wrong. I want to know why do I see negative values when I run the offline evaluation on the test dataset?
The query that I used to run this pipeline is: SPLIT=test
echo "
label_map_path: '/training_demo/annotations/label_map.pbtxt'
tf_record_input_reader: { input_path: '/training_demo/Predictions/test.record' }
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_input_config.pbtxt
echo "
metrics_set: 'coco_detection_metrics'
" > /training_demo/${SPLIT}_eval_metrics/${SPLIT}_eval_config.pbtxt
python object_detection/metrics/offline_eval_map_corloc.py \
--eval_dir='/training_demo/test_eval_metrics' \
--eval_config_path='training_demo/test_eval_metrics/test_eval_config.pbtxt' \
--input_config_path='/training_demo/test_eval_metrics/test_input_config.pbtxt'
DetectionBoxes_Recall/AR@100 (medium): -1.0 DetectionBoxes_Recall/AR@100 (small): -1.0 DetectionBoxes_Precision/[email protected]: -1.0 DetectionBoxes_Precision/mAP (medium): -1.0 etc.
I used the pipeline, python eval.py \ --logtostderr \ --checkpoint_dir=trained-inference-graphs/output_inference_graph/ \ --eval_dir=test_eval_metrics \ --pipeline_config_path=training/faster_rcnn_resnet101_coco-Copy1.config
The eval_input_reader in the faster_rcnn_resnet101_coco-Copy1.config pointing to the test TFRecord with ground truth and detection information.
I would appreciate any help on this.
Upvotes: 5
Views: 4805
Reputation: 38
!python eval.py --logtostderr --pipeline_config_path=--checkpoint_dir--eval_dir=eval/
You can find the Eval.py in legacy folder
Upvotes: 0
Reputation: 600
For me I just run the model_main.py
only once and change the eval_input_reader in the pipeline.config
to the test dataset. But I am not sure if this is how it should be done.
python model_main.py \
--alsologtostderr \
--run_once \
--checkpoint_dir=$path_to_model \
--model_dir=$path_to_eval \
--pipeline_config_path=$path_to_config
pipeline.config
eval_config: {
metrics_set: "coco_detection_metrics"
num_examples: 721 # no of test images
num_visualizations: 10 # no of visualizations for tensorboard
}
eval_input_reader: {
tf_record_input_reader {
input_path: "/path/to/test-data.record"
}
label_map_path: "/path/to/label_map.pbtxt"
shuffle: true
num_readers: 1
}
Also for me there was not a difference in mAP between validation and test dataset. So I am not sure if a split in training, validation & test data is actually necessary.
Upvotes: 0
Reputation: 4071
The evalution metrics is of COCO format so you can refer to COCO API for the meaning of these values.
As specified in coco api code, -1
is the default value if the category is absent. In your case, all objects detected only belong to 'small' area. Also area categories of 'small', 'medium' and 'large' depend on the pixels the area takes as specified here.
Upvotes: 1