Reputation: 733
I've been using TensorflowSharp with Faster RCNN successfully for a while now; however, I recently trained a Retinanet model, verified it works in python, and have created a frozen pb file for use with Tensorflow. For FRCNN, there is an example in the TensorflowSharp GitHub repo that shows how to run/fetch this model. For Retinanet, I tried modifying the code but nothing seems to work. I have a model summary for Retinanet that I've tried to work from, but it's not obvious to me what should be used.
For FRCNN, the graph is run in this way:
var runner = m_session.GetRunner();
runner
.AddInput(m_graph["image_tensor"][0], tensor)
.Fetch(
m_graph["detection_boxes"][0],
m_graph["detection_scores"][0],
m_graph["detection_classes"][0],
m_graph["num_detections"][0]);
var output = runner.Run();
var boxes = (float[,,])output[0].GetValue(jagged: false);
var scores = (float[,])output[1].GetValue(jagged: false);
var classes = (float[,])output[2].GetValue(jagged: false);
var num = (float[])output[3].GetValue(jagged: false);
From the model summary for FRCNN, it is obvious what the input ("image_tensor") and outputs ("detection_boxes", "detection_scores", "detection_classes", and "num_detections") are. They are not the same for Retinanet (I've tried), and I can't figure out what they should be. The "Fetch" part of the code above is causing a crash, and I'm guessing its because I'm not getting the node names right.
I won't paste the entire Retinanet summary here, but here is the first few nodes:
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, None, None, 3 0
__________________________________________________________________________________________________
padding_conv1 (ZeroPadding2D) (None, None, None, 3 0 input_1[0][0]
__________________________________________________________________________________________________
conv1 (Conv2D) (None, None, None, 6 9408 padding_conv1[0][0]
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization) (None, None, None, 6 256 conv1[0][0]
__________________________________________________________________________________________________
conv1_relu (Activation) (None, None, None, 6 0 bn_conv1[0][0]
__________________________________________________________________________________________________
And here are the last several nodes:
__________________________________________________________________________________________________
anchors_0 (Anchors) (None, None, 4) 0 P3[0][0]
__________________________________________________________________________________________________
anchors_1 (Anchors) (None, None, 4) 0 P4[0][0]
__________________________________________________________________________________________________
anchors_2 (Anchors) (None, None, 4) 0 P5[0][0]
__________________________________________________________________________________________________
anchors_3 (Anchors) (None, None, 4) 0 P6[0][0]
__________________________________________________________________________________________________
anchors_4 (Anchors) (None, None, 4) 0 P7[0][0]
__________________________________________________________________________________________________
regression_submodel (Model) (None, None, 4) 2443300 P3[0][0]
P4[0][0]
P5[0][0]
P6[0][0]
P7[0][0]
__________________________________________________________________________________________________
anchors (Concatenate) (None, None, 4) 0 anchors_0[0][0]
anchors_1[0][0]
anchors_2[0][0]
anchors_3[0][0]
anchors_4[0][0]
__________________________________________________________________________________________________
regression (Concatenate) (None, None, 4) 0 regression_submodel[1][0]
regression_submodel[2][0]
regression_submodel[3][0]
regression_submodel[4][0]
regression_submodel[5][0]
__________________________________________________________________________________________________
boxes (RegressBoxes) (None, None, 4) 0 anchors[0][0]
regression[0][0]
__________________________________________________________________________________________________
classification_submodel (Model) (None, None, 1) 2381065 P3[0][0]
P4[0][0]
P5[0][0]
P6[0][0]
P7[0][0]
__________________________________________________________________________________________________
clipped_boxes (ClipBoxes) (None, None, 4) 0 input_1[0][0]
boxes[0][0]
__________________________________________________________________________________________________
classification (Concatenate) (None, None, 1) 0 classification_submodel[1][0]
classification_submodel[2][0]
classification_submodel[3][0]
classification_submodel[4][0]
classification_submodel[5][0]
__________________________________________________________________________________________________
filtered_detections (FilterDete [(None, 300, 4), (No 0 clipped_boxes[0][0]
classification[0][0]
==================================================================================================
Total params: 36,382,957
Trainable params: 36,276,717
Non-trainable params: 106,240
Any help with figure out how to fix the "Fetch" part of this would be greatly appreciated.
EDIT:
To dig a little further into this, I found a python function to print the operation names from a .pb file. When doing this for the FRCNN .pb file, it clearly gave the output node names, as can be seen below (only posting the last several lines from the output of the python function).
import/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/TensorArrayStack_4/TensorArrayGatherV3
import/SecondStagePostprocessor/ToFloat_1
import/add/y
import/add
import/detection_boxes
import/detection_scores
import/detection_classes
import/num_detections
If I do the same thing for the Retinanet .pb file, it's not obvious what the outputs are. Here's the last several lines from the python function.
import/filtered_detections/map/while/NextIteration_4
import/filtered_detections/map/while/Exit_2
import/filtered_detections/map/while/Exit_3
import/filtered_detections/map/while/Exit_4
import/filtered_detections/map/TensorArrayStack/TensorArraySizeV3
import/filtered_detections/map/TensorArrayStack/range/start
import/filtered_detections/map/TensorArrayStack/range/delta
import/filtered_detections/map/TensorArrayStack/range
import/filtered_detections/map/TensorArrayStack/TensorArrayGatherV3
import/filtered_detections/map/TensorArrayStack_1/TensorArraySizeV3
import/filtered_detections/map/TensorArrayStack_1/range/start
import/filtered_detections/map/TensorArrayStack_1/range/delta
import/filtered_detections/map/TensorArrayStack_1/range
import/filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3
import/filtered_detections/map/TensorArrayStack_2/TensorArraySizeV3
import/filtered_detections/map/TensorArrayStack_2/range/start
import/filtered_detections/map/TensorArrayStack_2/range/delta
import/filtered_detections/map/TensorArrayStack_2/range
import/filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3
For reference, here's the python function that I used:
def printTensors(pb_file):
# read pb into graph_def
with tf.gfile.GFile(pb_file, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
# import graph_def
with tf.Graph().as_default() as graph:
tf.import_graph_def(graph_def)
# print operations
for op in graph.get_operations():
print(op.name)
Hope this helps.
Upvotes: 0
Views: 488
Reputation: 6234
I am not sure exactly the problem you are facing; You can get the ouputs from TF Serving output, Actually in retinanet Ipython/Jupyter notebook they have mentioned the output format as well
Querying the save model gives
""" The given SavedModel SignatureDef contains the following output(s):
outputs['filtered_detections/map/TensorArrayStack/TensorArrayGatherV3:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 300, 4)
name: filtered_detections/map/TensorArrayStack/TensorArrayGatherV3:0
outputs['filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 300)
name: filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3:0
outputs['filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3:0'] tensor_info:
dtype: DT_INT32
shape: (-1, 300)
name: filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3:0
Method name is: tensorflow/serving/predict
---
From retina-net
In general, inference of the network works as follows:
boxes, scores, labels = model.predict_on_batch(inputs)
Where `boxes` are shaped `(None, None, 4)` (for `(x1, y1, x2, y2)`), scores is shaped `(None, None)` (classification score) and labels is shaped `(None, None)` (label corresponding to the score). In all three outputs, the first dimension represents the shape and the second dimension indexes the list of detections.
"""
Upvotes: 0