Reputation: 2445
I am trying to connect the layer c0nv4_3 of vgg16 network instead of conv5_3 to the RPN network of Faster R-CNN. Here is the python code of vgg16 network. I have changed these lines:
def _image_to_head(self, is_training, reuse=False):
with tf.variable_scope(self._scope, self._scope, reuse=reuse):
net = slim.repeat(self._image, 2, slim.conv2d, 64, [3, 3],
trainable=False, scope='conv1')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3],
trainable=False, scope='conv2')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3],
trainable=is_training, scope='conv3')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3],
trainable=is_training, scope='conv4')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3],
trainable=is_training, scope='conv5')
self._act_summaries.append(net)
self._layers['head'] = net
return net
to:
def _image_to_head(self, is_training, reuse=False):
with tf.variable_scope(self._scope, self._scope, reuse=reuse):
net = slim.repeat(self._image, 2, slim.conv2d, 64, [3, 3],
trainable=False, scope='conv1')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3],
trainable=False, scope='conv2')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3],
trainable=is_training, scope='conv3')
net = slim.max_pool2d(net, [2, 2], padding='SAME', scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3],
trainable=is_training, scope='conv4')
self._act_summaries.append(net)
self._layers['head'] = net
return net
As seen above, I removed the conv5 and pool4 layers; because my objects are small and I hoped to get better results, but the results got worse. I think I need to add a deconv layer to the end of conv4? Or there is another way?
thanks
Upvotes: 0
Views: 292
Reputation: 827
There are methods too, for reducing the length of bottleneck features.
Why not to add deconv:
Pooling Layers:
Average pooling(based on the window size, it will return average of that window). So if lets say window(2,2) with values[3,2,4,3] will result into only one value: 6
MaxPool(based on the window size, it will result maximum value of that window). So if lets say window(2,2) with values[3,2,4,3] will result into only one value: 3
Check out pooling layers here
Upvotes: 1