Reputation: 1523
I am trying different activation functions in my simple neural network.
It does not matter using tf.nn.relu
, tf.nn.sigmoid
,... the network does what it should do.
But if I am using tf.nn.crelu
, I have a dimension error.
It returns something like [max, min]
and the width dimension is twice bigger.
What do I have to do? Fitting the following weights and biases to the output of crelu
?
Upvotes: 1
Views: 1236
Reputation: 53758
You're right, if you're building the network manually, you need to adjust the dimensions of the following layer to match tf.nn.crelu
output. In this sense, tf.nn.crelu
is not interchangeable with tf.nn.relu
, tf.nn.elu
, etc.
The situation is simpler if you use a high-level API, e.g. tensorflow slim. In this case, the layer functions are taking care of matching dimensions, so you can replace tf.nn.relu
easily with tf.nn.crelu
in code. However, keep in mind that the network is silently becoming twice as big.
Here's an example:
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.crelu,
normalizer_fn=slim.batch_norm,
normalizer_params={'is_training': is_training, 'decay': 0.95}):
conv1 = slim.conv2d(x_image, 16, [5, 5], scope='conv1')
pool1 = slim.max_pool2d(conv1, [2, 2], scope='pool1')
conv2 = slim.conv2d(pool1, 32, [5, 5], scope='conv2')
pool2 = slim.max_pool2d(conv2, [2, 2], scope='pool2')
flatten = slim.flatten(pool2)
fc = slim.fully_connected(flatten, 1024, scope='fc1')
drop = slim.dropout(fc, keep_prob=keep_prob)
logits = slim.fully_connected(drop, 10, activation_fn=None, scope='logits')
slim.arg_scope
simply applies all provided arguments to the underlying layers, in particular activation_fn
. Also note activation_fn=None
in the last layer to fix the output dimension. Complete code can be found here.
Upvotes: 1