H.Radmard
H.Radmard

Reputation: 11

How to have a variable number of hidden layers in Tensorflow?

Suppose that we want to try sort of hidden layer numbers and their size. How can we do in Tensorflow?

Consider following example to make it clear:

# Create a  Neural Network Layer

def fc_layer(input, size_in, size_out):
        w = tf.Variable(tf.truncated_normal([None, size_in, size_out]), name="W")
        b = tf.Variable(tf.constant(0.1, shape=[size_out]))
        act = tf.matmul(input, w) + b
        return act
n_hiddenlayers=3 #number of hidden layers
hidden_layer=tf.placeholder(tf.float32,[n_hiddenlayers, None, None])
#considering 4 as size of inputs and outputs of all layers
sizeInpOut=4
for i in range(n_hiddenlayers):
    hidden_layer(i,:,:)= tf.nn.sigmoid(fc_layer(X, sizeInpOut, sizeInpOut))

It results in an error about hidden_layer(i,:,:)= ... In the other word, I need tensor of tensors.

Upvotes: 3

Views: 4812

Answers (2)

tea_pea
tea_pea

Reputation: 1542

I did this just using a list to hold the different layers as follows, seemed to work fine.

    # inputs
    x_size=2 # first layer nodes
    y_size=1 # final layer nodes
    h_size=[3,4,3] # variable length list of hidden layer nodes

    # set up input and output
    X = tf.placeholder(tf.float32, [None,x_size])
    y_true = tf.placeholder(tf.float32, [None,y_size])

    # set up parameters
    W = []
    b = []
    layer = []

    # first layer
    W.append(tf.Variable(tf.random_normal([x_size, h_size[0]], stddev=0.1)))
    b.append(tf.Variable(tf.zeros([h_size[0]])))

    # add hidden layers (variable number)
    for i in range(1,len(h_size)):
        W.append(tf.Variable(tf.random_normal([h_size[i-1], h_size[i]], stddev=0.1)))
        b.append(tf.Variable(tf.zeros([h_size[i]])))

    # add final layer
    W.append(tf.Variable(tf.random_normal([h_size[-1], y_size], stddev=0.1)))
    b.append(tf.Variable(tf.zeros([y_size])))

    # define model
    layer.append(tf.nn.relu(tf.matmul(X, W[0]) + b[0]))

    for i in range(1,len(h_size)):
        layer.append(tf.nn.relu(tf.matmul(layer[i-1], W[i]) + b[i]))

    if self.type_in == "classification":
        y_pred = tf.nn.sigmoid(tf.matmul(layer[-1], W[-1]) + b[-1])
        loss = tf.reduce_mean(-1. * ((y_true * tf.log(y_pred)) + ((1.-y_true) * tf.log(1.-y_pred))))
        correct_prediction = tf.equal(tf.round(y_pred), tf.round(y_true))
        metric = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        metric_name = "accuracy"

Upvotes: 1

toto2
toto2

Reputation: 5326

Not a direct answer, but you could consider using tensorflow-slim. It's one of the many APIs distributed as part of tensorflow. It is lightweight and compatible with defining all the variables by hand as you are doing. If you look at the webpage I linked, slim.repeat and slim.stack allow you to create multiple layers of different widths in one line. To make things more complicated: I think part of slim is now the module called layers in tensorflow.

But maybe you just want to play directly with tf variables to understand how it works and not use a higher level API until later.

In the code you posted, since you want to create three layers, you should call fc_layer three times, but you only call it once. By the way this implies that w and b will be created three different times, as different variables with different internal tf names. And it is what you want.

You should have some for-loop or while-loop which iterates three times. Note that the output tensor at the end of the loop will become the input tensor in the next iteration. The initial input is the true input and the very last output is the true output.

Another issue with your code is that the non-linearity (the sigmoid) should be at the end of fc_layer. You want a non-linear operation between all layers.


EDIT: some code of what would usually be done:

import tensorflow as tf

input_size = 10
output_size = 4
layer_sizes  = [7, 6, 5]

def fc_layer(input, size, layer_name):
    in_size = input.shape.as_list()[1]
    w = tf.Variable(tf.truncated_normal([in_size, size]),
                    name="W" + layer_name)
    b = tf.Variable(tf.constant(0.1, shape=[size]),
                    name="b" + layer_name)
    act = tf.nn.sigmoid(tf.matmul(input, w) + b)
    return act

input = tf.placeholder(tf.float32, [None, input_size])
# output will be the intermediate activations successively and in the end the
# final activations (output).
output = input
for i, size in enumerate(layer_sizes + [output_size]):
    output = fc_layer(output , size, layer_name=str(i + 1))

print("final output var: " + str(output))
print("All vars in the tensorflow graph:")
for var in tf.global_variables():
    print(var)

With output:

final output: Tensor("Sigmoid_3:0", shape=(?, 4), dtype=float32)

<tf.Variable 'W1:0' shape=(10, 7) dtype=float32_ref>
<tf.Variable 'b1:0' shape=(7,) dtype=float32_ref>
<tf.Variable 'W2:0' shape=(7, 6) dtype=float32_ref>
<tf.Variable 'b2:0' shape=(6,) dtype=float32_ref>
<tf.Variable 'W3:0' shape=(6, 5) dtype=float32_ref>
<tf.Variable 'b3:0' shape=(5,) dtype=float32_ref>
<tf.Variable 'W4:0' shape=(5, 4) dtype=float32_ref>
<tf.Variable 'b4:0' shape=(4,) dtype=float32_ref>

In your code your were using the same name for w, which creates conflicts since different variables with the same name would be created. I fixed it in my code, but even if you use the same name tensorflow is intelligent enough and will rename each variable to a unique name by adding an underscore and a number.


EDIT: here is what I think you wanted to do:

import tensorflow as tf

hidden_size = 4
input_size = hidden_size  # equality required!
output_size = hidden_size # equality required!
n_hidden = 3

meta_tensor = tf.Variable(tf.truncated_normal([n_hidden, hidden_size, hidden_size]),
                    name="meta")

def fc_layer(input, i_layer):
    w = meta_tensor[i_layer]
    # more verbose: w = tf.slice(meta_tensor, begin=[i_layer, 0, 0], size=[1, hidden_size, hidden_size])[0]

    b = tf.Variable(tf.constant(0.1, shape=[hidden_size]),
                    name="b" + str(i_layer))
    act = tf.nn.sigmoid(tf.matmul(input, w) + b)
    return act

input = tf.placeholder(tf.float32, [None, input_size])
# output will be the intermediate activations successively and in the end the
# final activations (output).
output = input
for i_layer in range(0, n_hidden):
    output = fc_layer(output, i_layer)

print("final output var: " + str(output))
print("All vars in the tensorflow graph:")
for var in tf.global_variables():
    print(var)

With output:

final output var: Tensor("Sigmoid_2:0", shape=(?, 4), dtype=float32)

All vars in the tensorflow graph:
<tf.Variable 'meta:0' shape=(3, 4, 4) dtype=float32_ref>
<tf.Variable 'b0:0' shape=(4,) dtype=float32_ref>
<tf.Variable 'b1:0' shape=(4,) dtype=float32_ref>
<tf.Variable 'b2:0' shape=(4,) dtype=float32_ref>

As I said this is not standard. While coding it I also realized that it is quite limiting since all hidden layers must have the same size. A meta-tensor can be used to store many matrices, but those must all have the same dimensions. So you could not do like I did in the example above where the hidden first layer has size 7 and the next one size 6 and the final one size 5, before an output of size 4.

Upvotes: 0

Related Questions