Modifications in my tf.Graph change my NN weights initialization

Question

Before starting, I am using python3.4.6 and tensorflow1.4.0.

I am aiming to compare an exact same NN architecture when training in tensorflow under different input pipelines. For this purpose, I want to first ensure that same NN graph gets same initialization weights, regarding the input pipeline previously defined.

I have seen that there exists two PRNGs in tensorflow, the graph one and the operations one. I'm currently doing a tf.set_random_seed(777) to set the graph's seed.

In order to make it easy, I post below a couple of codes that differ in the input pipeline and its weights output (a subset of them):

Code 1:

import tensorflow as tf

g = tf.Graph()
with g.as_default():

    # Set graph seed
    tf.set_random_seed(777)

    # Input placeholders
    input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
    labels = tf.placeholder(tf.int64, shape=(None, 1))

    # Dummy placeholder
    # dummy = tf.placeholder(tf.int32)

    # Example Model
    conv1_1 = tf.layers.conv2d(input_x, 8, 3, activation=tf.nn.relu, name='conv1_1')
    conv1_2 = tf.layers.conv2d(conv1_1, 8, 3, activation=tf.nn.relu, name='conv1_2')
    pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")

    session_conf = tf.ConfigProto(log_device_placement=False)
    with tf.Session(config=session_conf) as sess:
        sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])

        conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
        print(sess.run(conv1_1_kernels)[:, :, 0, 0])

Output code 1:

[[ 0.03720146 0.0177983 -0.18485998]

[ 0.22072873 -0.14565685 0.21660429]

[-0.15442888 0.12140495 -0.05090818]]

Code 2:

import tensorflow as tf

g = tf.Graph()
with g.as_default():

    # Set graph seed
    tf.set_random_seed(777)

    # Input placeholders
    input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
    labels = tf.placeholder(tf.int64, shape=(None, 1))

    # Dummy placeholder
    dummy = tf.placeholder(tf.int32)

    # Example Model
    conv1_1 = tf.layers.conv2d(input_x, 8, 3, activation=tf.nn.relu, name='conv1_1')
    conv1_2 = tf.layers.conv2d(conv1_1, 8, 3, activation=tf.nn.relu, name='conv1_2')
    pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")

    session_conf = tf.ConfigProto(log_device_placement=False)
    with tf.Session(config=session_conf) as sess:
        sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])

        conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
        print(sess.run(conv1_1_kernels)[:, :, 0, 0])

Output code 2:

[[-0.20316723 0.01109874 -0.16709594]

[ 0.22850838 -0.10679846 -0.22449632]

[-0.13468848 0.12664327 0.2225503 ]]

These codes define my input pipeline through placeholders and define a simple NN graph. Then we start a tf.Session as sess to evaluate current weights on first channel of the first kernel in first layer conv1_1.

The main difference between Code 1 and Code 2 is that we got the dummy placeholder commented or not. I have rerun both codes independently several times and weights are consistent.

So, does anybody knows how to get same weights on my NN layers regarding if I define the dummy placeholder or not? Why tensorflow PRNG depends on a previous placeholder (which does not need PRNG)?

Any help will be appreciated!

dm0_ · Accepted Answer

As it is said in the documentation:

If the graph-level seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence.

Looks like this "deterministic pick" depends on graph contents.

So to have reproducible results you also need to pass operation-level seed:

import tensorflow as tf


g = tf.Graph()
with g.as_default():

    # Set graph seed
    tf.set_random_seed(777)

    # Input placeholders
    input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
    labels = tf.placeholder(tf.int64, shape=(None, 1))

    # Dummy placeholder
    dummy = tf.placeholder(tf.int32)

    # Example Model
    conv1_1 = tf.layers.conv2d(
        input_x, 8, 3, 
        activation=tf.nn.relu, 
        name='conv1_1', 
        kernel_initializer=tf.glorot_uniform_initializer(seed=1)
    )
    conv1_2 = tf.layers.conv2d(
        conv1_1, 8, 3, 
        activation=tf.nn.relu, 
        name='conv1_2', 
        kernel_initializer=tf.glorot_uniform_initializer(seed=2)
    )
    pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")

    session_conf = tf.ConfigProto(log_device_placement=False)
    with tf.Session(config=session_conf) as sess:
        sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])

        conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
        print(sess.run(conv1_1_kernels)[:, :, 0, 0])

I just added kernel_initializer parameters. They should be added to both scripts.

Modifications in my tf.Graph change my NN weights initialization

Answers (1)

Related Questions