Reputation: 636
Before starting, I am using python3.4.6 and tensorflow1.4.0.
I am aiming to compare an exact same NN architecture when training in tensorflow under different input pipelines. For this purpose, I want to first ensure that same NN graph gets same initialization weights, regarding the input pipeline previously defined.
I have seen that there exists two PRNGs in tensorflow, the graph one and the operations one. I'm currently doing a tf.set_random_seed(777)
to set the graph's seed.
In order to make it easy, I post below a couple of codes that differ in the input pipeline and its weights output (a subset of them):
Code 1:
import tensorflow as tf
g = tf.Graph()
with g.as_default():
# Set graph seed
tf.set_random_seed(777)
# Input placeholders
input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
labels = tf.placeholder(tf.int64, shape=(None, 1))
# Dummy placeholder
# dummy = tf.placeholder(tf.int32)
# Example Model
conv1_1 = tf.layers.conv2d(input_x, 8, 3, activation=tf.nn.relu, name='conv1_1')
conv1_2 = tf.layers.conv2d(conv1_1, 8, 3, activation=tf.nn.relu, name='conv1_2')
pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")
session_conf = tf.ConfigProto(log_device_placement=False)
with tf.Session(config=session_conf) as sess:
sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])
conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
print(sess.run(conv1_1_kernels)[:, :, 0, 0])
Output code 1:
[[ 0.03720146 0.0177983 -0.18485998]
[ 0.22072873 -0.14565685 0.21660429]
[-0.15442888 0.12140495 -0.05090818]]
Code 2:
import tensorflow as tf
g = tf.Graph()
with g.as_default():
# Set graph seed
tf.set_random_seed(777)
# Input placeholders
input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
labels = tf.placeholder(tf.int64, shape=(None, 1))
# Dummy placeholder
dummy = tf.placeholder(tf.int32)
# Example Model
conv1_1 = tf.layers.conv2d(input_x, 8, 3, activation=tf.nn.relu, name='conv1_1')
conv1_2 = tf.layers.conv2d(conv1_1, 8, 3, activation=tf.nn.relu, name='conv1_2')
pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")
session_conf = tf.ConfigProto(log_device_placement=False)
with tf.Session(config=session_conf) as sess:
sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])
conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
print(sess.run(conv1_1_kernels)[:, :, 0, 0])
Output code 2:
[[-0.20316723 0.01109874 -0.16709594]
[ 0.22850838 -0.10679846 -0.22449632]
[-0.13468848 0.12664327 0.2225503 ]]
These codes define my input pipeline through placeholders and define a simple NN graph. Then we start a tf.Session
as sess
to evaluate current weights on first channel of the first kernel in first layer conv1_1
.
The main difference between Code 1 and Code 2 is that we got the dummy
placeholder commented or not. I have rerun both codes independently several times and weights are consistent.
So, does anybody knows how to get same weights on my NN layers regarding if I define the dummy
placeholder or not? Why tensorflow PRNG depends on a previous placeholder (which does not need PRNG)?
Any help will be appreciated!
Upvotes: 1
Views: 41
Reputation: 2156
As it is said in the documentation:
If the graph-level seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence.
Looks like this "deterministic pick" depends on graph contents.
So to have reproducible results you also need to pass operation-level seed:
import tensorflow as tf
g = tf.Graph()
with g.as_default():
# Set graph seed
tf.set_random_seed(777)
# Input placeholders
input_x = tf.placeholder(tf.float32, shape=(None, 512, 512, 3))
labels = tf.placeholder(tf.int64, shape=(None, 1))
# Dummy placeholder
dummy = tf.placeholder(tf.int32)
# Example Model
conv1_1 = tf.layers.conv2d(
input_x, 8, 3,
activation=tf.nn.relu,
name='conv1_1',
kernel_initializer=tf.glorot_uniform_initializer(seed=1)
)
conv1_2 = tf.layers.conv2d(
conv1_1, 8, 3,
activation=tf.nn.relu,
name='conv1_2',
kernel_initializer=tf.glorot_uniform_initializer(seed=2)
)
pool1 = tf.layers.max_pooling2d(conv1_2, 2, 2, name="pool1")
session_conf = tf.ConfigProto(log_device_placement=False)
with tf.Session(config=session_conf) as sess:
sess.run([tf.local_variables_initializer(), tf.global_variables_initializer()])
conv1_1_kernels = [v for v in tf.trainable_variables() if v.name == "conv1_1/kernel:0"][0]
print(sess.run(conv1_1_kernels)[:, :, 0, 0])
I just added kernel_initializer
parameters. They should be added to both scripts.
Upvotes: 2