Tensorflow Placeholders vs Tensorflow Constants vs Numpy Arrays

Question

I am trying to run a forward pass on a convolutional neural network having a convolutional layer, followed by a pooling layer and finally a rectified linear unit (ReLU) activation layer. The details about input data and convolutional layer filters are as follows:

X: 4-dimensional input data having shape [N, H, W, C], where N = 60000 is the batch size, H = 32 is the height of an input image, W = 32 is the width of an input image, and C = 1 is number of channels in an input image.
W: 4-dimensional convolutional filter having shape [F, F, C, Cout], where F = 3 is the height and width of the filter, C = 1 is the number of channels in the input image, and Cout = 6 is the number of channels in the output image.

There are three approaches to do this.

Approach 1: Without using tf.constant() or tf.placeholder()

import numpy as np
import tensorflow as tf

X = np.random.random([60000, 32, 32, 1])
W = np.random.random([3, 3, 1, 6])

C = tf.nn.conv2d(X, W, strides=[1,1,1,1], padding="VALID")
P = tf.nn.avg_pool(C, ksize=[1,2,2,1], strides=[1,2,2,1], padding="VALID")
A = tf.nn.relu(P)

with tf.Session() as sess:
  result = sess.run(A)       # Takes 14.98 seconds

Approach 2: Using tf.constant()

import numpy as np
import tensorflow as tf

X = tf.constant(np.random.random([60000, 32, 32, 1]), dtype=tf.float64)
W = tf.constant(np.random.random([3, 3, 1, 6]), dtype=tf.float64)

C = tf.nn.conv2d(X, W, strides=[1,1,1,1], padding="VALID")
P = tf.nn.avg_pool(C, ksize=[1,2,2,1], strides=[1,2,2,1], padding="VALID")
A = tf.nn.relu(P)

with tf.Session() as sess:
  result = sess.run(A)       # Takes 14.73 seconds

Approach 3: Using tf.placeholder()

import numpy as np
import tensorflow as tf 

x = np.random.random([60000, 32, 32, 1])
w = np.random.random([3, 3, 1, 6])

X = tf.placeholder(dtype=tf.float64, shape=[None, 32, 32, 1])
W = tf.placeholder(dtype=tf.float64, shape=[3, 3, 1, 6])

C = tf.nn.conv2d(X, W, strides=[1,1,1,1], padding="VALID")
P = tf.nn.avg_pool(C, ksize=[1,2,2,1], strides=[1,2,2,1], padding="VALID")
A = tf.nn.relu(P)

with tf.Session() as sess:
  result = sess.run(A, feed_dict={X:x, W:w})       # Takes 3.21 seconds

Approach 3 (using tf.placeholder()) runs almost 4-5X faster than Approach 1 and Approach 2. All these experiments were conducted on NVIDIA GeForce GTX 1080 GPU.

The question is why do we get almost 4-5X speedup by just using tf.placeholder() in Approach 3 as compared to Approach 1 and Approach 2? In its underlying implementation, what is tf.placeholder() doing, that allows it to have such a good performance?

xdurch0 · Accepted Answer

Shoutouts to @y.selivonchyk for the invaluable experiments, however I feel like the answer doesn't elaborate on why these results occur.

I believe this is not so much about 'placeholder' being "good", but rather about the other two methods being a bad idea.

I would presume that 1) and 2) are actually the same and that 1) converts the array to a constant under the hood -- at least this would explain the identical behavior.

The reason 1) and 2) take so long is that constants are embedded explicitly into the computational graph. Because they're quite large tensors, this explains why the graph takes so long to build. However, once the graph is built, subsequent runs are faster because everything is "contained" in there. You should generally try to avoid including large pieces of data in the graph itself -- it should ideally just be a set of instructions for computation (i.e. Tensorflow ops).

With 3), the graph is much faster to build because we do not embedd the huge array in it, just a symbolic placeholder. However, execution is slower than 1) and 2) because the value needs to be fed into the placeholder each time (which also means the data has to be transferred onto the GPU in case you are running on one).

Tensorflow Placeholders vs Tensorflow Constants vs Numpy Arrays

Answers (2)

Related Questions