Confusion about TensorFlow shape rank

Question

I know there are already questions with similar titles, but before you report this as a repeat, please allow me to say that all of the answer to those questions are extremely ad hoc and not applicable to my problem.

I'm having trouble understanding why I cannot take the matrix multiplication (well, technically the matrix-vector multiplication) of two Tensors in TensorFlow. I have a Tensor v with shape (1000, 1000) and another Tensor h_previous with shape (1000). I'm doing plenty of matrix multiplications with two Tensors of the exactly same shapes in the program before, but this is just throwing a cryptic error. Here are the critical parts of the graph:

# Variables
# Encoder input
X = tf.placeholder(tf.float32, shape=[k, None])
we = tf.Variable(tf.truncated_normal([500, k], -0.1, 0.1))
# Encoder update gate
wz = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
uz = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder reset gate
wr = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
ur = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder h~ [find name]
w = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
u = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder representation weight
v = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))

# Encoder
h_previous = tf.zeros([1000])
for t in range(N):
    # Current vector and its embedding
    xt = tf.reshape(tf.slice(X, [t, 0], [1, k]), [k])
    e = tf.matmul(we, xt)
    # Reset calculation
    r = tf.sigmoid(tf.matmul(wr, e) + tf.matmul(ur, h_previous))
    # Update calculation
    z = tf.sigmoid(tf.matmul(wz, e) + tf.matmul(uz, h_previous))
    # Hidden-tilde calculation
    h_tilde = tf.tanh(tf.matmul(w, e) + tf.matmul(u, r * h_previous))
    # Hidden calculation
    one = tf.ones([1000])
    h = z * h_previous + (one - z) * h_tilde
    h_previous = h
c = tf.tanh(tf.matmul(v, h_previous))

I'm stumped. Does anyone have any clue? Thanks in advance. :)

Gregory Begelman · Accepted Answer

I have fixed your code in a couple of places and now it works (see below). Generally, the inputs for tf.matmul should be two 2D matrices (see docs here), while you passed a 2-D matrix (size is 1000x1000) and a 1-D matrix (size is 1000). If you reshape the second matrix to 1000x1 or 1x1000, matmul will work.

k = 77
N = 17
# Variables
# Encoder input
X = tf.placeholder(tf.float32, shape=[k, None])
we = tf.Variable(tf.truncated_normal([500, k], -0.1, 0.1))
# Encoder update gate
wz = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
uz = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder reset gate
wr = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
ur = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder h~ [find name]
w = tf.Variable(tf.truncated_normal([1000, 500], -0.1, 0.1))
u = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))
# Encoder representation weight
v = tf.Variable(tf.truncated_normal([1000, 1000], -0.1, 0.1))

# Encoder
h_previous = tf.zeros([1000, 1])
for t in range(N):
    # Current vector and its embedding
    xt = tf.reshape(tf.slice(X, [t, 0], [1, k]), [k, 1])
    e = tf.matmul(we, xt)
    # Reset calculation
    r = tf.sigmoid(tf.matmul(wr, e) + tf.matmul(ur, h_previous))
    # Update calculation
    z = tf.sigmoid(tf.matmul(wz, e) + tf.matmul(uz, h_previous))
    # Hidden-tilde calculation
    h_tilde = tf.tanh(tf.matmul(w, e) + tf.matmul(u, r * h_previous))
    # Hidden calculation
    one = tf.ones([1000])
    h = z * h_previous + (one - z) * h_tilde
    h_previous = h
c = tf.tanh(tf.matmul(v, h_previous))

Confusion about TensorFlow shape rank

Answers (1)

Related Questions