Reputation: 53
I am starting out with tensorflow and I have a huge problem when it comes to the ranks of tensors and how they interact with each other.
I have the following code with me:
w = tf.Variable(tf.constant([0.2,0.6]))
x = tf.placeholder(tf.float32)
y = w * x
As you can see, it is an incredibly simple setup.
However, when I execute the print w
the output is Tensor("Variable_13/read:0", shape=(2,), dtype=float32)
.
What is the meaning of shape(2,)
? What does the comma indicate?
Further, here are the other sore points after sess = tf.Session()
and initialising the variables:
print(sess.run(y,{x:[1,2]}))
[ 0.2 1.20000005]
print(sess.run(y,{x:[1]}))
[ 0.2 0.60000002]
print(sess.run(y,{x:[[1],[2]]}))
[[ 0.2 0.60000002]
[ 0.40000001 1.20000005]]
Why am I getting such a variety of behaviour? How is tensorflow determining a single data point? I realise now that specifying shape while declaring the placeholder is probably better than getting myself stuck like this.
I understand the last two cases as they were taught in class, but I am at a loss to explain the behaviour of the first case.
Upvotes: 2
Views: 335
Reputation: 27050
shape(2,)
indicates the shape of the tensor. In particular, the comma at the end indicates that the tensor is tuple.
You can check this simply running:
type((2))
that returns int
, whilst
type((2,))
returns tuple
.
You just discovered the broadcasting.
In short, in the first case, you're multiplying singularly each element of the 2 input tensors.
In the second case, you're multiplying a tensor for a scalar.
In the third case, instead, you're multiplying each element of w
for each element of x
. This because x has shape (something, 1)
. The 1
in the last dimension "trigger" a broacasting rule that makes the operation to behave in such way.
You should read the brodacasting rules here for a better understanding: https://docs.scipy.org/doc/numpy-1.12.0/user/basics.broadcasting.html#general-broadcasting-rules
Upvotes: 1
Reputation: 402852
Your first question is a simple one. shape=(2,)
refers to the dimensions of w
. In numpy
, shape is always represented by a tuple of integers, like this:
>>> x = np.random.randn(50)
>>> x.shape
(50,)
This is a 1D array, and only one integer is specified in the shape
. Now,...
>>> x = np.random.randn(50, 50)
>>> x.shape
(50, 50)
This is a 2D array. As you can see, shape
specifies the size of x
along 2 dimensions.
To answer your second question, x
is a placeholder, meaning it can take up any value you give it. That is precisely what the following lines do: {x:[1,2]}
, {x:[1]}
and {x:[[1],[2]]}
In the first case, x is assigned a 1D array of 2 elements [1, 2]
. In the second case, a 1D array with 1 element [1]
and so on.
Now, the operation w * x
above specifies that w
should be multiplied with x
. So, when doing sess.run(y,{x:[1,2]})
, w
is multiplied by x
with the values passed to it. And the output you see changes depending on the value you pass to x
.
In the first case, [0.2, 0.6] * [1, 2]
just multiplies each element at their corresponding indices and the result is [0.2 * 1, 0.6 * 2]
.
The second case does something similar.
In the third case, we have x with dimensions (2, 1). So each row of x
is in turn multiplied with w
to get a separate row, giving [[ 0.2, 0.60000002], [ 0.40000001, 1.20000005]]
as your output.
Upvotes: 2