Reputation: 93
I am just starting to learn cntk. However, I have a basic question that is holding me back from progressing. I have the following test that passes:
import numpy as np
from cntk import input_variable, plus
def test_simple(self):
x_input = np.asarray([[1, 2, 2]], dtype=np.int64)
assert (1, 3) == x_input.shape
y_input = np.asarray([[5, 3, 3]], dtype=np.int64)
assert (1, 3) == y_input.shape
x = input_variable(x_input.shape[1])
assert (3, ) == x.shape
y = input_variable(y_input.shape[1])
assert (3, ) == y.shape
x_plus_y = plus(x, y)
assert (3, ) == x_plus_y.shape
res = x_plus_y.eval({x: x_input, y: y_input})
assert 6 == res[0, 0, 0]
assert 5 == res[0, 0, 1]
assert 5 == res[0, 0, 2]
I understand that the shape of the output is (1, 1, 3) as the first and second axis are the batch and default dynamic axis respectively.
However, why do I need to set the shape of the input variables as (3,) instead of (1, 3). Using (1, 3) fails.
Why is there an inconsistency between the shape of the input node in the graph and the numpy data used as input to that node?
Thank you, Paddy
Upvotes: 1
Views: 134
Reputation: 2050
This is explained a little bit in the description of "arguments" for Function.forward. Another description is here. The reason for your confusion probably is that CNTK does some "helpful" conversions.
If you specify your input as (1,3) then you need to provide a list of (1,3) arrays in case of a minibatch without a sequence axis or a list of (x,1,3) arrays in case of a minibatch with a sequence axis (where x is potentially different for each sequence in the minibatch). Similarly, if you specify an input as (3,) then you need to either provide a list of (3,) vectors or a list of (x,3) vectors.
The confusion probably arises from the case when a list is not provided. In that case CNTK iterates over the leading axis of the provided tensor and creates a list out of those elements e.g. a (5,1,3) tensor becomes a batch of 5 elements each having a shape of (1,3).
Upvotes: 2