Reputation: 1991
Trying to train a single layer NN for text based multi label classification problem.
model= Sequential()
model.add(Dense(20, input_dim=400, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
Getting error as :
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
x_train is a 300-dim word2vec vectorized text data, each instance padded to 400 length. Contains 462 records.
Observations on training data are as below :
print('#### Shape of input numpy array #####')
print(x_train.shape)
print('#### Shape of each element in the array #####')
print(x_train[0].shape)
print('#### Object type for input data #####')
print(type(x_train))
print('##### Object type for first element of input data ####')
print(type(x_train[0]))
#### Shape of input numpy array #####
(462,)
#### Shape of each element in the array #####
(400, 300)
#### Object type for input data #####
<class 'numpy.ndarray'>
##### Object type for first element of input data ####
<class 'numpy.ndarray'>
Upvotes: 3
Views: 19866
Reputation: 651
There are three problems
problem1
This is your main problem, which directly caused the error.
something's wrong with how you initialize/convert your x_train (and I think it is a bug, or you used some unusual way to construct your data), now your x_train is in fact an array of array, instead of a multi-dimensional array. So TensorFlow "thought" you have a 1D array according to its shape, which is not what you want.
the solution is to reconstruct the array before sending to fit():
x_train = np.array([np.array(val) for val in x_train])
problem2
Dense layer expects your input to have shape (batch_size, ..., input_dim), which means your last dimension of x_train must equal to input_dim, but you have 300, which is different from 400.
According to your description, your input dimension, which is the word vector dimension is 300, so you should change input_dim to 300:
model.add(Dense(20, input_dim=300, kernel_initializer='he_uniform', activation='relu'))
or equivalently, directly provide input_shape instead
model.add(Dense(20, input_shape=(400, 300), kernel_initializer='he_uniform', activation='relu'))
problem3
because dense, aka linear layer, is meant for "linear" input, so it expects each of its data to be a vector of one dimensional, so input is usually like (batch_size, vector_length). When dense receive an input of dimension > 2 (you got 3 dimensions), it will perform Dense operation on the last dimension. quote from TensorFlow official documentation:
Note: If the input to the layer has a rank greater than 2, then
Dense
computes the dot product between theinputs
and thekernel
along the last axis of theinputs
and axis 1 of thekernel
(usingtf.tensordot
). For example, if input has dimensions(batch_size, d0, d1)
, then we create akernel
with shape(d1, units)
, and thekernel
operates along axis 2 of theinput
, on every sub-tensor of shape(1, 1, d1)
(there arebatch_size * d0
such sub-tensors). The output in this case will have shape(batch_size, d0, units)
.
This means your y should have shape (462, 400, 9) instead. which is most likely not what you are looking for (if this is indeed what you are looking for, code in problem1&2 should have solved your problem).
if you are looking for performing dense on the whole 400x300 matrix, you need to first flatten to a one-dimensional vector, like this:
x_train = np.array([np.array(val) for val in x_train]) # reconstruct
model= Sequential()
model.add(Flatten(input_shape=(400, 300)))
model.add(Dense(20, kernel_initializer='he_uniform', activation='relu'))
model.add(Dense(9, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
model.fit(x_train, y_train, verbose=0, epochs=100)
now the output will be (462, 9)
Upvotes: 7