Reputation: 3427
I am struggling to pass a Pandas column (or numpy array) with size (2946, 1)
to a Text embedding input layer in Keras with Tensorflow 2. The Pandas DataFrame object is just 1 text column with 2946 different observations.
According to the Tensorflow hub module documentation on this pre-trained word embeddings, the module:
The module takes a batch of sentences in a 1-D tensor of strings as input.
The network and input layer is defined as follows:
import tensorflow_hub as hub
import tensorflow as tf
from tenorflow import keras
hub_layer = hub.KerasLayer("https://tfhub.dev/google/Wiki-words-500-with-normalization/2",
input_shape=[], dtype=tf.string)
model = keras.Sequential()
model.add(hub_layer)
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='Adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(X_train.values, y_train, epochs=10, validation_split=0.20)
I get this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-36-cf8b37d02f89> in <module>()
1 model.fit(X_train.values, y_train,
2 epochs=10,
----> 3 validation_split=0.20)
8 frames
/tensorflow-2.1.0/python3.6/tensorflow_core/python/keras/engine/training_utils.py in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
571 ': expected ' + names[i] + ' to have ' +
572 str(len(shape)) + ' dimensions, but got array '
--> 573 'with shape ' + str(data_shape))
574 if not check_batch_axis:
575 data_shape = data_shape[1:]
ValueError: Error when checking input: expected keras_layer_input to have 1 dimensions, but got array with shape (2946, 1)
How can I pass a pandas column or numpy array as a batch of sentences in a 1-D tensor of strings that the input layer expects?
Upvotes: 2
Views: 614
Reputation: 7591
Try to flatten your X.train.values
from (2946, 1)
to just (2946)
. If X.train.values
is a np.array
you can use X.train.values.ravel()
or several other choices. If it's not, just convert it to numpy.
Upvotes: 1