Reputation: 1291
I have this code:
EMBEDDING_DIM = 100
MAXLEN = 16
TRUNCATING = 'post'
PADDING = 'post'
OOV_TOKEN = "<OOV>"
MAX_EXAMPLES = 160000
TRAINING_SPLIT = 0.9
# Initialize an empty numpy array with the appropriate size
EMBEDDINGS_MATRIX = np.zeros((VOCAB_SIZE+1, EMBEDDING_DIM))
# Iterate all of the words in the vocabulary and if the vector representation for
# each word exists within GloVe's representations, save it in the EMBEDDINGS_MATRIX array
for word, i in word_index.items():
embedding_vector = GLOVE_EMBEDDINGS.get(word)
if embedding_vector is not None:
EMBEDDINGS_MATRIX[i] = embedding_vector
# Define the model
def create_model(vocab_size, embedding_dim, maxlen, embeddings_matrix):
model = tf.keras.Sequential([
# Set the Embedding layer when using pre-trained embeddings
tf.keras.layers.Embedding(vocab_size+1, embedding_dim, input_length=maxlen, weights=[embeddings_matrix], trainable=False),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv1D(64, 6, activation='relu'),
# tf.keras.layers.AveragePooling1D(pool_size=4),
tf.keras.layers.GlobalAveragePooling1D(),
tf.keras.layers.LSTM(64),
tf.keras.layers.Dense(8, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
return model.summary()
model = create_model(VOCAB_SIZE, EMBEDDING_DIM, MAXLEN, EMBEDDINGS_MATRIX)
# Train the model
history = model.fit(train_pad_trunc_seq, train_labels, epochs=20, validation_data=(val_pad_trunc_seq, val_labels))
which brings me this error,
ValueError Traceback (most recent call last)
Input In [26], in <cell line: 2>()
1 # Create your untrained model
----> 2 model = create_model(VOCAB_SIZE, EMBEDDING_DIM, MAXLEN, EMBEDDINGS_MATRIX)
4 # Train the model and save the training history
5 history = model.fit(train_pad_trunc_seq, train_labels, epochs=20, validation_data=(val_pad_trunc_seq, val_labels))
Input In [25], in create_model(vocab_size, embedding_dim, maxlen, embeddings_matrix)
4 def create_model(vocab_size, embedding_dim, maxlen, embeddings_matrix):
5
6 ### START CODE HERE
----> 8 model = tf.keras.Sequential([
9 # This is how you need to set the Embedding layer when using pre-trained embeddings
10 tf.keras.layers.Embedding(vocab_size+1, embedding_dim, input_length=maxlen, weights=[embeddings_matrix], trainable=False),
11 tf.keras.layers.Dropout(0.2),
12 tf.keras.layers.Conv1D(64, 6, activation='relu'),
13 # tf.keras.layers.AveragePooling1D(pool_size=4),
14 tf.keras.layers.GlobalAveragePooling1D(),
15 tf.keras.layers.LSTM(64),
16 tf.keras.layers.Dense(8, activation='relu'),
17 tf.keras.layers.Dense(1, activation='sigmoid')
18 ])
20 model.compile(loss='binary_crossentropy',
21 optimizer='adam',
22 metrics=['accuracy'])
24 ### END CODE HERE
File ~\.conda\envs\tf-gpu\lib\site-packages\tensorflow\python\training\tracking\base.py:629, in no_automatic_dependency_tracking.<locals>._method_wrapper(self, *args, **kwargs)
627 self._self_setattr_tracking = False # pylint: disable=protected-access
628 try:
--> 629 result = method(self, *args, **kwargs)
630 finally:
631 self._self_setattr_tracking = previous_value # pylint: disable=protected-access
File ~\.conda\envs\tf-gpu\lib\site-packages\keras\utils\traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb
File ~\.conda\envs\tf-gpu\lib\site-packages\keras\engine\input_spec.py:214, in assert_input_compatibility(input_spec, inputs, layer_name)
212 ndim = shape.rank
213 if ndim != spec.ndim:
--> 214 raise ValueError(f'Input {input_index} of layer "{layer_name}" '
215 'is incompatible with the layer: '
216 f'expected ndim={spec.ndim}, found ndim={ndim}. '
217 f'Full shape received: {tuple(shape)}')
218 if spec.max_ndim is not None:
219 ndim = x.shape.rank
ValueError: Input 0 of layer "lstm_2" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 64)
So I need to have an input parameter such as input = (?,?,?) for my LSTM layer instead of (None, 64), but what should it be?
I have also tried to change GlobalAveragePooling1D() to AveragePooling1D(pool_size=4). It brings up the summary but gives me a different error:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_3 (Embedding) (None, 16, 100) 12829400
dropout_3 (Dropout) (None, 16, 100) 0
conv1d_3 (Conv1D) (None, 11, 64) 38464
average_pooling1d_1 (Averag (None, 2, 64) 0
ePooling1D)
lstm_3 (LSTM) (None, 64) 33024
dense_6 (Dense) (None, 8) 520
dense_7 (Dense) (None, 1) 9
=================================================================
Total params: 12,901,417
Trainable params: 72,017
Non-trainable params: 12,829,400
AttributeError Traceback (most recent call last)
Input In [24], in <cell line: 5>()
2 model = create_model(VOCAB_SIZE, EMBEDDING_DIM, MAXLEN, EMBEDDINGS_MATRIX)
4 # Train the model and save the training history
----> 5 history = model.fit(train_pad_trunc_seq, train_labels, epochs=20, validation_data=(val_pad_trunc_seq, val_labels))
AttributeError: 'NoneType' object has no attribute 'fit'
Please help?
Upvotes: 0
Views: 1167
Reputation: 114
I haven't used it before, but refer to https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalAveragePooling1D. It seems that if the input tensor is of dimension n, then GlobalAveragePooling 1D output a tensor with dimension n-1. GlobalAveragePooling1D do pooling along an axis, so it reduces the dim. If the output tensor shape after conv1d is (None, 11, 64), then the output for GlobalAveragePooling 1D is (None, 64), which is of dimension 2, not 3, so the first attempt results in an error.
Things are different for AveragePooling1D. It does local average pooling, so the dimension of output tensor is the same as input tensor. https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling1D
For the second question, refer to https://www.tensorflow.org/api_docs/python/tf/keras/Model#summary. model.summary()
just prints a string summary of the network. (I guess the function return value is None
). You should return a model with return model
, because class Model
has the method fit
Upvotes: 1