CharlieArreola
CharlieArreola

Reputation: 51

ValueError: Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 50), found shape=(None, 1, 512)

Learning to use bert-base-cased and a classification model... the code for the model is the following:

def mao_func(input_ids, masks, labels):
return {'input_ids':input_ids, 'attention_mask':masks}, labels

dataset = dataset.map(mao_func)

BATCH_SIZE = 32
dataset = dataset.shuffle(100000).batch(BATCH_SIZE)

split = .8
ds_len = len(list(dataset))

train = dataset.take(round(ds_len * split))
val = dataset.skip(round(ds_len * split))

from transformers import TFAutoModel
bert = TFAutoModel.from_pretrained('bert-base-cased')

Model: "tf_bert_model"


Layer (type) Output Shape Param #

bert (TFBertMainLayer) multiple 108310272

================================================================= Total params: 108,310,272 Trainable params: 108,310,272 Non-trainable params: 0

then the NN builduing:

input_ids = tf.keras.layers.Input(shape=(50,), name='input_ids', dtype='int32')
mask = tf.keras.layers.Input(shape=(50,), name='attention_mask', dtype='int32')

embeddings = bert(input_ids, attention_mask=mask)[0]

X = tf.keras.layers.GlobalMaxPool1D()(embeddings)
X = tf.keras.layers.BatchNormalization()(X)
X = tf.keras.layers.Dense(128, activation='relu')(X)
X = tf.keras.layers.Dropout(0.1)(X)
X = tf.keras.layers.Dense(32, activation='relu')(X)
y = tf.keras.layers.Dense(3, activation='softmax',name='outputs')(X)

model = tf.keras.Model(inputs=[input_ids, mask], outputs=y)

model.layers[2].trainable = False

the model.summary is:

 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_ids (InputLayer)         [(None, 50)]         0           []                               
                                                                                                  
 attention_mask (InputLayer)    [(None, 50)]         0           []                               
                                                                                                  
 tf_bert_model (TFBertModel)    TFBaseModelOutputWi  108310272   ['input_ids[0][0]',              
                                thPoolingAndCrossAt               'attention_mask[0][0]']         
                                tentions(last_hidde                                               
                                n_state=(None, 50,                                                
                                768),                                                             
                                 pooler_output=(Non                                               
                                e, 768),                                                          
                                 past_key_values=No                                               
                                ne, hidden_states=N                                               
                                one, attentions=Non                                               
                                e, cross_attentions                                               
                                =None)                                                            
                                                                                                  
 global_max_pooling1d (GlobalMa  (None, 768)         0           ['tf_bert_model[0][0]']          
 xPooling1D)                                                                                      
                                                                                                  
 batch_normalization (BatchNorm  (None, 768)         3072        ['global_max_pooling1d[0][0]']   
 alization)                                                                                       
                                                                                                  
 dense (Dense)                  (None, 128)          98432       ['batch_normalization[0][0]']    
                                                                                                  
 dropout_37 (Dropout)           (None, 128)          0           ['dense[0][0]']                  
                                                                                                  
 dense_1 (Dense)                (None, 32)           4128        ['dropout_37[0][0]']             
                                                                                                  
 outputs (Dense)                (None, 3)            99          ['dense_1[0][0]']                
                                                                                                  
==================================================================================================
Total params: 108,416,003
Trainable params: 104,195
Non-trainable params: 108,311,808
__________________________________________________________________________________________________

finally the model fitting is

optimizer = tf.keras.optimizers.Adam(0.01)
loss = tf.keras.losses.CategoricalCrossentropy()
acc = tf.keras.metrics.CategoricalAccuracy('accuracy')

model.compile(optimizer,loss=loss, metrics=[acc])

history = model.fit(
    train,
    validation_data = val,
     epochs=140
)

with execution error in line 7 -> the model.fit(...):

ValueError: Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 50), found shape=(None, 1, 512)

Can any one be so kind of helping me on what I did wrong and why... thanks:)

update: here is the git with the codes https://github.com/CharlieArreola/OnlinePosts

Upvotes: 4

Views: 10188

Answers (1)

Fabian
Fabian

Reputation: 802

It seems, that your shape of the train data doen't match the expected input shape of your input layer. You can check your shape of the train data with train.shape()

You input layer Input_ids = tf.keras.layers.Input(shape=(50,), name='input_ids', dtype='int32') expects train data with 50 columns, but you most likely have 512 if we look at your error. So to fix this, you could simply change your input shape.

Input_ids = tf.keras.layers.Input(shape=(512,), name='input_ids', dtype='int32')

If you split your x and y in your dataset you can make it more flexible with:

Input_ids = tf.keras.layers.Input(shape=(train_x.shape[0],), name='input_ids', dtype='int32')

Also don't forget, that you have to do this change to all of your input layers!

Upvotes: 4

Related Questions