Anmol Kumar
Anmol Kumar

Reputation: 87

InvalidArgumentError and InvalidArgumentError: Graph execution error tensorflow while creating LSTM Model

I am getting multiple error while running my code can't able to understand it is because of dataset or some architecture issue.

from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.layers import Dense,concatenate,Activation,Dropout,Input,LSTM,Flatten
from tensorflow.keras import regularizers,initializers,optimizers,Model

tokenizeressay = Tokenizer(oov_token="<OOV>")
# generate word indexes
tokenizeressay.fit_on_texts(X_train.essay)
# generate sequences
tokenisedessayTrain  = tokenizeressay.texts_to_sequences(X_train.essay)
tokenisedessayTest  = tokenizeressay.texts_to_sequences(X_test.essay)
tokenisedessayCV  = tokenizeressay.texts_to_sequences(X_cv.essay)

tokenisedTrainessay = pad_sequences(tokenisedessayTrain,maxlen=350)
tokenisedTestessay = pad_sequences(tokenisedessayTest,maxlen=350)
tokenizedCvessay = pad_sequences(tokenisedessayCV,maxlen=350)

import pickle
with open("C:\\Users\\Administrator\\Downloads\\glove_vectors", 'rb') as f:
    glove = pickle.load(f)

size_glove = 300     # glove vectors are 300 dims
size_vocab = len(list(tokenizeressay.word_counts.keys()))
word_Weights = np.zeros((size_vocab+1, size_glove))
for word, i in tokenizeressay.word_index.items():
    embedding_vector = glove.get(word)
    if embedding_vector is not None:
        # words not found in embedding index will be all-zeros.
        word_Weights[i-1] = embedding_vector
        
print("Max Length on Essay")
print(X_train.essay.apply(lambda x : len(x.split(' '))).max())
print("Word Item Length:")
print(len(tokenizeressay.word_index.items()))
print("Max Length we are taken will be 350")

Max Length on Essay
339
Word Item Length:
49330
Max Length we are taken will be 350

#LSTM and get the LSTM output and Flatten that output.  ----> Accoridng to the assignment
# I have choose 128 Units

essayInput = Input(shape=(350,),dtype='int32',name='essayInput')
embeddedEssay = Embedding(input_dim=(len(tokenizeressay.word_index.items())),output_dim=300,name='embeddedEssay',weights=[word_Weights],trainable=False)(essayInput)
essayLSTM = LSTM(units=128, return_sequences=True)(embeddedEssay)
essayOut = Flatten()(essayLSTM)

from tensorflow.keras.layers import concatenate
concat_layer = concatenate(inputs=[essayOut],name="concat_layer")
from tensorflow.keras.layers import Dropout,BatchNormalization

AfterConcatLayer = Dense(256,activation='relu',kernel_regularizer=regularizers.l2(0.001),kernel_initializer=initializers.he_normal())(concat_layer)
AfterConcatLayer = Dropout(0.5)(AfterConcatLayer)

AfterConcatLayer = Dense(128,activation='relu',kernel_regularizer=regularizers.l2(0.001),kernel_initializer=initializers.he_normal())(AfterConcatLayer)
AfterConcatLayer = Dropout(0.5)(AfterConcatLayer)
AfterConcatLayer = BatchNormalization()(AfterConcatLayer)
AfterConcatLayer = Dense(64,activation='relu',kernel_regularizer=regularizers.l2(0.001),kernel_initializer=initializers.he_normal())(AfterConcatLayer)
AfterConcatLayer = Dropout(0.5)(AfterConcatLayer)
SoftmaxOutput = Dense(2, activation = 'softmax')(AfterConcatLayer)
model1 = Model([essayInput], SoftmaxOutput)
model1.compile(loss='categorical_crossentropy', optimizer=optimizers.Adam(lr=0.0006,decay = 1e-4),metrics=[auc])
print(model1.summary())


checkpoint_path ="C:\\Windows_Old\\Learnings\\MachineLearning\\CheckPoints.hdf5"

cp_callback = ModelCheckpoint(filepath=checkpoint_path,save_best_only=True, save_weights_only=True, verbose=1,monitor='val_auc')

#Delete THIS CODE

import os
from keras.utils import to_categorical

if os.path.isfile(checkpoint_path):
    model1.load_weights(checkpoint_path)
#essayInput,SchoolStateInput,TeacherPrefixInput,CleanCategoriesInput,CleanSubCategoriesInput,ProjectGradeInput,PriceProjectNumberInput

model1.fit([tokenisedTrainessay],to_categorical(y_train), epochs=50,verbose=2,batch_size=512,validation_split=0.3,callbacks = [cp_callback])

I am using DonorChoose Dataset. You can download the same from this Link

Error I am getting is:

InvalidArgumentError                      Traceback (most recent call last)
Input In [23], in <cell line: 10>()
      7     model1.load_weights(checkpoint_path)
      8 #essayInput,SchoolStateInput,TeacherPrefixInput,CleanCategoriesInput,CleanSubCategoriesInput,ProjectGradeInput,PriceProjectNumberInput
---> 10 model1.fit([tokenisedTrainessay],to_categorical(y_train), epochs=50,verbose=2,batch_size=512,validation_split=0.3,callbacks = [cp_callback])

File C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\execute.py:54, in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
     52 try:
     53   ctx.ensure_initialized()
---> 54   tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
     55                                       inputs, attrs, num_outputs)
     56 except core._NotOkStatusException as e:
     57   if name is not None:

Different Error: Invalid Argument Error

InvalidArgumentError: Graph execution error:

Detected at node 'model/embeddedEssay/embedding_lookup' defined at (most recent call last):
    File "C:\ProgramData\Anaconda3\lib\runpy.py", line 197, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "C:\ProgramData\Anaconda3\lib\runpy.py", line 87, in _run_code
      exec(code, run_globals)
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
      app.launch_new_instance()
    File "C:\ProgramData\Anaconda3\lib\site-packages\traitlets\config\application.py", line 846, in launch_instance
      app.start()
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelapp.py", line 677, in start
      self.io_loop.start()
    File "C:\ProgramData\Anaconda3\lib\site-packages\tornado\platform\asyncio.py", line 199, in start
      self.asyncio_loop.run_forever()
    File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 601, in run_forever
      self._run_once()
    File "C:\ProgramData\Anaconda3\lib\asyncio\base_events.py", line 1905, in _run_once
      handle._run()
    File "C:\ProgramData\Anaconda3\lib\asyncio\events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 471, in dispatch_queue
      await self.process_one()
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 460, in process_one
      await dispatch(*args)
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 367, in dispatch_shell
      await result
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\kernelbase.py", line 662, in execute_request
      reply_content = await reply_content
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\ipkernel.py", line 360, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "C:\ProgramData\Anaconda3\lib\site-packages\ipykernel\zmqshell.py", line 532, in run_cell
      return super().run_cell(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2863, in run_cell
      result = self._run_cell(
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2909, in _run_cell
      return runner(coro)
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3106, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3309, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3369, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "C:\Users\Administrator\AppData\Local\Temp\2\ipykernel_4140\2952872782.py", line 10, in <cell line: 10>
      model1.fit([tokenisedTrainessay],to_categorical(y_train), epochs=50,verbose=2,batch_size=512,validation_split=0.3,callbacks = [cp_callback])
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1606, in fit
      val_logs = self.evaluate(
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1947, in evaluate
      tmp_logs = self.test_function(iterator)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1727, in test_function
      return step_function(self, iterator)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1713, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1701, in run_step
      outputs = model.test_step(data)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 1665, in test_step
      y_pred = self(x, training=False)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\training.py", line 557, in __call__
      return super().__call__(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\functional.py", line 510, in call
      return self._run_internal_graph(inputs, training=training, mask=mask)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\functional.py", line 667, in _run_internal_graph
      outputs = node.layer(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 65, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py", line 1097, in __call__
      outputs = call_fn(inputs, *args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py", line 96, in error_handler
      return fn(*args, **kwargs)
    File "C:\ProgramData\Anaconda3\lib\site-packages\keras\layers\core\embedding.py", line 208, in call
      out = tf.nn.embedding_lookup(self.embeddings, inputs)
Node: 'model/embeddedEssay/embedding_lookup'
indices[413,323] = 49330 is not in [0, 49330)
     [[{{node model/embeddedEssay/embedding_lookup}}]] [Op:__inference_test_function_5617]

I am kinda stuck in this from last three days.

I am have tries to increase the dim by 1 but failed. Code Snippet attached

essayInput = Input(shape=(350,),dtype='int32',name='essayInput')
embeddedEssay = Embedding(input_dim=(len(tokenizeressay.word_index.items())+1),output_dim=300,name='embeddedEssay',weights=[word_Weights],trainable=False)(essayInput)
essayLSTM = LSTM(units=128, return_sequences=True)(embeddedEssay)
essayOut = Flatten()(essayLSTM)

Error I am getting while doing this.

ValueError                                Traceback (most recent call last)
Input In [25], in <cell line: 5>()
      1 #LSTM and get the LSTM output and Flatten that output.  ----> Accoridng to the assignment
      2 # I have choose 128 Units
      4 essayInput = Input(shape=(350,),dtype='int32',name='essayInput')
----> 5 embeddedEssay = Embedding(input_dim=(len(tokenizeressay.word_index.items())+1),output_dim=300,name='embeddedEssay',weights=[word_Weights],trainable=False)(essayInput)
      6 essayLSTM = LSTM(units=128, return_sequences=True)(embeddedEssay)
      7 essayOut = Flatten()(essayLSTM)

File C:\ProgramData\Anaconda3\lib\site-packages\keras\utils\traceback_utils.py:70, in filter_traceback.<locals>.error_handler(*args, **kwargs)
     67     filtered_tb = _process_traceback_frames(e.__traceback__)
     68     # To get the full stack trace, call:
     69     # `tf.debugging.disable_traceback_filtering()`
---> 70     raise e.with_traceback(filtered_tb) from None
     71 finally:
     72     del filtered_tb

File C:\ProgramData\Anaconda3\lib\site-packages\keras\engine\base_layer.py:1772, in Layer.set_weights(self, weights)
   1770 ref_shape = param.shape
   1771 if not ref_shape.is_compatible_with(weight_shape):
-> 1772     raise ValueError(
   1773         f"Layer {self.name} weight shape {ref_shape} "
   1774         "is not compatible with provided weight "
   1775         f"shape {weight_shape}."
   1776     )
   1777 weight_value_tuples.append((param, weight))
   1778 weight_index += 1

ValueError: Layer embeddedEssay weight shape (49331, 300) is not compatible with provided weight shape (49330, 300).

Upvotes: 1

Views: 1120

Answers (1)

user11530462
user11530462

Reputation:

As stated in the error,

ValueError: Layer embeddedEssay weight shape (49331, 300) is not compatible with provided weight shape (49330, 300).

The error has occured due to shape mismatch, hence try increasing the input dimension of the Embedding layer like below.

embeddedEssay = Embedding(input_dim=(len(tokenizeressay.word_index.items()))+1,output_dim=300,name='embeddedEssay',weights=[word_Weights],trainable=False)(essayInput)

Also increase the vocabulary size as follows in order to avoid the error.

size_vocab = len(list(tokenizeressay.word_counts.keys()))+1

Kindly refer to this gist for the complete code and this example for more information on the error. Thank you!

Upvotes: 1

Related Questions