uhhh
uhhh

Reputation: 7

Issue with Word2Vec embedding matrix

I am currenlty working on a ML project. I am stuck on the embedding matrix for my Word2Vec model.

The code snippet is below:;

vocab_size = len(tokenizer.word_index)+1
embedding_matrix = np.zeros((vocab_size, embedding_vector_size))
# +1 is done because i starts from 1 instead of 0, and goes till len(vocab)
for  word, i in tokenizer.word_index.items():
    embedding_vector = model_1.wv[word] #error
    if embedding_vector is not None:
        embedding_matrix[i] = embedding_vector

The error I get is this message:

raise KeyError(f"Key '{key}' not present") KeyError: "Key 'https' not present"

What would be a way to fix this issue?

Upvotes: 0

Views: 165

Answers (1)

wwwslinger
wwwslinger

Reputation: 986

It looks like it is trying to find a word "https" in your word vector that doesn't exist. Try adding a check:

...
for  word, i in tokenizer.word_index.items():
  if word in model_1.wv:
    embedding_vector = model_1.wv[word] #error
...

You may want to do something if the word isn't found, so you can add an else.

Typically it's better to use the embedding methods than to manually search for a particular word, but I don't know your use case so this should catch the error.

Upvotes: 1

Related Questions