Reputation: 7
I am currenlty working on a ML project. I am stuck on the embedding matrix for my Word2Vec model.
The code snippet is below:;
vocab_size = len(tokenizer.word_index)+1
embedding_matrix = np.zeros((vocab_size, embedding_vector_size))
# +1 is done because i starts from 1 instead of 0, and goes till len(vocab)
for word, i in tokenizer.word_index.items():
embedding_vector = model_1.wv[word] #error
if embedding_vector is not None:
embedding_matrix[i] = embedding_vector
The error I get is this message:
raise KeyError(f"Key '{key}' not present") KeyError: "Key 'https' not present"
What would be a way to fix this issue?
Upvotes: 0
Views: 165
Reputation: 986
It looks like it is trying to find a word "https" in your word vector that doesn't exist. Try adding a check:
...
for word, i in tokenizer.word_index.items():
if word in model_1.wv:
embedding_vector = model_1.wv[word] #error
...
You may want to do something if the word isn't found, so you can add an else
.
Typically it's better to use the embedding methods than to manually search for a particular word, but I don't know your use case so this should catch the error.
Upvotes: 1