mabergerx
mabergerx

Reputation: 1213

Semantically weighted mean of word embeddings

Given a list of word embedding vectors I'm trying to calculate an average word embedding where some words are more meaningful than others. In other words, I want to calculate a semantically weighted word embedding.

All the stuff I found is on just finding the mean vector (which is quite trivial of course) which represents the average meaning of the list OR some kind of weighted average of words for document representation, however that is not what I want.

For example, given word vectors for ['sunglasses', 'jeans', 'hats'] I would like to calculate such a vector which represents the semantics of those words BUT with 'sunglasses' having a bigger semantic impact. So, when comparing similarity, the word 'glasses' should be more similar to the list than 'pants'.

I hope the question is clear and thank you very much in advance!

Upvotes: 2

Views: 1360

Answers (1)

Poorna Prudhvi
Poorna Prudhvi

Reputation: 731

Actually averaging of word vectors can be done in two ways

  1. Mean of word vectors without tfidf weights.

  2. Mean of Word vectors multiplied with tfidf weights.

This will solve your problem of word importance.

Upvotes: 1

Related Questions