Reputation: 41
I did document similarity on my corpus using Doc2Vec and it outputting not that good of similarities. I was wondering if I could do a topic model from what Doc2Vec is giving me to increase the accuracy of my model in order to get better similarities?
Upvotes: 2
Views: 574
Reputation: 54233
You should train a new model (like LDA) from the original corpus.
If the native similarities given by the Doc2Vec process aren't very good, maybe you can improve them with tuning your process.
But if that doesn't work, then Doc2Vec hasn't distilled useful info from your data – and downstream calculations built on those (bad) raw numbers aren't likely to get magically better.
Upvotes: 2