Reputation: 31
I have a question about two t-SNE plots I made. I have a set of 850 articles for which I wanted to check which articles are similar to each other. This was done by pre-processing the articles first, then making a tf-idf vector of the whole set and making a t-SNE plot of this tf-idf, one with cosine distances and one with euclidean distances.
However, they both look very similar, it looks a bit like that only the axes are switched or something... Is there any logical reasoning for this?
The colors are the labels an article got from a simple sentiment analysis.
Above the Cosine Distances
Above the Euclidean distances
Thanks for any help in advance!
Upvotes: 1
Views: 268
Reputation: 173
The test result indicates that Euclidean distance and cosine distance are likely the same distance function (up to certain scaling factor) for the specific type of data. You could verify this by heatmaps of the two distance matrixes.
Upvotes: 1