Reputation: 317
I have a matrix of word co-occurrence, like below. I'd like to use MDS to reduce the dimension and plot it. In sklearn there's a function model = MDS(n_components=2, dissimilarity='precomputed', random_state=1)
and to apply the model output = model.fit_transform(input)
My understanding is the input should be a dissimilarity matrix instead of the similarity one that I have. Is that correct? Is there a function that I could use to convert this co-occurrence dissimilarity matrix? I'm quite new to this. Many thanks for your help.
co-occurrence matrix :
word1 word2 word3 ...
word1. 0 1 3
word2 1 0 5
word3 3 5 1
...
Upvotes: 3
Views: 564
Reputation: 93
It might be too late, but I might have an answer to propose.
I used a similarity matrix (full of 1 in the diagonale, which is not your case), and found a simple formula to transform it into a dissimilarity matrix: (1 - cell) However, my supervisor found another formula (I can't find back the reference) which seems to manage a diagonale with different values. I put some details in this thread, but my AWK program can't be applied to your data (as I simplified the formula to manage my case where I only have 1 in diagonale).
The formula which could work for you is :
In my case, where the diagonale has 1, I simplified it to :
I hope it might help you ! :) But maybe I'm wrong. If that's the case, I'm interested to know the details.
Upvotes: 0