Shokouh Dareshiri
Shokouh Dareshiri

Reputation: 936

Cosine Similarity in Java

I want to calculate the similarity in rows of a matrix such as D, but the results are not correct!! What is the problem of my codes? In calculating the similarity of rows in matrix U, i did as below.. as results shows, the similarities of rows is between just 1.0 and -1.0, i think it is wrong!!

    {

public void run(String[] args) throws Exception {

        Matrix A = new Matrix(array);

        for(int i = 0; i < A.getRowDimension(); i++)
            System.out.println("similar is : " + cosineSimilarity(i, A));

    }


private ArrayList cosineSimilarity(int rowIndex, Matrix D) {

        double dotProduct = 0.0, firstNorm = 0.0, secondNorm = 0.0;
        double cosinSimilarity;
        ArrayList<Double> similarRows = new ArrayList<>();

        for(int row = 0; row < D.getRowDimension(); row++){
            for (int column = 0; column < D.getColumnDimension(); column++) {
            dotProduct = + (D.get(rowIndex, column) * D.get(row, column));
            firstNorm =  + pow(D.get(rowIndex, column),2);
            secondNorm = + pow(D.get(row, column), 2);
           // Matrix f = D.getMatrix(row, column);
            }
            cosinSimilarity = (dotProduct / (sqrt(firstNorm) * sqrt(secondNorm)));
            similarRows.add(row, cosinSimilarity);
        }
return similarRows;
    }

}

The results are :

A is :    
 0.067174 -0.862994 -0.435024 0.123151 -0.214891 0.011754
 0.502582 -0.205973 0.093513 0.031561 0.821020 0.145506
 0.406919 -0.032555 0.413105 0.623333 -0.246395 -0.462002
 0.394209 0.218539 -0.497640 -0.386091 -0.002859 -0.632551
 0.571882 0.300883 -0.279673 0.132980 -0.354327 0.600810
 0.308004 -0.271047 0.552712 -0.654632 -0.305748 0.064427

similar is : [1.0, 1.0, -1.0, -1.0, 1.0, 1.0]
similar is : [1.0, 1.0, -1.0, -1.0, 1.0, 1.0]
similar is : [-1.0, -1.0, 1.0, 1.0, -1.0, -1.0]
similar is : [-1.0, -1.0, 1.0, 1.0, -1.0, -1.0]
similar is : [1.0, 1.0, -1.0, -1.0, 1.0, 1.0]
similar is : [1.0, 1.0, -1.0, -1.0, 1.0, 1.0]

Upvotes: 0

Views: 2521

Answers (1)

laune
laune

Reputation: 31290

You want to compute the similarities between the given row and each row in the Matrix. Hence, inner product and norms must be computed getRowDimension times.

But the initializations are in the wrong place - move them into the loop over all rows.

And you want to use += and not = +!

private ArrayList cosineSimilarity(int rowIndex, Matrix D) {
    ArrayList<Double> similarRows = new ArrayList<>();

    for(int row = 0; row < D.getRowDimension(); row++){
        double dotProduct = 0.0, firstNorm = 0.0, secondNorm = 0.0;
        for (int column = 0; column < D.getColumnDimension(); column++) {
        dotProduct += (D.get(rowIndex, column) * D.get(row, column));
        firstNorm += pow(D.get(rowIndex, column),2);
        secondNorm += pow(D.get(row, column), 2);
       // Matrix f = D.getMatrix(row, column);
        }
        double cosinSimilarity = (dotProduct / (sqrt(firstNorm) * sqrt(secondNorm)));
        similarRows.add(row, cosinSimilarity);
    }

Upvotes: 4

Related Questions