Srikanth Pragallapati
Srikanth Pragallapati

Reputation: 58

Algorithm for word match percentage between two text files

I have two Strings with many words in it.

My task is to find the percentage of word match between two strings. Can someone suggest me the algorithm we already have to get precise percentage/matched word.

Example :

1. Mason natural fish oil 1000 mg omega-3 softgels - 200 ea
2. Mason Vitamins Omega 3 Fish Oil, 1000mg. Softgels, Bonus Size 200-Count Bottle

**Output** should be 8 words matched between two strings.

Upvotes: 1

Views: 2318

Answers (1)

Coder
Coder

Reputation: 2044

You can use method as below. I have added inline comments to discribe the each step you can try it. Note that on this code example I have used space character to split the words. If you have any concerns you can add comment.

Note that I have did the matching words ignoring the case because otherwise there was no possibility to have 8 matching words in your given example.

public static int matchStrings(String firstString, String SecondString) {

    int matchingCount = 0;

    //Getting the whole set of words in to array. 
    String[] allWords = firstString.split("\\s");
    Set<String> firstInputset = new HashSet<String>();

    //getting unique words in to set
    for (String string : allWords) {
        firstInputset.add(string);
    }

    //Loop through the set and check whether number of words occurrence in second String
    for (String string : firstInputset) {
        if (SecondString.toLowerCase().contains(string.toLowerCase())) {
            matchingCount++;
        }
    }
    return matchingCount;
}

Upvotes: 3

Related Questions