124697
124697

Reputation: 21893

Compare Lists of Pairs to find similars

Movie1{{'hello',5},{'foo',3}}
Movie2{{'hi',2},{'foo',2}}

While testing i am testing with 2 movies each has around 20 unique words grouped in pairs of word and frequency

public ArrayList<Pair<String, Integer>> getWordsAndFrequency() {

        String[] keys = description.split(" ");
        String[] uniqueKeys;
        int count = 0;
        uniqueKeys = getUniqueKeys(keys);

        for (String key : uniqueKeys) {
            if (null == key) {
                break;
            }

            for (String s : keys) {
                if (key.equals(s)) {
                    count++;
                }
            }
            words.add(Pair.of(key, count));
            count = 0;
        }
        sortWords(words);

        return words;
    }

Upvotes: 2

Views: 87

Answers (2)

nitegazer2003
nitegazer2003

Reputation: 1193

Your bug is your getWordsAndFrequency() method actually adds more entries to words. So each time you call it the word list gets longer and longer. To fix this, you should calculate the words and frequency once and add these Pairs to the list, then just return the list in the getWordsAndFrequency() method rather than calculating it every time.

Upvotes: 1

nullPointer
nullPointer

Reputation: 133

Can you put the data (that is currently stored in an arraylist of pairs) in a hashmap? You can then compute the intersection of the sets of keywords between two movies and add their scores

For example:

Map<String, Integer> keyWordsMovie1 = movie1.getWordsAndFrequency();
Map<String, Integer> keyWordsMovie2 = movie2.getWordsAndFrequency();
Set<String> commonKeyWords = new HashSet<String>(keyWordsMovie1.keySet()); //set of all keywords in movie1
intersection.retainAll(keyWordsMovie2.keySet());

for (String keyWord : intersection){
    int freq1 = keyWordsMovie1.get(keyWord);
    int freq2 = keyWordsMovie2.get(keyWord);    
    //you now have the frequencies of the keyword in both movies
}

Upvotes: 0

Related Questions