Dostrelith
Dostrelith

Reputation: 962

How to optimize my anagram search function?

I'm working on a problem from HackerRank - I have a working anagram search function, however, it is too slow when it comes to large input arrays/strings.

The dictionary and query inputs are both lists of words, the function should look for the number of anagrams for each word in query and return a list of anagram counts corresponding to each word.

dictionary = ["abc", "bca"]
query = ["abc", "xyz"]

# return [2, 0]

I have tried 2 approaches to reduce the run time but with no success (the max. time allowed limit is 10 seconds for the hidden tests) -

  1. to break out of loops as soon as I know there will not be any more matches
  2. to create a sub dictionary of only the words with equal length of the search word
def stringAnagram(dictionary, query):
    result = []
    for i in range(len(dictionary)):
            dictionary[i] = "".join(sorted(dictionary[i]))
    dictionary.sort()
    dictionary.sort(key=len)
    
    for word in query:
        i = 0
        sortedWord = "".join(sorted(word))
        subDictionary = [entry for entry in dictionary if len(entry) == len(sortedWord)]
        
        for entry in subDictionary:
            if sortedWord == entry:
                i += 1
        result.append(i)
        
    return result

Can anyone point out where the bottleneck is?

Upvotes: 0

Views: 1666

Answers (2)

vcosk
vcosk

Reputation: 2934

def stringAnagram(dictionary, query):
    sorted_dictionary = {''.join(sorted(word)) for word in words}
    result = [word for word in query if ''.join(sorted(word)) in sorted_dictionary]
    return result

Upvotes: 0

Uttam Rabari
Uttam Rabari

Reputation: 26

Instead of creating permutations of a string of particular length and comparing it to see if there is any match, there is a simple observation, and that is the number of any character would be the same in both strings if they are anagram.

Upvotes: 1

Related Questions