I'm working on a problem from HackerRank - I have a working anagram search function, however, it is too slow when it comes to large input arrays/strings. The dictionary and query inputs are both lists of words, the function should look for the number of anagrams for each word in query and return a list of anagram counts corresponding to each word. dictionary = ["abc", "bca"] query = ["abc", "xyz"] # return [2, 0] I have tried 2 approaches to reduce the run time but with no success (the max. time allowed limit is 10 seconds for the hidden tests) - to break out of loops as soon as I know there will not be any more matches to create a sub dictionary of only the words with equal length of the search word def stringAnagram(dictionary, query): result = [] for i in range(len(dictionary)): dictionary[i] = "".join(sorted(dictionary[i])) dictionary.sort() dictionary.sort(key=len) for word in query: i = 0 sortedWord = "".join(sorted(word)) subDictionary = [entry for entry in dictionary if len(entry) == len(sortedWord)] for entry in subDictionary: if sortedWord == entry: i += 1 result.append(i) return result Can anyone point out where the bottleneck is?

Reputation: 962

How to optimize my anagram search function?

I'm working on a problem from HackerRank - I have a working anagram search function, however, it is too slow when it comes to large input arrays/strings.

The dictionary and query inputs are both lists of words, the function should look for the number of anagrams for each word in query and return a list of anagram counts corresponding to each word.

dictionary = ["abc", "bca"]
query = ["abc", "xyz"]

# return [2, 0]

I have tried 2 approaches to reduce the run time but with no success (the max. time allowed limit is 10 seconds for the hidden tests) -

to break out of loops as soon as I know there will not be any more matches
to create a sub dictionary of only the words with equal length of the search word

def stringAnagram(dictionary, query):
    result = []
    for i in range(len(dictionary)):
            dictionary[i] = "".join(sorted(dictionary[i]))
    dictionary.sort()
    dictionary.sort(key=len)
    
    for word in query:
        i = 0
        sortedWord = "".join(sorted(word))
        subDictionary = [entry for entry in dictionary if len(entry) == len(sortedWord)]
        
        for entry in subDictionary:
            if sortedWord == entry:
                i += 1
        result.append(i)
        
    return result

Can anyone point out where the bottleneck is?

Upvotes: 0

Answers (2)

vcosk

Reputation: 2934

def stringAnagram(dictionary, query):
    sorted_dictionary = {''.join(sorted(word)) for word in words}
    result = [word for word in query if ''.join(sorted(word)) in sorted_dictionary]
    return result

Upvotes: 0

Uttam Rabari

Reputation: 26

Instead of creating permutations of a string of particular length and comparing it to see if there is any match, there is a simple observation, and that is the number of any character would be the same in both strings if they are anagram.

Upvotes: 1

How to optimize my anagram search function?

Answers (2)

Related Questions