Dan
Dan

Reputation: 701

What is the best complexity to check if an Anagrams word exist in a list?

Which of these two algorithms has a better complexity, if we consider O(n) and not O(1) for sort, slice, join, split? I think the second one is better in terms of performance.

And, do you know if we can write it down in a better complexity or performance.

This is the questions:

Anagrams: 
a re-arrangement of letters that spells another word, 
for example: spot, tops, stop are anagrams. 

boolean containsAnagram(List<String> dictionary, String word)

assertTrue(containsAnagram(List.of("stop", "bus"), "pots"));
assertFalse(containsAnagram(List.of("stop", "bus"), "apple"));

And, I have written my code in javascript as follow:

const containsAnagram = (dic, str) => {
      console.log(dic, str);
      for(let i=0; i< str.length; i++){ 
          const letter = str[i];
          for(let j=0; j < dic.length ; j++ ){
              const word = dic[j];
              for(let k=0; k < word.length ; k++ ){
                  if(letter === word[k]){
                      dic[j] = word.slice(0, k) + word.slice(k+1, word.length);
                      // dic[j] is empty string and it is the end of the str , return true
                      if( dic[j] === '' && str.length === i+1 ){
                          return true;
                      }
                      break;
                  }
              }
          }
      }
      return false;
    };
console.log(containsAnagram(["stop", "bus"], "pots"));
console.log(containsAnagram(["stop", "bus"], "apple"));

Which basically given O(n^3) if consider that slice or substr is o(1). While the actual case is that slice or substr has the complexity of o(n).

So, I rewrite it using the sort function, as follow:

const containsAnagram = (dic, str) => {
    console.log(dic, str);
    str = str.split('').sort().join(''); // n
    for(let j=0; j < dic.length ; j++ ){
        const word = dic[j].split('').sort().join(''); // o (n*n*n*n)
        dic[j] = word ;
        if( word === str ){
            return true;
        }
    }
    return false;
};  
console.log(containsAnagram(["s", "s1"], "sp")); // false
console.log(containsAnagram(["s", "psso1"], "spos1"));  // true

Upvotes: 1

Views: 115

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 370799

Both are unnecessarily complex. .sort is O(n log n), not O(n), so while it's an improvement, it's not as good as it could be.

To reduce computational complexity and iterate over each character in the dic and str exactly once, make a function that turns a string into an object that counts up the number of occurrences of each character, then compare the object's values:

const groupStr = str => {
  const obj = {};
  for (const char of str) {
    obj[char] = (obj[char] || 0) + 1;
  }
  return obj;
};
const objsEqual = (obj1, obj2) => {
  const [keys1, keys2] = [obj1, obj2].map(Object.keys);
  if (keys1.length !== keys2.length) return false;
  return keys1.every(key1 => obj1[key1] === obj2[key1]);
};
const containsAnagram = (targets, input) => {
  const inputGrouped = groupStr(input);
  return targets.some(
    target => objsEqual(groupStr(target), inputGrouped)
  );
};

console.log(containsAnagram(["stop", "bus"], "pots"));
console.log(containsAnagram(["stop", "bus"], "apple"));

That said, in the real world, worrying about computational complexity for this sort of thing likely won't be something worth spending time on, unless you're iterating over multiple huge dictionaries or something. Computational complexity only matters with inputs large enough to strain reasonably modern computers.

Upvotes: 1

Related Questions