Manuelarte
Manuelarte

Reputation: 1810

Performance, big O notation, of a recursive an a non-recursive algorithm

Imagine that I have to check whether all the letters of one string is in another one. I'd like to compare two implementations, one tail-recursive and another using a hashMap. Here are the two implementations:

private boolean isPossible(final String left, final String right) {
    boolean toReturn = false;
    if (left.isEmpty()) {
        toReturn = true;
    } else {
        char charAt = left.charAt(0);
        final int index = right.indexOf(charAt);
        toReturn = index != -1 ? isPossible(left.substring(1), stringWithoutIndex(right, index)) : false;
    }
    return toReturn;
}

And the hashMap solution:

public boolean isPossible(String left, String right) {
    HashMap<Character, Integer> occurrencesMap = createOccurrenceMapFor(left);
    HashMap<Character, Integer> withTheLettersInRightRemoved = removeLettersFoundIn(right, occurrencesMap);
    return checkThatWeCanWriteTheMessage(withTheLettersInRightRemoved);
}

private HashMap<Character, Integer> removeLettersFoundIn(final String string, final HashMap<Character, Integer> occurrencesMap) {
    HashMap<Character, Integer> lettersRemoved = new HashMap<>(occurrencesMap);
    for (char c : string.toCharArray()) {
        if (lettersRemoved.containsKey(c)) 
            lettersRemoved.put(c, lettersRemoved.get(c).intValue() - 1); 
    }
    return lettersRemoved;
}

private HashMap<Character, Integer> createOccurrenceMapFor(String string) {
    HashMap<Character, Integer> occurrencesMap = new HashMap<>();
    for (char c : string.toCharArray()) {

        if (occurrencesMap.containsKey(c)) 
            occurrencesMap.put(c, occurrencesMap.get(c).intValue() + 1); 
        else 
            occurrencesMap.put(c, 1);
    }
    return occurrencesMap;
}

private boolean checkThatWeCanWriteTheMessage(HashMap<Character, Integer> occurrencesMap) {
    for (char c : occurrencesMap.keySet()){
        if (withTheLettersInMagazineRemoved.get(c) > 0) {
            return false;
        }
    }
    return true;
}

I think that both solutions have O(n) performance, since none of them has a for loop, etc on them. But, once I compare the times, I get that the hashMap solutions is much more faster than the recursive one. Of course, it makes sense, but I'd like to know why since, in theory, both have O(n). Am I right?

Upvotes: 2

Views: 919

Answers (3)

Ely
Ely

Reputation: 11152

First algorithm: O(n*m)

Second algorithm: O(n) - You'll see that you can abstract from the size of m here.

See below for feedback.


First program

The recursive function is not tail recursive. There is no tail call. But I think you can transform it into a tail recursive function by separating the return statements. E.g. (I shortened your code, but please verify though)

private boolean isPossible(final String left, final String right) {
    if (left.isEmpty()) return true;
    char firstChar = left.charAt(0);
    int index = right.indexOf(charAt);
    if (index == -1) return false;
    return isPossible(left.substring(1), stringWithoutIndex(right, index));
}

Not sure what stringWithoutIndex(right, index) exactly does, but would be good to know for the time complexity. I assume it simply returns the right string without the character at the given index, which is O(m) (m is the length of right string).

I denote the lines of code inside the function by numbers (1, 2, 3, 4 and 5). I assume that left string has length n and right string has length m.

  1. O(1)
  2. O(1)
  3. O(m), because in a worst case scenario you'd find the character at the last index.
  4. O(1)
  5. O(m), the removal of a character of the right string

Add those together and you'd get O(m) for one iteration. And since you iterate as many times as you have characters in the left string, the time complexity of that algorithm is O(m*n).

  • If m < n, then it is smaller than O(n^2).
  • If m = n, then it is O(n^2).
  • If m > n, then it is O(n * m)

Finally the complexity class of this algorithm is O(n*m).

Side note on tail recursion

It has nothing to do with time complexity, but on an implementation specific level you can have performance boosts. Theoretically speaking, this merely improves the constant of an algorithm's complexity class.

Tail recursive function (as you might maybe already know) can be optimized and yield a performance boost. You might be interested in turning it into a tail recursive function if speed is important to you. See Wikipedia for information. In general you can expect an iterative function to be faster than a recursive function. However, a recursive function with memoization can beat an iterative function. See here for example with Fibonacci numbers.

Second program

Read through and consider remark at the end of this explanation (Will discuss complexity of hashmap for analysis) and note that this is a purely theoretical, but rigorous, approach. Similar line numbering as in previous example:

isPossible(...):

  1. O(n * k') - depends on createOccurrenceMapFor(...)
  2. O(m * k + n) - depends on removeLettersFoundIn(...)
  3. O((n-m) * l) - depends on checkThatWeCanWriteTheMessage(...)

removeLettersFoundIn(...):

  1. O(n) - creation of the hash map based on left string
  2. O(m) - loop over number of characters in right string
    1. O(k) - worst case for insertion into hashmap in theory is O(k)

This function has complexity O(m*k + n) as you can see.

createOccurrenceMapFor(...):

  1. O(n) - loop over number of characters in left string
    1. O(k') - worst case for insertion into hashmap in theory is O(k')

This function has complexity O(n*k')

checkThatWeCanWriteTheMessage(...):

Not sure what withTheLettersInMagazineRemoved exactly is, but I assume it is a hashmap of size k''.

  1. O(n-m) - loop over number of leftover characters in left string
    1. O(k'') - worst case for retrieval in hashmap is O(k'')

Note this function does not make sense in the context of your algorithm. At least I can't really figure that out.

Finally, the algorithm has a complexity of the sum of the the sub functions:

O(n*k' + m*k+n + (n-m)*l)

Consequences of using the hashmap and expected worst case

Excerpt from Java documentation:

This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

Your keys are characters and thus I think it is not difficult to built a constant time hash function. And so let us assume that operations of your hashmap (put, get and containsKey) run in constant time O(1) instead of O(k), O(k') or O(k'').

Finally, the complexity class of that algorithm would be O(n + m+n + (n-m)) = O(3*n) = O(n), assuming the hash function runs in O(1), which you can assume in good conscience.

Upvotes: 1

tucuxi
tucuxi

Reputation: 17945

aString.indexOf(aChar) is implemented using a loop over the characters in the source string; it is therefore O(aString.length). And stringWithoutIndex(aString, anIndex) cannot be implemented in better than O(aString.length). In the worst case (when all the characters in left appear in right), you will be performing O(left.length * right.length) operations.

Your first code fragment is equivalent to:

private boolean isPossible(String left, String right) {
  // loop will repeat left.length times at most
  while (true) {
    if (left.isEmpty()) {
      return true;
    } else {
      char first = left.charAt(0);
      left = left.substring(1);

      // indexOf: O(right.length)
      int index = -1;
      for (int i=0; i<right.length; i++) {
        if (right.charAt(i) == first) { 
          index = i; 
          break; 
        }
      }

      if (index >= 0) {
        // stringWithoutIndex: concatenating strings is O(size of result)
        right = right.substring(0, index) + right.substring(index+1);
      } else {
        return false;
      }
    }
  }
}

I have only transformed recursion into iteration, and expanded indexOf and stringWithoutIndex - which makes complexity easier to calculate, as the loops are readily visible.

Note that mechanical conversion from recursive to iterative (or vice-versa), does not change complexity classes; although, when the code can be written as tail-recursive (as in this case isPossible), iterative code can be somewhat faster, and cannot run out of stack (since it does not use it). Because of this, many compilers convert tail recursion into iteration behind the scenes.

A similar argument can be made for inlining functions (inlined is generally faster, but will remain in the same big-o complexity class), although there is an additional size tradeoff: inlining makes your compiled program larger if the code was being used in many places.

Upvotes: 2

serhiyb
serhiyb

Reputation: 4833

First solution goes over each character in first string which is O(N) but for each character it searches for matching character in second string which gives another internal/nested O(N) and O(N^2) in total.

Second solution iterates over first string O(N), after that it iterates over second string O(N) and finally it iterates over hashmap which contains only limited range of characters (some constant). The total is O(N)+O(N)+C=O(N)

Upvotes: 3

Related Questions