Geobits
Geobits

Reputation: 22342

Time complexity of this algorithm

I've been doing some reading on time complexity, and I've got the basics covered. To reinforce the concept, I took a look at an answer I recently gave here on SO. The question is now closed, for reason, but that's not the point. I can't figure out what the complexity of my answer is, and the question was closed before I could get any useful feedback.

The task was to find the first unique character in a string. My answer was something simple:

public String firstLonelyChar(String input)
{
    while(input.length() > 0)
    {
        int curLength = input.length();
        String first = String.valueOf(input.charAt(0));
        input = input.replaceAll(first, "");
        if(input.length() == curLength - 1)
            return first;
    }
    return null;
}

Link to an ideone example

My first thought was that since it looks at each character, then looks at each again during replaceAll(), it would be O(n^2).

However, it got me to thinking about the way it actually works. For each character examined, it then removes all instances of that character in the string. So n is constantly shrinking. How does that factor into it? Does that make it O(log n), or is there something I'm not seeing?

What I'm asking:

What is the time complexity of the algorithm as written, and why?

What I'm not asking:

I'm not looking for suggestions to improve it. I get that there are probably better ways to do this. I'm trying to understand the concept of time complexity better, not find the best solution.

Upvotes: 3

Views: 496

Answers (4)

Ivaylo Strandjev
Ivaylo Strandjev

Reputation: 71009

The worst time complexity you will have is for the string aabb... and so on each character repeated exactly twice. Now this depends on the size of your alphabet let's say that is S. Let's also annotate the length of your initial string with L. So for each letter you will have to iterate over the whole String. However first time you do that the String will be of size L, second time L-2 and so on. Overall you will have to perform in the order of L + (L-2) + ... + (L- S*2) operations and that is L*S - 2*S*(S+1), assuming L is more than 2*S.

By the way if the size of your alphabet is constant, and I suppose it is, the complexity of your code is O(L)(though with a big constant).

Upvotes: 4

zenpoy
zenpoy

Reputation: 20136

Let m be the size of your alphabet, and let n be the length of your string. The worse case would be to uniformly distribute your string's characters between the alphabet letters, meaning you'll have n / m characters for each letter in your alphabet, let's mark this quantity with q. For example, the string aabbccddeeffgghh is the uniformly distribution of 16 characters between the letters a-h, so here n=16 and m=8 and you have q=2 characters for each letter.

Now, your algorithm is actually going over the letters of the alphabet (it just uses the order which they appear in the string), and for each iteration it has to go over the length of the string (n) and shrink it by q (n -= q). So over all the number of operation you do in the worst case are:

s = n + n-(1*q) + ... + n-((m-1)*q)

You can see that s is the sum of the first m elements of the arithmetic series:

s = (n + n-((m-1)*q) * m / 2 =  
    (n + n-((m-1)*(n/m)) * m / 2 ~ n * m / 2

Upvotes: 1

Wasafa1
Wasafa1

Reputation: 805

Let's mark:

m = number of unique letters in the word
n = input length

This is the complexity calculation:
The main loop goes at most m times, because there are m different letters,
the .Replaceall checks at most O(n) comparisons in each cycle.

the total is: O(m*n)

an example for O(m*n) cycle is: input = aabbccdd,

m=4, n=8

the algorithm stages:

1. input = aabbccdd, complex - 8
2. input = bbccdd, complex = 6
3. input = ccdd, complex = 4
4. input = dd, complex = 2

total = 8+6+4+2 = 20 = O(m*n)

Upvotes: 1

recursive
recursive

Reputation: 86154

The worst case is O(n^2) where n is the length of the input string. Imagine the case where every character is doubled except the last one, like "aabbccddeeffg". Then there are n/2 loop iterations, and each call to replaceAll has to scan the entire remaining string, which is also proportional to n.

Edit: As Ivaylo points out, if the size of your alphabet is constant, it's technically O(n) since you never consider any character more than once.

Upvotes: 1

Related Questions