Reputation: 22342
I've been doing some reading on time complexity, and I've got the basics covered. To reinforce the concept, I took a look at an answer I recently gave here on SO. The question is now closed, for reason, but that's not the point. I can't figure out what the complexity of my answer is, and the question was closed before I could get any useful feedback.
The task was to find the first unique character in a string. My answer was something simple:
public String firstLonelyChar(String input)
{
while(input.length() > 0)
{
int curLength = input.length();
String first = String.valueOf(input.charAt(0));
input = input.replaceAll(first, "");
if(input.length() == curLength - 1)
return first;
}
return null;
}
My first thought was that since it looks at each character, then looks at each again during replaceAll()
, it would be O(n^2).
However, it got me to thinking about the way it actually works. For each character examined, it then removes all instances of that character in the string. So n
is constantly shrinking. How does that factor into it? Does that make it O(log n), or is there something I'm not seeing?
What I'm asking:
What is the time complexity of the algorithm as written, and why?
What I'm not asking:
I'm not looking for suggestions to improve it. I get that there are probably better ways to do this. I'm trying to understand the concept of time complexity better, not find the best solution.
Upvotes: 3
Views: 496
Reputation: 71009
The worst time complexity you will have is for the string aabb...
and so on each character repeated exactly twice. Now this depends on the size of your alphabet let's say that is S
. Let's also annotate the length of your initial string with L
. So for each letter you will have to iterate over the whole String. However first time you do that the String will be of size L
, second time L-2
and so on. Overall you will have to perform in the order of L + (L-2) + ... + (L- S*2)
operations and that is L*S - 2*S*(S+1)
, assuming L is more than 2*S
.
By the way if the size of your alphabet is constant, and I suppose it is, the complexity of your code is O(L)
(though with a big constant).
Upvotes: 4
Reputation: 20136
Let m
be the size of your alphabet, and let n
be the length of your string. The worse case would be to uniformly distribute your string's characters between the alphabet letters, meaning you'll have n / m
characters for each letter in your alphabet, let's mark this quantity with q
. For example, the string aabbccddeeffgghh
is the uniformly distribution of 16
characters between the letters a-h
, so here n=16
and m=8
and you have q=2
characters for each letter.
Now, your algorithm is actually going over the letters of the alphabet (it just uses the order which they appear in the string), and for each iteration it has to go over the length of the string (n
) and shrink it by q
(n -= q
). So over all the number of operation you do in the worst case are:
s = n + n-(1*q) + ... + n-((m-1)*q)
You can see that s
is the sum of the first m
elements of the arithmetic series:
s = (n + n-((m-1)*q) * m / 2 =
(n + n-((m-1)*(n/m)) * m / 2 ~ n * m / 2
Upvotes: 1
Reputation: 805
Let's mark:
m = number of unique letters in the word
n = input length
This is the complexity calculation:
The main loop goes at most m times, because there are m different letters,
the .Replaceall checks at most O(n) comparisons in each cycle.
the total is: O(m*n)
an example for O(m*n) cycle is: input = aabbccdd,
m=4, n=8
the algorithm stages:
1. input = aabbccdd, complex - 8
2. input = bbccdd, complex = 6
3. input = ccdd, complex = 4
4. input = dd, complex = 2
total = 8+6+4+2 = 20 = O(m*n)
Upvotes: 1
Reputation: 86154
The worst case is O(n^2)
where n is the length of the input string. Imagine the case where every character is doubled except the last one, like "aabbccddeeffg". Then there are n/2 loop iterations, and each call to replaceAll
has to scan the entire remaining string, which is also proportional to n
.
Edit: As Ivaylo points out, if the size of your alphabet is constant, it's technically O(n)
since you never consider any character more than once.
Upvotes: 1