Reputation:
Given a string, find the length of the longest substring without repeating characters.
Examples:
Given "abcabcbb", the answer is "abc", which the length is 3.
int lengthOfLongestSubstring(char* s) {
if (s == NULL) {
return 0;
}
myHash hash;
int start = 0;
int end = 0;
int maxLen = 0;
while (s[end] != '\0') {
char c = s[end];
if (hash.find(c) == hash.end()) {
hash[c] = end;
end++;
}else {
if (end - start > maxLen) {
maxLen = end - start;
}
int index = hash[c];
while(start <= index) {
hash.erase(s[start]);
start++;
}
}
}
if (end - start > maxLen) {
maxLen = end - start;
}
return maxLen;
}
Some one said this algorithm complexity is O(n) but I think it didn't consider the second loop, it's less than O(n^2) but should be bigger than O(n).
How should we analyze the worst case complexity?
Upvotes: -1
Views: 208
Reputation: 180103
Asymptotic complexity analysis is trickier than just counting loops.
Before we go any further, however, do note that if an algorithm's asymptotic complexity is O(n), then it is also O(n2), as the latter expresses a weaker bound. Presumably you're looking for the tightest bound.
So let's now look at sum of all the operations performed in all the iterations of the outer loop. On each iteration, hash.find(c) == hash.end()
is computed. We assume each such evaluation costs O(1). Then there are two alternatives:
the current character, s[end]
, is distinct from all others in the current substring, as judged by the hash lookup. In this case, the current character is added to the current substring, and end
is incremented.
the current character duplicates a previous character in the current substring. In this case, the previous appearance of the current character and each preceding character in the current substring is removed from the substring.
Each of the n
characters in the original string is added to the current string once, and removed from it at most once, and the actions involved in each individual character's addition or removal cost O(1). The remaining operations performed in the outer loop cost O(1) per iteration. Every iteration of the loop performs either one addition or at least one removal, so there cannot be more than 2*n iterations altogether. That gives the loop nest a worst-case cost of n * O(1)
[for additions] + n * O(1)
[for removals] + 2 * n * O(1)
[for other operations], for a total cost bounded by 4 * n * O(1) = O(4n) = O(n)
.
Upvotes: 1
Reputation: 726489
The complexity of this algorithm is indeed O(n).
Proof: Consider the values of start
and end
. Each iteration of the outer loop performs one of two actions:
end
by 1, orstart
in a loop one or more times.The number of times that end
is advanced is equal to n
, the length of the input. The total number of times that start
is advanced cannot exceed n
, because index
is never set to a number above n
, and start
is never decremented.
Hence, the total number of times the outer loop runs is n
, and the total number of times the inner loop runs is n
, yielding time complexity of O(n).
Note that the inner loop does not do all n
iterations at once. Its iterations are "distributed" among the iterations of the outer loop. However, if you total up the number of iterations of the inner loop, during all iterations of the outer loop, they are not going to exceed n
.
Upvotes: 1
Reputation: 845
If I'm understanding your algorithm and data structure used correctly, consider the worse-case scenario: we get a string of the same character over and over again, and have to erase every time. In that case, we've performed just about 2n
operations for the string. This is still linear growth, granted with a constant in front of it. So, for the purposes of understanding how it grows, we're still linear growth - but, as you noted, in between n
and n^2
, since we're about 2n
worst-case.
Upvotes: 0