Reputation: 740
I ran some test to determine if O(==) for Strings is O(len(string)) or O(1).
My tests:
import timeit
x = 'ab' * 500000000
y = 'ab' * 500000000
%timeit x == y
> 163 ms ± 4.62 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
x = 'ab' * 5000
y = 'ab' * 5000
%timeit x == y
> 630 ns ± 23.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Looking at the above results I understand that string comparison is linear O(N) and not O(1).
However, I was reading this document: Complexities of Python Operations
The part:
Finally, when comparing two lists for equality, the complexity class above shows as O(N), but in reality we would need to multiply this complexity class by O==(...) where O==(...) is the complexity class for checking whether two values in the list are ==. If they are ints, O==(...) would be O(1); if they are strings, O==(...) in the worst case it would be O(len(string)). This issue applies any time an == check is done. We mostly will assume == checking on values in lists is O(1): e.g., checking ints and small/fixed-length strings.
This says the worst case for strings would be O(len(string)). My question is why worst case? Shouldn't the best/average case be O(len(string))?
Upvotes: 3
Views: 5165
Reputation: 42786
The algorithm is simple, you check the strings char by char, so:
Hello == Hello => They are equal...so it is actually the worst case because you check all the chars from both strings
Hello != Hella => Still worst case, you realize they are different in the last char of the strings.
hello != Hello => Best case scenario, the first char for both (h != H) are different, so you stop checking them there.
Upvotes: 4