Reputation: 10566
There is a problem "Find the element repeated more than n/2 times"
Can you please help to estimate time complexity for solution that uses random:
What would be the worst case for this method, if I use perfect random generator that gives random uniform numbers? O(N²)
?
My intuition says that on average it should give answer in two tries, but it is only average case. How to prove it? I'm not quite sure how to estimate running time for random algorithms.
Upvotes: 0
Views: 822
Reputation: 241671
There is no bound for the worst case running time of this algorithm.
The "perfect" random number generator's output cannot be contingent on the previous history; otherwise it would be imperfect (real-world pseudo-rngs are imperfect in this way, so you might be able to construct a real bound for a specific PRNG).
Consequently, it could take an arbitrary number of guesses before the RNG guesses one of the majority element's positions.
If you're allowed to rearrange the array, you could swap the wrong guess to the beginning of the (still unknown) part of the array, and then restrict the guesses to as-yet-unguessed positions. That would limit the number of iterations to n/2-1, so the worst-case running time for the algorithm would be O(n2).
Although it has no impact on the big-O runtime, you can almost always stop the count scan early, either because you've already found n/2+1 elements or because there are not enough unscanned elements left to bring the count up to that threshold. Even with that optimization, the worst-case (alternating elements) time for a single scan is still n and the expected scan is still O(n).
Upvotes: 2
Reputation: 14977
If we talk about a worst-case complexity, we mean an worst case for the input, i.e. an input that forces the algorithm into it's worst possible running time.
This is the same for randomized algorithms. We calculate the expected complexity for an worst case input.
In your example an worst input would be an array of length n
, that only contains a number a
⌊n/2⌋+1
times.
The complexity is now O(n)⋅E[X]
, where X
is the number of tries you have to randomly pick a number from the array until you pick a
.
If a
is m
times in the array E[X] = n/m
holds. So for our worst case input we get E[X] = n/(⌊n/2⌋+1) < n/(n/2) = 2
.
So this randomized algorithm has a worst case complexity of O(n)
.
Upvotes: 0
Reputation: 2982
For randomized algorithms, the expected run time better characterizes their run time. For the algorithm you described the expected run time is at most
S = n * 1/2 + 2n * 1/2^2 + 3n * 1/2^3 + ... up to infinity
We can solve this as follows:
S = n/2 + 2n/2^2 + 3n/2^3 + ... up to infinity
2S = n + 2n/2 + 3n/2^2 + 4n/2^3 + ... up to infinity
(subtracting the top from bottom)
S = n + n/2 + n/4 + n/8 + ... up to infinity
= 2n
So the expected runtime is O(n).
Upvotes: 1
Reputation: 372674
Assuming that there is actually an element that appears more than n / 2 times, the expected running time is O(n). You can think about it this way - each time you choose an element, you need to do O(n) work to check whether it's the majority element. The question then is how many elements, on expectation, you're going to have to pick. Each time you choose an element randomly, you have at least 1/2 probability that you do pick something that's a majority element. On expectation, that means that you'll need to pick two elements before you find a majority element, so the runtime will be O(n). If you're curious why, notice that the probability that you find what you're looking for after exactly k probes (k > 0) is at most 2-k, since you need to have the first k - 1 probes not succeed and then have the kth probe succeed. You can then compute the expected value as
0 * 2-0 + 1 * 2-1 + 2 * 2-2 + ...
= 2
(This summation is known to work out to exactly two, though proving it is a bit messy.)
In the worst-case, though, every time you pick an element, you'll choose something that isn't a majority element. This is extraordinarily unlikely, though - the probability that you don't find a majority element after k rounds is at most 2-k. For k = 300, this number is less than one over the number of atoms in the universe. Therefore, even though you can't bound the worst-case runtime, it's so astronomically unlikely that it's something you can safely ignore.
Hope this helps!
Upvotes: 3