Reputation:
I am studying the Boyer-Moore Algorithm (from here) and I had a quick question - what is the need of the second pass (which essentially just 'confirms' by finding the frequency of that element). Doesn't the first pass itself guarantee that the element found is the majority one? I considered a few examples and felt that a single pass is enough. Could you kindly provide some examples to counter my feeling?
The code (if required) is as below:
int majorityElement(vector<int>& nums) {
int candidate=0, count=0;
for(auto value: nums) {
//update the candidate if the count == 0
if(count==0)
candidate=value;
//if the value == candidate then increment count
if(value==candidate)
count++;
else
//decrement count
count--;
}
//return candidate
return candidate;
}
Edit: If I understand correctly, the algorithm is only applicable when the frequency of the majority element is indeed greater than (vector size())/2
. So, is the second pass really required? Whenever we code, we do some trivial sanity checks (like checking if the input vector is empty), so in this case, why do we have a 'sanity check' as a part of an algorithm? Or is there something more to it?
Upvotes: 3
Views: 715
Reputation: 1
I think a second pass is required when you have to compare your element with the upper limit if you only want to find the element which repeated more than n/2 than first pass will do the work
Upvotes: 0
Reputation: 372814
I think the following intuition for the Boyer-Moore algorithm might shed some light on why two passes are necessary.
The algorithm is based on the following idea. Imagine each element of your array is a person in a room holding a card with a number on it. Each person in the room wanders around until they meet someone else. If the two people are holding different numbers, they each sit down. Otherwise, they keep moving around until they encounter someone else. Eventually, some number of people will be left standing.
If there is a true majority element, the group of the last people standing will definitely have the majority element because no matter how people pair off there’s too many people in the majority for all of them to have been eliminated. But if there isn’t a majority, there could still be someone left standing at the end holding a non-majority element. For example, maybe they just happened to have not encountered anyone with a different value while everyone else sat down.
The reason for the second pass is to differentiate between these two cases. If there is a majority at all, it has to end up as the final candidate. If there isn’t, something might still end up as the final candidate and you need to rule that case out.
Upvotes: 3
Reputation: 349
I guess you are confused about what majority element is. The candidate only qualifies if its frequency is more than half of the total list i.e.
if frequency(majority_element) > total_size_of_list / 2: return True
First pass only gets the possible candidate for Majority Voting. Second pass confirms if it is really majority element or not.
For example:- In following list
[1, 2, 2, 3, 3, 4, 4, 5, 5, 5]
Possible candidate for majority element is 5. But Frequency of 5 in the list is 3 only which is less than half of list size and thus it is not a majority element and thus the test fails but you won't even know it if you don't make the second pass.
I hope it helps!
Upvotes: 0