Zebrafish
Zebrafish

Reputation: 14434

Why must std::sort compare function return false when arguments are equal?

In std::sort you can supply a third argument which is the basis for how it sorts a list. If you want the first argument to come first, then you return true. If you want the second argument to come first you return false. I have come across the problem that my predicate function supposedly is an "invalid comparator", and I have narrowed it down to the fact that it does not fulfil the following requirement:

if arg1 == arg2, compare function MUST return false.

There have been some terms I have seen such as that std::sort requires "strict weak ordering". Apart from 2 places, all the other pages I get about these topics seem to be technical papers, and I can't understand it. What I can understand of it is that:

In weak ordering some elements "may be tied" with each other.

But to me this is also the meaning of a "partially ordered set", which is:

"there may be pairs of elements for which neither element precedes the other"

Furthermore, I can't understand what the "strict" implies in either of them.

Leaving aside my confusion about order theory terminology, my question is if in the compare function argument 1 and argument 2 are equal, and in which case I don't care which comes before the other (either one coming before would make me happy), why can't I return true to have argument 1 come first?

I was also going to ask how my program actually knows it's an invalid comparator but then I thought it probably just checks to see if arg1 and arg2 are equal when the compare function returns true.

Upvotes: 22

Views: 13325

Answers (6)

KeyC0de
KeyC0de

Reputation: 5287

A 2-way less-than comparison operator< function checks if you want to return the first number before the second number. So it should return true if yes, or false if no.

When the first and second arguments are equal then false should be returned because, again, there's no need to sort equal elements.

Upvotes: 0

Terry
Terry

Reputation: 750

1. From a code perspective

When elements count is greater than _S_threshold (in STL, default value is 16), std::sort() will use quicksort.

The following code is __unguarded_partition function in quicksort.

  /// This is a helper function...
  template<typename _RandomAccessIterator, typename _Tp, typename _Compare>
    _RandomAccessIterator
    __unguarded_partition(_RandomAccessIterator __first,
             _RandomAccessIterator __last,
             const _Tp& __pivot, _Compare __comp) 
   {
      while (true)
      {
        while (__comp(*__first, __pivot))
          ++__first;
        --__last;
        while (__comp(__pivot, *__last))
          --__last;
        if (!(__first < __last))
          return __first;
        std::iter_swap(__first, __last);
        ++__first;
      }
    }

if arg1 == arg2, compare function return true, then __first iterator will keep moving to the right, and the program will go out of range and cause coredump.

while (__comp(*__first, __pivot))
    ++__first;

so, if arg1 == arg2, compare function MUST return false.

2. From a algorithm logic perspective

If comparison function uses operator<=, then it returns true for equal values. If test to see whether 10B is equivalent to 10A, it naturally uses the comparison function. In this example, that's operator<=. Tt checks to see whether this expression is true:

!(10A<= 10B)&&!(10B<= 10A) //test 10Aand 10B for equivalence

Well, 10A and 10B are both 10, so it's clearly true that 10A <= 10B. Equally clearly, 10B <= 10A. The above expression thus simplifies to

!(true)&&!(true)

and that simplifies to

false && false

which is simply false. That is, the test concludes that 10A and 10B are not equivalent, hence not the same. Furthermore, any comparison function where equal values return true will do the same thing. Equal values are, by definition, not equivalent! Isn't that cool?

You also can see << Effective STL>>, Item21: Always have comparison functions return false for equal values.

Upvotes: 9

rcgldr
rcgldr

Reputation: 28941

For an explanation to Benjamin Lindley's answer, consider the typical case where std::sort uses quicksort with Hoare type partition scheme. The left side is scanned for values < pivot using compare(value,pivot) to do the compare, while the right side is scanned for values > pivot using compare(pivot, value). Note that quicksort partition may rely on the fact that a left or right scan is stopped when it encounters a value == pivot and doesn't continue scanning past the pivot on that scan. If the user supplied compare function returns true on such a compare (true when value == pivot), the scan could continue beyond boundaries of the array or vector being sorted, which is apparently what happened in Benjamin Lindley's test case.

Upvotes: 3

Benjamin Lindley
Benjamin Lindley

Reputation: 103751

The algorithm which std::sort uses is unspecified. By using a comparison function which does not meet the requirements set by the standard, you break the algorithm's assumptions, and cause undefined behavior.

Look what happens in the output of this noisy comparison function which uses <= (which is not a valid comparator) instead of < (which is valid).

http://coliru.stacked-crooked.com/a/34e4cef3280f3728

In the output, I am printing an incrementing variable (for reference to point out when the algorithm goes haywire), followed by the value of the first argument and its position in the vector, and then the second argument and its position in the vector. An example looks like this:

124: 2@12 <= 4@7

Which means this is the 124th invocation of the comparison function, and it is comparing the value 2 at index 12, with the value 4 at index 7.

Things go crazy starting at line 37

37: 0@0 <= 0@-1
38: 0@0 <= 144@-2
39: 0@0 <= 0@-3
40: 0@0 <= 192@-4

It is comparing values that I didn't even insert into the vector (144, 192, etc...) and at indexes outside the range of the vector (negative indexes, in this case).

Upvotes: 6

Serge
Serge

Reputation: 12384

Without going in the depth of the math, 2 variables can be compared using just '<' operator (or '>' if you wish). However '<' is used commonly in explanation of the 'strict weak ordering` and in implementation of sorters.

The idea basically is that in practical programming if a < b == false and b < a == false then a == b.

Upvotes: 1

Oktalist
Oktalist

Reputation: 14734

The compare function simply models a "less than" operator. Consider how the < operator works for primitive types like int:

int a = 1, b = 2;     a < b == true      a is less than b
int a = 2, b = 1;     a < b == false     a is not less than b, because a is greater than b
int a = 1, b = 1;     a < b == false     a is not less than b, because a equals b

Returning true means you want a to be ordered before b. So return false if that is not the case, either because you want b to be ordered before a, or because their order doesn't matter.

If you return true when the arguments are equal, then you are saying that you want a to come before b and you want b to come before a, which is a contradiction.

Upvotes: 19

Related Questions