Dia Abdulkarim
Dia Abdulkarim

Reputation: 13

What is the meaning of the comparision operators for strings?

I am giving a class in C++, so our topic was about operator overloading, from the text book that I am using "C++ How to program for Deitel" and there is this example, where he compares 2 strings which logically doesn't make any sense.

My question is in the following example. How come s2 is smaller then s1 in comparison using alphabool?

#include <iostream>
#include <string> 
using namespace std;

int main() {
    string s1{ "happy" };
    string s2{ " birthday" };
    string s3; // creates an empty string

    // test overloaded equality and relational operators
    cout << "s1 is \"" << s1 << "\"; s2 is \"" << s2
        << "\"; s3 is \"" << s3 << '\"'
        << "\n\nThe results of comparing s2 and s1:" 
        << "\ns2 == s1 yields " << (s2 == s1)
        << "\ns2 != s1 yields " << (s2 != s1)
        << "\ns2 >  s1 yields " << (s2 > s1)
        << "\ns2 <  s1 yields " << (s2 < s1)
        << "\ns2 >= s1 yields " << (s2 >= s1)
        << "\ns2 <= s1 yields " << (s2 <= s1)
        <<"Size of string s1 is:"<<s1.length()
        <<"Size if string s2 is:"<<s2.length();

}

Upvotes: 0

Views: 105

Answers (3)

Tom Blodget
Tom Blodget

Reputation: 20782

Although string is intended for text and all text has an encoding, the comparison operators treat a string as a sequence of bytes—numeric values. So, no matter how the bytes go into the string and, if text, which encoding is used, the comparison is over numeric sequences, lexicographically. This means an element by element comparison up until the first difference or the end of one or both of the sequences.

C++ String class

Note that this class handles bytes independently of the encoding used: If used to handle sequences of multi-byte or variable-length characters (such as UTF-8), all members of this class (such as length or size), as well as its iterators, will still operate in terms of bytes (not actual encoded characters).

Upvotes: 0

Ben S.
Ben S.

Reputation: 1143

The space character at the beginning of s2 comes before the h at the beginning of s1 in the ASCII chart, so s2 gets sorted first. Fortunately, if that isn't the behavior you want, the standard library includes ways of using different sorting algorithms. For example, some versions of std::lexicographical_compare take a parameter of type Compare that can be any function that takes two elements of the same type and returns true if and only if the two elements are in the desired order. (You can also use a "function-like object", i.e., a class that implements a operator() that takes two elements and returns a bool). Similarly, containers like std::set and std::map that keep their elements in sorted order take a template type argument Compare that can be used to sort their elements however you like.

Now, that sounds like a lot of work compared to just doing s1 < s2, huh? Well, if you like, you can create your own string class with its own operator< implemented however you like. One way to do that is like this:

class myString: public std::string
{
    // ...
};

bool operator<(const myString &ms1, const myString &ms2)
{
    return std::lexicographical_compare(ms1.begin(), ms1.end(), ms2.begin(), ms2.end(), customComparison);
}

Now, you do have to be a bit careful depending on how you do this - in this case, you need to be careful not to hand myString objects to something that's expecting an std::string, since it would try to sort them like std::string objects instead - but there are various techniques for mitigating that if you decide it's something you want to do.

Upvotes: 2

Izuka
Izuka

Reputation: 2612

What is being used right there is Lexicographical Comparison. To explain it simply, when comparing two string value this way, the one word that would first appear in a dictionnary is the one that will be considered "little than" the second. Basically, each characters of the two string values are compared one by one until the ASCII code of one has a lower value than the one of the other.

In your particular case, the space at the start of your " birthday" string has a lower ASCII value than the one of the h of your "happy" string. So it is sorted first.

Upvotes: 2

Related Questions