user542687
user542687

Reputation:

Best way to compare std::strings

What is the best way to compare std::strings? The obvious way would be with if/else:

std::string input;
std::cin >> input;

if ( input == "blahblahblah" )
{
    // do something.
}

else if ( input == "blahblah" )
{
    // do something else.
}

else if ( input == "blah" )
{
    // do something else yet.
}

// etc. etc. etc.

Another possibility is to use an std::map and a switch/case. What is the best way when doing lots (like 8, 10, 12+) of these comparisons?

Upvotes: 15

Views: 36801

Answers (7)

DaveSawyer
DaveSawyer

Reputation: 361

You could make a switch using a compile-time hash and with a 64 bit hash not have to worry about collisions. A nice side effect of this is your compiled program will not have the string constants in it, potentially being smaller.

// compile-time FNV-1a hash
// see (https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function)
static constexpr std::uint64_t basis = 14695981039346656037ULL;
static constexpr std::uint64_t prime = 1099511628211ULL;
constexpr std::uint64_t c_hash(const char *str) {
    std::uint64_t hash{basis};
    while (*str != 0) {
        hash ^= str[0];
        hash *= prime;
        ++str;
    }
    return hash;
}
switch (c_hash(input)) {
    case c_hash("blah"): cout << "first"; break;
    case c_hash("blahblah"): cout << "second"; break;
    case c_hash("blahblahblah"): cout << "third"; break;
    default: cout < "what?"; break;
}

Upvotes: 0

Ben
Ben

Reputation: 1318

Here's an example using std::map.

#include <map>
#include <string>
#include <iostream>
#include <utility>

void first()
{
  std::cout << "first\n";
}

void second()
{
  std::cout << "second\n";
}

void third()
{
  std::cout << "third\n";
}


int main()
{
  typedef void(*StringFunc)();
  std::map<std::string, StringFunc> stringToFuncMap;

  stringToFuncMap.insert(std::make_pair("blah", &first));
  stringToFuncMap.insert(std::make_pair("blahblah", &second));
  stringToFuncMap.insert(std::make_pair("blahblahblah", &third));

  stringToFuncMap["blahblah"]();
  stringToFuncMap["blahblahblah"]();
  stringToFuncMap["blah"]();
}

Output is:

second
third
first

The benefits of this approach are:

  • It's easily extensible.
  • It forces you to break out the string-handling routines into separate functions (programming by intention).
  • Function lookup is O(log n), whereas your example is O(n)

Look into using boost::function to make the syntax a bit nicer, especially with class member functions.

Upvotes: 27

Haozhun
Haozhun

Reputation: 6521

If you mean "most efficient" by "the best", read ahead.

I'm suggesting using the following method if there really is a lot.
String in Switch is actually something will be in Java 7. (As part of Project Coin)

And according to the proposal, this is the way Java language will implement it.
First, hash value of each of the strings is calculated. This problem is then a "switch int" problem, which is available in most currently language, and is efficient. In each of the case statement, you then check if this is really the string (in very rare cases different strings could hash to the same int).
I personally don't do the last step in practice sometimes as it's necessity depends on the situation you specific program is in, i.e. whether the strings possible are under the programmer's control and how robust the program need to be.

A sample pseudocode the corresponds

String s = ...
switch(s) {
 case "quux":
    processQuux(s);
    // fall-through

  case "foo":
  case "bar":
    processFooOrBar(s);
    break;

  case "baz":
     processBaz(s);
    // fall-through

  default:
    processDefault(s);
    break;
}

from the fore-mentioned proposal to help you understand.

// Advanced example
{  // new scope for synthetic variables
  boolean $take_default = false;
  boolean $fallthrough = false;
  $default_label: {
      switch(s.hashCode()) { // cause NPE if s is null
      case 3482567: // "quux".hashCode()
          if (!s.equals("quux")) {
              $take_default = true;
              break $default_label;
          }
          processQuux(s);
          $fallthrough = true;
                case 101574: // "foo".hashCode()
          if (!$fallthrough && !s.equals("foo")) {
              $take_default = true;
              break $default_label;
          }
          $fallthrough = true;
      case 97299:  // "bar".hashCode()
          if (!$fallthrough && !s.equals("bar")) {
              $take_default = true;
              break $default_label;
          }
          processFooOrBar(s);
          break;

      case 97307: // "baz".hashCode()
          if (!s.equals("baz")) {
              $take_default = true;
              break $default_label;
          }
          processBaz(s);
          $fallthrough = true;

      default:
          $take_default = true;
          break $default_label;
      }
  }
  if($take_default)
      processDefault(s);
}

Upvotes: 0

mike.dld
mike.dld

Reputation: 3049

With 8, 10 and even 12 comparisons you can still use if ... else if ... scheme, nothing bad. If you want 100 or something, I'd recommend writing a function which would calculate a hash of string (even by simple xoring all the characters, but some other good method would be preferable for better distribution) and then switching over its result as Evan proposed. If function returns unique numbers for all the possible input strings - that's even better and doesn't require additional comparisons.

Upvotes: 0

Evan Teran
Evan Teran

Reputation: 90563

using operator== is pretty good, but if performance is really critical, you can improve it depending on your use case. If the goal is to pick one of a few choices and perform a specific action, you can use a TRIE. Also if the strings are different enough, you could do something like this:

switch(s[0]) {
case 'a':
    // only compare to strings which start with an 'a'
    if(s == "abcd") {

    } else if (s == "abcde") {

    }
    break;
case 'b':
    // only compare to strings which start with an 'b'
    if(s == "bcd") {

    } else if (s == "bcde") {

    }
    break;
default:
    // we know right away it doesn't match any of the above 4 choices...
}

basically use a certain character in the string which good uniqueness (doesn't have to be the first if all strings are at least N in length any character before N will do!) to do a switch then do a series of compares on a subset of the strings which match that unique characteristic

Upvotes: 3

Edward Strange
Edward Strange

Reputation: 40895

The answer to this question is all too dependent upon the problem. You've named two examples. You could add to your options things like hash tables, regular expressions, etc...

Upvotes: 0

Etienne de Martel
Etienne de Martel

Reputation: 37015

"12" isn't a lot... but anyway.

You can only use a switch for integral types (char, int, etc.), so that's out of the question for std::string. Using a map would probably be more readable.

Then again, it all depends on how you define "best".

Upvotes: 1

Related Questions