yino
yino

Reputation: 1

Hash string change case

I have a constexpr hash string function that gets called at compile time.

I want to know, how can I convert the string to uppercase or lowercase before it is hashed?

constexpr unsigned long long int HashString(const char* str, unsigned long long int hash = 0)
{
    return (*str == 0) ? hash : 101 * HashString(str + 1) + *str;
}

Upvotes: 0

Views: 476

Answers (1)

cdhowie
cdhowie

Reputation: 169353

If I understand your goal right, this is possible but with a few caveats:

  • You can't actually build a new string literal in a constexpr function. That capability does not exist in C++, so "converting the string to lowercase" is simply not possible at compile time, at least not in that specific way.
  • The answer given here assumes the input is ASCII; with non-ASCII strings (UTF-8, for example) it will either not change the case of multi-byte characters, or it might even damage multi-byte characters such that a UTF-8 decoder will fail when decoding the string.

Since we can't return a new string literal out of a constexpr function, we need to be able to give the hash function a mapping function to apply to each character. For example, here's a constexpr mapping function to convert an ASCII character to lowercase:

constexpr char ascii_tolower(char v) {
    return v >= 'A' && v <= 'Z' ?
        v + ('a' - 'A') :
        v;
}

We can also provide a simple "identity" function that maps values to themselves (if you want to hash a string as-is with no case conversion).

template <typename T>
constexpr T identity(T v) {
    return v;
}

To be able to supply these functions at compile time, we need to give this to another constexpr function as a template parameter.1 We'll make identity the default so that hash_fn can be called as though it weren't a template function, if no mapping is desired.

template <constexpr char (*fn)(char) = identity>
constexpr int hash_fn(char const *str) {
    return *str ? fn(*str) ^ (hash_fn<fn>(str + 1) << 3) : 0;
}

Now we can use this function in constexpr context with arbitrary mapping functions, like so:

constexpr int hash_a = hash_fn<ascii_tolower>("FoObAr");
constexpr int hash_b = hash_fn<ascii_tolower>("fOoBaR");

(Demo)


1 Note that this hash function is pretty simplistic and will very easily generate collisions. It's not intended to be an example of a good hash function, just an example of the technique of using a constexpr function as a template parameter.

Upvotes: 1

Related Questions