Troy Harvey
Troy Harvey

Reputation: 1415

constexpr in C (or equivalent)

I am trying to get a string-based switch expression to work in C using a hash function. I've been able to get it to work with clean syntax using 'constexpr' with Clang/LLVM turned to C++, even though the code is C.

However, there are of course odd side effects of having it compile as C++, like lack of void* implicit casting which becomes really awkward.

So the question is how to solve this dilemma (without slapping the C11 committee upside their head for why this wasn't added to the C spec)

  1. Is there a way to get constexpr option turned on with C?
  2. Is there a way to get implicit void* casting turned on with C++?
  3. Is there another clean way to code this in C11/C99 that doesn't require recalculating hashes?

Here is my current example code:

constexpr uint64 cHash(char const* text, uint64 last_value = basis)
{
    return *str ? cHash(text+1, (*text ^ last_value) * prime) : last_value;
}

void SwitchFunction(char const* text)
{
    switch(Hash(text))
    {
        case cHash("first"):
            break;
        case cHash("second"):
            break;
        case cHash("third"):
            break;
        default:
            break;
    }
}

Upvotes: 25

Views: 32463

Answers (5)

Henri Menke
Henri Menke

Reputation: 10939

I'm a bit late to the party, but was recently facing the same problem.

For such a simple hash function, you can just implement it using the C preprocessor. The downside is that the preprocessor can not split strings into characters, so instead of hash("first") you will have to write HASH('f','i','r','s','t'). The HASH macro is implemented using __VA_ARGS__ and works for strings with up to eight characters.

I have also turned you hash function from a recursive one into an iterative one, which is a bit easier to read and doesn't require the optional argument. The generated assembly is pretty much the same (https://godbolt.org/z/1g8LPI).

#include <stdio.h>

typedef unsigned long uint64;

#define HASH_BASIS 17UL
#define HASH_PRIME 11UL

#define HASH_1(ARG1) ((ARG1 ^ HASH_BASIS) * HASH_PRIME)
#define HASH_2(ARG1, ARG2) ((ARG2 ^ HASH_1(ARG1)) * HASH_PRIME)
#define HASH_3(ARG1, ARG2, ARG3) ((ARG3 ^ HASH_2(ARG1, ARG2)) * HASH_PRIME)
#define HASH_4(ARG1, ARG2, ARG3, ARG4)                                         \
    ((ARG4 ^ HASH_3(ARG1, ARG2, ARG3)) * HASH_PRIME)
#define HASH_5(ARG1, ARG2, ARG3, ARG4, ARG5)                                   \
    ((ARG5 ^ HASH_4(ARG1, ARG2, ARG3, ARG4)) * HASH_PRIME)
#define HASH_6(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6)                             \
    ((ARG6 ^ HASH_5(ARG1, ARG2, ARG3, ARG4, ARG5)) * HASH_PRIME)
#define HASH_7(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6, ARG7)                       \
    ((ARG7 ^ HASH_6(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6)) * HASH_PRIME)
#define HASH_8(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6, ARG7, ARG8)                 \
    ((ARG8 ^ HASH_7(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6, ARG7)) * HASH_PRIME)

#define HASH_COUNT(ARG1, ARG2, ARG3, ARG4, ARG5, ARG6, ARG7, ARG8, func, ...)  \
    func

#define HASH(...)                                                              \
    HASH_COUNT(__VA_ARGS__, HASH_8(__VA_ARGS__), HASH_7(__VA_ARGS__),          \
               HASH_6(__VA_ARGS__), HASH_5(__VA_ARGS__), HASH_4(__VA_ARGS__),  \
               HASH_3(__VA_ARGS__), HASH_2(__VA_ARGS__), HASH_1(__VA_ARGS__))

uint64 hash(const char *text) {
    uint64 h = HASH_BASIS;
    char c;
    while ((c = *text++) != '\0') {
        h = (c ^ h) * HASH_PRIME;
    }
    return h;
}

int main(int argc, char *argv[]) {
    const char *text = argc > 1 ? argv[1] : "";
    switch (hash(text)) {
    case HASH('f', 'i', 'r', 's', 't'):
        puts(text);
        break;
    case HASH('s', 'e', 'c', 'o', 'n', 'd'):
        puts(text);
        break;
    case HASH('t', 'h', 'i', 'r', 'd'):
        puts(text);
        break;
    default:
        puts("oops");
        break;
    }
}

Upvotes: 14

Alvae
Alvae

Reputation: 1294

If you use an inline function and compile your code with optimizations, a decent compiler should be able to apply constant propagation to your code. Here's a small example:

const int basis = 17;
inline const int hash(const char* text, int last_value) {
  return *text ? hash(text + 1, (*text ^ last_value) * 11) : last_value;
}

int main(int argc, const char** argv) {
  if (hash(argv[0], basis) == hash("hello", basis)) {
    return 0;
  } else {
    return 1;
  }
}

If invoked with the -O3 flag, clang will optimize away the call to hash("hello", basis) and replace it with a constant statically. You can see that optimization if you generate the LLVM byte code (clang -S -emit-llvm example.c):

; (...)
  %18 = icmp ne i32 %14, 20068367
  %19 = zext i1 %18 to i32
  br label %20
; (...)

Unfortunately, this doesn't mean you can use a call to hash as an actual constant expression within your code, as there's no way to tell the compiler hash is necessarily statically optimizable. So for instance, you can't use it as the value of a switch-case. For these particular use cases (no pun intended), you've got no choice but using pre-computed constants (i.e. Lundin's suggestion).

This might not be as hard as you think, depending on how complex are your constexprs. There are countless C parsers written in various scripting languages (e.g. pycparser for Python). Then all you need to do is to walk your C AST and apply whatever custom preprocessing pass(es) you see fit.

Upvotes: 9

mjs
mjs

Reputation: 3005

That isnt going to work in C. The values of the case labels must be constant.

What you could do is pre-calculate the output for cHash("first") etc, and then use the value in the case, eg:

#define CHASH_FIRST 0x831928 /* precalculated output for cHash ("first") */

switch (Hash(text))
{
   case CHASH_FIRST:
     break;

}

To extend this you could then build another binary that just calculates the values for your hashes, run this as part of your build process, and use the values generated as pre-processor defines on your compile line using.

Upvotes: 4

Lundin
Lundin

Reputation: 213358

Is there a way to get constexpr option turned on with C?

No, no such thing exists in C.

Is there a way to get implicit void* casting turned on with C++?

No, C++ has mandatory type safety of pointers.

Is there another clean way to code this in C11/C99 that doesn't require recalculating hashes?

The only way you can do it, is the traditional way with macros. In case you create a function-like macro with those parameters, and only use it on compile-time constants, then all computations will be done at compile-time. Unfortunately, the code will turn rather ugly, but there is no way to avoid that in C.

The best way might be to prepare all such compile-time parameters with an external script/program, then just store them as raw data tables in the C program.

Upvotes: 6

Joe
Joe

Reputation: 7798

If you know the values to be hashed ahead of time then you could use gperf and generate a perfect hash? C won't play well with constexpr.

Upvotes: 6

Related Questions