Reputation: 189
I tried to calculate hashes for constant C-strings in compile-time using macros. That is my example code:
#include <stddef.h>
#include <stdint.h>
typedef uint32_t hash_t;
#define hash_cstr(s) ({ \
typeof(sizeof(s)) i = 0; \
hash_t h = 5381; \
for (; i < sizeof(s) - 1; ) \
h = h * 33 + s[i++]; \
h; \
})
/* tests */
#include <stdio.h>
int main() {
#define test(s) printf("The djb2 hash of " #s " is a %u\n", hash_cstr(#s))
test(POST);
test(/path/to/file);
test(Content-Length);
}
Now I run GCC to show listing:
arm-none-eabi-gcc-4.8 -S -O2 -funroll-loops -o hash_test.S hash_test.c
And the result is as expected: all strings was eliminated and replaced by its hashes. But generally I use -Os to compile code of embedded apps. When I try to do it, I have hashes only for strings with less than four characters. I also tried to set parameter max-unroll-times
and use GCC 4.9:
arm-none-eabi-gcc-4.9 -S -Os -funroll-loops \
--param max-unroll-times=128 -o hash_test.S hash_test.c
I can't understand the reason of that behavior and how I can extend this restriction of four chars.
Upvotes: 2
Views: 2839
Reputation: 189
It seems I found a workaround, which is limited by length. It looks like a dirty hack but works as expected with any GCC toolchain.
#define _hash_cstr_4(s, o) \
for (; i < ((o + 4) < sizeof(s) - 1 ? \
(o + 4) : sizeof(s) - 1); ) \
h = h * 33 + s[i++]
#define _hash_cstr_16(s, o) \
_hash_cstr_4(s, o); \
_hash_cstr_4(s, o + 4); \
_hash_cstr_4(s, o + 8); \
_hash_cstr_4(s, o + 12)
#define _hash_cstr_64(s, o) \
_hash_cstr_16(s, o); \
_hash_cstr_16(s, o + 16); \
_hash_cstr_16(s, o + 32); \
_hash_cstr_16(s, o + 48)
#define _hash_cstr_256(s, o) \
_hash_cstr_64(s, o); \
_hash_cstr_64(s, o + 64); \
_hash_cstr_64(s, o + 128); \
_hash_cstr_64(s, o + 192)
#define hash_cstr(s) ({ \
typeof(sizeof(s)) i = 0; \
hash_t h = 5381; \
if (sizeof(s) - 1 < 256) { \
_hash_cstr_256(s, 0); \
} else \
for (; i < sizeof(s) - 1; ) \
h = h * 33 + s[i++]; \
h; \
})
When the length of hashed string is lesser than 256 characters, it calculates hash at compile time, otherwise it calculates hash at runtime.
This solution does not require additional tuning of compiler. It works with -Os and -O1 too.
Upvotes: 1
Reputation: 1
I suggest putting the relevant code in a separate file and compile that file with -O2
(not with -Os
). Or put a function specific pragma like
#pragma GCC optimize ("-O2")
before the function, or use a function attribute like __attribute__((optimize("02")))
(and the pure
attribute probably is also relevant)
You might be interested by __builtin_constant_p
.
I would make your hashing code some static inline
function (perhaps with always_inline
function attribute), e.g.
static inline hash_t hashfun(const char*s) {
hash_t h = 5381;
for (const char* p = s; *p; p++)
h = h * 33 + *p;
return h;
}
A more portable (and less brittle) alternative is to change your build procedure to generate some C file (e.g. with a simple awk
or python
script, or even an ad-hoc C program) containing things like
const char str1[]="POST";
hash_t hash1=2089437419; // the hash code of str1
Don't forget that .c
or .h
files can be generated by something else (you'll just need to add some rules inside your Makefile
to generate them); if your boss feels uneasy about that show him the metaprogramming wikipage.
Upvotes: 2
Reputation: 193
If C++ is alowed give a channce to template function, something like:
template<int I>
hash_t hash_rec(const char* str, hash_t h) {
if( I > 0 ) {
return hash_rec<I-1>(str, h * 33 + str[I-1]);
} else {
return h;
}
}
#define hash(str) hash_rec<sizeof(str)>(str, 5381)
h = hash(str);
Upvotes: 0