Reputation: 89053
So I'm optimizing some code by unrolling some loops (yes, I know that I should rely on my compiler to do this for me, but I'm not working with my choice of compilers) and I wanted to do so somewhat gracefully so that, in case my data size changes due to some edits in the future, the code will degrade elegantly.
Something like:
typedef struct {
uint32_t alpha;
uint32_t two;
uint32_t iii;
} Entry;
/*...*/
uint8_t * bytes = (uint8_t *) entry;
#define PROCESS_ENTRY(i) bytes[i] ^= 1; /*...etc, etc, */
#if (sizeof(Entry) == 12)
PROCESS_ENTRY( 0);PROCESS_ENTRY( 1);PROCESS_ENTRY( 2);
PROCESS_ENTRY( 3);PROCESS_ENTRY( 4);PROCESS_ENTRY( 5);
PROCESS_ENTRY( 6);PROCESS_ENTRY( 7);PROCESS_ENTRY( 8);
PROCESS_ENTRY( 9);PROCESS_ENTRY(10);PROCESS_ENTRY(11);
#else
# warning Using non-optimized code
size_t i;
for (i = 0; i < sizeof(Entry); i++)
{
PROCESS_ENTRY(i);
}
#endif
#undef PROCESS_ENTRY
This not working, of course, since sizeof
isn't available to the pre-processor (at least, that's what this answer seemed to indicate).
Is there an easy workaround I can use to get the sizeof
a data structure for use with a C macro, or am I just SOL?
Upvotes: 5
Views: 6684
Reputation: 101565
You cannot do it in preprocessor, but you do not need to. Just generate a plain if
in your macro:
#define PROCESS_ENTRY(i) bytes[i] ^= 1; /*...etc, etc, */
if (sizeof(Entry) == 12) {
PROCESS_ENTRY( 0);PROCESS_ENTRY( 1);PROCESS_ENTRY( 2);
PROCESS_ENTRY( 3);PROCESS_ENTRY( 4);PROCESS_ENTRY( 5);
PROCESS_ENTRY( 6);PROCESS_ENTRY( 7);PROCESS_ENTRY( 8);
PROCESS_ENTRY( 9);PROCESS_ENTRY(10);PROCESS_ENTRY(11);
} else {
size_t i;
for (i = 0; i < sizeof(Entry); i++) {
PROCESS_ENTRY(i);
}
}
sizeof
is a constant expression, and comparing a constant against constant is also constant. Any sane C compiler will optimize away the branch that is always false at compile-time - constant folding is one of the most basic optimizations. You lose the #warning
, though.
Upvotes: 17
Reputation: 35925
This probably won't help, but if you have the ability to do this in C++ you can use a template to cause the compiler to dispatch to the appropriate loop at compile time:
template <std::size_t SizeOfEntry>
void process_entry_loop(...)
{
// ... the nonoptimized version of the loop
}
template <>
void process_entry_loop<12>(...)
{
// ... the optimized version of the loop
}
// ...
process_entry_loop<sizeof(Entry)>(...);
Upvotes: 1
Reputation: 49311
Two other approaches spring to mind - either write a small app to write the unrolled loop, or use a variation on Duff's device with the expected size of the struct.
Upvotes: 1
Reputation: 279245
You're out of luck - the preprocessor doesn't even know what a struct is, let alone any way to work out its size.
In a case like this you could just #define a constant to what you happen to know the size of the struct is, then statically assert that it's actually equal to the size using the negative-sized array trick.
Also you could try just doing if (sizeof(Entry) == 12)
, and see whether your compiler is capable of evaluating the branch condition at compile time and removing dead code. It's not that big an ask.
Upvotes: 5
Reputation: 10670
If you want the smallest possible size for the struct (or to align it to a 4-byte boundary, or whatever), you can use the packed or aligned attributes.
In Visual C++, you can use #pragma pack, and in GCC you can use __attribute__((packed)) and __attribute__((aligned(num-bytes)).
Upvotes: -2
Reputation: 120644
If you are using autoconf or another build configuration system, you could check the size of the data structures at configuration time and write out headers (like #define SIZEOF_Entry 12
). Of course this gets more complicated when cross-compiling and such, but I am assuming your build and target architectures are the same.
Otherwise yes, you are out of luck.
Upvotes: 9