Reputation:
I'm writing some code for calculating with arbitrarily large unsigned integers. This is just for fun and training, otherwise I'd use libgmp. My representation uses an array of unsigned integers and for chosing the "base type", I use a typedef:
#include <limits.h>
#include <stdint.h>
typedef unsigned int hugeint_Uint;
typedef struct hugeint hugeint;
#define HUGEINT_ELEMENT_BITS (CHAR_BIT * sizeof(hugeint_Uint))
#define HUGEINT_INITIAL_ELEMENTS (256 / HUGEINT_ELEMENT_BITS)
struct hugeint
{
size_t s; // <- maximum number of elements
size_t n; // <- number of significant elements
hugeint_Uint e[]; // <- elements of the number starting with least significant
};
The code is working fine, so I only show the part relevant to my question here.
I would like to pick a better "base type" than unsigned int
, so the calculations are the fastest possible on the target system (e.g. a 64bit type when targeting x86_64, a 32bit type when targeting i686, an 8bit type when targeting avr_attiny, ...)
I thought that uint_fast8_t
should do what I want. But I found out it doesn't, see e.g. here the relevant part of stdint.h
from MinGW:
/* 7.18.1.3 Fastest minimum-width integer types
* Not actually guaranteed to be fastest for all purposes
* Here we use the exact-width types for 8 and 16-bit ints.
*/
typedef signed char int_fast8_t;
typedef unsigned char uint_fast8_t;
The comment is interesting: for which purpose would an unsigned char
be faster than an unsigned int
on win32? Well, the important thing is: uint_fast8_t
will not do what I want.
So is there some good and portable way to find the fastest unsigned integer type?
Upvotes: 2
Views: 204
Reputation: 85316
It's not quite that black and white; processors may have different/specialized registers for certain operations, like AVX registers on x86_64, may operate most efficiently on half-sized registers or not have registers at all. The choice of the "fastest integer type" thus depends heavily on the actual calculation you need to perform.
Having said that, C99 defines uintmax_t
which is meant to represent the maximum width unsigned integer type, but beware, it could be 64 bit simply because the compiler is able to emulate 64-bit math.
If you target commodity processors, size_t
usually provides a good approximation for the "bitness" of the underlying hardware because it is directly tied to the memory addressing capability of the machine, and as such is most likely to be the most optimal size for integer math.
In any case you're going to have to test your solution on all hardware that you're planning to support.
Upvotes: 2
Reputation: 6063
It's a good idea to start your code with the largest integer type the platform has, uintmax_t
. As has already been pointed out, this is not necessarily, but rather most probably the fastest. I'd rather say there are exceptions where this is not the case, but as a default, it is probably your best bet.
Be very careful to build the size granularity into expressions that the compiler can resolve at compile type, rather than runtime for speed.
It is most probably a good idea to define the base type as something like
#define LGINT_BASETYPE uintmax_t
#define LGINT_GRANUL sizeof(LGINT_BASETYPE)
This will allow you to change the base type in one single place and adapt to different platforms quickly. That results in code that is easily moved to a new platform, but still can be adapted to the exception cases where the largest int type is not the most performant one (after you have proven that by measurement)
As always, it does not make a lot of sense to think about optimal performance when designing your code - Start with a reasonable balance of "designed for optimization" and "design for maintainability" - You might easily find out that the choice of base type is not really the most CPU-eating part of your code. In my experience, I was nearly always in for some surprises when comparing my guesses on where CPU is spent to my measurements. Don't fall into the premature optimization trap.
Upvotes: 0