user2371524
user2371524

Reputation:

Determine the fastest unsigned integer type for basic arithmetic calculations

I'm writing some code for calculating with arbitrarily large unsigned integers. This is just for fun and training, otherwise I'd use libgmp. My representation uses an array of unsigned integers and for chosing the "base type", I use a typedef:

#include <limits.h>
#include <stdint.h>

typedef unsigned int hugeint_Uint;

typedef struct hugeint hugeint;

#define HUGEINT_ELEMENT_BITS (CHAR_BIT * sizeof(hugeint_Uint))
#define HUGEINT_INITIAL_ELEMENTS (256 / HUGEINT_ELEMENT_BITS)

struct hugeint
{
    size_t s;         // <- maximum number of elements
    size_t n;         // <- number of significant elements
    hugeint_Uint e[]; // <- elements of the number starting with least significant
};

The code is working fine, so I only show the part relevant to my question here.

I would like to pick a better "base type" than unsigned int, so the calculations are the fastest possible on the target system (e.g. a 64bit type when targeting x86_64, a 32bit type when targeting i686, an 8bit type when targeting avr_attiny, ...)

I thought that uint_fast8_t should do what I want. But I found out it doesn't, see e.g. here the relevant part of stdint.h from MinGW:

/*  7.18.1.3  Fastest minimum-width integer types
 *  Not actually guaranteed to be fastest for all purposes
 *  Here we use the exact-width types for 8 and 16-bit ints.
 */
typedef signed char int_fast8_t;
typedef unsigned char uint_fast8_t;

The comment is interesting: for which purpose would an unsigned char be faster than an unsigned int on win32? Well, the important thing is: uint_fast8_t will not do what I want.

So is there some good and portable way to find the fastest unsigned integer type?

Upvotes: 2

Views: 204

Answers (2)

rustyx
rustyx

Reputation: 85316

It's not quite that black and white; processors may have different/specialized registers for certain operations, like AVX registers on x86_64, may operate most efficiently on half-sized registers or not have registers at all. The choice of the "fastest integer type" thus depends heavily on the actual calculation you need to perform.

Having said that, C99 defines uintmax_t which is meant to represent the maximum width unsigned integer type, but beware, it could be 64 bit simply because the compiler is able to emulate 64-bit math.

If you target commodity processors, size_t usually provides a good approximation for the "bitness" of the underlying hardware because it is directly tied to the memory addressing capability of the machine, and as such is most likely to be the most optimal size for integer math.

In any case you're going to have to test your solution on all hardware that you're planning to support.

Upvotes: 2

tofro
tofro

Reputation: 6063

It's a good idea to start your code with the largest integer type the platform has, uintmax_t. As has already been pointed out, this is not necessarily, but rather most probably the fastest. I'd rather say there are exceptions where this is not the case, but as a default, it is probably your best bet.

Be very careful to build the size granularity into expressions that the compiler can resolve at compile type, rather than runtime for speed.

It is most probably a good idea to define the base type as something like

#define LGINT_BASETYPE uintmax_t
#define LGINT_GRANUL   sizeof(LGINT_BASETYPE)

This will allow you to change the base type in one single place and adapt to different platforms quickly. That results in code that is easily moved to a new platform, but still can be adapted to the exception cases where the largest int type is not the most performant one (after you have proven that by measurement)

As always, it does not make a lot of sense to think about optimal performance when designing your code - Start with a reasonable balance of "designed for optimization" and "design for maintainability" - You might easily find out that the choice of base type is not really the most CPU-eating part of your code. In my experience, I was nearly always in for some surprises when comparing my guesses on where CPU is spent to my measurements. Don't fall into the premature optimization trap.

Upvotes: 0

Related Questions