Kevin Duarte
Kevin Duarte

Reputation: 428

Why is there no optimization for uint8?

So I have been researching how the variable uint8 works and I have realized that it is actually not faster than int! In order to multiply, divide, add, or subtract, the program must turn uint8 into an int which will make it about the same speed or slightly slower.

Why did C++ not implement multiplying, dividing, adding, or subtracting directly to uint8?

Upvotes: 2

Views: 398

Answers (2)

BlueWanderer
BlueWanderer

Reputation: 2691

I'm not sure wether or not a compiler will produce 8bit arithmetic operations for uint8_t when properate (quite unlikely for it is unlikely to be faster).

@harold mentioned, what I said before is not so morden now... Partial register update problem is no longer so serious now for 8bit operations. So, just that most 8bit operations are not faster. While 8bit division is a little faster and I'm trying to figure out why MS's compiler won't use it. (Not so sure: As the partially updating problem is just mostly reduced not completely removed, and even kept by AMD, that one cycle benefit of 8bit division just not worth to be abused).

Original: On morden x86 processors, 8bit operations face a problem called partial register update that you only change part of the full register, which results in false dependency that seriously impacts performance.


And FYI, at the language level there is no arithmetic for integral types smaller than int in C++. There is the usual arithmetic promotion to lift the type.

Upvotes: 1

πάντα ῥεῖ
πάντα ῥεῖ

Reputation: 1

Why did C++ not implement multiplying, dividing, adding, or subtracting directly to uint8?

Because the optimal way doing that is platform specific.

Most CPU's provide these operations as assembler instructions based on using integer values of a specific default size (e.g. 32 bits, or 64 bits like shown here for 16 bit instructions), they may or may not have such instructions for uint8 values.
The bit size is usually optimized for the CPU's cache lining mechanisms.

So the optimal implementation is dependend on the available target CPU instructions and cannot be covered by the C++ standard.

Upvotes: 4

Related Questions