Reputation: 54919
I have the following function:
char f1( int a, unsigned b ) { return abs(a) <= b; }
For execution speed, I want to rewrite it as follows:
char f2( int a, unsigned b ) { return (unsigned)(a+b) <= 2*b; } // redundant cast
Or alternatively with this signature that could have subtle implications even for non-negative b
:
char f3( int a, int b ) { return (unsigned)(a+b) <= 2*b; }
Both of these alternatives work under a simple test on one platform, but I need it to portable. Assuming non-negative b
and no risk of overflow, is this a valid optimization for typical hardware and C compilers? Is it also valid for C++?
Note: As C++ on gcc 4.8 x86_64 with -O3
, f1()
uses 6 machine instructions and f2()
uses 4. The instructions for f3()
are identical to those for f2()
. Also of interest: if b
is given as a literal, both functions compile to 3 instructions that directly map to the operations specified in f2()
.
Upvotes: 6
Views: 696
Reputation: 54919
Yes, this is portable to compliant platforms. The conversion from signed to unsigned is well defined:
The description in the C spec is a bit contrived:
if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
The C++ spec addresses the same conversion in a more sensible way:
In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern
In the question, f2()
and f3()
achieve the same results in a slightly different way.
f2()
the presence of the unsigned
operand causes a conversion of the signed
operand as required here for C++. The unsigned addition may-or-may-not then result in a wrap-around past zero, which is also well defined [citation needed].f3()
the addition occurs in signed representation with no trickiness, and then the result is (explicitly) converted to unsigned. So this is slightly simpler than f2()
(and also more clear).In both cases, the you end up with the same unsigned representation of the sum, which can then be compared (as unsigned) to 2*b
. And the trick of treating a signed value as an unsigned type allows you to check a two-sided range with only a single comparison. Note also that this is a bit more flexible than using the abs()
function since the trick doesn't require that the range be centered around zero.
I think this question demonstrated that using unsigned types is generally a bad idea. Look at the confusion it caused here.
It can be tempting to use unsigned
for documentation purposes (or to take advantage of the shifted value range), but due to the conversion rules, this may tend to be a mistake. In my opinion, the "usual arithmetic conversions" are not sensible if you assume that arithmetic is more likely to involve negative values than to overflow signed values.
I asked this followup question to clarify the point: mixed-sign integer math depends on variable size. One new thing that I have learned is that mixed-sign operations are not generally portable because the conversion type will depend on the size relative to that of int
.
In summary: Using type declarations or casts to perform unsigned operations is a low-level coding style that should be approached with the requisite caution.
Upvotes: 1
Reputation: 144780
To determine if the 2 expressions are equivalent for your purpose, you must study the domain of definition:
abs(a) <= b
is defined for all values of int a
and unsigned b
, with just one special case for a = INT_MIN;
. On 2s complement architectures, abs(INT_MIN)
is not defined but most likely evaluates to INT_MIN
, which converted to unsigned
as required for the <=
with an unsigned
value, yields the correct value.
(unsigned)(a+b) <= 2*b
may produce a different result for b > UINT_MAX/2
. For example, it will evaluate to false for a = 1
and b = UINT_MAX/2+1
. There might be more cases where you alternate formula gives an incorrect result.
EDIT: OK, the question was edited... and b
is now an int
.
Note that a+b
invokes undefined behavior in case of overflow and the same for 2*b
. So you make the assumption that neither a+b
nor 2*b
overflow. Furthermore, if b
is negative, you little trick does not work.
If a
is in the range -INT_MAX/2..INT_MAX/2
and b
in the range 0..INT_MAX/2
, it seems to function as expected. The behavior is identical in C and C++.
Whether it is an optimization depends completely on the compiler, command line options, hardware capabilities, surrounding code, inlining, etc. You already address this part and tell us that you shave one or two instructions... Just remember that this kind of micro-optimization is not absolute. Even counting instructions does not necessarily help find the best performance. Did you perform some benchmarks to measure if this optimization is worthwhile? Is the difference even measurable?
Micro-optimizing such a piece of code is self-defeating: it makes the code less readable and potentially incorrect. b
might not be negative in the current version, but if the next maintainer changes that, he/she might not see the potential implications.
Upvotes: 2
Reputation: 16156
Starting with the original code with signature
char f2( int a, unsigned b );
this contains the expression
a + b
Since one of these operands has a signed and the other an (corresponding) unsigned integer type (thus they have the same "integer conversion rank"), then - following the "Usual arithmetic conversions" (§ 6.3.1.8) - the operand with signed integer type is converted to the unsigned type of the other operand.
Conversion to an unsigned integer type is well defined, even if the value in question cannot be represented by the new type:
[..] if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 60
§ 6.3.1.3/2
Footnote 60 just says that the described arithmetic works with the mathematical value, not the typed one.
Now, with the updated code
char f2_updated( int a, int b ); // called f3 in the question
things would look different. But since b
is assumed to be non-negative, and assuming that INT_MAX <= UINT_MAX
you can convert b
to an unsigned
without fearing it to have a different mathematical value afterwards. Thus you could write
char f2_updated( int a, int b ) {
return f2(a, (unsigned)b); // cast unnecessary but to make it clear
}
Looking again at f2
the expression 2*b
further limits the allowed range of b
to be not larger than UINT_MAX/2
(otherwise the mathematical result would be wrong).
So as long as you stay within these bounds, every thing is fine.
Note: Unsigned types do not overflow, they "wrap" according to modular arithmetic.
Quotes from N1570 (a C11 working draft)
A final remark:
IMO the only really reasonable choice to write this function is as
#include <stdbool.h>
#include <assert.h>
bool abs_bounded(int value, unsigned bound) {
assert(bound <= (UINT_MAX / 2));
/* NOTE: Casting to unsigned makes the implicit conversion that
otherwise would happen explicit. */
return ((unsigned)value + bound) <= (2 * bound);
}
Using a signed type for the bound
does not make much sense, because the absolute of a value cannot be less than a negative number. abs_bounded(value, something_negative)
would be always false. If there's the possibility of a negative bound, then I'd catch this outside of this function (otherwise it does "too much"), like:
int some_bound;
// ...
if ((some_bound >= 0) && abs_bounded(my_value, some_bound)) {
// yeeeha
}
Upvotes: 2
Reputation: 153547
As OP wants fast and portable code (and b
is positive), it first makes sense to code safely:
// return abs(a) <= b;
inline bool f1_safe(int a, unsigned b ) {
return (a >= 0 && a <= b) || (a < 0 && 0u - a <= b);
}
This works for all a,b
(assuming UINT_MAX > INT_MAX
). Next, compare alternatives using an optimized compile (let the compiler do what it does best).
The following slight variation on OP's code will work in C/C++ but risks portability issues unless "Assuming non-negative b and no risk of overflow" can be certain on all target machines.
bool f2(int a, unsigned b) { return a+b <= b*2; }
In the end, OP goal of fast and portable code may find code the works optimally for the select platform, but not with others - such is micro-optimization.
Upvotes: 2