Reputation: 582
Edit: I updated the example to be C. I am concerned specifically with C and not C++ (sorry for the confusion, see situation below).
I am looking for a safe way to convert a signed integer to an unsigned integer while always maintaining the exact same bit pattern between conversions. As I understand it, simply casting has undefined or implementation dependent behavior so it is not safe to rely on (case A below). But what about bit-wise operators like OR (case B below)? Can bit-wise OR be used to safely convert signed to unsigned? What about the reverse?
Example:
#include <stdio.h>
int main() {
// NOTE: assuming 32bit ints
// example bit pattern: 11111111110001110001001111011010
// signed int value: -3730470
// unsigned int value: 4291236826
// example 1
// signed -> unsigned
int s1 = -3730470;
unsigned int u1a = (unsigned int)s1;
unsigned int u1b = (unsigned int)0 | s1;
printf("%u\n%u\n", u1a, u1b);
// example 2
// unsigned -> signed
unsigned int u2 = 4291236826;
int s2a = (int)u2;
int s2b = (int)0 | u2;
printf("%i\n%i\n", s2a, s2b);
}
Situation: I am writing a PostgreSQL C-Language function/extension to add popcount functionality (my first attempt code here). PostgreSQL does not support unsigned types (ref). All the efficient methods of calculating popcount I found require unsigned data types to work correctly. Therefore, I must be able to convert the signed data types to an unsigned data type without changing the bit pattern.
Off topic: I do realize that an alternate solution would be to use PostgreSQL bit string bit
and varbit
data types instead of the integer data types, but for my purposes the integer data types are much easier to use and manage.
Upvotes: 2
Views: 3535
Reputation: 13
What about ...
int s1 = -3730470;
unsigned int u1 = *(unsigned int*)&s1;
unsigned int u2 = 4291236826;
int s2a = *(int*)&u2;
Upvotes: 1
Reputation: 154218
a safe way to convert a signed integer to an unsigned integer while always maintaining the exact same bit pattern between conversions
A union
will work as below even if the int
is a rare non-2's complement. Only on very expectational platforms (ticking away in a silicon graveyard) where INT_MAX == UINT_MAX
will this be a problem.
union {
int i;
unsigned u;
} x = { some_int };
printf("%d\n", some_int);
printf("%u\n", x.u);
Yet if one can limit oneself to common 2's complement int
, the below is sufficient.
unsigned u = (unsigned) some_int;
But what about bit-wise operators like OR (case B below)?
Can bit-wise OR be used to safely convert signed to unsigned?
The following |
is like a hidden cast due to integer promotions:
If an
int
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. C11dr §6.3.1.1 3
int s1 = -3730470;
unsigned int u1b = (unsigned int)0 | s1;
// just like
= (unsigned int)0 | (unsigned int)s1;
= (unsigned int)s1;
What about the reverse?
Converting a unsigned int
to a signed int
is well defined if the value is representable in both [0...INT_MAX]
. Converting an out-of-int
-range unsigned
to int
is ...
either the result is implementation-defined or an implementation-defined signal is raised. §6.3.1.3 3
Best to use unsigned types for bit manipulations.
The below code may often work as hoped, but should not be used for robust coding.
// NOTE: assuming 32bit ints, etc.
unsigned int u2 = 4291236826;
int s2a = (int)u2; // avoid this
Alternative
int s2a;
if (u2 > INT_MAX) {
// Handle with some other code
} else {
s2a = (int) u2; // OK
}
BTW: better to append u
to unsigned constants like 4291236826 to convey to the compiler that indeed an unsigned constant is intended and not a long long
like 4291236826.
unsigned int u2 = 4291236826u;
Upvotes: 3