Mike
Mike

Reputation: 582

C convert signed to unsigned maintaining exact bits

Edit: I updated the example to be C. I am concerned specifically with C and not C++ (sorry for the confusion, see situation below).

I am looking for a safe way to convert a signed integer to an unsigned integer while always maintaining the exact same bit pattern between conversions. As I understand it, simply casting has undefined or implementation dependent behavior so it is not safe to rely on (case A below). But what about bit-wise operators like OR (case B below)? Can bit-wise OR be used to safely convert signed to unsigned? What about the reverse?

Example:

#include <stdio.h>

int main() {
    // NOTE: assuming 32bit ints
    // example bit pattern: 11111111110001110001001111011010
    //   signed int value: -3730470
    // unsigned int value: 4291236826

    // example 1
    // signed -> unsigned
    int s1 = -3730470; 
    unsigned int u1a = (unsigned int)s1;
    unsigned int u1b = (unsigned int)0 | s1;

    printf("%u\n%u\n", u1a, u1b);

    // example 2
    // unsigned -> signed
    unsigned int u2 = 4291236826;
    int s2a = (int)u2;
    int s2b = (int)0 | u2;

    printf("%i\n%i\n", s2a, s2b);
}

Situation: I am writing a PostgreSQL C-Language function/extension to add popcount functionality (my first attempt code here). PostgreSQL does not support unsigned types (ref). All the efficient methods of calculating popcount I found require unsigned data types to work correctly. Therefore, I must be able to convert the signed data types to an unsigned data type without changing the bit pattern.

Off topic: I do realize that an alternate solution would be to use PostgreSQL bit string bit and varbit data types instead of the integer data types, but for my purposes the integer data types are much easier to use and manage.

Upvotes: 2

Views: 3535

Answers (2)

Alexander Grissik
Alexander Grissik

Reputation: 13

What about ...

int s1 = -3730470; 
unsigned int u1 = *(unsigned int*)&s1;

unsigned int u2 = 4291236826;
int s2a = *(int*)&u2;

Upvotes: 1

chux
chux

Reputation: 154218

a safe way to convert a signed integer to an unsigned integer while always maintaining the exact same bit pattern between conversions

A union will work as below even if the int is a rare non-2's complement. Only on very expectational platforms (ticking away in a silicon graveyard) where INT_MAX == UINT_MAX will this be a problem.

union {
  int i;
  unsigned u;
} x = { some_int };
printf("%d\n", some_int);
printf("%u\n", x.u);

Yet if one can limit oneself to common 2's complement int, the below is sufficient.

unsigned u = (unsigned) some_int;

But what about bit-wise operators like OR (case B below)?
Can bit-wise OR be used to safely convert signed to unsigned?

The following | is like a hidden cast due to integer promotions:

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. C11dr §6.3.1.1 3

int s1 = -3730470; 
unsigned int u1b = (unsigned int)0 | s1;
// just like
                 = (unsigned int)0 | (unsigned int)s1;
                 =                   (unsigned int)s1;

What about the reverse?

Converting a unsigned int to a signed int is well defined if the value is representable in both [0...INT_MAX]. Converting an out-of-int-range unsigned to int is ...

either the result is implementation-defined or an implementation-defined signal is raised. §6.3.1.3 3

Best to use unsigned types for bit manipulations.
The below code may often work as hoped, but should not be used for robust coding.

// NOTE: assuming 32bit ints, etc.
unsigned int u2 = 4291236826;
int s2a = (int)u2;  // avoid this

Alternative

int s2a;
if (u2 > INT_MAX) {
  // Handle with some other code
} else {
  s2a = (int) u2; // OK
}

BTW: better to append u to unsigned constants like 4291236826 to convey to the compiler that indeed an unsigned constant is intended and not a long long like 4291236826.

unsigned int u2 = 4291236826u;

Upvotes: 3

Related Questions