chqrlie
chqrlie

Reputation: 144655

Multiple inconsistent behavior of signed bit-fields

I have come across a strange behavior on signed bit-fields:

#include <stdio.h>

struct S {
    long long a31 : 31;
    long long a32 : 32;
    long long a33 : 33;
    long long : 0;
    unsigned long long b31 : 31;
    unsigned long long b32 : 32;
    unsigned long long b33 : 33;
};

long long f31(struct S *p) { return p->a31 + p->b31; }
long long f32(struct S *p) { return p->a32 + p->b32; }
long long f33(struct S *p) { return p->a33 + p->b33; }

int main() {
    struct S s = { -2, -2, -2, 1, 1, 1 };
    long long a32 = -2;
    unsigned long long b32 = 1;
    printf("f31(&s)       => %lld\n", f31(&s));
    printf("f32(&s)       => %lld\n", f32(&s));
    printf("f33(&s)       => %lld\n", f33(&s));
    printf("s.a31 + s.b31 => %lld\n", s.a31 + s.b31);
    printf("s.a32 + s.b32 => %lld\n", s.a32 + s.b32);
    printf("s.a33 + s.b33 => %lld\n", s.a33 + s.b33);
    printf("  a32 +   b32 => %lld\n",   a32 +   b32);
    return 0;
}

Using Clang on OS/X, I get this output:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => -1
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => -1
  a32 +   b32 => -1

Using GCC on Linux, I get this:

f31(&s)       => -1
f32(&s)       => 4294967295
f33(&s)       => 8589934591
s.a31 + s.b31 => 4294967295
s.a32 + s.b32 => 4294967295
s.a33 + s.b33 => 8589934591
  a32 +   b32 => -1

The above output shows 3 types of inconsistencies:

The C Standard has this language:

6.7.2 Type specifiers

...

Each of the comma-separated multisets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.

Bit-fields are notoriously broken in many older compilers...
Is the behavior of Clang and GCC conformant or are these inconsistencies the result of one or more bugs?

Upvotes: 6

Views: 287

Answers (2)

autistic
autistic

Reputation: 15632

Is the behavior of Clang and GCC conformant or are these inconsistencies the result of one or more bugs?

I think it's most likely the fault is in your code, tbh. According to 6.7.2.1p5:

A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type.

There's no mention of long long here, so we can't necessarily treat this code as conformant to begin with. It seems that some compilers have documented support (for example, some ARM clang targets), whereas others are happy to let the behaviour be undefined (for example, gcc manuals don't appear to list long long in the category of "Allowable bit-field types other than _Bool, signed int, and unsigned int (C99 and C11 6.7.2.1)").

Furthermore, according to 6.3.1.1p2:

The following may be used in an expression wherever an int or unsigned int may be used:

  • An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
  • A bit-field of type _Bool, int, signed int, or unsigned int.

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int.

In other words, it isn't simply enough for the compiler to support these types of bit-fields, but also to have appropriate type conversions so that the expressions are converted properly. Specifically, this code looks utterly terrifying, because %lld tells printf to expect long long int, whereas I think you may only be passing an int (or unsigned, perhaps):

printf("s.a31 + s.b31 => %lld\n", s.a31 + s.b31);
printf("s.a32 + s.b32 => %lld\n", s.a32 + s.b32);
printf("s.a33 + s.b33 => %lld\n", s.a33 + s.b33);
printf("  a32 +   b32 => %lld\n",   a32 +   b32);

I figured I'd sign off quoting my expected result of this hairy looking code above:

If a conversion specification is invalid, the behavior is undefined.282) If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

-- C11/7.21.6.1p9

Upvotes: 2

sidcoder
sidcoder

Reputation: 460

Please have a look to the proposed code which works correctly and as expected.

For the practical purpose, I would suggest, just make sure that

  • compatible types are added,
  • correct types are returned and
  • correct types are in the printf statement.

That's it.

For more information, see also Ref.[1] and [2], below.

#include <stdio.h>

struct S {
    long long a31 : 31;
    long long a32 : 32;
    long long a33 : 33;
    
    unsigned long long b31 : 31;
    unsigned long long b32 : 32;
    unsigned long long b33 : 33;
};

long long f31(struct S *p) { return ((long long)p->a31 + (long long)p->b31); }
long long f32(struct S *p) { return ((long long)p->a32 + (long long)p->b32); }
long long f33(struct S *p) { return ((long long)p->a33 + (long long)p->b33); }

int main() {
    struct S s = { -2, -2, -2, 1, 1, 1 };
    long long a32 = -2;
    unsigned long long b32 = 1;
    
    printf("p->a31       => %lld\n", (long long)(s.a31));
    printf("p->a32       => %lld\n", (long long)(s.a32));
    printf("p->a33       => %lld\n", (long long)(s.a33));
    
    printf("p->b31       => %lld\n", (long long)(s.b31));
    printf("p->b32       => %lld\n", (long long)(s.b32));
    printf("p->b33       => %lld\n", (long long)(s.b33));
    
    
    printf("f31(&s)       => %lld\n", (long long)(f31(&s)));
    printf("f32(&s)       => %lld\n", (long long)(f32(&s)));
    printf("f33(&s)       => %lld\n", (long long)(f33(&s)));
    printf("s.a31 + s.b31 => %lld\n", ((long long)s.a31 + (long long)s.b31));
    printf("s.a32 + s.b32 => %lld\n", ((long long)s.a32 + (long long)s.b32));
    printf("s.a33 + s.b33 => %lld\n", ((long long)s.a33 + (long long)s.b33));
    printf("  a32 +   b32 => %lld\n", (long long) (a32 +   b32));
    return 0;
}

p->a31       => -2
p->a32       => -2
p->a33       => -2
p->b31       => 1
p->b32       => 1
p->b33       => 1
f31(&s)       => -1
f32(&s)       => -1
f33(&s)       => -1
s.a31 + s.b31 => -1
s.a32 + s.b32 => -1
s.a33 + s.b33 => -1
  a32 +   b32 => -1

References

[1] Signed to unsigned conversion in C - is it always safe?

[2] https://www.geeksforgeeks.org/bit-fields-c/ "We cannot have pointers to bit field members as they may not start at a byte boundary."

Upvotes: 0

Related Questions