user1457712
user1457712

Reputation: 77

C++ Bits in 64 bit integer

Hello I have a struct here that is 7 bytes and I'd like to write it to a 64 bit integer. Next, I'd like to extract out this struct later from the 64 bit integer.

Any ideas on this?

#include "stdafx.h"

struct myStruct
{
        unsigned char a;               
        unsigned char b;
        unsigned char b;
        unsigned int someNumber;
};

int _tmain(int argc, _TCHAR* argv[])
{
    myStruct * m = new myStruct();
    m->a = 11;
    m->b = 8;
    m->c = 12;
    m->someNumber = 30;

    printf("\n%s\t\t%i\t%i\t%i\t%i\n\n", "struct", m->a, m->b, m->c, m->someNumber);

    unsigned long num = 0;

    // todo: use bitwise operations from m into num (total of 7 bytes) 

    printf("%s\t\t%i\n\n", "ulong", num);

    m = new myStruct();

    // todo: use bitwise operations from num into m;

    printf("%s\t\t%i\t%i\t%i\t%i\n\n", "struct", m->a, m->b, m->c, m->someNumber);

    return 0;
}

Upvotes: 1

Views: 2723

Answers (3)

user1457712
user1457712

Reputation: 77

Got it.

static unsigned long long compress(char a, char b, char c, unsigned int someNumber)
{
    unsigned long long x = 0;
    x = x | a;
    x = x << 8;
    x = x | b;
    x = x << 8;
    x = x | c;
    x = x << 32;
    x = x | someNumber;
    return x;
}

myStruct * decompress(unsigned long long x)
{
    printBinary(x);
    myStruct * m = new myStruct();
    m->someNumber = x | 4294967296;
    x = x >> 32;
    m->c = x | 256;
    x = x >> 8;
    m->b = x | 256;
    x = x >> 8;
    m->a = x | 256;
    return m;
}

Upvotes: 0

user1084944
user1084944

Reputation:

You should to do something like this:

class structured_uint64
{
    uint64_t data;
public:
    structured_uint64(uint64_t x = 0):data(x) {}
    operator uint64_t&() { return data; }
    unsigned uint8_t low_byte(size_t n) const { return data >> (n * 8); }
    void low_byte(size_t n, uint8_t val) {
        uint64_t mask = static_cast<uint64_t>(0xff) << (8 * n);
        data = (data & ~mask) | (static_cast<uint64_t>(val) << (8 * n));
    }
    unsigned uint32_t hi_word() const { return (data >> 24); }
    // et cetera
};

(there is, of course, lots of room for variation on the details of the interface and where among the 64 bits the constituents are placed)

Using different types to alias the same portion of memory is a generally bad idea. The thing is, it's very valuable for the optimizer to be able to use reasoning like:

"Okay, I've read a uint64_t at the start of this block, and nowhere in the middle does the program write to any uint64_ts, therefore the value must be unchanged!"

which means it will get the wrong answer if you tried to change the value of the uint64_t object through a uint32_t reference. And as this is very dependent what optimizations are possible and done, it is actually pretty easy to never run across the problem in test cases, but see it in the real program you're trying to write -- and you'll spend forever trying to find the bug because you convinced yourself it's not this problem.

So, you really should do the insertion/extraction of the fields with bit twiddling (or intrinsics, if profiling shows that this is a performance issue and there are useful ones available) rather than trying to set up a clever struct.

If you really know what you're doing, you can make the aliasing work, I believe. But it should only be done if you really know what you're doing, and that includes knowing relevant rules from the standard inside and out (which I don't, and so I can't advise you on how to make it work). And even then you probably shouldn't do it.

Also, if you intend your integral types to be a specific size, you should really use the correct types. For example, never use unsigned int for an integer that is supposed to be exactly 32 bits. Instead use uint32_t. Not only is it self-documenting, but you won't run into a nasty surprise when you try to build your program in an environment where unsigned int is not 32 bits.

Upvotes: 1

Mark Tolonen
Mark Tolonen

Reputation: 177971

Use a union. Each element of a union occupies the same address space. The struct is one element, the unsigned long long is another.

#include <stdio.h>

union data
{
    struct
    {
        unsigned char a;
        unsigned char b;
        unsigned char c;
        unsigned int d;
    } e;
    unsigned long long f;
};

int main()
{
    data dat;
    dat.f = 0xFFFFFFFFFFFFFFFF;
    dat.e.a = 1;
    dat.e.b = 2;
    dat.e.c = 3;
    dat.e.d = 4;
    printf("f=%016llX\n",dat.f);
    printf("%02X %02X %02X %08X\n",dat.e.a,dat.e.b,dat.e.c,dat.e.d);
    return 0;
}

Output, but note one byte of the original unsigned long long remains. Compilers like to align data such as 4-byte integers on addresses divisible by 4, so three bytes, then a pad byte so the integer is at offset 4 and the struct has a total size of 8.

f=00000004FF030201
01 02 03 00000004

This can be controlled in compiler-dependent fashion. Below is for Microsoft C++:

#include <stdio.h>

#pragma pack(push,1)
union data
{
    struct
    {
        unsigned char a;
        unsigned char b;
        unsigned char c;
        unsigned int d;
    } e;
    unsigned long long f;
};
#pragma pack(pop)

int main()
{
    data dat;
    dat.f = 0xFFFFFFFFFFFFFFFF;
    dat.e.a = 1;
    dat.e.b = 2;
    dat.e.c = 3;
    dat.e.d = 4;
    printf("f=%016llX\n",dat.f);
    printf("%02X %02X %02X %08X\n",dat.e.a,dat.e.b,dat.e.c,dat.e.d);
    return 0;
}

Note the struct occupies seven bytes now and the highest byte of the unsigned long long is now unchanged:

f=FF00000004030201
01 02 03 00000004

Upvotes: 0

Related Questions