hetelek
hetelek

Reputation: 3886

Structs Being Weird - C++

I have been having alot of trouble with this stupid struct. I don't see why it is doing this, and I am really not sure how to fix it. The only way I know how to fix it, is by removing the struct and doing it some other way(which I don't want to do).

So I am reading data from a file, and I am reading it in to a struct pointer all at once. It seems like the offset/pointer of my 'long long' gets messed up everytime. View in details below.

So here is my struct:

struct Entry
    {
        unsigned short type;
        unsigned long long identifier;
        unsigned int offset_specifier, length;
    };

And here is my code for reading all the crap into the struct pointer/array:

Entry *entries = new Entry[SOME_DYNAMIC_AMOUNT];
fread(entries, sizeof(Entry), SOME_DYNAMIC_AMOUNT, openedFile);

As you can see, I write all that into my struct array. Now, I will show you the data I am reading(for the first struct in this example).

The bytes I am reading.

So this is the data that is going into the first element in 'entries'. The first item(the short, 'type'), seems to be read fine. After that, when the 'identifier' is read, it seems like the whole struct is shifted X amount of bytes. Here is a picture of the first element(after reversing the endian):

enter image description here

And here is the data in memory(the red square is where it begins):

enter image description here

I know that was a bit confusing, but I tried to explain it as well as possible. Thanks for any help, Hetelek. :)

Upvotes: 1

Views: 356

Answers (2)

James M
James M

Reputation: 16718

Structures are padded with extra bytes so that the fields are faster to access. You can prevent this with #pragma pack:

#pragma pack(push, 1)

struct Entry
{
    /* ... */
};

#pragma pack(pop)

Note that this might not be 100% portable (I know that at least GCC and MSVC support it for x86).

Upvotes: 6

Keith Thompson
Keith Thompson

Reputation: 263267

Reading and writing structs to a file in binary is perilous.

The problem you're running into here is that the compiler inserts padding (needed for alignment) between the type and identifier members of your structure. Apparently whatever program wrote the data (which you haven't told us about) used a different layout that the program that's trying to read the data.

This could happen if the two systems (the one writing the data and the one reading it) have different alignment requirements, and therefore different layouts for the Entry type.

Alignment is not the only potential problem, though; differences in endianness can also be a serious problem. Different systems might have differing sizes for the predefined integer types. You can't assume that struct Entry will have a consistent layout unless all the code that deals with it runs on a single system -- and ideally with the same version of the same compiler.

You might be able to use #pragma pack to work around this, but I don't recommend it. It's not portable, and it can be unsafe. At best, it will work around the problem of padding between members; there are still plenty of ways the layout can vary from one system to another.

It's impossible to give you a definitive solution without knowing where and how the data layout of the file you're reading is defined.

If we assume that the file layout for each record is, for example:

  • A 2-byte unsigned integer in network byte order (type)
  • An 8-byte integer in network byte order (identifier)
  • A 4-byte integer in network byte order (offset_specifier, length)
  • with no padding between them

then you should either read the data into an unsigned char[] buffer, or into objects of type uint16_t, uint32_t, and uint64_t (defined in <cstdint> or <stdint.h>), and then translate it from network byte order to local byte order.

You can wrap this conversion in a function that reads from the file and converts the data, storing it in an Entry struct.

If you're able to assume that the program will only run on a restricted set of systems, then you can bypass some of this. For example, you might be able to tweak the declaration of struct Entry so it matches the file format, and read and write it directly. Doing so will mean your code isn't portable to some systems. You'll have to decide which price you're willing to pay.

Upvotes: 3

Related Questions