Reputation: 417

Why padding works based on datatype

I wonder the behavior of this below program, as the padding works based on the adjacent datatype in C programming.

#include <stdio.h>
struct abc{
    char a1;
    int a2;
}X;
struct efg
{
    char b1;
    double b2;
}Y;
int main()
{
    printf("Size of X = %d\n",sizeof(X));
    printf("Size of Y = %d\n",sizeof(Y));
    return 0;
}

Output of the program

root@root:~$./mem 
Size of X = 8
Size of Y = 16

In Structure abc 3 bytes are padded whereas in structure efg 7 bytes are padded.

Is this how padding designed?

Upvotes: 0

Answers (1)

Erik Nyquist

Reputation: 1317

Padding is being added to avoid the members crossing a word boundary when they don't need to; alignment, as some have said in comments. There is a nice explanation about it here:

http://www.geeksforgeeks.org/structure-member-alignment-padding-and-data-packing/

The size of the largest member does have an effect on the padding of the other members. Generally, all members are aligned to the size of the largest member. I believe this is because it is just the simplest/most effective way for the compiler to ensure that all struct members are properly aligned.

Because of this, an interesting detail is that you can often save space if you order your struct members by size, with the largest members declared first. Here's some code to illustrate that (I always find looking at a dump of the actual memory helps with things like this, rather than just the size)

#include <stdio.h>

// Inefficient ordering-- to avoid members unnecessarily crossing word
// boundaries, extra padding is inserted.
struct X {
    unsigned long a;   // 8 bytes
    unsigned char b;   // 4 bytes
    unsigned int c;    // 4 bytes
    unsigned char d;   // 4 bytes
};

// By ordering the struct this way, we use the space more
// efficiently. The last two bytes can get packed into a single word.
struct Y {
    unsigned long a;   // 8 bytes
    unsigned int c;    // 4 bytes
    unsigned char b;   // 1 byte
    unsigned char d;   // 3 bytes
};

struct X x = {.a = 1, .b = 2, .c = 3, .d = 4};
struct Y y = {.a = 1, .b = 2, .c = 3, .d = 4};

// Print out the data at some memory location, in hex
void print_mem (void *ptr, unsigned int num)
{
    int i;
    unsigned char *bptr = (unsigned char *)ptr;

    for (i = 0; i < num; ++i) {
        printf("%.2X ", bptr[i]);
    }

    printf("\n");
}

int main (void)
{
    print_mem(&x, sizeof(struct X)); // This one will be larger
    print_mem(&y, sizeof(struct Y)); // This one will be smaller
    return 0;
}

And the output from running the above code:

01 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 00 00 00 00 
01 00 00 00 00 00 00 00 03 00 00 00 02 04 00 00

There are various subtleties to this, I'm sure it works a bit differently on various implementations. See http://www.catb.org/esr/structure-packing for more in-depth details about struct ordering/packing...

Upvotes: 2

Why padding works based on datatype

Answers (1)

Related Questions