Reputation: 93
#include <stdio.h>
#include <stddef.h>
struct struct_a {
int x;
float y;
char a;
double z;
};
int main(void) {
printf("sizeof struct %zu\n", sizeof(struct struct_a));
printf("Offset of x: %lu\n", offsetof(struct struct_a, x));
printf("Offset of y: %lu\n", offsetof(struct struct_a, y));
printf("Offset of a: %lu\n", offsetof(struct struct_a, a));
printf("Offset of z: %lu\n", offsetof(struct struct_a, z));
return 0;
}
Compiled for 64-bit (default or -m64
option):
sizeof struct 24
Offset of x: 0
Offset of y: 4
Offset of a: 8
Offset of z: 16
a
to align z
at offset 16 (which is a multiple of 8).Compiled for 32-bit (-m32
option):
sizeof struct 20
Offset of x: 0
Offset of y: 4
Offset of a: 8
Offset of z: 12
a
to align z
at offset 12.Why is double
placed at offset 12 in the 32-bit version when it's an 8-byte type?
double
to be aligned to an 8-byte boundary, but in 32-bit mode, it's aligned at 12 instead. I understand that a 32-bit system processes data in 4-byte chunks per cycle—does this influence the alignment of double
?How should I explain structure padding in a job interview?
I have checked the following stack overflow discussions, but they seem it more complicated examples for me to understand since I'm trying to understand as a beginner.
Edit: The answers provide a much clear explanation, so I have confirmed them using the below examples.
I compiled -m32 and -m64 output with some code changes on my ubuntu system.
// use the same struct_a form the above code,
a.x = 0xFFFFFF; // 2 nibble should show 0x00 and rest all 0xff
a.c = 'A';
*(uint32_t *)&a.y = 0xFFFFFFFF; // Force 0xff bytes for float
*(uint64_t *)&a.z = 0xFFFFFFFFFFFFFFFF; // Force 0xff bytes for double
$1 = {x = 16777215, y = -nan(0x7fffff), c = 65 'A', z = -nan(0xfffffffffffff)}
(gdb) x/20bx &a
0x56559008 <a>: 0xff 0xff 0xff 0x00 0xff 0xff 0xff 0xff
0x56559010 <a+8>: 0x41 0x00 0x00 0x00 0xff 0xff 0xff 0xff
0x56559018 <a+16>: 0xff 0xff 0xff 0xff
// 0x56559008 x, 0x56559008 y, 0x56559010 a, < 3 byte padding >, 0x56559010 z.
(gdb) p &a.x
$6 = (int *) 0x56559008 <a>
(gdb) p &a.y
$7 = (float *) 0x5655900c <a+4>
(gdb) p &a.c
$8 = 0x56559010 <a+8> "A"
(gdb) p &a.z
$9 = (double *) 0x56559014 <a+12>
I tried declaring struct a
as an array of 2 elements:
p &a[1].x
is aligned at offset 20 because per system V, i386 a double (largest element in the struct) is aligned according to it's alignment requirements which is 4 in 32 bit architecture.(gdb) x/40bx &a
0x56559040 <a>: 0xff 0xff 0xff 0x00 0xff 0xff 0xff 0xff
0x56559048 <a+8>: 0x41 0x00 0x00 0x00 0xff 0xff 0xff 0xff
0x56559050 <a+16>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0x00
0x56559058 <a+24>: 0xff 0xff 0xff 0xff 0x41 0x00 0x00 0x00
0x56559060 <a+32>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
(gdb) p &a[0].z
$2 = (double *) 0x5655904c <a+12>
(gdb) p &a[1].x
$3 = (int *) 0x56559054 <a+20>
$1 = {x = 16777215, y = -nan(0x7fffff), c = 65 'A', z = -nan(0xfffffffffffff)}
(gdb) x/24bx &a
0x555555558010 <a>: 0xff 0xff 0xff 0x00 0xff 0xff 0xff 0xff
0x555555558018 <a+8>: 0x41 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x555555558020 <a+16>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
// 0x555555558010 x, 0x555555558014 y, 0x555555558018 c, < 7 byte padding >, 0x555555558020 z
(gdb) p &a.x
$2 = (int *) 0x555555558010 <a>
(gdb) p &a.y
$3 = (float *) 0x555555558014 <a+4>
(gdb) p &a.c
$4 = 0x555555558018 <a+8> "A"
(gdb) p &a.z
$5 = (double *) 0x555555558020 <a+16>
So I tried executing the same for array, by declaring a[2] and assigning the same initialized values.
a[0].x = 0xFFFFFF; a[1].x = 0xFFFFFF;
a[0].c = 'A', a[1].c = 'A';
*(uint32_t *)&a[0].y = 0xFFFFFFFF; // Force 0xff bytes for float
*(uint32_t *)&a[1].y = 0xFFFFFFFF; // Force 0xff bytes for float
*(uint64_t *)&a[0].z = 0xFFFFFFFFFFFFFFFF; // Force 0xff bytes for double
*(uint64_t *)&a[1].z = 0xFFFFFFFFFFFFFFFF; // Force 0xff bytes for double
a[0].z
it begins at 0x555555558050 <a+16>
end of array element 0a[1].x
it begins at 0x555555558058 <a+24>
start of array element 1, address begins at 0x555555558058 struct is aligned to it's largest member's alignment requirement which is 8 in 64 bit architecture.// GDB output
(gdb) p a
$1 = {{x = 16777215, y = -nan(0x7fffff), c = 65 'A', z = -nan(0xfffffffffffff)}, {x = 16777215, y = -nan(0x7fffff), c = 65 'A', z = -nan(0xfffffffffffff)}}
(gdb) x/48bx &a
0x555555558040 <a>: 0xff 0xff 0xff 0x00 0xff 0xff 0xff 0xff
0x555555558048 <a+8>: 0x41 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x555555558050 <a+16>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x555555558058 <a+24>: 0xff 0xff 0xff 0x00 0xff 0xff 0xff 0xff
0x555555558060 <a+32>: 0x41 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x555555558068 <a+40>: 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff
Upvotes: 4
Views: 120
Reputation: 145277
2. How should I explain structure padding in a job interview?
Given that compiler behavior may differ across platforms, what is the best way to frame my answer when asked about struct alignment and padding? Should I focus on general rules (like aligning to the largest type) or explain that alignment can vary depending on architecture and compiler optimizations?
You should show that you understand the alignment requirements for specific types larger than char
that may lead to padding between structure members
and/or after the last member, but never before the first member.
The alignment requirements are target specific, more precisely ABI specific to ensure interoperability among different compilers that implement structures according to the C source definition. You can use as an example the compiler that produces an executable and the kernel that implements a structure in kernel space that is copied to user space via a system call (stat
or ioctl
calls). These can use different programming languages and compilers but must adhere to the same ABI.
Padding cause structures to use extra space in memory, which may result in early memory exhaustion or other performance reduction. When designing your own structures you might be able to reduce padding by reordering members, grouping small types together. Optimizing the structure layout for 32 and 64 bit ABIs is not very difficult, but explaining why and how will show your interviewer a good skill level in the C language, which can be seen from your detailed question. You did not know about the specifics of double
alignment on 32-bit intel targets, that's no big deal and now you know. Keep learning, that's a very valuable skill.
Upvotes: 1
Reputation: 141698
Why is double placed at offset 12 in the 32-bit version when it's an 8-byte type?
Because it is specified that way.
Because document System V i386 ABI specification says alignment of double
is 4 in this https://www.uclibc.org/docs/psABI-i386.pdf document in table2.1 on page 8. So next from 8 is 12.
But document System V AMD64 ABI specifies alignment of double
to be 8 according to this https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf document table 3.1 on page 12.
The compiler creates structures according to an specification. All compilers use the same specifications, so the generated code can talk to each other.
does this influence the alignment of double?
Maybe there is i386 rationale but I wasn't able to find one. Reading Alignment of a struct with two doubles is 4 even though double is aligned to 8 (32bit) it would be better to align doubles on 8 on i386. But i386 ABI is what it is and is very very old and compilers want to produce portable output.
Upvotes: 3