Reputation: 21

How does the Linux kernel handle structure padding on the TCP/IP stack?

I'm somewhat familiar with the kernel's socket buffer system, and I searched a lot but I've been unable to find how the kernel handles the problem of struct padding. How does the kernel pack the bytes of the outgoing TCP/IP packet so that code running on a different platform can make sense of it?

When sending data from one machine to another, I know you can't just send your structs as is. Yet that is what looks to be happening with the code in the Linux kernel. What am I missing?

Upvotes: 2

Answers (1)

missimer

Reputation: 4079

Since you did not refer to a specific bit of code I can only talk about things in general.

I searched a lot but I've been unable to find how the kernel handles the problem of struct padding.

GCC provides mechanisms to ensure there is no padding between struct members. One such mechanism is the packed attribute. This way you can define a struct and know exactly what the memory layout of the struct will be.

How does the kernel pack the bytes of the outgoing TCP/IP packet so that code running on a different platform can make sense of it?

TCP/IP defines what the memory layout of the TCP and IP headers will be. You can find information on them here.

When sending data from one machine to another, I know you can't just send your structs as is.

Well actually you can, you just have to be really cautious about how you do it, which basically the Linux is. Just sending a struct say through TCP socket to another program with the same struct definition is dangerous for a few reasons. Take the following struct:

struct my_struct {
    uint32 foo;
    uint64 bar;
}

One reason why people say you shouldn't just send a struct is the memory layout of this struct could be different on different machines or with different compilers. For example on a 32 bit machine there probably won't be any padding, on a 64 bit machine their might be 32 bits of padding between foo and bar. I use words like probably and might because the compiler isn't forced to add padding; its just an optimization it might do. Even if the machines are both 64 bit if you use a different compiler you could get different results as different compilers might add or not add padding. There is also the issue of endianness, so if you are on a little endian machine you should convert to the big endian as that is what the network byte order is specified to be. Another issue to consider, which my example does not, is that certain types will have different sizes, again depending on the compiler and architecture. So for example size_t might be 32 bits on a 32 bit machine and 64 bits on a 64 bit machine. So the same code on a different machine will produce a struct that is a different size. However, if you use types that have specific bit widths, as in my example, this is not an issue.

Now if you take care of all the issues, which the Linux kernel does do, then you can just send a struct.

For more information about why in general sending a struct over TCP is a bad idea this SO question might be useful. As the top answer as of right now states there are three main reasons (the same ones I outlined here), but if you take care of them it is possible. While it is probably not a good practice for a user-space program at some point something has to do this as things such as the TCP packet have specific field requirements.

Upvotes: 2

How does the Linux kernel handle structure padding on the TCP/IP stack?

Answers (1)

Related Questions