Reputation: 269
I've been looking at file formats and information on byte alignment in files is hard to come by. I can find information on memory byte alignment ("Data Structure Alignment"), but that's a different matter.
In setting up a standard format, is there an optimal way to align bytes in a file that is good or even necessary for various systems? This is not for one data type, but for many. Is 2-byte alignment sufficient, or is it really even necessary? What about 4-byte alignment? How well will a 32-bit or 64-bit system handle this?
Upvotes: 5
Views: 1905
Reputation: 134125
When working with binary data, very often you'll just write memory directly to the file. In that case, data in the file is aligned exactly as it is in memory. This has the advantage of not requiring any intermediate steps when reading the information back into your memory data structures. It does use a bit more disk space than absolutely required if you were to eliminate the alignment, but typically not a lot of space.
You have to be careful, though, if you'll be reading that data from other programs. They have to be written to take the padding bytes into account. For example if you have this structure:
struct foo
{
int a;
char b;
int c;
}
And you tell it to align on 32-bit boundaries, your memory (and therefore disk) layout will be:
4 bytes - a
1 byte - b
3 bytes - padding
4 bytes - c
If the other program isn't written to take that into account and instead assumes byte alignment, it'll try to read c
from the four bytes immediately following b
. The result, as you can imagine, wouldn't be good.
When I'm working with binary data, I usually just write the data to the file, ignoring the typically small amount of "waste" that's due to data alignment.
Upvotes: 5