Reputation: 691
I can't wrap my head around how I can use char arrays (the first argument of std::ifstream.read() to compare different types of data).
For example, if I was trying to read the magic of a Windows PE file, I am doing this but I feel there are better ways around it since, to my knowledge, this requires I define every pre-assumed value in the file as a std::array:
std::array<char, 2> magic;
in.read(magic.data(), magic.size());
std::array<char, 2> shouldBe = { 0x4d, 0x5a }; // MZ for dos header
if(magic == shouldBe) {
// magic correct
}
This gives me compiler warnings like invalid conversion from int to char. I also don't quite understand how I'd read in the magic for other files where the hex values don't at all correlate to ASCII characters. For example, every Java class file starts with 0xCAFEBABE is a magic yet when I read it in as 4 chars and then cast each part to an int, I get padding which I don't want on the left.
char* magic = new char[4];
in.read(magic, 4);
// how can I compare this array to 0xCAFEBABE?
Output when I loop through each part and then cast as int and use std::hex in the output stream:
ffffffca fffffffe ffffffba ffffffbe
What's the best way to parse lots of different types of values used in binary file formats like PE files and Java classes?
Upvotes: 1
Views: 2301
Reputation: 490148
You basically have two choices: you can either hard-code the values into the program, or you can store them externally. If you're storing them internally, it's probably easiest to start by structuring the data a bit:
struct magic {
std::string value;
int result;
};
std::vector<magic> values {
{ ".ELF", 1 },
{ "MZ", 2},
{ "\xca\xfe\xba\xbe", 3}, // 0xcafebabe
{ "etc", -1}};
Then you can (for example) step through values in a loop, compare values, when you get a match have a value to tell you (for example) how to process that kind of file.
If you store the values as strings as I've done here, it's probably easiest to do the comparisons as strings as well. One obvious way would be to read in a block (e.g., 2 kilobytes) from the beginning of the file, then create a string from the correct number of bytes from the file, then compare to the expected value.
Upvotes: 1
Reputation: 303097
The approach is perfectly fine. The only issue is this line:
std::array<char, 2> shouldBe = { 0x4d, 0x5a }; // MZ for dos header
Narrowing conversions are disallowed with list initialization, so you just have to do some explicit casting:
std::array<char, 2> shouldBe = { (char)0x4d, (char)0x5a };
Upvotes: 3