Reputation: 23134
I have been search on the internet for a way to read binary files in c++, and I have found two snippets that kind of works:
No.1:
#include <iostream>
#include <fstream>
int main(int argc, const char *argv[])
{
if (argc < 2) {
::std::cerr << "Usage: " << argv[0] << "<filename>\n";
return 1;
}
::std::ifstream in(argv[1], ::std::ios::binary);
while (in) {
char c;
in.get(c);
if (in) {
// ::std::cout << "Read a " << int(c) << "\n";
printf("%X ", c);
}
}
return 0;
}
Result:
6C 1B 1 FFFFFFDC F FFFFFFE7 F 6B 1
No.2:
#include <stdio.h>
#include <iostream>
using namespace std;
// An unsigned char can store 1 Bytes (8bits) of data (0-255)
typedef unsigned char BYTE;
// Get the size of a file
long getFileSize(FILE *file)
{
long lCurPos, lEndPos;
lCurPos = ftell(file);
fseek(file, 0, 2);
lEndPos = ftell(file);
fseek(file, lCurPos, 0);
return lEndPos;
}
int main()
{
const char *filePath = "/tmp/test.bed";
BYTE *fileBuf; // Pointer to our buffered data
FILE *file = NULL; // File pointer
// Open the file in binary mode using the "rb" format string
// This also checks if the file exists and/or can be opened for reading correctly
if ((file = fopen(filePath, "rb")) == NULL)
cout << "Could not open specified file" << endl;
else
cout << "File opened successfully" << endl;
// Get the size of the file in bytes
long fileSize = getFileSize(file);
// Allocate space in the buffer for the whole file
fileBuf = new BYTE[fileSize];
// Read the file in to the buffer
fread(fileBuf, fileSize, 1, file);
// Now that we have the entire file buffered, we can take a look at some binary infomation
// Lets take a look in hexadecimal
for (int i = 0; i < 100; i++)
printf("%X ", fileBuf[i]);
cin.get();
delete[]fileBuf;
fclose(file); // Almost forgot this
return 0;
}
Result:
6C 1B 1 DC F E7 F 6B 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 A1 D 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
The result of xxd /tmp/test.bed
:
0000000: 6c1b 01dc 0fe7 0f6b 01 l......k.
The result of ls -l /tmp/test.bed
-rw-rw-r-- 1 user user 9 Nov 3 16:37 test.bed
The second method is giving the right hex codes in the beginning but seems got the file size wrong, the first method is messing up the bytes.
These methods look very different, perhaps there are many ways to do the same thing in c++? Is there an idiom that pros adopt?
Upvotes: 1
Views: 2605
Reputation: 23134
In a search for why @Roland Illig 's answer (now deleted) does not work, I found the following solution, not sure if it's up to the professional standard, but it gives right results so far, and allows to check the beginning n-bytes of a file:
#include <iostream>
#include <fstream>
#include <cstdlib>
#include <string>
int main(int argc, const char *argv[])
{
if (argc < 3) {
::std::cerr << "usage: " << argv[0] << " <filename>\n";
return 1;
}
int nbytes = std::stoi(argv[2]);
char buffer[nbytes];
std::streamsize size = nbytes;
std::ifstream readingFile(argv[1], std::ios::binary);
readingFile.read(buffer, (int)size);
std::streamsize bytesread = readingFile.gcount();
unsigned char rawchar;
if (bytesread > 0) {
for (int i = 0; i < bytesread; i++) {
rawchar = (unsigned char) buffer[i];
printf("%02x ", (int) rawchar);
}
printf("\n");
}
return 0;
}
Another answer I got from wibit.com :
#include <iostream>
#include <fstream>
using namespace std;
int main(int argc, const char* argv[])
{
ifstream inBinaryFile;
inBinaryFile.open(argv[1], ios_base::binary);
int currentByte = inBinaryFile.get();
while(currentByte >= 0)
{
printf("%02x ", currentByte);
currentByte = inBinaryFile.get();
}
printf("\n");
inBinaryFile.close();
return 0;
}
Upvotes: 0
Reputation: 154025
You certainly want to convert the char
objects to unsigned char
before processing them as integer values! The problem is that char
may be signed in which case negative values get converted to negative int
s when you cast them. Negative int
s displayed as hex will have more then two hex digits, the leading ones probably all "f".
I didn't immediately spot why the second approach gets the size wrong. However, the C++ approach to read a binary file is simple:
#include <iostream>
#include <fstream>
#include <vector>
#include <iomanip>
std::vector<unsigned char> bytes;
{
std::ifstream in(name, std::ios_base::binary);
bytes.assign(std::istreambuf_iterator<char>(in >> std::noskipws),
std::istreambuf_iterator<char>());
}
std::cout << std::hex << std::setfill('0');
for (int v: bytes) {
std::cout << std::setw(2) << v << ' ';
}
Upvotes: 1
Reputation: 126927
Both your methods are some strange mix of C and C++ (well, actually the second is just plain C); still, the first method is mostly right, but you have to use an unsigned char
for c
, otherwise any byte over 0x7f is read as negative, which results in that wrong output.1
To do things correctly and in the "C++ way", you should have done:
std::cout<<std::hex<<std::setfill('0');
...
if (in)
std::cout << std::setw(2)<<int(c) << "\n";
The second one gets the "signedness" correct, but it's mostly just C. A quick fix would be to fix the 100
in the for
loop, replacing it with fileSize
. But in general, loading the whole file in memory just to dump its content in hexadecimal is a botched idea; what you normally do is to read the file a piece at time in a fixed-size buffer and convert it by the by.
get
returns an int
; if it's bigger than 0x7f
it overflows the char
when assigning, and typically results in some negative value. Then when it is passed to printf
it gets sign-extended (since any signed integer parameter passed to a vararg function is widened to int
) but interpreted as an unsigned int
due to the %X
parameter. (all this assuming 2's complement arithmetic, non-signaling integer overflow and signed char
)Upvotes: 1
Reputation: 661
In the first case you're printing char (which is signed) while in the second case you're doing the same with unsigned char. %X extends chars to ints and that causes the difference.
Upvotes: 0