Reputation:
I'm writing a tiny program to compute the hash (blake2b) of a file using libsodium and find myself staring at a weird bug.
There's a missing zero in my hexadecimal output which isn't caused by the hash procedure since we're using the same 256 bit truncated hash functions.
Both use Blake2b (optimized for x64).
I made sure to verify that the file was read in total, even if it was the case the output with be totally different since it's a hash function (1 bit is sufficient to have different outputs).
I also used C-style printing and C++ streams to see if it had something to do with format specifiers which showed it wasn't the case.
My program outputs the following :
479b5e6da5eb90a19ae1777c8ccc614b5c8f695c9cffbfe78d38b89e40b865
When using b2sum command line tool
b2sum /bin/ls -l 256 479b5e6da5eb90a19ae1777c8ccc614b**0**5c8f695c9cffbfe78d38b89**0**e40b865
#include<iostream>
#include<fstream>
#include<sstream>
#include<ios>
#include<vector>
#include<sodium.h>
using namespace std;
int main(int argc, char** argv)
{
using buffer = vector<char>;
ifstream input(argv[1],ios::binary | ios::ate);
// get file size
streamsize filesize = input.tellg();
input.seekg(0,ios::beg);
// make a buffer with that filesize
buffer buf(filesize);
// read the file
input.read(buf.data(),buf.size());
input.close();
// show filesize
cout << "Filesize : " << filesize << endl;
// using the snipped from libsodium docs
// https://libsodium.gitbook.io/doc/hashing/generic_hashing
// Example 1
unsigned char hash[crypto_generichash_BYTES];
crypto_generichash(hash,sizeof(hash),(unsigned char*)buf.data(),buf.size(),NULL,0);
// Print the hash in hexadecimal
for(int i = 0; i < crypto_generichash_BYTES; i++)
{
printf("%x",hash[i]);
}
cout << endl;
// load the hash into a stringstream using hexadecimal
stringstream ss;
for(int i=0; i<crypto_generichash_BYTES;++i)
ss << std::hex << (int)hash[i];
std::string mystr = ss.str();
// output the stringstream
cout << mystr << endl;
cout << "hash length :" << mystr.length() << endl;
}
Upvotes: 0
Views: 449
Reputation: 595319
printf("%x",hash[i]);
does not output a leading zero for hex values < 0x10. You need to use printf("%02x", hash[i]);
instead, which tells printf()
to output a minimum of 2 hex digits, prepending a leading zero if needed.
Otherwise, use C++ stream output instead:
std::cout << std::hex << std::setw(2) << std::setfill('0') << (int)hash[i];
Which you also need to do for your std::streamstream
, as your code for that is also omitting leading zeros for hex values < 0x10.
Upvotes: 1
Reputation: 881093
You should be using something like:
printf("%02x",hash[i]);
to print out the bytes. This will correctly handle hex values less than 16 which, in your version, will simply output a single hex digit.
You can see that in the following program:
#include <cstdio>
#define FMT "%02x"
int main() {
printf(FMT, 0x4b);
printf(FMT, 0x05);
printf(FMT, 0xc8);
putchar('\n');
}
With FMT
defined as above, you see the correct 4b05c8
. With it defined (as you have) as "%x"
, you see the errant 4b5c8
.
And, just as an aside, you may want to consider ditching the C legacy stuff(a) like printf
. I know it's in the standard but hardly anyone(b) uses it because of its limitations, despite the iostream
equivalent being much more verbose.
Or do what we've done and just use the fmt
library for much more succinct but still type-safe output, especially since it's currently being targeted toward C+20 (hence will almost certainly become part of the standard at some point).
(a) Nobody wants to be known as a C+ programmer, that strange breed who never quite embraced the full power of the language :-)
(b) Based on the sample of a moderate number of C++ developers I've worked with :-)
Upvotes: 1