Juzu
Juzu

Reputation: 11

is there cryptographically secure hash algorithm/function that allows hashing faster when you concatenate more data?

I need to calculate hash codes for constantly growing dataset and it needs to be cryptographically secure if possible. For example hashing the series of natural numbers, first 1, then 12, then 123.

If cryptographically secure hashing always requires doing everything from the start for each added number, then I need some other hashing. The most secure cryptographically, or something that takes as long as possible to find other data that gives the same hash code.

RAM usage can be massive.

Upvotes: 1

Views: 311

Answers (1)

bk2204
bk2204

Reputation: 76409

What you're looking for is tree hashing, which is a form of Merkle tree. Essentially, each data block is hashed, and a hash is computed over the hash of each's data block hash, and so on until you reach the root. Typically, there are dedicated designs for this that prevent the security problems with the immediate and naïve approach.

If you'd like to incrementally append more data, then you'll obviously need to store the hashes for the data blocks or at least some intermediate values so you can recompute the root hash efficiently. BLAKE2 has a special tree mode which can be used, although I'm not sure if any of the standard libraries support it, so you may need to take the reference code and configure it accordingly. BLAKE2 is cryptographically secure and extremely fast, even in plain C.

There's also the newer BLAKE3, which is supposed to be cryptographically secure and is even faster. It always runs in a tree hash mode. However, it has seen less cryptanalysis than BLAKE2, and so I would recommend BLAKE2 for almost all applications.

There are similar approaches for other hash functions, but these are the fastest cryptogpaphically secure options.

Upvotes: 3

Related Questions