VeryLazyBoy
VeryLazyBoy

Reputation: 420

Time efficiency of python sha1 hash calculation

For these two ways of calculating a sha1 hash, regards with time efficiency, are they the same?

(1) Split the string to small chunks and update the hash multiple times

import hashlib

...
...
sha1 = hashlib.sha1()
sha1.update(chunk1)
sha1.update(chunk2)
...

(2) Pass the complete string to the hash function and compute the hash only once

import hashlib
...
...
sha1 = hashlib.sha1()
sha1.update(the_complete_string)
...

Upvotes: 0

Views: 1141

Answers (1)

Sam Hartman
Sam Hartman

Reputation: 6489

There is additional overhead for each chunk:

  • you must split the string
  • There are python calls into hashlib for each chunk
  • the hash library must set up to handle each chunk

So, there is overhead that scales with the number of chunks. If you have a constant number of chunks, it probably doesn't matter. However if you were to split a significant string into one-character chunks and update using each of those one-character chunks, the chunked approach would be significantly slower than the whole string approach.

That said, there's overhead in combining chunks into a single string or bytes object. If what you have are chunks, and the only reason you're combining them is for the hash performance, that probably will not save time.

Upvotes: 1

Related Questions