Dennis
Dennis

Reputation: 1975

Is it possible to combine multiple SHA1 states to get a final state in Golang?

In Go1.13, I have an upload server. This server accepts 2 types of upload.

Chunked and Chunked+Threaded. On chunked uploads everything works expected. I calculate every chunk while they are writing to disk. User can upload multiple chunks one-by-one in good order.

This means, I can save every chunk's SHA1 state to disk using BinaryMarshaler, then read previous state and continue to calculate next chunks until I find final hash. Final hash gives me whole file's SHA1 perfectly.

When its ordered, I can append to existing state. But problem starts on threaded.... (Simultaneously)

    hashComplete := sha256.New()
    // read previous sttate from disk
    state, err := ioutil.ReadFile(ctxPath)
    if err != nil {
        return err
    }

    if len(state) > 0 {
        unmarshaler, _ := hashComplete.(encoding.BinaryUnmarshaler)
        if err := unmarshaler.UnmarshalBinary(state); err != nil {
            return err
        }
    }

    // In here im writing file to disk and hash. file object is simple File.
    writer := io.MultiWriter(file, hashComplete)
    n, err := io.Copy(writer, src) // src is source (io.Reader)

    marshaler, _ := hashComplete.(encoding.BinaryMarshaler)
    newState, err := marshaler.MarshalBinary()
    if err != nil {
        return err
    }

    shaCtxFile.Write(newState) // Here im saving last state to disk.

    // Then later, after upload finishes, I read this file and get the SHA1 hex from it. It is correct.

Now this is chunked upload in specific/good order. The other upload method is Chunked+Threaded. This mean, User can upload chunks simultaneously at the same time then send a request to concatenate them together in given order (at last request).

I already calculate each chunk's SHA1 and save it to disk.

My question is it is possible to combine those states and get the final hash or do I need to rehash after concatenate. Is there a way to combine those states?

Upvotes: 2

Views: 833

Answers (1)

Maarten Bodewes
Maarten Bodewes

Reputation: 94058

Assuming you mean the final hash over the whole file, then no, you cannot combine multiple SHA-1 hashes over partial data to create a hash over the whole file, as if it was calculated at once. The reason for this is that the initial SHA-1 state is always the same, and rehashing will restart at that specific state. Furthermore, the final block is padded and a length is added (internal to the hash function) before the final hash value is calculated.

However, you can of course create a hash list or hash tree, where you define how big the blocks are. Then you can hash all the hashes over the chunk to create a topmost hash value. Now you have a different hash value than just the SHA-1 over the file, but the hash is consistent with your definition and can be recalculated, even in a multi-threaded fashion. It is still unique for the data within the file (assuming of course that put in the hash values sequentially) so it can be used to validate the integrity of the file. And, as far as I know, that's for normal secure hash function the only way to use multi-threaded hash calculation.

For more information, Google about Merkle-trees.


Of course, SHA-1 has been broken for collision resistance. Unfortunately, that's exactly what you are using it for. So please use SHA-256. If 256 bits is too much then using SHA-256 and taking the leftmost 160 bits is a more secure alternative.

Upvotes: 4

Related Questions