stepanbujnak
stepanbujnak

Reputation: 651

How to prevent malformed uploads?

I have a fairly simple code for uploading files to Google Cloud Storage using Golang.

func upload(object *storage.ObjectHandle, b []byte) error {
    w := object.NewWriter(context.Background())

    if _, err = w.Write(b); err != nil {
        return err
    }
    return w.Close()
}

I have uploaded multitudes of files without any problems, but yesterday I noticed that one of the files was damaged. I'm fairly certain that the file was damaged during the upload as I name the files based on MD5 hash of its contents. I believe the Google Cloud Storage should've returned an error when calling the w.Close() but it didn't. What's the best way to make sure that the upload always fails when the transfer is interrupted/damaged?

Upvotes: 0

Views: 175

Answers (1)

Kenny Grant
Kenny Grant

Reputation: 9633

You could try the following checks before and after you upload the bytes:

  • store len(b) of bytes
  • store sha256 hash of bytes

Verify that both of these are the same when reading back from the cloud storage directly afterwards. This could impact performance of course but it would ensure that you are getting out what you put into GCS.

That isn't the only place you could see corruption though - if the client stopped transmitting or transmitted bad data to your server, this wouldn't detect it. If so checking for integrity in some other way before upload might be your best bet. If your files are of a known type you could also check for integrity that way by verifying that it really is a valid jpg file for example.

It might be best by trying to reproduce and finding out exactly where the corruption occurs first to verify your assumption that GCS should have returned an error and instead silently corrupted the data given to it.

Upvotes: 1

Related Questions