Chris Snow
Chris Snow

Reputation: 24616

HTTP mechanisms to ensure downloaded data has not been corrupted during download?

I have a requirement to make legal documents available to mobile applications (e.g. android, iphone, etc) via HTTP. Corruption can occur over http (references: 1, 2). In my case it is imperative that the downloaded documents have not been corrupt during transmission.

One mechanism for ensuring integrity is to digitally sign the documents. This approach works well if the documents are xml, however the signing public key will need to be available and trusted by the client.

Another mechanism is to create and store a checksum of the document (e.g. MD5). The client can download the document and the checksum, and then use the checksum to verify the document.

Upvotes: 0

Views: 196

Answers (1)

Turnerj
Turnerj

Reputation: 4288

As far as I know, HTTP itself does not have any built-in checksum mechanism and your suggestion would work for ensuring the data is valid. The thing is though, HTTP is generally implemented on the Transmission Control Protocol (TCP). TCP provides reliable communication between hosts.

Specifically, TCP itself implements error detection (using a checksum) and uses special number sequences to ensure the data arrives in the order that it was sent. If the host sending the data receives information that the receiving host did not get the data, it will resend.

If however the HTTP implementation on the device is actually running on top of the User Datagram Protocol (UDP), it isn't reliable however it is unlikely that a device is using UDP for HTTP or at least the unreliable version (as there is a Reliable User Datagram Protocol).

Now, I couldn't find statistics or much information at all regarding corruption of a HTTP request. Depending how mission critical you deem this to be, treat it like it would happen then. There is mention of downloading files that end up being corrupt. While these mostly seem to relate to ZIP files, I wouldn't think it is due to HTTP but rather other things inbetween like the device itself that is downloading and corrupting the information.

Perhaps in your scenario, it is best to add your checksum if it is absolutely critically important that your information arrives in one piece.

Upvotes: 2

Related Questions