Oliver
Oliver

Reputation: 11597

How fast are CRCs to generate?

I need to generate etags for image files on the web. One of the possible solutions I thought of would be to calculate CRCs for the image files, and then use those as the etag.

This would require CRCs to be calculated every time someone requests an image on the server, so its very important that it can be done fast.

So, how fast are algorithms to generate CRCs? Or is this a stupid idea?

Upvotes: 2

Views: 829

Answers (4)

sll
sll

Reputation: 62504

I would suggest calculate hash when adding a image into a data base once and then just return it by SELECT along with a image itself.

If you are usign Sql Server and images not very large (max 8000 bytes) you can leverage HASHBYTES() function which able to generate SHA-1, MD5, ...

Upvotes: 1

Jon Hanna
Jon Hanna

Reputation: 113252

Depends on the method used, and the length. Generally pretty fast, but why not cache them?

If there won't be changes to the files more often than the resolution of the system used to store it (that is, of file modification times for the filesystem or of SQLServer datetime if stored in a database), then why not just use the date of modification to the relevant resolution?

I know RFC 2616 advises against the use of timestamps, but this is only because HTTP timestamps are 1sec resolution and there can be changes more frequent than that. However:

  1. That's still fine if you don't change images more than once a second.
  2. It's also fine to base your e-tag on the time as long as the precision is great enough that it won't end up with the same for two versions of the same resource.

With this approach you are guaranteed a unique e-tag (collisions are unlikely with a large CRC but certainly possible), which is what you want.

Of course, if you don't ever change the image at a given URI, it's even easier as you can just use a fixed string (I prefer string "immutable").

Upvotes: 1

Tim Rogers
Tim Rogers

Reputation: 21713

Most implementations use the last modified date or other file headers as the ETag including Microsoft's own, and I suggest you use that method.

Upvotes: 2

Aliostad
Aliostad

Reputation: 81660

Use instead a more robust hashing algo such as SHA1.

Speed depends on the size of the image. Most time will be spent on loading data from the disk, rather than in CPU processing. You can cache your generated hashes.

But I also advise on creating etag based on last update date of the file which is much quicker and does not require loading the whole file.

Remember, etag must only be unique for a particular resource so if two different images have the same last update time, it is fine.

Upvotes: 5

Related Questions