Reputation: 11597
I need to generate etags for image files on the web. One of the possible solutions I thought of would be to calculate CRCs for the image files, and then use those as the etag.
This would require CRCs to be calculated every time someone requests an image on the server, so its very important that it can be done fast.
So, how fast are algorithms to generate CRCs? Or is this a stupid idea?
Upvotes: 2
Views: 829
Reputation: 62504
I would suggest calculate hash when adding a image into a data base once and then just return it by SELECT along with a image itself.
If you are usign Sql Server and images not very large (max 8000 bytes) you can leverage HASHBYTES() function which able to generate SHA-1, MD5, ...
Upvotes: 1
Reputation: 113252
Depends on the method used, and the length. Generally pretty fast, but why not cache them?
If there won't be changes to the files more often than the resolution of the system used to store it (that is, of file modification times for the filesystem or of SQLServer datetime if stored in a database), then why not just use the date of modification to the relevant resolution?
I know RFC 2616 advises against the use of timestamps, but this is only because HTTP timestamps are 1sec resolution and there can be changes more frequent than that. However:
With this approach you are guaranteed a unique e-tag (collisions are unlikely with a large CRC but certainly possible), which is what you want.
Of course, if you don't ever change the image at a given URI, it's even easier as you can just use a fixed string (I prefer string "immutable").
Upvotes: 1
Reputation: 21713
Most implementations use the last modified date or other file headers as the ETag including Microsoft's own, and I suggest you use that method.
Upvotes: 2
Reputation: 81660
Use instead a more robust hashing algo such as SHA1.
Speed depends on the size of the image. Most time will be spent on loading data from the disk, rather than in CPU processing. You can cache your generated hashes.
But I also advise on creating etag based on last update date of the file which is much quicker and does not require loading the whole file.
Remember, etag must only be unique for a particular resource so if two different images have the same last update time, it is fine.
Upvotes: 5