Reputation: 9636
I am building a photos-site where users can upload photo and use it view later. These photos are not public and private. I am storing the photos and the thumbnails in S3. Currently the implementation that I am following is that when a user comes to page I serve signed urls of the thumbnails and that its loaded from S3(though I am also thinking about using signed urls from cloudfront).
The issues now are:
Please help me with what approach I should take, I want the page loading time to be fast and also serve the security concern. I would also like to know that will serving from cloudfront be faster than browser cache( I have read it someplace) even for different signed url everytime. Feel free to be descriptive in your answer.
Upvotes: 2
Views: 995
Reputation: 13065
I don't think there is a perfect answer to what you want. Some random ideas/tradeoffs:
1) switch to HTTPS. That way you can ignore people sniffing URLs. But HTTPS items cannot be cached in the browser for very long.
2) If you are giving out signed urls, don't set expires = "time + 10m", but do "time + 20m and round to nearest 10m". This way, the URLs will be constant for at least 10m, and the browser can cache them. (Be sure to also set the expires: headers on the files in S3 so the browser knows they can be cached.)
3) You could proxy all the URLs. Have the browser request the photo from your server, then write a web proxy to proxy the request to the photo in S3. Along the way, you can check the user auth, generate a signed URL for S3, and even cache the photo locally.) This seems "less efficient" for you, but it lets the browser cache your URLs for as long as they want. It's also convenient for your users, since they can bookmark a photo URL, and it always works. Even if they move to a different computer, they hit your server which can ask them to sign in before showing the photo.
Make sure to use an "evented" server like Python Twisted, or Node.js. That way, you can be proxying thousands of photos at the same time without using a lot of memory/CPU on your server. (You will use a lot of bandwidth, since all data goes thru your server. But you can "scale out" easily by running multiple servers.)
4) Cloudfront is a cache. It will be slower (by a few hundred ms) the first time a resource is requested from a CF server. But don't expect the second request to be cached! Each CF location has ~20 different servers, and you'll hit a random one each time. So requesting a photo 10 times will likely generate 10 cache misses, and you still only have a 50% chance of getting a cache hit on the next request. CF is only useful for popular content that is going to be requested hundreds of times. CF is somewhat useful for foreign users because the private CF-to-S3 connection can be better than the normal internet.
I'm not sure exactly how you would have CF do your security checking for you. But if you pass thru the S3 auth, (not the default), then you could use the "mod 10 minutes" trick to make URLs that can be cached for 10 minutes.
It is impossible for CF to be "faster than a browser cache". But if you are NOT using your browser cache, CF can be faster than S3, but mostly in foreign locations.
Take a look at what other people do (i.e. smugmug uses S3, I think.)
Upvotes: 3