Uploading files to ec2, first to ebs volume then moving to s3

Question

http://farm8.staticflickr.com/7020/6702134377_cf70482470_z.jpg

OK sorry for the terrible drawing but it seemed a better way to organize my thoughts and convey them. I have been wrestling for a while with how to create an optimal de-coupled easily scale-able system for uploading files to a web app on AWS.

Uploading directly to S3 would work except for the fact the files need to be instantly accessible to the uploader for manipulation then once manipulated they can go to s3 where they will be served to all instances.

I played with the idea of creating a SAN with something like glusterfs then uploading directly to that and serving from that. I have not ruled it out but from varying sources the reliability of this solution might be less than ideal (if anyone has better insight on this I would love to hear). In any case I wanted to formulate a more "out of the box" (in the context of AWS) solution.

So to elaborate on this diagram, I want the file to be uploaded to the local filesystem of the instance it happens to go to, which is an EBS volume. The storage location of the file would not be served to the public (i.e. /tmp/uploads/ ) It could still be accessed by the instance through a readfile() operation in PHP so that the user could see and manipulate it right after uploading. Once the user is finished manipulating the file a message to move it to s3 could be queued in SQS.

My question is then once I save the file "locally" on the instance (which could be any instance due to the load balancer), how can I record which instance it is on (in the DB) so that subsequent requests through PHP to read or move the file will find said file.

If anyone with more experience in this has some insight I would be very grateful. Thanks.

Uploading files to ec2, first to ebs volume then moving to s3

Answers (1)

Related Questions