Reputation: 124
Context
I am developping a webapp that
I am planning to deploy this application on the AWS. More specifically, using EC2 and S3.
Challenge
I am trying to come up with a design that is both cost-effective and performant to offer this service.
Analysis
The following assumptions are used:
Consider the following application flow:
In terms of network performance, step 1 and 2 will be the bottlenecks as EC2 has limited downloading and uploading bandwidth. Step 4 should not be a problem since S3 is taking care of the bandwidth for transferring file to the end user.
In terms of costs, fixed costs are the EC2 instances, and the main variable cost is step 4, where AWS charges 0.09$/GB in data transfer. Since the files are removed after 24 hours, the storage fee is comparatively tiny.
Question
Have I correctly identified the performance bottlenecks in this application flow?
Is my cost analysis correct?
Is this the optimal flow in terms of costs? Is there any way to further reduce the cost?
Since step 1 and step 2 (downloading from Internet and uploading to S3) will be very bandwidth-consuming when simultaneously downloading multiple large files, will it significantly affect the responsiveness of my server to serve regular API requests? Should I use a dedicated EC2 instance just for handling API calls from the clients, and another dedicated EC2 instance just for downloading and uploading? This will slightly further complicate the design, as I will have to manage the communication between the 2 instances as well.
Upvotes: 3
Views: 185
Reputation: 11
How about allowing the client to upload files directly to S3?
Your application would generate a pre-signed url, so that you can control which users can upload files, but after that the client interacts directly with S3. This would remove the costly "download then upload" process in steps 1 & 2.
See this document http://docs.aws.amazon.com/AmazonS3/latest/dev/PresignedUrlUploadObject.html
Upvotes: 1
Reputation: 74
Can you use more AWS Services? Are you aware of AWS Lambda? https://aws.amazon.com/lambda/details/ It can perform actions in response to actions, e.g. it could delete a file from S3 shortly after it is downloaded. http://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html This alleviates the need to track downloads and delete them, once you get past the learning curve of AWS Lambda. It can also handle other processing, so you only have to upload to S3 from EC2.
Regarding cost, S3 has different quality levels, and the "reduced redundancy" might be sufficient for your needs, saving a little money.
Upvotes: 2