Marco
Marco

Reputation: 1172

AWS Architecture for product feed

As a quick resume i need to build a product feed that ll be constantly updating based on the people searches on site. Important things to have in mind:

With all of this in mind i go straight for explaining the possible Architecture i thought (im open and encourage to new ones):

In both i ll be getting the product information making people on site make a request to API gateway with the param of the product and thought proxy implementation to Lambda where i have all the data parsed. After that i can:

  1. Store by day on S3 and run a daily EC2 that retrieves all registers from the day before and cross the to a query on a redshift cluster. After all rows that need to be updated are detected update the redshift table.
  2. Use elasticache and in realtime evaluate if the row (by id) needs to be updated (and update it) directly from lambda.

My biggest concern is being cost efficient. Thoughts? Any other variable i should consider? Any other solution i should look into?

Upvotes: 1

Views: 123

Answers (2)

Kfactor21
Kfactor21

Reputation: 412

Some of the consideration you might want to evaluate could be:

-How soon does the processed data need to be made available ?

-What format does the processed data need to be made available ?

-What grain does the processed data need to be made available ?

-What volumes of data do you expect from web layer per pull request ?

-You've mentioned elastic cache, what latency (in seconds) can your application withstand ? except the inmemory staging of data, is there any other reason for using elastic cache ? No-Sql service like Dynamo DB is a good choice in most situations.

-Does the solution require a real time write to redshift. (Frequent random inserts to redshift is an antipattern !)

-Updates to redshift work best when you flag the record to be updated as "old" and insert a new record.

-About Lambda (as you might already know) has a processing cap time of 300 Seconds, hence you might want to trial out if Lambda Transformation could hit the cap.

-Also AWS RDS Service like Aurora is cheaper than redshift, can store up-to 64 TB of data hence could be a good Data Store solution offering flexibility of OLTP system.

Upvotes: 1

Noel Llevares
Noel Llevares

Reputation: 16087

To reduce your costs, try the following:

  1. You can replace the daily EC2 task with a Lambda function triggered by a CloudWatch scheduled event. It's free!

  2. Instead of Elasticache, use DynamoDB. It's free.

  3. I don't know why you're using Redshift. If it can be replaced with RDS, ElasticSearch, or even DynamoDB, I think that would make it even cheaper.

Upvotes: 1

Related Questions