Reputation: 3151
I would like to know if AWS SQS is the right service for doing browser polling.
For example :
1) User acesses application through browser, and requests a large PDF to be generated
2) API responds back with "OK" to user and forwards the request to SQS
3) SQS queue is being read by a lambda which generates the PDF and stores it to S3.
Now, at some point between steps 2 and 3, the user browser wants to know when the PDF is done (no email), it could do this by polling SQS for a specific message ID (is this possible even?), but I have some questions :
a) Is it "okay" for both the user and lambda to be reading the same message from SQS? And what about too many users overloading SQS with polling requests?
b) Can a SQS message be edited/updated? How would the user know that lambda finished the PDF and get the download link? Can lambda edit the message to that it contains the link to S3? If not what would be the recommended way/AWS service for the user to know when the PDF is done without wasting too much resources?
And preferably without needing a database just for this... We really don't have too much users but we're trying to make things right and future proof.
Tagging boto as i'm doing all this in Python... eventually.
Upvotes: 3
Views: 1702
Reputation: 1160
I would suggest Web sockets as a method to push the notification back to the browser, instead of let the browser polls (i.e, periodically sending GetObject API call) for the PDF file in S3. This approach will help you to notify the browser if an error occurred when PDF generation.
For more details could you please watch https://www.youtube.com/watch?v=3SCdzzD0PdQ (from 6:40).
At 10:27 you will find a diagram that matches to what you are trying to achieve (replace DynamoDB component with S3).
I also think the Websocket based approach is cheaper compared to the polling approach by comparing S3 pricing [1] vs Web socket pricing [2]. But you will need to conduct a test (which reflects a production workload) and validate this.
[1] https://aws.amazon.com/s3/pricing/#Request_pricing [2] "WebSocket APIs" in https://aws.amazon.com/api-gateway/pricing/
Upvotes: 4
Reputation: 3107
You don't want to use SQS for this - you can read up to only 10 messages per poll, and if your queue has a lot of messages you may (will) see the same ones over and over as you keep polling i.e. there is no guarantee that you will ever see all of them. Not to mention the hell you will enter with visibility timeouts and making this work with multiple clients polling your queue.
Your output PDF is going to S3, so you could do the following: let the Lambda in step (2) construct a unique S3 key for the output PDF and send the key back to the client in the "OK" response. Then make the client poll the output bucket using that key. Of course the PDF constructed should be written using this key.
To poll from the browser, use GetObject. You will need to configure CORS on the output bucket for this to work.
Upvotes: 2