Reputation: 183
So I'm trying to configure my spout(s) to read from an Amazon SQS queue. Now, I want a situation wherein I can share the load across multiple spouts.
I understand it's possible to have multiple threads, but can I have two or more different spout instances/applications which are reading from the same queue and emitting to the same topology? For eg. Spout A and Spout B read from the SQS and then both emit to bolt C?
Upvotes: 2
Views: 2845
Reputation: 2135
SQS Queue -----> Spout (N Number of Executors).
This model will perfectly fine. as soon as, any of executor instance will pick up message, message will become invisible from SQS.
Keep Message Invisibility time Much higher than Message Processing time with in Storm Topology.
You can keep delete SQS message logic inside ack method.
Upvotes: 0
Reputation: 7193
Of course, you can have multiple spouts, but you have to define them accordingly to prevent double submit of the same element (or your topology does accept that by design). Multiple processes of the same element imply bad counters for instance.
Check Storm concurrency as a start with executors (threads) and tasks (instances) per spout / bolt and choose the number you want.
In your code, you have to be sure that you don't manage the same tuples twice or more, either you do it before storm (a queue which doesn't accept the same element twice which is processed / emptied by many spouts for instance, or multiple queues - one for each spout, beware of transactions) or you do it in storm (process messages only with x param in one spout, with y in another and a message cannot be x and y at the same time).
Upvotes: 3