Reputation: 13140
Right now I have a Python script that uses Boto to insert a number of messages into SQS -- around 100,000 to 200,000. Simply iterating through the loop without creating SQS messages takes about 3 minutes. With SQS messages, it's dreadfully slow.
What's the best way to speed this up? Should I create a pool of SQS connections and thread the insertion of messages? Should I shard the list of messages to insert and spawn multiple processes each with its own share of the list?
What do experienced Boto users recommend?
Upvotes: 6
Views: 3731
Reputation: 45856
Concurrency is important, either through threads or multiprocessing, or gevent. Take your pick. Also, are you using send_message_batch
? That allows you to send 10 messages at a time and also helps a lot.
Upvotes: 4
Reputation: 26882
You could try more concurrency, by using eventlet with boto. Have a look at this SO answer: Fastest way to download 3 million objects from a S3 bucket. The same strategy should work with SQS as well.
However, you probably want to make sure there are no other, sillier problems. Are you testing this from a EC2 instance? If not, you might want to spin an instance up in the same region as your SQS endpoint, and test there to see if it's just your network that's slow. If that doesn't help, then maybe try eventlet.
Upvotes: 3