Reputation: 15760
Folks, trying to parse a log file into RabbitMQ with pika client:
import pika
credentials = pika.PlainCredentials('username', 'password')
parameters = pika.ConnectionParameters(credentials=credentials,host='ec2privateip',port=5672,virtual_host='/')
connection = pika.BlockingConnection(parameters)
channel = connection.channel()
channel.queue_declare(queue='blahqueue')
f = open(r'apicalls.log', 'r')
while True:
line = f.readline()
if not line:
time.sleep(1)
else:
channel.basic_publish(exchange='',routing_key='hello',body=line)
For performance, I am seeing that 1 ec2 machine can send at around 300 messages/second. This does not change from m1.small to m1.large.
For better performance, should I invest time to rewrite the above in C, or should i look elsewhere?
Tests running locally on the same RabbitMQ machine show exactly the same.
If I run the runjava.sh com.rabbitmq.examples.MulticastMain test locally, i see 10K/second performance. This leads me to believe the Python client is slow, or I am not testing the setup properly.
Upvotes: 2
Views: 1544
Reputation: 49085
Your probably not going to see improvement with rewriting the above in C as the file system and your random time.sleep
are the bottleneck. I'm not entirely sure on Amazon EC2 but in general when you upgrade a faster machine you don't necessarily get a faster file system IO.
Also there is a difference between publish speeds and consumption speeds. Obviously make sure its the publish speed that is the problem.
Upvotes: 1