Reputation: 152
I wrote a simple script to try to figure out how does ZeroRPC perform in terms of throughput of messages, the server is a simple service, echoing a greeting. The code of the client is below, see that I'm trying to launch parallel tasks:
import zerorpc
import datetime
import time
import gevent
import threading
N_MSGS = 1000
N_TASKS = 10
N_TASK_STEP = 1
count = dict()
client = zerorpc.Client()
client.connect('tcp://192.168.144.142:80081')
def task(number):
results = []
for i in range(N_MSGS):
results.append(client.hello('Mathias', async=True))
gevent.sleep(0)
count[number] = 0
for r in results:
if r.get() == 'Hello Mathias':
count[number] += 1
format_header = '{:<6s} {:<8s} {:<20s} {:<20s} {:<20s}'
format_line = '{:>6d} {:>8d} {:>20s} {:>20d} {:>9.2f}'
print(format_header.format("#RUN", "#TASKS", "TOTAL TIME", "#MSGS", "MSG/SEC"))
run = 1
for i in range(1, N_TASKS, N_TASK_STEP):
tasks=list()
for j in range(i):
tasks.append(gevent.spawn(task, j))
start = datetime.datetime.now()
gevent.joinall(tasks)
end = datetime.datetime.now()
total_time = end - start
count_total = 0
for _,v in count.items():
count_total += v
msg_per_sec = count_total / total_time.total_seconds()
print(format_line.format(run, i, str(total_time),
count_total, msg_per_sec))
run += 1
The server is quite simple:
import zerorpc
class Hello:
def hello(self, name):
return 'Hello {}'.format(name)
server = zerorpc.Server(Hello())
server.bind('ipc:///tmp/local')
server.run()
note that I'm using the IPC socket, I was expecting that whenever more tasks are added, the throughput would increase until a certain cap. But actually the numbers are almost the same always. Even when I change to a real network setup, the numbers do not differ a lot. Follow the numbers I got with the setup of the code above:
N_RUN N_TASKS TOTAL_TIME N_MSGS MSG/SEC // vvvvvvv???????????
1 1 0:00:01.543374 2000 1295.86 // 1295.8621824651705
2 2 0:00:02.454205 4000 1629.85 // 1629.8556966512579
3 3 0:00:03.583786 6000 1674.20 // 1674.2071094646835
4 4 0:00:04.903248 8000 1631.57 // 1631.5715623602969
5 5 0:00:06.133924 10000 1630.27 // 1630.2777797703395
6 6 0:00:07.299903 12000 1643.85 // 1643.8574594758315
7 7 0:00:10.096884 14000 1386.56 // 1386.5663901853286
8 8 0:00:10.437927 16000 1532.87 // 1532.8714216912995
9 9 0:00:11.384918 18000 1581.03 // 1581.0390553537582
10 10 0:00:12.628328 20000 1583.74 // 1583.740935458756
11 11 0:00:13.691057 22000 1606.88 // 1606.888350548829
12 12 0:00:15.430392 24000 1555.37 // 1555.372021657
13 13 0:00:16.775109 26000 1549.91 // 1549.9154133663155
14 14 0:00:18.021384 28000 1553.70 // 1553.7097483744865
I also took some measurements with IPC, and also I was expecting a bigger difference in performance when compared to TCP.
I plotted a chart, see below:
I have the impression that something is limiting the performance from increasing further.
Maybe I'm missing something in my code, I'm considering using ZeroRPC in an important project and would like to have a glimpse on its performance.
Just to be clear, since there is some information missing. I did run the parallel task, for the script above, and when I start two client process, the throughput drops by half, let's say one single client process it gets 2000 if I start a second I get 1000 for each. I'm not an expert on performance, I just decide to test by myself, since I see the framework as a nice potential for a project where I'm working on.
Upvotes: 0
Views: 657
Reputation: 1
A few facts first -
ZeroMQ is not the limit here for several reasons:
your posted message-rate under ~ 1k7 [MSG/s]
is about 4-orders-of-manitude under tests
even ~ [MB]
-sized BLOBs will fly faster than the exemplified "Hello"
-messages above
for ~ 16 [B]
payloads, one may tune a "ceiling" above ~ 6,200,000 [MSG/s]
a performance-motivated setup was not even attempted to be sketched, the less performed
there is a principal error in concept of the test - a "just"-[CONCURRENT]
process scheduling ( the more if 've made just a daisy-chain of event-handlers for a RTT-duration test evaluated above ) is by far not anywhere near a domain of a true-[PARALLEL]
process-orchestration, this is an elementary fault in presented concept ( as noted by @bombela already above ).
performance tweaking for co-existing, yet different transport-class instances are due to many reasons require way harder work to tweak their performance ceilings, than polishing a single transport-class setup for ultimate performance
Am I doing something wrong in my benchmark?
No, yet nothing to achieve any better performance was done there.
Is this level of performance expected?
No, for details ref. datarates cited above.
Could there be any limit on the number of connections in the client?
No, but may be.
If yes could I increase the number?
Yes.
Upvotes: 0