Python: How to trigger multiple process at same instant

Question

I am trying to run a process that does a http POST which in turn will send an alert(time taken to send an alert is in nano second) to a server. I am trying to test the capacity of the server in handling alerts in milliseconds. As per the given standard, the server is said to handle 6000 alerts/second.

I created a piece of code using multiprocessing module, which sends 6000 alerts, but I am using a for loop and hence the time taken to execute the for loop exceeds more than a second. And hence all the 6000 process are not triggered at SAME INSTANT.

Is there a way to trigger multiple(N number) process at same instant?

This is my code: flowtesting.py which is a library. And this is followed by my script after '####'

import json import httplib2

class flowTesting(): def init(self, companyId, deviceIp): self.companyId = companyId self.deviceIp = deviceIp

def generate_savedSearchName(self, randNum):
    self.randMsgId = randNum
    self.savedSearchName = "TEST %s risk31 more than 3" % self.randMsgId

def def_request_body_dict(self):
    self.reqBody_dict = \
        { "Header" : {"agid" : "Agent1",
                      "mid": self.randMsgId,
                      "ts" : 1253125001
        },
          "mp":
              {
                  "host" : self.deviceIp,
                  "index" : self.companyId,
                  "savedSearchName" : self.savedSearchName,
              }
        }
    self.req_body = json.dumps(self.reqBody_dict)

def get_default_hdrs(self):
    self.hdrs = {'Content-type': 'application/json',
                 'Accept-Language': 'en-US,en;q=0.8'}

def send_request(self, sIp, method="POST"):
    self.sIp = sIp
    self.url = "http://%s:8080/agent/splunk/messages" % self.sIp

    http_cli = httplib2.Http(timeout=180, disable_ssl_certificate_validation=True)
    rsp, rsp_body = http_cli.request(uri=self.url, method=method, headers=self.hdrs, body=self.req_body)
    print "rsp: %s and rsp_body: %s" % (rsp, rsp_body)

# My testScript
from flowTesting import flowTesting
import random
import multiprocessing

deviceIp = "10.31.421.35"
companyId = "CPY0000909"
noMsgToBeSent = 1000
sIp = "10.31.44.235"
uniq_msg_id_list = random.sample(xrange(1,10000), noMsgToBeSent)

def runner(companyId, deviceIp, uniq_msg_id):
    proc = flowTesting(companyId, deviceIp)
    proc.generate_savedSearchName(uniq_msg_id)
    proc.def_request_body_dict()
    proc.get_default_hdrs()
    proc.send_request(sIp)

process_list = []
for uniq_msg_id in uniq_msg_id_list:
    savedSearchName = "TEST-1000 %s risk31 more than 3" % uniq_msg_id

    process = multiprocessing.Process(target=runner, args=(companyId,deviceIp,uniq_msg_id,))
    process.start()
    process.join()
    process_list.append(process)

print "Process list: %s" % process_list
print "Unique Message Id: %s" % uniq_msg_id_list

abarnert · Accepted Answer

Making them all happen in the same instant is obviously impossible—unless you have a 6000-core machine and an OS kernel whose scheduler is able to handle them all perfectly (which you don't), you can't get 6000 pieces of code running at once.

And, even if you did, what they're all trying to do is to send a message on a socket. Even if your kernel was that insanely parallel, unless you have 6000 separate NICs, they're going to end up serialized in the NIC buffer. That's the way IP works: one packet after another. And of course there are all the routers on the path, the server's NIC, the server's OS, etc. And even if IP doesn't get in the way, bytes take time to transfer over a cable. So the only way to do this at the same instant, even in theory, would be to have 6000 NICs on each side and wire them up directly to each other with identical fiber.

However, you don't really need them in the same instant, just closer to each other than they are. You didn't show us your code, but presumably you're just starting 6000 Processes that all immediately try to send a message. That means you're including the process startup time—which can be pretty slow (especially on Windows)—in the skew time.

You can reduce that by using threads instead of processes. That may seem counterintuitive, but Python is pretty good at handling I/O-bound threads, and every modern OS is very good at starting new threads.

But really, what you need is a Barrier on your threads or processes, to let all of them complete all the setup work (including process startup) before any of them try to do any work.

It still probably won't be tight enough, but it will be a lot tighter than you probably have right now.

The next limit you're going to face is context-switching time. Modern OSs are pretty good at scheduling, but not 6000-simultaneous-tasks good. So really, you want to reduce this to N processes, each one just spamming 6000/N connections sequentially as fast as possible. That will get them into the kernel/NIC much faster than trying to do 6000 at once and making the OS do the serialization for you. (In fact, on some platforms, depending on your hardware, you might actually be better off with one process doing 6000 in a row than N doing 6000/N. Test it both ways.)

There's still some overhead for the socket library itself. To get around that, you want to pre-craft all of the IP packets, then create a single raw socket and spam those packets. Send the first packet from each connection, then the second packet from each connection, etc.

Python: How to trigger multiple process at same instant

Answers (2)

Related Questions