Reputation: 1415
I am wanting to issue http requests in parallel and here is how my code (skeleton) looks like when using ray:
@ray.remote
def issue_request(user_id):
r = requests.post(url , json, headers)
ray.get([issue_request.remote(id_token[user_id]) for _ in range(500)])
This is running much slower as compared to the following:
def issue_request(user_id):
r = requests.post(url , json, headers)
jobs = []
for i in range(500):
process = multiprocessing.Process(target=issue_request,
args=(admin_id))
jobs.append(process)
for j in jobs:
j.start()
# Ensure all of the processes have finished
for j in jobs:
j.join()
The machine has two cores and it seems that ray only starts two processes to handle the 500 requests. Can someone please tell me how to tell ray to start 1 worker/process per request?
Upvotes: 0
Views: 5496
Reputation: 3372
You can do ray.init(num_cpus=10)
to tell Ray to schedule up to 10 tasks concurrently. There is more information about resources in Ray at https://ray.readthedocs.io/en/latest/resources.html.
By default Ray will infer the number of cores using something like os.cpu_count()
.
Starting 500 processes simultaneously would be probably be excessive. In the multiprocessing case, the processes are exiting once they finish, so you probably never have 500 around concurrently.
Upvotes: 4