user1274878
user1274878

Reputation: 1415

Python ray code working slower as compared to python multi-processing

I am wanting to issue http requests in parallel and here is how my code (skeleton) looks like when using ray:

@ray.remote
def issue_request(user_id):
    r = requests.post(url , json, headers)

 ray.get([issue_request.remote(id_token[user_id]) for _ in range(500)])

This is running much slower as compared to the following:

def issue_request(user_id):
    r = requests.post(url , json, headers)

jobs = []
for i in range(500):
    process = multiprocessing.Process(target=issue_request,
                                  args=(admin_id))
jobs.append(process)
for j in jobs:
    j.start()

# Ensure all of the processes have finished
for j in jobs:
    j.join()

The machine has two cores and it seems that ray only starts two processes to handle the 500 requests. Can someone please tell me how to tell ray to start 1 worker/process per request?

Upvotes: 0

Views: 5496

Answers (1)

Robert Nishihara
Robert Nishihara

Reputation: 3372

You can do ray.init(num_cpus=10) to tell Ray to schedule up to 10 tasks concurrently. There is more information about resources in Ray at https://ray.readthedocs.io/en/latest/resources.html.

By default Ray will infer the number of cores using something like os.cpu_count().

Starting 500 processes simultaneously would be probably be excessive. In the multiprocessing case, the processes are exiting once they finish, so you probably never have 500 around concurrently.

Upvotes: 4

Related Questions