David Adrian
David Adrian

Reputation: 1109

gRPC Python thread_pool vs max_concurrent_rpcs

When launching a Python grpc.server, what's the difference between maximum_concurrent_rpcs and the max_workers used in the thread pool. If I want maximum_concurrent_rpcs=1, should I still provide more than one thread to the thread pool?

In other words, should I match maximum_concurrent_rpcs to my max_workers, or should I provide more workers than max concurrent RPCs?

server = grpc.server(
    thread_pool=futures.ThreadPoolExecutor(max_workers=1),
    maximum_concurrent_rpcs=1,
)

Upvotes: 15

Views: 8955

Answers (1)

Attila123
Attila123

Reputation: 1042

If your server already processing maximum_concurrent_rpcs number of requests concurrently, and yet another request is received, the request will be rejected immediately.

If the ThreadPoolExecutor's max_workers is less than maximum_concurrent_rpcs then after all the threads get busy processing requests, the next request will be queued and will be processed when a thread finishes its processing.

I had the same question. To answer this, I debugged a bit what happens with maximum_concurrent_rpcs. The debugging went to py36/lib/python3.6/site-packages/grpc/_server.py in my virtualenv. Search for concurrency_exceeded. The bottom line is that if the server is already processing maximum_concurrent_rpcs and another request arrives, it will be rejected:

# ...
elif concurrency_exceeded:
    return _reject_rpc(rpc_event, cygrpc.StatusCode.resource_exhausted,
                        b'Concurrent RPC limit exceeded!'), None
# ...

I tried it with the gRPC Python Quickstart example:

In the greeter_server.py I modified the SayHello() method:

# ...
def SayHello(self, request, context):
    print("Request arrived, sleeping a bit...")
    time.sleep(10)
    return helloworld_pb2.HelloReply(message='Hello, %s!' % request.name)
# ...

and the serve() method:

def serve():
    server = grpc.server(futures.ThreadPoolExecutor(max_workers=10), maximum_concurrent_rpcs=2)
    # ...

Then I opened 3 terminals and executed the client in them manually (as fast as I could using python greeter_client.py:

As expected, for the first 2 clients, processing of the request started immediately (can be seen in the server's output), because there were plenty of threads available, but the 3rd client got rejected immediately (as expected) with StatusCode.RESOURCE_EXHAUSTED, Concurrent RPC limit exceeded!.

Now to test what happens when there are not enough threads given to ThreadPoolExecutor I modified the max_workers to be 1:

server = grpc.server(futures.ThreadPoolExecutor(max_workers=1), maximum_concurrent_rpcs=2)

I ran my 3 clients again roughly the same time as previously.

The results is that the first one got served immediately. The second one needed to wait 10 seconds (while the first one was served) and then it was served. The third one got rejected immediately.

Upvotes: 22

Related Questions