Reputation: 12521
I have two gRPC services and one will call another one through normal gRPC method(no stream on either side), I'm using istio as service mesh and have sidecar injected into kubernetes pod of both services.
The gRPC call works correctly under normal load, but under high concurrency load situations, gRPC client side keeps throwing the following exception:
<#bef7313d> i.g.StatusRuntimeException: UNAVAILABLE: upstream connect error or disconnect/reset before headers
at io.grpc.Status.asRuntimeException(Status.java:526)
at i.g.s.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:434)
at i.g.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at i.g.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at i.g.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at i.g.i.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:678)
at i.g.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
at i.g.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
at i.g.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
at i.g.i.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:397)
at i.g.i.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
at i.g.i.ClientCallImpl.access$300(ClientCallImpl.java:63)
at i.g.i.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546)
at i.g.i.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467)
at i.g.i.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584)
at i.g.i.ContextRunnable.run(ContextRunnable.java:37)
at i.g.i.SerializingExecutor.run(SerializingExecutor.java:123)
at j.u.c.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at j.u.c.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Meanwhile, there's no exception on the server side, and there's no error on the istio-proxy
container of client's pod neither. But if I disable istio sidecar injection so that those two service talk to each other directly, there's no such errors.
Could somebody kindly tell me why, and how to resolve this problem?
Thanks a lot.
Upvotes: 2
Views: 3399
Reputation: 12521
Finally I found the reason, it's caused by the default circuitBeakers
settings of envoy sidecar, by default the option max_pending_requests
and max_requests
is set to 1024
, and the default connecTimeout
is 1s
, so under the high concurrency load situation when the server side has too many pending requests waiting to be served, the sidecar circuitBreaker will get involved and tell client side the server side upstream is UNAVAILABLE
.
To fix this problem you need to apply a DestinationRule
for the target service with reasonable trafficPolicy
settings.
Upvotes: 6