How to configure load balancing strategy for WCF http relay

Question

We are building a platform which will allow to invoke an on-premise API from the cloud, for this purpose we are using WCF relays which in fact is the appropriate service for us since we need to create the relays dynamically (we have an onboarding API which is responsible for validating the customer license and creates the relay in Azure in case of a valid license to make possible the communication between the cloud API and the on-premise API).

At this moment our QA team is working on the load tests in order to know how much traffic supports the platform, during the tests our QA team detected a weird behavior when they run a test with 50 concurrent requests from JMeter.

Basically, in this scenario (50 concurrent requests), some of the requests fail because of 502 Bad Gateway responses, these responses come from the relay since the content type is XML.

The 502 Bad Gateway response is telling that the listener didn't accept the connection within the allowed interval, but the weird thing is that once we receive the 502 Bad Gateway response, no more requests reach the listener (the on-premise API), the only way to make the communication works again is to close the listener and start it again.

As far as I know, WCF relays support load balancing using a random strategy to choose the listener responsible for processing the request, I have found a thread in the Github repo belonging to the WCF relay service where is described the load balancing algorithm as follows:

Get a local copy of the list of all the known load-balanced listeners for the address requested by the sender (this comes from a cache which is updated every 500ms).
If the list of listeners is empty and we haven't refreshed the list of listeners exactly once force refresh the list of known load-balanced listeners for the endpoint.
If the list of listeners is empty return an exception to the sender and stop.
Pick a random index into the list of potential listeners.
Try to rendezvous with the selected listener.
If that rendezvous succeeds then stop.
If the rendezvous attempt with the selected listener doesn’t succeed within 10 seconds remove the selected listener from the list of listeners to try.
If more than 60 seconds have passed return an exception to the sender.
Go to step 2.

If the algorithm works in the way described above, then the default behaviour is very sensitive to DoS attacks, since once the rendezvous attempt fails, the listener will be removed from the list of listeners to try, this is a very bad idea since the only way to make the communication works again is to reconnect the listener, in our case, this means a manual action by the user since we are hosting the WCF service in a windows service (the user should restart the windows service in case of failed rendezvous attempt).

The funny thing is that we are applying the "ConnectionStatusBehaviour" to the endpoint to log whatever connection issue and we don't see anything strange in the logs, apparently, the service keeps connected/online and in the azure monitor the listener keep connected as well.

There is some way to configure a different behaviour when the rendezvous attempt doesn't succeed?.

Our current WCF configuration is as follows:

Actual Binding Configuration (WebHttpRelayBinding)

SecurityMode = Transport
RelayClientAuthenticationType = RelayAccessToken
IsDynamic = false
UseDefaultProxy = false
OpenTimeout = 10 seconds
CloseTimeout = 10 seconds
SendTimeout = 10 seconds
ReceiveTimeout = TimeSpan.MaxValue

Actual ServiceThrottlingBehaviour

MaxConcurrentCalls = 160
MaxConcurrentInstances = 1000
MaxConcurrentSessions = 1160

Actual Wcf Service Behaviour

InstanceContextMode = PerCall
ConcurencyMode = Single

Client-side

The WCF relay is invoked from an AspNetCore 2.2 application using HttpClient since the relay address is discovered at runtime (we cannot use a WCF proxy in this case).

The AspNetCore application is hosted in Azure.

Listener

Deployed on a Virtual Machine in Azure The system connectivity mode is configured with the value ConnectivityMode.Https

How to configure load balancing strategy for WCF http relay

Answers (0)

Related Questions