Reputation: 33
We have configured the high availability for API end points as mentioned here https://apim.docs.wso2.com/en/3.2.0/learn/design-api/endpoints/high-availability-for-endpoints/#configuring-load-balancing-endpoints
In Load balance and Failover Configurations we have chosen the "EndpointType" as "Load Balanced". We could see the requests are routed to these load balanced end points successfully. However when we stop any one of the end-point node, 2 requests are still routed to the stopped node before the remaining requests are successfully routed to the active node. This is happening again and again when we receive new requests. The particular failed end-point is not marked as inactive or down
The error response are {"fault":{"code":101503,"type":"Status report","message":"Runtime Error","description":"Error connecting to the back end"}}
The entries from carbon logs are attached below
TID: [-1234] [] [2022-12-14 09:51:23,307] WARN {org.wso2.carbon.apimgt.gateway.handlers.throttling.ThrottleHandler} - Error while getting throttling information for resource and http verb TID: [-1] [] [2022-12-14 09:51:23,308] WARN {org.apache.synapse.transport.passthru.ConnectCallback} - Connection refused or failed for : /100.66.2.32:7010 TID: [-1234] [] [2022-12-14 09:51:23,309] WARN {org.apache.synapse.endpoints.EndpointContext} - Endpoint : NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1 with address http://100.66.2.32:7010/mcm-provider will be marked SUSPENDED as it failed TID: [-1234] [] [2022-12-14 09:51:23,309] WARN {org.apache.synapse.endpoints.EndpointContext} - Suspending endpoint : NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1 with address http://100.66.2.32:7010/mcm-provider - current suspend duration is : 30000ms - Next retry after : Wed Dec 14 09:51:53 UTC 2022 TID: [-1234] [] [2022-12-14 09:51:23,310] WARN {org.apache.synapse.endpoints.LoadbalanceEndpoint} - Endpoint [NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint] Detect a Failure in a child endpoint : Endpoint [NewMCMInboundChannel-RESTAPIService--vv2_APIproductionEndpoint_1] TID: [-1234] [] [2022-12-14 09:51:23,310] INFO {org.apache.synapse.mediators.builtin.LogMediator} - {api:admin--NewMCMInboundChannel-RESTAPIService:vv2} STATUS = Executing default 'fault' sequence, ERROR_CODE = 101503, ERROR_MESSAGE = Error connecting to the back end
Upvotes: 0
Views: 499
Reputation: 1031
Here what you are experiencing is the default behavior of the endpoint suspension. Any endpoint created with API Manager can be in 3 states.
In your configuration, since you have configured two endpoints in a load-balanced manner, initially both endpoints are in the active state. Both endpoints share the load. Once the endpoint 2 is stopped, the next request routed to the endpoint failed with an error code and that has put the endpoint from active to suspended state.
There are three configurations you can set in such suspended situation.
In the default configurations, initial suspension duration is set to 30 seconds. This means, server will remove the suspended state of the endpoint after 30 seconds from the endpoint failure and will put it back to the active state. That's why you can observe that endpoint is getting active from time to time. This is expected as the server tries to determine whether the endpoint is active or not.
You can increase this suspension time with the configurations and suspension time is calculated considering the other 3 configurations.
Endpoint suspension time = Min(current suspension duration * progressionFactor, maximumDuration)
With each failed attempt, the progression factor will increase the suspension duration until the maximum time. This will reset once the endpoint has served at least one successful request.
You can configure all the in the publisher UI, endpoints section as below.
More information on endpoint suspension can be found in here 1.
1 - https://docs.wso2.com/display/EI660/Endpoint+Error+Handling
Upvotes: 0