Brãnicã
Brãnicã

Reputation: 27

Azure API Management - 500 Internal Server Errors

Following the best practices we've created a similar infrastructure, based this architecture.

Recently we've imported several external API (those APIs are exposed on public IPs) so they can be used in our organization. Those APIs are working, but randomly we get intermittent 500 Internal error messages.

A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond xx.xx.xx.xx:443. Usually it takes between 20-45 seconds to display this error.

There are some articles on how to troubleshoot this kind of behavior, but in this article from MS, they suggest to move the backend in the same virtual network, which is not possible, because it's not our API.

Have any of you experienced the same behavior? If so, how did you manage to troubleshoot it?

We've set up a new API Management that is not connected to an internal network and tried to reproduce this behavior, but all requests returned 200 OK, therefore I'm excluding a problem with the backends.

We've tried a retry policy, but it's a bad/frustrating experience for the end client and not always successful.

Upvotes: 0

Views: 3230

Answers (2)

EboGit
EboGit

Reputation: 11

We had a similar situation of intermittent 500 errors returned by APIM to API client. After consulting Microsoft support, we decided to implement their recommendation and use an automatic retry mechanism applicable on all APIs:

<backend>
    <retry condition="@((context.Response.StatusCode == 500) || (context.LastError?.Reason == "BackendConnectionFailure"))" count="5" interval="2" first-fast-retry="true">
        <forward-request buffer-request-body="true" />
    </retry>
</backend>

Upvotes: 0

Philip
Philip

Reputation: 667

In the reference architecture, an Application Gateway is used.

By default, the Application Gateway has a "Request time-out" on the backend settings of 20 seconds. This could be the issue when running long request to the APIM.

For more information on backend timeouts of an Application Gateway, look at Backend server timeout and Request time-out.

Upvotes: 0

Related Questions