Reputation: 468
I apologize for maybe a confusing title, but I have a dilema that I would know how resolve when I was a Java software developer, but need help with when working in the Azure Data factory.
I have a data source that I am accessing to load the data (into the SQL Database, but the destination is irrelevant for the question at hand).
There are two endpoints
http://<same-base-url>/abc-endPoint
and
http://<same-base-url>/xyz-endPoint
The data loads are independent of each other - kust 2 separate data loads.
The uses 3-legged authentication that I externalized to a separate pipeline.
After run it will provide fresh Bearer token in
So, I have 2 ways of using the bearer token later. Either getting it from key vault, or running authorization activity and getting the output variable.
So, then I am running 2 pipelines, each having a loo (until activity) looping through paginated REST responses. They literally copy one of the other with only difference in the "abc-endPoint" vs "xyz-endPoint" for the base url.
Here is the issue: Because there are many pages, and the run can take a long time, I run authorization on every iteration of the loop. For the most part things work out fine, however when 2 pipelines run sometimes the singled out authorization can fail with the error
Operation on target Get Refresh token failed: {"error":"invalid_grant","error_description":"The refresh token is invalid or expired."}
And as a programmer, I understand that a pipeline authorizing might have invalidated the refresh token for the other pipeline that just authorized but did not complete yet.
Are there options in ADF to deal with this? Kind of like synchronizing access to a variable (if spoken the programming language terminology) ??
Upvotes: 0
Views: 136
Reputation: 2589
Previously issued tokens are not revoked by Microsoft. This applies to access tokens and refresh tokens.
If you're authenticating and obtaining an access token for every iteration of your loop then you don't need the refresh token. You obtained an access token by authenticating against the OAuth 2.0 endpoint at some point. So, simply authenticate using your service principal's credentials every iteration, and use the new access (bearer) token in each API request.
But, the real clue in getting to the root cause of the issue is a combination of you overabundantly making requests to the OAuth endpoint, and mentioning that sometimes the second pipeline will cause failure.
It sounds like you are actually hitting throttle limits on Microsoft's Identity Server/OAuth API.
What I recommend you do is to properly observe your access token lifetimes. When you authenticate your service principal, the response you receive will contain an expires_in
value. This is the amount of seconds before the access token expires. This will be between 60 and 90 minutes. The refresh token expiry can be between 60 and 90 days, unless you have custom policies or conditional access in place.
Refactor your code. Set a local variable to the system time and add the number of seconds minus 5 minutes. Then, in your loop, check the system time against the expiry time in your variable. If it's greater, only then reauthenticate using the service principal credentials.
If you insist on using the refresh token to obtain a new access token, remember to update it in KeyVault with the new refresh token which will get back with the new access token. It will expire at some point. Otherwise, I don't see the need to store it in KeyVault; especially if it isn't retrieved by anything.
Upvotes: 0