Reputation: 51
In Response to ChadZ answer here is the metric of the form recognizer which i'm talking about Form Recognizer Metrics. In our test we're checking a directory for files and analyzing them in a sequential manner, waiting for each response, writing the results, getting the next file and so on. No Multithreading.
Have a look at the biggest spike at April, 14 with are 15330 Calls. If we assume that each call at April, 14 took 10 seconds (which would be fast, normaly it could take up to a minute) those analyzing took 153300 seconds, which are 2555 minutes or 42,58 hours. Even if analyzing would only take 5 seconds that would be more than 20 hours.
Ofcourse i could be wrong but currently the best logical explanation would be that also get-requests are tracked & billed.
I'm using a custom model with labels (created with the sample labeling tool) and getting the results with the "Python Form Recognizer Async Analyze" V2 SDK Code from the bottom of this this page. While the async thing in V2 is much slower than V1 (which i described here) it also seems much, much more expensive.
The original example code to get the result after a post api call looks like this:
n_tries = 15
n_try = 0
wait_sec = 5
max_wait_sec = 60
while n_try < n_tries:
try:
resp = get(url = get_url, headers = {"Ocp-Apim-Subscription-Key": apim_key})
resp_json = resp.json()
if resp.status_code != 200:
print("GET analyze results failed:\n%s" % json.dumps(resp_json))
quit()
status = resp_json["status"]
if status == "succeeded":
print("Analysis succeeded:\n%s" % json.dumps(resp_json))
quit()
if status == "failed":
print("Analysis failed:\n%s" % json.dumps(resp_json))
quit()
# Analysis still running. Wait and retry.
time.sleep(wait_sec)
n_try += 1
wait_sec = min(2*wait_sec, max_wait_sec)
except Exception as e:
msg = "GET analyze results failed:\n%s" % str(e)
print(msg)
quit()
print("Analyze operation did not complete within the allocated time.")
As you can see in the original example code it looks every 5 seconds to get the result.
My Problem: It seems to me that not only the api call for analyzing a document is billed but also each and every get-request to get the results.
Our bill has tenfold and more since using V2. We currently in testing phase and we've usually about 400-500 Documents per month which were correctly tracked and billed in V1. With V2 and the sample code above we now have 63690 (!!!!!) Calls, each call ist billed, costs are exploding.
Can anybody confirm this behaviour?
Personaly i'd like to get back the sync-operation where the response of the api call also contains the result of the any document analyse.
try:
url = base_url + "/models/" + model_id + "/analyze"
with open(filepath, "rb") as f:
data_bytes = f.read()
response = requests.post(url=url, data=data_bytes, headers=headers)
return response.json()
except Exception as e:
print(str(e))
return None
unfortunately this doesn't work anymore.....
try:
response = requests.post(url=post_url, data=data_bytes, headers=headers) # , params=params)
if response.status_code != 202:
return None
# Success
get_url = response.headers["operation-location"]
return form_recognizerv2_getdata(get_url, subscription_key)
except Exception as e:
print("POST analyze failed:\n%s" % str(e))
return None
Upvotes: 1
Views: 570
Reputation: 720
GetAnalyzeResults calls are not billed. Form Recognizer bills only for analyzed pages and not by transactions and requests. The graph "Form Recognizer Metrics"shows all your transactions and API calls including the GetAnalyzeResults but you are not billed for those. The billing for V1 and V2 is the same. Please contact customer service if you are experiencing a billing issue.
Neta-MSFT
Upvotes: 2
Reputation: 179
I can confirm that in Form Recognizer v2, GET calls are not billed. And train call is free too. If there's a billing issue, please contact customer service.
Upvotes: 0