Reputation: 10383
I'm following these instructions to use the layout form recognizer service from Azure Which have the following code:
########### Python Form Recognizer Async Layout #############
import json
import time
from requests import get, post
# Endpoint URL
endpoint = r"<Endpoint>"
apim_key = "<Subscription Key>"
post_url = endpoint + "/formrecognizer/v2.0-preview/Layout/analyze"
source = r"<path to your form>"
headers = {
# Request headers
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': apim_key,
}
with open(source, "rb") as f:
data_bytes = f.read()
try:
resp = post(url = post_url, data = data_bytes, headers = headers)
if resp.status_code != 202:
print("POST analyze failed:\n%s" % resp.text)
quit()
print("POST analyze succeeded:\n%s" % resp.headers)
get_url = resp.headers["operation-location"]
except Exception as e:
print("POST analyze failed:\n%s" % str(e))
quit()
I tried the code I got the following error:
POST analyze failed:
{"error":{"code":"FailedToDownloadImage","message":"Failed to download image from input URL."}}
POST analyze succeeded:
{'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json; charset=utf-8', 'x-envoy-upstream-service-time': '4', 'apim-request-id': '515e93ee-4db8-4174-92b1-63e5c415c056', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'x-content-type-options': 'nosniff', 'Date': 'Sat, 06 Jun 2020 20:47:28 GMT'}
POST analyze failed:
'operation-location'
The code I'm using is:
import json
import time
from requests import get, post
I'm reading the pdf file before making the request and verifying it loaded into the variable
source = r"data/Invoice_7.pdf"
with open(source, "rb") as f:
data_bytes = f.read()
print (data_bytes[0:10])
Then the request details:
endpoint = r"https://xxxx.cognitiveservices.azure.com/"
apim_key = "xxxx"
post_url = endpoint + "/formrecognizer/v2.0-preview/Layout/analyze"
headers = {
# Request headers
'Content-Type': 'application/json',
'Ocp-Apim-Subscription-Key': apim_key,
}
And finally making the request:
try:
resp = post(url = post_url, data = data_bytes, headers = headers)
print (1)
if resp.status_code != 202:
print("POST analyze failed:\n%s" % resp.text)
#quit()
print (2)
print("POST analyze succeeded:\n%s" % resp.headers)
print (3)
get_url = resp.headers["operation-location"]
print (4)
except Exception as e:
print("POST analyze failed:\n%s" % str(e))
#quit()
I'm printing a number at each step because I find very weird that I get both fail and successful requests responses. This is the result:
1
POST analyze failed:
{"error":{"code":"FailedToDownloadImage","message":"Failed to download image from input URL."}}
2
POST analyze succeeded:
{'Transfer-Encoding': 'chunked', 'Content-Type': 'application/json; charset=utf-8', 'x-envoy-upstream-service-time': '1', 'apim-request-id': '93a2a162-d14f-496f-ba8a-077bcfd5d3c7', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'x-content-type-options': 'nosniff', 'Date': 'Sat, 06 Jun 2020 21:00:20 GMT'}
3
POST analyze failed:
'operation-location'
So the code fails at this line:
get_url = resp.headers["operation-location"]
the text in the response variable is:
'{"error":{"code":"FailedToDownloadImage","message":"Failed to download image from input URL."}}'
Upvotes: 2
Views: 2004
Reputation: 3024
As defined in the REST API documentation, you need to specify the Content-Type. When you set your Content-Type to application/json
, you need to provide a public accessible source via JSON. In your case, you need to set the Content-Type to application/pdf
. When you want to make this dynamic, you could make use of the PyPi package filetype.
By the way, did you know that there is a (beta) Python SDK for Form Recognizer, which you can use for your use-case.
Upvotes: 1