Reputation: 15
I'm working with ABBYY OCR SDK to convert text images to xml in Python. My aim is to retain format of text, so I've been trying to use xml:writeFormatting
parameter as follows:
ocr_engine = CloudOCR(application_id='', password='')
jpg = open('pic16.JPG', 'rb')
file = {jpg.name: jpg}
result = ocr_engine.process_and_download(file,
exportFormat='xml&xml:writeFormatting=true', language='English')
result
for format, content in result.items():
with open('converted.xml', 'wb') as output_file:
output_file.write(content.read())
output_file.close()
And the following error popped up:
HTTPError: 450 Client Error: Unknown format xmlwriteFormatting=true for url: http://cloud-eu.ocrsdk.com/processImage?exportFormat=xmlwriteFormatting%3Dtrue&language=English
Upvotes: 1
Views: 326
Reputation: 207
By the sample, I guess this is no ABBYY OCR SDK. This is ABBYY Cloud SDK (entirely different product with a similar purpose). ABBYY SDK uses your computer CPU power to OCR text, Cloud OCR SDK uses ABBYY online services to do the same.
xml:writeFormatting should be "yes" or "no", not "true" or "false".
Upvotes: 1