SANJAY TEJA
SANJAY TEJA

Reputation: 15

How to use xml:writeFormatting of ABBYY OCR SDK in python?

I'm working with ABBYY OCR SDK to convert text images to xml in Python. My aim is to retain format of text, so I've been trying to use xml:writeFormatting parameter as follows:

ocr_engine = CloudOCR(application_id='', password='')
jpg = open('pic16.JPG', 'rb')
file = {jpg.name: jpg}
result = ocr_engine.process_and_download(file, 

exportFormat='xml&xml:writeFormatting=true', language='English')
    result

for format, content in result.items():
with open('converted.xml', 'wb') as output_file:
output_file.write(content.read())
output_file.close()

And the following error popped up:

HTTPError: 450 Client Error: Unknown format xmlwriteFormatting=true for url: http://cloud-eu.ocrsdk.com/processImage?exportFormat=xmlwriteFormatting%3Dtrue&language=English

Upvotes: 1

Views: 326

Answers (1)

Nadia Solovyeva
Nadia Solovyeva

Reputation: 207

By the sample, I guess this is no ABBYY OCR SDK. This is ABBYY Cloud SDK (entirely different product with a similar purpose). ABBYY SDK uses your computer CPU power to OCR text, Cloud OCR SDK uses ABBYY online services to do the same.

xml:writeFormatting should be "yes" or "no", not "true" or "false".

Upvotes: 1

Related Questions