Reputation: 127
I want to know if OCR of Microsoft supports Arabic and Hindi languages?
I have read in the documentation that it supports the mentioned languages but when I send an image containing Arabic or Hindi texts, the result is very wrong as you can see in the image.
Is there any way to change the language of the text manually in their official demo (https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision) or is there any other website where I can test Azure OCR service fully?
Upvotes: 1
Views: 1155
Reputation: 494
Hindi is not one of the supported languages listed under the Optical Character Recognition (OCR) section, but is listed under Image analysis with a green tick for 'tags' only.
An alternative Azure OCR API which CAN read Hindi (and many other Indian lanaguages such as Assamese, Devanagari, Gujarati, Gurmukhi, Kannada, Malayalam, Marathi, Nepali, Panjabi, Sanskrit, Sindhi, Sinhala, Tamil, Telugu) is IronOCR which includes one-click support for 125 supported languages.
//PM> Install-Package IronOcr.Languages.Hindi
using IronOcr;
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Hindi;
using (var Input = new OcrInput(@"images\Hindi.png"))
{
var Result = Ocr.Read(Input);
Var AllText = Result.Text
}
Here's a full tutorial by Wade Gausden on Azure OCR In .NET with IronOCR which covers running Ocr in Azure.
Upvotes: 1
Reputation: 65391
Hindi is not supported
Arabic is supported via the OCR API
See: https://learn.microsoft.com/en-us/azure/cognitive-services/computer-vision/language-support
Upvotes: 0