Reputation: 5228
I need to develop my own webapi custom skill that make us of Read API
. I will use it in my custom skillset. I can't use built-in OCR skill
from Azure Cognitive Search
(t
Output of my webapi skill looks like this:
// logic to get result...
// now creating output to custom skill
var textUrlFileResults = results.AnalyzeResult.ReadResults;
foreach (ReadResult page in textUrlFileResults)
{
var newValue = new
{
RecordId = value.RecordId,
Data = new
{
text = string.Join(" ", page.Lines?.Select(x => x.Text))
}
};
output.Values.Add(newValue);
}
}
return new OkObjectResult(output);
And here is my skillset
definition:
"skills": [
{
"@odata.type": "#Microsoft.Skills.Text.MergeSkill",
"name": "#1",
"context": "/document",
"insertPreTag": " ",
"insertPostTag": " ",
"inputs": [
{
"name": "text",
"source": "/document/content"
},
{
"name": "itemsToInsert",
"source": "/document/normalized_images/*/text"
},
{
"name": "offsets",
"source": "/document/normalized_images/*/contentOffset"
}
],
"outputs": [
{
"name": "mergedText",
"targetName": "merged_content"
}
]
},
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "#2",
"description": null,
"context": "/document/normalized_images/*",
// i cut some info
"inputs": [
{
"name": "image",
"source": "/document/normalized_images/*"
}
],
"outputs": [
{
"name": "text",
"targetName": "text"
}
]
}
],
I am trying to OCR pdf document that look like this:
And in Index i get this document that looks like this:
{
"@odata.context": " cutted ",
"value": [
{
"@search.score": 1,
"content": "\nText before shell\n\nText after shell\n\nText after bw\n\n\n\n\n\n\n\nAnd here second page\n\n\n",
"merged_content": "\nText before shell\n\nText after shell\n\nText after bw\n\n SHELL 1900 1904 1909 1930 1948 SHELL SHELL Shell Shell 1955 1961 1971 1995 1999 \n\n B+W BLACK+WHITE PHOTOGRAPHY \n\n\n\nAnd here second page\n\n\n",
"text": [
"SHELL 1900 1904 1909 1930 1948 SHELL SHELL Shell Shell 1955 1961 1971 1995 1999",
"B+W BLACK+WHITE PHOTOGRAPHY"
],
"layoutText": [],
"textFromOcr": "[\"SHELL 1900 1904 1909 1930 1948 SHELL SHELL Shell Shell 1955 1961 1971 1995 1999\",\"B+W BLACK+WHITE PHOTOGRAPHY\"]"
}
]
}
My question is, why OCRed text is not placed in correct order with standard text when i am using /document/normalized_images/*/contentOffset"
in MergeSkill
? To be honest my skillset is copy-pasted from ms docs and it is not working as expected. I dont really understand, what special comes from OCR skill
. I need to develop my own OCR skill, i can't use OCR
from Search out of the box, i need to write it on my own.
Upvotes: 0
Views: 230
Reputation: 466
Unfortunately, that is the behavior of the skill by design. It gets the text first and leave the image translation at the bottom. This is not something that can be changed at this time with code within the skill due to an implementation limitation. Changes to OCR skill documentation have been made to reflect this, and it will be published hopefully this week, to clarify and avoid confusion.
Upvotes: 1