DaveNOTDavid
DaveNOTDavid

Reputation: 1803

Using Google's Text Recognition API to detect horizontal lines instead of blocks in images

Is there a way to detect full-sized, horizontal lines (max width) instead of text blocks in images using Google's Text Recognition API? Say, if I wanted to retrieve the total due from a receipt image like this:

enter image description here

... because as of now, the API detects texts in blocks instead in an arbitrary order like this:

enter image description here

... and no, TextBlock's getComponents() only retrieves the Lines within each TextBlock since TextBlock is at the top of the Text hierarchy (TextBlock contains Line) as mentioned in the docs here. If only this API could start off with Lines instead of TextBlocks for an image bitmap's frame...

I even tried resizing the text blocks' bounding box (rectangle) with hard-coded coordinates to hopefully detect the full line of text, "Chicken Bowl... 7.15", but to no avail as shown below:

val textRecognizer = TextRecognizer.Builder(this).build()
if (textRecognizer.isOperational) {
    val imageFrame = Frame.Builder()
                .setBitmap(imageBitmap)
                .build()
    val textBlocks = textRecognizer.detect(imageFrame)
    for (i in 0 until textBlocks.size()) {
        val textBlock = textBlocks.get(textBlocks.keyAt(i))
        textBlock.boundingBox.set(97, 1244, 1235, 1292)

        val textValue = textBlock.value
        Log.d(LOG_TAG, "textValue: " + textValue)
    }
}

Upvotes: 3

Views: 2072

Answers (1)

Arno
Arno

Reputation: 447

You are right - the API just gives you the coordinates of the text blocks and of the lines within the blocks. Therefore you have to sort out all lines by yourself.

Before you can start with this, you should rotate the coordinates in a way that the baselines are (more or less) horizontal. Be aware that the coordinates of the bounding boxes are sometimes in a wrong order. You should sort these misleading boxes out, when you calculate the needed rotation angle.

After you rotated all the coordinates, you can start to match all word-bounding-boxes and create the lines that you need. In my code I did this by comparing the vertical center of the boxes. Be aware of fragements with very small or very large height (in comparison to the average height). You have to give them a special treatment.

I can asure you that this works well with receipts as shown in your example.

Upvotes: 0

Related Questions