DocumentAI detect if image contains non-text visual elements in it

Question

Most of my target images contain only text elements, which is expected, since my main purpose is to extract text from them. But some of the target images contain non-text visual elements (actual images within the document), I'd like to know which of them are like this.

Does DocumentAI have a way to do that?

I have tried to detect the image by checking the areas of the blocks of a page object in DocumentAI using Python:

def has_visual_elements(page):
    """Checks if a page likely contains non-text visual elements."""
    for block in page.blocks:
        if block.layout:
            layout = block.layout.bounding_poly
            # Calculate the area of the bounding box
            width = layout.vertices[2].x - layout.vertices[0].x
            height = layout.vertices[2].y - layout.vertices[0].y
            area = abs(width * height)

            if area > 10000:
                return True
    return False

If the area is bigger than certain value, then there may be non-text visual elements in it. But some images containing only text elements return big area value. So this couldn't solve it.

An image containing non-text visual elements in it:

DocumentAI detect if image contains non-text visual elements in it

Answers (1)

Related Questions