Solo
Solo

Reputation: 21

Document AI doesn't recognize parent label area correctly, and does it only on per line basis

I have an issue with Document AI. When I try to create Parent label, with child labels in it, it does not recognize whole area of parent label correctly, and only recongizes it on per line basis, with separate label for each.

Can it be done somehow that it recognize both lines in one parent label insteand of dividing it?

Here is an example of how AI labeled it.

Here I have provided example on how I labeled the "Training/Testing" datasets:

Upvotes: 1

Views: 503

Answers (1)

Holt Skinner
Holt Skinner

Reputation: 2234

Based on the images provided of the training data, it looks like the labels are not very precise. The bounding boxes need to be tight around the entities in the document, and cover the full text required.

Are you using uptraining with an Invoice/Purchase Order processor or a Custom Document Extractor? Based on the document structure I can see, it might make sense to use Uptraining if it's similar to an existing processor.

It's also recommended to provide as much training data as possible to improve accuracy. If you're using the minimum amount required, your results could be improved by adding more labeled data.

You can try leveraging Auto-Labeling to create more examples quickly without having to manually label everything.

Upvotes: 0

Related Questions