Filip Östermark
Filip Östermark

Reputation: 445

Improve Document AI generative AI accuracy?

I am creating a Document AI Custom Processor on Google Cloud Platform. I have been using the pre-trained foundation model to auto-label documents as I import them. However, it is not clear to me if labeling more documents will improve the performance of a Generative AI (as opposed to custom) processor, or if labeling documents will only improve the performance of custom trained models?

Upvotes: 0

Views: 585

Answers (1)

user23255231
user23255231

Reputation: 11

The best way to improve the labeling is through uptraining the processor. I like to use the model based method, just note you need 20 instances of every label (10 test, 10 training) to uptrain using this method (google recommends 50 each). If needed, you can upload duplicates to get to the minimum requirement, but i dont recommend this since it will lead to a lower F1 score. the more documents the better, so i like to uptrain once i can, then continue to import more docs and uptrain (about every 50 new docs). Uptrain by going to build>manage dataset> train new version. If the documents come in different formats, try to have 20 documents minimum per format

Upvotes: 1

Related Questions