Reputation: 11
I really appreciate anyone being willing to help. I’ve been following along with this AWS blog post:
and the github - https://github.com/aws-samples/aws-document-classifier-and-splitter/
I am struggling to get workflow1 to be compatible with my use case. When I give an S3 Location that contains < a certain number of training documents, my state machine loops between check table status -> is table complete -> wait for object processing. It is able to create the endpoint when I pass far fewer multipage pdf’s.
In function 4, the check to see if the table is full is
if rows_not_filled == 0:
return True
else:
return False
Even when the table was full (I was able to look through the table for any empty rows), it seemed to be returning False. So I think I am being throttled somewhere by DynamoDB, but I don't know where, and I don't know how to fix this issue. I believe I have tried both on-demand as well as provisioned capacity options for the table, but get the same error when I try to feed too many documents to the application. I am pleading for someone to help me troubleshoot where the error is. I am almost positive that the issue lies with DynamoDB, as Textract has no issues successfully writing all document text to the table. I simply cannot get the table check to True, and since I am a novice with AWS, do not know what I need to modify in order to fix this. I have been stuck for weeks and AWS support has been zero help to me.
Upvotes: 0
Views: 30