gokublack
gokublack

Reputation: 1470

AWS Textract StartDocumentAnalysis function not publishing a message to the SNS Topic

I am working with AWS Textract and I want to analyze a multipage document, therefore I have to use the async options, so I first used startDocumentAnalysisfunction and I got a JobId as the return, But it needs to trigger a function that I have set to trigger when the SNS topic got a message.

These are my serverless file and handler file.

provider:
  name: aws
  runtime: nodejs8.10
  stage: dev
  region: us-east-1
  iamRoleStatements:
    - Effect: "Allow"
      Action:
       - "s3:*"
      Resource: { "Fn::Join": ["", ["arn:aws:s3:::${self:custom.secrets.IMAGE_BUCKET_NAME}", "/*" ] ] }
    - Effect: "Allow"
      Action:
        - "sts:AssumeRole"
        - "SNS:Publish"
        - "lambda:InvokeFunction"
        - "textract:DetectDocumentText"
        - "textract:AnalyzeDocument"
        - "textract:StartDocumentAnalysis"
        - "textract:GetDocumentAnalysis"
      Resource: "*"

custom:
  secrets: ${file(secrets.${opt:stage, self:provider.stage}.yml)}

functions:
  routes:
    handler: src/functions/routes/handler.run
    events:
      - s3:
          bucket: ${self:custom.secrets.IMAGE_BUCKET_NAME}
          event: s3:ObjectCreated:*

  textract:
    handler: src/functions/routes/handler.detectTextAnalysis
    events:
      - sns: "TextractTopic"

resources:
  Resources:
    TextractTopic:
        Type: AWS::SNS::Topic
        Properties:
          DisplayName: "Start Textract API Response"
          TopicName: TextractResponseTopic

Handler.js

module.exports.run = async (event) => {
  const uploadedBucket = event.Records[0].s3.bucket.name;
  const uploadedObjetct = event.Records[0].s3.object.key;

  var params = {
    DocumentLocation: {
      S3Object: {
        Bucket: uploadedBucket,
        Name: uploadedObjetct
      }
    },
    FeatureTypes: [
      "TABLES", 
      "FORMS"
    ],
    NotificationChannel: {
      RoleArn: 'arn:aws:iam::<accont-id>:role/qvalia-ocr-solution-dev-us-east-1-lambdaRole', 
      SNSTopicArn: 'arn:aws:sns:us-east-1:<accont-id>:TextractTopic'
    }
  };

  let textractOutput = await new Promise((resolve, reject) => {
    textract.startDocumentAnalysis(params, function(err, data) {
      if (err) reject(err); 
      else resolve(data);
    });
  });
}

I manually published an sns message to the topic and then it is firing the textract lambda, which currently has this,

module.exports.detectTextAnalysis = async (event) => {
  console.log('SNS Topic isssss Generated');
  console.log(event.Records[0].Sns.Message);
};

What is the mistake that I have and why the textract startDocumentAnalysis is not publishing a message and making it trigger the lambda?

Note: I haven't use the startDocumentTextDetection before using the startTextAnalysis function, though it is not necessary to call it before this.

Upvotes: 12

Views: 4678

Answers (5)

The SNS Topic name must be AmazonTextract

At the end your arn should look this:

arn:aws:sns:us-east-2:111111111111:AmazonTextract

Upvotes: 5

Matthew Pitts
Matthew Pitts

Reputation: 859

For anyone using the CDK in TypeScript, you will need to add Lambda as a ServicePrincipal as usual to the Lambda Execution Role. Next, access the assumeRolePolicy of the execution role and call the addStatements method.

The basic execution role without any additional statement (add those later)

  this.executionRole = new iam.Role(this, 'ExecutionRole', {
    assumedBy: new ServicePrincipal('lambda.amazonaws.com'),
  });

Next, add Textract as an additional ServicePrincipal

  this.executionRole.assumeRolePolicy?.addStatements(
    new PolicyStatement({
      principals: [
        new ServicePrincipal('textract.amazonaws.com'),
      ],
      actions: ['sts:AssumeRole']
    })
  );

Also, ensure the execution role has full permissions on the target SNS topic (note the topic is created already and accessed via fromTopicArn method)

 const stmtSNSOps = new PolicyStatement({
    effect: iam.Effect.ALLOW,
    actions: [
      "SNS:*"
    ],
    resources: [
      this.textractJobStatusTopic.topicArn
    ]
  });

Add the policy statement to a global policy (within the active stack)

 this.standardPolicy = new iam.Policy(this, 'Policy', {
    statements: [
      ...
      stmtSNSOps, 
      ...
    ]
  });

Finally, attach the policy to the execution role

  this.executionRole.attachInlinePolicy(this.standardPolicy);

Upvotes: 0

Christian Gossain
Christian Gossain

Reputation: 5972

I was able got this working directly via Serverless Framework by adding a Lambda execution resource to my serverless.yml file:

resources:
  Resources:
    IamRoleLambdaExecution:
      Type: AWS::IAM::Role
      Properties:
        AssumeRolePolicyDocument:
          Version: "2012-10-17"
          Statement:
            - Effect: Allow
              Principal:
                Service:
                  - lambda.amazonaws.com
                  - textract.amazonaws.com
              Action: sts:AssumeRole

And then I just used the same role generated by Serverless (for the lambda function) as the notification channel role parameter when starting the Textract document analysis:

Thanks to this this post for pointing me in the right direction!

Upvotes: 0

griff4594
griff4594

Reputation: 502

Make sure you have in your Trusted Relationships of the role you are using:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "lambda.amazonaws.com",
          "textract.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Upvotes: 10

Ruben J Garcia
Ruben J Garcia

Reputation: 344

If you have your bucket encrypted you should grant kms permissions, otherwise it won't work

Upvotes: 0

Related Questions