Reputation: 752
I am executing the following AWS Lambda function:
import json
import urllib.parse
import boto3
print('Loading function')
s3 = boto3.client('s3')
transcribe = boto3.client('transcribe')
#DOCUMENTATION : https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/transcribe.html#TranscribeService.Client.start_transcription_job
def lambda_handler(event, context):
# 1 - Get the bucket name
bucket = event['Records'][0]['s3']['bucket']['name']
# 2 - Get the file/key name
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
media_uri = "s3://aws-support-ml-demo-bucket/SampleInboundCall2.mp3"
try:
response = transcribe.start_transcription_job(
TranscriptionJobName='thisjobiscomming-from-lambda',
LanguageCode='en-US',
MediaSampleRateHertz=8000,
MediaFormat='mp3',
Media={
'MediaFileUri': media_uri
},
OutputBucketName='aws-support-ml-demo-bucket-transcribe',
# OutputEncryptionKMSKeyId='string',
Settings={
'ShowSpeakerLabels': True,
'MaxSpeakerLabels': 3,
'ChannelIdentification': False,
'ShowAlternatives': False,
},
JobExecutionSettings={
'AllowDeferredExecution': True,
'DataAccessRoleArn': 'arn:aws:iam::026863910802:role/service-role/TEST-AWS-TEST'
},
ContentRedaction={
'RedactionType': 'PII',
'RedactionOutput': 'redacted'
}
)
print(response)
except Exception as e:
print(e)
raise e
with error:
2020-06-18T11:18:17.628+03:00
An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The S3 URI that you provided can't be accessed. Make sure that you have read permission and try your request again.
2020-06-18T11:18:17.628+03:00
[ERROR] BadRequestException: An error occurred (BadRequestException) when calling the StartTranscriptionJob operation: The S3 URI that you provided can't be accessed. Make sure that you have read permission and try your request again.
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 63, in lambda_handler
raise e
File "/var/task/lambda_function.py", line 24, in lambda_handler
response = transcribe.start_transcription_job(
The bucket name is aws-support-ml-demo-bucket
and the file is directly inside the bucket.
My lambda role also has full access to S3.
I do not have a lot of experience using S3 url but think it might be the problem.
Regarding IAM Role I use exactly the same one role both for lambda execution and in the lambda for transcribe:
'DataAccessRoleArn': 'arn:aws:iam::026863910802:role/service-role/TEST-AWS-TEST'
The role has the following permissions:
AmazonS3FullAccess
AWS managed policy
AmazonTranscribeFullAccess
AWS managed policy
IAMassumeRole
Managed policy
AWSLambdaS3ExecutionRole-6fe39002-b20d-4255-a666-98fb5c889b2c
Managed policy
AWSLambdaBasicExecutionRole-9da5b8ab-3601-4975-ad97-1206e6348784
Managed policy
Upvotes: 1
Views: 1245
Reputation: 629
You can try creating S3 client by passing region also(us-east-1). Also please check if your lambda is inside VPC or not. If it's inside in VPC. You might need the route or VPC endpoint to connect to S3.
Upvotes: 0
Reputation: 31
Marcin's answer is actually quite correct. If you specify AllowDeferredExecution-parameter for your job options (even if it's set to false), you need to also specify the DataAccessRoleArn.
However, using the same role for deferred execution than for the lambda won't work, because the trust policy of that role likely has lambda.amazonaws.com in its Principal.Service field. Like AWS documentation says in https://docs.aws.amazon.com/transcribe/latest/dg/job-queuing.html the queued jobs need to include a trust policy that allows transcribe to assume the role. I.e. you need to have a role which has the following trust policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": [
"transcribe.amazonaws.com"
]
},
"Action": "sts:AssumeRole"
}
]
}
So at first you need to create a new role that has rights to the S3 buckets that you use for the jobs' input and output files. Then you need to add the correct trust policy to that role. When creating a new role, you usually take some AWS service as a base service and start adding policies. That initial service selection in the IAM roles menu is set to be the Principal.Service in the role's trust policy. It's a shame that there is no transcribe-service to start with when creating a role, so you need to start with something else, like EC2, and edit the trust policy after you have created the role. You can do that by choosing the role in IAM and selecting Trust Relationships-tab.
After I created the new role and added the above mentioned trust policy and S3 permissions to the role, I could use that role for deferred execution jobs options.
Upvotes: 3
Reputation: 238967
my lambda role also has full access to s3.
I think the problem is not with lambda execution role, but rather with:
'DataAccessRoleArn': 'arn:aws:iam::026863910802:role/service-role/TEST-AWS-TEST'
This should be role for Job Queuing with trust policy for transcribe.amazonaws.com
.
This role should have permissions for reading your s3 object, not the lambda function's execution role.
From docs:
DataAccessRoleArn: The Amazon Resource Name (ARN) of a role that has access to the S3 bucket that contains the input files. Amazon Transcribe assumes this role to read queued media files. If you have specified an output S3 bucket for the transcription results, this role should have access to the output bucket as well.
Upvotes: 0