Saajid Kamran
Saajid Kamran

Reputation: 1

AWS transcribe- Bad request: error detected at at 'transcriptionJobName' failed to satisfy constraint:

hi I invoke transcribe job to a AWS lambda function but I couldn't find any job role created , I have seen this error in cloud watch log .Can anyone help me out with this please I having this trouble 3 days but I couldn't found any solution .

This is the error msg here

This is the error msg  here

'use strict';
var aws = require('aws-sdk');
var s3 = new aws.S3();
var transcribeservice = new aws.TranscribeService();

exports.handler = (event, context, callback) => {
    console.log('Received event:', JSON.stringify(event, null, 2));
    var bucket = event.Records[0].s3.bucket.name;
    var key = event.Records[0].s3.object.key;    
    var newKey = key.split('.')[0];
    var str = newKey.lastIndexOf("/");
    newKey = newKey.substring(str+1);
    
    var inputaudiolocation = "https://s3.amazonaws.com/search-video/inputaudio/";
    var mp3URL = inputaudiolocation+newKey+".mp3";
    var outputbucket = "search-video";
    var params = {
        LanguageCode: "en-US", /* required */
        Media: { /* required */
          MediaFileUri: mp3URL
        },
        MediaFormat: "mp3", /* required */
        TranscriptionJobName: newKey, /* required */
        MediaSampleRateHertz: 44100,
        OutputBucketName: outputbucket
      };
      transcribeservice.startTranscriptionJob(params, function(err, data){
      if (err){
       console.log('Received event:Error = ',err);
      } else {
       console.log('Received event:Success = ',data);
      }
     });
};
'''
Thanks in Advance !!!

Upvotes: 0

Views: 650

Answers (1)

AYHarano
AYHarano

Reputation: 71

Given that it has passed 10 months, I guess you probably had moved on with this issue. Anyhow, the log image has the reason the transcription job failed: due to its name does not match the required regular expression (regex).

As stated in TranscriptionJobName parameter description in AWS Transcribe API Reference, it must match ^[0-9a-zA-Z._-]+ regex pattern.

My suggestion to naively solve that issue would be to remove all the characters that are not allowed. Before params declaration but after the last newKey assignment, you can include:

NAIVE SUGGESTION:

newKey = newKey.replace(/[^0-9a-zA-Z._-]/g, '')

That should suffice for your issue. The explanation of that code is described below within another suggestion.

However, if we revisit the TranscriptionJobName parameter documentation, there are some points of attention to watch out.

TranscriptionJobName

A unique name, chosen by you, for your transcription job. The name you specify is also used as the default name of your transcription output file. If you want to specify a different name for your transcription output, use the OutputKey parameter.

This name is case sensitive, cannot contain spaces, and must be unique within an AWS account. If you try to create a new job with the same name as an existing job, you get a ConflictException error.

Type: String

Length Constraints: Minimum length of 1. Maximum length of 200.

Pattern: ^[0-9a-zA-Z._-]+

Required: Yes

Considering the uniqueness constraint, the maximum length constraint, and the regex pattern constraint here is a suggestion that takes all of that into account

BETTER SUGGESTION:

const nowAsISOString = (new Date()).toISOString();
newKey = (nowAsISOString+newKey).replace(/[^0-9a-zA-Z._-]/g, '').slice(0, 200);

It takes into account the timestamp of when the job name was set by using Date.prototype.toISOString(). nowAsISOString will store a string value such as 2011-10-05T14:48:00.000Z.

Notice that nowAsISOString will contain invalid characters to the TranscriptionJobName parameter. But that fact itself won't be an issue, given that the full string (timestamp plus the previously chosen name - nowAsISOString+newKey) will have their invalid characters according to the pattern replaced with an empty string (.replace(/[^0-9a-zA-Z._-]/g, '')).

When ^ is the first character within a group [], it indicates a negation: it will match everything that is not covered in the group.

The MDN JavaScript Regular Expressions Groups and ranges page has a more detailed explanation:

Characters: [^xyz] [^a-c]

Meaning: A negated or complemented character class. That is, it matches anything that is not enclosed in the brackets. You can specify a range of characters by using a hyphen, but if the hyphen appears as the first or last character enclosed in the square brackets it is taken as a literal hyphen to be included in the character class as a normal character. For example, [^abc] is the same as [^a-c]. They initially match "o" in "bacon" and "h" in "chop". Note: The ^ character may also indicate the beginning of input.

The replace function matches a pattern and replaces it with a replacement. In the suggestion case, it would match everything that is not allowed with an empty string multiple times (the g parameter). More details for the replace function at the MDN's String.prototype.replace() page.

Considering the aforementioned explanation, it will remove all the characters that don't match the expected pattern for the TranscriptionJobName. Revisiting the string timestamp example, it would replace from 2011-10-05T14:48:00.000Z to 2011-10-05T144800.000Z.

The last portion .slice(0, 200) would truncate the string to have at most 200 characters. More details for the slice function at the MDN's String.prototype.slice() page.

Based on the better suggestion, you would have a TranscriptionJobName parameter that:

  • is unique due to the name definition timestamp
  • would remove the invalid characters (the OP initial issue), and
  • have a string within the allowed number of characters.

Upvotes: 1

Related Questions