Reputation: 1
I am using AWS Transcribe for speech recognition. Though I have created my custom vocabulary, I am unable to find any Boto3 code snippet to utilize the it in python. Kindly find the sample code attached.
client_transcribe = boto3.client('transcribe') client_transcribe.start_transcription_job(TranscriptionJobName=job_name, Media={'MediaFileUri': file_url}, MediaFormat='mp4',LanguageCode='en-US', OutputBucketName=bucket)
Upvotes: 0
Views: 484
Reputation: 770
The vocabulary name is a member of the settings object, a parameter to the start_transcription_job method.
Example:
settings = {
'VocabularyName': 'your-custom-vocabulary-name-goes-here'
}
client_transcribe.start_transcription_job(
TranscriptionJobName=job_name,
LanguageCode='your-language-code-goes-here',
Settings=settings,
MediaFormat='mp4',
OutputBucketName=bucket
Media={
'MediaFileUri': file_url
})
If you need help to determine the language code of your vocabulary, you can use the following AWS cli command from your terminal if you have AWS cli installed:
aws transcribe get-vocabulary --vocabulary-name {your-custom-vocabulary-name}
It returns a response such as:
{
"LastModifiedTime": 1573523589.419,
"VocabularyName": "redacted",
"DownloadUri": "redacted",
"LanguageCode": "en-US",
"VocabularyState": "READY"
}
For example, if the language code for your vocabulary is en-US
, then use that language code when calling start_transcription_job
.
Hope this helps!
Upvotes: 1