Creating a speech service from Azure Speech to Text Rest API

I can see there are two versions of REST API endpoints for Speech to Text in the Microsoft documentation links.

https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/batch-transcription and https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-speech-to-text

One endpoint is [https://.api.cognitive.microsoft.com/sts/v1.0/issueToken] referring to version 1.0 and another one is [api/speechtotext/v2.0/transcriptions] referring to version 2.0. How can I create a speech-to-text service in Azure Portal for the latter one?

Whenever I create a service in different regions, it always creates for speech to text v1.0.

Any tips?

PS: I've Visual Studio Enterprise account with monthly allowance and I am creating a subscription (s0) (paid) service rather than free (trial) (f0) service.

Thanks, Ozgur

Upvotes: 1

Answers (2)

Nicolas R

Reputation: 14619

All official Microsoft Speech resource created in Azure Portal is valid for Microsoft Speech 2.0

I understand that this v1.0 in the token url is surprising, but this token API is not part of Speech API.

So go to Azure Portal, create a Speech resource, and you're done.

If you want to be sure, go to your created resource, copy your key. That's what you will use for Authorization, in a header called Ocp-Apim-Subscription-Key header, as explained here

Demo:

Get your key on your created resource
Go to https://[REGION].cris.ai/swagger/ui/index (REGION being the region where you created your speech resource)
Click on Authorize: you will see both forms of Authorization

Paste your key in the 1st one (subscription_Key), validate
Close this window
Test one of the endpoints, for example the one listing the speech endpoints, by going to the GET operation on /api/speechtotext/v2.0/endpoints
Click 'Try it out' and you will get a 200 OK reply!

Upvotes: 0

Jay Gong

Reputation: 23792

Understand your confusion because MS document for this is ambiguous. Per my research,let me clarify it as below: Two type services for Speech-To-Text exist, v1 and v2.

v1 could be found under Cognitive Service structure when you create it:

Based on statements in the Speech-to-text REST API document:

Before using the speech-to-text REST API, understand:

Requests that use the REST API and transmit audio directly can only contain up to 60 seconds of audio.
The speech-to-text REST API only returns final results. Partial results are not provided.

If sending longer audio is a requirement for your application, consider using the Speech SDK or a file-based REST API, like batch transcription.

So v1 has some limitation for file formats or audio size. If you have further more requirement,please navigate to v2 api- Batch Transcription hosted by Zoom Media.You could figure it out if you read this document from ZM. You could create that Speech Api in Azure Marketplace:

That's the creation page for it :

Also,you could view the API document at the foot of above page, it's V2 API document.

Final tip:

v1's endpoint like: https://eastus.api.cognitive.microsoft.com/sts/v1.0/issuetoken

v2's endpoint like:

Upvotes: -1

Creating a speech service from Azure Speech to Text Rest API

Answers (2)

Related Questions