Reputation: 3373
I am trying to use Amazon's new streaming transcribe API from Go 1.11. Currently Amazon provides Java SDK only so I am trying the low-level way.
The only relevant piece of documentation is here but it does not show the endpoint. I have found it in a Java example that it is https://transcribestreaming.<region>.amazonaws.com
and I am trying the Ireland region i.e. https://transcribestreaming.eu-west-1.amazonaws.com
. Here is my code to open an HTTP/2 bi-directional stream:
import (
"crypto/tls"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/aws/external"
"github.com/aws/aws-sdk-go-v2/aws/signer/v4"
"golang.org/x/net/http2"
"io"
"io/ioutil"
"log"
"net/http"
"os"
"time"
)
const (
HeaderKeyLanguageCode = "x-amzn-transcribe-language-code" // en-US
HeaderKeyMediaEncoding = "x-amzn-transcribe-media-encoding" // pcm only
HeaderKeySampleRate = "x-amzn-transcribe-sample-rate" // 8000, 16000 ... 48000
HeaderKeySessionId = "x-amzn-transcribe-session-id" // For retrying a session. Pattern: [a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}
HeaderKeyVocabularyName = "x-amzn-transcribe-vocabulary-name"
HeaderKeyRequestId = "x-amzn-request-id"
)
...
region := "eu-west-1"
cfg, err := external.LoadDefaultAWSConfig(aws.Config{
Region: region,
})
if err != nil {
log.Printf("could not load default AWS config: %v", err)
return
}
signer := v4.NewSigner(cfg.Credentials)
transport := &http2.Transport{
TLSClientConfig: &tls.Config{
// allow insecure just for debugging
InsecureSkipVerify: true,
},
}
client := &http.Client{
Transport: transport,
}
signTime := time.Now()
header := http.Header{}
header.Set(HeaderKeyLanguageCode, "en-US")
header.Set(HeaderKeyMediaEncoding, "pcm")
header.Set(HeaderKeySampleRate, "16000")
header.Set("Content-type", "application/json")
// Bi-directional streaming via a pipe.
pr, pw := io.Pipe()
req, err := http.NewRequest(http.MethodPost, "https://transcribestreaming.eu-west-1.amazonaws.com/stream-transcription", ioutil.NopCloser(pr))
if err != nil {
log.Printf("err: %+v", err)
return
}
req.Header = header
_, err = signer.Sign(req, nil, "transcribe", region, signTime)
if err != nil {
log.Printf("problem signing headers: %+v", err)
return
}
// This freezes and ends after 5 minutes with "unexpected EOF".
res, err := client.Do(req)
...
Problem is that executing the request (client.Do(req)
) freezes for five minutes and then ends with the "unexpected EOF" error.
Any ideas what I am doing wrong? Did someone successfully use the new streaming transcribe API without the Java SDK?
EDIT (March 11, 2019):
I tested this again and now it does not time out but immediately returns 200 OK
response. There is an "exception" in the response body though: {"Output":{"__type":"com.amazon.coral.service#SerializationException"},"Version":"1.0"}
I tried opening the HTTP2 stream with io.Pipe
(like the code above) and also with a JSON body described in the documentation:
{
"AudioStream": {
"AudioEvent": {
"AudioChunk": ""
}
}
}
The result was the same.
EDIT (March 13, 2019):
As mentioned by @gpeng, removing the content-type
from headers will fix the SerializationException
. But then there is an IAM exception and it is needed to add the transcription:StartStreamTranscription
permission to your IAM user. That is though nowhere in the AWS IAM console and must be added manually as a custom JSON permission :/
There is also a new/another documentation document here which shows incorrect host
and a new content-type
(do not use that content-type
, the request will return 404 with it).
After removing the content-type
, and adding the new permission, now I am getting an exception {"Message":"A complete signal was sent without the preceding empty frame."}
. Also writing to the pipe blocks forever, so I am stuck again. The messages described in the new documentation are different than in the old one, now finally binary, but I do not understand them. Any ideas how to send such HTTP2 messages in Go?
EDIT (Match 15, 2019):*
If you get HTTP 403 error about signature mismatch, then do not set the transfer-encoding
and x-amz-content-sha256
HTTP headers. When I set them, sign the request with AWS SDK's V4 signer, then I receive HTTP 403 The request signature we calculated does not match the signature you provided.
Upvotes: 14
Views: 5235
Reputation: 3
I had a similar requirement for using the AWS transcribe service with their WebSocket API in node js. Seeing as there was no support for this in the official package as of yet, I have gone ahead and written a package that is called AWS-transcribe and can be found here. I hope that helps.
It provides a stream interface around the WebSocket, and can be used like the below example
import { AwsTranscribe, StreamingClient } from "aws-transcribe"
const client = new AwsTranscribe({
// if these aren't provided, they will be taken from the environment
accessKeyId: "ACCESS KEY HERE",
secretAccessKey: "SECRET KEY HERE",
})
const transcribeStream = client
.createStreamingClient({
region: "eu-west-1",
sampleRate,
languageCode: "en-US",
})
// enums for returning the event names which the stream will emit
.on(StreamingClient.EVENTS.OPEN, () => console.log(`transcribe connection opened`))
.on(StreamingClient.EVENTS.ERROR, console.error)
.on(StreamingClient.EVENTS.CLOSE, () => console.log(`transcribe connection closed`))
.on(StreamingClient.EVENTS.DATA, (data) => {
const results = data.Transcript.Results
if (!results || results.length === 0) {
return
}
const result = results[0]
const final = !result.IsPartial
const prefix = final ? "recognized" : "recognizing"
const text = result.Alternatives[0].Transcript
console.log(`${prefix} text: ${text}`)
})
someStream.pipe(transcribeStream)
Upvotes: 0
Reputation: 101
I reached out to AWS support and they now recommend using websockets instead of HTTP/2 when possible (blog post here)
If this fits your usecase I would highly recommend checking out the new example repo at: https://github.com/aws-samples/amazon-transcribe-websocket-static which shows a browser-based solution in JS.
I've also noticed that the author of the demo has an express example on his personal Github at: https://github.com/brandonmwest/amazon-transcribe-websocket-express but I haven't confirmed if this is working.
Appreciate these examples aren't in Python but I think you'll have better luck using the Websocket client as opposed to HTTP/2 (which let's be honest, is still a bit terrifying :P)
Upvotes: 4
Reputation: 13078
I'm still fighting this thing with Node.js as well. What is not clear about the docs is that in one place it says that the Content-Type
should not be application/json
, but in some other place, it makes it look like that payload should be encoded as application/vnd.amazon.eventstream
. It looks like the payload should be carefully formatted in a binary format instead of a JSON object as follows:
Amazon Transcribe uses a format called event stream encoding for streaming transcription. This format encoded binary data with header information that describes the contents of each event. You can use this information for applications that call the Amazon Transcribe endpoint without using the Amazon Transcribe SDK. Amazon Transcribe uses the HTTP/2 protocol for streaming transcriptions. The key components for a streaming request are:
A header frame. This contains the HTTP headers for the request, and a signature in the authorization header that Amazon Transcribe uses as a seed signature to sign the following data frames.
One or message frames in event stream encoding. The frame contains metadata and the raw audio bytes.
An end frame. This is a signed message in event stream encoding with an empty body.
There is a sample function that shows how to implement all of that using Java which might shed some light in how this encoding is to be done.
Upvotes: 0
Reputation: 154
Try not setting the content type header and see what response you get. I'm trying to do the same thing (but in Ruby) and that 'fixed' the SerializationException
. Still can't get it to work but I've now got a new error to think about :)
UPDATE: I have got it working now. My issue was with the signature. If both host
and authority
headers are passed they are joined with ,
and treated as host
on the server side when the signature is checked so the signatures never match. That doesn't seem like correct behaviour on the AWS side but it doesn't look like it's going to be an issue for you in Go.
Upvotes: 1