Reputation: 467
I'm attempting to write a node application that transcribes audio from a microphone via AWS' streaming transcription service. What I have so far can be found in this repository (it's small).
Unfortunately the above doesn't work. I believe there's a bug in taking the data provided by the microphone stream and transforming it before passing it to the writable transcriber stream. This is because I have proven that the other two components of the app work
As a side note, I'm not familiar with handling audio data and encoding (decoding?) it to PCM. I'm not even positive if what the mic-stream is giving me is PCM or not and if I need to decode from or encode to PCM before providing it to the transcription service. All of this is to say, I'm pretty sure the byte-handling is the issue.
Any help getting this sorted would be greatly appreciated.
Thanks, Geoff
Upvotes: 1
Views: 1886
Reputation: 2026
The data frames sent to Amazon Transcribe streaming need to be encoded in a specific was outlined here.
Since you're using WebSocket streaming, AWS has a sample project in JavaScript which you may refer to/use: https://github.com/aws-samples/amazon-transcribe-websocket-static
Upvotes: 1