geofflittle
geofflittle

Reputation: 467

Streaming audio mic data to aws transcribe in node

I'm attempting to write a node application that transcribes audio from a microphone via AWS' streaming transcription service. What I have so far can be found in this repository (it's small).

Unfortunately the above doesn't work. I believe there's a bug in taking the data provided by the microphone stream and transforming it before passing it to the writable transcriber stream. This is because I have proven that the other two components of the app work

  1. I've written a piece of the app to pipe the mic to the speakers that proves that the mic stream works as expected.
  2. When sending requests over the WebSocket to the transcription service, it sends non-exceptional responses back, albeit empty, proving that the transcription service client works as expected.

As a side note, I'm not familiar with handling audio data and encoding (decoding?) it to PCM. I'm not even positive if what the mic-stream is giving me is PCM or not and if I need to decode from or encode to PCM before providing it to the transcription service. All of this is to say, I'm pretty sure the byte-handling is the issue.

Any help getting this sorted would be greatly appreciated.

Thanks, Geoff

Upvotes: 1

Views: 1886

Answers (1)

Paradigm
Paradigm

Reputation: 2026

The data frames sent to Amazon Transcribe streaming need to be encoded in a specific was outlined here.

Since you're using WebSocket streaming, AWS has a sample project in JavaScript which you may refer to/use: https://github.com/aws-samples/amazon-transcribe-websocket-static

Upvotes: 1

Related Questions