Reputation: 101
I am trying to use Amazon Lex as the conversation engine in a home assistant via the Python SDK. The post_content method seems appropriate and I did get it to work on text-only test examples. However, I am unable to figure out how to interact directly using audio streaming.
import pyaudio
import boto3
pa = pyaudio.PyAudio()
audio_stream = pa.open(
rate=16000,
channels=1,
format=pyaudio.paInt16,
input=True,
frames_per_buffer=1024,
)
lex_client = boto3.client("lex-runtime")
response = lex_client.post_content(
botName="BOT_NAME",
botAlias="BOT_ALIAS",
userId="USER_ID",
contentType="audio/l16; rate=16000; channels=1",
inputStream=audio_stream,
)
print(response)
This raises the following error:
botocore.exceptions.HTTPClientError: An HTTP Client raised an unhandled exception: 'Stream' object is not iterable
Fair enough, so I tried inputStream=audio_stream.read(1024)
, which works without a problem, but doesn't recognize any spoken text (i.e. 'inputTranscript': ''
in the response). I imagine this is because the chunk is simply too short to contain meaningful text.
I am fairly inexperienced with web development so I suspect I am missing something very obvious. Looking at how audio streaming is apparently handled in Amazon Transcribe, it seems like I should be using async and callback functions.
How should I properly handle this stream? If there are fundamental things I should be understanding better, I'd also really appreciate pointers to the right resources.
Upvotes: 0
Views: 591
Reputation: 9482
It is simple.
Let's start from documentation:
inputStream (bytes or seekable file-like object) -- [REQUIRED]
User input in PCM or Opus audio format or text format as described in the Content-Type HTTP header.
You can stream audio data to Amazon Lex or you can create a local buffer that captures all of the audio data before sending. In general, you get better performance if you stream audio data rather than buffering the data locally.
Okay, what is file-like object
in python? Looks like SOF knows answer on this question
This is the API for all file-like objects in Python (as of 3.10.5).
...
__iter__()
...
Okay, __iter__()
means that file-like object should be iterable
.
No problem let's check Stream
class. Stream
's methods are:
__init__
get_input_latency
get_output_latency
get_time
get_cpu_load
start_stream
stop_stream
is_active
is_stopped
write
read
get_read_available
get_write_available
Looks like no __iter__
here :(
Is Stream
a file-like object? Definitely not.
Why we checked Stream
? Because pa.open
returning Stream
.
Okay, and what we have to do now?
Probably we have to start record → close record → write stream to file (or bytes (check for BytesIO)) and pass BytesIO
object to AWS client. Because BytesIO
is file-like object:
from io import BytesIO
# get methods of BytesIO
dir(BytesIO)
Output:
# I can see __iter__ here!
['__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', 'close', 'closed', 'detach', 'fileno', 'flush', 'getbuffer', 'getvalue', 'isatty', 'read', 'read1', 'readable', 'readinto', 'readinto1', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']
Okay, and how to save Stream
into BytesIO
?
Looks like this answer is closer to our aim.
--
You can protest: But here is saying "You can stream audio data to Amazon Lex"
That is true. But Pyaudio
's Stream
is just a class name. And it is not satisfying python's stream standards
Upvotes: 1