Reputation: 2980
I have a solution that will reside on a user’s local mobile device, I want this to post audio content to Lex using the AWS REST API. The problem is that the solution can’t stream audio (up or down) and has almost no audio manipulation capabilities locally. However, Lex has very specific input requirements and also streams output.
So access will be via an API Gateway acting as a Proxy with a Lambda (Python 2.7) function to deal with the audio issues.
The output is all taken care of, the Lambda code saves the AudioStream into a file and sends that file as a response body, this works fine. However I can’t get the input to work.
The input audio is an MP3 file sent as the body of a POST request and I need to get this into a format acceptable to Lex.
I’ve investigated the following approaches
Native AWS
Use S3 and Elastic Transcoder - when transcoding to PCM the lowest allowed sample rate is 22050, but Lex requires 16000, this also doesn’t seem to allow transcoding to Opus format
Use MediaConvert - couldn’t see a setting to convert to PCM or Opus
Native Python
Python doesn’t seem to have the ability to unpack MP3 natively. I’ve read that this would be very slow and not worth doing.
Import a library
Use something ffmpeg-python or ffmpy - but this involves creating a deployment package or similar. I could go down this road but this really seems overly complicated for what I want to do.
Use something other than Python
I chose Python as I’m more familiar coding with it in Lambda but perhaps C#, Node, Java 8 have something available that would make this easy in a Lambda function.
At the moment I’m looking at doing the following
Of course there will be some latency issues here, but as long as they’re not too severe I’m willing to live with them. This does seem overly complex for what I thought would be a fairly simple task. However, it's the best I’ve come up with so far, but even to prove it out will take a number of hours work and I’ve spent days on this already.
So the main question is whether Python Wave library can be used in AWS Lambda to modify the sample rate in this way?
If not, is there a way of solving this by either creating a deployment package, using an AWS feature I haven’t investigated yet or a neater way of doing this in something other than Python?
The problem is that the Lex part of this app was supposed to be a nice-to-have, it’s not a main feature and yet it’s taken up the majority of the dev time, I’m pretty close to just ditching it but thought I’d ask here first.
Upvotes: 2
Views: 1412
Reputation: 2980
So it took a while but there is a way to do this.
The way I've solved it is to save the file to s3, then run through Elastic Transcoder to get a wav file (1 channel at 22050 sample rate).
Then use the following var values
And this code should get it down to 16000
import audioop
import wave
s_read = wave.open(src, 'r')
s_write = wave.open(dst, 'w')
n_frames = s_read.getnframes()
data = s_read.readframes(n_frames)
converted = audioop.ratecv(data, 1, inchannels, inrate, outrate, None)
s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
s_write.writeframes(converted[0])
s_read.close()
s_write.close()
The file is then accepted by Lex and gets a response as expected.
There's some noticeable latency on this method, processing is usually about 7-10 seconds according to CloudWatch Logs so probably not acceptable for a production level solution but it's good enough for my needs.
Thanks to the following sources
Upvotes: 1