Reputation: 153
I'm currently using the MediaRecorder API in JavaScript to record chunks of audio and send them into a back-end for processing. Right now, each time I call stop(), a new event callback is issued, containing the audio gathered since the last execution of start(). Thus, my audio chunks are contiguous and non-overlapping, meaning that if I want to send 2s of audio to my back-end, I have to wait for 2 seconds of data to be gathered.
Conversely, what I would like to do is record my audio in overlapping chunks. So, for example, I want to record 2s of data, send it to the server, then wait for 0.1s, and send the latest 2s of data again (0.1s of fresh data + 1.9s from the previous frame), then wait for another 0.1s, and send 2s again, and so forth. In this case, my chunk (or frame, or window) size would be 2s, the overlap 1.9s (or 95%), and 0.1s is the "hop".
I couldn't find any good leads, sadly. Is there a way of achieving this with the MediaRecorder API (or any other JS-based media recording API)?
Thanks in advance! :)
Extra info: I'm running a prediction model in the back-end, and I'd like to make my system more responsive. I can reduce the latency of my system by lowering the chunk size, but that's out of the question because the model only accepts 2s of data.; however, I can increase the temporal resolution of my predictions (i.e. how many predictions per second I obtain) by lowering the hop size. With the current implementation, my hop size is effectively equal to the chunk length.
Upvotes: 1
Views: 855
Reputation: 108796
You can't create these overlappinging sound clips with MediaRecorder: it generates a stream of compressed audio where, in principle, the data for each audio sample depends on the previous audio sample. So, overlapping clips just aren't a thing in that output.
You could record continuously and send each ondataavailable
packet to your server with a POST or websocket in real time. But then your server would have to decompress the incoming sequence of packets and generate the overlapping clips. That is harder than it seems at first glance.
Better: You can use the Web Audio API's ScriptProcessorNode capability to get a Javascript event every so often that lets you process a buffer of PCM (raw, not compressed) audio samples. In that event's handler you can create the overlapping clips and then use a POST request or a websocket to send them to your server. Notice that these raw PCM sound clips will take more bytes than compressed clips, so you'll be pounding more data through the network between your user's browser and your server. The higher data volume may or may not cause operational headaches. (The good news is that AWS and Azure don't charge for incoming data, only outgoing.)
Explaining all this is beyond the scope of an SO answer.
Upvotes: 2