Reputation: 61
Here's a high level overview of my problem:
There will be a central computer at an art gallery, and three separate remote sites, say up to a mile away from central. Each site has a musician. The central computer sends a live backing track over the internet to each of the three musicians, who play along to it and are each recorded as a live stream. Each of the three streams is then played back at the gallery, in-sync with the backing track and with the other musicians, as though all the musicians were playing live in the same room. The client has requested that the musicians appear to play PRECISELY in time with each other, i.e. no apparent latency between each musician. The musicians cannot hear each other, they only hear the backing track.
Here's what I see as the technical solution:
Each backing track packet is sent out from the gallery with the current timestamp. As a musician plays and is recorded, the packet currently being recorded is marked with the timestamp of the current backing track packet. When the three audio streams are sent back, they are buffered. Each packet is then played, say, ten seconds after its timestamp. i.e. At 11:00:00 AM, all of the packets marked 10:59:50 AM are played.
Or to think of it another way, each incoming stream is delayed 10 seconds behind real time. This buffering should allow for any network blips. It is also acceptable since there is no apparent latency to the viewers at the gallery, and everything is being played "as-live." We are assuming there is a good quality internet connection to each remote site.
I'm ideally looking for a JavaScript solution to this, as it's what I'm most familiar with (but other solutions would be interesting to know about as well).
Does anyone know of any JavaScript libraries with built-in functionality to allow this sort of buffering?
Upvotes: 4
Views: 1180
Reputation: 163468
To be clear, it sounds like it doesn't matter that the musicians play back in time with each other... only that they play in time with the backing track, and within ~10 seconds of each other, correct? Assuming that's the case...
You can use WebRTC for this, but we'll only be using the data channel. No need for media streams. This is a performance that requires precise timing, and I'm assuming decent quality audio as well. Let's just leave it in PCM and send that over the WebRTC data channels.
Alternatively, you could have a server host Web Socket connections which relay data to the sites.
You can use the ScriptProcessorNode to play and record. This gives you raw access to the PCM stream. Just send/receive the bytes via your data channel. If you want to save some bandwidth, you can reduce the floating point samples down to 16-bit integers. Just crank them back up to floats on the receiving end.
The main synchronization needs to occur where playback of the backing track and recording occur at the same time. Immediately upon starting your playback, start recording. If you're using the ScriptProcessorNode as mentioned previously, you can actually do both in the same node, guaranteeing sample-accurate synchronization.
On playback, simply buffer all your tracks until you have your desired buffer level, and then play them back simultaneously inside your ScriptProcessorNode. Again, this is sample-accurate.
The only thing you might have to deal with now is clock drift. What's 44.1kHz to you might actually be 44.099kHz to me. This can add up over time, but is generally not something you need to concern yourself with as long as you reset all this once in awhile. That is, as long as you're not recording for a whole day or more without stopping, it probably won't be an issue for you.
The recorded packets should be marked with the timestamp of the incoming backing track packet
No, this synchronization should not happen at the network layer. If you're using a reliable transport with WebRTC data channels or Web Socket, you don't have to do anything but start all your streams at byte 0
. Don't use timestamps, use sample offsets.
Does anyone know of any JavaScript libraries with built-in functionality to allow this sort of buffering?
I've actually built a project for doing similar things that allows for sample-accurate internet radio hand-offs from one site to another. It builds up a buffer over time, and then for the hand-off it basically re-syncs to a master clock from the new site. Since the new site is behind the old site, and since we can't bend space/time very easily, I drop out of the buffer a bit and pick up at the new site's master clock. (Not any different if there were a buffer underrun from a single site!) Anyway, I don't know of any other code that does this.
Upvotes: 3