Jay Bazuzi
Jay Bazuzi

Reputation: 46496

How to synchronize media playback over an unreliable network?

I wish I could play music or video on one computer, and have a second computer playing the same media, synchronized. As in, I can hear both computers' speakers at the same time, and it doesn't sound funny.

I want to do this over Wi-Fi, which is slightly unreliable.

Algorithmically, what's the best approach to this problem?

EDIT 1

Whether both computers "play" the same media, or one "plays" the media and streams it to the other, doesn't matter to me.

I am certain this is a tractable problem because I once saw a demo of Wi-Fi speakers. That was 5+ years ago, so I'm figure the technology should make it easier today.

Upvotes: 6

Views: 2581

Answers (2)

ninjagecko
ninjagecko

Reputation: 91094

(I myself was looking for an application which did this, hoping I wouldn't have to write one myself, when I stumbled upon this question.)

overview

You introduce a bit of buffer latency and use a network time-synchronization protocol to align the streams. That is, you split the stream up into packets, and timestamp each packet with "play later at time T", where T is for example 50-100ms in the future (or more if the network is glitchy). You send (or multicast) the packets on the local network, to all computers in the chorus. The computers will all play the sound at the same time because the application clock is synced.

Note that there may be other factors like OS/driver/soundcard latency which may have to be factored into the time-synchronization protocol. If you are not too discerning, the synchronization protocol may be as simple as one computer beeping every second -- plus you hitting a key on the other computer in beat. This has the advantage of accounting for any other source of lag at the OS/driver/soundcard layers, but has the disadvantage that manual intervention is needed if the clocks become desynchronized.


hybrid manual-network sync

One way to account for other sources of latency, without constant manual intervention, is to combine this approach with a standard network-clock synchronization protocol; the first time you run the protocol on new machines:

  1. synchronize the machines with manual beat-style intervention
  2. synchronize the machines with a network-clock sync protocol
  3. for each machine in the chorus, take the difference of the two synchronizations; this is the OS/driver/soundcard latency of each machine, which they each keep track of

Now whenever the network backbone changes, all one needs to do is resync using the network-clock sync protocol (#2), and subtract out the OS/driver/soundcard latencies, obviating the need for manual intervention (unless you change the OS/drivers/soundcards).


nature-mimicking firefly sync

If you are doing this in a quiet room and all machines have microphones, you do not even need manual intervention (#1), because you can have them all follow a "firefly-style" synchronizing algorithm. Many species of fireflies in nature will all blink in unison. http://tinkerlog.com/2007/05/11/synchronizing-fireflies/ describes the algorithm these fireflies use: "If a firefly receives a flash of a neighbour firefly, it flashes slightly earlier." Flashes correspond to beeps or buzzes (through the soundcard, not the mobo piezo buzzer!), and seeing corresponds to listening through the microphone.

This may be a bit awkward over very large room distances due to the speed of sound, but I doubt it'll be an issue (if so, decrease rate of beeping).

Upvotes: 7

John Ellinwood
John Ellinwood

Reputation: 14531

The synchronization is relative to the position of the listener relative to each speaker. I don't think the reliability of the network would have as much to do with this synchronization as it would the content of the audio stream. In order to synchronize you need to find the distance between each speaker and the listener. Find the difference between each of those values and the value for the farthest speaker. For each 1.1 feet of difference, delay each of the close speakers by 1ms. This will ensure that the audio stream reaches the listener at the same time. This all assumes an open area, as any in proximity to your scenario will generate reflections of the audio waves and create destructive interference. Objects within the area may also transmit sound at a slower speed resulting in delayed sound of their own.

Upvotes: -1

Related Questions