First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.
Interesting. Feels like this might still have some noticeable tens-of-millisends latency on Windows, where the default audio drivers still have high latency. The browser may intend to play the sound at time t, but when it calls Windows's API to play the sound I'm guessing it doesn't apply a negative time offset?
So it doesn't need to use the microphone? I guess from the "works across the ocean" comment and based on this description. I would have thought you would listen to the mic and sync based on surrounding audio somehow but it's good to know that it's not needed.
First, I do clock synchronization with a central server so that all clients can agree on a time reference.
Then, instead of directly manipulating the hardware audio ring buffers (which browsers don't allow), I use the Web Audio API's scheduling system to play audio in the future at a specific start time, on all devices.
So a central server relays messages from clients, telling them when to start and which sample position in the buffer to start from.