Reputation: 1865
We have some data that we are trying to synchronize between N machines and a centralized server, and I'm looking for a way to do this that is relatively efficient and robust.
Looking around, it appears that this is called a "set reconciliation problem". It's good to have a label for it, but searching on that turns up a lot of fairly academic work, which is at times a bit difficult to gauge in terms of its usefulness for our data, which is best described as contact lists in terms of its properties: objects (people) with multiple fields that do get updated, but not that often.
Our system involves a central server and machines connected to it. The central server, ideally, is the 'good' copy. A feature that's nice to have also, is the ability to force the machines to resend by tweaking something on the server.
So far, my thinking is along the lines of a UUID for each object and something like a version or timestamp (per object and or per collection of objects?) to use to tell which data to attempt to synchronize... but my thinking is still a bit fuzzy, and I thought asking would probably lead to a better solution than trying to invent this on my own.
Upvotes: 0
Views: 729
Reputation: 1497
It is not easy, and the perfect solution is academical. So you are on the good track. You can craft a sync algorithm for your own problem, relaxing some of the requirements of the general solution.
I delivered a presentation on these topics at the last JsDay in Italy. Here are my slides: http://www.slideshare.net/matteocollina/operational-transformation-12962149
Let me know if they help you, or if you need some assistance.
Upvotes: 1