Needs Help
Needs Help

Reputation: 275

Zookeeper barrier implementation

I am trying to implement a barrier in Zookeeper. My implementation works all of the time when there are a small number of nodes that need to join to pass the barrier. However, when I test my implementation with 100+ nodes needing to joining the barrier, around 1% of the time it seems like that one of the nodes is missing the last watcher event, and not checking to see if the number of children of the barrier node has changed.

I even synchronized the process method on the watcher, but that did not change anything. Below is the code for my process method, and the logic that checks to see if needs to move forward.

Watcher process :

public BarrierWatcher(FastBarrier FastBarrier) {
      this.ofb = FastBarrier;
    }

    @Override
    public synchronized void process(WatchedEvent event) {
      synchronized (ofb) {
        ofb.notify();
      }
    }

Logic to control barrier mechanism:

BarrierWatcher bw = new BarrierWatcher(this);
List<String> memberList = zk.getChildren(barrierPath, bw);
synchronized(this) {
  while (memberList.size() < numOfMembers) {
    this.wait(1000);
    memberList = zk.getChildren(barrierPath, bw);
    }
}

Instead of just calling this.wait(), I had add this.wait(1000) for the rare failure occurrence. With 1000 in place it always passes the barrier once all nodes have joined. I was sure that synchronizing the process method would fix this, but it hasn't. Anyone have any experience with this, or an ideas what i might be doing wrong?

Upvotes: 4

Views: 1701

Answers (1)

Mairbek Khadikov
Mairbek Khadikov

Reputation: 8089

You can compare your implementation with netflix-curator where distributed barrier is already implemented.

Upvotes: 4

Related Questions