Karol Dowbecki
Karol Dowbecki

Reputation: 44980

Efficient exclusive lock with ZooKeeper for infrequent operations

I've a micro-service deployed on multiple servers that has two main sources of data:

  1. Events received constantly (24/7/365) at high volume (100-1000 event/sec)

  2. Once-a-day operation which might takes a moment to finish

I want to run this once-a-day processing in exclusive mode: pause the event processing, run the once-a-day task, and then resume the event processing. I already have a way to start the once-a-day operation correctly but I still have to implement locking between 1 and 2 to ensure exclusivity.

Most ZooKeeper recipes that I found require a write operation for every processed event e.g. read lock acquisition with InterProcessReadWriteLock or counter increase with DistributedAtomicLong. Since once-a-day opeartions happens infrequenty this per-event overhead seems wasteful.

Is there a ZooKeeper/Curator receipt that's optimized for such a use case?

I thought about following but I'm not 100% sure if that's the right approach (and how to implement point 2 below):

  1. When once-a-day operation starts create new /exclusive path in ZooKeeper
  2. Wait for all in-flight events to finish
  3. Before processing an event check if /exclusive exists. If it's there stop processing until /exclusive path is removed
  4. When once-a-day ends remove the /exclusive path

Upvotes: 1

Views: 1051

Answers (1)

Randgalt
Randgalt

Reputation: 2956

What about this?

Each event processor must:

  • obtain a read lock via InterProcessReadWriteLock.
  • Use NodeCache to watch a "signal" node and listen for changes on this node. When the node exists it means it is daily processing time. When it doesn't exist event processing can proceed.
  • When the NodeCache shows that the signal node has been created, the event processors must release their locks and wait for the signal node to be deleted (again by listening with the NodeCache).
  • When the NodeCache shows that the signal node has been deleted, the event processors obtain the read locks again and continue processing events.

Once this is set there's no additional ZooKeeper activity while it's all running.

When the once a day operation is ready to run:

  • It creates the signal node (as an ephemeral node)
  • Acquires the write lock on the same path as the event processors use for their read locks
  • Does its periodic processing
  • Releases the write lock
  • Deletes the signal node

There's a huge caveat with this, however, and that's what can happen with JVM pauses. Please also read this Tech Note for important edge cases.

Upvotes: 1

Related Questions