Vineel
Vineel

Reputation: 1788

Role of Zookeeper in Hadoop

I understand based on the slides that in the context of Hadoop that Zookeeper is used for storing information of Master, and status of different tasks, which worker is working on which partition AND also the available workers are also stored in Zookeeper.

Why is Zookeeper is used for this metadata storage here? Any data store can be used right ?

For instance Celery can configure any result backend Redis/Mongo etc. So in practice Hadoop can use any storage backend right? But why Zookeeper?

This doc suggests that Redis, SQLite, MySQL, PostgreSQL can be used for celery task result storage.

https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/index.html

Upvotes: 0

Views: 178

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191681

Zookeeper ZAB protocol is utilized for leader election, as well as distributed locks.

It is not simply a datastore, and no, not any can be used.

Celery isn't used within the Hadoop ecosystem, so I'm not sure how that's relevant to the question.

Upvotes: 1

Related Questions