rocksteady
rocksteady

Reputation: 2560

Changing zookeeper cluster leadership when leader dies

Far below you can find the docker-compose.yml files.

Prerequisites:

I start 3 zookeeper servers as a cluster using docker-compose (docker-compose.yml, 3 zookeepers), then I add a 4. one (another docker-compose.yml, 1 zookeeper) to the cluster. One of the first 3 zookeepers is the leader, the 4. one is a follower, as expected.

Problem:

When I stop the first three zookeepers (by means of docker-compose down), I "lose" the leader and I expect the 4. zookeeper to take the leadership.

The only thing that happens is that zookeeper shows errors, e.g.:

WARN Cannot open channel to 3 at election address localhost/127.0.0.1:43888
java.net.ConnectException: Connection refused

Doing echo stat | nc localhost 52181 | grep Mode previously returned the Mode follower for this last zookeeper and now returns nothing.

The still runnng zookeeper server only says, e.g.:

INFO Closed socket connection for client /127.0.0.1:43548 (no session established for client) (org.apache.zookeeper.server.NIOServerCnxn)

Solution 1:

Solution 2:

When I start the single zookeeper server at first (without the others already running), it just returns error messages (see errors above) and obviously is not running correctly since echo stat | nc localhost 52181 | grep Mode again returns nothing.

When I then add the other 3 zookeepers to the cluster, all runs well and the first zookeeper server is the leader.

Killing the first zookeeper leaves 3 running and one of them is the new leader.

Question:


docker-compose.yml files:

I start 3 zookeeper servers with docker-compose and the following docker-compose.yml:

---
version: '2'
services:
  zookeeper_1:
    image: confluentinc/cp-zookeeper:3.1.1
    network_mode: host
    environment:
      ZOOKEEPER_SERVER_ID: 1
      ZOOKEEPER_CLIENT_PORT: 22181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: localhost:22888:23888;localhost:32888:33888;localhost:42888:43888;localhost:52888:53888
  zookeeper_2:
    image: confluentinc/cp-zookeeper:3.1.1
    network_mode: host
    environment:
      ZOOKEEPER_SERVER_ID: 2
      ZOOKEEPER_CLIENT_PORT: 32181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: localhost:22888:23888;localhost:32888:33888;localhost:42888:43888;localhost:52888:53888
  zookeeper_3:
    image: confluentinc/cp-zookeeper:3.1.1
    network_mode: host
    environment:
      ZOOKEEPER_SERVER_ID: 3
      ZOOKEEPER_CLIENT_PORT: 42181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: localhost:22888:23888;localhost:32888:33888;localhost:42888:43888;localhost:52888:53888

The I start a 4. one in the same manner:

---
version: '2'
services:
  zookeeper_4:
    image: confluentinc/cp-zookeeper:3.1.1
    network_mode: host
    environment:
      ZOOKEEPER_SERVER_ID: 4
      ZOOKEEPER_CLIENT_PORT: 52181
      ZOOKEEPER_TICK_TIME: 2000
      ZOOKEEPER_INIT_LIMIT: 5
      ZOOKEEPER_SYNC_LIMIT: 2
      ZOOKEEPER_SERVERS: localhost:22888:23888;localhost:32888:33888;localhost:42888:43888;localhost:52888:53888

Upvotes: 1

Views: 2571

Answers (1)

Benjamin Reed
Benjamin Reed

Reputation: 433

one thing to keep in mind: zookeeper will only come up if a majority of servers are running. so if you have 4 servers and bring down 3 of them, zookeeper will only come up if you start up two more.

which version of zookeeper are you using? if it is pre 3.5 or (you are using 3.5 and not using the reconfig commands), you will need to restart servers when you change configurations.

Upvotes: 2

Related Questions