anshul410
anshul410

Reputation: 884

Kafka cluster setup

I am a newbie to Kafka technology. I have setup a basic single node cluster using Ambari.

I want to understand what is the recommended configuration for a production server. Let's say in production I will have 5 topics each getting traffic in the range of 500,000 to 50,000,000 in a day.

I am thinking of setting up a 3-4 node kafka cluster using EC2 r5.xlarge instances.

I am mostly confused about zookeeper part. I understand zookeeper needs odd number of nodes and zookeeper is installed on all kafka nodes, then how do I run Kafka with even number of nodes. If this is true it will limit Kafka to odd number of nodes as well.

Is it really needed to install Zookeeper on all Kafka nodes. Can I install Zookeeper on separate nodes and Kafka brokers on separate nodes, how ?

What if I want to run multiple Kafka clusters. Is it possible to manage multiple Kafka clusters through single Zookeeper cluster, how if possible ?

I have started learning Kafka recently only, any help would be appreciated.

Thanks,

Upvotes: 0

Views: 3023

Answers (2)

OneCricketeer
OneCricketeer

Reputation: 191993

Can I install Zookeeper on separate nodes and Kafka brokers on separate nodes, how ?

You can, and you should if you have the available resources.


Run zookeeper-server-start zookeeper.properties on an odd number of servers. (max 5 or 7 for larger Kafka clusters)

On every other machine that is a Kafka broker, not the same servers as Zookeeper, edit server.properties to point to that set of Zookeeper machine addresses for the zookeeeper.connect property.

Then do kafka-server-start server.properties for every new Kafka broker.

From there, you can scale Kafka independently of Zookeeper

Is it possible to manage multiple Kafka clusters through single Zookeeper cluster

Look up Zookeeper chroots

One Kafka cluster would be defined as

zoo1:2181/kafka1

And a second

zoo1:2181/kafka2

be careful not to mix those up if machines shouldn't be in the same Kafka cluster


You can find various CloudFormation, Terraform, or Ansible repos for setting up Kafka in a distibuted way in the Cloud on Github, or go for Kubernetes if you are familiar with it.

Upvotes: 2

Milos Pajic
Milos Pajic

Reputation: 336

I am mostly confused about zookeeper part. I understand zookeeper needs odd number of nodes and zookeeper is installed on all kafka nodes, then how do I run Kafka with even number of nodes. If this is true it will limit Kafka to odd number of nodes as well.

Zookeeper can, but doesn't have to be installed on the same servers as kafka. It is not requirement to run zookeeper on odd number of nodes, just very good recommendation

Is it really needed to install Zookeeper on all Kafka nodes. Can I install Zookeeper on separate nodes and Kafka brokers on separate nodes, how ?

It is not required and it's even better not to have zookeeper and kafka on the same server. Installing zookeeper on another server is quite similar to when they reside on the same one. Every kafka broker needs to have zookeeper.connect setting pointing to all zookeeper nodes.

What if I want to run multiple Kafka clusters. Is it possible to manage multiple Kafka clusters through single Zookeeper cluster, how if possible ?

It is possible. In this case it's recommended to have servers dedicated just to zookeeper ensemble. In this case, in zookeeper.connect settings you should use hostname:port/path instead just hostname:port.

Upvotes: 2

Related Questions