Shaik Mujahid Ali
Shaik Mujahid Ali

Reputation: 2378

Kafka in distributed system

I am new to kafka , i am running kafka in a single machine as of now. I want to run kafka in an distributed environment on multiple machines. There is no proper documentation for this. Any documentation or suggestion on this will be really helpful.

Upvotes: 5

Views: 4301

Answers (2)

smadhava
smadhava

Reputation: 63

Adding on to the previous answer by user2720864

Let us assume that Kafka system with below configuration is needed.

7 Kafka nodes

3 Zoo keepers

To achieve this install 7 Kafka instances, in 7 different server/vm(instances), and in each of these instances set a different broker-id, this will let the zookeeper identify the different kafka nodes for bookkeeping, maintenance. broker.id=X (/config/server.properties)

To start zookeepers, you can use 3 of the previous kafka instances or can use new servers to start zookeepers. Once the servers on which zookeepers run are decided, change the /config/server.properties to specify zookeepers.

zookeeper.connect=hostname1:port1,hostname2:port2

In a distributed environment its nice to have 3 zoo keepers. While there is only one zookeeper which acts as a true master, other 2 zookeepers act as fail overs. When the master fails one of the two ZKs will take over as master.

I found this link to be very useful, it helped me clarify a lot of things about kafka architecture.

This is a good reference for all the configurations on the property files in kafka.

Hope this helps!

Upvotes: 4

user2720864
user2720864

Reputation: 8171

Basically you need to do the follwing
1) Set up kafka on all the machines
2) Configure the config/server1.properties properties file to specify an unique id for each machines. You can do that by setting the broker.id properties in the config file. e.g. broker.id=1, broker.id=2. For every brokers this id should be unique. This is how every node is identified in a kafka cluster.
3) Start kafka in all nodes

You can refer Step 6: Setting up a multi broker cluster from their official quick start page.

Also here is a nice article worth taking a look

Upvotes: 3

Related Questions