Reputation: 1474
I am creating a kafka data pipeline with 3 kafka brokers and 3 zookeepers so I need to use 6 machines to deploy 3 brokers and 3 zookepers. There are two ways to reduce no/of machines used
1) To keep 1 kafka broker and 1 zookeeper in the same physical machine I searched in web and found in quora that there will be latency issues as both kafka broker and zookeeper are using the same RAM
2) using docker and creating two continers in a machine each for zookeeper and broker and limiting the RAM of the zookeeper container so that more RAM is taken by the kafka broker and to reduce the latency
I want to know more pros and cons of my two use cases
Is it a good practice to keep both zookeeper and broker in same machine using containers
Thanks in advance
Upvotes: 0
Views: 2041
Reputation: 8026
A very distinct feature of kafka is that it makes use of sequential read/writes on disk to achieve its high level of performance. Having another application actively using the same physical hard drive (like zookeeper will do since it maintains a changelog), will lower the max throughput you can get out of kafka.
This doesn't exclude at all sharing a server though, and as it has been said, ZK is an overall very light service in terms of ressources used. It just excludes sharing a drive if you need high level of performances (hundreds of megs of data per second out of each broker).
You may also want to consider that you'll be mutualizing failure causes here, which is usually not the optimal thing to do if you can avoid it. It can be an acceptable tradeoff if you don't have the option to use/rent 3 very small machines for ZK, but are working on a given stock of servers.
Upvotes: 2
Reputation: 5387
If you are using Zookeeper only for Kafka and no other processes/application is using that Zookeeper cluster, then Zookeeper is going to use very less memory. Kafka also doesn't keep a lot of data in memory. So, it will be safe for you to run one Kafka and one Zookeeper node on the same machine. You may restrict Zookeeper to use less memory by controlling its max JVM heap space. If usable memory is less for the OS cache, then Kafka performance may have an impact. As Zookeeper is not going to use much memory, so you may ignore that.
Upvotes: 0