Reputation: 3748
Is it possible to read from multiple partitions using Kafka Simple Consumer? Simple Consumer uses the partition in the following:
PartitionMetadata metadata = findLeader(brokers, port, topic, partition);
SimpleConsumer consumer = new SimpleConsumer(leadBroker, port, 100000, 64 * 1024, clientName);
leadBroker = findNewLeader(leadBroker, topic, partition, port);
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
Upvotes: 6
Views: 11580
Reputation: 540
One thread will read only from one partition. To read from multiple partitions you need to spawn multiple threads and each thread will read from single partition. You must run this in different thread, otherwise you loose the benefits of having partitions and your performance will take a hit.
For starter you can run all consumers on one machine. But eventually you will have to start using different machines for consuming. At that time you need to ensure that one partition is processed only once. Concretely, problem you need to solve is that 2 threads (from different) are trying to read from same partition. At all times, you must allow only one to process it.
Additionally, you need to manage offsets. You need to flush them in zookeeper at regular interval.
I'll suggest you to use High Level Consumer. It is much easier to use than Simple Consumer. It provides with co-ordination among different threads accessing same partition and manages offsets of its own.
Upvotes: 5
Reputation: 2938
One instance of SimpleConsumer reads from a single partition. Though you can easily create multiple instances of SimpleConsumer and read different partitions sequentially or in parallel (from different threads).
The tricky part is coordination among readers on different machines so they don't read from the same partition (assuming all messages need to be processed only once). You need to use high level consumer or write similar custom code to accomplish that.
Upvotes: 5