Reputation: 123
I have the following questions regarding topics and partitions
1) What is the difference between n-topics with m-partitions and nm topics ? Would there be a difference when accessing m-partitions through m threads and nm topics using n*m different processes
2)A perfect use case differentiating high level and low level consumer
3)In case of a failure (i.e) message not delivered where can i find the error logs in Kafka.
Upvotes: 6
Views: 8408
Reputation: 2938
1) What is the difference between n-topics with m-partitions and nm topics ?
There has to be at least one partition for every topic. Topic is just a named group of partitions and partitions are really streams of data. The code that uses Kafka producer normally is not concerned with partitions, it just sends a message to a topic. By default producer uses round robin approach to select a partiton to store a message but you can create a custom one if needed and select a partition based on message's content.
If there is only one partition, only one broker processes messages for the topic and appends them to a file. On the other hand, if there are as many partitions as brokers, message processing is parallelized and there is up to m times (minus overhead) speedup. That assumes that each broker is running on its own box and kafka data storage is not shared among brokers.
If there are more partitions for a topic than brokers, Kafka tries to distribute them evenly among all of brokers.
The same goes to reading from Kafka. If there is only one partition, the kafka consumer speed is limited by max read speed of a single disk. If there are multiple partitions, messages from all partitions (on different brokers) are retrieved in parallel.
1a) Would there be a difference when accessing m-partitions through m threads and nm topics using n*m different processes
You're mixing partitions and topics here, see my answer above.
2)A perfect use case differentiating high level and low level consumer
High level consumer : I just want to use Kafka as extermely fast persistent FIFO buffer and not worry much about details.
Low level consumer : I want to have a custom partition data consuming logic, e.g. start reading data from newly created topics without a need of consumer reconnection to brokers.
3)In case of a failure (i.e) message not delivered where can i find the error logs in Kafka.
Kafka uses log4j for logging. It depends on its configuration where the log is stored (in case of producer and consumer). Kafka broker logs are normally stored in /var/log/kafka/.
Upvotes: 19