I was running my services that work with kafka already for a year and no spontaneous changes of leader happens. But for the last 2 weeks that started happens quite often. Kafka log on that: [2015-09-27 15:35:14,826] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions [myTopic] (kafka.server.ReplicaFetcherManager) [2015-09-27 15:35:14,830] INFO Truncating log myTopic-0 to offset 11520979. (kafka.log.Log) [2015-09-27 15:35:14,845] WARN [Replica Manager on Broker 2]: Fetch request with correlation id 713276 from client ReplicaFetcherThread-0-2 on partition [myTopic,0] failed due to Leader not local for partition [myTopic,0] on broker 2 (kafka.server.ReplicaManager) [2015-09-27 15:35:14,857] WARN [Replica Manager on Broker 2]: Fetch request with correlation id 256685 from client mirrormaker-1 on partition [myTopic,0] failed due to Leader not local for partition [myTopic,0] on broker 2 (kafka.server.ReplicaManager) [2015-09-27 15:35:20,171] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions [myTopic,0] (kafka.server.ReplicaFetcherManager) What can cause switching leader? If there is info in some kafka documentation - please - just point the link. I've failed to find. System configuration kafka version: kafka_2.10-0.8.2.1 os: Red Hat Enterprise Linux Server release 6.5 (Santiago) server.properties (differs from default): broker.id=001 socket.send.buffer.bytes=1048576 socket.receive.buffer.bytes=1048576 socket.request.max.bytes=104857600 log.flush.interval.messages=10000 log.flush.interval.ms=1000 log.retention.bytes=-1 controlled.shutdown.enable=true auto.create.topics.enable=false

Reputation: 4532

When does kafka change leader?

I was running my services that work with kafka already for a year and no spontaneous changes of leader happens. But for the last 2 weeks that started happens quite often. Kafka log on that:

[2015-09-27 15:35:14,826] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions [myTopic] (kafka.server.ReplicaFetcherManager)
[2015-09-27 15:35:14,830] INFO Truncating log myTopic-0 to offset 11520979. (kafka.log.Log)
[2015-09-27 15:35:14,845] WARN [Replica Manager on Broker 2]: Fetch request with correlation id 713276 from client ReplicaFetcherThread-0-2 on partition [myTopic,0] failed due to Leader not local for partition [myTopic,0] on broker 2 (kafka.server.ReplicaManager)
[2015-09-27 15:35:14,857] WARN [Replica Manager on Broker 2]: Fetch request with correlation id 256685 from client mirrormaker-1 on partition [myTopic,0] failed due to Leader not local for partition [myTopic,0] on broker 2 (kafka.server.ReplicaManager)
[2015-09-27 15:35:20,171] INFO [ReplicaFetcherManager on broker 2] Removed fetcher for partitions [myTopic,0] (kafka.server.ReplicaFetcherManager)

What can cause switching leader? If there is info in some kafka documentation - please - just point the link. I've failed to find.

System configuration

kafka version: kafka_2.10-0.8.2.1

os: Red Hat Enterprise Linux Server release 6.5 (Santiago)

server.properties (differs from default):

broker.id=001
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.flush.interval.messages=10000
log.flush.interval.ms=1000
log.retention.bytes=-1
controlled.shutdown.enable=true
auto.create.topics.enable=false

Upvotes: 10

Answers (2)

Garry

Reputation: 688

I am assuming you have one topic and one partition with a replication factor of 2. Which is not a good configuration for optimal Kafka performance and consumers.

Your Logs are not clear enough for leader switch. Major issue in your topic may be having the only one leader due to the only partition. Now the single file in your logs is getting bigger in size day by day. Kafka internally does rebalancing at some level(details are not confirmed). That can be the reason for your leader switch. But i am not sure.

Also in your 2nd log line its says some of the logs are truncated. Can you please go though the logs in details and check is this happening only after truncation?

As you already mentioned you already checked your Kafka log directory files and their size. Please run the describe when you got this issue. The leader switch will reflect here as well. Or if you can setup some dashboard that will display the leader for past time. Then it will be easy for you to find the root cause.

bin/kafka-topics.sh --describe --zookeeper Zookeeperhost:Port --topic TopicName

Suggestion: i will suggest you to create a new topic with more partitions(read Kafka documentation to get a good idea about optimum number of partitions) and start writing to it. Or you can check, how to change partitions for current topic.

Last Thing: Is leader switch causing some issues in your Clients or you are worried only about warnings?

Upvotes: 0

Giddy up

Reputation: 91

It appears like lead broker is down for that partition. It might be that data directroy(log.dirs) configured in server.properties is out of space and broker is not able to accommodate. Also, what is replication factor of topic and cluster size of brokers?

Upvotes: 2

When does kafka change leader?

Answers (2)

Related Questions