Reputation: 1
Good morning,
A bit of background for you: We are currently putting together a POC to use Apache Kafka as a messaging queue for inbound log data for post processing by Elastic Logstash. Currently I have 3 broker nodes configured to point to a single zookeeper node. I have a default replication factor of 3 and minumum ISR of 2 to account for a single node failure(or availability zone in this case). When creating a topic I set a partition count of 10 and replication factor of 3 - Kafka duly goes and creates the topic - happy days! However, because I use SSL on my inbound interface(because it will be internet facing) I need to secure the topics to be writable by a a certain principal as follows:
/opt/kafka-dq/bin/kafka-acls.sh --authorizer-properties zookeeper.connect=zookeeper-001:2181 --add --allow-principal User:USER01 --producer --topic 'USER01_openvpn'
When this happens the ISR drops to a single node, and as I have a minimum ISR of 2 the partitions are taken offline which causes filebeat(client end) to start throwing the following errors:
kafka/client.go:242 Kafka publish failed with: circuit breaker is open
The following errors are also seen in the kafka server logs
2018-11-16 09:59:12,736] ERROR [Controller id=3] Received error in LeaderAndIsr response LeaderAndIsrResponse(responses={USER01_openvpn-3=CLUSTER_AUTHORIZATION_FAILED, USER01_openvpn-2=CLUSTER_AUTHORIZATION_FAILED...
[2018-11-16 10:09:46,852] ERROR [Controller id=2 epoch=23] Controller 2
epoch 23 failed to change state for partition USER01_openvpn-4 from
OnlinePartition to OnlinePartition (state.change.logger)
kafka.common.StateChangeFailedException: Failed to elect leader for
partition USER01_openvpn-4 under strategy
PreferredReplicaPartitionLeaderElectionStrategy
I have attempted to remedy this by adding an ACL for the ANONYMOUS user to all topics but this actually caused the cluster to break further. For further clarity, whilst I have SSL enabled on the inbound interface my cluster inter-broker comms is plaintext.
The documentation around ACL's for the cluster itself are somewhat "wooly" at best so wondered how best to approach this issue.
Upvotes: 0
Views: 2029
Reputation: 26885
It looks like you are missing an ACL with ClusterAction
on the Cluster
resource for your brokers. This is required to allow them to exchange inter-broker messages.
As your brokers are using plaintext, you probably need to set this ACL on the ANONYMOUS
principal.
If you're using only SSL (without SASL), you want to make sure you do SSL authentication, otherwise anybody could connect to your cluster and would get ClusterAction
permissions allowing them to cause havoc.
Upvotes: 1