Steve G
Steve G

Reputation: 1

Kafka ACL's cause topic replication to fail

Good morning,

A bit of background for you: We are currently putting together a POC to use Apache Kafka as a messaging queue for inbound log data for post processing by Elastic Logstash. Currently I have 3 broker nodes configured to point to a single zookeeper node. I have a default replication factor of 3 and minumum ISR of 2 to account for a single node failure(or availability zone in this case). When creating a topic I set a partition count of 10 and replication factor of 3 - Kafka duly goes and creates the topic - happy days! However, because I use SSL on my inbound interface(because it will be internet facing) I need to secure the topics to be writable by a a certain principal as follows:

/opt/kafka-dq/bin/kafka-acls.sh --authorizer-properties zookeeper.connect=zookeeper-001:2181 --add --allow-principal User:USER01 --producer --topic 'USER01_openvpn'

When this happens the ISR drops to a single node, and as I have a minimum ISR of 2 the partitions are taken offline which causes filebeat(client end) to start throwing the following errors:

kafka/client.go:242     Kafka publish failed with: circuit breaker is open

The following errors are also seen in the kafka server logs

2018-11-16 09:59:12,736] ERROR [Controller id=3] Received error in LeaderAndIsr response LeaderAndIsrResponse(responses={USER01_openvpn-3=CLUSTER_AUTHORIZATION_FAILED, USER01_openvpn-2=CLUSTER_AUTHORIZATION_FAILED...

[2018-11-16 10:09:46,852] ERROR [Controller id=2 epoch=23] Controller 2 
epoch 23 failed to change state for partition USER01_openvpn-4 from 
OnlinePartition to OnlinePartition (state.change.logger)
kafka.common.StateChangeFailedException: Failed to elect leader for 
partition USER01_openvpn-4 under strategy 
PreferredReplicaPartitionLeaderElectionStrategy

I have attempted to remedy this by adding an ACL for the ANONYMOUS user to all topics but this actually caused the cluster to break further. For further clarity, whilst I have SSL enabled on the inbound interface my cluster inter-broker comms is plaintext.

The documentation around ACL's for the cluster itself are somewhat "wooly" at best so wondered how best to approach this issue.

Upvotes: 0

Views: 2029

Answers (1)

Mickael Maison
Mickael Maison

Reputation: 26885

It looks like you are missing an ACL with ClusterAction on the Cluster resource for your brokers. This is required to allow them to exchange inter-broker messages.

As your brokers are using plaintext, you probably need to set this ACL on the ANONYMOUS principal.

If you're using only SSL (without SASL), you want to make sure you do SSL authentication, otherwise anybody could connect to your cluster and would get ClusterAction permissions allowing them to cause havoc.

Upvotes: 1

Related Questions