Reputation: 1399
There is large amounts of data being pushed into one of our Kafka topics, is there a way to determine which producer this data is coming from?
Upvotes: 7
Views: 5907
Reputation: 1
Upvotes: 0
Reputation: 41
You can use headers, and hardcode producer id in the header before producing! That's something I'm doing in Node.js using rdkafka, java also should have it!
Upvotes: 2
Reputation: 191748
Without SASL or Authorizer
level auditing, no there is not an easy way other than tracking down connected, suspicious client-id via JMX.
I would suggest you enforce a standard message format and spread the word to producer teams. For example, look at the Cloudevents spec, which includes a source field
https://github.com/cloudevents/spec/blob/master/kafka-protocol-binding.md
Upvotes: 5
Reputation: 9427
You can enable quotas for the clients/users, and then monitor which clients get throttled via two quota-related JMX MBeans - bandwidth and request rate:
Metric: Bandwidth quota metrics per (user, client-id), user or client-id
MBean:kafka.server:type={Produce|Fetch},user=([-.\w]+),client-id=([-.\w]+)
What it shows:: Two attributes. throttle-time indicates the amount of time in ms the client was throttled. Ideally = 0. byte-rate indicates the data produce/consume rate of the client in bytes/sec. For (user, client-id) quotas, both user and client-id are specified. If per-client-id quota is applied to the client, user is not specified. If per-user quota is applied, client-id is not specified.Metric: Request quota metrics per (user, client-id), user or client-id
MBean:kafka.server:type=Request,user=([-.\w]+),client-id=([-.\w]+)
What it shows: Two attributes. throttle-time indicates the amount of time in ms the client was throttled. Ideally = 0. request-time indicates the percentage of time spent in broker network and I/O threads to process requests from client group. For (user, client-id) quotas, both user and client-id are specified. If per-client-id quota is applied to the client, user is not specified. If per-user quota is applied, client-id is not specified.
Upvotes: 4