Reputation: 2890
I have an API endpoint that accepts events with a specific user ID and some other data. I want those events broadcasted to some external locations and I wanted to explore using Kafka as a solution for that.
I have the following requirements:
UserID
should be delivered in order to the external locations.Initially, from some reading I did, it felt like I want to have N
consumers where N
is the number of external locations I want to broadcast to. That should fulfill requirement (3). I also probably want one producer, my API, that will push events to my Kafka cluster. Requirement (2) should come in automatically with Kafka.
I was more confused regarding how to model the internal Kafka cluster side of things. Again, from the reading I did, it sounds like it's a bad practice to have millions of topics, so having a single topic for each userID
is not an option. The other option I read about is having one partition for each userID
(let's say M
partitions). That would allow requirement (1) to happen out of the box, if I understand correctly. But that would also mean I have M
brokers, is that correct? That also sounds unreasonable.
What would be the best way to fulfill all requirements? As a start, I plan on hosting this with a local Kafka cluster.
Upvotes: 0
Views: 82
Reputation: 191711
You are correct that one topic per user is not ideal.
Partition count is not dependent upon broker count, so this is a better design.
If a single external location is failing, that shouldn't delay delivery to other locations.
This is standard consumer-group behavior, not topic/partition design.
Upvotes: 1