Reputation: 564
Kafka as a message pubsub system, needs to store data locally and do replication to avoid loss if crashed. My idea is to modify Kafka to let it write data directly to hdfs, so there is no need to do replication, making Kafka simpler, is it doable?
Upvotes: 0
Views: 540
Reputation: 32130
Doable, maybe. A good idea? Almost certainly not. Kafka itself persists data and manages replication and resilience across multiple nodes for both redundancy and performance. Bringing HDFS into the mix makes no sense at all.
Upvotes: 2
Reputation: 366
If you don't use replication, in the case a broker failing u wont be able to get the data to be sent to the partition and you won't be able to receive any data from that point on. Replication isn't just for saving data when a broker crashes, it also ensures robustness of the system.
Upvotes: 0