Why Kafka streams creates topics for aggregation and joins

Question

I recently created my first Kafka stream application for learning. I used spring-cloud-stream-kafka-binding. This is a simple eCommerce system, in which I am reading a topic called products, which have all the product entries whenever a new stock of a product comes in. I am aggregating the quantity to get the total quantity of a product.

I had two choices -

Send the aggregate details (KTable) to another kafka topic called aggregated-products
Materialize the aggregated data

I opted second option and what I found out that application created a kafka topic by itself and when I consumed messages from that topic then got the aggregated messages.

.peek((k,v) -> LOGGER.info("Received product with key [{}] and value [{}]",k, v))
            .groupByKey()
            .aggregate(Product::new,
                    (key, value, aggregate) -> aggregate.process(value),
                    Materialized.>as(PRODUCT_AGGREGATE_STATE_STORE).withValueSerde(productEventSerde)//.withKeySerde(keySerde)
                    // because keySerde is configured in application.properties
            );

Using InteractiveQueryService, I am able to access this state store in my application to find out the total quantity available for a product.

Now have few questions -

why application created a new kafka topic?
if answer is 'to store aggregated data' then how is this different from option 1 in which I could have sent the aggregated data by my self?
Where does RocksDB come into picture?

Code of my application (which does more than what I explained here) can be accessed from this link -

https://github.com/prashantbhardwaj/kafka-stream-example/blob/master/src/main/java/com/appcloid/kafka/stream/example/config/SpringStreamBinderTopologyBuilderConfig.java

Why Kafka streams creates topics for aggregation and joins

Answers (1)

Related Questions