Haily
Haily

Reputation: 39

What is a data stream in Kafka?

Why do you talk in terms of Kafka about data streams.

Whereas for example in terms of REST API I never heard anything like data stream.

Maybe someone can tell me what the term data stream really means in Kafka. By the way I understand that there is a producer and a consumer in Kafka. The producer sends data to the broker and the broker sends it to the consumer.

Upvotes: 1

Views: 223

Answers (1)

Mykhailo Skliar
Mykhailo Skliar

Reputation: 1367

Data streams in Kafka means the endless collection of data.

For example, if you want to return all twitter messages for yesterday, then it is a collection with certain (although very large) number.

But if you want to return all twitter message since 1 January 2021, then it is an endless stream of data (because you didn't specify the end of period).

If you create Kafka Consumer and subscribe to all twitter messages since 1 January 2021, then you will have an endless stream, which will return new data continuously.

You can also talk about endless data Streams in reactive REST API, why not? If your Kafka Consumer will put all twitter messages in MongoDB, for example, which is reactive database, then you can create Web Socket connection to your reactive REST endpoint, which will continuously push new twitter messages from MongoDB to your UI.

Instead of MongoDB you can also use Kafka directly, but you will not be able to store the large amount of data during a long time (depends on you Kafka cluster configuration).

Unfortunately, most of the relational databases, like MySQL or PostgreSQL are not reactive yet, so you would have to build polling workarounds to enable reactive-like endless stream of data.

Upvotes: 1

Related Questions