Reputation: 51
I am new to Apache Kafka and its stream services related to API's, but was wondering if there was any formal documentation on where to obtain the initial raw data required for ingestion?
In essence, I want to try my hand at building a rudimentary crypto trading bot, but was under the impression that http APIs may have more latency than APIs that integrate with Kafka Streams. For example, I know RapidAPI has a library of http APIs that can be accessed that would help pull data, but was unsure if there was something similar if I wanted the data to be ingested through Kafka Streams. I guess I am under the impression that the two data sources will not be similar and will be different in some way, but am also unsure if this is not the case.
I tried digging around on Google, but it's not very clear on what APIs or source data is taken for Kafka Streams, or if they are the same/similar just handled differently.
If you have any insight or documentation that would be greatly appreciated. Also feel free to let me know if my understanding is completely false.
Upvotes: 2
Views: 1093
Reputation: 191733
any formal documentation on where to obtain the initial raw data required for ingestion?
Kafka accepts binary data. You can feed in serialized data from anywhere (although, you are restricted by (configurable) message size limits).
APIs that integrate with Kafka Streams
Kafka Streams is an intra-cluster library, it doesn't integrate with anything but Kafka.
If you want to periodically, poll/fetch an HTTP/1 API, then you would use a regular HTTP client, and a Kafka Producer.
Probably a similar answer with streaming HTTP/2 or websocket, although, still not able to use Kafka Streams, and you'd have to deal with batching records into a Kafka Producer request
You instead should look for Kafka Connect projects on the web that operate with HTTP, or opt for something like Apache NiFi as a broader project with lots of different "processors" like GetHTTP and ProduceKafka.
Once the data is in Kafka, you are welcome to use Kafka Streams/KSQL to do some processing
Upvotes: 1