Reputation: 4245
I am trying to save data as it arrives in a streaming fashion (with the least amount of delay) to my database which is InfluxDB. Currently I save it in batches.
Current setup - interval based
Currently I have an Airflow instance where I read the data from a REST API every 5min and then save it to the InfluxDB.
Desired setup - continuous
Instead of saving data every 5 min, I would like to establish a connection via a Web-socket (I guess) and save the data as it arrives. I have never done this before and I am confusing how actually it is done? Some question I have are:
As you can see, I am quite lost on where to start, what to read and how to make sense from all this. Any advise on direction and/or guidance for a set-up would be very appreciated.
Upvotes: 1
Views: 1242
Reputation: 338
If I understand correctly, you are keen to code a java service which would process the incoming data, so one of the solution is to implement a websocket with for example jetty.
From there you receive the data in json format for example and you process the data using the influxdb-java framework with which you fill the database. Influxdb-java will allow you to create and manage the data.
I don't know airflow, and how you produce the data, so maybe there is built-in tools (influxdb sinks) that can save you some work in your context.
I hope that this can give you some guide lines to start digging more.
Upvotes: 1