Reputation: 35724
I've got some monthly binary log files that I'd like to send to logstash (or possibly fluentd).
The issue I'm having is that (TTBOMK) the bin files are not readable by logstash so I would need to one of these.
Which of these options is the best way to read a custom bin file into logstash?
I've set up a nodejs based js script that can read a binary file and create readable text version of the document. It can be run as CLI or an http service and return only lines after a set line number. Is it possible to integrate this with logstash directly, or indirectly (so that would not require me to rewrite the code).
If not, is re-writing the script as a logstash plugin worthwhile?
If option 1 would not work, and option 2 would take too much time to implement, I'm considering generating text versions. Because of the size of the resulting documents being several GB, I'd like to remove the files, or if possible parts of the file that have been read already. Is there any way to get feedback from logstash as to what has been read already?
PS I'm running on Windows Server, if it makes any difference
Upvotes: 1
Views: 1410
Reputation: 16362
You threw out a lot of details, so hopefully I have them all straight.
If you have an http service, logstash has a http_poller input that can poll it.
I would not recommend writing a plugin for logstash. Things continue to change to rapidly in that ecosystem.
Creating plain text files is the easiest idea from a logstash perspective. Logstash doesn't tell you explicitly that it has processed a file, but you can look it up in the registry (in unix, a file named ".sincedb*", typically in /var/lib/logstash, which contains the inode number and a file size offset) to see if the file has been 100% processed.
There are lots of other ways to feed input to logstash, including tcp/ucp inputs or brokers (rabbit, redis, etc) which might fit into your workflow.
There may be Windows-related caveats to all of this, of course.
Upvotes: 2
Reputation: 2722
Easiest way would be to convert the binary format into json and feed that to logstash. Either via file or some other mechanism. Primarily because when you throw json at logstash configuration of filter is extremely simple:
filter {
if [type] == "my_json_type" {
json {
source => "message"
}
}
}
which will break down the json document into fields for you, including documents nested in json. I recommend feeding that over socket rather than files if we are talking large volumes, as out of the box doesn't not support any sort of notice when file is "done with". So your input definition could look like:
tcp {
port => 4567
type => "my_json_type"
}
Which will open a listening socket on port 4567 and treat each received line as, well, line and further the filter will appropriately process it as json document. Then in your node.js you can dispose of logs that you've already fed to logstash.
Upvotes: 1