Reputation:
I have Filebeat, Logstash, ElasticSearch and Kibana. Filebeat is on a separate server and it's supposed to receive data in different formats: syslog, json, from a database, etc and send it to Logstash.
I know how to setup Logstash to make it handle a single format, but since there are multiple data formats, how would I configure Logstash to handle each data format properly?
In fact, how can I setup them both, Logstash and Filebeat, so that all the data in different formats get sent from Filebeat and submitted to Logstash properly? I mean, the config setting which handle sending and receiving data.
Upvotes: 2
Views: 5855
Reputation: 698
Supported with 7.5.1
filebeat-multifile.yml // file beat installed on a machine
filebeat.inputs:
- type: log
tags: ["gunicorn"]
paths:
- "/home/hduser/Data/gunicorn-100.log"
- type: log
tags: ["apache"]
paths:
- "/home/hduser/Data/apache-access-100.log"
output.logstash:
hosts: ["0.0.0.0:5044"] // target logstash IP
gunicorn-apache-log.conf // log stash installed on another machine
input {
beats {
port => "5044"
host => "0.0.0.0"
}
}
filter {
if "gunicorn" in [tags] {
grok {
match => { "message" => "%{USERNAME:u1} %{USERNAME:u2} \[%{HTTPDATE:http_date}\] \"%{DATA:http_verb} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:android_client}\"" }
remove_field => "message"
}
}
else if "apache" in [tags] {
grok {
match => { "message" => "%{IPORHOST:client_ip} %{DATA:u1} %{DATA:u2} \[%{HTTPDATE:http_date}\] \"%{WORD:http_method} %{URIPATHPARAM:api} %{DATA:http_version}\" %{NUMBER:status_code} %{NUMBER:byte} \"%{DATA:external_api}\" \"%{GREEDYDATA:gd}\" \"%{DATA:u3}\""}
remove_field => "message"
}
}
}
output {
if "gunicorn" in [tags]{
stdout { codec => rubydebug }
elasticsearch {
hosts => [...]
index => "gunicorn-index"
}
}
else if "apache" in [tags]{
stdout { codec => rubydebug }
elasticsearch {
hosts => [...]
index => "apache-index"
}
}
}
Run filebeat from binary Give proper permission to file
sudo chown root:root filebeat-multifile.yml
sudo chmod go-w filebeat-multifile.yml
sudo ./filebeat -e -c filebeat-multifile-1.yml -d "publish"
Run logstash from binary
./bin/logstash -f gunicorn-apache-log.conf
Upvotes: 1
Reputation: 1366
If the "data formats" in your question are codecs, this has to be configured in the input of logstash. The following is about filebeat 1.x and logstash 2.x, not the elastic 5 stack. In our setup, we have two beats inputs - the first is default = "plain":
beats {
port => 5043
}
beats {
port => 5044
codec => "json"
}
On the filebeat side, we need two filebeat instances, sending their output to their respective ports. It's not possible to tell filebeat "route this prospector to that output".
Documentation logstash: https://www.elastic.co/guide/en/logstash/2.4/plugins-inputs-beats.html
Remark: If you ship with different protocols, e.g. legacy logstash-forwarder / lumberjack, you need more ports.
Upvotes: 1
Reputation: 4089
To separate different types of inputs within the Logstash pipeline, use the type
field and tags
for more identification.
In your Filebeat configuration, you should be using a different prospector for each different data format, each prospector can then be set to have a different document_type:
field.
For example:
filebeat:
# List of prospectors to fetch data.
prospectors:
# Each - is a prospector. Below are the prospector specific configurations
-
# Paths that should be crawled and fetched. Glob based paths.
# For each file found under this path, a harvester is started.
paths:
- "/var/log/apache/httpd-*.log"
# Type to be published in the 'type' field. For Elasticsearch output,
# the type defines the document type these entries should be stored
# in. Default: log
document_type: apache
-
paths:
- /var/log/messages
- "/var/log/*.log"
document_type: log_message
In the above example, logs from /var/log/apache/httpd-*.log
will have document_type: apache
, while the other prospector has document_type: log_message
.
This document-type
field becomes the type
field when Logstash is processing the event. You can then use if
statements in Logstash to do different processing on different types.
For example:
filter {
if [type] == "apache" {
# apache specific processing
}
else if [type] == "log_message" {
# log_message processing
}
}
Upvotes: 4