Reputation: 27
recently I deployed ELK and started forwarding logs from nginx through logstash frowarder.
Problem is, that in elasticsearch (1.4.2) / kibana (4) is "bytes" value of request mapped as string.
I uses standard congfiguration found everywhere.
Into logstash patterns added new pattern for nginx logs:
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:http_host} %{IPORHOST:clientip} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time:float} %{NUMBER:upstream_time:float}
NGINXACCESS %{IPORHOST:http_host} %{IPORHOST:clientip} \[%{HTTPDATE:timestamp}\] \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent} %{NUMBER:request_time:float}
Added these configs for logstash
input {
lumberjack {
port => 5000
type => "logs"
ssl_certificate => "/etc/logstash/tls/certs/logstash-forwarder.crt"
ssl_key => "/etc/logstash/tls/private/logstash-forwarder.key"
}
}
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
add_field => [ "received_at", "%{@timestamp}" ]
add_field => [ "received_from", "%{host}" ]
}
syslog_pri { }
date {
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
} else if [type] == "nginx" {
grok {
match => { "message" => "%{NGINXACCESS}" }
}
date {
match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
}
geoip {
source => "clientip"
}
}
}
output {
elasticsearch_http {
host => localhost
}
}
But in elsticsearch I see that as string even if I define "bytes" as long
(?:%{NUMBER:bytes:long}|-)
Does anybody know how to store "bytes" as number type?
Thanks
Upvotes: 1
Views: 783
Reputation: 11571
You're on the right track with (?:%{NUMBER:bytes:long}|-)
, but "long" isn't a valid data type. Quoting the grok documentation (emphasis mine):
Optionally you can add a data type conversion to your grok pattern. By default all semantics are saved as strings. If you wish to convert a semantic’s data type, for example change a string to an integer then suffix it with the target data type. For example
%{NUMBER:num:int}
which converts thenum
semantic from a string to an integer. Currently the only supported conversions areint
andfloat
.
Note that this doesn't control the data type that's actually used in the indexing on the Elasticsearch side, only the data type of the JSON document that's sent to Elasticsearch (which may or may not affect which mapping ES uses). In the JSON context there's no difference between ints and longs; scalar values are either numbers, bools, or strings.
Upvotes: 1