Reputation: 11741
I am getting a lot of the following errors when I am running logstash to index documents into Elasticsearch
[2019-11-02T18:48:13,812][WARN ][logstash.outputs.elasticsearch] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"my-index-2019-09-28", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x729fc561>], :response=>{"index"=>{"_index"=>"my-index-2019-09-28", "_type"=>"doc", "_id"=>"BhlNLm4Ba4O_5bsE_PxF", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field [timestamp] of type [date] in document with id 'BhlNLm4Ba4O_5bsE_PxF'", "caused_by"=>{"type"=>"illegal_argument_exception", "reason"=>"Invalid format: \"2019-09-28 23:32:10.586\" is malformed at \" 23:32:10.586\""}}}}}
It clearly has a problem with the date being formed but I don't see what that problem could be. Below are excerpts from my logstash config and the elasticsearch template. I include these because I'm trying to use the timestamp
field to articulate the index in my logstash config by copying timestamp
into @timestamp
then formatting that to a YYY-MM-DD
format and use that stored metadata to articulate my index
Logstash config:
input {
stdin { type => stdin }
}
filter {
csv {
separator => " " # this is a tab (/t) not just whitespace
columns => ["timestamp","field1", "field2", ...]
convert => {
"timestamp" => "date_time"
...
}
}
}
filter {
date {
match => ["timestamp", "yyyy-MM-dd' 'HH:mm:ss'.'SSS'"]
target => "@timestamp"
}
}
filter {
date_formatter {
source => "@timestamp"
target => "[@metadata][date]"
pattern => "YYYY-MM-dd"
}
}
filter {
mutate {
remove_field => [
"@timestamp",
...
]
}
}
output {
amazon_es {
hosts =>
["my-es-cluster.us-east-1.es.amazonaws.com"]
index => "my-index-%{[@metadata][date]}"
template => "my-config.json"
template_name => "my-index-*"
region => "us-east-1"
}
}
Template:
{
"template" : "my-index-*",
"mappings" : {
"doc" : {
"dynamic" : "false",
"properties" : {
"timestamp" : {
"type" : "date"
}, ...
},
"settings" : {
"index" : {
"number_of_shards" : "12",
"number_of_replicas" : "0"
}
}
}
When I inspect the raw data it looks like what the error is showing me and that appears to be well formed so I'm not sure what my issue is
Here is an example row, it's been redacted but the problem field is untouched and is the first one
2019-09-28 07:29:46.454 NA 2019-09-28 07:29:00 someApp 62847957802 62847957802
Upvotes: 1
Views: 2486
Reputation: 11741
Turns out the source problem was the convert
block. logstash is unable to understand the time format specified in the file. To address this I changed the original timestamp
field to unformatted_timestamp
and apply the date formatter I was already using
filter {
date {
match => ["unformatted_timestamp", "yyyy-MM-dd' 'HH:mm:ss'.'SSS'"]
target => "timestamp"
}
}
filter {
date_formatter {
source => "timestamp"
target => "[@metadata][date]"
pattern => "YYYY-MM-dd"
}
}
Upvotes: 1
Reputation: 7473
You are parsing your lines using the csv
filter and setting the separator to a space, but your date is also split by a space, this way your first field, named timestamp
only gets the date 2019-09-28
and the time is on the field named field1
.
You can solve your problem creating a new field named date_and_time
with the contents of the fields with the date and the time, for example.
csv {
separator => " "
columns => ["date","time","field1","field2","field3","field4","field5","field6"]
}
mutate {
add_field => { "date_and_time" => "%{date} %{time}" }
}
mutate {
remove_field => ["date","time"]
}
This will create a field named date_and_time
with the value 2019-09-28 07:29:46.454
, you can now use the date
filter to parse this value into the @timestamp
field, the default for logstash.
date {
match => ["date_and_time", "YYYY-MM-dd HH:mm:ss.SSS"]
}
This will leave you with two fields with the same value, date_and_time
and @timestamp
, the @timestamp
is the default for logstash so I would suggest keeping it and removing the date_and_time
that was created before.
mutate {
remove_field => ["date_and_time"]
}
Now you can create your date based index using the format YYYY-MM-dd
and logstash will extract the date from the @timestamp
field, just change your index
line in your output for this one:
index => "my-index-%{+YYYY-MM-dd}"
Upvotes: 0