Reputation: 5890
I'm trying to process xml file to ES with Logstash. But I tried multiple times it's still not working. I highly appreciate your kind help. The configure file as following:
input {
file {
path => "/data/logstashtest/*.xml"
start_position => "beginning"
}
}
filter {
multiline {
pattern => "^\s|</report>|^[A-Za-z].*"
what => "previous"
}
xml {
store_xml => "false"
source => "message"
xpath => [
"/report/@logtype", "logtype",
"/report/result/@name", "name",
"/report/result/@start-epoch", "start-epoch",
"/report/result/@generated-at","generated-at"
]
}
date {
match => [ "generated-at", "ISO8601" ]
}
}
output {
elasticsearch {
protocol => http
host => localhost
port => 9200
cluster => mycluster
index => mylog
}
stdout { codec => rubydebug }
}
The xml source file as following:
<report reportname="" logtype="news">
<result name="financial news" logtype="news" start-epoch="1433134800" end-epoch="1433149199" generated-at="2015/06/01 04:10:17"/>
</report>
The Logstash is in the same node with one of ES nodes. I used the following command:
bin/logstash -f threatlog.conf
It output:
[2015-09-09 17:55:29.811] WARN -- Concurrent: [DEPRECATED] Java 7 is deprecated, please use Java 8.
Java 7 support is only best effort, it may not work. It will be removed in next release (1.0).
Logstash startup completed
When I check the ES index, there is nothing. I'm using logstash-1.5.4. Thanks in advance!
Upvotes: 0
Views: 432
Reputation: 217554
The reason you see this is because Logstash keeps track of the position in the file until which it has already processed the content. The first time you launched Logstash you probably saw some output and then none anymore. To get rid of this and keep starting over until your get the config right, you need to set sincedb_path
to /dev/null
so Logstash doesn't keep track of where it is in the processing of your XML files.
So chance your input filter to this:
input {
file {
path => "/data/logstashtest/*.xml"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
Then there's also a problem with your date filter which doesn't expect the correct date format, you'll get an error like this following one:
Failed parsing date from field {:field=>"generated-at", :value=>"2015/06/01 04:10:17", :exception=>"Invalid format: \"2015/06/01 04:10:17\" is malformed at \"/06/01 04:10:17\"", :config_parsers=>"ISO8601", :config_locale=>"default=fr_FR", :level=>:warn}
So in order to fix this, you simply need to change your date filter like this with the correct date format:
date {
match => [ "generated-at", "yyyy/MM/dd HH:mm:ss" ]
}
After that, you'll get a nice and properly formatted Logstash event:
{
"message" => "<report reportname=\"\" logtype=\"news\">\n <result name=\"financial news\" logtype=\"news\" start-epoch=\"1433134800\" end-epoch=\"1433149199\" generated-at=\"2015/06/01 04:10:17\"/>\n</report>",
"@version" => "1",
"@timestamp" => "2015-06-01T02:10:17.000Z",
"host" => "localhost",
"path" => "/data/text.xml",
"tags" => [
[0] "multiline"
],
"logtype" => [
[0] "news"
],
"name" => [
[0] "financial news"
],
"start-epoch" => [
[0] "1433134800"
],
"generated-at" => [
[0] "2015/06/01 04:10:17"
]
}
Upvotes: 2