Reputation: 1079
I have a csv in which one column may contain multi-line values.
ID,Name,Address
1, ABC, "Line 1
Line 2
Line 3"
The data written above as per CSV standard is one record (to my knowledge).
I have following filter for logstash
filter {
csv {
separator => ","
quote_char => "\""
columns => ["ID","Name", "Address"]
}
}
output {
elasticsearch {
host => "localhost"
port => "9200"
index => "TestData"
protocol => "http"
}
stdout {}
}
But when I execute it, it creates three records. (All are wrong in principle as first one contains two column data ID and Name and partial data for Address and next two records contain Line 2 and Line 3 but no ID and Name
How can I fix this? Am I missing something in the file parsing?
Upvotes: 1
Views: 1301
Reputation: 357
have you tryed the multiline codec?
You should add something like this in your input plugin:
codec => multiline {
pattern => "^[0-9]"
negate => "true"
what => "previous"
}
it tells logstash that every line not starting with a number should be merged with the previous line
Upvotes: 3