Reputation: 3231
So im trying to wrap my head around the document_type vs document_id when using the JDBC importer from logstash and exporting to elasticsearch.
I finally wrapped my head around indexes. But lets pretend im pulling from a table of sensor data (like temp/humidity/etc...) that has sensor id's...temps/ humidity (weather related data) with time recorded. (So it's a big table)
And I want to keep polling the database every X so often.
What would document_type vs document_id be in this instance, this is going to be stored (or whatever you want to call it) against 1 index.
The document_type vs document_id confuses me, especially in regards to JDBC importer.
If I set document_id to say my primary key, won't it get over-written each time? So i'll just have 1 document of data each time? (which seems pointless)
Upvotes: 0
Views: 718
Reputation: 1715
The jdbc plugin will create a JSON document with one field for each column. So to keep consistent with your example, if you had that data it would be imported as a document that looks like this:
{
"sensor_id": 567,
"temp": 90,
"humidity": 6,
"timestamp": "{time}",
"@timestamp": "{time}" // auto-created field, the time Logstash received the document
}
You were right when you said that if you set document_id
to your primary key, it would get overwritten. You can disregard document_id
unless you want to update existing documents in Elasticsearch, which I don't imagine you would want to do with this type of data. Let Elasticsearch generate the document id for you.
Now let's talk about document_type
. If you want to set the document type, you need to set the type
field in Logstash to some value (which will propagate into Elasticsearch). So the type field in Elasticsearch is used to group similar documents. If all of the documents in your table that you're importing with the jdbc plugin are of the same type (they should be!), you can set type
in the jdbc input like this...
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "mysql"
parameters => { "favorite_artist" => "Beethoven" }
schedule => "* * * * *"
statement => "SELECT * from songs where artist = :favorite_artist"
...
type => "weather"
}
}
Now, in Elasticsearch you can take advantage of the type
field by setting a mapping for that type. For example you might want:
PUT my_index
{
"mappings": {
"weather": {
"_all": { "enabled": false },
"properties": {
"sensor_id": { "type": "integer" },
"temp": { "type": "integer" },
"humidity": { "type": "integer" },
"timestamp": { "type": "date" }
}
}
}
}
Hope this helps! :)
Upvotes: 1