Reputation: 5236
I have PostgreSQL 10
database with table. 7000 new data comes into the table every hour.
In Logstash 6.4
I have such .conf
file which create index in Elasticsearch
.
.conf
:
input {
jdbc {
jdbc_connection_string => "jdbc:postgresql://@host:@port/@database"
jdbc_user => "@username"
jdbc_password => "@password"
jdbc_driver_library => "C:\postgresql-42.2.5.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement => "SELECT * from table_name"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "table_name"
}
}
Questions:
Upvotes: 0
Views: 1554
Reputation: 3018
How update existing index with new data which appeared in table?
Index table_name is automatically updated with new entries added to your database table. However, if any existing entries are updated in database table then they are added into the index as new documents with a new document id. Instead, if you would like the existing document in ES updated, use a column name which has unique values and assign it as document id. This way if an existing entry in database is updated the corresponding document in ES is overwritten with latest values.
Use document_id => "%{column_name_with_unique_values>}"
in output configuration
What is the maximum amount of data index can store? Could there be an overflow?
It depends on your resources really. However, for optimal performance it is recommended to keep your shard size between 20 - 40 GB. If your index has 5 primary shards you can store about 200 GB of data in a single index. Anything above that consider storing data in a new index. Ideally, use time series indices such as daily or monthly such that it becomes easier to maintain for ex. to archive & backup and then purge.
Upvotes: 2