Reputation: 931
Whenever I logstash restarts it start fetching database records from scratch, here is my .conf
file which I am executing
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:postgresql://**************:5432/******"
# The user we wish to execute our statement as
jdbc_user => "*******"
jdbc_password => "******"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/usr/share/logstash/lib/postgresql-42.2.9.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "org.postgresql.Driver"
jdbc_paging_enabled => true
# the column which should be stored to .logstash_jdbc_last_run
tracking_column => "unix_ts_in_secs"
use_column_value => true
tracking_column_type => "numeric"
schedule => "*/5 * * * * *"
jdbc_page_size => 500
# sql_last_value is stored in this file so each pipeline should have different
# file name so it doesn't get mixed
last_run_metadata_path => "/etc/logstash/conf.d/lastrun/.logstash_jdbc_last_run_alerts"
# our query
statement_filepath => "/etc/logstash/db-queries/alerts-query.sql"
}
}
filter {
mutate {
remove_field => ["unix_ts_in_secs"]
}
}
output {
elasticsearch {
index => "typec"
document_id => "%{id}"
hosts => ["************"]
user => "******"
password => "******"
}
}
File: alerts-query.sql
SELECT *
FROM alerts as a
WHERE (
extract(epoch FROM a."updated_at") > :sql_last_value
AND a."updated_at" < NOW()
)
ORDER BY a."updated_at" ASC
Is there any way to persist the :sql_last_value
and even if logstash restart it starts from the previously stored :sql_last_value
instead starting from 0
?
Upvotes: 0
Views: 4454
Reputation: 56
Another direction that may be useful for you - Logstash should create a file under <home_folder>/.logstash_jdbc_last_run
to persist this value between runs. Was it created? If so analyzing it could help.
Upvotes: 1
Reputation: 2908
The problem is the use_column_value setting.
Description from the docs:
When set to true, uses the defined tracking_column value as the :sql_last_value. When set to false, :sql_last_value reflects the last time the query was executed.
This means that :sql_last_value is not referring to the value in last_run_metadata_path but your tracking_column-value. When you restart Logstash this value is gone/unknown before the first execution. Hence the value defaults to 0 and Logstash fetches all records.
So if you want that the :sql_last_value survives restarts you should only rely on last_run_metadata_path and remove the use_column_value setting.
Upvotes: 4