Reputation: 2396
I set up my river with the following script:
curl -XPUT 'localhost:9200/_river/foo/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"url" : "jdbc:mysql://...:3306/....",
"user" : "...",
"password" : "...",
"sql" : "SELECT v.id as _id,v.name,v.entrydate, v.link, v.html,v.created AS _created,vc.name AS company, vp.name AS position FROM foo v LEFT JOIN foocompany vc ON vc.id=v.company LEFT JOIN fooposition vp ON vp.id=v.position ",
"fetchsize" : 100,
"bulk_size" : 100,
"max_bulk_requests" : 2,
"bulk_flush_interval" : "30s",
"strategy": "simple",
"poll": "30s",
"autocommit": true
}
}'
After some time when this river runs, I get an exception which is probably because of the configuration of the MySQL server itself:
[2014-11-27 16:54:02,301][ERROR][org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverFlow] com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 10 milliseconds ago. The last packet sent successfully to the server was 52,296 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
java.io.IOException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 10 milliseconds ago. The last packet sent successfully to the server was 52,296 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.fetch(SimpleRiverSource.java:231)
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverFlow.move(SimpleRiverFlow.java:129)
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverFlow.run(SimpleRiverFlow.java:88)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet successfully received from the server was 10 milliseconds ago. The last packet sent successfully to the server was 52,296 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1129)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3720)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3609)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4160)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:928)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:2053)
at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:406)
at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:385)
at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:163)
at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7472)
at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:919)
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.close(SimpleRiverSource.java:613)
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.execute(SimpleRiverSource.java:263)
at org.xbib.elasticsearch.river.jdbc.strategy.simple.SimpleRiverSource.fetch(SimpleRiverSource.java:227)
... 3 more
Caused by: java.io.EOFException: Can not read response from server. Expected to read 4 bytes, read 0 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3166)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3620)
... 15 more
The problem here is that reconfiguring MySQL in my setup is not an option. Alas, I have to seek for options elsewhere.
Upvotes: 0
Views: 652
Reputation: 7275
I've had many, many headaches with Elastic rivers. Not just the JDBC one, but custom written rivers, web crawler rivers, etc.
An important note is that rivers are being deprecated very soon. (Preferred method of indexing bulk data into ElasticSearch?)
One of the problems I've seen is that the rivers don't always reliably start when Elastic gets restarted. Sometimes the rivers don't start at all, sometimes they do. Very frustrating.
The official recommendation from Elastic is to move the process outside of Elastic and pump the data in.
I've replaced all our JDBC rivers with small C# apps running on Linux as a cron job on the same Elastic server. Works great and it's much more reliable and easier to start/restart. Re-creating rivers in Elastic was always a pain for me.
Upvotes: 2