Reputation: 315

Data transfer from SQL Server to ElasticSearch Node

I was using Elastic Search 1.7 version for my application and by using the concept of river i was filling up the data from SQL Server to ElasticSearch using the following article : https://www.pluralsight.com/blog/it-ops/elasticsearch-and-sql-server

URL: POST http://localhost:9200/_river/my_jdbc_river/_meta

JSON:

{

"type":"jdbc",

"schedule" : "0 0-59 0-23 ? * *",

"jdbc": {

"driver":"com.microsoft.sqlserver.jdbc.SQLServerDriver",

"url":"jdbc:sqlserver://127.0.0.1:1433;databaseName=MyDatabase",

"user":"my_db_user","password":"password",

"sql":"select ID as _id, FirstName, LastName, DateOfBirth from People",

"index":"people",

"type":"person"

}

}

But in the documentation of elastic search 2.0 it says the rivers are deprecated .Is there any better way to copy the data from SQL Server to elastic with elastic search 2.0

Any code sample of alternative would be really helpful

Thanks

Upvotes: 12

Answers (3)

banavalikar

Reputation: 282

If you are referring this post now (like me), ElasticSearch has moved on and there is a compatibility issue with the spring boot client that is used for this which does not support ES 5.x.x. So, you have to create your own spring boot client that is compatible/wait for ES to provide the support/use an alternative solution to bridge SQL Server to ES.

See: https://github.com/spring-projects/spring-data-elasticsearch/wiki/Spring-Data-Elasticsearch---Spring-Boot---version-matrix

Upvotes: 0

Fl4v

Reputation: 1062

You should take a look at this: https://github.com/jprante/elasticsearch-jdbc/wiki/jdbc-plugin-feeder-mode-as-an-alternative-to-the-deprecated-elasticsearch-river-api

EDIT : This is what I did, step by step, to synchronize data from SQL Server to ElasticSearch, without the deprecated river. I did it from scratch. You may have already done some of these steps if you already set up a river.

Download JDBC importer for your version of elasticsearch: https://github.com/jprante/elasticsearch-jdbc#compatiblity-matrix
Unzip it in elasticsearch folder. Note that you can unzip it anywhere, but don't put it in {elasticsearch}/plugin or, as it is not a plugin, it would cause elasticsearch to crash when starting.
Download JDBC drivers from Microsoft: https://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=11774 (I downloaded the ".exe" version)
Execute it. It creates an unzipped folder. Open it and copy sqljdbc4.jar, sqljdbc41.jar and sqljdbc42.jar to {elasticsearch-jdbc}/lib (be careful not to copy sqljdbc.jar)
Enable TCP/IP connection to SQL Server (see for example http://r2d2.cc/2015/08/05/the-tcpip-connection-to-the-host-localhost-port-1433-has-failed-microsoft-sql-server-2014/)
In Elasticsearch directory create directory {elasticsearch}\bin\feeder
Create jdbc_sqlserver.json file with this content:

    {  "type":"jdbc",
      "jdbc":{ 
         "url":"jdbc:sqlserver://localhost:1433;databaseName=test", 
         "user":"test",
         "password":"test",
         "sql":"SELECT id as _id, id, name,email FROM test",
         "index":"users",
         "type":"user", 
         "autocommit":"true"
         } 
     }

Create jdb_sqlserver.ps1 file with this content:

function Get-PSVersion {
     if (test-path variable:psversiontable) {$psversiontable.psversion} else {[version]"1.0.0.0"}
}
$powershell = Get-PSVersion
if($powershell.Major -le 2 ){
 Write-Error "Oh, so sorry, this script requires Powershell 3 (due to convertto-json)"
 exit
}
if((Test-Path env:\JAVA_HOME) -eq $false){
 Write-Error "Environment variable JAVA_HOME must be set to your java home"
 exit
}
curl -XDELETE "http://localhost:9200/planets"

$DIR = "D:\programs\elasticsearch\plugins\elasticsearch-jdbc-1.7.0.1\"
$FEEDER_CLASSPATH="$DIR\lib"
$FEEDER_LOGGER="file://$DIR\bin\log4j2.xml"

java -cp "$FEEDER_CLASSPATH\*" -"Dlog4j.configurationFile=$FEEDER_LOGGER" "org.xbib.tools.Runner" "org.xbib.tools.JDBCImporter" jdbc_sqlserver.json

Execute jdb_sqlserver.ps1 with powershell

This article was very helpful: http://r2d2.cc/2015/08/05/elastic-search-on-windows-with-mysql-sql-server-2014-and-import-feeder/

Upvotes: 17

jhilden

Reputation: 12429

Our plan for this is that when we write the record to SQL; then we then take the primary key and put that into a queuing system (RabbitMQ in our case). Then, there is a consumer on that queue (or many consumers) that extracts the data out of SQL and inserts it into ES.

Having an intermediate queue gives us a number of advantages:

We can take a lot more time to gather up all the data we want to insert into ES without our customers having to wait
If we ever want to re-index ES with more/different data we can just put that primary key back into the queue and it will gather up the data again.
If ES were ever down for maintance the data would still exist in the queue and would get picked back up when ES was back online.

Upvotes: 4

Data transfer from SQL Server to ElasticSearch Node

Answers (3)

Related Questions