Ragavan Thiru
Ragavan Thiru

Reputation: 89

Run ElasticSearch on top relational database

The problem I have is, whether it is possible to use ElasticSearch on top of a relational database. 1. When I insert or delete a record in the relational database, will it reflect in the elastic search? 2. If I insert a document in the elastic search will it be persisted in the database? 3. Does it uses a cache or an in-memory database to facilitate search? If so what is uses?

Upvotes: 0

Views: 3344

Answers (3)

DetourToNirvana
DetourToNirvana

Reputation: 83

Came across this question while looking for a similar thing. Thought an update was due.

My Findings:

  1. Elasticsearch has now deprecated Rivers, though the above-mentioned jprante's River lives on...
  2. Another option I found was the Scotas Push Connector which pushes inserts, updates and deletes from an RDBMS to Elasticsearch. Details here: http://www.scotas.com/product-scotas-push-connector.

    Example implementation here: http://www.scotas.com/blog/?p=90

Upvotes: 0

John Petrone
John Petrone

Reputation: 27487

There is no direct connection between Elasticsearch and relational databases - ES has it's own datastore based on Apache Lucene.

That said, you can as others have noted use the Elasticsearch River plugin for JDBC to load data from a relational database into Elasticsearch. Keep in mind there are a number of limitations to this approach:

  1. It's one way only - The JDBC River for ES only reads from the source database - it does not push data from ES into the source database.

  2. Deletes are not handled - if you delete data in your source database after it's been indexed into ES that deletion will not be reflected in ES. ElasticSearch river JDBC MySQL not deleting records and https://github.com/jprante/elasticsearch-river-jdbc/issues/213

  3. It was not intended as a production, scalable solution for relational database and Elasticsearch integration. From the JDBC River's author's comment in January of 2014, it was designed as a " a single node (non-scalable) solution" "for demonstration purposes." http://elasticsearch-users.115913.n3.nabble.com/Strategy-for-keeping-Elasticsearch-updated-with-MySQL-td4047253.html

To answer your questions directly (assuming you use the JDBC River):

  1. New document inserts can be handled by the JDBC River but existing data deletes are not.

  2. Data does not flow from Elasticsearch into your relational database. That would need to be custom development work.

  3. Elasticsearch is built on top of Apache Lucene. Lucene in turn depends a great deal on file system caching at the OS level (which is why ES recommends keeping heap size down to no more than 50% of total memory, to leave a lot for the file system cache). In addition the ES/Lucene stack makes use of a number of internal caches (like the Lucene field cache and the filter cache) http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-cache.html and http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-fielddata.html Internally the filter cache is implemented using a bitset: http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/

Upvotes: 2

ThomasC
ThomasC

Reputation: 8165

1)You should take a look at the ElasticSearch jdbc river here for inserts (I believe deleted rows aren't managed any more, see developper comment).

2)Unless you do it manually, it is not natively managed by ElasticSearch.

3)Indeed, ElasticSearch use cache to improve performances, especially when using filters. Bitsets (arrays of 0/1) are stored.

Upvotes: 1

Related Questions