Vimson
Vimson

Reputation: 89

SphinxSearch - Different Nodes using shared data

We are in the process of building a SphinxSearch Cluster using Amazon EC2 instances. We did a sample test like several instances using the same shared file system (Elastic File System). Our idea is, in a cluster we might have more than 10 nodes, But we can use a single instance to index documents and keep it in Elastic File System and can shared by multiple nodes for reading.

Our test worked fine, But technically any problem with this approach? (Like locking issue etc)

Can someone please suggest on this

Thanks in Advance

Upvotes: 0

Views: 461

Answers (1)

Manticore Search
Manticore Search

Reputation: 1482

If you're ok with having N copies of the index you can do as follows:

  • build an index in one place in a temp folder
  • rename the files so they include .new.
  • distribute the index to all the other places using rsync or whatever you like. Some even do broadcasting with UFTP
  • rotate the indexes at once in all the places by sending HUP to the searchds or better by doing RELOAD INDEX (http://docs.manticoresearch.com/latest/html/sphinxql_reference/reload_index_syntax.html), it normally takes only few ms so we can say that your new index replaces the previous one simultaneously on all the nodes
  • previously (and perhaps still in Sphinx) there was an issue with rotating the index (either by --rotate or RELOAD) in case it was processing a long query (the rotate just had to wait). It was fixed in Manticoresearch recently.

This is tried'n'true solution people use in production for years, but if you really want to share the same files among multiple searchd instances you can softlink all the files except .spl, but then to rotate the index in the searchd instances using the links (not the actual files) you'll need to restart the searchd instances which doesn't look good in general, but in some special cases may be still a good solution.

Upvotes: 3

Related Questions