Reputation: 14717
I need to send records to a search engine (Solr or ElasticSearch) to index.
In my design, a field can have up to 5000 values and for some records, ALL these 5000 values (OR or AND relationship) of this field need to be sent to the search engine.
I have about 10 fields of this nature, plus 30 other fields (text, integer, etc.).
I wonder whether Solr or ElasticSearch can effectively handle a large number of values of a field and which one does a better job.
What about millions of records in this situation?
What about real time indexing in already-millions-of-records-and-keep-growing situation? I understand Solr NRS and ElasticSearch can do real-time indexing, but I am not sure whether my situation poses new challenges.
Thanks for any input!
Cheers!
Upvotes: 0
Views: 647
Reputation: 4774
Both Solr and ElasticSearch are based on Lucene, which does the real indexing/querying/storing documents. So performance, in terms of size of fields and documents, should be pretty similar in both.
The choice between one or the order should be probably based on which one you find most enjoyable to work with. ElasticSearch, for example, has a JSON API for querying and indexing, while Solr uses pretty much XML for configuration and querying.
If you're going to have millions of documents and/or will have the need to divide the insert/query load in a cluster of machines ElasticSearch has, in my opinion, an advantage because of the easiness to shard and create replicas.
Regarding the real-time search, both will probably suit your needs. They allow you to customize how frequently it will "refresh" the index. Allowing new documents, that were just indexed, to appear in search results. For example, in ElasticSearch you can set a refresh to occur once a minute.
Upvotes: 3