ManojP
ManojP

Reputation: 6248

How to improve Solr performance?

How we are using Solr: We are storing 7K documents with 3k attributes attached to one document in solr. Each attribute is indexed on solr to enable search/sort on these attribute. We are fetching data from solr based on search/filter criteria with 400+ attribute attached to one document. So when we try to search some text in solr with 1 attribute(by setting fl="projectId") it takes hardly 1 second to display result on solr console which is fine.

However if we try to fetch 100+ attribute for same search criteria (which will return string ~100*3* no of matched document(~50) out of 7K document with 4K attribute) it takes ~20 seconds. But we need to fetch 400+ attribute with matched document it takes long time ~90 seconds, earlier it was crashing due to outOfMemoryError which we had fixed by increasing RAM size and JVM Heap size.

Mysql Data sync with Solr: Currently we are using MySql DB as primary database and Solr Server as secondary database. We used to synch mysql data with Solr server on daily basis. We also update solr server whenever we update any attribute on Mysql DB.

Using solr result data in application: Application dashboard will contain document with pre-configured columns(attributes) by user. User can apply search/filter criteria to populate required result on his dashboard. So our application try to fetch data with search/filter criteria from solr server by accessing it.

We have tried many things like increasing heap size, RAM size and no of CPU also but no luck. Data is increasing day by day which is causing lot of issues. It is working for small no of projects or small no of attribute but whenever we try to fetch more attribute then it takes too much time, sometime it crashed.

I am not sure whether we are using indexes properly?

Can anyone suggest better/alternate approach? Thanks in advance.

Upvotes: 4

Views: 1809

Answers (2)

Uri Shtand
Uri Shtand

Reputation: 1737

You can try to use facet search - multiple searches that reduce the number of candidates on each successive search.

The other way is to use filters extensively.

If you can turn some of the query into filters (fq) that would probably increase performance on a good factor.

Upvotes: 1

jay
jay

Reputation: 2077

Instead of getting 400 fields back for each document, you can get back just the "id" of each document and then get these documents from MySQL which is your permanent storage.

So for the example if you are getting back 25 document ids per search , in your application you can get the 25 docs from MySQL (may be do a parallel call)

In my experience returning more number of fields increases qTime a lot.

Upvotes: 5

Related Questions