Vishnu
Vishnu

Reputation: 724

elastic search performance related queries for large volume of data

I have been working on a good scale production system, where I have indexed large volume of data to elastic search. Then I need to search with specific queries. While doing so, I am having some performance related queries.

Please consider this as a follow up question of this

  1. Since I used to return the nested data using inner hits, from the documentation using _source is not a best solution if we have large set of nested objects to return. So how can we overcome this? Can we use doc value fields? If yes how?

  2. Read that by default inner hits size defaults to size 3, so we can provide a max of 100. Suppose if we need to return all the results, how can we fetch data without affecting the performance?

Upvotes: 1

Views: 1140

Answers (1)

Amit
Amit

Reputation: 32386

Reg size,

You can specify size as big till it doesn't cross the default limit of from+size of 10K which is know as index.max_result_window as specified in index module doc, although you can change the limit dynamically but its not recommended as mentioned in the same link and there are better alternatives to it.

More importantly you need to define size on inner_hits which are even more costly and that the whole reason ES limited it to just 3, while on normal query default size limit is 10.

Coming to doc_values,

Instead of fetching values from _source, you can do that as long as your are using on fields on which its enabled by default like keyword fields but for text fields its not enabled by default and you have to first enable it and it has below cons:

  1. You need to change the index mapping and reindex all content
  2. It will take more space in your index.
  3. Its very costly on text fields and that's the reason its disabled and more info on this official doc
  4. You already have this information on _source and it will be better to use that due to performance reasons.

Upvotes: 1

Related Questions