Eli Johnes
Eli Johnes

Reputation: 351

About stored field vs docvalues in solr

Please help understand the following regarding solr

1)Where are stored fields and docValues fields saved in solr? 2)if we are enabling docvalues for some fields, will the normal query (only search, with no faceting or sort applied) performance be better when compared to using stored fields? 3)Is it advisable to replace all the stored fields with docValues?

Upvotes: 0

Views: 303

Answers (1)

Abhijit Bashetti
Abhijit Bashetti

Reputation: 8678

DocValues are a way of recording field values internally that is more efficient for some purposes, such as sorting and faceting, than traditional indexing.

DocValue fields are now column-oriented fields with a document-to-value mapping built at index time. This approach promises to relieve some of the memory requirements of the fieldCache and make lookups for faceting, sorting, and grouping much faster.

Stored fields store all field values for one document together in a row-stride fashion. while retrieval of document, all field values are returned at once per document, so that loading the relevant information about a document is very fast.

However, if you need to scan a field (for faceting/sorting/grouping/highlighting) it will be a slow process, as you will have to iterate through all the documents and load each document's fields per iteration resulting in disk seeks.

Field values retrieved during search queries are typically returned from stored values. However, non-stored docValues fields will be also returned along with other stored fields when all fields (or pattern matching globs) are specified to be returned (e.g. “fl=*”) for search queries depending on the effective value of the useDocValuesAsStored parameter for each field. For schema versions >= 1.6, the implicit default is useDocValuesAsStored="true"

When retrieving fields from their docValues form (using the /export handler, streaming expressions or if the field is requested in the fl parameter), two important differences between regular stored fields and docValues fields must be understood:

  1. Order is not preserved. For simply retrieving stored fields, the insertion order is the return order. For docValues, it is the sorted order.

  2. Multiple identical entries are collapsed into a single value. Thus if I insert values 4, 5, 2, 4, 1, my return will be 1, 2, 4, 5.

In cases where the query is returning only docValues fields performance may improve since returning stored fields requires disk reads and decompression whereas returning docValues fields in the fl list only requires memory access.

In a environment with low-memory , or you don’t need to index a field, DocValues are perfect for faceting/grouping/filtering/sorting/function queries.

For more details please refer DocValues

Upvotes: 1

Related Questions