Elasticsearch range filter inverted index

Question

Have ten billions of documents. One field of the document is timestamp (milliseconds), used the following mapping when indexing.

  timestamp:
    type: "date"
    format: "YYYY-MM-dd HH:mm:ss||YYYY-MM-dd HH:mm:ss.SSS"
    ignore_malformed: true
    doc_values: true

When search, use the range filter. Since doc_value is used, range filter internally use invert index to search. It is kind of slowness.

The execution option controls how the range filter internally executes. 
The execution option accepts the following values:
index: Uses the field’s inverted index in order to determine whether documents fall within the specified range.

If I change the mapping in another way, that is, use day instead of hours/seconds/milliseconds.

  day:
    type: "date"
    format: "YYYY-MM-dd"
    ignore_malformed: true
    doc_values: true

when search, use the range filter, it is faster.

Can someone help explain why the performance differ.

The first one (using seconds/milliseconds), the invert index (assume internally it is kind of hashtable) has huge number of keys. While the second one (only use days), the invert index has much less keys. Is it the reason ?

Elasticsearch range filter inverted index

Answers (1)

Related Questions