Reputation: 1
What are the things which are there in memory of Elasticsearch which make search so fast? Are all jsons in memory themselves, or only inverted index and mapping will be in memory 24*7??
Upvotes: 0
Views: 1048
Reputation: 6076
It is a good question, and then answer in short is:
Inverted indexes are not guaranteed to be always stored in memory. I didn't manage to find a direct proof, so I infer this from the following:
_cat/segments
output parameter size.memory
)Give memory to the filesystem cache
This means that Elasticsearch also stores index data on disk in quite smart way so filesystem itself helps it with often accessible searches.
One of such "life-hacks" is that for each field in the mapping there will be a different inverted index, which will be small enough to be efficiently cached by FS, if queried frequently (and fields you never query will just occupy the disk space).
No, it stores them in a special field called _source
. It is not fast to retrieve it, that's why scripts accessing _source
may be slow in execution.
Yes, for example, those ones that are used for aggregations:
doc_values
, which are column-oriented storage for exact-value fields (this feature makes Elasticsearch a little bit Columnar DB), but again, it is not originally in-memory and gets "cached" upon frequent use;fielddata
, which does similar job but for text
fields; it is actually stored in memory but it is not efficient and is turned off by default.It uses more caching: Shard request caching and Node query cache. As you see, it is not as simple as "just put data in memory".
Hope that helps!
Upvotes: 4