jorgen
jorgen

Reputation: 3593

Elasticsearch index nodes data inconsistent in a way we can't explain

We are experiencing issues with inconsistent data between nodes, giving rise to significantly indeterministic search results. We are on ES version 8.15.0.

So far we think we've ruled out the following:

Our first hypothesis is that this happens due to deleted documents: https://www.elastic.co/guide/en/elasticsearch/reference/current/consistent-scoring.html. This is supported by the fact that the nodes appear to have the same number of documents but different number of deleted ones.

Using dfs_query_then_fetch does not, however, solve the issue, which hints that it’s not that simple. Also, looking into the results from /_explain, we see that “number of documents containing term” differs between the nodes for the same terms.

We could solve the indeterminacy using ?prefer=... in the search, but seeing as the search result from one of the nodes is noticeably better than the other this seems less than ideal. Also we’d like to understand where the difference came from in the first place.

Any idea what to look into?

Upvotes: 0

Views: 31

Answers (0)

Related Questions