Stefan Weiss
Stefan Weiss

Reputation: 461

How to disable query cache?

First of all, sorry for not 100% clearly questions title. It is easier to explain with few lines of code:

query = {...}

while True:
    elastic_response = elastic_client.search(elastic_index, body=query, request_cache=False)
    if elastic_response["hits"]["total"]) == 0:
        break
    else:
        for doc in elastic_response["hits"]["hits"]:
            print("delete {}".format(doc["_id"]))
            elastic_client.delete(index=elastic_index, doc_type=doc["_type"], id=doc["_id"])

I make a search, then delete all the docs and then do the search again to get the next bunch.
BUT the search query gives me the same docs! And this results in 404 exception on delete. It has to be some kind of cache, but i does not found anything, "request_cache" doesn't help.

I can probably refactor this code to use batch delete, but i want to understand what is wrong here

P.S. i'm using the official python client

Upvotes: 2

Views: 1925

Answers (1)

Andrei Stefan
Andrei Stefan

Reputation: 52366

If using a sleep() after the deletes makes the documents go away, then it's not about cache. It's about the refresh_interval and the near real timeness or Elasticsearch.

So, call _refresh after your code leaves the for loop. Also, don't delete document by document, but create a _bulk request where you delete all your documents in batches, depending on how many they are.

Upvotes: 3

Related Questions