Reputation: 461
First of all, sorry for not 100% clearly questions title. It is easier to explain with few lines of code:
query = {...}
while True:
elastic_response = elastic_client.search(elastic_index, body=query, request_cache=False)
if elastic_response["hits"]["total"]) == 0:
break
else:
for doc in elastic_response["hits"]["hits"]:
print("delete {}".format(doc["_id"]))
elastic_client.delete(index=elastic_index, doc_type=doc["_type"], id=doc["_id"])
I make a search, then delete all the docs and then do the search again to get the next bunch.
BUT the search query gives me the same docs! And this results in 404 exception on delete. It has to be some kind of cache, but i does not found anything, "request_cache" doesn't help.
I can probably refactor this code to use batch delete, but i want to understand what is wrong here
P.S. i'm using the official python client
Upvotes: 2
Views: 1925
Reputation: 52366
If using a sleep()
after the deletes makes the documents go away, then it's not about cache. It's about the refresh_interval
and the near real timeness or Elasticsearch.
So, call _refresh
after your code leaves the for
loop. Also, don't delete document by document, but create a _bulk
request where you delete all your documents in batches, depending on how many they are.
Upvotes: 3