Reputation: 7537
i'm using delete by query plugins for elastic search.
I have a index products
with a integer field size
. I want delete all document with size 10. I have over 5000 documents with size 10. If i try:
DELETE /products/product/_query?q=size:10
this query requires over 2 minutes.
I understand because delete by query
plugin is slow, from documentation:
Internally, it uses Scroll and Bulk APIs to delete documents in an efficient and safe manner. It is slower [..] Queries which match large numbers of documents may run for a long time, as every document has to be deleted individually.
How do i perform a fastest documents mass deleting?
Upvotes: 8
Views: 7513
Reputation: 15642
ES 8.11, 2024-01
I don't know what the situation was in 2016, but maybe you could consider doing a bulk delete.
The downside of this is that it might be quite complicated to determine the _id
s of all the LuceneDocuments (index documents) you need to delete. Typically you might have to run a _search query to find these _id
s on the basis of your query. You must have these _id
s to do a bulk delete.
Then you have the faff of making a bulk string conforming to the strict string format required. It's fairly feasible when you get the hang of it. And these bulk operations are pretty fast.
Upvotes: 0
Reputation: 6357
You can't. This is the only supported way of deleting documents in latest versions of Elasticsearch. Elasticsearch 1.x deletes much faster (but potentially in an unsafe manner). So if it is really worth so much, you can go back to an older version of Elasticsearch.
Upvotes: 6