Reputation: 349
On my elasticsearch 1.4 I used to delete documents using the DeleteByQuery API like this :
curl -XDELETE http://my_elasticsearch:9200/_all/_query?q=some_field:some_value
This wasn't perfect (because of regular OutOfMemoryError) but this works enough for my needs (at this time).
But now I use the new elasticsearch 1.5 and in the documentation I have read that :
Deprecated in 1.5.0.
"Delete by Query will be removed in 2.0: it is problematic since it silently forces a refresh which can quickly cause OutOfMemoryError during concurrent indexing, and can also cause primary and replica to become inconsistent. Instead, use the scroll/scan API to find all matching ids and then issue a bulk request to delete them..
So I would like to do the same using scroll/scan API. But how to delete using this? I don't understand how. The documentation API and documentation Java API doesn't seems complete for me (missing example of deleting).
PS: I'm looking for understand with java or curl (no matter for me in final I need the both).
Upvotes: 1
Views: 652
Reputation: 10180
I ran into this issue as well and could not find a good code example. I'll show you what I came up with. I'm not sure if this is the best way to do it, so please feel free to comment about how this could be refined. Note that I set the size of the results for the query to Integer.MAX_VALUE so that the query will return all (or as many as possible) of the results that need to be deleted.
Repeat if necessary
private void deleteAllByQuery(final String index, final String type, final QueryBuilder query) {
SearchResponse response = elasticSearchClient.prepareSearch(index)
.setTypes(type)
.setQuery(query)
.setSize(Integer.MAX_VALUE)
.execute().actionGet();
SearchHit[] searchHits = response.getHits().getHits();
while (searchHits.length > 0) {
LOGGER.debug("Need to delete " + searchHits.length + " records");
// Create bulk request
final BulkRequestBuilder bulkRequest = elasticSearchClient.prepareBulk().setRefresh(true);
// Add search results to bulk request
for (final SearchHit searchHit : searchHits) {
final DeleteRequest deleteRequest = new DeleteRequest(index, type, searchHit.getId());
bulkRequest.add(deleteRequest);
}
// Run bulk request
final BulkResponse bulkResponse = bulkRequest.execute().actionGet();
if (bulkResponse.hasFailures()) {
LOGGER.error(bulkResponse.buildFailureMessage());
}
// After deleting, we should check for more records
response = elasticSearchClient.prepareSearch(index)
.setTypes(type)
.setQuery(query)
.setSize(Integer.MAX_VALUE)
.execute().actionGet();
searchHits = response.getHits().getHits();
}
}
Upvotes: 2