Guinoutortue
Guinoutortue

Reputation: 349

Elasticsearch 1.5 curl delete without queryapi

On my elasticsearch 1.4 I used to delete documents using the DeleteByQuery API like this :

curl -XDELETE http://my_elasticsearch:9200/_all/_query?q=some_field:some_value

This wasn't perfect (because of regular OutOfMemoryError) but this works enough for my needs (at this time).

But now I use the new elasticsearch 1.5 and in the documentation I have read that :

Deprecated in 1.5.0.

"Delete by Query will be removed in 2.0: it is problematic since it silently forces a refresh which can quickly cause OutOfMemoryError during concurrent indexing, and can also cause primary and replica to become inconsistent. Instead, use the scroll/scan API to find all matching ids and then issue a bulk request to delete them..

So I would like to do the same using scroll/scan API. But how to delete using this? I don't understand how. The documentation API and documentation Java API doesn't seems complete for me (missing example of deleting).

PS: I'm looking for understand with java or curl (no matter for me in final I need the both).

Upvotes: 1

Views: 652

Answers (1)

Kevin Crowell
Kevin Crowell

Reputation: 10180

I ran into this issue as well and could not find a good code example. I'll show you what I came up with. I'm not sure if this is the best way to do it, so please feel free to comment about how this could be refined. Note that I set the size of the results for the query to Integer.MAX_VALUE so that the query will return all (or as many as possible) of the results that need to be deleted.

  1. Run query to get all IDs to be deleted
  2. Add delete requests for all IDs to a bulk request
  3. Run bulk request
  4. Re-run query to see if any more records need to be deleted
  5. Repeat if necessary

    private void deleteAllByQuery(final String index, final String type, final QueryBuilder query) {
        SearchResponse response = elasticSearchClient.prepareSearch(index)
                .setTypes(type)
                .setQuery(query)
                .setSize(Integer.MAX_VALUE)
                .execute().actionGet();
    
        SearchHit[] searchHits = response.getHits().getHits();
    
        while (searchHits.length > 0) {
            LOGGER.debug("Need to delete " + searchHits.length + " records");
    
            // Create bulk request
            final BulkRequestBuilder bulkRequest = elasticSearchClient.prepareBulk().setRefresh(true);
    
            // Add search results to bulk request
            for (final SearchHit searchHit : searchHits) {
                final DeleteRequest deleteRequest = new DeleteRequest(index, type, searchHit.getId());
                bulkRequest.add(deleteRequest);
            }
    
            // Run bulk request
            final BulkResponse bulkResponse = bulkRequest.execute().actionGet();
            if (bulkResponse.hasFailures()) {
                LOGGER.error(bulkResponse.buildFailureMessage());
            }
    
            // After deleting, we should check for more records
            response = elasticSearchClient.prepareSearch(index)
                .setTypes(type)
                .setQuery(query)
                .setSize(Integer.MAX_VALUE)
                .execute().actionGet();
    
            searchHits = response.getHits().getHits();
        }
    }
    

Upvotes: 2

Related Questions