Christophe Roussy
Christophe Roussy

Reputation: 17029

ElasticSearch get only document ids, _id field, using search query on index

For a given query I want to get only the list of _id values without getting any other information (without _source, _index, _type, ...).

I noticed that by using _source and requesting non-existing fields it will return only minimal data but can I get even less data in return ? Some answers suggest to use the hits part of the response, but I do not want the other info.

Upvotes: 4

Views: 8303

Answers (2)

Tobias Ernst
Tobias Ernst

Reputation: 4654

I suggest to use elasticsearch_dsl for python. They have a nice api.

from elasticsearch_dsl import Document

# don't return any fields, just the metadata
s = s.source(False)
results = list(s)

Afterwards you can get the the id with:

first_result: Document = results[0]
id: Union[str,int] = first_result.meta.id

Here is the official documentation to get some extra information: https://elasticsearch-dsl.readthedocs.io/en/latest/search_dsl.html#extra-properties-and-parameters

Upvotes: 2

Maninder
Maninder

Reputation: 56

Better to use scroll and scan to get the result list so elasticsearch doesn't have to rank and sort the results.

With the elasticsearch-dsl python lib this can be accomplished by:

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search

es = Elasticsearch()
s = Search(using=es, index=ES_INDEX, doc_type=DOC_TYPE)

s = s.fields([])  # only get ids, otherwise `fields` takes a list of field names
ids = [h.meta.id for h in s.scan()]

Upvotes: 4

Related Questions