Reputation: 16850

Is there a Python ElasticSearch client that supports asynchronous requests?

I'm looking for an ElasticSearch Python client that can make asynchronous requests. For example, I'd like to write this code,

query1_future = es.search('/foobar', query1_json)
query2_future = es.search('/baz', query2_json) # Submit query 2 right after query 1, don't wait for its response
query1 = query1_future.get()
query2 = query2_future.get()

However, I don't see any clients (PyES, or the official client, for example) supporting this. Further, the two I'm familiar with couple the request logic with the response processing logic, so modifying them myself seems difficult. Perhaps a sufficient interim solution would be to use the asynchronous version of Requests, grequests?

Also, it's worth pointing out that ElasticSearch's _msearch may be a better-performing option, but for real-world applications it'd require some code restructuring.

Upvotes: 12

Answers (9)

Adam Wallner

Reputation: 2412

I've created an async ElasticSearch ORM, which is based on Pydantic (v2.x), called ESORM

You can easily create model like this:

from esorm import ESModel

class User(ESModel):
    name: str
    age: int

or if you would like to use other ES field types:

from esorm.fields import byte, keyword, text

class User(ESModel):
    name: keyword
    age: byte
    cv: text

Nested documents:

class User(ESModel):
    name: text
    email: keyword
    age: byte = 18  # You can specify defauld values as in Pydantic

class Post(ESModel):
    title: text
    content: text
    writer: User  # User is a nested document

You can easily create mappings for your models, which will create your indices with the specified types automatically:

# Create indices and mappings
async def prepare_es():
    import models  # Import your models
    # Here models argument is not needed, but you can pass it to prevent unused import warning
    await setup_mappings(models)

Search documents:

async def query():
    users = await User.search(
        query={
            'bool': {
                'must': [{
                    'range': {
                        'age': {
                            'gte': 18
                        }
                    }
                }]
            }
        }
    )

Everything is annotated, type checked and autocompleted by the IDE. Even queries, because it uses TypedDict.

There are a tons of other features in it.

Upvotes: 0

Ajay M

Reputation: 2660

This is an older question, but now in 2019, there is official async wrapper package. https://github.com/elastic/elasticsearch-py-async

I have had success with using against ES 5.x but issue is the 5.x branch is not being maintained https://github.com/elastic/elasticsearch-py-async/issues/46

Upvotes: 1

Eran H.

Reputation: 1239

Twistes is a good library if you are using twisted

Upvotes: 0

Tomáš Linhart

Reputation: 10210

Just came across this question. There is an official asynchronous Elasticsearch client based on asyncio:

https://github.com/elastic/elasticsearch-py-async

Upvotes: 9

lextoumbourou

Reputation: 463

I've forked txes into txes2. It features a more PEP8 friendly interface, test coverage (unit and integration) and support for ES v1.x.

Still a work in progress, but probably a good choice for people using Twisted.

Upvotes: 2

Brian Anderson

Reputation: 621

My suggestion is to just stick with CURLing everything. There are so many different methods, filters, and queries that various "wrappers" have a hard time recreating all the functionality. In my view, it is similar to using an ORM for databases...what you gain in ease of use you lose in flexibility/raw power.

Give CURL a try for a while and see how that treats you. You can use external JSON formatters to check your JSON, the mailing list to look for examples and the docs are ok if you use JSON.

Upvotes: -1

jcollie

Reputation: 687

I haven't used it yet, but I found this:

https://github.com/jkoelker/txes

Upvotes: 0

matagus

Reputation: 6206

There's this Tornado async client for ES.

Upvotes: 1

luart

Reputation: 1451

You can also consider the following options to perform I/O without blocking main executing process using existent clients:

Use multithreading on Jython or IronPython (they do not have GIL and take advantage of multiple CPU cores)
Use ProcessPoolExecutor on Python3
Use gevent with sockets monkey pathching to force existent clients work with gevent sockets that actually makes the client asynchronous but also request some additional code to manage results

Gevent usage is the most lightweight (for RAM / CPU resources) and allows processing of the most intensive I/O, but it's also the most complex among the listed solutions. Also note that it works in the single process and to use advantage of multiple cores multiprocessing package should be used.

Upvotes: 3

Is there a Python ElasticSearch client that supports asynchronous requests?

Answers (9)

Related Questions