Reputation: 16850
I'm looking for an ElasticSearch Python client that can make asynchronous requests. For example, I'd like to write this code,
query1_future = es.search('/foobar', query1_json)
query2_future = es.search('/baz', query2_json) # Submit query 2 right after query 1, don't wait for its response
query1 = query1_future.get()
query2 = query2_future.get()
However, I don't see any clients (PyES, or the official client, for example) supporting this. Further, the two I'm familiar with couple the request logic with the response processing logic, so modifying them myself seems difficult. Perhaps a sufficient interim solution would be to use the asynchronous version of Requests, grequests?
Also, it's worth pointing out that ElasticSearch's _msearch
may be a better-performing option, but for real-world applications it'd require some code restructuring.
Upvotes: 12
Views: 6953
Reputation: 2412
I've created an async ElasticSearch ORM, which is based on Pydantic (v2.x), called ESORM
You can easily create model like this:
from esorm import ESModel
class User(ESModel):
name: str
age: int
or if you would like to use other ES field types:
from esorm.fields import byte, keyword, text
class User(ESModel):
name: keyword
age: byte
cv: text
Nested documents:
class User(ESModel):
name: text
email: keyword
age: byte = 18 # You can specify defauld values as in Pydantic
class Post(ESModel):
title: text
content: text
writer: User # User is a nested document
You can easily create mappings for your models, which will create your indices with the specified types automatically:
# Create indices and mappings
async def prepare_es():
import models # Import your models
# Here models argument is not needed, but you can pass it to prevent unused import warning
await setup_mappings(models)
Search documents:
async def query():
users = await User.search(
query={
'bool': {
'must': [{
'range': {
'age': {
'gte': 18
}
}
}]
}
}
)
Everything is annotated, type checked and autocompleted by the IDE. Even queries, because it uses TypedDict.
There are a tons of other features in it.
Upvotes: 0
Reputation: 2660
This is an older question, but now in 2019, there is official async wrapper package. https://github.com/elastic/elasticsearch-py-async
I have had success with using against ES 5.x but issue is the 5.x branch is not being maintained https://github.com/elastic/elasticsearch-py-async/issues/46
Upvotes: 1
Reputation: 10210
Just came across this question. There is an official asynchronous Elasticsearch client based on asyncio:
https://github.com/elastic/elasticsearch-py-async
Upvotes: 9
Reputation: 463
I've forked txes into txes2. It features a more PEP8 friendly interface, test coverage (unit and integration) and support for ES v1.x.
Still a work in progress, but probably a good choice for people using Twisted.
Upvotes: 2
Reputation: 621
My suggestion is to just stick with CURLing everything. There are so many different methods, filters, and queries that various "wrappers" have a hard time recreating all the functionality. In my view, it is similar to using an ORM for databases...what you gain in ease of use you lose in flexibility/raw power.
Give CURL a try for a while and see how that treats you. You can use external JSON formatters to check your JSON, the mailing list to look for examples and the docs are ok if you use JSON.
Upvotes: -1
Reputation: 687
I haven't used it yet, but I found this:
https://github.com/jkoelker/txes
Upvotes: 0
Reputation: 1451
You can also consider the following options to perform I/O without blocking main executing process using existent clients:
Gevent usage is the most lightweight (for RAM / CPU resources) and allows processing of the most intensive I/O, but it's also the most complex among the listed solutions. Also note that it works in the single process and to use advantage of multiple cores multiprocessing package should be used.
Upvotes: 3