Reputation: 2453
I am having a very bad week having chosen elasticsearch with graylog2. I am trying to run queries against the data in ES using Python.
I have tried following clients.
Elasticutils - Another documented, but without a complete sample. I get the following error with code attached. I don't even know how it uses this S() to connect to the right host?
es = get_es(hosts=HOST, default_indexes=[INDEX])
basic_s = S().indexes(INDEX).doctypes(DOCTYPE).values_dict()
results:
print basic_s.query(message__text="login/delete")
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 223, in __repr__
data = list(self)[:REPR_OUTPUT_SIZE + 1]
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 623, in __iter__
return iter(self._do_search())
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 573, in _do_search
hits = self.raw()
File "/usr/lib/python2.7/site-packages/elasticutils/__init__.py", line 615, in raw
hits = es.search(qs, self.get_indexes(), self.get_doctypes())
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 841, in search
return self._query_call("_search", body, indexes, doc_types, **query_params)
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 251, in _query_call
response = self._send_request('GET', path, body, querystring_args)
File "/usr/lib/python2.7/site-packages/pyes/es.py", line 208, in _send_request
response = self.connection.execute(request)
File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 167, in _client_call
return getattr(conn.client, attr)(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/pyes/connection_http.py", line 59, in execute
response = self.client.urlopen(Method._VALUES_TO_NAMES[request.method], uri, body=request.body, headers=request.headers)
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 294, in urlopen
return self.urlopen(method, url, body, headers, retries-1, redirect) # Try again
File "/usr/lib/python2.7/site-packages/pyes/urllib3/connectionpool.py", line 255, in urlopen
raise MaxRetryError("Max retries exceeded for url: %s" % url)
pyes.urllib3.connectionpool.MaxRetryError: Max retries exceeded for url: /graylog2/message/_search
I wish the devs of this good projects would provide some complete examples. Even looking at sources I am t a complete loss.
Is there any solution, help out there for me with elasticsearch and python or should I just drop all of this and pay for a nice splunk account and end this misery.
I am proceeding with using curl, download the entire json result and json load it. Hope that works, though curl downloading 1 million messages from elasticsearch may not just happen.
Upvotes: 8
Views: 9336
Reputation: 1217
ElasticSearch recently (Sept 2013) released an official Python client elasticsearch-py (elasticsearch on PyPI, also on github), which is supposed to be a fairly direct mapping to the official ElasticSearch API. I haven't used it yet, but it looks promising, and at least it will match the official docs!
Edit: We started using it, and I'm very happy with it. ElasticSearch's API is pretty clean, and elasticsearch-py maintains that. Easier to work with and debug in general, plus decent logging.
Upvotes: 3
Reputation: 71
Explicitly setting the host resolved that error for me:
basic_s = S()
.es(hosts=HOST, default_indexes=[INDEX])
Upvotes: 7
Reputation: 416
I have found rawes to be quite usable: https://github.com/humangeo/rawes
It's a rather low-level interface but I have found it to be much less awkward to work with than the high-level ones. It also supports the Thrift RPC if you're into that.
Upvotes: 8
Reputation: 121
FWIW, PYES docs are here: http://packages.python.org/pyes/index.html
Usage: http://packages.python.org/pyes/manual/usage.html
Upvotes: 4
Reputation: 21
ElasticUtils has sample code: http://elasticutils.readthedocs.org/en/latest/sampleprogram1.html
If there are other things you need in the docs, just ask.
Upvotes: 2
Reputation: 9731
Honestly, I've had the most luck with just CURLing everything. ES has so many different methods, filters, and queries that various "wrappers" have a hard time recreating all the functionality. In my view, it is similar to using an ORM for databases...what you gain in ease of use you lose in flexibility/raw power.
Except most of the wrappers for ES aren't really that easy to use.
I'd give CURL a try for a while and see how that treats you. You can use external JSON formatters to check your JSON, the mailing list to look for examples and the docs are ok if you use JSON.
Upvotes: 7