Stéphanie C
Stéphanie C

Reputation: 829

level must be an integer, error in elasticsearch

I have implemented a little crawler in python and I wanted to try to export the results in elasticsearch as explained in this tutorial.

I've made the fix proposed in the comment because of the update of the elasticsearch for scrapy plug-in (cf github link). I've changed the ELASTICSEARCH_UNIQ_KEY with an existing field in my scraper. Of course I have installed the plugin and checked that my spider worked (I have succeeded in getting output in json the command scrapy crawl brand -o output.json where brand is the name of my spider)

I've installed elasticsearch and it is running, I have been able to reproduce some examples found here. But it doesn't work when I'm using the following command :scrapy crawl brand.

I added quotes in the line ELASTICSEARCH_LOG_LEVEL= 'log.DEBUG' in the settings.py file since log is not recognized without. But now, I have the following error :

Traceback (most recent call last):
  File "C:\Users\stephanie\Downloads\WinPython-32bit-2.7.9.2\python-2.7.9\lib\site-packages\twisted\internet\defer.py", line 588, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "C:\Users\stephanie\Downloads\WinPython-32bit-2.7.9.2\python-2.7.9\lib\site-packages\scrapyelasticsearch\scrapyelasticsearch.py", line 70, in process_item
    self.index_item(item)
  File "C:\Users\stephanie\Downloads\WinPython-32bit-2.7.9.2\python-2.7.9\lib\site-packages\scrapyelasticsearch\scrapyelasticsearch.py", line 53, in index_item
    log.msg("Generated unique key %s" % local_id, level=self.settings.get('ELASTICSEARCH_LOG_LEVEL'))
  File "C:\Users\stephanie\Downloads\WinPython-32bit-2.7.9.2\python-2.7.9\lib\site-packages\scrapy\log.py", line 49, in msg
    logger.log(level, message, *[kw] if kw else [])
  File "C:\Users\stephanie\Downloads\WinPython-32bit-2.7.9.2\python-2.7.9\lib\logging\__init__.py", line 1220, in log
    raise TypeError("level must be an integer")
TypeError: level must be an integer
2015-08-04 02:06:02 [scrapy] INFO: Crawled 1 pages (at 1 pages/min), scraped 0 items (at 0 items/min)
2015-08-04 02:06:02 [scrapy] INFO: Closing spider (finished)
2015-08-04 02:06:02 [selenium.webdriver.remote.remote_connection] DEBUG: DELETE
http://127.0.0.1:49654/hub/session/209677e4-1577-4f05-a418-8554159d8c74/window {
"sessionId": "209677e4-1577-4f05-a418-8554159d8c74"}
2015-08-04 02:06:03 [selenium.webdriver.remote.remote_connection] DEBUG: Finished Request
2015-08-04 02:06:03 [scrapy] INFO: Dumping Scrapy stats:

I am using python 2.7 and elasticsearch 1.7.1 Do I have to do some configuration with elastic search or what may cause this error ? Thanks for your help.

Upvotes: 3

Views: 4752

Answers (1)

Joe Young
Joe Young

Reputation: 5885

I don't have an ElasticSearch setup to try this on, but you could try modifying settings.py, add the following to the top of settings.py

import logging

And change

ELASTICSEARCH_LOG_LEVEL= 'log.DEBUG'

to

ELASTICSEARCH_LOG_LEVEL= logging.DEBUG

If the above still doesn't work, you can try this instead:

ELASTICSEARCH_LOG_LEVEL= 10

Upvotes: 4

Related Questions