Варков Эмир
Варков Эмир

Reputation: 21

Cannot set LOG_LEVEL when using CrawlerRunner

I'm running spiders using CrawlerRunner, and I need to set the logging level.

Documentation suggests using configure_logging. However, this function

Assigns DEBUG and ERROR level to Scrapy and Twisted loggers respectively

and I need to set a different logging level.
Here is the code for launching the scraper:

import os
import sys
import logging

from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from scrapy.utils.project import get_project_settings
from scrapy.utils.reactor import install_reactor
from scrapy_app.models import Scraper

# reactor must be installed before importing
install_reactor("twisted.internet.asyncioreactor.AsyncioSelectorReactor")
from twisted.internet import ( # pylint: disable=wrong-import-position, wrong-import-order
    reactor,
)

SETTINGS_FILE_PATH = "scrapy_app.scrapy_app.settings"
os.environ.setdefault("SCRAPY_SETTINGS_MODULE", SETTINGS_FILE_PATH)

def setup_logging():
    configure_logging() # logs are there, but LOG_LEVEL is always DEBUG


def _crawl_spider(runner: CrawlerRunner, spider: str) -> None:
    """
    Adds a scrapper to the runner by its name without starting the reactor
    :param spider: spider name
    """

    d = runner.crawl(spider, )
    d.addBoth(lambda _: reactor.stop())  # type: ignore


def run_spider(spider: str) -> None:
    """
    Run a scraper by its name

    :param spider: Name of spider
    """

    setup_logging()
    runner = CrawlerRunner(get_project_settings())
    _crawl_spider(runner, spider)
    reactor.run() # type: ignore
  1. The documentation suggests using configure_logging, and while it works, it always sets the log level to DEBUG.
def setup_logging():
    configure_logging() # the logs are there, but LOG_LEVEL is always DEBUG
  1. Alternatively, the documentation suggests using logging.basicConfig. However, in my case, if I use only logging.basicConfig``, no logs are output at all.
def setup_logging():
    `logging.basicConfig(
        level=logging.INFO # no logging at all
    )
  1. Using logging.basicConfig and configure_logging together - Same result as just using configure_logging.

  2. Setting LOG_LEVEL in settings.py doesn't seem to change anything.

Upvotes: 0

Views: 49

Answers (1)

SuperUser
SuperUser

Reputation: 4822

You need to change it in the settings before and use them as an argument for CrawlerRunner.

settings = get_project_settings()
settings['LOG_LEVEL'] = logging.INFO
runner = CrawlerRunner(settings)

Upvotes: 0

Related Questions