Reputation: 6173
I am using scrapy (1.5.0) which apparently uses Pillow (5.2.0). When I run my script with scrapy runspider my_scrapy_script.py
the stdout gets flooded with useless logging messages, e.g.:
2018-07-11 14:41:07 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot)
2018-07-11 14:41:07 [PIL.Image] DEBUG: Importing BlpImagePlugin
2018-07-11 14:41:07 [PIL.Image] DEBUG: Importing BmpImagePlugin
2018-07-11 14:41:07 [PIL.Image] DEBUG: Importing BufrStubImagePlugin
2018-07-11 14:41:07 [PIL.Image] DEBUG: Importing CurImagePlugin
... many more of the like ...
I tried disabling them by settings the logger level like this:
logger = logging.getLogger('PIL.Image').setLevel(logging.WARNING)
etc, it didn't help, I tried to set the root logger level like this:
logger = logging.getLogger().setLevel(logging.WARNING)
with no effect too, higher levels also don't help
setting LOG_LEVEL = logging.WARNING
and even LOG_ENABLED = False
in scrapy settings has no effect too.
if I set LOG_LEVEL
to 'INFO'
it prints
2018-07-11 07:04:42 [scrapy.crawler] INFO: Overridden settings: {'LOG_ENABLED': False, 'LOG_LEVEL': 'INFO', 'SPIDER_LOADER_WARN_ONLY': True}
so it looks like the above mentioned flood is produced before the script is loaded
Upvotes: 2
Views: 3440
Reputation: 9585
As @Lodi suggested in a question's comment, I could only solve the issue of Scrapy filling the logs with debug messages on production (including all the HTML of the scraped pages) in a Django project using celery, disabling the propagation of the scrapy logger. So, what I did is:
settings.py:
import logging
if not DEBUG:
logging.getLogger('scrapy').propagate = False
Also, I made my Spider use a logger that derives from 'scrapy' logger, as the CrawlSpider.logger
isn't a descendant from 'scrapy' logger. So, in this case I used the scrapy.spiders
logger to log messages from my Spider that inherits from CrawlSpider
:
logger = logging.getLogger('scrapy.spiders')
And then use it with logger.debug()
, logger.info()
, etc.
Keep in mind that messages logged with a severity higher than debug
and info
, that is: warning
, error
, critical
and exception
will be propagated altough propagation is disabled at the scrapy
logger. So, you'll still see the DropItem
exceptions logged.
Upvotes: 1
Reputation: 921
Another way
from scrapy.utils.log import configure_logging
configure_logging(install_root_handler=True)
logging.disable(50) # CRITICAL = 50
for logging levels = Python Logging Levels
more information => Scrapy Logging
Another way in spider:
custom_settings = {
'LOG_ENABLED': False,
# ... other settings...
}
Upvotes: 1
Reputation: 1285
According to the Documentation start with an additonal parameter:
https://doc.scrapy.org/en/latest/topics/logging.html
--loglevel/-L LEVEL
So it could be
scrapy runspider my_scrapy_script.py --loglevel WARNING
Upvotes: 5
Reputation: 898
You could disable it completely with LOG_ENABLED=False
. You could also pass settings during scrapy invocation - scrapy runspider my_scrapy_script.py -s LOG_ENABLED=False
Upvotes: 0