Vaibhav Jain
Vaibhav Jain

Reputation: 5507

How to save Scrapy crawl Command output

I am trying to save the output of the scrapy crawl command I have tried scrapy crawl someSpider -o some.json -t json >> some.text But it doesn't worked ...can some body tell me how i can save output to a text file....I mean the logs and information printed by scrapy...

Upvotes: 18

Views: 24099

Answers (7)

Kumar Deepam
Kumar Deepam

Reputation: 1029

scrapy crawl someSpider --logfile some.text

This will do exactly what you are looking for, save the output of the command that you see on the screen to a text file.

Upvotes: 0

Moein Kameli
Moein Kameli

Reputation: 976

You can save as log file:

scrapy crawl someSpider -s LOG_FILE=fileName.log -L <loglevel>

loglevel can be one between CRITICAL, ERROR, WARNING, INFO and DEBUG or --nolog for no log. for more information read Doc.

Upvotes: 0

tomjn
tomjn

Reputation: 5389

For all scrapy commands you can add --logfile NAME_OF_FILE to log to a file e.g.

scrapy crawl someSpider -o some.json --logfile some.text

There are two other useful command line options for logging:

  • -L or --loglevel to control the logging level e.g. -L INFO (the default is DEBUG)

  • --nolog to disable logging completely

These commands are documented here.

Upvotes: 9

Hackaholic
Hackaholic

Reputation: 19763

you can use nohup:

nohup scrapy crawl someSpider &

The log will be stored in nohup.out

Upvotes: 0

EVX
EVX

Reputation: 312

if you want to get the output from runspider command.

scrapy runspider scraper.py -o some.json -t json 2> some.text

This works as well.

Upvotes: 1

claire_
claire_

Reputation: 733

You can add these lines to your settings.py:

LOG_STDOUT = True
LOG_FILE = '/tmp/scrapy_output.txt'

And then start your crawl normally:

scrapy crawl someSpider

Upvotes: 45

JoshuaBoshi
JoshuaBoshi

Reputation: 1286

You need to redirect stderr too. You are redirecting only stdout. You can redirect it somehow like this:

scrapy crawl someSpider -o some.json -t json 2> some.text

The key is number 2, which "selects" stderr as source for redirection.

If you would like to redirect both stderr and stdout into one file, you can use:

scrapy crawl someSpider -o some.json -t json &> some.text

For more about output redirection: http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html

Upvotes: 21

Related Questions