Reputation: 31
The given syntax from https://github.com/scrapinghub/portia#running-a-portia-spider
portiacrawl PROJECT_PATH SPIDER_NAME
I tried running
portiacrawl D:/portia-master/slyd/data/projects/darkwoods example
portiacrawl slyd/data/projects/darkwoods example
portiacrawl slyd/data/projects/darkwoods
But they give me the same help message.
Usage: portiacrawl <project dir/project zip> [spider] [options]
Allow to easily run slybot spiders on console. If spider is not given, print a
list of available spiders inside the project
Options:
-h, --help show this help message and exit
--settings=SETTINGS Give specific settings module (must be on python path)
--logfile=LOGFILE Specify log file
-a NAME=VALUE Add spider arguments
-s NAME=VALUE Add extra scrapy settings
-o FILE, --output=FILE
dump scraped items into FILE (use - for stdout)
-t FORMAT, --output-format=FORMAT
format to use for dumping items with -o (default:
jsonlines)
-v, --verbose more verbose
I am very new to portia, so I am very confuse as to what to do. Can anyone give me a sample of what should I write for the PROJECT_PATH? I am currently using portia via vagrant.
Upvotes: 1
Views: 2315
Reputation: 1
I have create portia-dashboard which you can find at github, a docker image is also avaliable at docker hub. With portia-dashboard, you can deploy a project, start a spider, or monitor job status by mouse click in a simple web interface. Refer to doc to get detail information on how to start a spider.
Upvotes: 0
Reputation: 133
You can use scrapyd to run the spider.
curl http://your_scrapyd_host:6800/schedule.json -d project=your_project_name -d spider=your_spider_name
This way you can also have a basic monitoring of the spider. I also found a quick and simple web interface that helps with deploying the spider once it's been deployed with scrapyd: https://gist.github.com/MihaiCraciun/78f0a53b7a99587d178b
Hope it helps !
Upvotes: 0
Reputation: 31
I forgot which question was it, but someone mentioned cd to the directory before using the command portiacrawl. After exploring vagrant for a while, I found the directory and its at /vagrant/slyd/data/projects.
So to run portiacrawl, you just have to cd to portia directory before doing portiacrawl
portiacrawl /vagrant/slyd/data/projects/[project name] [spider] [options]
I ran this command and it worked
portiacrawl /vagrant/slyd/data/projects/darkwoods example
Upvotes: 1