Reputation: 865
I am trying to deploy scrapy project with scrapyd. I can run my project normally by use
cd /var/www/api/scrapy/dirbot
scrapy crawl dmoz
This is step by step I did:
1/ I run
scrapy version -v
>> Scrapy : 0.16.3
lxml : 3.0.2.0
libxml2 : 2.7.8
Twisted : 12.2.0
Python : 2.7.3 (default, Aug 1 2012, 05:14:39) - [GCC 4.6.3]
Platform: Linux-3.2.0-31-virtual-x86_64-with-Ubuntu-12.04-precise
2/ install scrapyd with
aptitude install scrapyd-0.16
3/ I have project scrapy at /var/www/api/scrapy/dirbot (http://domain.com/api/scrapy/dirbot). I edit scrapy.cfg
[settings]
default = dirbot.settings
[deploy:scrapyd2]
url = http://domain.com/api/scrapy/dirbot/
username = vu
password = hoang
4/ I test with deploy command
scrapy deploy -l
>> scrapyd2 http://domain.com/api/scrapy/dirbot/
5/ But when I use command
scrapy deploy -L scrapyd2
>> /usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/settings/deprecated.py:23: ScrapyDeprecationWarning: You are using the following settings which are deprecated or obsolete (ask [email protected] for alternatives):
BOT_VERSION: no longer used (user agent defaults to Scrapy now)
warnings.warn(msg, ScrapyDeprecationWarning)
Traceback (most recent call last):
File "/usr/local/bin/scrapy", line 5, in <module>
pkg_resources.run_script('Scrapy==0.16.3', 'scrapy')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 499, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1235, in run_script
execfile(script_filename, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/EGG-INFO/scripts/scrapy", line 4, in <module>
execute()
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 131, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 76, in _run_print_help
func(*a, **kw)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/cmdline.py", line 138, in _run_command
cmd.run(args, opts)
File "/usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/commands/deploy.py", line 76, in run
f = urllib2.urlopen(req)
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 406, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 519, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 444, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 527, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found
and
scrapy deploy scrapyd2 -p project
>> /usr/local/lib/python2.7/dist-packages/Scrapy-0.16.3-py2.7.egg/scrapy/settings/d eprecated.py:23: ScrapyDeprecationWarning: You are using the following settings which are deprecated or obsolete (ask [email protected] for alternat ives):
BOT_VERSION: no longer used (user agent defaults to Scrapy now)
warnings.warn(msg, ScrapyDeprecationWarning)
Building egg of project-1358597244
'build/scripts-2.7' does not exist -- can't clean it
zip_safe flag not set; analyzing archive contents...
Deploying project-1358597244 to http://domain.com/api/scrapy/dirbot/addversio n.json
Deploy failed (404):
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL /api/scrapy/dirbot/addversion.json was not found on this se rver.</p>
<hr>
<address>Apache/2.2.22 (Ubuntu) Server at domain.com Port 80</address>
</body></html>
* I don't understand what is python egg. Can you give me example about it? I don't know I had it or not. Maybe is that file /var/www/api/scrapy/dirbot/setup.py?
from setuptools import setup, find_packages
setup(
name = 'project',
version = '1.0',
packages = find_packages(),
entry_points = {'scrapy': ['settings = dirbot.settings']},
)
* How can I deploy my project. I don't know exactly did I do wrong, or miss a step?
Thanks
Upvotes: 0
Views: 2675
Reputation: 2258
well from the error it seems that the web site you want to crawl give 404 error either you put the wrong web site or there is some configuration error.
about python egg, there is a good answer about that, check it at What is a Python egg?
about setup.py : as i know it was used for install a python application from source using command python setup.py install
edit: it seems i was got confused about the command pip
, sorry about that
Upvotes: 1