Reputation: 2136
I want to deploy my scrapy project on a ip that is not listed in the scrapy.cfg file , because the ip can change and i want to automate the process of deploying. i tried giving the ip of the server directly in the deploy command but it did not work. any suggestion to do this?
Upvotes: 1
Views: 855
Reputation: 8624
First, you should consider assigning a domain to the server, so you can always get to it regardless of its dynamic IP. DynDNS comes handy at times.
Second, you probably won't do the first, because you haven't got access to the server, or for whatever other reason. In that case, I suggest mimicking above behavior by using your system's hosts
file. As described at wikipedia article:
The hosts file is a computer file used by an operating system to map hostnames to IP addresses.
For example, lets say you set your url
to remotemachine
in your scrapy.cfg
. You can write a script that would edit the hosts
file with the latest IP address, and execute it before deploying your spider. This approach has a benefit of having a system-wide effect, so if you are deploying multiple spiders, or using the same server for some other purpose, you don't have to update multiple configuration files.
This script could look something like this:
import fileinput
import sys
def update_hosts(hostname, ip):
if 'linux' in sys.platform:
hosts_path = '/etc/hosts'
else:
hosts_path = 'c:\windows\system32\drivers\etc\hosts'
for line in fileinput.input(hosts_path, inplace=True):
if hostname in line:
print "{0}\t{1}".format(hostname, ip)
else:
print line.strip()
if __name__ == '__main__':
hostname = sys.argv[1]
ip = sys.argv[2]
update_hosts(hostname, ip)
print "Done!"
Ofcourse,you should do additional argument checks, etc., this is just a quick example.
You can then run it prior deploying like this:
python updatehosts.py remotemachine <remote_ip_here>
If you want to take it a step further and add this functionality as a simple argument to scrapyd-deploy, you can go ahead and edit your scrapyd-deploy
file (its just a Python script) to add the additional parameter and update the hosts
file from within. But I'm not sure this is the best thing to do, since leaving this implementation separate and more explicit would probably be a better choice.
Upvotes: 2
Reputation: 474001
This is not something you can solve on the scrapyd
side.
According to the source code of scrapyd-deploy
, it requires the url
to be defined in the [deploy]
section of the scrapy.cfg
.
One of the possible workarounds could be having a placeholder in scrapy.cfg
which you would replace with a real IP address of the target server, before starting scrapyd-deploy
.
Upvotes: 1