Reputation: 5084
I have to send an extension setting to scrapy when I run a spider. It's really very easy when I use curl:
http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
But when I wanted to plug this in a python script based on the module requests, I was a bit confused about setting=DOWNLOAD_DELAY=2, because it doesn't follow the usual form (key=value). So I tried this:
r = requests.post("http://httpbin.org/get", params={'arg1': 'val1', 'setting=DOWNLOAD_DELAY': '2'})
But has no effect on the usual scrapy's behaviour.
Thanks in advance.
Upvotes: 3
Views: 1080
Reputation: 366083
Normally, in key-value pairs passed on the command line, you split on the first =
, not the second. So, do this:
r = requests.post("http://httpbin.org/get", params={'arg1': 'val1', 'setting': 'DOWNLOAD_DELAY=2'})
For example, in the GNU documentation for Program Argument Syntax Conventions:
Long options consist of ‘--’ followed by a name made of alphanumeric characters and dashes. Option names are typically one to three words long, with hyphens to separate words. Users can abbreviate the option names as long as the abbreviations are unique.
To specify an argument for a long option, write ‘--name=value’. This syntax enables a long option to accept an argument that is itself optional.
In other words, in --foo=bar=baz
, foo
is the name
, and bar=baz
is the value
, because =
is not an alphanumeric character or a dash.
Similarly, curl
handles the option -d foo=bar=baz
with foo
as the name
and bar=baz
as the value.
You can't directly infer that from any spec—in fact, you can't even directly infer that curl
follows the GNU argument syntax at all, since it's not a GNU program and (IIRC) does its own custom argument parsing. So, you'd have to read the source to be absolutely sure.
Or, more simply, test it. Capture the form-encoded request that curl
sends out. (If you don't know how to do that: try just running a fake server with netcat
, e.g., nc -kl 8888
on a Mac/BSD system, then curl http://localhost:8888/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1
, and see what shows up on the command line.)
But this kind of behavior is pretty much an implicit standard whenever you have name=value
pairs.
Upvotes: 4