smarber
smarber

Reputation: 5084

Convert post request performed by curl to python request based on requests module

I have to send an extension setting to scrapy when I run a spider. It's really very easy when I use curl:

 http://localhost:6800/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1

But when I wanted to plug this in a python script based on the module requests, I was a bit confused about setting=DOWNLOAD_DELAY=2, because it doesn't follow the usual form (key=value). So I tried this:

r = requests.post("http://httpbin.org/get", params={'arg1': 'val1', 'setting=DOWNLOAD_DELAY': '2'})

But has no effect on the usual scrapy's behaviour.

Thanks in advance.

Upvotes: 3

Views: 1080

Answers (1)

abarnert
abarnert

Reputation: 366083

Normally, in key-value pairs passed on the command line, you split on the first =, not the second. So, do this:

r = requests.post("http://httpbin.org/get", params={'arg1': 'val1', 'setting': 'DOWNLOAD_DELAY=2'})

For example, in the GNU documentation for Program Argument Syntax Conventions:

Long options consist of ‘--’ followed by a name made of alphanumeric characters and dashes. Option names are typically one to three words long, with hyphens to separate words. Users can abbreviate the option names as long as the abbreviations are unique.

To specify an argument for a long option, write ‘--name=value’. This syntax enables a long option to accept an argument that is itself optional.

In other words, in --foo=bar=baz, foo is the name, and bar=baz is the value, because = is not an alphanumeric character or a dash.

Similarly, curl handles the option -d foo=bar=baz with foo as the name and bar=baz as the value.

You can't directly infer that from any spec—in fact, you can't even directly infer that curl follows the GNU argument syntax at all, since it's not a GNU program and (IIRC) does its own custom argument parsing. So, you'd have to read the source to be absolutely sure.

Or, more simply, test it. Capture the form-encoded request that curl sends out. (If you don't know how to do that: try just running a fake server with netcat, e.g., nc -kl 8888 on a Mac/BSD system, then curl http://localhost:8888/schedule.json -d project=myproject -d spider=somespider -d setting=DOWNLOAD_DELAY=2 -d arg1=val1, and see what shows up on the command line.)

But this kind of behavior is pretty much an implicit standard whenever you have name=value pairs.

Upvotes: 4

Related Questions