Reputation: 2220
I'm looking for the safest and most convenient way to call a shell command from python(3). Here a ps to pdf conversion:
gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile="${pdf_file}" "${ps_file}"
I use subprocess
, shlex
and avoid shell=True
.
But I find the resulting command inconsistent:
cmd = ['gs', '-dBATCH', '-dNOPAUSE', '-sDEVICE=pdfwrite', '-sOutputFile={0}'.format(pdf_filename), ps_filename]
What do I miss?! subprocess.call()
syntax looks so clean with space separated arguments, and looks such a mess everywhere else.
What's the difference when calling subprocess.call(cmd)
(at python level, ie. escaping, injection protection, quoting, etc.) between:
cmd = ['do', '--something', arg]
cmd = ['do', '--someting {0}'.format(arg)]
If none, is this, also, a good way to do it ?
cmd = ['gs', '-dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile={0} {1}'.format(pdf_filename, ps_filename)]
Another example of inconsistency:
hg merge -r 3
would be cmd = ['hg', 'merge', '-r', revision_id]
hg merge --rev=3
would be cmd = ['hg', 'merge', '--rev={0}'.format(revision_id)]
despite the fact, it is two ways to send the same arguments.
Upvotes: 1
Views: 3659
Reputation: 1421
The difference is that the command may have a --something
option which accepts an argument, but it doesn't have a --something foo
option -- which is what you would be telling it. When you run a command in your shell, like wc -l myfile.txt
, your shell splits up that commandline where it finds spaces -- so the command that gets run is ['wc', '-l', 'myfile.txt']
.
The subprocess
module does not perform such splitting. You have to do it yourself (unless you use the 'shell' option, but that's generally less secure, so avoid it if you can.).
Some anti-examples...
Try to run a command named "wc -l myfile.txt". Of course, there is no "wc -l myfile.txt" command installed, only a "wc" command, so this will fail:
['wc -l myfile.txt']
Try to run a command "wc" with an option "-l myfile.txt". There is an "-l" option, but no "-l myfile.txt" option. This will fail:
['wc', '-l myfile.txt']
and a correct example:
['wc', '-l', 'myfile.txt']
This calls wc with the -l
option (print only the line count) and myfile.txt
as the only filename.
Something you may have found confusing is fragments like this:
'-sOutputFile={0}'
This is an 'inline' style of giving the argument of an option. If this is supported, the help for the program usually says so explicitly. Python does not split these -- the program receiving them does.
There are three main styles of 'inline' arguments. I'll use grep
options to demo the first two:
--context=3
-C3
(the above two lines are equivalent)
The third type is only found in imagemagick and a few other programs that tend to have reams of commandline arguments, such as gs:
-sOutputFile=foo
This is just a minor variation on the GNU standard --long-option=VALUE form shown above.
The GNU libc manual's "argument syntax" section gives a full explanation of these option passing conventions.
In regards to escaping: No escaping is done, and no escaping is normally needed. The string values are passed exactly as you specify to the command. Naturally, no quoting is done nor is it needed, since you already took care of that in your Python code.
In regards to injection: this is not possible unless you use the 'shell' option. Don't use the 'shell' option :).
Upvotes: 6
Reputation: 43447
Difference between what you asked.. easy to check:
arg = 'foo'
cmd = ['do', '--something', arg]
print cmd
cmd = ['do', '--someting {0}'.format(arg)]
print cmd
>>>
['do', '--something', 'foo']
['do', '--someting foo']
As you can see they are not the same.
In order to call your subprocess correctly, you should do this:
cmd = ['gs', '-dBATCH', '-dNOPAUSE', '-sDEVICE=pdfwrite', '-sOutputFile={0}'.format(pdf_filename), ps_filename]
subprocess.Popen(cmd, ...)
OR:
cmd = 'gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile={0} {1}'.format(pdf_filename, ps_filename)
subprocess.Popen(cmd, shell=True, ...)
The difference between using a list of arguments or a string:
When you use a list of arguments, you are passing those as the arguments to the shell (or executable if you specify)
And when you send a string with shell=True
you let the shell parse the string and make its own arguments...
So ['do', '--something', 'foo']
is 3 arguments, while ['do', '--someting foo']
is only 2 arguments.
Upvotes: 2