TheOx
TheOx

Reputation: 2228

Benefit of using os.mkdir vs os.system("mkdir")

Simple question that I can't find an answer to:

Is there a benefit of using os.mkdir("somedir") over os.system("mkdir somedir") or subprocess.call(), beyond code portability?

Answers should apply to Python 2.7.

Edit: the point was raised that a hard-coded directory versus a variable (possibly containing user-defined data) introduces the question of security. My original question was intended to be from a system approach (i.e. what's going on under the hood) but security concerns are a valid issue and should be included when considering a complete answer, as well as directory names containing spaces

Upvotes: 3

Views: 5572

Answers (1)

Charles Duffy
Charles Duffy

Reputation: 295649

Correctness

Think about what happens if your directory name contains spaces:

mkdir hello world

...creates two directories, hello and world. And if you just blindly substitute in quotes, that won't work if your filename contains that quoting type:

'mkdir "' + somedir + '"'

...does very little good when somedir contains hello "cruel world".d.


Security

In the case of:

os.system('mkdir somedir')

...consider what happens if the variable you're substituting for somedir is called ./$(rm -rf /)/hello.

Also, calling os.system() (or subprocess.call() with shell=True) invokes a shell, which means that you can be open to bugs such as ShellShock; if your /bin/sh were provided by a ShellShock-vulnerable bash, and your code provided any mechanism for arbitrary environment variables to be present (as is the case with HTTP headers via CGI), this would provide an opportunity for code injection.


Performance

os.system('mkdir somedir')

...starts a shell:

/bin/sh -c 'mkdir somedir'

...which then needs to be linked and loaded; needs to parse its arguments; and needs to invoke the external command mkdir (meaning another link and load cycle).


A significant improvement is the following:

subprocess.call(['mkdir', '--', somedir], shell=False)

...which only invokes the external mkdir command, with no shell; however, as it involves a fork()/exec() cycle, this is still a significant performance penalty over the C-library mkdir() call.

In the case of os.mkdir(somedir), the Python interpreter directly invokes the appropriate syscall -- no external commands at all.


Error Handling

If you call os.mkdir('somedir') and it fails, you get an IOError with the appropriate errno thrown, and can trivially determine the type of the error.

If the mkdir external command fails, you get a failed exit status, but no handle on the actual underlying problem without parsing its stderr (which is written for humans, not machine readability, and which will vary in contents depending on the system's current locale).

Upvotes: 14

Related Questions