Reputation: 9576
I have a Perl script that processes a bunch of file names, and uses those file names inside backticks. But the file names contain spaces, apostrophes and other funky characters.
I want to be able to escape them properly (i.e. not using a random regex off the top of my head). Is there a CPAN module that correctly escapes strings for use in bash commands? I know I've solved this problem in the past, but I can't find anything on it this time. There seems to be surprisingly little information on it.
Upvotes: 7
Views: 9022
Reputation: 437278
tl;dr
The following subroutine safely quotes (escapes) a list of filenames (paths) on both Unix-like and Windows systems:
#!/usr/bin/env perl
sub quoteforshell {
return join ' ', map {
$^O eq 'MSWin32' ?
'"' . s/"/""/gr . '"'
:
"'" . s/'/'\\''/gr . "'"
} @_;
}
#'# Sample invocation
my $shellcmd = ($^O eq 'MSWin32' ? 'echo ' : 'printf "%s\n" ') .
quoteforshell('\\foo/bar', 'I\'m here', '3" of snow', 'bar |&;()<>#!');
print `$shellcmd`;
Output of the sample command on Unix-like systems, showing that all input arguments were passed through unmodified:
\foo/bar
I'm here
3" of snow
bar |&;()<>#!
On Unix-like systems, it should work with any strings (except ones with embedded NUL chars), not just filenames - see below for details.
On Windows, embedded "
instances are escaped as ""
, which is the only safe way to do it, but, sadly, may not be what the target program expects - see below for details; note, however, that this is not a concern if you're only passing filenames on Windows, because "
is not a legal filename character.
See the bottom of this post for a shell-less command-invocation alternative that bypasses the "
-quoting problem on Windows.
On Unix-like platforms, qx//
(the generalized form of `...`
) and the single-argument forms of system
and exec
invoke the shell by passing the command to /bin/sh -c
. /bin/sh
is assumed to be POSIX-compatible (and may or may not be Bash on a given system).
The single-argument forms of system
and exec
may or may not involve a shell - they decide based on the specific command passed whether involvement of a shell is needed. For instance, if a command has embedded (literal) single- or double-quotes, the shell is called. Since the solution below is based on embedding single-quoted tokens in the command string, it also works with the single-argument form of system
and exec
.
In POSIX-compatible shells you can take advantage of single-quoted strings, which do not interpolate their contents in any way.
The only challenge is to escape single-quotes ('
) themselves, which requires trickery, because, strictly speaking, embedding single-quotes in a single-quoted strings is not supported by the shell.
The trick is to replace every '
instance with '\''
(sic), which works around the problem by effectively splitting the input string into multiple single-quoted strings, with escaped '
instances - \'
- spliced in - the shell then reassembles the string parts into a single string.
Here's a subroutine that take a list of strings (filenames) and returns a space-separated string of quoted versions of the strings that guarantee literal use by the shell:
sub quoteforsh { join ' ', map { "'" . s/'/'\\''/gr . "'" } @_ }
Example (uses most POSIX shell metacharacters):
my $shellcmd = 'printf "%s\n" ' .
quoteforsh('\\foo/bar', 'I\'m here', '3" of snow', 'bar |&;()<>#!');
print `$shellcmd`;
This passes the following to /bin/sh -c
(shown here as a pure literal, without any quoting):
printf "%s\n" '\foo/bar' 'I'\''m here' '3" of snow' 'bar |&;()<>#!'
Note how each input string is in enclosed in single-quotes, and how the only character that needed quoting among all input strings was '
, which, as discussed, was replaced with '\''
.
This should output the input strings as-is, one on each line:
\foo/bar
I'm here
3" of snow
bar |&;()<>#!
On Windows, the analogous subroutine looks like this:
sub quoteforcmdexe { join ' ', map { '"' . s/"/""/gr . '"' } @_ }
This works analogous to quoteforsh()
above, except that
cmd.exe
doesn't support single-quoting."
, which is escaped as ""
- note, however, that for filenames this isn't strictly necessary, because Windows doesn't allow "
instances in filenames.However, there are limitations and pitfalls:
%USERNAME%
; by contrast, non-existing variables or isolated %
instances are fine.
%
instances as %%
, but while that works in a batch file, it inexplicably doesn't work from Perl:
`perl "%%USERNAME%%.pl"`
complains, e.g., about %jdoe%.pl
not being found, implying that %USERNAME%
was interpolated, despite the doubled %
chars.%
instances in double-quoted strings don't need escaping the way they do in batch files.)"
instances as ""
is the only SAFE way to do it, but it is not what most target programs expect.
\"
- then part of the argument
list may never be passed to the target program, with the remaining part either causing failure, unwanted redirection to a file, or, worse, unexpected execution of arbitrary commands.cmd.exe
, you may break the target program's parsing.Alternative: shell-less command invocation
If your command is an invocation of a single executable with all arguments to be passed as-is, there's no need to involve the shell at all, which:
"
-quoting problem on WindowsThe following subroutine works on both Unix-like systems and Windows, and is a shell-less alternative to qx//
(`...`
), which accepts the command to invoke as a list of arguments to interpret as-is:
sub qxnoshell {
use IPC::Cmd;
return unless @_;
my @cmdargs = @_;
if ($^O eq 'MSWin32') { # Windows
# Ensure that the executable name ends in '.exe'
$cmdargs[0] .= '.exe' unless $cmdargs[0] =~ m/\.exe$/i;
unless (IPC::Cmd::can_run $cmdargs[0]) { # executable not found
# Issue warning, as qx// would and open '-|' below does.
my $warnmsg = "Executable '$cmdargs[0]' not found";
scalar(caller) eq 'main' ? warn($warnmsg . "\n") : warnings::warnif('exec', $warnmsg);
return;
}
for (@cmdargs[1..$#cmdargs]) {
if (m'"') {
s/"/\\"/; # \-escape embedded double-quotes
$_ = '"' . $_ . '"'; # enclose as a whole in embedded double-quotes
}
}
}
open my $fh, '-|', @cmdargs or return;
my @lines = <$fh>;
close $fh;
return wantarray ? @lines : join('', @lines);
}
Examples
# Unix: $out should receive literal '$$', which demonstrates that
# /bin/sh is not involved.
my $out = qxnoshell 'printf', '%s', '$$'
# Windows: $out should receive literal '%USERNAME%', which demonstrates
# that cmd.exe is not involved.
my $out = qxnoshell 'perl', '-e', 'print "%USERNAME%"'
IPC::Cmd
.open ..., '-|'
on Windows still falls back on cmd.exe
if the initial invocation attempt fails - the same applies to system()
and exec()
, incidentally.cmd.exe
- which can have unintended consequences - the subroutine (a) ensures that the first list argument is an *.exe
executable, (b) tries to locate it, and (c) only tries to invoke the command if the executable could be located.\"
. Upvotes: 2
Reputation: 118118
Are you looking for quotemeta?
Returns the value of EXPR with all non-"word" characters backslashed.
Update: As hobbs points out in the comments, quotemeta
is not intended for this purpose and upon thinking a little more about it, might have problems with embedded nul
s. On the other hand String::ShellQuote croaks upon encountering embedded null
s.
The safest way is to avoid the shell entirely. Using the list form of 'system' can go a long way towards that (I found out to my dismay a few months ago that cmd.exe
might still get involved on Windows), I would recommend that.
If you need the output of the command, you are best off (safety-wise) opening a pipe yourself as shown in hobbs' answer
Upvotes: 3
Reputation: 239801
If you can manage it (i.e. if you're invoking some command directly, without any shell scripting or advanced redirection shenanigans), the safest thing to do is to avoid passing data through the shell entirely.
In perl 5.8+:
my @output_lines = do {
open my $fh, "-|", $command, @args or die "Failed spawning $command: $!";
<$fh>;
};
If it's necessary to support 5.6:
my @output_lines = do {
my $pid = open my $fh, "-|";
die "Couldn't fork: $!" unless defined $pid;
if (!$pid) {
exec $command, @args or die "Eek, exec failed: $!";
} else {
<$fh>; # This is the value of the C<do>
}
};
See perldoc perlipc
for more information on this kind of business, and see also IPC::Open2
and IPC::Open3
.
Upvotes: 6