Jasmine Lognnes
Jasmine Lognnes

Reputation: 7097

Why is this map function so complicated?

When I run the below script I get

$VAR1 = 'ssh -o Compression=yes -o ConnectTimeout=333 remoteIp \'mbuffer -q -s 128k -m mbufferSize -4 -I mbufferPort|zfs recv recvOpt dstDataSet\'';

which leads me to think, that all $shellQuote does is converting an array to a string and adding a ' in the beginning and end. Plus adding a | between two arrays. But the purpose of the map function can't I figure out.

The script is a super simplified version of this in order to figure out what exactly $shellQuote does.

Question

$shellQuote looks very complicated. Does it do anything else I am missing?

#!/usr/bin/perl

use Data::Dumper;
use warnings;
use strict;

my $shellQuote = sub {
    my @return;

    for my $group (@_){
        my @args = @$group;
        for (@args){
            s/'/'"'"'/g;
        }
        push @return, join ' ', map {/^[-\/@=_0-9a-z]+$/i ? $_ : qq{'$_'}} @args;
    }

    return join '|', @return;
};

sub buildRemoteRefArray {
    my $remote = shift;

    my @sshCmdArray = (qw(ssh -o Compression=yes -o), 'ConnectTimeout=' . '333');

    if ($remote){
        return [@sshCmdArray, $remote, $shellQuote->(@_)];
    }

    return @_;
};


my @recvCmd = buildRemoteRefArray('remoteIp', ['mbuffer', (qw(-q -s 128k -m)), 'mbufferSize', '-4', '-I', 'mbufferPort'], ['zfs', 'recv', 'recvOpt', 'dstDataSet']);
my $cmd = $shellQuote->(@recvCmd);
print Dumper $cmd;

Upvotes: 1

Views: 101

Answers (2)

ikegami
ikegami

Reputation: 386216

Ignore your code for a second and look at this one as it's a bit clearer.

# shell_quote_arg("foo bar") => 'foo bar'
sub shell_quote_arg {
   my ($s) = @_;
   return $s if $s !~ /[^-\/@=_0-9a-z]/i;
   $s =~ s/'/'"'"'/g;       # '
   return qq{'$s'}
}

# shell_quote("echo", "foo bar") => echo 'foo bar'
sub shell_quote {
    return join ' ', map { shell_quote_arg($_) } @_;
}

my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);

my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);

My shell_quote is used to build a shell command from a program name and argument. For example,

shell_quote('zfs', 'recv', 'recvOpt', 'dstDataSet')

returns

zfs recv recvOpt dstDataSet

So why not just use join(' ', 'zfs', 'recv', 'recvOpt', 'dstDataSet')? Because characters such as spaces, $ and ' have special meaning to the shell. shell_quote needs to do extra work if these are present. For example,

shell_quote('echo', q{He's got $100})

returns

echo 'He'"'"'s got $100'       # When run, uses echo to output: He's got $100

The shellQuote you showed does the same thing as my shell_quote, but it also does the join('|', ...) you see in my code.


By the way, notice that shellQuote is called twice. The first time, it's used to build the command to execute on the remote machine, as the following does:

my $remote_shell_cmd1 = shell_quote('mbuffer', 'arg1a', 'arg1b');
my $remote_shell_cmd2 = shell_quote('zfs', 'arg2a', 'arg2b');
my $remote_shell_cmd = join(' | ', $remote_shell_cmd1, $remote_shell_cmd2);

The second time, it's used to build the command to execute on the local machine, as the following does:

my $local_shell_cmd = shell_quote('ssh', $host, $remote_shell_cmd);

Upvotes: 2

Jonathan Cast
Jonathan Cast

Reputation: 4635

The map function, by which I assume you mean

map {/^[-\/@=_0-9a-z]+$/i ? $_ : qq{'$_'}} @args

checks each argument to see if it is a legal shell token or not. Legal shell tokens are passed through; anything with a suspicious character gets enclosed on '' quotes.

Bear in mind that your example has two calls to $shellQuote, not just one; you're printing:

print Dumper($shellQuote->(
    [
        qw(ssh -o Compression=yes -o),
            'ConnectTimeout=' . '333',
            'remoteIp',
            $shellQuote->(
                [
                    'mbuffer',
                        (qw(-q -s 128k -m)),
                        'mbufferSize',
                        '-4',
                        '-I',
                        'mbufferPort',
                ],
                [
                    'zfs',
                        'recv',
                        'recvOpt',
                        'dstDataSet',
                ],
            ),
    ]
));

Where I've indented the arguments to each shell command one step further than the command for clarity of the structure of the list. So your '' quotes are coming from the outer $shellQuote, which is recognizing that the inner $shellQuote has put spaces into its result; the | is comming from the inner $shellQuote, which is using them to combine the the two array refs passed to it.

Breaking the map function down, map { expr } @args means 'evaluation expr for each element of @args and make a list of the results.

/^[-\/@=_0-9a-z]+$/i ? $_ : qq{'$_'} is a ternary expression (Googleable term). $_ is the current element of @args, and /re/i is true if and only if $_ matches the given regular expression (Googleable term) (case insensitive). The whole expression means 'if the current element of @args contains only the listed characters (ASCII letters, ASCII digits, and the characters -, /, @, and =), return it as-is; otherwise return it wrapped in single quotes'.

The for loop, before that, replaces each ' in each element of @args with '"'"', which is a particular way of embedding a single quote into a single-quoted string in sh.

Upvotes: 3

Related Questions