Zlatan Omerovic
Zlatan Omerovic

Reputation: 4097

Parsing parameters from command line with RegEx and PHP

I have this as an input to my command line interface as parameters to the executable:

-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"

What I want to is to get all of the parameters in a key-value / associative array with PHP like this:

$result = [
    'Parameter1' => '1234',
    'Parameter2' => '1234',
    'param3' => 'Test \"escaped\"',
    'param4' => '10',
    'param5' => '0',
    'param6' => 'TT',
    'param7' => 'Seven',
    'param8' => 'secret',
    'SuperParam9' => '4857',
    'SuperParam10' => '123',
];

The problem here lies at the following:

So far, since I'm really bad with RegEx, and still learning it, is this:

/(-[a-zA-Z]+)/gui

With which I can get all the parameters starting with an -...

I can go to manually explode the entire thing and parse it manually, but there are way too many contingencies to think about.

Upvotes: 1

Views: 184

Answers (2)

Jan
Jan

Reputation: 43169

You could use

--?
(?P<key>\w+)
(?|
    =(?P<value>[^-\s?"]+)
    |
    \h+"(?P<value>.*?)(?<!\\)"
    |
    \h+(?P<value>\H+)
)

See a demo on regex101.com.


Which in PHP would be:

<?php

$data = <<<DATA
-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"
DATA;

$regex = '~
            --?
            (?P<key>\w+)
            (?|
                =(?P<value>[^-\s?"]+)
                |
                \h+"(?P<value>.*?)(?<!\\\\)"
                |
                \h+(?P<value>\H+)
            )~x';

if (preg_match_all($regex, $data, $matches)) {
    $result = array_combine($matches['key'], $matches['value']);
    print_r($result);
}
?>


This yields

Array
(
    [Parameter1] => 1234
    [Parameter2] => 38518
    [param3] => Test \"escaped\"
    [param4] => 10
    [param5] => 0
    [param6] => TT
    [param7] => Seven
    [param8] => secret
    [SuperParam9] => 4857
    [SuperParam10] => 123
)

Upvotes: 2

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can try this that uses the branch reset feature (?|...|...) to deal with the different possible formats of the values:

$str = '-Parameter1=1234 -Parameter2=38518 -param3 "Test \"escaped\"" -param4 10 -param5 0 -param6 "TT" -param7 "Seven" -param8 "secret" "-SuperParam9=4857?--SuperParam10=123"';

$pattern = '~ --?(?<key> [^= ]+ ) [ =]
(?|
    " (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) "
  |
    ([^ ?"]*)
)~x';

preg_match_all ($pattern, $str, $matches);
$result = array_combine($matches['key'], $matches['value']);
print_r($result);

demo

In a branch reset group, the capture groups have the same number or the same name in each branch of the alternation.

This means that (?<value> [^\\\\"]*+ (?s:\\\\.[^\\\\"]*)*+ ) is (obviously) the value named capture, but that ([^ ?"]*) is also the value named capture.

Upvotes: 2

Related Questions