kuzey beytar
kuzey beytar

Reputation: 3226

(PHP) Parse command

I want to get values of command tags (GET, FROM, IN, etc.) My command is:

// My command
$_cmd = 'GET a, b FROM p IN a and c="I am from Sarajevo" or d>1 ';

// My parser
if(preg_match_all('/(GET|FROM|IN)\s+([^\s]+)/si',$_cmd, $m))
    $cmd = array_combine($m[1], $m[2]);

Output:

Array
(
  [GET] => a,
  [FROM] => p
  [IN] => a
  [from] => Sarajevo"
)

I am looking for this output:

Array
(
  [GET] => a, b
  [FROM] => p
  [IN] => a and c="I am from Sarajevo" or d>1
)

As you see, problem is with whitespaces and repeated command tags in strings (like from). So how can I parse this command?

Upvotes: 3

Views: 146

Answers (5)

Saic Siquot
Saic Siquot

Reputation: 6513

$_cmd = 'GET a, b FROM p IN a and c="I am from Sarajevo" or d>1 ';
$tpar = preg_split('/\s+(GET|FROM|IN)\s+/i', ' '.$_cmd.' ', -1, PREG_SPLIT_DELIM_CAPTURE);
array_walk($tpar, 'trim');

print_r($tpar);

// gives:
array(
  [0] => GET
  [1] => a, b
  [2] => FROM
  [3] => p
  [4] => IN
  [5] => a and c="I am from Sarajevo" or d>1
)
// the rest is straight forward

Upvotes: 1

mario
mario

Reputation: 145482

You cannot easily parse that with a single regex. (It's doable, but not simple.)

You should use a simple tokenizer, where a regex again becomes a useful tool:

  preg_match_all('/\w+|".*?"|\W/', $_cmd = 'GET a, b FROM p IN a and c="I am from Sarajevo" or d>1 ', $list);

This gives you a simple list, where you just have to find the clauses that you are interested in, then remerge the subsequent tokens (though I'm confused about your use case):

[0] => Array
    (
        [0] => GET
        [1] => a
        [2] => ,
        [3] => b
        [4] => FROM
        [5] => p
        [6] => IN
        [7] => a
        [8] => and
        [9] => c
        [10] => =
        [11] => "I am from Sarajevo"
        [12] => or
        [13] => d
        [14] => >
        [15] => 1
    )

Upvotes: 8

SergeS
SergeS

Reputation: 11779

if( preg_match_all('/(GET|FROM|IN)(.(?!(GET|FROM|IN)))+\s*/si',$_cmd, $m))

this means - find any char after keyword which is not followed by GET, FROM or IN whith whitespace after it

Upvotes: 3

powtac
powtac

Reputation: 41070

You could remove the case insensitive i after the delimiter /. And also make sure there is at least one whitespace after the keywords.

Upvotes: 1

Dor
Dor

Reputation: 7494

You need to develop a scripting language for this. Regexps aren't suitable for these purposes.

Upvotes: 1

Related Questions