Shahid Najam Afridi
Shahid Najam Afridi

Reputation: 139

How to parse this comment block with PHP using a regular expression?

I am having a problem in the preg_match_all() function. What would be a regular expression pattern for this type of string?

Consider this code:

$str="* Function do Something * @param String $variable1 * @param String $variable2 * @return String";

I want a pattern for preg_match to parse this string to this form of array... It separates @param and @return or @author, etc.

It should print the array like this:

array("param"=>[0]=>"String $variable1",[1]=>"String $variable2",
    "return"=>"String")

Upvotes: 0

Views: 1221

Answers (3)

RobertPitt
RobertPitt

Reputation: 57268

Well, firstly the common splitting identifier is the *, so firstly I would explode by them:

$segments = explode('*', $text);

Then I see that there will be spaces because at the sides of the * we have spaces, so they need to be trimmed away, but firstly we need to set up a blank array to store the new cleaned data in.

$results = array();

Then loop through each segment, trimming and check for the @ symbol:

$first = true;
foreach($segments as $segment) {
    // Strip trailing/leading whitespace and line breaks
    $segment = trim(segment);

    if ($first === true) {
        // Name: The very first line would be the name.
        $results['name'] = $segment;
        $first = false;
    } else {
        // Params and return
        if ($segment[0] === "@") {
            // Find the first space, usually after @xxxx text
            $pos = strpos(' ', $segment);

            // Get the name of the var so param for @param
            $index = substr($segment, 1, $pos);
            // rest of the string
            $value = substr($segment, $pos+1);
            switch($index) {
                case 'param':
                case 'params':
                    $results['params'][] = $value;
                    break;
                case 'return':
                case 'returns':
                    $params['return'] = $value;
                    break;
                default:
                    $params[$index] = $value;
                    break;
            }
        }
    }
}

Hopefully you will be able to see what this code bock is doing, but if not a little explanation below.

After exploding the string into segments, we start looping through them. There's a small $first variable that is set to true as default so we know if it's the first iteration of the array, the reason being the first line is the function name and it does not have have an @ symbol to denote a named line.

After that, we check to see if the character at index 0 is equal to @. If so, then we cut out the string so that:

@param fun ...
0123456789 ...
^     ^

So that we cut from 1 to the index if the space (6) and this would give 'param'.

After creating a switch statement, we just use substr() to cut off only the part of the string after the param prefix (offset 6 in this case).

This code more than likely will not work as it's untested, but it's written to show you how to go about it. I hope it gets you going.

Some other resources:

  • [Is there a good (standalone) PHPDoc parser class or function in PHP?][2]

  • [How to parse a phpDoc style comment block with PHP?][3]

I really don't think a regular expression is the way to go, but if that's really what you want to do then [How to parse a phpDoc style comment block with PHP?][6] is the way to go.

Upvotes: -1

bourbaki
bourbaki

Reputation:

Have a try with this:

$str='* Function do Something * @param String $variable1 * @param String $variable2 * @return String';
$l = explode('*', $str);
$res = array();
foreach($l as $el) {
    if (preg_match("/@(\w+) (.*)$/", $el, $m)) {
        $res[$m[1]][] = $m[2];
    }
}
print_r($res);

Output:

Array
(
    [param] => Array
        (
            [0] => String $variable1
            [1] => String $variable2
        )

    [return] => Array
        (
            [0] => String
        )

)

Upvotes: 3

Eitrix
Eitrix

Reputation: 116

Try this:

preg_match_all('/(?<=[\s])[$@\w\s]*(?=[\s"])/i', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];
// $result will be array of matches

This will match everything between those * in groups, so just kick out from array first match if you don't need function part and use the rest for parameters.

GL

Upvotes: 1

Related Questions