Catalin Cardei
Catalin Cardei

Reputation: 324

How to parse a PHP project, find a function occurrences in code and detect the called parameter in each call?

Edited: The real name of the function is CB_t

Consider a project with several files and folders. Across the project we call a CB_t($string='') function several times with different parameters. What I need is to find programatically all the times that this functions was called and create a array with the parameters. The parameter is always a String.

Code Sample:

File 1:

<?php
# Some code ....

$a = CB_t('A');
$b = CB_t("B");

# more code ...

File 2:

<?php
# Some code ....

$c = CB_t("ABC");
$d = CB_t('1938');

# more code ...

What I need is to parse all the code and create an array with all the parameters. I the case of the above sample the array should look like:

['A','B','ABC','1938']

Below is what I tried until now and is not giving good results because the function was sometimes called using simple quotes and sometimes using double quotes or using upper case or lowercase.

    $search = "F(";
    $path = realpath(ROOT); // ROOT defined as project root folder 
    $fileList = new \RecursiveIteratorIterator(new \RecursiveDirectoryIterator($path), \RecursiveIteratorIterator::SELF_FIRST);
    $count = 0;$counter = 0;
    foreach ($fileList as $item) 
    {
        if ( $item->isFile() && substr($item->getPathName(), -4) =='.php')  // #1
        {

            $counter++;

            $file = file_get_contents($item->getPathName());


                if( strpos($file,trim($search)) !== false) {

                    $count++;

                    echo "<br>File no   : ".$count;
                    echo "<br>Filename  : ".$item->getFileName();
                    echo "<br>File path : ".$item->getPathName();
                    echo "<hr>";


                } // End file is php

            unset($file);

        } // End if #1

    } // End foreach

I think that can solved somehow with regular expressions but I don't control this part very good.

Thanks in advance!

Upvotes: 2

Views: 917

Answers (2)

Clart Tent
Clart Tent

Reputation: 1309

I'm not certain the regular expression is clever enough, but this should get you close:

foreach ($fileList as $item) 
{
    if ( $item->isFile() && substr($item->getPathName(), -4) =='.php')  // #1
    {
        $counter++;

        $file = file_get_contents($item->getPathName());
        $matches= array();

        $count= preg_match_all('/\bCB_t\s*\(\s*[\'"](.*?)[\'"]\s*\)/i', $file, $matches);

        echo "<br>File no   : ".$count;
        echo "<br>Filename  : ".$item->getFileName();
        echo "<br>File path : ".$item->getPathName();
        echo "<hr>";

        unset($file);

        $total+= $count;
    } // End if #1

} // End foreach

The regular expression looks for an CB_t (or an cb_t -- the i at the end makes it case-insensitive) followed by zero or more spaces followed by a ( followed by zero or more spaces (again) followed by a single- or double-quote. (Obviously this won't match anywhere where CB_t is called with a variable parameter e.g. CB_t($somevar) - you'd need to tweak it for that.)

It then uses the result from a call to preg_match_all to count the number of matches on the page. (I've added a $total count too - I was using that in my own testing!)

One problem I know the regular expression has is it will still count a call to CB_t that appears in comments or within another string e.g.

/* CB_t('fred'); */
$somevar= 'CB_t("fred")';

Will both get counted.

Hope it helps!

(Edited for careless pasting)

(Edited again to include Galvic's improved RegExp and to change the function name as requested.)

Upvotes: 1

user557597
user557597

Reputation:

This might work. Some extra annotations there for the branch reset. Capture group 1 will
contain the string content.

 Edit - If you want to make the regex into a C-style string, here it is:    

 "[Ff]\\s*\\(\\s*(?|\"([^\"\\\\]*(?:\\\\.[^\"\\\\]*)*)\"|'([^'\\\\]*(?:\\\\.[^'\\\\]*)*)')\\s*\\)"
 ---------------------------------------------------------


      #  [Ff]\s*\(\s*(?|"([^"\\]*(?:\\.[^"\\]*)*)"|'([^'\\]*(?:\\.[^'\\]*)*)')\s*\)

      [Ff] 
      \s* 
      \(
      \s* 
      (?|
           " 
 br 1      (                              # (1 start)
                [^"\\]* 
                (?: \\ . [^"\\]* )*
    1      )                              # (1 end)
           "
        |  
           ' 
 br 1      (                              # (1 start)
                [^'\\]* 
                (?: \\ . [^'\\]* )*
    1      )                              # (1 end)
           '
      )
      \s* 
      \)

Edit2 - Usage example:

 $string =
 "
 f('hello')
 F(\"world\")
 ";

 preg_match_all
      ( 
          "/[Ff]\\s*\\(\\s*(?|\"([^\"\\\\]*(?:\\\\.[^\"\\\\]*)*)\"|'([^'\\\\]*(?:\\\\.[^'\\\\]*)*)')\\s*\\)/",
          $string,
          $matches,
          PREG_PATTERN_ORDER
      );
  print_r( $matches[1] );

 -----------------------------
 Result:
 Array
 (
     [0] => hello
     [1] => world
 )

Upvotes: 1

Related Questions