Xeoncross
Xeoncross

Reputation: 57184

How do I find all the classes used in a PHP file?

I am trying to use the tokenizer to scan a file to find all the defined classes, anything they extend, any created instances, and anytime they were statically invoked.

<?php

$tokens = token_get_all(file_get_contents($file));

$used_classes = array();
$defined_classes = array();
$variable_classes = array();

foreach($tokens as $i => $token) {

    if(is_array($token)) {

        if(isset($tokens[$i - 2][0], $tokens[$i - 1][0])) {

            // new [class]
            if ($tokens[$i - 2][0] == T_NEW AND $tokens[$i - 1][0] == T_WHITESPACE) {

                if($tokens[$i][0] == T_STRING) {
                    $used_classes[$token[1]] = TRUE;

                // new $variable()
                } elseif($tokens[$i][0] == T_VARIABLE) {    

                    // @todo, this is really broken. However, do best to look for the assignment
                    if(preg_match('~\$var\s*=\s*([\'"])((?:(?!\1).)*)\1~', $text, $match)) {
                        if(empty($extension_classes[$match[2]])) {
                            $used_classes[$match[2]] = TRUE;
                        }
                    } elseif($token[1] !== '$this') {
                        $variable_classes[$token[1]] = TRUE;
                    }
                }

            }

            // class [class]
            if ($tokens[$i - 2][0] == T_CLASS AND $tokens[$i - 1][0] == T_WHITESPACE) {

                if($tokens[$i][0] == T_STRING) {
                    $defined_classes[$token[1]] = TRUE;
                }
            }


            // @todo: find more classes \/

            // class [classname] extends [class] ???
            // [class]::method()???
        }
    }
}

How can I extend this code to find any additional instances of PHP classes like mentioned above?

Upvotes: 5

Views: 2923

Answers (4)

Danack
Danack

Reputation: 25701

Parsing and then interpreting PHP code is not something that can be solved well using a regex. You would need a something much more clever, like a state machine, that can actually understand things like scope, class names, inheritance etc to be able to do what you want.

It just so happens, that I happen to have written a PHP-to-Javascript converter based on a state-machine that will almost do most of what you want to do:

all the defined classes

Yes, all the classes create a ClassScope with all their variables listed and their methods are created as FunctionScope's, so you can tell which methods a class has.

anything they extend

Yes, every class has it's parent classes listed in ClassScope->$parentClasses

any created instances

Nope, but wouldn't be hard to add extra code to record these.

anytime they were statically invoked.

Nope - but that actually could be done with a regex.

Although it doesn't exactly solve your problem, the project as it stands would get you 95% of the way towards what you want to do, which would save a couple weeks work.

Upvotes: 2

Mark Leighton Fisher
Mark Leighton Fisher

Reputation: 5693

It looks like if you just load the code, you can then use the built-in Reflection API (ReflectionClass::_construct(), etc.) to examine each class.

To get the classes themselves, use the built-in get_declared_classes().

(Note: I have not tried this, so YMMV.)

Upvotes: 0

Ira Baxter
Ira Baxter

Reputation: 95334

I don't think you can do this by just analyzing tokens.

You need to know, for any class name, what actual definition it represents, including any inheritance relations, and whether it has been used in your code to implement an interface. The class/interface definition may be in another file; that file may be included under some condition. You may have the same class name defined differently in different files. So in general you need to to process all the files that comprise your system at once.

What you need as a foundation is a tool that parses PHP and builds up real symbol tables. You might be able to compute your result from that. (Such a tool analyzes tokens as a starting place, but it is far more work than trivial token scanning).

Upvotes: 0

Charles D&#39;Angelo
Charles D&#39;Angelo

Reputation: 76

Inclued is probably worth looking into here, though I don't think it will provide you with any data beyond which files/classes were included and how many times.

Upvotes: 0

Related Questions