James
James

Reputation: 720

Parsing a HTML file for CSS references

I have a script which is suppose to collect all the css form a defined url or page. I have tried everything and for some reason it will not get it to detect linked stylesheets such as

<link rel="stylesheet" href="css/typography.css" /> 

I have tried everything I can think of. This is the code I am using that collects on page css and imports. Any help adding the link system would be great.

function scan($page_content){
    $i = 0;
    if(ereg("<style( *[\n]*.*)>\n*(.\n*)*<\/style>", $page_content)){
        if(preg_match_all("/(@\s*import\s* (url((\"|')?)?((\"|')?)|(\"|'){1}).+(\"|')?\)?)/", $page_content, $ext_stylesheets)){
            foreach($ext_stylesheets[0] as $stylesheet){
                $css_content[$i] = preg_replace("/(@\s*import\s*)|(url\(?((\"|')?))|(\"|'){1}|\)?(\"|')?;|(\s)/", "", $stylesheet);
                $i++;
            }
            $array = 1;
        }
        $inline_notused = $this->check_file($page_content, $page_content);
    }
    else die("No page styles, sorry!".$this->helptext);
}

Upvotes: 0

Views: 133

Answers (1)

uınbɐɥs
uınbɐɥs

Reputation: 7351

Here's a nice DOM/XPath way (demo):

function scan($html) {
    $dom = new DOMDocument;
    $dom->loadHTML($html);
    $path = new DOMXPath($dom);
    $nodes = $path->query('//style|//link');
    $style = '';
    foreach($nodes as $node) {
        if($node->tagName === 'style') {
            $style .= $node->nodeValue;
        } elseif($node->tagName === 'link') {
            $style .= "@import url('{$node->getAttribute('href')}')";
        } else {
            // Invalid
        }
        $style .= PHP_EOL;
    }
    return $style;
}

Upvotes: 1

Related Questions