user3355352
user3355352

Reputation:

Regex for html tags that are not commented

I need to find all <link /> tags in the html which are not commented.

For example, in html:

<link rel="stylesheet" href="xyz/dzgt/style.css" />
<!--[if IE 7]>
<link rel="stylesheet" type="text/css" href="xyz/dzgt/ie7.css" />
<![endif]-->

I need a regexp matching <link rel="stylesheet" href="xyz/dzgt/style.css"/> but not matching <link rel="stylesheet" type="text/css" href="xyz/dzgt/ie7.css" /> because it is surrounded by <!-- -->.

I could find all <link /> tags with the following regex /<link.*href="(.*\.css)".*\/>/m, but it also matches the ones are commented, but I only need the ones are not commented.

Thanks for the help in advance!

Upvotes: 4

Views: 79

Answers (1)

You should make use of DOMDocument Class instead of regex to parse HTML. Check this.

<?php
$html='<link rel="stylesheet" href="xyz/dzgt/style.css" />
<!--[if IE 7]>
<link rel="stylesheet" type="text/css" href="xyz/dzgt/ie7.css" />
<![endif]-->';
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('link') as $tag) {
        echo $tag->getAttribute('href');
   
}

OUTPUT :

xyz/dzgt/style.css

Upvotes: 4

Related Questions