Erik
Erik

Reputation:

Regular Expression for Link Tags in HTML

I need help with regular expressions. What I'm looking for is a regex that looks for link-tags like this:

<link rel="stylesheet" href="style.css" type="text/css">

Irrespective of where href="" is positioned, I would like to look it up in the link-tag and put a variable named $url in front of style.css with a / following. If it finds http:// or https:// in front of style.css, then i don't want to put the variable in front of it.

I want every link-tag to be replaced.

Upvotes: 5

Views: 6752

Answers (5)

Kim Stacks
Kim Stacks

Reputation: 10812

I adapted @Juicy Scripter's answer.

It is an improvement for the following.

a) it also works for single quotes as well as double quotes. meaning

/**
 *
 * Take in html content as string and find all the <script src="yada.js" ... >
 * and add $prepend to the src values except when there is http: or https:
 *
 * @param $html String The html content
 * @param $prepend String The prepend we expect in front of all the href in css tags
 * @return String The new $html content after find and replace. 
 * 
 */
    protected static function _prependAttrForTags($html, $prepend, $tag) {
        if ($tag == 'css') {
            $element = 'link';
            $attr = 'href';
        }
        else if ($tag == 'js') {
            $element = 'script';
            $attr = 'src';
        }
        else if ($tag == 'img') {
            $element = 'img';
            $attr = 'src';
        }
        else {
            // wrong tag so return unchanged
            return $html;
        }
        // this checks for all the "yada.*"
        $html = preg_replace('/(<'.$element.'\b.+'.$attr.'=")(?!http)([^"]*)(".*>)/', '$1'.$prepend.'$2$3$4', $html);
        // this checks for all the 'yada.*'
        $html = preg_replace('/(<'.$element.'\b.+'.$attr.'='."'".')(?!http)([^"]*)('."'".'.*>)/', '$1'.$prepend.'$2$3$4', $html);
        return $html;
    }

Upvotes: 0

null
null

Reputation: 7594

Try this regular expression:

/(<link.*href=["'])(style.css)(["'].[^>]*>)/gi 

Replace portion would look like

\1http://\2\3

or

$1http://$2$3

Note: You may need to escape one of the quotes based on how you quote the string.

Upvotes: 2

Juicy Scripter
Juicy Scripter

Reputation: 25918

You can use preg_replace like this to archive desired result:

preg_replace('/(<link\b.+href=")(?!http)([^"]*)(".*>)/', '$1'.$url.'$2$3$4', $html);

So this code (assuming is stored in $html and $url = 'http://mydomain.com/'):

<link rel="stylesheet" href="style.css" type="text/css">
<link rel="stylesheet" href="style2.css" type="text/css">
<link rel="stylesheet" href="http://google.com/style3.css" type="text/css">
<link rel="stylesheet" href="style4.css" type="text/css">
<link rel="stylesheet" href="https://google.com/style5.css" type="text/css">
<link rel="stylesheet" href="some/path/to/style6.css" type="text/css">

Will be converted to this:

<link rel="stylesheet" href="http://mydomain.com/style.css" type="text/css">
<link rel="stylesheet" href="http://mydomain.com/style2.css" type="text/css">
<link rel="stylesheet" href="http://google.com/style3.css" type="text/css">
<link rel="stylesheet" href="http://mydomain.com/style4.css" type="text/css">
<link rel="stylesheet" href="https://google.com/style5.css" type="text/css">
<link rel="stylesheet" href="http://mydomain.com/some/path/to/style6.css" type="text/css">

Upvotes: 3

karim79
karim79

Reputation: 342635

The solution to this will never be pretty (or reliable) using a regex, I would recommend using a DOM parser instead, and adding in the attribute with one of its manipulation methods. Have a look at simplehtmldom:

http://simplehtmldom.sourceforge.net/

For example, take a look at this:

// Create DOM from string
$html = str_get_html('<div id="hello">Hello</div><div id="world">World</div>');

$html->find('div', 1)->class = 'bar';

$html->find('div[id=hello]', 0)->innertext = 'foo';

echo $html; // Output: <div id="hello">foo</div><div id="world" class="bar">World</div>

Upvotes: 2

whichdan
whichdan

Reputation: 1897

I'm guessing you're editing a single file - your text editor or IDE should be able to do a regex search/replace for you.

Try this:

Search: href="([^http].*?)"

Replace: href="<?php echo $url; ?>/\1"

If you need to use this in PHP, use preg_replace. Just remember that your search string needs a forward slash before and after it.

Upvotes: -2

Related Questions