Joe
Joe

Reputation: 1772

preg_match_all not matching href section correctly

I'm having a problem matching the href section of a link using preg_match_all, currently it is capturing 3 sections (full link, url only, link text only) which is perfect but the url only part is capturing any other tags located after the href tag.

Also how do I make the "href" text case insensitive?

Code:

$content = '<a href="http://www.google.com" target="_blank">Google</a> is a search engine. <a href="http://www.yahoo.com" title="yahoo" target="_blank">Yahoo</a> is a search engine.';

preg_match_all('/<a href="([^<]*)">([^<]*)<\/a>/', $content, $matches);

print_r($matches);

Result:

Array
(
    [0] => Array
        (
            [0] => <a href="http://www.google.com" target="_blank">Google</a>
            [1] => <a href="http://www.yahoo.com" title="yahoo" target="_blank">Yahoo</a>
        )

    [1] => Array
        (
            [0] => http://www.google.com" target="_blank
            [1] => http://www.yahoo.com" title="yahoo" target="_blank
        )

    [2] => Array
        (
            [0] => Google
            [1] => Yahoo
        )

)

Upvotes: 0

Views: 1980

Answers (1)

bizzehdee
bizzehdee

Reputation: 21003

your starting out looking for the > and not taking in to account any other attributes. try

/<a href="([^"]*)"[^>]+>([^<]*)<\/a>/

this will now pull out the href, then skip over the rest of the attributes, and then pull the html right up the next tag

Upvotes: 2

Related Questions