Undefined
Undefined

Reputation: 1929

Regex match if not after word

I have a regex that's matching urls and converting them into html links. If the url is already part of a link I don't want to to match, for example:

http://stackoverflow.com/questions/ask

Should match, but:

<a href="http://stackoverflow.com/questions/ask">Stackoverflow</a>

Shouldn't match

How can I create a regex to do this?

Upvotes: 3

Views: 5878

Answers (4)

Igor Luzhanov
Igor Luzhanov

Reputation: 830

Try this

/(?:(([^">']+|^)https?\:\/\/[^\s]+))/m

Upvotes: 0

Woot4Moo
Woot4Moo

Reputation: 24316

This link provides information. The accepted solution is like so:

   <a\s
      (?:(?!href=|target=|>).)*
      href="http://
      (?:(?!target=|>).)*

By removing the references to "target" this should work for you.

Upvotes: 0

Shiplu Mokaddim
Shiplu Mokaddim

Reputation: 57650

If your url matching regular expression is $URL then you can use the following pattern

(?<!href[\"'])$URL

In PHP you'd write

preg_match("/(?<!href[\"'])$URL/", $text, $matches);

Upvotes: 5

Jay
Jay

Reputation: 57919

You can use a negative lookbehind to assert that the url is not preceded by href="

(?<!href=")

(Your url-matching pattern should go immediately after that.)

Upvotes: 2

Related Questions