David Smith
David Smith

Reputation: 39734

How can I convert URLs to Markdown syntax, but NOT interfere with URLs already in Markdown syntax?

A system I am writing uses Markdown to modify links, but I also want to make plain links active, so that typing http://www.google.com would become an active link. To do this, I am using a regex replacement to find urls, and rewrite them in Markdown syntax. The problem is that I can not get the regex to not also parse links already in Markdown syntax.

I'm using the following code:

$value = preg_replace('@((?!\()https?://([-\w\.]+)+(:\d+)?(/([\w/_\.]*(\?\S+)?)?)?)@', '[$1]($1)', $value);

This works well for plain links, such as http://www.google.com, but I need it to ignore links already in the Markdown format. I thought the section (?!() would prevent it from matching urls which followed a parenthesis, but it would seem that I am in error.

I realize that even this is not an ideal solution (if it worked), but this is pushing beyond my regex abilities.

Upvotes: 3

Views: 445

Answers (2)

Alan Moore
Alan Moore

Reputation: 75222

I think (?<!\() is what you meant. If the match position is at the beginning of http://www.google.com, it's not the next character you need to check, but the previous one. In other words you need a negative lookbehind, not a negative lookahead.

Upvotes: 1

Dustin Getz
Dustin Getz

Reputation: 21801

regexes are notoriously bad at stuff like this, you might end up with all sorts of clever html exploits you never could have thought of. IMO you should mod the markdown script to flag markdown URLs as it sees them, so you can ignore flagged URLs when you find them all with a very very simple search that doesn't leave complexity to hack.

Upvotes: 0

Related Questions