jthompson
jthompson

Reputation: 931

JavaScript Regexp to Wrap URLs and Emails in anchors

I searched high and low but cannot aeem to find a definitve answer to this. As is often the case with regexps. So I thought I'd ask here.

I'm trying to put together a regular expression i can use in JavaScript to replace all instances of URLs and email addresses (does'nt need to be ever so strict) with anchor tags pointing to them.

Obviously this is something usually done very simply on the server-side, but in this case it is necessary to work with plain text so an elegant JavaScript solution to perfom the replaces at runtime would be perfect.

Onl problem is, as I've stated before, I have a huge regular expression shaped gaping hole in my skill set :(

I know that one of you has the answer at the tip of your fingers though :)

Upvotes: 3

Views: 3929

Answers (5)

JeremyWeir
JeremyWeir

Reputation: 24378

Here's a good article for urls...

https://blog.codinghorror.com/the-problem-with-urls/

emails are more straight forward since they have to end in a .tld You don't need to get fancy with that one since you're not validating, just matching, so off the top of my head...

[^\s]+@\w[\w-.]*.[a-zA-Z]+

Upvotes: 1

Daniel LeCheminant
Daniel LeCheminant

Reputation: 51091

Well, blindly using regexps from http://www.osix.net/modules/article/?id=586

var emailRegex = 
   new RegExp(
   '([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}' + 
   '\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.' + 
   ')+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)',
   "gi");

var urlRegex = 
   new RegExp(
   '((https?://)' + 
   '?(([0-9a-z_!~*\'().&=+$%-]+: )?[0-9a-z_!~*\'().&=+$%-]+@)?' + //user@ 
   '(([0-9]{1,3}\.){3}[0-9]{1,3}' + // IP- 199.194.52.184 
   '|' + // allows either IP or domain 
   '([0-9a-z_!~*\'()-]+\.)*' + // tertiary domain(s)- www. 
   '([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.' + // second level domain 
   '[a-z]{2,6})' + // first level domain- .com or .museum 
   '(:[0-9]{1,4})?' + // port number- :80 
   '((/?)|' + // a slash isn't required if there is no file name 
   '(/[0-9a-z_!~*\'().;?:@&=+$,%#-]+)+/?))',
   "gi");

then

text.replace(emailRegex, "<a href='mailto::$1'>$1</a>");

and

text.replace(urlRegex, "<a href='$1'>$1</a>");

might to work

Upvotes: 4

ciscoheat
ciscoheat

Reputation: 3947

Just adding a bit of information on email regexps: Most of them seems to ignore that domain names can have the characters 'åäö' in them. So if your care about that, make sure that the solution you are using has åäöÅÄÖ in the domain part of the regexp.

Upvotes: 0

Tomalak
Tomalak

Reputation: 338316

As always, this ("this" being "processing HTML with regex") is going to be difficult and error-prone. The following will work on reasonably well-formed input only, but here's what I would do:

  1. find the element you want to process, take it's innerHTML property value
  2. iteratively find everything that already is a link (/(<a\b.+?</a>/ig)
  3. based on that, cut your string into "this isn't a link"- and "this is a link"-bits, appending all of them them to a neatly orderd array
  4. process the "non-link" bits only (those that don't begin with "<a "), looking for URL- or e-mail-address patterns
  5. wrap every address you find in <a> tags
  6. join() the array back to a string
  7. set the innerHTML property to your new value

I am sure you will find regular expression examples that match e-mail addresses and URLs. Take the ones that suit you most, and use them in step 4.).

Upvotes: 0

Chris Ballance
Chris Ballance

Reputation: 34347

Not a canned solution, but this will point you in the right direction.

I use Regex Coach to build and test my regexes. You can find plentiful examples of regexes for urls and email addresses online.

Upvotes: 1

Related Questions