Reputation: 931
I searched high and low but cannot aeem to find a definitve answer to this. As is often the case with regexps. So I thought I'd ask here.
I'm trying to put together a regular expression i can use in JavaScript to replace all instances of URLs and email addresses (does'nt need to be ever so strict) with anchor tags pointing to them.
Obviously this is something usually done very simply on the server-side, but in this case it is necessary to work with plain text so an elegant JavaScript solution to perfom the replaces at runtime would be perfect.
Onl problem is, as I've stated before, I have a huge regular expression shaped gaping hole in my skill set :(
I know that one of you has the answer at the tip of your fingers though :)
Upvotes: 3
Views: 3929
Reputation: 24378
Here's a good article for urls...
https://blog.codinghorror.com/the-problem-with-urls/
emails are more straight forward since they have to end in a .tld You don't need to get fancy with that one since you're not validating, just matching, so off the top of my head...
[^\s]+@\w[\w-.]*.[a-zA-Z]+
Upvotes: 1
Reputation: 51091
Well, blindly using regexps from http://www.osix.net/modules/article/?id=586
var emailRegex =
new RegExp(
'([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}' +
'\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.' +
')+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)',
"gi");
var urlRegex =
new RegExp(
'((https?://)' +
'?(([0-9a-z_!~*\'().&=+$%-]+: )?[0-9a-z_!~*\'().&=+$%-]+@)?' + //user@
'(([0-9]{1,3}\.){3}[0-9]{1,3}' + // IP- 199.194.52.184
'|' + // allows either IP or domain
'([0-9a-z_!~*\'()-]+\.)*' + // tertiary domain(s)- www.
'([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.' + // second level domain
'[a-z]{2,6})' + // first level domain- .com or .museum
'(:[0-9]{1,4})?' + // port number- :80
'((/?)|' + // a slash isn't required if there is no file name
'(/[0-9a-z_!~*\'().;?:@&=+$,%#-]+)+/?))',
"gi");
then
text.replace(emailRegex, "<a href='mailto::$1'>$1</a>");
and
text.replace(urlRegex, "<a href='$1'>$1</a>");
might to work
Upvotes: 4
Reputation: 3947
Just adding a bit of information on email regexps: Most of them seems to ignore that domain names can have the characters 'åäö' in them. So if your care about that, make sure that the solution you are using has åäöÅÄÖ in the domain part of the regexp.
Upvotes: 0
Reputation: 338316
As always, this ("this" being "processing HTML with regex") is going to be difficult and error-prone. The following will work on reasonably well-formed input only, but here's what I would do:
innerHTML
property value/(<a\b.+?</a>/ig
)"<a "
), looking for URL- or e-mail-address patterns<a>
tagsjoin()
the array back to a stringinnerHTML
property to your new valueI am sure you will find regular expression examples that match e-mail addresses and URLs. Take the ones that suit you most, and use them in step 4.).
Upvotes: 0
Reputation: 34347
Not a canned solution, but this will point you in the right direction.
I use Regex Coach to build and test my regexes. You can find plentiful examples of regexes for urls and email addresses online.
Upvotes: 1