GivP
GivP

Reputation: 2654

Regex in Javascript to remove links

I have a string in JavaScript and it includes an a tag with an href. I want to remove all links and the text. I know how to just remove the link and leave the inner text but I want to remove the link completely.

For example:

var s = "check this out <a href='http://www.google.com'>Click me</a>. cool, huh?";

I would like to use a regex so I'm left with:

s = "check this out. cool, huh?";

Upvotes: 16

Views: 25342

Answers (6)

mazy
mazy

Reputation: 690

Examples above do not remove all occurrences. Here is my solution:

str.replace(/<a\b[^>]*>/gm, '').replace(/<\/a>/gm, '')

Upvotes: 1

Paul Worlton
Paul Worlton

Reputation: 161

Just to clarify, in order to strip link tags and leave everything between them untouched, it is a two step process - remove the opening tag, then remove the closing tag.

txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");

Working sample:

<script>
 function stripLink(txt) {
    return txt.replace(/<a\b[^>]*>/i,"").replace(/<\/a>/i, "");
 }
</script>

<p id="strip">
 <a href="#">
  <em>Here's the text!</em>
 </a>
</p>

<p>
 <input value="Strip" type="button" onclick="alert(stripLink(document.getElementById('strip').innerHTML))">
</p>

Upvotes: 16

Ionuț G. Stan
Ionuț G. Stan

Reputation: 179109

Just commented about John Resig's HTML parser. Maybe it helps on your problem.

Upvotes: 1

ChristopheD
ChristopheD

Reputation: 116137

This will strip out everything between <a and /a>:

mystr = "check this out <a href='http://www.google.com'>Click me</a>. cool, huh?";
alert(mystr.replace(/<a\b[^>]*>(.*?)<\/a>/i,""));

It's not really foolproof, but maybe it'll do the trick for your purpose...

Upvotes: 21

georgebrock
georgebrock

Reputation: 30033

If you only want to remove <a> elements, the following should work well:

s.replace(/<a [^>]+>[^<]*<\/a>/, '');

This should work for the example you gave, but it won't work for nested tags, for example it wouldn't work with this HTML:

<a href="http://www.google.com"><em>Google</em></a>

Upvotes: 1

Chas. Owens
Chas. Owens

Reputation: 64909

Regexes are fundamentally bad at parsing HTML (see Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why). What you need is an HTML parser. See Can you provide an example of parsing HTML with your favorite parser? for examples using a variety of parsers.

Upvotes: 3

Related Questions