Reputation: 111
Hey this may have been asked elsewhere somewhere but i couldnt seen to find it.
Essentially im just trying to remove the a tags from a string using regex in javascript.
So i have this:
<a href="www.google.com">This is google</a>
and i want the output to just be "this is google". How would this be done in javascript using regex? Thanks in advance!!
SOLUTION:
Ok so the solution i was provided from my boss goes as follows
The best way to do that is in two parts. One is to remove all closing tags. Then you’re going to want to focus on removing the opening tag. Should be as straightforward as:
/<a\s+.*?>(.*)<\/a>/
With the .*? being the non-greedy version of match /anything/
Upvotes: 1
Views: 1761
Reputation: 3535
Try this with lookahead.Get the first capturing group.
(?=>).([^<]+)
Check Demo
Upvotes: 0
Reputation: 73221
This shouldn't be done with regex at all, but like this for example:
var a = document.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
return val.innerHTML;
});
console.log(texts);
<a href="www.google.com">this is google</a>
If you only have the a string with multiple <a href...>
, you can create an element first
var a_string = '<a href="www.google.com">this is google</a><a href="www.yahoo.com">this is yahoo</a>',
el = document.createElement('p');
el.innerHTML = a_string;
var a = el.querySelectorAll('a');
var texts = [].slice.call(a).map(function(val){
return val.innerHTML;
});
console.log(texts);
Upvotes: 2
Reputation: 11318
One more way, with using of capturing groups. So, you basically match all, but grab just one result:
var re = /<a href=.+>(.+)<\/a>/;
var str = '<a href="http://www.somesite.com">this is google</a>';
var m;
if ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex) {
re.lastIndex++;
}
}
console.log(m[1]);
https://regex101.com/r/rL0bT6/1 Note: code created by regex101.
Demo:http://jsfiddle.net/ry83mhwc/
Upvotes: -1
Reputation: 247
I don't know your case, but if you're using javascript you might be able to get the inside of the element with innerHTML
. So, element.innerHTML
might output This is google
.
The reasoning is that Regex really isn't meant to parse HTML.
If you really, really want a Regexp, here you go:
pattern = />(.*)</;
string = '<a href="www.google.com">This is google</a>';
matches = pattern.exec(string);
matches[1] => This is google
This uses a match group to get the stuff inside >
and <
. This won't work with every case, I guarantee it.
Upvotes: 0