Reputation: 683
I have an html string:
"this is <b>bold</b>, and then again - <b>another bolded</b> one"
My desired result is to get a list of all tags + the index of each tag
results = [
{
tag: '<b>bold</b>',
text: 'bold',
index: 8
},
{
tag: '<b>another bolded</b>',
text: 'another bolded',
index: 38
}
]
I tries using this regex
/\<b\>(.*)\<\/b\>/
but it gives me this result instead
results = [
{
tag: '<b>bold</b>, and then again - <b>another bolded</b>',
text: 'bold</b>, and then again - <b>another bolded',
index: 8
}
]
this javascript I use now is:
var func = function() {
var text = "this is <b>bold</b>, and then again - <b>another bolded</b> one";
var match = text.match(/\<b\>(.*)\<\/b\>/);
var result = [
{
tag: match[0],
text: match[1],
index: match.index
}
]
return result;
}
Upvotes: 1
Views: 90
Reputation: 635
Try inserting a ?
to make (.*)
less greedy
/\<b\>(.*?)\<\/b\>/
https://javascript.info/regexp-greedy-and-lazy
For the index of the opening and closing tags - the index of the opening tag is known, as it is match.index
of /\<b\>(.*)\<\/b\>/
.
For the closing tag, add the index of the opening tag in text
to the index of the closing tag in match[0]
.
{
tag: match[0],
text: match[1],
index: match.index,
closingTagIndex: match[0].match(/(<\/b\>)/).index + match.index
}
Upvotes: 3
Reputation: 18908
You can use replace
to loop over the string finding the tags, text, and index:
const string = "this is <b>bold</b>, and then again - <b>another bolded</b> one";
const matches = [];
string.replace(/<b>(.*?)<\/b>/g, (tag, text, index) => {
matches.push({tag, text, index});
});
console.log(matches);
Upvotes: 3