Exception
Exception

Reputation: 8379

Regular expression to remove text outside the tags in a string

Here is my string. Which will contain XML string
Like below

 var str= "<str>rvrv</str>rvrv<q1>vrvv</q1>vrvrv<q2>rtvrvr</q2>";

How can I remove text outside tags(text which does not belong to any tag.) using regular expression. Please help me on this.

Upvotes: 0

Views: 1228

Answers (1)

Fabrizio Calderan
Fabrizio Calderan

Reputation: 123367

Assuming your problem is only removing text not enclosed inside an element (and remaining code is well formed so you haven't strings like

var str= "<str>lorem <b>ipsum</str>";

) you could use a regular expression like this

var str= "<str>rvrv</str>rvrv<q1>vrvv</q1>vrvrv<q2>rtvrvr</q2>",
    elements = str.match(/<(.+?)>[^<]+<\/\1>/gi);

console.log(elements.join(''));

and this returns

<str>rvrv</str><q1>vrvv</q1><q2>rtvrvr</q2>

Note: to detect closing tags I used a backreference (see http://www.regular-expressions.info/brackets.html)

Upvotes: 3

Related Questions