Reputation: 61
I have very simple html that is generated from a jSon database of strings like this:
"<div style=\"padding-top:59px;\"><a href=\"http://www.macm.org/en/index.html\"><img src=\"http://www.artimap.com/montreal/www.macm.org.jpg\"><br>www.macm.org/en/index.html</a><h1>Musée d'art contemporain de Montréal</h1><p></p><p>A major Canadian institution dedicated exclusively to contemporary art, the Musée offers a varied program ranging from presentations of its Permanent Collection to exhibitions of works by Québec, Canadian and international artists. The Permanent Collection comprises some 7,000 works, including the largest collection of art by Paul-Émile Borduas.</p><div><p>185, Sainte-Catherine West (corner Jeanne-Mance)</p><p>H2X 3X5</p></div><b>514 847-6226</b></div>"
And a variable RESULTSshow that is a concatenation of such strings, an another var: searchterm that is the search term. I want to enclose each occurence of searchterm in the results by the HTMl <i>searchterm</i> I am using those regexp and function for each tags I am intereseted in, for example:
var REG=new RegExp(searchterm,'gmi');
var regFUN=function(x){return x.replace(REG,"<i>$&</i>");};
var reg = new RegExp('<p>(.*?)</p>','gmi');
RESULTSshow=RESULTSshow.replace(reg,regFUN);
(I do this for every tags I am interested in highlighting)
This does <i>"searchterm"</i> but also gives <<i>p</i>> if searchterm==="p" wich really bugs me for the two last days.
The problem is that if searchterm is "p", that will not only change the text inside the tags but also change the tag itself.
How can I stop it from changing the tags ? I really want to do it with a regExp, not looping through html (dom) for speed sake.
Upvotes: 0
Views: 2894
Reputation: 61
Now using this wonderful little RegExp instead of the overly complicated first one:
REG=new RegExp("(?![^<>]*>)("+searchterm+")","gi");
RESULTSshow=RESULTSshow.replace(REG,'<i>$1</i>');
Upvotes: 1
Reputation: 2128
Well, considering your HTML doesn't contain blocks like SCRIPT, CDATA, STYLE, it's possible with a regex using lookahead :
text = text.replace(/(?![^<>]*>)old/g, 'new');
Though I'd use a light parser or a home-grown one without worrying about the speed for better support. Note that you'll need to process the source if your attributes may contain <>
characters.
Try this :
<html>
<head>
<script>
function t() {
text = "<html><head></head><body><p>SuperDuck</p><p>Jumps over the lazy dog</p></body></html>";
a = text.replace(/(?![^<>]*>)(p)/g, '<i>$1</i>');
alert (a);
}
</script>
</head>
<body>
<button onclick="t();">hit me!</button>
</body>
</html>
Just replace the (p)
in the replace string and you're ready to jump over =)
Upvotes: 0