Reputation: 153
I'm performing simple text search/replace on random HTML pages in jQuery, but I'm having issues ignoring terms that appear within an attribute, i.e. if my term is jquery
, I would like to ignore all terms in <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script> jquery
while still hitting the one outside. Right now my code looks a little like this:
$("div#content").html($("div#content").text().replace(/(jquery)/g, "stuff"));
I've been looking at positive/negative lookahead/lookbehind, but I can't get it down right. I'm not able to use any external libraries besides jQuery, and I've seen this post already.
I suppose one solution could be to use some indexOf magic to search the sections I want, but I don't know if this is efficient or feasible for quick text searching.
Any suggestions would be greatly appreciated!
Upvotes: 0
Views: 539
Reputation: 153
Stumbled upon this just now, pretty much handles the problems I had before by looking only at text nodes: https://stackoverflow.com/a/4515063/660036
I don't think this solution takes care of text spanning more than one text node, i.e. searching for 'quick' in
the <strong>qui</strong>ck brown fox
But the complexity needed to solve these cases is much higher than what I need right now =P
Upvotes: 0
Reputation: 54836
This is a good case for using a hand-coded parser. It's pretty much the only approach that will allow you to reliably handle all the cases you want to handle.
Basically, think of the parser as a state-machine. It needs to read the input text, one character at a time, and for each character perform the appropriate action based upon that character and its current parsing state. This model makes it relatively trivial to ignore any text that appears within an HTML tag, while processing everything else.
Here's a simple example to get you started: http://jsfiddle.net/8BeEv/
Note that the example code does not currently handle escape sequences inside of HTML tags (a \>
sequence inside of a tag will break it, for instance), malformed HTML, or other possible though generally rare error cases.
Upvotes: 1
Reputation: 5768
How about this (?<=[^\/])jquery
searches for all jquery
not preceded by a /
... unless there are other ways for the term jquery
to appear in an attribute?
Upvotes: 1