Reputation: 990
Is it possible to select all words, that are not tags and not inside tags as attributes? I have got this working inverse, and I know I could make this in two phases, replace first matches and make a new Javascript RegExp search. But thing is that I'd like to get it work with one expression.
(<[^>]*>)|({[^>]*})
Input:
<p>Test image captions for GitBook:</p>
<p>Second image: <img scr="./image2.png" alt="image title" title="image title">asdf</img>{caption width="300" style="height:'300px'"} </p>
<p>Sample text and first image: <img scr="./image1.png" alt="image 1" /> {caption width="300" style="height:'300px'"} for testing ok...</p>
Expected output marking words inside ` that should be matched:
<p>`Test` `image` `captions` `for` `GitBook`:</p>
<p>`Second` `image`: <img scr="./image2.png" alt="image title" title="image title">`asdf`</img>{caption width="300" style="height:'300px'"} </p>
<p>`Sample` `text` `and` `first` `image`: <img scr="./image1.png" alt="image 1" /> {caption width="300" style="height:'300px'"} `for` `testing` `ok`...</p>
Upvotes: 1
Views: 59
Reputation: 1
Try using .textContent
, String.prototype.replace()
with RegExp
/\{.*\}|:|\.+|\s{2}|\s$/gi
var p = document.getElementsByTagName("p"), res = [];
for (var text = "", i = 0; i < p.length; i++) {
res[i] = p[i].textContent.replace(/\{.*\}|:|\.+|\s{2}|\s$/gi, "")
}
console.log(res)
<!--
<p>`Test` `image` `captions` `for` `GitBook`:</p>
<p>`Second` `image`: <img scr="./image2.png" alt="image title" title="image title">`asdf`</img>{caption width="300" style="height:'300px'"} </p>
<p>`Sample` `text` `and` `first` `image`: <img scr="./image1.png" alt="image 1" /> {caption width="300" style="height:'300px'"} `for` `testing` `ok`...</p>
-->
<p>Test image captions for GitBook:</p>
<p>Second image: <img scr="./image2.png" alt="image title" title="image title">asdf</img>{caption width="300" style="height:'300px'"} </p>
<p>Sample text and first image: <img scr="./image1.png" alt="image 1" /> {caption width="300" style="height:'300px'"} for testing ok...</p>
Upvotes: 0
Reputation: 990
My question might not have been too clear because answers were using javascript code to process matches. My purpose was to find solution with simple expression only. I finally found this expression that satisfies my needs:
((?!([^<]+)?>)([\w]+)(?!([^\{]+)?\})([\w]+))
Upvotes: 1
Reputation: 36703
You can try this:
var words = [];
$(function () {
$("p").each(function () {
words.concat($(this).text().split(" "));
});
});
Now words
array contains all the words.
Upvotes: 0