Reputation: 26142
I need to capture all (english) words except abbreviations whose pattern are:
"_any-word-symbols-including-dash."
(so there is underscore in the beginning and dot in the end an any letters and dash in the middle)
I tried smthing like this:
/\b([A-Za-z-^]+)\b[^\.]/g
but i seems that I don't understand how to work with negative matches.
UPDATE:
I need not just to match but wrap the words in some tags:
"a some words _abbr-abrr. a here" I should get:
<w>a</w> <w>some</w> <w>words</w> _abbr-abbr. <w>a</w> <w>here</w>
So I need to use replace with correct regex:
test.replace(/correct regex/, '<w>$1</w>')
Upvotes: 0
Views: 91
Reputation: 10003
Negative lookahead is (?!)
.
So you can use:
/\b([^_\s]\w*(?!\.))\b/g
Unfortunately, there is no lookbehind in javascript, so you can't do similar trick with "not prefixed by _
".
Example:
> a = "a some words _abbr. a here"
> a.replace(/\b([^_\s]\w*(?!\.))\b/g, "<w>$1</w>")
"<w>a</w> <w>some</w> <w>words</w> _abbr. <w>a</w> <w>here</w>"
Following your comment with -
. Updated regex is:
/\b([^_\s\-][\w\-]*(?!\.))\b/g
> "abc _abc-abc. abc".replace(/\b([^_\s\-][\w\-]*(?!\.))\b/g, "<w>$1</w>")
"<w>abc</w> _abc-abc. <w>abc</w>"
Upvotes: 2