WHITECOLOR
WHITECOLOR

Reputation: 26142

Capture words not followed by symbol

I need to capture all (english) words except abbreviations whose pattern are:

"_any-word-symbols-including-dash." 

(so there is underscore in the beginning and dot in the end an any letters and dash in the middle)

I tried smthing like this:

/\b([A-Za-z-^]+)\b[^\.]/g

but i seems that I don't understand how to work with negative matches.

UPDATE:

I need not just to match but wrap the words in some tags:

"a some words _abbr-abrr. a here" I should get:

<w>a</w> <w>some</w> <w>words</w> _abbr-abbr. <w>a</w> <w>here</w>

So I need to use replace with correct regex:

test.replace(/correct regex/, '<w>$1</w>')

Upvotes: 0

Views: 91

Answers (1)

mishik
mishik

Reputation: 10003

Negative lookahead is (?!).

So you can use:

/\b([^_\s]\w*(?!\.))\b/g

Unfortunately, there is no lookbehind in javascript, so you can't do similar trick with "not prefixed by _".

Example:

> a = "a some words _abbr. a here"
> a.replace(/\b([^_\s]\w*(?!\.))\b/g, "<w>$1</w>")
"<w>a</w> <w>some</w> <w>words</w> _abbr. <w>a</w> <w>here</w>"

Following your comment with -. Updated regex is:

/\b([^_\s\-][\w\-]*(?!\.))\b/g

> "abc _abc-abc. abc".replace(/\b([^_\s\-][\w\-]*(?!\.))\b/g, "<w>$1</w>")
"<w>abc</w> _abc-abc. <w>abc</w>"

Upvotes: 2

Related Questions