Reputation: 615
sentence: WOMACK ARMY HOSPITAL null 2817 ~ Reilly ft Rd~ 28310
expected: WOMACK ARMY HOSPITAL null 2817 ~ Reilly~ 28310
word groups are separated by tabs.
I need to remove every word enclosed in between the tilde symbols (~) that contain 2 letters or less.
my current regex doesn't find it.
find what: ~[^ ]{1,2}~
replace with: nothing
This needs to work multi-line.
Upvotes: 1
Views: 261
Reputation: 627087
You may use
(?:\G(?!^)|~)[^~\n]*?\K[^\n\w]*\b\w{1,2}\b(?=[^~\n]*~)
Replace with an empty string. See the regex demo online.
Note that I added \n
to the negated character classes to make sure you only match within lines (without overflowing from one line to another).
Details
(?:\G(?!^)|~)
- the end of the previous match or a tilde[^~\n]*?
- 0+ chars other than tilde and newline, as few as possible\K
- omit the text matched so far[^\n\w]*
- 0+ chars other than word and newline chars, as many as possible\b\w{1,2}\b
- 1 or 2 char words (replace \w
with \pL
to only match letters)(?=[^~\n]*~)
- there must be a tilde after 0+ chars other than tilde and newline (to make sure we have a closing ~
on the same line).Notepad++ settings:
Upvotes: 1