Reputation: 678
I have strings like this one:
- Smith, John (1919-2006).
- McKane, Vicky (1949-2012).
I would like to match
7.
Smith, John(1919-2006).
8.
McKane, Vicky(1949-2012).
I have came up with this \s*[0-9]\.\s*|\s*(?:\([^()]*\))\.\s*
link. It does the trick, but it also catches new line characters which gives in MS Word the following string when regex is replaced with empty character ""
:
Smith, JohmMcKane, Vicky
EDIT: Here is the VBA code I use:
With selection
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Global = True
RegEx.MultiLine = True
RegEx.Pattern = "\s*[0-9]\.\s*|\s*(?:\([^()]*\))\.\s*"
.Text = RegEx.Replace(.Text, "")
End With
Upvotes: 1
Views: 1981
Reputation: 626689
I have tested it a lot, and the best I could achieve is
[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()]*\))\.[ \t]*
Or - since you have the Multiline option on:
^[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()" & vbCr & vbLf & "]*\))\.[ \t]*$
Both will result in
Smith, John
McKane, Vicky
Note that \s
can be safely replaced with [ \t]
to only match ASCII regular horizontal whitespace.
The last paragraph separator is added by all means only if the whole contents are selected before the replacement. If you select all but the last separator, the additional separator won't be inserted.
So, you may use this workaround:
ActiveDocument.Content.Select
Selection.MoveLeft Unit:=wdCharacter, Count:=1, Extend:=wdExtend
With Selection
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Global = True
RegEx.MultiLine = True
RegEx.Pattern = "^[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()" & vbCr & vbLf & "]*\))\.[ \t]*$"
.Text = RegEx.Replace(.Text, "")
End With
Upvotes: 1
Reputation: 1160
[^\S\n]
Will match any whitespace that isn't a new line
/\s*[0-9]\.\s*|\s*(?:\([^()]*\))\.[^\S\n]*/g
However, I would suggest an alternative way to do what you are trying to do.
/\d\.\s(.*)\s\(.*\)\./g
Will match the lines you've asked, but put the names into a Capture Group for easy retrieval later.
Upvotes: 0