menteith
menteith

Reputation: 678

RegEx to not much new line characters in Word

I have strings like this one:

  1. Smith, John (1919-2006).
  2. McKane, Vicky (1949-2012).

I would like to match

7. Smith, John (1919-2006).

8. McKane, Vicky (1949-2012).

I have came up with this \s*[0-9]\.\s*|\s*(?:\([^()]*\))\.\s* link. It does the trick, but it also catches new line characters which gives in MS Word the following string when regex is replaced with empty character "" :

Smith, JohmMcKane, Vicky

EDIT: Here is the VBA code I use:

With selection
        Dim RegEx As Object
        Set RegEx = CreateObject("VBScript.RegExp")
        RegEx.Global = True
        RegEx.MultiLine = True
        RegEx.Pattern = "\s*[0-9]\.\s*|\s*(?:\([^()]*\))\.\s*"
        .Text = RegEx.Replace(.Text, "")
End With

Upvotes: 1

Views: 1981

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

I have tested it a lot, and the best I could achieve is

[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()]*\))\.[ \t]*

Or - since you have the Multiline option on:

^[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()" & vbCr & vbLf & "]*\))\.[ \t]*$

Both will result in

Smith, John
McKane, Vicky

Note that \s can be safely replaced with [ \t] to only match ASCII regular horizontal whitespace.

The last paragraph separator is added by all means only if the whole contents are selected before the replacement. If you select all but the last separator, the additional separator won't be inserted.

So, you may use this workaround:

ActiveDocument.Content.Select
Selection.MoveLeft Unit:=wdCharacter, Count:=1, Extend:=wdExtend
With Selection
        Dim RegEx As Object
        Set RegEx = CreateObject("VBScript.RegExp")
        RegEx.Global = True
        RegEx.MultiLine = True
        RegEx.Pattern = "^[ \t]*[0-9]\.[ \t]*|[ \t]*(?:\([^()" & vbCr & vbLf & "]*\))\.[ \t]*$"
        .Text = RegEx.Replace(.Text, "")
End With

enter image description here

Upvotes: 1

dustinroepsch
dustinroepsch

Reputation: 1160

[^\S\n]

Will match any whitespace that isn't a new line

/\s*[0-9]\.\s*|\s*(?:\([^()]*\))\.[^\S\n]*/g

However, I would suggest an alternative way to do what you are trying to do.

/\d\.\s(.*)\s\(.*\)\./g

Will match the lines you've asked, but put the names into a Capture Group for easy retrieval later.

Upvotes: 0

Related Questions