Radu Puspana
Radu Puspana

Reputation: 115

VBA 6 : regex not recognizing complicated string

I have this string "1X214X942,0SX" where each X represents a "non-breaking space" (a whitespace to be exact) with the ASCII code 160, and S represents a space character.

I am trying to recognize it with this regex:

(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})

but it doesn't work, as this whitespace is not recognized and the expression only catches 942,0.

I have tried to see if RegExr can catch the whole string and it can, ( http://gskinner.com/RegExr/?2v8ic) so there's something wrong with my VBA then ?

Please advise !

Here is my code :

Sub ChangeNumberFromFRformatToENformat()

Dim SectionText As String
Dim RegEx As Object, RegC As Object, RegM As Object
Dim i As Integer


Set RegEx = CreateObject("vbscript.regexp")
With RegEx
    .Global = True
    .MultiLine = False
    .Pattern = "(\d{1,3}\s(\d{3}\s)*\d{3}(\,\d{1,3})?|\d{1,3}\,\d{1,3})"
End With

For i = 1 To ActiveDocument.Sections.Count()

    SectionText = ActiveDocument.Sections(i).Range.Text

    If RegEx.test(SectionText) Then
        Set RegC = RegEx.Execute(SectionText)

        For Each RegM In RegC

            Call ChangeThousandAndDecimalSeparator(RegM.Value)

        Next 'For Each RegM In RegC

        Set RegC = Nothing
        Set RegM = Nothing

    End If

Next 'For i = 6 To ActiveDocument.Sections.Count()

Set RegEx = Nothing

End Sub

Upvotes: 2

Views: 1023

Answers (2)

Callie J
Callie J

Reputation: 31296

The sequence \s doesn't match the non-breaking space, however there are ways you can do it. The atoms you want to consider using in the regexp are \xnn or \nnn -- these match a character by it's hexadecimal value or it's octal value.

Thus to match a non-breaking space of ASCII 160, specify one of \xA0 or \240 instead of \s.

Upvotes: 4

Tomalak
Tomalak

Reputation: 338228

The problem is, that \s does not contain the non-breaking space in the VBScript.RegExp implementation.

Try this instead:

With RegEx
    .Global = True
    .MultiLine = False
    .Pattern = "(\d{1,3}X(\d{3}X)*\d{3}(,\d{1,3})?|\d{1,3},\d{1,3})"
    .Pattern = Replace(.Pattern, "X", "[\s" & Chr(160) & "]")
End With

Regular expression patterns match literal characters you put into them, even the non-breaking space. The only way to add a non-breaking space to a VB string is by creating it with Chr().

Upvotes: 2

Related Questions