Extract Regex Value in Excel Visual basic

Question

I have a Visual Basic code ready but it extracts everything but the matching regex.

The sheet looks like this

Here's the code I use:

Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String

strPattern = "[a-zA-Z]{4}\d{4}$"

If strPattern <> "" Then
    strInput = Myrange.Value
    strReplace = ""
    
    With regEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .pattern = strPattern
    End With
    
    If regEx.test(strInput) Then
        simpleCellRegex = regEx.Replace(strInput, strReplace)
    Else
        simpleCellRegex = "Not matched"
    End If
End If
End Function

I use Excel 2019 on Windows.

JvdV · Accepted Answer

VBA:

You could still do this via regex replace adjusting your pattern, but rather pull the value directly from the matchobject, for example:

Function simpleCellRegex(s As String) As String

With CreateObject("vbscript.regexp")
    .Pattern = "(?:^|\s)([a-zA-Z]{4}\d{4})$"
    If .Test(s) Then
        simpleCellRegex = .Execute(s)(0).submatches(0)
    Else
        simpleCellRegex = "No Match"
    End If
End With

End Function

You'll notice I edited the pattern to (?:^|\s)([a-zA-Z]{4}\d{4})$. This would ensure that the previous character is either non-existing or is a whitespace. This should prevent possible false positives when string is part of a larger substring. Even word-boundaries won't help in case of 'test-abcd1000'. See an online demo. This does, however, require us to acces the submatches from the matchobject. Both indexes are zero-based. Another option is to use Trim(.Execute(s)(0)) to get rid of the potential whitespace.

FILTERXML():

Though regex is a viable option, it does require your workbook to be macro-enabled. You could opt for pattern validation with native function, for example FILTERXML():

=IFERROR(FILTERXML(""&SUBSTITUTE(A2;" ";"")&"";"//s[last()][translate(substring(.,1,4), 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][string-length()=8][substring(.,5)*0=0]");"No Match")

The meaning of the xpath used:

//s[last()] - Get last element in array. In regex term; validate we are looking at the text before the end-line anchor;
[translate(substring(.,1,4), 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''] - When first four characters are translated (alpha to empty string) check if remainder equals empty;
[string-length()=8] - Check if lenght of string is 8 characters;
[substring(.,5)*0=0] - Check that substring from 5th characters onwards equals zero when multiplied by zero.

Extract Regex Value in Excel Visual basic

Answers (1)

Related Questions