Reputation: 13
I have a Visual Basic code ready but it extracts everything but the matching regex.
Here's the code I use:
Function simpleCellRegex(Myrange As Range) As String
Dim regEx As New RegExp
Dim strPattern As String
Dim strInput As String
Dim strReplace As String
Dim strOutput As String
strPattern = "[a-zA-Z]{4}\d{4}$"
If strPattern <> "" Then
strInput = Myrange.Value
strReplace = ""
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = strPattern
End With
If regEx.test(strInput) Then
simpleCellRegex = regEx.Replace(strInput, strReplace)
Else
simpleCellRegex = "Not matched"
End If
End If
End Function
I use Excel 2019 on Windows.
Upvotes: 0
Views: 202
Reputation: 75990
VBA:
You could still do this via regex replace adjusting your pattern, but rather pull the value directly from the matchobject, for example:
Function simpleCellRegex(s As String) As String
With CreateObject("vbscript.regexp")
.Pattern = "(?:^|\s)([a-zA-Z]{4}\d{4})$"
If .Test(s) Then
simpleCellRegex = .Execute(s)(0).submatches(0)
Else
simpleCellRegex = "No Match"
End If
End With
End Function
You'll notice I edited the pattern to (?:^|\s)([a-zA-Z]{4}\d{4})$
. This would ensure that the previous character is either non-existing or is a whitespace. This should prevent possible false positives when string is part of a larger substring. Even word-boundaries won't help in case of 'test-abcd1000'. See an online demo. This does, however, require us to acces the submatches from the matchobject. Both indexes are zero-based. Another option is to use Trim(.Execute(s)(0))
to get rid of the potential whitespace.
FILTERXML():
Though regex is a viable option, it does require your workbook to be macro-enabled. You could opt for pattern validation with native function, for example FILTERXML()
:
=IFERROR(FILTERXML("<t><s>"&SUBSTITUTE(A2;" ";"</s><s>")&"</s></t>";"//s[last()][translate(substring(.,1,4), 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ','')=''][string-length()=8][substring(.,5)*0=0]");"No Match")
The meaning of the xpath used:
//s[last()]
- Get last element in array. In regex term; validate we are looking at the text before the end-line anchor;[translate(substring(.,1,4), 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ','')='']
- When first four characters are translated (alpha to empty string) check if remainder equals empty;[string-length()=8]
- Check if lenght of string is 8 characters;[substring(.,5)*0=0]
- Check that substring from 5th characters onwards equals zero when multiplied by zero.Upvotes: 1