atgold18
atgold18

Reputation: 31

How to extract the first instance of digits in a cell with a specified length in VBA?

I have the following Text sample:

Ins-Si_079_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_May

I want to get the number 079, So what I need is the first instance of digits of length 3. There are certain times the 3 digits are at the end, but they usually found with the first 2 underscores. I only want the digits with length three (079) and not 19, 1920, or 2554 which are different lengths.

Sometimes it can look like this with no underscore:

1920 O-B CLI 353 Tar Traf

Or like this with the 3 digit number at the end:

Ins-Si_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_079

There are also times where what I need is 2 digits but when it's 2 digits its always at the end like this:

FY1920-Or-OLV-B-45 

How would I get what I need in all cases?

Upvotes: 0

Views: 469

Answers (3)

FaneDuru
FaneDuru

Reputation: 42236

Try this function, please:

Function ExtractThreeDigitsNumber(x As String) As String
   Dim El As Variant, arr As Variant, strFound As String
   
   If InStr(x, "_") > 0 Then
        arr = Split(x, "_")
   Elseif InStr(x, "-") > 0 Then
        arr = Split(x, "-")
   Else
        arr = Split(x, " ")
   End If
    For Each El In arr
        If IsNumeric(El) And Len(El) = 3 Then strFound = El: Exit For
    Next
    If strFound = "" Then
        If IsNumeric(Right(x, 2)) Then ExtractThreeDigitsNumber = Right(x, 2)
    Else
        ExtractThreeDigitsNumber = strFound
    End If
End Function

It can be called in this way:

Sub testExtractThreDig()
    Dim x As String
    x = "Ins-Si_079_GM_SOC_US_VI SI_SOC_FY1920_US_FY19/20_A2554_Si Resp_2_May"
    Debug.Print ExtractThreeDigitsNumber(x)
End Sub

Upvotes: 2

JvdV
JvdV

Reputation: 75900

If VBA is not a must you could try:

=TEXT(INDEX(FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"_"," "),"-"," ")," ","</s><s>")&"</s></t>","//s[.*0=0][string-length()=3 or (position()=last() and string-length()=2)]"),1),"000")

enter image description here

It worked for your sample data.


Edit: Some explaination.

  • SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"_"," "),"-"," ")," ","</s><s>") - The key part to transform all three potential delimiters (hyphen, underscore and space) to valid XML node end- and startconstruct.

  • The above concatenated using ampersand into a valid XML construct (adding a parent node <t>).

  • FILTERXML can be used to now 'split' the string into an array.

  • //s[.*0=0][string-length()=3 or last() and string-length()=2] - The 2nd parameter of FILTERXML which should be valid XPATH syntax. It reads:

     //s                                                                 'Select all <s> nodes with
                                                                          following conditions:
     [.*0=0]                                                             'Check if an <s> node times zero
                                                                          returns zero (to check if a node
                                                                          is numeric.                              '
     [string-length()=3 or (position()=last() and string-length()=2)]    'Check if a node is 3 characters
                                                                          long OR if it's the last node and
                                                                          only 2 characters long.
    
  • INDEX(.....,1) - I mentioned in the comments that usually this is not needed, but since ExcelO365 might spill the returned array, we may as well implemented to prevent spilling errors for those who use the newest Excel version. Now we just retrieving the very first element of whatever array FILTERXML returns.

  • TEXT(....,"000") - Excel will try delete leading zeros of a numeric value so we use TEXT() to turn it into a string value of three digits.

Now, if no element can be found, this will return an error however a simple IFERROR could fix this.

Upvotes: 3

T.M.
T.M.

Reputation: 9948

You can split the listed items and check for 3 digits via Like:

Function Get3Digits(s As String) As String
Dim tmp, elem
tmp = Split(Replace(Replace(s, "-", " "), "_", " "), " ")
For Each elem In tmp
    If elem Like "###" Then Get3Digits = elem: Exit Function
Next
If Get3Digits = vbNullString Then Get3Digits = IIf(Right(s, 2) Like "##", Right(s, 2), "")

End Function

Edited due to comment:

I would execute a 2 digit search when there are no 3 didget numbers before the end part and the last 2 digits are 2. if 3 digits are fount at end then get 3 but if not then get 2. there are times when last is a number but only one number. I would only want to get last if there are 2 or 3 numbers. The - would not be relevant to the 2 digets. if nothing is found that is desired then would return " ".

Upvotes: 3

Related Questions