Mubash
Mubash

Reputation: 13

Best formula/method to extract a standard set of numbers from a string?

I have the following strings from which I need to extract 6 digit numbers. Since these strings are generated by another software, they occur interchangeably and I cannot control it. Is there any one method that would extract both 6-digit numbers from each of these strings?

Branch '100235 to 100236 Ckt 1' specified in table 'East Contingency' for record with primary key = 21733 was not found in branch or transformer data.

Loadflow branch ID '256574_701027_1' defined in supplemental branch table was not found in branch or transformer input.

Transmission element from bus number 135415 to bus number 157062 circuit ID = 1 defined for corridor 'IESO-NYISO' was not found in input data

I don't know VBA, but I can learn it if it means I can get the 6 digit numbers using a single method. thanks

I have been using LEFT(), RIGHT() & MID() previously, but it means manually applying the appropriate formula for individual string.

Upvotes: 1

Views: 216

Answers (3)

Tom Sharpe
Tom Sharpe

Reputation: 34210

A couple more suggestions (if you need them):

(1) Replacing all non-digit characters with a space then splitting the resulting string:

=LET(numbers,TEXTSPLIT(TRIM(REDUCE("",MID(A1,SEQUENCE(1,LEN(A1)),1),LAMBDA(a,c,IF(is.digit(c),a&c,a&" "))))," "),FILTER(numbers,LEN(numbers)=6))

Here I've defined a function is.digit as

=LAMBDA(c, IF(c = "", FALSE, AND(CODE(c) > 47, CODE(c) < 58)))

(tl;dr I quite like doing it this way because it hides the implementation details of is.digit and creates a rudimentary form of encapsulation)

(2) A UDF - based on the example here and called as

=RegexTest(A1)

Option Explicit

Function RegexTest(s As String) As Double()

    Dim regexOne As Object
    Dim theNumbers As Object
    Dim Number As Object
    Dim result() As Double
    Dim i As Integer
    
    Set regexOne = New RegExp
    
    ' Not sure how you would extract numbers of length 6 only, so extract all numbers...
    
    regexOne.Pattern = "\d+"
    regexOne.Global = True
    regexOne.IgnoreCase = True
    
    Set theNumbers = regexOne.Execute(s)
    
    i = 1
    
    For Each Number In theNumbers
    
    '...Check the length of each number here
    
        If Len(Number) = 6 Then
            ReDim Preserve result(1 To i)
            result(i) = CDbl(Number)
            i = i + 1
        End If
    Next
    
    RegexTest = result

End Function

Note - if you wanted to preserve leading zeroes you would need to omit the Cdbl() and return the numbers as strings. Returns an error if no 6-digit numbers are found.

Upvotes: 1

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60224

If you have Microsoft 365, you can use this formula:

=LET(arr,TEXTSPLIT(SUBSTITUTE(SUBSTITUTE(A1,"'"," "),"_"," ")," "),
FILTER(arr,ISNUMBER(-arr)*(LEN(arr)=6)))

Thanks to @TomSharpe for this shorter version, using an array constant within TEXTSPLIT to add on possible delimiters.

=LET(arr,TEXTSPLIT(A1,{"_"," ",","," ","'"}),FILTER(arr,(LEN(arr)=6)*ISNUMBER(-arr)))

Data
enter image description here

Output
enter image description here

Upvotes: 1

Jos Woolley
Jos Woolley

Reputation: 9062

An alternative is:

=LET(ζ,MID(A1,SEQUENCE(,LEN(A1)-5),6),ξ,MID(ζ,SEQUENCE(6),1),FILTER(ζ,MMULT(SEQUENCE(,6,,0),1-ISERR(0+ξ))=6))

Upvotes: 1

Related Questions