Patrick Coffey
Patrick Coffey

Reputation: 1

Extract different numbers from multiple strings

In a .csv spreadsheet, I have multiple strings with incrementing numerical values contained in each, and I need to extract the numbers from each string. For example, here are two strings:

DEVICE1.CM1 - 4.1.1.C1.CA_VALUE (A)
DEVICE1.CM2 - 6.7.1.C2.CA_VALUE (A)
DEVICE1.CM1 - 4.1.2.C1.CA_VALUE (A) 
DEVICE1.CM1 - 4.1.2.C2.CA_VALUE (A) 
DEVICE1.CM1 - 4.1.2.C3.CA_VALUE (A) 
DEVICE1.CM1 - 5.1.1.C1.CA_VALUE (A) 
DEVICE1.CM1 - 5.1.1.C2.CA_VALUE (A) 
DEVICE1.CM1 - 5.10.1.C3.CA_VALUE (A) 
DEVICE1.CM1 - 6.13.1.C10.CA_VALUE (A)

And I am looking to extract "4.1.1.C1" from the first string, and "6.7.1.C2" from the second string.

I have over 1000 strings, each with a different incremental value in the form of "#.#.#.C.#" and all of the options I have tried so far involve searching for a specific value to extract, rather than extracting all values of that general form. Is there any reasonable way to accomplish this?

Upvotes: 0

Views: 184

Answers (3)

T.M.
T.M.

Reputation: 9948

The fixed structure

(items) are always preceeded by " - " and followed by ".CA_VALUE (A)"

allows to isolate the code string via Split as follows:

  • consider ".CA_VALUE (A)" as closing delimiter, but change occurrence(s) to "- "
  • execute Split now on the resulting string using only the first delimiter (StartDelim "- ")
  • isolate the second token (index: 1 as split results are zero-based)
Function ExtractCode(ByVal s As String) As String
    Const StartDelim   As String = "- "
    Const ClosingDelim As String = ".CA_VALUE (A)"
    ExtractCode = Split(Replace(s, ClosingDelim, StartDelim), StartDelim)(1)
End Function


Another approach with focus on splitting via point delimiters //Edit 2021-11-20

If you want to experiment with a fixed start position of your 4-items code in a split array (based on point delimiters "."), you might also consider the following approach:

  • split via point delimiters "."
  • filter only the 3rd,4th,5th and 6th item via WorksheetFunction.Index (by its columns argument)
  • join the resulting items again via connecting points "."

a) Using (Excel) version MS 365

Function ExtractCode(ByVal s As String, Optional startPos As Long = 3) As Variant
    Const delim As String = "."
    Dim tmp
    tmp = Split(Replace(s, "- ", delim), delim)   ' normalize hyphen to point delimiter
    With Application.WorksheetFunction
        ExtractCode = Join(.Index(tmp, 0, .Sequence(1, 4, startPos)), ".")
    End With
End Function

b) Make it backwards compatible

Just change the function result assignment to

        ExtractCode = Join(.Index(tmp, 0, Evaluate("{1,2,3,4}-1+" & startPos)), ".")

which in both cases changes the Index column argument to a 1-based column number Array(3,4,5,6)

Upvotes: 1

FunThomas
FunThomas

Reputation: 29276

I am not a big fan of regular expressions because they are often hard to read, but this is a typical example where you should use them. Read carefully the Q&A BigBen linked to in the comments.

Function extractCode(s As String) As String
    Static rx As RegExp
    If rx Is Nothing Then Set rx = New RegExp
    rx.Pattern = "\d+\.\d+\.\d+\.C\d"
    If rx.Test(s) Then
        extractCode = rx.Execute(s)(0)
    End If
End Function

(You will need to add the reference to the Microsoft VBScript Regular Expression library)

--> Updated my answer, you need to escape the dot, else it is a placeholder for any character and the pattern would also match something like "4x1y2zC3",

Upvotes: 3

Solar Mike
Solar Mike

Reputation: 8375

So here goes: enter image description here

MID(A1,FIND("-",A1,1)+2,(FIND("_",A1,1)-FIND("-",A1,1))-5)

Upvotes: 1

Related Questions