Reputation: 355
I had originally asked the question below and found a good formula excel formula find part number in file path text string
So after using the =MID(A1,SEARCH("????-??????-??",A1),14)
formula to find the part number names things seemed fine until I ran into data that looks like the sample set below.
c:\users\stuff\folder_name\folder_name_9876-vv-123-Ag\1234-000001-01_ baskets_1.pdf
c:\users\stuff\folder_name_9876-vv-123-Ag\random file_1234-000001-02_ baskets_2.pdf
c:\users\stuff\folder_name_XXXX-XXXXXX-XX\1234-000001-03_ baskets_3.pdf
c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf
c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf
c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf
I am looking thru a few hundred thousand file names and I know I am missing some important files due to the false positive triggers. Like 9876-vv-123-Ag
it technically fits the ????-??????-??
format but is not what I want to extract from the string. Now to the question, is there a way to apply =MID(A1,SEARCH("????-??????-??",A1),14)
that can look from the right side of the string and work its way left to ensure that it grabs the last sub-string in the full string that meets the ????-??????-??
format?
or is there a way to set the wildcard search to numeric only instead of anything in that ????-??????-??
format?
Upvotes: 0
Views: 4247
Reputation: 60324
If there are no other ambiguities, you can search just the file name portion of the file path, and return the matching string:
=MID(TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",99)),99)),SEARCH("????-??????-??",TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",99)),99))),14)
If there are other ambiguities, you will need to be more specific
This portion of the formula: TRIM(RIGHT(SUBSTITUTE(A1,"\",REPT(" ",99)),99))
returns the last substring where the string is split on the "\" character. If it might be the case that the file name might have more than 99 characters, then increase the 99 appropriately, or use something like LEN(A1)
instead.
If you want to use a VBA routine, the following routine processed 500,000 rows in less than 4 seconds. It does depend on the part number consisting of only digits, but the change would be trivial if other patterns need to be included.
Option Explicit
Sub partNum()
Dim RE As Object
Dim R As Range, WS As Worksheet
Dim vSrc As Variant, vRes As Variant
Dim I As Long
Set WS = Worksheets("sheet1")
With WS
vSrc = .Range(.Cells(1, 1), .Cells(.Rows.Count, 1).End(xlUp))
Set R = .Range(.Cells(1, 2), .Cells(UBound(vSrc, 1), 2))
End With
ReDim vRes(1 To UBound(vSrc, 1), 1 To 1)
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "\d{4}-\d{6}-\d{2}"
For I = 1 To UBound(vSrc)
If .test(vSrc(I, 1)) = True Then vRes(I, 1) = .Execute(vSrc(I, 1))(0)
Next I
End With
R.EntireColumn.Clear
R = vRes
End Sub
VBA Function (stand-alone)
Option Explicit
Function getPartNum(S As String) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "\d{4}-\d{6}-\d{2}(?!.*\\)"
If .test(S) = True Then getPartNum = .Execute(S)(0)
End With
End Function
Upvotes: 1
Reputation: 54838
Works only for 2 possible occurrences of the string choosing the last.
=IF(ISERROR(SEARCH("????-??????-??";A2));"";IF(ISERROR(SEARCH("????-??????-??";A2;SEARCH("????-??????-??";A2)+1));MID($A2;SEARCH("????-??????-??";A2);LEN("????-??????-??"));MID($A2;SEARCH("????-??????-??";A2;SEARCH("????-??????-??";A2)+1);LEN("????-??????-??"))))
Sub SearchQ2()
Const cStrSource As String = "A2"
Const cStrTarget As String = "B2"
Const cStrSearch As String = "????-??????-??"
Dim vntRange As Variant
Dim lngSearch As Long
Dim intStart As Integer
Dim lng1 As Long
Dim strTemp As String
vntRange = Range(cStrSource).Resize(Cells(Rows.Count, Range("A1").Column) _
.End(xlUp).Row - Range(cStrSource).Row + 1)
For lng1 = 1 To UBound(vntRange)
intStart = 1
Do
On Error Resume Next
lngSearch = WorksheetFunction.Search(cStrSearch, _
vntRange(lng1, 1), intStart)
If Err Then
Exit Do
Else
strTemp = Mid(vntRange(lng1, 1), lngSearch, Len(cStrSearch))
intStart = lngSearch + Len(cStrSearch) + 1
End If
Loop
vntRange(lng1, 1) = strTemp
Next
Range(cStrTarget).Resize(Cells(Rows.Count, Range("A1").Column) _
.End(xlUp).Row - Range(cStrSource).Row + 1) = vntRange
End Sub
It works fine for a hundred, maybe a few hundred cells. This might be an inspiration for someone to write an efficient function.
It loops through the string to find the last occurrence of the SearchString.
Function SearchQ(SearchString As String, Cell As Range) As String
Application.Volatile
Dim lngSearch As Long
Dim intStart As Integer
intStart = 1
Do
On Error Resume Next
lngSearch = WorksheetFunction.Search(SearchString, _
Cell.Cells(1, 1).Text, intStart)
If Err Then
Exit Do
Else
SearchQ = Mid(Cell.Cells(1, 1).Text, lngSearch, Len(SearchString))
intStart = lngSearch + Len(SearchString) + 1
End If
Loop
End Function
Upvotes: 0