Reputation: 71
I'm a VBA noob. I need help working out this filter: My data has ~50,000 rows and 100 columns. The column I want to filter has values like TL-98.263138472% BD-1.736861528%. I want to filter out all the values in VBA where TL>90%. I can think of a long way of doing it - where I create a loop, break down each cell, then look at TL, then the 4 numbers next to it. But it sounds like it would take forever. Wondering if there's a faster/easier way to do it? Also wondering, if it's even worth it. If it would take even more than 2 seconds, then I would rather not do it with VBA. I have not coded it yet, wanted to see if anyone has better ideas than what I came up with. Thanks in advance! Adding an example of my data below:
Upvotes: 2
Views: 328
Reputation: 166196
Pretty fast in my tests:
Sub tester()
Dim ws As Worksheet, t
Dim i As Long, rng As Range, rngFilt As Range, arr, arrFilt
' For i = 2 To 50000 'create some dummy data
' Cells(i, "A") = "TL-" & 50 + (Rnd() * 60) & "% BD-1.736861528%"
' Next i
' [B2:CV50000].value="blah" 'fill rest of table
t = Timer
Set ws = ActiveSheet
If ws.FilterMode Then ws.ShowAllData
Set rng = ws.Range("A1:A" & ws.Cells(Rows.Count, "A").End(xlUp).Row) 'range of values to filter
Set rngFilt = rng.Offset(0, 110) 'a range off to the right to filter on
arr = rng.Value
arrFilt = rngFilt.Value 'for holding filtering flags
arrFilt(1, 1) = "Filter" 'column header
For i = 2 To UBound(arr, 1)
arrFilt(i, 1) = IIf(FilterOut(arr(i, 1)), "Y", "N")
Next i
rngFilt.Value = arrFilt
rngFilt.AutoFilter field:=1, Criteria1:="N"
Debug.Print Timer - t
End Sub
'does this value need to be filtered out?
Function FilterOut(v) As Boolean
Dim pos As Long
pos = InStr(v, "TL-")
If pos > 0 Then
v = Mid(v, pos + 3)
pos = InStr(v, "%")
If pos > 0 Then
v = Left(v, pos - 1)
'Debug.Print v
If IsNumeric(v) Then FilterOut = v > 90
End If
End If
End Function
This ran in <0.3 sec for me, on a 50k row X 100 col dataset
Upvotes: 2
Reputation: 54807
The Code
Option Explicit
Sub CopyData()
Dim T As Double: T = Timer
' Read Data: Write the values from the source range to an array.
' Define constants.
Const SRC_NAME As String = "Sheet1"
Const SRC_COLUMN As Long = 44
Const CRIT_STRING_LEFT As String = "TL-"
Const CRIT_VALUE_GT As Double = 90
Const DST_NAME As String = "Sheet2"
' Reference the workbook.
Dim wb As Workbook: Set wb = ThisWorkbook ' workbook containing this code
' Reference the source range.
Dim sws As Worksheet: Set sws = wb.Sheets(SRC_NAME)
Dim srg As Range: Set srg = sws.Range("A1").CurrentRegion
Dim srCount As Long: srCount = srg.Rows.Count
Dim cCount As Long: cCount = srg.Columns.Count
' Write to the array (practically this line uses up all the time).
Dim Data(): Data = srg.Value ' assumes at least two cells in 'srg'
Debug.Print "Read Data: " & Format(Timer - T, "0.000s")
T = Timer
' Modify Data: Write the critical values to the top of the array.
Dim cLen As Long: cLen = Len(CRIT_STRING_LEFT)
Dim dr As Long: dr = 1 ' skip headers
Dim sr As Long, c As Long
Dim cPos As Long, cNum As Double, cString As String
For sr = 2 To srCount ' skip headers
cString = CStr(Data(sr, SRC_COLUMN))
cPos = InStr(1, cString, CRIT_STRING_LEFT, vbTextCompare)
If cPos > 0 Then
cString = Right(cString, Len(cString) - cPos - cLen + 1)
cString = Replace(cString, "%", "")
cNum = Val(cString) ' 'Val' doesn't work with "!,@,#,$,%,&,^"
If cNum > CRIT_VALUE_GT Then ' 'Evaluate' is too slow!
dr = dr + 1
For c = 1 To cCount
Data(dr, c) = Data(sr, c)
Next c
End If
End If
Next sr
Debug.Print "Modify Data: " & Format(Timer - T, "0.000s")
T = Timer
' Write Data: Write the values from the array to the destination range.
If dr = 0 Then Exit Sub ' no filtered values
' Reference the destination range.
Dim dws As Worksheet: Set dws = wb.Sheets(DST_NAME)
Dim drg As Range: Set drg = dws.Range("A1").Resize(dr, cCount)
' Write to the range (practically this line uses up all the time).
drg.Value = Data
' Clear below
drg.Resize(dws.Rows.Count - drg.Row - dr + 1).Offset(dr).Clear
Debug.Print "Write Data: " & Format(Timer - T, "0.000s")
MsgBox "Data copied.", vbInformation
End Sub
The Result (Time Passed)
On a sample of 50k
rows by 100
columns of data with 26k
matches, the code finished in under 5s
:
Read Data: 1.336s
Modify Data: 0.277s
Write Data: 3.375s
There were no blank cells and each cell in the criteria column contained the criteria string with a percentage hence it should be faster on your data. Your feedback is expected.
Upvotes: 1
Reputation: 1827
It you want to look into non VBA solutions, You could use a helper column to decide it it's worth filtering out.
First we need to find "TL-" in the string, then find "%" After that:
MID(A4,FIND("TL-",A4)+3,FIND("%",A4,FIND("TL-",A4)+3)-FIND("TL-",A4)-3)
This will just return us that value sub string, regardless or position.
Now we need to convert it into a value... and I'm told that --( ) isn't the correct way to convert a string to a value... but i keep using it and it keeps working.
Anyway, finally we test if that is larger than 90 like:
=IF(--(MID(A4,FIND("TL-",A4)+3,FIND("%",A4,FIND("TL-",A4)+3)-FIND("TL-",A4)-3))>90,"Remove","Keep")
Upvotes: 1