AnJ
AnJ

Reputation: 614

Optimize copying filtered data

I have a table with around 100k rows and 40 columns.

I need to copy some of the rows to another workbook based an array with strings that match column values.

cond_list = ["value1", "value2", "value3" ...]

This condition can match 5k rows or more.

I tried a simple solution to use AutoFilter and copy visible cells:

' Filter source data
src_wks.ListObjects("Table1").Range.AutoFilter _
  Field:=src_wks.ListObjects("Table1").ListColumns("Column1").Index, _
  Criteria1:=cond_list, Operator:=xlFilterValues
        
' Copy and paste
src_wks.UsedRange.SpecialCells(xlCellTypeVisible).Copy
dst_wks.Range("A1").PasteSpecial Paste:=xlPasteValues

Filtering takes a fraction of a second, but then execution of this line takes more than 10 minutes. I have to run this code like 20 times so it is unacceptable.

src_wks.UsedRange.SpecialCells(xlCellTypeVisible).Copy

I tried to modify the code following this comment: https://stackoverflow.com/a/22789329/7214068

I tried to copy whole data first and then remove hidden rows:

' Copy and Paste whole table
dst_wks.UsedRange.Offset(1, 0).Value = ""
addr = src_wks.UsedRange.Address
dst_wks.Range(addr).Value = src_wks.UsedRange.Value

' Filter data
dst_wks.ListObjects("Table1").Range.AutoFilter _
  Field:=dst_wks.ListObjects("Table1").ListColumns("Column1").Index, _
  Criteria1:=cond_list, Operator:=xlFilterValues

' Remove rest
Application.DisplayAlerts = False ' Suppress "delete row?" promt
Dim i, numRows As Long
numRows = dst_wks.UsedRange.Rows.Count
For i = numRows To 1 Step -1
    If (dst_wks.Range("A" & i).EntireRow.Hidden = True) Then
        dst_wks.Range("A" & i).Delete
    End If
Next i
Application.DisplayAlerts = True

Copying whole data takes less than two seconds. But then it again hangs on for loop and takes more than 10 minutes.

Upvotes: 0

Views: 171

Answers (3)

Anabas
Anabas

Reputation: 356

I am not sure about how your data looks like, but from my opinion, it is not efficienct to use filter. Here I will post a demo for your reference. Better to use SQL.

Sub filterProcess()
    Dim filterArray
    Dim conn As Object
    Set conn = CreateObject("adodb.connection")
    strPath = ThisWorkbook.FullName

    If Application.Version < 12 Then
        connString = "Provider=Microsoft.jet.OLEDB.4.0;Extended Properties=Excel 8.0;Data Source=" & strPath
    Else
        connString = "Provider=Microsoft.ACE.OLEDB.12.0; Extended Properties = 'Excel 12.0; HDR=YES;IMEX=0'; Data Source = " & strPath
    End If

    filterArray = Array("ta001", "01", "A")


    conn.Open connString
    strSQL = " SELECT * FROM [a$]  where [title1] = '" & filterArray(0) & "'" & " and [title2] = '" & filterArray(1) & "'" & "and [title3] = '" & filterArray(2) & "'"
    Set rst = conn.Execute(strSQL)
   Worksheets.Add

    For j = 0 To rst.Fields.Count - 1
        Cells(1, j + 1) = rst.Fields(j).Name

    Next

    ActiveSheet.Range("A2").CopyFromRecordset rst

    rst.Close
    conn.Close

    Set conn = Nothing


End Sub

enter image description here

Upvotes: 1

Ryan Wildry
Ryan Wildry

Reputation: 5677

An alternate approach (there are several ways to do this) could be to use a SQL statement to query the data from the sheet in question, then copy it to a new sheet. This might be preferable if the conditions for selecting data become more complex.

I have my data setup like this on Sheet1:

enter image description here

Code

Option Explicit
Private Const adCmdText As Long = 1
Private Const adStateOpen As Long = 1

Public Sub DisplayView(Conditions As String)
    Dim dbField       As Variant
    Dim fieldCounter  As Long
    Dim dbConnection  As Object
    Dim dbRecordset   As Object
    Dim dbCommand     As Object
    Dim OutputSheet   As Excel.Worksheet

    Set dbConnection = CreateObject("ADODB.Connection")
    Set dbRecordset = CreateObject("ADODB.Recordset")
    Set dbCommand = CreateObject("ADODB.Command")

    Set OutputSheet = ThisWorkbook.Worksheets("Sheet2")

    'Do a quick check to determine the correct connection string
    'if one of these don't work, have a look here --> https://www.connectionstrings.com/excel/
    If Left$(ThisWorkbook.FullName, 4) = "xlsm" Then
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0 Macro;HDR=YES';"
    Else
        dbConnection.connectionstring = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & _
        ThisWorkbook.FullName & ";Extended Properties='Excel 12.0;HDR=YES';"
    End If

    'Open the connection and query
    dbConnection.Open
    With dbCommand
        .ActiveConnection = dbConnection
        .CommandType = adCmdText
        .CommandText = "Select * from [Sheet1$] where Column1 in (" & Conditions & ")" 'Update Sheet where applicable
        Set dbRecordset = .Execute
    End With

    'Clear the Output Sheet
    OutputSheet.Cells.Clear

    'Add Headers to output
    For Each dbField In dbRecordset.Fields
        fieldCounter = fieldCounter + 1
        OutputSheet.Cells(1, fieldCounter).Value2 = dbField.Name
    Next

    'Dump the found records
    OutputSheet.Range("A2").CopyFromRecordset dbRecordset
    If dbConnection.State = adStateOpen Then dbConnection.Close
End Sub

'Run from here
Public Sub ExampleRunner()
    Dim t As Double
    t = Timer
    DisplayView "'value1','value2','value3'" 'Send it a quoted csv of values you are looking for
    Debug.Print "Getting data took: " & Timer - t & " seconds"
End Sub

This is taking about 4-5 seconds on my machine to pull back a few thousand records from a total data set size of 100,000.

Upvotes: 1

HTH
HTH

Reputation: 2031

you could try :

  • the technique from accepted solution of the SO question you linked

    i.e.: loop through Areas and work with Value properties

  • reference src_wks.ListObjects("Table1").Range also for copy/paste values operation

as follows:

Dim area As Range
With src_wks.ListObjects("Table1").Range ' reference your table Range
    ' Filter referenced range
    .AutoFilter _
        Field:=src_wks.ListObjects("Table1").ListColumns("Column1").Index, _
        Criteria1:=cond_list, Operator:=xlFilterValues

    ' Copy and paste values from each single referenced range "visible" area
    For Each area In .SpecialCells(xlCellTypeVisible).Areas
        With area
            dst_wks.Cells(Rows.Count, 1).End(xlUp).Offset(1).Resize(.Rows.Count, .Columns.Count).Value = .Value
        End With
    Next
End With

and if you could also Sort your table, that could dramatically speed it up more

Upvotes: 0

Related Questions