Sham
Sham

Reputation: 43

VBA-Excel and large data sets causes program to crash

First time poster and new to programming in general. I have a project in which i have to build a financial model to mine for data in excel. I have succeeded in building said model on VBA. I have ran tests on 3,000 line dataset and it was successful. I will briefly explain what it does.

I track a given stock on a given day on multiple exchanges. I download the data (roughly 935,000 lines) The first step is copy all the data for a given exchange (roughly 290,000) onto a new sheet (this takes roughly 8min), then I create a new column to log the bid ask spread (12secs), the next step is what Im having trouble with, I basically rank each line of data twice, one column for Bid size and one column for Ask size. I created a function which uses excel Percentile function and ranks based on where the given bid and ask size lands. As of right now, I have been running the Macro for the last 35min and has yet to execute. I cant attempt the other macros since each macro depends on the previous one.

So my basic issue is that since my data set is large, my model keeps crashing.The code seems to be fine when working with the test data, and it doesn't throw any errors when I run the program, but with the larger data set it just crashes. Does anyone have any suggestions? Is this normal with such large amounts of data?

Thanks in advance. Sham

Here is the sub and function thats giving me the trouble, the sub takes in the required inputs to run the function and then pops into the assigned cell. The code is suppose to repeat the process for three separate sheets. For now, Id like it to work on one sheet, hence used the comments to not include the loop

Sub Bucketting()

Dim firstRow As Long
Dim lastRow As Long
Dim counter As Long
Dim bidRange As Range
Dim offerRange As Range
Dim bidScroll As Range
Dim offerScroll As Range
Dim Ex As String
Dim i As Integer

'For i = 1 To 1 Step 1 'Sheet Selection Process
 '   If i = 1 Then
  '      Ex = "Z"
   ' ElseIf i = 2 Then
    '    Ex = "P"
   ' Else
    '    Ex = "T"
   ' End If

Sheets("Z").Select 'Sheet selected

With ActiveSheet

    firstRow = .UsedRange.Cells(1).Row + 1
    lastRow = .UsedRange.Rows.Count

   Set bidRange = .Range("F2:F" & lastRow)
   Set offerRange = .Range("G2:G" & lastRow)

    For counter = lastRow To firstRow Step -1

        Set bidScroll = .Range("F" & counter)
        Set offerScroll = .Range("G" & counter)

        With .Cells(counter, "J")
        .Value = DECILE_RANK(bidRange, bidScroll)
        End With

        With .Cells(counter, "K")
        .Value = DECILE_RANK(offerRange, offerScroll)
        End With

    Next counter

End With

Range("J1").Select
ActiveCell = "Bid Rank"

ActiveCell.Offset(0, 1) = "Offer Rank"

'Next i

End Sub

 Function DECILE_RANK(DataRange, RefCell)

    'Credit: BJRaid 
    'DECILE_RANK(The Range of data)
    'Declares the function that can be called in the spreadsheet cell  - enter '=DECILE_RANK(A5:A50,A5)

    'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are

    DEC1 = Application.WorksheetFunction.Percentile(DataRange, 0.1)
    DEC2 = Application.WorksheetFunction.Percentile(DataRange, 0.2)
    DEC3 = Application.WorksheetFunction.Percentile(DataRange, 0.3)
    DEC4 = Application.WorksheetFunction.Percentile(DataRange, 0.4)
    DEC5 = Application.WorksheetFunction.Percentile(DataRange, 0.5)
    DEC6 = Application.WorksheetFunction.Percentile(DataRange, 0.6)
    DEC7 = Application.WorksheetFunction.Percentile(DataRange, 0.7)
    DEC8 = Application.WorksheetFunction.Percentile(DataRange, 0.8)
    DEC9 = Application.WorksheetFunction.Percentile(DataRange, 0.9)


    ' Calculate the Decile rank that the reference cell value sits within

    If (RefCell <= DEC1) Then DECILE_RANK = 1
    If (RefCell > DEC1) And (RefCell <= DEC2) Then DECILE_RANK = 2
    If (RefCell > DEC2) And (RefCell <= DEC3) Then DECILE_RANK = 3
    If (RefCell > DEC3) And (RefCell <= DEC4) Then DECILE_RANK = 4
    If (RefCell > DEC4) And (RefCell <= DEC5) Then DECILE_RANK = 5
    If (RefCell > DEC5) And (RefCell <= DEC6) Then DECILE_RANK = 6
    If (RefCell > DEC6) And (RefCell <= DEC7) Then DECILE_RANK = 7
    If (RefCell > DEC7) And (RefCell <= DEC8) Then DECILE_RANK = 8
    If (RefCell > DEC8) And (RefCell <= DEC9) Then DECILE_RANK = 9
    If (RefCell > DEC9) Then DECILE_RANK = 10

End Function

Upvotes: 4

Views: 13798

Answers (3)

Lance Roberts
Lance Roberts

Reputation: 22842

The problem is that your looping through each row individually, the Excel way is to try and work with whole ranges at once whenever possible. I would load the ranges into arrays, then modify your DECILE_RANK code to work with the items in the array.

Note that variant arrays that read ranges in are 2-D.

Here is the fully functioning code including my custom VBA array slicer. Note that it was only tested on a small dataset:

Sub Bucketting()

Dim lastRow As Long
Dim bidArray As Variant
Dim offerArray As Variant

Sheets("Sheet1").Select 'Sheet selected

With ActiveSheet

  lastRow = .UsedRange.Rows.Count + 1

  bidArray = .Range("F2:F" & lastRow)
  offerArray = .Range("G2:G" & lastRow)

  Range("J2:J" & lastRow).Value = GetArraySlice2D(DECILE_RANK(bidArray), "column", 1, 1, 0)
  Range("K2:K" & lastRow).Value = GetArraySlice2D(DECILE_RANK(offerArray), "column", 1, 1, 0)

End With

Range("J1").Select
ActiveCell = "Bid Rank"

ActiveCell.Offset(0, 1) = "Offer Rank"

End Sub

Function DECILE_RANK(DataRange As Variant) As Variant

' Credit:     BJRaid
' DECILE_RANK(The Range of data)
' Declares the function that can be called in the spreadsheet cell  - enter '=DECILE_RANK(A5:A50,A5)

Dim DEC(0 To 10) As Variant
Dim i As Integer, j As Integer

'Using the percentile worksheet function calculate where the 10th, 20th etc percentile of the reference range are
DEC(0) = 0
For i = 1 To 9
  DEC(i) = Application.WorksheetFunction.Percentile(DataRange, 0.1 * i)
Next i
DEC(10) = Application.WorksheetFunction.Max(DataRange)

' Calculate the Decile rank that the reference cell value sits within
For i = 1 To UBound(DataRange, 1)
  For j = 1 To 10
    If ((DataRange(i, 1) > DEC(j - 1)) And (DataRange(i, 1) <= DEC(j))) Then
      DataRange(i, 1) = j
      Exit For
    End If
  Next j
Next i

DECILE_RANK = DataRange

End Function

Public Function GetArraySlice2D(Sarray As Variant, Stype As String, Sindex As Integer, Sstart As Integer, Sfinish As Integer) As Variant

' this function returns a slice of an array, Stype is either row or column
' Sstart is beginning of slice, Sfinish is end of slice (Sfinish = 0 means entire
' row or column is taken), Sindex is the row or column to be sliced (NOTE:
' 1 is always the first row or first column)
' an Sindex value of 0 means that the array is one dimensional 3/20/09 Lance Roberts

Dim vtemp() As Variant
Dim i As Integer

On Err GoTo ErrHandler

Select Case Sindex
    Case 0
        If Sfinish - Sstart = UBound(Sarray) - LBound(Sarray) Then
            vtemp = Sarray
        Else
            ReDim vtemp(1 To Sfinish - Sstart + 1)
            For i = 1 To Sfinish - Sstart + 1
                vtemp(i) = Sarray(i + Sstart - 1)
            Next i
        End If
    Case Else
        Select Case Stype
            Case "row"
                If Sfinish = 0 Or (Sstart = LBound(Sarray, 2) And Sfinish = UBound(Sarray, 2)) Then
                    vtemp = Application.WorksheetFunction.Index(Sarray, Sindex, 0)
                Else
                    ReDim vtemp(1 To Sfinish - Sstart + 1)
                    For i = 1 To Sfinish - Sstart + 1
                        vtemp(i) = Sarray(Sindex, i + Sstart - 1)
                    Next i
                End If
            Case "column"
                If Sfinish = 0 Or (Sstart = LBound(Sarray, 1) And Sfinish = UBound(Sarray, 1)) Then
                    vtemp = Application.WorksheetFunction.Index(Sarray, 0, Sindex)
                Else
                    ReDim vtemp(1 To Sfinish - Sstart + 1)
                    For i = 1 To Sfinish - Sstart + 1
                        vtemp(i) = Sarray(i + Sstart - 1, Sindex)
                    Next i
                End If
        End Select
End Select
GetArraySlice2D = vtemp
Exit Function

ErrHandler:
    Dim M As Integer
    M = MsgBox("Bad Array Input", vbOKOnly, "GetArraySlice2D")

End Function

Upvotes: 1

Phil.Wheeler
Phil.Wheeler

Reputation: 16858

I'm not sure if this will directly address your problem, but have you considered using Application.ScreenUpdating = False? Don't forget to set it back to true once your data has processed.

Upvotes: 0

Jody
Jody

Reputation: 8291

935,000 lines is a lot for excel. Like, really a lot. Barring saying using a real database, If your application is literally putting a =Percentile(...) in each cell, I would recommend Trying to use another tool for that. Perhaps something within VBA itself. More generally, use something outside of a cell - then store the result value in the cell. There is a lot of overhead in maintaining those formulas that are interdependent on 935k rows of data.

Upvotes: 2

Related Questions