CaptainABC
CaptainABC

Reputation: 1239

Combine rows and sum values by unique identifiers vba excel

I am in kind of a pickle :(

I have the below data and the task is to identify the unique records and combine them summing the values.

Let me explain, below is the data:

OrgData http://im80.gulfup.com/uDNyW7.png

So the end result that I need to get is the data per visit of each client with total of the price and the item name to be kept as the first item:

EndData http://im75.gulfup.com/PvkIWz.png

I have tried using a helper column which is a combination of the "Client ID" and the "Date"

For i = 1 to Lastrow
Worksheets("Sheet1").Range("F" & i).Value = Worksheets("Sheet1").Range("A" & i).Value & _
Worksheets("Sheet1").Range("C" & i).Value
Next i

I then tried to copy the helper column to a temp sheet and remove duplicates and then for each of the remaining values I used autofilter by the helper column value and then summed the result of column D and wrote that to a new sheet.

Set rng = Sheet1.Range("D2:D" & lastrow2)
total = Application.WorksheetFunction.Sum(rng.SpecialCells(xlCellTypeVisible))

But given that my sheet has more than 60K + rows, it takes forever.

I am sure that there is a better approach out there, but just can't think of any.

Upvotes: 1

Views: 2179

Answers (3)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60174

Here is a VBA solution using a User Defined Object: cVisit which has the five properties of ID, Name, Date, Price and Item.

EDIT: I ran some timing tests and, depending on the distribution of duplicates in the source data, it runs in five to fifteen seconds on my machine with a data source of 60,000 rows.

First insert a class module, rename it cVisit, and paste the following code:


Option Explicit
Private pID As String
Private pName As String
Private pDT As Date
Private pPrice As Double
Private pItem As String

Public Property Get ID() As String
    ID = pID
End Property
Public Property Let ID(Value As String)
    pID = Value
End Property

Public Property Get Name() As String
    Name = pName
End Property
Public Property Let Name(Value As String)
    pName = Value
End Property

Public Property Get DT() As Date
    DT = pDT
End Property
Public Property Let DT(Value As Date)
    pDT = Value
End Property

Public Property Get Price() As Double
    Price = pPrice
End Property
Public Property Let Price(Value As Double)
    pPrice = Value
End Property

Public Property Get Item() As String
    Item = pItem
End Property
Public Property Let Item(Value As String)
    pItem = Value
End Property

Then, in a regular module:


Option Explicit
Sub DailyVisits()
    Dim wsSrc As Worksheet, vSrc As Variant, rSrc As Range
    Dim vRes() As Variant, wsRes As Worksheet, rRes As Range
    Dim cV As cVisit, colVisits As Collection
    Dim I As Long
    Dim sKey As String
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet1")
Set rRes = wsRes.Range("H1")

'Read source data into an array as it is much faster to iterate through a VBA array
' than a worksheet
With wsSrc
    Set rSrc = .Range("a1", .Cells(.Rows.Count, "A").End(xlUp)).Resize(columnsize:=5)
    vSrc = rSrc
End With

'Collect all the visits into a Collection keyed to Client ID and Date
Set colVisits = New Collection
On Error Resume Next
For I = 2 To UBound(vSrc, 1)
    Set cV = New cVisit
    With cV
        .ID = vSrc(I, 1)
        .Name = vSrc(I, 2)
        .DT = vSrc(I, 3)
        .Price = vSrc(I, 4)
        .Item = vSrc(I, 5)
        sKey = CStr(.ID & "|" & .DT)
        colVisits.Add cV, sKey

        'If the record for this ID and date already exists, then add the 
        'price to the existing record.  Else a new record gets added
        If Err.Number = 457 Then  
            With colVisits(sKey)
                .Price = .Price + cV.Price
            End With
        ElseIf Err.Number <> 0 Then Stop
        End If
        Err.Clear
    End With
Next I
On Error GoTo 0

'To minimize chance of out of memory errors with large database
Erase vSrc
vSrc = rSrc.Rows(1)


'Write the collection to a "results" array
'then write it to the worksheet and format
ReDim vRes(0 To colVisits.Count + 1, 1 To 5)
For I = 1 To UBound(vRes, 2)
    vRes(0, I) = vSrc(1, I)
Next I
For I = 1 To colVisits.Count
    With colVisits(I)
        vRes(I, 1) = .ID
        vRes(I, 2) = .Name
        vRes(I, 3) = .DT
        vRes(I, 4) = .Price
        vRes(I, 5) = .Item
    End With
Next I

With rRes.Resize(UBound(vRes), UBound(vRes, 2))
    .EntireColumn.Clear
    .Value = vRes
    With .Rows(1)
        .Font.Bold = True
        .HorizontalAlignment = xlCenter
    End With
    .Columns(3).NumberFormat = "d/mm/yyyy"
    .Columns(4).NumberFormat = "$#,##0.00"
    .EntireColumn.AutoFit
End With


End Sub

Adjust your source and results worksheet as you like, and the first cell of the results range and Run.

Upvotes: 1

pnuts
pnuts

Reputation: 59450

OP wants VBA but has also mentioned "what else can I try" so on the excuse that this may allow for other possibilities, a formula basis solution might be to:

  1. Work on a copy.
  2. Add a column (say A, with =IF(OR(B1<>B2,D1<>D2),"*","") in A2 copied down to suit (ie ~60k rows) and add * at the bottom of the list. (Hopefully this will cover the situation where different days are next to one another but with the same client ID, though that is not shown in the example).
  3. Copy A and Paste Special Values over the top (may be possible to skip until part of step 6).
  4. There should now be asterisks to mark the rows from which Item names are to be retained (and where the totals are required).
  5. In G2 and copied down to suit: =IF(ISBLANK(A2),"",SUM(INDIRECT("E"&ROW()&":E"&ROW()+MATCH("~*",A3:A$65000,0)-1)))
  6. Select, Copy and Paste Special, Values over the top.
  7. Filter to select (Blank) in ColumnA and delete all visible except header.
  8. Remove the filter.

Should be a lot quicker that multiple Subtotals, but still may not be suitable if to be repeated frequently. However corresponding steps could be built into a subroutine, or the above recorded for a macro.

Upvotes: 0

DannyBland
DannyBland

Reputation: 493

A simple way to do this would be to combine the two cells so in F2 type

=A2 & D2

Then sort column E, then run a subtotal on your data, that sums column D at every change in column F.

Upvotes: 0

Related Questions