Instant Breakfast
Instant Breakfast

Reputation: 1445

Find value intersections in multiple columns

I am well out of my depth here: Can this be done? And if so, what methods should I consider?

I periodically receive a spreadsheet that contains a variable number of sheets. Each sheet has the same header row, but different values in the rows beneath. In one column is an identifying number that indicates a unique user, and I need to determine if there is an intersection between any of the Identifier columns on those worksheets. Here is a simplified example, in which the first and third worksheet have an intersection of abc789 but there is no intersecting value in Worksheet 2. I want to know when there is an intersection, and between which worksheets:

Worksheet 1:

ID_Number
•   abc123
•   abc456
•   abc789

Worksheet 2:

ID_Number
•   abc234
•   abc345
•   abc912

Worksheet 3:

ID_Number
•   abc789
•   abc567
•   abc678

If it can be done, I'm suspicious of another problem: doing it in a way that works for 3 sheets today and 10 sheets tomorrow! To answer that question I tried setting variables for an unknown number of columns to compare like this, but clearly failed:

Dim iArraySize As Integer
Dim iTabCounter As Integer
Dim iLoopCounter As Integer

iTabCounter = ActiveWorkbook.Sheets.Count

For iLoopCounter = 3 To iTabCounter
    iArraySize = ActiveWorkbook.Sheets(iLoopCounter).Range("C2", Range("C2").End(xlDown)).Count
    dim aID & iloopcounter as Variant 'this line fails on compile with "expected end of statement" highlighting the ampersand
    aID1 = Range("C2", Range("C2").End(xlDown)).Value
Next iLoopCounter

Is this a lost cause? Should I resolve myself to manual examination?

Upvotes: 3

Views: 4165

Answers (3)

tigeravatar
tigeravatar

Reputation: 26660

This will output a list of all ID's that were found more than once and what sheets they were found in on a summary sheet:

Sub tgr()

    Const strIDCol As String = "A"
    Const lHeaderRow As Long = 1

    Dim cllIDs As Collection
    Dim ws As Worksheet
    Dim IDCell As Range
    Dim arrUnqIDs(1 To 65000) As Variant
    Dim arrMatches(1 To 65000) As String
    Dim ResultIndex As Long
    Dim lUnqIDCount As Long

    Set cllIDs = New Collection

    For Each ws In ActiveWorkbook.Sheets
        With Range(ws.Cells(lHeaderRow + 1, strIDCol), ws.Cells(ws.Rows.Count, strIDCol).End(xlUp))
            If .Row > lHeaderRow Then
                For Each IDCell In .Cells
                    On Error Resume Next
                    cllIDs.Add IDCell.Text, LCase(IDCell.Text)
                    On Error GoTo 0
                    If cllIDs.Count > lUnqIDCount Then
                        lUnqIDCount = cllIDs.Count
                        arrUnqIDs(lUnqIDCount) = IDCell.Text
                        arrMatches(lUnqIDCount) = ws.Name
                    Else
                        ResultIndex = WorksheetFunction.Match(IDCell.Text, arrUnqIDs, 0)
                        arrMatches(ResultIndex) = arrMatches(ResultIndex) & "|" & ws.Name
                    End If
                Next IDCell
            End If
        End With
    Next ws

    If lUnqIDCount > 0 Then
        With Sheets.Add(Before:=ActiveWorkbook.Sheets(1))
            With .Range("A1:B1")
                .Value = Array("Intersecting ID's", "Intersected in Sheets...")
                .Font.Bold = True
            End With
            .Range("A2").Resize(lUnqIDCount).Value = Application.Transpose(arrUnqIDs)
            .Range("B2").Resize(lUnqIDCount).Value = Application.Transpose(arrMatches)
            .UsedRange.AutoFilter 2, "<>*|*"
            .UsedRange.Offset(1).EntireRow.Delete
            .UsedRange.AutoFilter
            .Range("A1").CurrentRegion.EntireColumn.AutoFit
        End With
    End If

    Set cllIDs = Nothing
    Set ws = Nothing
    Set IDCell = Nothing
    Erase arrUnqIDs
    Erase arrMatches

End Sub

Upvotes: 3

Graham Anderson
Graham Anderson

Reputation: 1239

The following code will display message boxes showing where the same ID numbers are found on diferent worksheets in the workbook. It assumes the identifier column is column A and that there are no blank cells within the data in column A

Sub CheckSub()
Const iIDENTIFIER_COLUMN = 1
Dim wsCurrentWorksheet As Worksheet
Dim wsWorksheetToCheck As Worksheet
Dim lCurrentRow As Long
Dim lCheckRow As Long
Dim iWorkbookNumber As Integer
Dim iWorkbookCount As Integer
Dim iCheckbookNumber As Integer

iWorkbookCount = ThisWorkbook.Sheets.Count
For iWorkbookNumber = 1 To iWorkbookCount
    lCurrentRow = 2
    Set wsCurrentWorksheet = ThisWorkbook.Sheets(iWorkbookNumber)
    Do While wsCurrentWorksheet.Cells(lCurrentRow, iIDENTIFIER_COLUMN).Value <> Empty
        For iCheckbookNumber = iWorkbookNumber To iWorkbookCount
            Set wsWorksheetToCheck = ThisWorkbook.Sheets(iCheckbookNumber)
            If wsCurrentWorksheet.Name <> wsWorksheetToCheck.Name Then
                lCheckRow = 2
                Do While wsWorksheetToCheck.Cells(lCheckRow, iIDENTIFIER_COLUMN).Value <> Empty
                    If wsCurrentWorksheet.Cells(lCurrentRow, iIDENTIFIER_COLUMN).Value = _
                        wsWorksheetToCheck.Cells(lCheckRow, iIDENTIFIER_COLUMN).Value Then
                            MsgBox (wsCurrentWorksheet.Cells(lCurrentRow, iIDENTIFIER_COLUMN).Value _
                            & " found on " & wsCurrentWorksheet.Name & " and " & wsWorksheetToCheck.Name)
                    End If
                    lCheckRow = lCheckRow + 1
                Loop
            End If
        Next iCheckbookNumber
        lCurrentRow = lCurrentRow + 1
    Loop
Next iWorkbookNumber
End Sub

Upvotes: 1

chiliNUT
chiliNUT

Reputation: 19573

It needs some work but heres a script that will print out all the dupes on all sheets in a column. Its not very robust, you have to specify the range, and it prints everything twice

Sub printDupes()
    For Each ws In ActiveWorkbook.Worksheets 'go thru each worksheet
        For Each idnumber In ws.Range("A2:A4") 'look at each idnumber in id column in selected worksheet
            For Each otherWs In ActiveWorkbook.Worksheets 'go thru each OTHER worksheet
             If ws.Name <> otherWs.Name Then 'skip it if its the same sheet
                For Each otherIdNumber In otherWs.Range("A2:A4") 'go thru each idnumber in the OTHER worksheet (the one you are comparing to)
                 If otherIdNumber.Value = idnumber.Value Then 'if you find a match
                 Debug.Print idnumber.Value 'print the value
                 Debug.Print otherWs.Name & "!" & otherIdNumber.Address 'print the address of the id we were looking at
                 Debug.Print ws.Name & "!" & idnumber.Address 'print address of the match
                 End If

                Next otherIdNumber
                End If
            Next otherWs

        Next idnumber

    Next ws
End Sub

this will work for your particular example, replace A2:A4 with a large range

Upvotes: 1

Related Questions