Cbohn
Cbohn

Reputation: 5

Excel VBA listing all unique values from a column, and all corresponding values to each unique value from another column

I am a novice with VBA/Macros and do not know the technical terms or the best ways to perform many functions. However, I was able to create a Macro that partially does what I need it to, but I can't seem to figure out the rest.

The Situation: I have vulnerability scan results that I'm attempting to parse a certain way so that I can get relevant information out properly so that I can provide it in a clean and concise way to my teams. Currently I get the vuln results into an excel file which has all of the information per vuln, per host (so 1 vuln showing on 6 hosts, produces 6 rows). My current Macro is able to distill the information down so that I can view it as 1 row per vuln (indicated by a unique PluginID), with all of the hosts concatenated into 1 cell delimited by a line break. The current macro accomplishes this well, however there is another column within the vuln results that is very important - "Vulnerability Proof". On some vulns, the "Vulnerability Proof" is the same for all hosts with the vuln, and on some its different. I need to be able to list all unique "vulnerability proofs" with all of the hosts that the proof is related to, so that I get a concise list of "these hosts have this proof, these hosts have this proof etc."

Here some sample source data:

PluginID Description Host Vuln Proof
Plugin123 CVE-Plugin123 Host1 Version 1.2.3 detected
Plugin123 CVE-Plugin123 Host2 Version 1.2.3 detected
Plugin123 CVE-Plugin123 Host3 Version 4.5.6 detected
Plugin456 Plugin456-2021 Vuln Host1 Version 7.8.9 detected
Plugin456 Plugin456-2021 Vuln Host2 Version 10.11.12 detected
Plugin456 Plugin456-2021 Vuln Host3 Version 10.11.12 detected

Currently when I run my macro (below), I get the following output:

PluginID Description Host Vuln Proof
Plugin123 CVE-Plugin123 Host1, Host2, Host3 Version 1.2.3 detected, Version 1.2.3 detected, Version 4.5.6 detected
Plugin456 Plugin456-2021 Vuln Host1, Host2, Host3 Version 7.8.9 detected, Version 10.11.12 detected, Version 10.11.12 detected

This, while achieving the goal of reviewing by 1 row per vuln, doesn't give a very good way to review the vulnerability proof if/when that proof is long, and/or there are many, many hosts reporting the vuln.

How I'd like to receive the output:

PluginID Description Host Vuln Proof
Plugin123 CVE-Plugin123 Host1, Host2, Host3 Host1, Host 2: Version 1.2.3 detected
Host 3: Version 4.5.6 detected
Plugin456 Plugin456-2021 Vuln Host1, Host2, Host3 Host 1: Version 7.8.9 detected
Host 2, Host3: Version 10.11.12 detected

My current Macro looks like this:

Sub CombineDupRows()
    Dim cCR As cCombinedRows
    Dim colCR As Collection
    Dim wsSrc As Worksheet, wsRes As Worksheet, rRes As Range
    Dim vSrc As Variant, vRes() As Variant
    Dim S As String
    Dim I As Long, J As Long
    
Sheets("Results").Activate
    
Set wsSrc = Worksheets("Source")
Set wsRes = Worksheets("Results")
    Set rRes = wsRes.Cells(1, 1)
    
With wsSrc
    vSrc = .Range("a1", .Cells(.Rows.Count, "A").End(xlUp)).Resize(columnsize:=4)
End With
Application.ScreenUpdating = False
'collect source data
Set colCR = New Collection
On Error Resume Next
For I = 1 To UBound(vSrc)
    Set cCR = New cCombinedRows
    With cCR
        For J = 1 To 2
            .Key(J) = CStr(vSrc(I, J))
        Next J
        
        .Phrase(0) = vSrc(I, 3)
        .ProofPhrase(0) = vSrc(I, 4)
        
        'The key will be the concatenation of columns 1-2
        S = Join(.Keys, Chr(1))
        
        'if key is duplicate, add phrase to existing collection item
        On Error Resume Next
            colCR.Add cCR, S
            Select Case Err.Number
                Case 457  'duplicate key
                    Err.Clear
                    colCR(S).Phrase(UBound(colCR(S).Phrases) + 1) = .Phrase(0)
                    colCR(S).ProofPhrase(UBound(colCR(S).ProofPhrases) + 1) = .ProofPhrase(0)
                Case Is <> 0  'some other error.  Stop for debugging
                    Debug.Print Err.Number, Err.Description, Err.Source
                    Stop
                End Select
        On Error GoTo 0
    End With
Next I

'Create results array
ReDim vRes(1 To colCR.Count, 1 To 4)
For I = 1 To colCR.Count
    With colCR(I)
        For J = 1 To 2
            vRes(I, J) = colCR(I).Key(J)
        Next J
        vRes(I, J) = Join(.Phrases, " " & Chr(13) & Chr(10))
        vRes(I, 4) = Join(.ProofPhrases, " " & Chr(13) & Chr(10))
    End With
Next I

Set rRes = rRes.Resize(UBound(vRes, 1), UBound(vRes, 2))
With rRes
    .Value = vRes
End With

As part of this, I also had to create a Class Module which looks like this:

Option Explicit
Private pKeys() As String
Private pKey As String
Private pPhrases() As String
Private pPhrase As String
Private ProofpPhrases() As String
Private ProofpPhrase As String

Private Sub Class_Initialize()
    ReDim pKeys(0)
    ReDim pPhrases(0)
    ReDim ProofpPhrases(0)
End Sub


Public Property Get Keys() As String()
    Keys = pKeys
End Property

Public Property Get Key(index As Long) As String
    Key = pKeys(index)
End Property

Public Property Let Key(index As Long, strValue As String)
    If index > UBound(pKeys) Then ReDim Preserve pKeys(index)
    pKeys(index) = strValue
End Property

Public Property Get Phrases() As String()
    Phrases = pPhrases
End Property

Public Property Get Phrase(index As Long) As String
    Phrase = pPhrases(index)
End Property

Public Property Let Phrase(index As Long, strValue As String)
    If index > UBound(pPhrases) Then ReDim Preserve pPhrases(index)
    pPhrases(index) = strValue
End Property

Public Property Get ProofPhrases() As String()
    ProofPhrases = ProofpPhrases
End Property

Public Property Get ProofPhrase(index As Long) As String
    ProofPhrase = ProofpPhrases(index)
End Property

Public Property Let ProofPhrase(index As Long, strValue As String)
    If index > UBound(ProofpPhrases) Then ReDim Preserve ProofpPhrases(index)
    ProofpPhrases(index) = strValue
End Property

I've attempted to add some string types for the host again which could be used to provide some sort of concatenating in the proof column in the results, but I haven't been successful thus far even with that (even though its not even fully what I am looking for). I've spent a good amount of time tinkering and attempting to get this to work, and am finally asking for help. What can I add to the Macro to get this to work to get results out the way I'm looking for?

Upvotes: 0

Views: 335

Answers (2)

Ron Rosenfeld
Ron Rosenfeld

Reputation: 60224

I've been trying to learn Power Query (available in Windows Excel 2010+ and Office 365), so I will present this six-step procedure as an alternative.

To use Power Query

  • Select some cell in your Data Table
  • Data => Get&Transform => from Table/Range
  • When the PQ Editor opens: Home => Advanced Editor
  • Make note of the Table Name in Line 2
  • Paste the M Code below in place of what you see
  • Change the Table name in line 2 back to what was generated originally.
  • Read the comments and explore the Applied Steps to understand the algorithm

M Code

let

//Change Name in next line to actual name in your workbook
    Source = Excel.CurrentWorkbook(){[Name="Table22"]}[Content],

//Set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"PluginID", type text}, 
        {"Description", type text}, 
        {"Host", type text}, 
        {"Vuln Proof", type text}}),

//Group by PluginID
    #"Grouped Rows" = Table.Group(#"Changed Type", {"PluginID"}, {

//If there is more than on associated Description, not sure what you want to do
//  so we only return the first entry
        {"Description", each [Description]{0}},

// Aggregate the Hosts
        {"Host", each Text.Combine([Host], ", ")},

// Combine Host:Vuln Proofs by grouping each subgroup by the Vuln Proof
// and extracting the associated Hosts
        {"Vuln Proof", (t)=> Table.Group(t, "Vuln Proof",{
            {"Hosts", (x)=> Text.Combine(x[Host],", ")}})}}),

// Add custom column to create a list of the vuln proof and hosts from each subtable
    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", 
        each List.Transform(List.Zip({[Vuln Proof][Hosts],[Vuln Proof][Vuln Proof]}),
                                each Text.Combine(_,":"))),

//Then concatenate the list of hostes with the vuln proof
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", 
        {"Custom", each Text.Combine(List.Transform(_, Text.From), "#(lf)"), type text}),

//remove unneeded column        
    #"Removed Columns" = Table.RemoveColumns(#"Extracted Values",{"Vuln Proof"})
in
    #"Removed Columns"

enter image description here

Upvotes: 1

Tim Williams
Tim Williams

Reputation: 166331

Here's how I'd tackle it - sorry did not have time to try to understand/adapt your code...

Sub Combine()
    Const SEP As String = "~~~"
    Dim r As Long, dictHosts As Object, dictProofs As Object, kp, k, host, proof
    Dim data, dict As Object, c As Range, s As String
    
    Set dictHosts = CreateObject("scripting.dictionary")
    Set dictProofs = CreateObject("scripting.dictionary")

    'read all data to array for processing
    data = ActiveSheet.Range("A2:D" & ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row).Value
    
    'process and combine
    For r = 1 To UBound(data, 1)
        k = data(r, 1) & SEP & data(r, 2)
        host = data(r, 3)
        proof = data(r, 4)
        'tracking vuln vs hosts
        If Not dictHosts.exists(k) Then dictHosts.Add k, New Collection 'new vuln
        dictHosts(k).Add host                              'add host for this vuln
        
        'tracking vuln vs host<>proof
        If Not dictProofs.exists(k) Then dictProofs.Add k, CreateObject("scripting.dictionary")
        Set dict = dictProofs(k)
        If Not dict.exists(proof) Then dict.Add proof, New Collection
        dict(proof).Add host
    Next r
    
    'output results
    Set c = ActiveSheet.Range("G1")
    For Each k In dictHosts
        c.Value = Split(k, SEP)(0)               'plugin ID
        c.Offset(0, 1).Value = Split(k, SEP)(1)  'description
        c.Offset(0, 2).Value = Join(ColToArray(dictHosts(k)), ", ") 'list of hosts
        s = ""
        Set dict = dictProofs(k)
        'build the proof: hosts string
        For Each kp In dict
            s = s & IIf(s <> "", vbLf, "") & kp & ": " & Join(ColToArray(dict(kp)), ",")
        Next kp
        c.Offset(0, 3).Value = s
        
        Set c = c.Offset(1, 0)
    Next k
End Sub

'Utility function: get an array from a Collection
Function ColToArray(col As Collection)
    Dim v, arr, i
    ReDim arr(0 To col.Count - 1)
    For i = 1 To col.Count
        arr(i - 1) = col(i)
    Next i
    ColToArray = arr
End Function

Upvotes: 1

Related Questions