skkakkar
skkakkar

Reputation: 2828

Why Dictionary takes second key item occurance instead of first one

I have to match column items of worksheet "Sheet1" which may have 30K rows to "Sheet2" items which may have around 5K rows.

In my sample data I have taken questions of Stack Overflow site as My actual data closely resembles it. Sheet2 data generally should be unique but chances of human error remains and there can be some duplicates in which one of the values can be erroneous.

My specific query is why Dictionary is outputting second occurance of key item value instead of the first occurance item value. I am using a scripting dictionary as a lookup and reading/writing all values as arrays for maximum speed.

Sheet2 has following sample data.

Q_DESC                                                                               Q_ID   Q_NO
vba  array  assign  values  for  loop                                                300003 34595616
adding  values  to  list  using  vlookup                                             300008 34517577
indirect  function  excel                                                            300018 34437000
multiple  replace  excel  vba  vba                                                   300004 34538197
indirect  function  excel                                                            300018 34438222
how  to  reference  a  cell  in  another  excel  worksheet  using  id  from  a  formula300020 34401572

Sheet1 has following sample data.

Q_DESC                                                                                Q_ID  Q_NO
copy  excel  worksheet  range  and  paste  into  outlook  as  a  picture             300002 34599462
vba  array  assign  values  for  loop                                                300003 34595616
use  a  for  to  move  from  cells                                                   300004 34538197
move  rows  to  another  sheet  if  meeting  criteria                                300005 34534837
subscript  out  of  range  error  excel  copy  paste                                 300006 34532985
extract  data  from  excel  file  joined  by  type  of  data  in  sheets             300007 34518747
adding  values  to  list  using  vlookup                                             300008 34517577
vba  workbook  close  function  cannot  make  this  work  within  required  location 300009 34514214
excel  vba  how  to  maintain  a  cell  value  from  an  instant  of  a  changing  variable 300010  34500540
excel  vba  open  csv  in  notepad  and  copy  all  to  excel                        300011 34498063
if  function  with  8  conditions   300012  34481381
excel  2013  how  to  copy  selective  columns  form  different  workbook            300013 34483403
formula  in  cells  missing  after  running  vba  macro                              300014 34464421
how  to  work  with  selection  range  in  excel  cell                               300015 34460417
multiple  replace  excel  vba  vba                                                   300016 34447116
copying  without  the  clipboard  and  without  formatting                           300017 34443932
indirect  function  excel                                                            300018 34438222
hiding  columns  based  on  cell  value  using  vba                                  300019 34424332
how  to  reference  a  cell  in  another  excel  worksheet  using  id  from  a  formula300020   34401572

Following code fragment for dictionary key and items value gives output mentioned subsequently.

 Dim key As Variant
    For Each key In dict.Keys
        Debug.Print key, dict(key)
    Next key

Results obtained are

    vba  array  assign  values  for  loop                                           34595616 
    adding  values  to  list  using  vlookup                                        34517577 
    indirect  function  excel                                                       34438222 
    multiple  replace  excel  vba  vba                                              34538197 
    how  to  reference  a  cell  in  another  excel  worksheet  using  id  from  a  formula    34401572 

I need help in understanding as to why for key indirect function excel it is printng second item value i.e. 34438222 instead of first occurance value which is 34437000 as key already exists.

Code adopted by me is based on one of the answers provided by Tim Williams on this site as mentioned below.

 Sub MatchItems()

    Dim ws1 As Worksheet
    Dim rws As Worksheet, t, arr1, arr2
    Dim dict As Object, rw As Range, res(), arr, nR As Long, i As Long

    Set ws1 = ThisWorkbook.Sheets("Sheet1")
    Set rws = ThisWorkbook.Sheets("Sheet2")
    Set dict = CreateObject("scripting.dictionary")

    t = Timer

    'create a lookup from two arrays
    arr1 = rws.Range("a1").CurrentRegion.Columns(1).Value
    arr2 = rws.Range("a1").CurrentRegion.Columns(3).Value
    For i = 2 To UBound(arr1, 1)
        dict(arr1(i, 1)) = arr2(i, 1)
    Next i
    Dim key As Variant
    For Each key In dict.Keys
        Debug.Print key, dict(key)
    Next key
    Debug.Print "created lookup", Timer - t

    'get the values to look up
    arr = ws1.Range(ws1.Range("A2"), ws1.Cells(Rows.Count, 1).End(xlUp))
    nR = UBound(arr, 1)        '<<number of "rows" in your dataset
    ReDim res(1 To nR, 1 To 1) '<< resize the output array to match

    'perform the lookup
    For i = 1 To nR
        If dict.exists(arr(i, 1)) Then
            res(i, 1) = dict(arr(i, 1))
        Else
            res(i, 1) = ""
        End If
    Next i

    ws1.Range("E2").Resize(nR, 1).Value = res '<< populate the results

    Debug.Print "Done", Timer - t

End Sub

Upvotes: 0

Views: 82

Answers (2)

Josh Eller
Josh Eller

Reputation: 2065

It sounds like you don't want to write to the dictionary if a value for that key already exists. If that's the case, just replace this:

For i = 2 To UBound(arr1, 1)
    dict(arr1(i, 1)) = arr2(i, 1)
Next i

With this:

For i = 2 To UBound(arr1, 1)
    If Not dict.exists(arr1(i, 1)) Then dict(arr1(i, 1)) = arr2(i, 1)
Next i

Upvotes: 2

shrivallabha.redij
shrivallabha.redij

Reputation: 5902

Your answer lies in this piece of code which basically loops through the whole column with repeated values.

'create a lookup from two arrays
arr1 = rws.Range("a1").CurrentRegion.Columns(1).Value
arr2 = rws.Range("a1").CurrentRegion.Columns(3).Value
For i = 2 To UBound(arr1, 1)
    dict(arr1(i, 1)) = arr2(i, 1)
Next i

where the last record (and not second record) gets populated in the dict. You should use Exists method or alternatively you can modify the for loop as below for returning first record e.g.

'create a lookup from two arrays
arr1 = rws.Range("a1").CurrentRegion.Columns(1).Value
arr2 = rws.Range("a1").CurrentRegion.Columns(3).Value
For i = UBound(arr1, 1) To LBound(arr1, 1) Step -1
    dict(arr1(i, 1)) = arr2(i, 1)
Next i

Hth

Upvotes: 1

Related Questions