Reputation: 2706
I am looking for alternatives to vlookup, with improved performance within the context of interest.
The context is the following:
VLOOKUP
is FALSE
)A schema to explain :
Reference sheet : ("sheet1"
)
A B
1
2 key1 data1
3 key2 data2
4 key3 data3
... ... ...
99999 key99998 data99998
100000 key99999 data99999
100001 key100000 data100000
100002
Lookup sheet:
A B
1
2 key51359 =VLOOKUP(A2;sheet1!$A$2:$B$100001;2;FALSE)
3 key41232 =VLOOKUP(A3;sheet1!$A$2:$B$100001;2;FALSE)
4 key10102 =VLOOKUP(A3;sheet1!$A$2:$B$100001;2;FALSE)
... ... ...
99999 key4153 =VLOOKUP(A99999;sheet1!$A$2:$B$100001;2;FALSE)
100000 key12818 =VLOOKUP(A100000;sheet1!$A$2:$B$100001;2;FALSE)
100001 key35032 =VLOOKUP(A100001;sheet1!$A$2:$B$100001;2;FALSE)
100002
On my Core i7 M 620 @2.67 GHz, this computes in ~10 minutes
Are there alternatives to VLOOKUP with better performance in this context ?
Upvotes: 23
Views: 46318
Reputation: 1
Value fix: check for a blank cell when building the dictionary. If the cell is blank, exit for.
Upvotes: -1
Reputation:
You also may want to consider using the “double Vlookup” method (not my idea - seen elsewhere). I tested it on 100,000 lookup values on sheet 2 (randomly sorted) with an identical data set as the one you’ve described on sheet 1, and timed it at just under 4 seconds. The code is also a bit simpler.
Sub FastestVlookup()
With Sheet2.Range("B1:B100000")
.FormulaR1C1 = _
"=IF(VLOOKUP(RC1,Sheet1!R1C1:R100000C1,1)=RC1,VLOOKUP(RC1,Sheet1!R1C1:R100000C2,2),""N/A"")"
.Value = .Value
End With
End Sub
Upvotes: 6
Reputation: 51
Switch to Excel 2013 and use Data Model. There you can define a column with unique ID keys in both tables and bind those two tables with relationship in Pivot Table. Than if absolutely necessary you can use Getpivotdata() to fill the first table. I had a ~250K rows table doing vlookup in the similar ~250K rows table. Stopped Excel calculating it after an hour. With Data Model it took less than 10sec.
Upvotes: 5
Reputation: 2706
I considered the following alternatives:
The compared performance is:
Using the same reference sheet
1) Lookup sheet: (vlookup array formula version)
A B
1
2 key51359 {=VLOOKUP(A2:A10001;sheet1!$A$2:$B$100001;2;FALSE)}
3 key41232 formula in B2
4 key10102 ... extends to
... ... ...
99999 key4153 ... cell B100001
100000 key12818 ... (select whole range, and press
100001 key35032 ... CTRL+SHIFT+ENTER to make it an array formula)
100002
2) Lookup sheet: (match+index version)
A B C
1
2 key51359 =MATCH(A2;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B2)
3 key41232 =MATCH(A3;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B3)
4 key10102 =MATCH(A4;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B4)
... ... ... ...
99999 key4153 =MATCH(A99999;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B99999)
100000 key12818 =MATCH(A100000;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B100000)
100001 key35032 =MATCH(A100001;sheet1!$A$2:$A$100001;) =INDEX(sheet1!$B$2:$B$100001;B100001)
100002
3) Lookup sheet: (vbalookup version)
A B
1
2 key51359 {=vbalookup(A2:A50001;sheet1!$A$2:$B$100001;2)}
3 key41232 formula in B2
4 key10102 ... extends to
... ... ...
50000 key91021 ...
50001 key42 ... cell B50001
50002 key21873 {=vbalookup(A50002:A100001;sheet1!$A$2:$B$100001;2)}
50003 key31415 formula in B50001 extends to
... ... ...
99999 key4153 ... cell B100001
100000 key12818 ... (select whole range, and press
100001 key35032 ... CTRL+SHIFT+ENTER to make it an array formula)
100002
NB : For some (external internal) reason, the vbalookup fails to return more than 65536 data at a time. So I had to split the array formula in two.
and the associated VBA code :
Function vbalookup(lookupRange As Range, refRange As Range, dataCol As Long) As Variant
Dim dict As New Scripting.Dictionary
Dim myRow As Range
Dim I As Long, J As Long
Dim vResults() As Variant
' 1. Build a dictionnary
For Each myRow In refRange.Columns(1).Cells
' Append A : B to dictionnary
dict.Add myRow.Value, myRow.Offset(0, dataCol - 1).Value
Next myRow
' 2. Use it over all lookup data
ReDim vResults(1 To lookupRange.Rows.Count, 1 To lookupRange.Columns.Count) As Variant
For I = 1 To lookupRange.Rows.Count
For J = 1 To lookupRange.Columns.Count
If dict.Exists(lookupRange.Cells(I, J).Value) Then
vResults(I, J) = dict(lookupRange.Cells(I, J).Value)
End If
Next J
Next I
vbalookup = vResults
End Function
NB: Scripting.Dictionary
requires a referenc to Microsoft Scripting Runtime
which must be
added manually (Tools->References menu in the Excel VBA window)
Conclusion :
In this context, VBA using a dictionary is 100x faster than using VLOOKUP and 20x faster than MATCH/INDEX
Upvotes: 23