Reputation: 205
I have created a fictitious example below with a pricing lookup table Table2 and a transactions table Table1. Tables below.
Table1
+----------+--------+
| CATEGORY | VOLUME |
+----------+--------+
| Orange | 12 |
+----------+--------+
| Orange | 25 |
+----------+--------+
| Grape | 40 |
+----------+--------+
| Grape | 700 |
+----------+--------+
Table2
+----------+-------+-------+
| CATEGORY | RANGE | PRICE |
+----------+-------+-------+
| Orange | 10 | 2.50 |
+----------+-------+-------+
| Orange | 20 | 2.00 |
+----------+-------+-------+
| Orange | 30 | 1.50 |
+----------+-------+-------+
| Grape | 50 | 5.00 |
+----------+-------+-------+
| Grape | 100 | 2.00 |
+----------+-------+-------+
The hope is to reference the lookup table and pull the PRICE associated with a RANGE less than or equal to VOLUME for a given CATEGORY. In cases where VOLUME is larger than any of the RANGEs in the lookup table, it would pull the PRICE for the highest RANGE.
Desired Output
+----------+--------+-------+
| CATEGORY | VOLUME | PRICE |
+----------+--------+-------+
| Orange | 12 | 2.00 |
+----------+--------+-------+
| Orange | 25 | 1.50 |
+----------+--------+-------+
| Grape | 40 | 5.00 |
+----------+--------+-------+
| Grape | 700 | 2.00 |
+----------+--------+-------+
My head immediately went to using double unary --( array functions, but that performance would not work when both tables are hundreds of thousands of records.
Any ideas much appreciated. Using Excel 2016.
Upvotes: 1
Views: 110
Reputation: 15297
I would suggest using an SQL statement against the two worksheets/tables, and pasting the results using CopyFromRecordset
.
SQL:
SELECT t1.CATEGORY, t1.VOLUME, (
SELECT PRICE
FROM [Table2$] AS t2
WHERE t1.CATEGORY = t2.CATEGORY
AND (
t1.VOLUME <= t2.RANGE
OR t2.RANGE = MAXRANGE
)
ORDER BY t2.RANGE
) AS FINALPRICE
FROM [Table1$] AS t1
LEFT JOIN (
SELECT CATEGORY, MAX(RANGE) AS MAXRANGE
FROM [Table2$ AS t2a]
GROUP BY CATEGORY
) AS MAXRANGES ON t1.CATEGORY = MAXRANGES.CATEGORY
Add a reference (Tools -> References...) to the maximum version of Microsoft ActiveX Data Objects (on my machine it's 6.1).
It assumes three worksheets — Table1
, Table2
and Results
. (It is also possible to use named ranges, or specific cell ranges, as tables.)
Sub main()
Dim conn As New ADODB.Connection
With conn
.Provider = "Microsoft.ACE.OLEDB.12.0"
.ConnectionString = "Data Source=""" & ActiveWorkbook.FullName & """;" & _
"Extended Properties=""Excel 12.0;HDR=Yes"""
.Open
End With
Dim sql As String
sql = _
"SELECT t1.CATEGORY, t1.VOLUME, ( " & _
"SELECT TOP 1 PRICE " & _
"FROM [Table2$] AS t2 " & _
"WHERE t1.Category = t2.Category " & _
"AND ( " & _
"t1.VOLUME <= t2.RANGE " & _
"OR t2.RANGE = MAXRANGE " & _
") " & _
"ORDER BY t2.RANGE " & _
") AS FINALPRICE " & _
"FROM [Table1$] AS t1 " & _
"LEFT JOIN ( " & _
"SELECT CATEGORY, MAX(RANGE) AS MAXRANGE " & _
"FROM [Table2$] AS t2a " & _
"GROUP BY CATEGORY " & _
") AS MAXRANGES ON t1.CATEGORY = MAXRANGES.CATEGORY"
Dim rs As ADODB.Recordset
Set rs = conn.Execute(sql)
Worksheets("Results").Cells(1, 1).CopyFromRecordset rs
conn.Close
Set conn = Nothing
Set rs = Nothing
End Sub
Upvotes: 1
Reputation: 2066
Not sure if you can convert your Table 2 into a pivot table. If you can, the below should work
Convert your table 2 into a pivot table. Add two helper columns( "Row" and "Column")
Row =MATCH(E11,$E$3:$E$5,0)
this gives you the category row in the pivot table
Column =IFERROR(1/AGGREGATE(14,6,1/(($F$2:$O$2>F11)*COLUMN($F$2:$O$2)*(OFFSET($F$2:$O$2,H11,0)>0)),1)-COLUMN($E$2),COLUMN($O$2)-COLUMN($E$2))
Here I'm also using array operations(for which you had concern about performance) but that is limited to two row arrays and not the whole data you have. Still can't say much about performance
$F$2:$O$2>F11
gives all ranges greater than current volume
COLUMN($F$2:$O$2)
gives the column number
OFFSET($F$2:$O$2,H11,0)>0
checks in the current category row, which cells have values.
The combination of above three conditions gives you the column numbers that has a range greater than the required volume and has a corresponding price to it.
The 1/AGGREGATE()
part gives you the minimum column number which is subtracted from the first column's number COLUMN($E$2)
.
if the volume is larger than max range, the formula will give error which is captured in the IFERROR part. COLUMN($O$2)-COLUMN($E$2)
gives you the last column number in the pivot table.
Once you have the row and column number, the price can be got by the below formula
=INDEX($E$3:$O$5,H11,I11+1)
Note the +1 for the column number in this formula
Here is an image to see the references correctly
To check performance I generated a data set of 391 categories with each having 40 ranges. Doesn't make excel slow....but your case might be different
Let me know if you need help with understanding. My explanations might not be enough :P
Upvotes: 1